Saturday, March 29, 2014

Programming the BeagleBone's PRU

The BeagleBone Black is already a very powerful device with Linux under it's belt, but one thing many may not know is that the BeagleBone can also be programmed as two Real-Time units also (like having two Arduinos) - at least I never knew until the day before writing this post. Every bone comes with two PRUs (Programmable Real-Time Units).

In my efforts to learn what this feature was all about and how I could implement it, I unfortunately didn't find very much help from Google. This may have just been because I was looking in the wrong places though, nevertheless I found my answer(s) and so desire to share them here with you on my blog.

Simply Tutorials:
https://github.com/TekuConcept/PRU_Demo



PRUs on the bone are programmed with a special form of assembly specific to the TI platforms.

Time (or patience) may not be in your favor but I promise you: once you learn how to program with assembly in general, your eyes will be opened to the power and even more possibilities of your own computers beyond that which Linux and C++ has to offer - and the endless possibilities will overflow comprehension! (With assembly on any processing computer, you are literally telling the processor how to do it's job, just like when you program an Arduino. Assembly is to binary executable as hex is to binary value)

I recommend learning Assembly with Intel i386 processors [NASM] first before discovering Assembly with TI PRUs, and what better place to start than with these well put together tutorials at www.tutorialspoint.com/assembly_programming. Don't worry, once you've completed all the lessons there, you'll be able to jump right on into PRU programming using these links (or this post) as your guide:
processors.wiki.ti.com/index.php/PRU_Assembly_Instructions,
processors.wiki.ti.com/index.php/PRU_Assembly_Reference_Guide



Take a look at some tutorials/examples:
http://boxysean.com/blog/2012/08/12/first-steps-with-the-beaglebone-pru/
(for the example.c file in the above, change the header locations from 'pruss/*.h' to '/usr/include/*.h')
http://github.com/beagleboard/am335x_pru_package/tree/master/pru_sw/example_apps
http://mythopoeic.org/bbb-pru-minimal/
Don't forget to run 'modprobe uio_pruss' before executing any PRU code! To enable PRU, try typing the following in shell: 'echo BB-BONE-PRU-01 > /sys/devices/bone_capemgr.9/slots'



Set up BeagleBone Black:
Check to see if the packages already exist with these shell commands:
whereis modprobe # or modprobe -h
whereis pasm # or pasm -h
...both of which should return file locations (or help info). If they do, you're all ready to roll.
Otherwise enter the following (these are the commands I tested with Debian):

  • git clone git://github.com/beagleboard/am335x_pru_package.git
    (I had trouble cloning directly to the bone so I cloned to my desktop and then used SFTP to copy the files over)
  • cd am335x_pru_package/pru_sw/app_loader/interface
  • make CROSS_COMPILE=""
  • cd ../../utils/pasm_source
  • chmod u+x linuxbuild
  • ./linuxbuild
  • cd ../../utils
  • mv pasm pasm_2
  • cd ../example_apps
  • make CROSS_COMPILE=""
  • ldconfig
  • cd am335x_pru_package/pru_sw/utils
  • cp pasm /usr/bin/pasm



Assembly Comparison:
"Nearly all instructions (with exception of accessing memory external to PRU) are single-cycle execute (5 ns when running at 200 MHz)" If you know Verilog some of this may be familiar...
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
NASM memory registers:

  • 32bit: EAX, EBX, ECX, EDX
  • 16bit: AX, BX, CX, DX
  •  8bit: AH, AL, BH, BL, CH, CL, DH, DL
PRU memory registers:
  • 32bit: r0, r1, r2, r3            . . . r30, r31
  • 16bit: r0.w0, r0.w1              . . . r31.w0
  •  8bit: r0.b0, r0.b1, r0.b2       . . . r31.b0
  •  1bit: r0.t0, r0.b0.t0, r0.w1.t8 . . . r31.t31.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
NASM general:
  • MOV [dst],[src]: dst=src;
  • MOV BYTE/WORD/DWORD/QWORD/TBYTE [dst],[src]
  • JMP [label]: goto label;
  • CALL [proc]: goto proc -> return to call;
  • RET: return;
PRU general:
  • MOV/LDI [dst],[src]: dst=src;
  • MVIB/MVIW/MVID [dst],[src];
  • LBBO [dst],[adr],[off],[cnt]: 'Copy cnt bytes into dst from memory address adr+off; cnt!=0'
  • SBBO [src],[adr],[off],[cnt]: 'Copy cnt bytes from src to memory address adr+off; cnt!=0'
  • ZERO [src],[sz]: memcpy(src, 0, sz);
  • JMP [label]: goto label;
  • JAL [rtrn],[label]: goto label -> return to rtrn;
  • CALL [proc]: goto proc -> return to call;
  • RET: return;
  • WBS [src][val]: while(!(src&(1<<val)));
  • WBC [src][val]: while(src&(1<<val));
  • HALT: 'pauses PRU until manually resumed'
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
NASM arithmetic:
  • INC [a]:     a++;
  • DEC [a]:     a--;
  • ADD [a],[b]: a+=b;
  • SUB [a],[b]: a-=b;
  • MUL [m]:     a*=m;
  • IMUL [-m]:   a*=-m;
  • DIV [d]:     a/=d;
  • IDIV [-d]:   a/=-d;
PRU arithmetic:
  • ADD [v],[a],[b]: v=a+b;
  • SUB [v],[a],[b]: v=a-b;
  • ADC [v],[a],[b]: v=a+b+1;
  • SUC [v],[a],[b]: v=a+b+1;
  • RSB [v],[a],[b]: v=b-a;
  • RSC [v],[a],[b]: v=b-a-1;
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
NASM logic:
  • AND [a],[b]: a&=b;
  • OR  [a],[b]: a|=b;
  • XOR [a],[b]: a^=b;
  • NOT [a],[b]: a=~b;
  • TEST [a],[b]: v=g.e.l (great.equal.less)
    JG [label]: jumps if greater than
    JE [label]: jumps if equal to
    JL [label]: jumps if less than
PRU logic:
  • AND [v],[a],[b]: v=a&b;
  • OR  [v],[a],[b]: v=a|b;
  • XOR [v],[a],[b]: v=a^b;
  • NOT [v],[a]    : v=~a;
  • MIN [v],[a],[b]: v=a<b?a:b;
  • MAX [v],[a],[b]: v=a>b?a:b;
  • LSL [v],[a],[b]: v=a<<b;
  • LSR [v],[a],[b]: v=a>>b;
  • CLR [v],[a],[b]: v=a&~(1<<b);
  • SET [v],[a],[b]: v=a|~(1<<b);