Developing FPGA backends

fgrenyard · November 4, 2021, 11:13am

Hello,

I’m currently developing a neuromorphic FPGA architecture which could have potential use as a backend for spiking Nengo models; I have a prototype up and running (which compiles and runs the model in the Nengo: Under the Hood example code in Nengo’s documentation). I’m looking to develop the project beyond a prototype, could your team give me some advice on how to progress this further?

Thanks!

xchoo · November 15, 2021, 4:18am

Hi @fgrenyard, and welcome to the Nengo forums!

I’m not sure if you are aware, but we do actually have an FPGA backend for Nengo! Although, it should be noted that it only supports a small number of FPGAs (3 dev boards essentially), so your work will still be helpful.

As for how to develop the project beyond the prototype, you can look to our NengoFPGA codebase for guidance. I can give you a run-down of how we developed our FPGA backend, and give insights on some of the decisions that were made along the way.

The first thing you’ll probably want to do is to decide the scope of the Nengo model that you want to be able to run on the FPGA. Given the limitations (in size) of the FPGA development boards we were using, we limited our FPGA backend to only be able to simulate one Nengo network (within a Nengo model) on each FPGA board. We had sketched out a “full” FPGA backend (where the entire Nengo model is put on the FPGA), but the development of that project was put on hold in favour of other projects.

The other thing you’ll need to figure out is how to communicate with the rest of the Nengo software. Since we had limited the FPGA backend to simulating one Nengo network (embedded in a Nengo model), we decided to use UDP sockets to communicate to / from the FPGA development board. This communication was done using the ARM processor on the development board, with the linux OS running on the ARM processor as a proxy. We discovered that this form of communication was a significant bottleneck (transferring data from the FPGA fabric to the ARM processor is a slow process), and we were attempting to find alternative methods of communication before work on the project was halted. One idea we had was to implement an ethernet controller on the FPGA fabric so that we could do the UDP connection on the FPGA itself, rather than through the ARM processor.

You could also implement entire Nengo models on the FPGA board. For this approach to work, you’ll need to familiarize yourself with the Nengo build process. This is the way Nengo takes a Nengo model and converts it to the Numpy operations that actually run the simulation. You’ll need to replicate this process to “compile” the Nengo model into an FPGA bitstream, or into memory values that are read by a generic Nengo model bitstream. The approach taken here would depend on how you program the FPGA with the Nengo model code.

Also, be sure to check out our Nengo community contributor guide, and best of luck to you.

fgrenyard · November 15, 2021, 7:24pm

Hi @xchoo, thank you!

My aims for the system include making the hardware portable to many different FPGA architectures, sizes, and boards, so the hardware is written in SystemVerilog and is heavily parametrised so that the memory usage can be finely controlled.

A rundown of the existing backend’s codebase would be very useful, thank you! I have had a look through the codebase and I would like to determine the best way to interface my current implementation with Nengo.

The scope that I am eventually aiming for is for full models to be deployed onto dev boards for use in robotics, edge applications, etc. with no host needed. The aspect that I would like to implement first before developing an interface to Nengo is to expand the ensemble dimensionality; the example code only has one dimension (A clarification of how higher dimensions are implemented at a low level would also be very helpful!) After that, I would like to deploy one ensemble as a first step, then expand to full models and add learning rules.

The communication with the host computer is done via USB UART at the moment, with a Verilog UART on the FPGA and the Pyserial library on the host. This was done as many FPGA dev boards have USB serial capability, increasing portability. It runs at around 2Mb/s, but a ‘hat’ for my dev board (Alchitry Au) allows higher data rates over USB-C, which could be an option for other boards too. At the moment, only the input/output values of the system is transferred, but I would like to add debug information/spike addresses/etc.

At the moment, the board is programmed by compiling the parameters from the constructed model into a binary memory file by a compiler I’ve written in Python, and then are baked into the bitstream when the synthesis tools run; this is to allow for full optimisation of all the data paths to reduce the model’s memory footprint on the FPGA’s fabric.

Thank you very much for your help, I’m looking forward to developing this system further!

xchoo · November 18, 2021, 4:45pm

I wrote up a brief explanation of the builder (for learning rules, but it also applies to all other Nengo objects) here. There is also documentation on the Nengo build process here. You can probably write custom builder code to do the FPGA compilation step.

Another approach would be to use the standard Nengo build process (i.e., just create a Nengo simulator object), and iterate through the various Nengo objects (ensembles, connections, etc.) to extract out the parameters you need for the FPGA model (e.g., iterate through all of the ensembles to get the gain and bias values).

fgrenyard · November 23, 2021, 11:01am

Hi @xchoo,

Thank you for your help! I have a single ensemble up and running with Nengo (using the standard Nengo build process approach for extracting the parameters); I’m now working on expanding the framework to support higher dimensional representations and whole networks. I’ll post the progress of the project on this thread.

Screenshot 2021-11-23 at 10.55.57|690x383

xchoo · November 24, 2021, 1:41am

That’s neat! Congrats!