Understanding pipeline of nengo/nengo_dl on hardware

Dear Nengo community,

I am trying to understand the nengo/nengo_dl pipeline. Just for an example let’s suppose we have a two-layer model with a single neuron in each layer.


Once the model is trained and we want to run it on some hardware for instance loihi or FPGA. Then my understanding so far of the pipeline is:
Step 1: Given an input, it is multiplied first with the weight w1 and the product of this two is sent to the First layer LIF neuron.
Step 2: The LIF neuron based on the tunning_curve assigned spikes.
Step 3: These spikes are then decoded for the next layer? Am I correct? And based on the tunning_curve spikes are assigned to the decoded value. If my understanding is correct then how does this decoding looks like? Since I assumed everything after the first layer is done in spikes/current. How these spikes/currents are treated at next layer neurons?

Thank you in advance for your answer. :slight_smile:

Hi @Choozi,

If you are running it on hardware, the pipeline is typically hardware dependent. With respect to models run on hardware, Nengo and NengoDL merely serve to train the models to obtain the connection and neuron parameter weights needed for the network to perform some function. It is the responsibility of the respective backends (e.g., NengoLoihi, NengoFPGA, etc.) to then convert the network architecture (including the weights) into a structure that the hardware can natively use.

I’m not familiar with all of the hardware supported by Nengo, but most of the hardware apply the following pipeline to the neural network computation (using your 2 layer network as an example):

  1. Some input is provided. This input is multiplied by the connection weight matrix w1 and is fed into the first layer of LIF neurons. This input (that has been multiplied by the connection weight matrix) is treated as input current to the LIF neurons.
  2. For spiking neurons, the input current is fed into the neuron’s activation function. Depending on the state of the neuron (i.e., what the membrane voltage is, how much input current is being provided, etc.), the neuron may or may not generate a spike. Note that this is a “real-time” process. The membrane voltages are updated every timestep to determine if the neuron spikes or not.
    Note: For rate neurons, using the input current and the neuron’s response curve, a firing rate is computed (basically, the neuron’s response curve maps some input current to some output firing rate).
  3. The spikes generated by the first layer of neurons are filtered by a post-synaptic filter, and then multiplied with the connection weights w2. The post-synaptic filter “smooths” out the spikes and in combination with the connection weights “converts” the spike trains into input currents for the second neural layer.
  4. As with step 2, the input current from the previous step is fed into the second layer neuron’s activation functions which in turn generates (or not) spikes from the second layer.
  5. To get some output, the same post-synaptic smoothing is applied to the spike train of layer 2. These smoothed spike trains are multiplied with the weights w3 (also known as output weights) to generate the output signal.

You should note above that for spiking networks, the “communication” between each layer is a spike train. When the spike train “arrives” at the destination neuron, this is the point where the post-synaptic filter and the connection weight is applied. Also note that since the post-synaptic filter and the connection weights are just multiplicative (to be precise, the application of the post-synaptic filter is a convolution, and the application of the connection weights is a multiplication), the post-synaptic filter can be combined with the connection weights (i.e., scaling the post-synaptic filter), so the operation can be done in one step.

I should note that the terms “encoder” and “decoder” are pretty much only used in Nengo itself, and they may not physically exist in the hardware (unless the hardware is capable of doing computation using the NEF algorithm – i.e., the hardware is capable of using factorized weights). Rather, for some hardware, there only exists one “connection weight” matrix between neural layers. For such hardware, the respective Nengo backend would combine the encoders and decoders together to form the connection weight matrix.

Dear @xchoo,

Thank you very much for your prompt and detailed reply. I have few more questions to add.

  1. Some input is provided. This input is multiplied by the connection weight matrix w1 and is fed into the first layer of LIF neurons. This input (that has been multiplied by the connection weight matrix) is treated as input current to the LIF neurons.

How is this current calculated? Looking at different posts I came up with something this: “In Nengo a spike is a single timestep event with a magnitude of 1/dt. If you make a connection from a.neurons object, the resulting current fed to the post object depends on the synaptic filter applied on the connection. If the synapse is None or 0 , then no change to the spike is applied, and what is essentially a 1/dt single timestep spike of current is set to the post population. If a synapse is specified, the spike is convolved with the synapse, and that resulting current is sent to the post population (over time).”

My question is how much current a spike represents? Is this hardware specific?

  1. For spiking neurons, the input current is fed into the neuron’s activation function. Depending on the state of the neuron (i.e., what the membrane voltage is, how much input current is being provided, etc.), the neuron may or may not generate a spike. Note that this is a “real-time” process. The membrane voltages are updated every timestep to determine if the neuron spikes or not.
    Note*: For rate neurons, using the input current and the neuron’s response curve, a firing rate is computed (basically, the neuron’s response curve maps some input current to some output firing rate).

How this transition take place from rate neurons to spiking neurons? For instance, if the firing rate of a neuron is 100Hz then it means that when we translate it to a spiking neuron, the neuron will send a spike at every 10 milliseconds?

And we double the rate i.e., 200Hz then the spiking version of this neuron will spike every 5 milliseconds right?

I should note that the terms “encoder” and “decoder” are pretty much only used in Nengo itself, … connection weight matrix.

When I am using nengo_dl defining layers with keras does this include encoders and decoders as well?

The current is calculated literally as I described it – the input signal (be it from some exterior source, or as a spike train from a preceding neuron) is multiplied by some connection weight, then filtered by a synaptic filter. This process is consistent with the summary you quote in your question.

I must point out that there is a disconnect between the currents in Nengo and the currents found in biology. Nengo (as well as other neural network simulators) are abstractions of the processes found in biology, so there isn’t necessarily an exact mapping between Nengo and biology (i.e., 1 unit of current in Nengo doesn’t necessarily map onto 1 uA of current in biology).

A spike doesn’t represent current. A spike represents an instantaneous increase in voltage in some given timeframe. If you want to calculate the amount of current the spike represents, you’ll need to know the resistance of the axon the spike is travelling down (for physical systems). In Nengo, this resistance is not modelled (i.e., the resistance is 0), so using the formula I = \frac{V}{R}, we get that in Nengo, one spike has infinite current. For hardware like Loihi, spikes are typically digital signals (i.e., a ‘1’ being sent down a wire), so the amount of current contained in the spike depends on the hardware being used.

I should note that the amount of current contained in a spike is typically unimportant. The spike serves to inform the system that an event has occurred, and a current equal to the connection weight multiplied by some post-synaptic current (PSC) [the PSC is in turn determined by the synaptic filter being used] should be fed into the succeeding neuron.

In Nengo / NengoDL, converting a rate model to a spiking model typically involves just replacing the rate neurons with equivalently configured (i.e., same intercepts, same max rates, same refractory times, etc.) spiking neurons, all the while keeping the rest of the network unchanged.

When you do so, a rate neuron that was firing at 100Hz will spike with an inter-spike interval of 1/100s (10ms) at steady state. The “at steady state” qualifier here is very important because the firing rate equivalence assumes that the input to the neuron is constant. Unlike rate neurons (whose output firing rates can change very quickly), the internal dynamics of spiking neurons mean that they require time to process quick changes to their input, and this is the reason why spiking models are more noisy than rate models.

As a quick example, let’s say you have a rate neuron where the output firing rate is such (using 1ms timesteps): 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 200, 100, 100, 100 …
The spiking neuron equivalent would generate a spike after the first 10 timesteps, and (roughly) after the next 9 timesteps. If you calculate the effective spike rate of the spiking neuron, you’ll see that it’s just slightly over 100Hz, and the 200Hz jump seen in the rate neuron output completely disappears in the spiking case.

Yes, this is correct, but once again, with the caveat that it comparison is made at steady state.

No. The connections between neural layers in most NengoDL models are done between neuron objects. Making such a connection “bypasses” (i.e., does not use) the encoders and decoders. But, in the grand scheme of things, this is not important since encoders and decoders are abstract concepts used to solve for the connection weights (using the NEF algorithm), whereas with NengoDL, you are using Keras training methods to do the same thing (i.e., to solve for the connection weights).

@xchoo Thank you very much for taking the time to answer briefly. I would like to clear few more doubts…

Unlike rate neurons (whose output firing rates can change very quickly), the internal dynamics of spiking neurons mean that they require time to process quick changes to their input, and this is the reason why spiking models are more noisy than rate models.

Noisier here means that since it is not spiking at every time step that’s why it is noisy?

As a quick example, let’s say you have a rate neuron where the output firing rate is such (using 1ms timesteps): 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 200, 100, 100, 100 …

For this example, the rate-based neuron will send 1 spike every timestep till 10th-time step. At the 11th timestep it will send 2 spikes. Right?

In the spiking version of neuron, it will only send a single spiking at 11th time-step? and for the rest of the time steps it will be idle.

If you calculate the effective spike rate of the spiking neuron, you’ll see that it’s just slightly over 100Hz, and the 200Hz jump seen in the rate neuron output completely disappears in the spiking case.

Can you please elaborate it more?

But, in the grand scheme of things, this is not important since encoders and decoders are abstract concepts used to solve for the connection weights (using the NEF algorithm), whereas with NengoDL, you are using Keras training methods to do the same thing (i.e., to solve for the connection weights)

Does this mean that we are only changing the weights and biases in training when using the NengoDL. The gain and rest of the neuron parameters are not changed?

Spiking networks are typically noisier because information is only transmitted when a spike is generated. For example, if a rate network outputs this (with a 1ms timestep):

100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100 …

The equivalent spike output would be:

0, 0, 0, 0, 0, 0, 0, 0, 0, 1000, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1000

I.e., 1 spike every 10 timesteps (note that the spike is 1000 in magnitude since it’s 1/dt). If you compare the rate and spike output, you can easily come to the conclusion that the spike output is much noisier than the rate network output.

It depends on how the neuron dynamics is implemented. For a neuron like the LIF neuron, there is a refractory period where the neuron doesn’t spike at all. More likely, what would happen is that the neuron spike at the 10th timestep, and then spike again at the 19th timestep (instead of the 20th timestep). This is what I mean by my statement:

If you look at the effective spike rate of the spiking LIF neuron would be 100Hz, then ~110Hz, then back to 100Hz. Comparing to the rate network though, the 200Hz jump doesn’t follow through to the spike output.

It depends on how you configure your NengoDL network. Typically, all of the neuron parameters will change. With respect to neuron parameters, there are really only two that define the behaviour of a (rate) neuron: the bias, and the gain. Biases, as you mentioned are separate inputs to the neuron and can be trained. Gains are typically rolled into the connection weights and so when the connection weights are changed, so to are the gains. Spiking neurons have additional parameters like the refractory time constant or the RC time constant, but those are not trainable.

Note that NengoDL operates somewhat differently from Nengo (core). In Nengo core, it is possible to get access to things like the gains and biases post-learning since the learning rules implemented in Nengo don’t typically changes those parameters. In NengoDL, however, since TensorFlow is used to train these networks, and since the gains and connection weights (encoders + decoders) are combined to form the TensorFlow equivalents before training, after training, it is possible that these individual components (encoders, decoders, gains) become inseparable from the overall connections weight value.

Dear @xchoo Thank you for your prompt reply and thank you for answering my question so patiently. I have some confusion back in my head that I would like to clear.

I.e., 1 spike every 10 timesteps (note that the spike is 1000 in magnitude since it’s 1/dt
). If you compare the rate and spike output, you can easily come to the conclusion that the spike output is much noisier than the rate network output.

I will try to explain my understanding with the following example. If we have a rate neuron with the following rates with a time step of 1ms

Rate neuron: 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 200, 100, 100, 100 , 100 , 100 , 100 , 100 , 100 , 100 …

Assuming no refractory period, the spiking equivalent would be:

Spike neuron: 0, 0, 0, 0, 0, 0, 0, 0, 0, 1000, 0, 0, 0, 0, 0, 0, 0, 0, 1000, 0
Right?

When we are looking at the rate neuron with 100Hz. So it will generate a spike every 10ms. So in the above example, the first spike will be on 10ms. Since at the 11th time step the rate is 200Hz this means now the second spike will be on 15ms. Now at 16ms, the rate is again 100Hz so the next spike will be generated at 25 ms as shown in the figure below.

Is my understanding correct?

Now the spiking equivalent neuron. My understanding here is that 1/dt represents a threshold assuming that there is no refractory period. Right?

So the neuron will start accumulating for the 1st 9 steps as it reaches 1000 at the 10th time step It will spike. At the 11th time step, the rate is 200Hz and then again 100Hz so the next time 1000 will be achieved at the 19th time step, and hence it will spike. Then spike again on 29th step and so on…

This is what my understanding is about rate neurons and spiking neurons.

In NengoDL, however, since TensorFlow is used to train these networks, and since the gains and connection weights (encoders + decoders) are combined to form the TensorFlow equivalents before training, after training, it is possible that these individual components (encoders, decoders, gains) become inseparable from the overall connections weight value.

So this means that we can represent the behavior of the neuron (with constant refractory and RC) by the connection weights and bias only? What I mean in this case I would only need the biases and weight connection to map it on FPGA.

One another question is about the training. When we are training in nengo_dl. During the training process is the neuron behavior like a conventional artificial neural network neuron? or is it a rate-based neuron?

Thank you very much for your patience :grimacing: and for answering my questions.

That is correct.

Ah… Here is where you go wrong. Rate neurons are so-called “rate” neurons because the output of the neuron is the current activity (firing rate) of the neuron. They don’t produce any spikes at all. Thus, if the rate neuron has an output firing rate of a constant 100Hz, the output of the neuron will be 100 in each timestep.

Spiking neurons, on the other hand, spike at the frequency determined by the activity of the neuron. Thus, if the output firing rate of a spiking neuron is 100Hz, they will spike every 10ms. Here’s a plot comparing the outputs of two identically configured neurons, one rate and one spiking:

In the plot, we see that, as expected, the rate neuron has a constant output of 100, whereas the spiking neuron spikes every 10 timesteps (the dt is 1ms). The inputs to both neurons are the identical. We also note that each spike in the spiking neuron output is 1/dt (which is expected). I should also note that these plots were generated in Nengo, and in Nengo, the data starts at t=dt, so the 5th timestep (for example) is at 0.006s and not 0.005s.

Now, let’s see what happens if the input is modified so that at the 11th timestep, the input value is increase so that the rate neuron should output 200 instead of 100. For the rate neuron, the change in the output should be immediate, since the rate neuron has no internal dynamics (i.e., no membrane voltages to calculate and propagate). For the spiking neuron, however, the sudden increase in the input will affect the output, but it takes time for the membrane voltage to accumulate and generate a spike. So, instead of an instantaneous change in the output, the second spike is generated at timestep 19 instead of 20.

If we were to calculate the effective firing rate of the spiking neuron, we see that for the first bit of the simulation, it’s 100Hz (spiking every 10ms). But for the second bit of the simulation, it’s ~111.11Hz (spiked at 9ms instead of 10ms). Compared to the rate neuron output, however, we see that the sudden spike to the input values is all but lost for the spiking neuron (only increased by 11Hz, whereas the rate neuron increased by 100Hz). This is what I meant when I stated:

If you want to experiment with this example, he’s the code I used to generate the plots. You can change the output of input_func to see how it affects the outputs of the neurons. There are some comments in the code which you should read careful. They tell you why certain values are set the way they are.
test_rate_spike_compare.py (3.5 KB)

This depends on the type of neuron used. For the LIF and ReLU neurons, this is the case (assuming the \tau_{ref} and \tau_{RC} values are the same on the FPGA as they are in the Nengo model). However, since the \tau_{ref} and \tau_{RC} values can be changed by the user, it’s typically better to transfer those values as well to the FPGA.

In NengoDL, the training process uses the rate-based version of whatever neuron type you have configured your network to use. Some machine learning software (e.g., TensorFlow), the “conventional artificial neural network neuron” is the rate-version of the linear neuron. There is no equivalent neuron type in Nengo, but the nengo.RectifiedLinear() neuron type closely approximates it (the activation of the linear neuron extends below 0Hz, whereas the rectified linear neuron stops at 0Hz).
Other machine learning software use ReLU neurons (identical to the nengo.RectifiedLinear() neuron), while others use TanH or Sigmoid neurons (both of these neuron types are also available in Nengo).

@xchoo Thank you very much and taking the time to briefly answer my questions. Thank you for the script. Although I understand the concept behind one spike, things get a bit confusing when there are multiple time changes in rate.

For instance, extending your example in the script, when I change the rate for 2ms i.e. keep it for 2 ms for 200 Hz, in the spiking domain the 2nd spike occurs around 19 ms as shown in the figure below.

Now when I keep a 1 ms gap between two change rates, the behavior is the same in the spiking domain as shown below.

Similarly, if we have a change rate 11 ms and 18 ms the behavior in the spiking domain remain the same as shown below.

Now in case, we have three times change in rates in rate neurons, we have a second spike on 18 ms.

What I observed regardless of the location of the change of rates, every change in rate is causing 1 ms step (spike 1 ms earlier) in the spiking domain.

There must be some simple mathematical concept behind it that I fail to grasp. Can you please explain this behavior?

Thank you very much once again for answering my questions :slight_smile: and important for your patience. :grimacing:

If you look at the code for the rate LIF neuron, you’ll see that the output firing rate of the neuron is purely a function of the input current. That is to say, at every timestep, if you know what the input current to the neuron is, you can plug it into the mathematical formula to give you the output spike rate. This is analogous to having a short water pipe (facing down). If you pour 100ml into the pipe, it immediately exits the other end of the pipe. If you hook the pipe up to a faucet, the amount of water exiting the pipe will change as fast as you change how open/close the faucet is.

If you look at the spiking LIF neuron, on the other hand, you’ll see that the way Nengo determines if the neuron has spiked or not depends on the voltage (membrane voltage) of the neuron. If you follow the code, this membrane voltage is updated every timestep, but a spike is only generated when the membrane voltage cross a specific threshold (the spiking threshold). What this means is that there is no direct mapping between the input current and the output spike train. Rather, the input current causes the voltage to accumulate (i.e., integrate), and only when the voltage cross the spike threshold is a spike generated.

The spiking neuron activation function is analogous to having a bucket (or something like this fountain), rather than a short pipe. In this analogy, when the bucket overflows, a spike is considered to be “generated” (for the tipping fountain, when the bamboo thing tips over, that’s when a spike is generated). For a bucket, if you put 100ml into the bucket, it doesn’t necessarily overflow. In the examples above, where it takes 10 timesteps to generate a spike, it’s equivalent to putting 100ml into the 1L bucket at every timestep. After 10 timesteps, the amount of water in the bucket reaches 1L, and overflows, so a spike is generated.

We can use the pipe / bucket analogy to see how the rate and spike neurons will react to quick changes in the input current. For the rate neuron (pipe), a quick change to the input flow of water will result in an equally quick change in the output flow of water. But, for the spiking neuron (bucket), changing the input flow of water merely changes how fast the bucket fills. Thus, doing 200+200+100+100+100 ml of water (in 5 timesteps) is roughly equivalent to doing 200+100+100+100+200 ml of water (in 5 timesteps)**(see notes about LIF neurons at the end).That is why in your plots above, there is little difference between the first few plots. It’s only when you introduce a third 200Hz input does it impact the output firing rate (because an additional 200ml input will cause the bucket to fill faster).

A Note About LIF Neurons:
If you examine the name “LIF”, it stands for: Leaky-Integrate-and-Fire. The “integrate” part I’ve already explained, where the neuron integrates the input current (fills the bucket). The “fire” part I’ve also explained, where the neuron only fires when the membrane voltage reaches a specific value (the bucket overflows). The “leaky” part I haven’t yet explained. In the LIF neuron model, current actually slowly leaks out of the neuron. This is like having a small hole in the bucket. When you stop feeding the neuron with input current, the membrane voltage will slowly decay to 0. Likewise, if you stop putting water into a bucket that has a small hole in it, the water in the bucket will eventually slowly all leak out.

What this means is that there will be a difference in behaviour if you change when the input current is fed to the neuron. In the example plots you posted above, there actually should be a small difference in the case where you have 200Hz for timesteps 1+2 compared to 200Hz for timesteps 1+8. But, because the neuron is firing fairly fast compared to the dt of the simulation, this difference becomes lost (the dt is too large to resolve this small difference in spike times). If you reduce the firing rate of the neuron to 10Hz, or if you make the dt a smaller value (e.g., 10ns), you should see the difference pop up.