[Nengo Loihi]: Computable functions and Spiking Behaviour

No. The architecture of the Loihi board is not the same as a traditional CPU, so I hesitate to use the word “CPU” to describe it. By “Loihi board”, I mean a collection of Loihi chips. In each chip, there are several neurocores, which contain circuitry to do the neuron activation function computation, as well as any connection weights (matrix operations) necessary. Each chip also contains a few lakemont CPUs, which are in charge of handling I/O and timing signals. If you want to know more about the architecture of the Loihi chip & Loihi board, I recommend posting on the INRC forums.

If you consider just execution time, it will be more “efficient” to perform the AveragePooling operation on a GPU, yes. However, if you are considering power, then no, it will probably still be more efficient to perform the AveragePooling on the Loihi chip. This is assuming a batch size of 1 though. For larger batch sizes, the GPU gets more efficient if you amortize the power across the size of the batch.

We discuss the relationship between energy use and batch sizes in this paper (although, you should note that this is for one specific application, so it may not necessary generalize to others).

If I recall correctly, the matrix multiplication operations are done on a per spike basis within the Loihi neurocores, but I’ll have to double check that.

Essentially, yes. The on-chip neurons is just a specially tuned nengo.Ensemble, but it functions the same as regular ensembles.

Sorry for being a bit persistent here… but with respect to the following,

does it mean that the AveragePooling op is done by neurocores without the generation of spikes of course?

I guess the above relates to the question I have asked. Please let me know accordingly and also what per spike basis means.

And thanks for the paper… that should definitely give me some context about the energy usage and related :slight_smile: .

That is correct. The average pooling operation can be performed purely in the connection weights, and does not require any additional neural ensembles (i.e., no additional spike generation) to do it.
For example, if you had an ensemble A with 4 neurons and an ensemble B with 2 neurons, and you wanted to compute a (2x1) average pooling, the connection weights between the two ensembles would look like this:

You’ll notice that no additional ensembles are needed between ensemble A and B to compute the average pooling operation.

In the Loihi board / chip, whenever a neuron spikes, a “network” packet is generated and sent from the neurocore simulating the source neuron to the neurocore simulating the destination neuron. When the packet arrives at the destination neuron, several computations are made, one of which is the connection weights computation. Since the connection weights serve to connect individual neurons, there is only 1 connection weight between 2 neurons. Since there is only 1 connection weight between 2 neurons, this only needs to be computed if a spike is transmitted from the 2 neurons involved with that connection weight. Thus, the connection weight matrix operations are only computed on a per spike basis.

Hello @xchoo, thanks for resolving the doubt about the computational substrate for the AveragePooling op. This execution of the sum of weighted inputs in neurocores holds true for any nengo.Convolution() operation I suppose. Just confirming… in your posted network above, the output from Ensemble A is filtered (i.e. synapsed) output, which is multiplied by 0.5 and then that serves as input x to Ensemble B where the current J is calculated, followed by the voltage update in Ensemble B’s neurons, leading to spike (if the voltage reaches threshold) and the spike is then further synapsed before outputting from Ensemble B; and thus the AveragePooling op continues if it exists after the Ensemble B layer. Right?

Also, with respect to my previous comment on this thread, I guess following was missed.

In the above question, now one doubt stands resolved, i.e. the Convolution operations are performed on neurocores (of course on Loihi Board). What about Synaptic filtering?

Also, let’s say, I have to calculate inter-spike interval, this can be done in nengo.Node() perhaps as it is supposed to hold any arbitrary python code. The nengo.Node() will then output the inter-spike interval of the selected neurons, therefore I guess… it will be considered as a non-passthrough node, hence executed on host PC… right? If considered a non-passthrough node, is there a way to make it passthrough… I strongly anticipate there should be a way, as spikes are inherent to Loihi Board and calculating inter-spike intervals at each time-step should be trivial (perhaps even a Loihi API exposed for it?). One more question, can we change the weights of the matrices/kernels in nengo.Convolution() every time-step? Sorry for asking these many questions on this thread… :grin:

Yes. Although, the synapse filtering computation is done on the input to a neuron, not as an output from a neuron (the output of a neuron is just a spike).

If I recall correctly, the synaptic filtering is performed as part of the connection weight computation on the destination neuron’s neurocore (or just before the neurocore… i can’t remember what intel denotes to be part of / outside the neurocore – regardless, it’s still on the Loihi chip).

That is correct. If you are performing custom python computations in a nengo.Node, that will be performed on the PC.

I don’t believe there is currently a method by which you can measure the firing rates of neurons on the board itself as this requires measurement hardware to be implemented on the Loihi chip itself, which I don’t think is present. However, what you could do is use the NengoLoihi’s Loihi “emulation” mode to compute the interspike intervals (ISI) on your PC (the emulator runs entirely on the PC, but the firing rates of the neurons should be identical to running it on the Loihi board). Then, to run it on the board, simply disable your ISI nengo.Node.
To use the NengoLoihi emulation mode, do

with nengo_loihi.Simulator(target="sim") as sim:

instead of

with nengo_loihi.Simulator(target="loihi") as sim:

Yes, but if you want to do this, you’ll need to use an online learning rule (probably a custom one?) to achieve this. The online learning mechanisms are the only things on the Loihi board that are able to change connection weights as the simulation is progressing. They do come at a performance cost though, and I’ll have to check with the NengoLoihi devs, as there are some restrictions you need to adhere to regarding online learning rules. Otherwise, you’ll run into the issue where parts of the rule are run on your PC, and parts of it on the Loihi board.

With respect to the following:

Actually I need to calculate ISI during inference phase in Nengo-DL (with possible option of deploying the network on Loihi Board), hence was looking for a direct way to calculate it. The IS Intervals would obviously change based on the input, and BTW… I don’t need ISI during training phase or so. Next, I wanted to modify the nengo.Convolution() kernels during inference phase as well. I guess, since this would involve modifying the layers of the converted network in Nengo-DL, I should better open another topic focused on it. Anyways, please do let me know your thoughts about calculating ISI during inference phase with focus on on-board computation (if there exists any way).

I spoke to the NengoLoihi devs, and gave it some more thought. If you are just wanting to compute the ISI of neurons in your network, then all you need to do is either probe the neurons (nengo.Probe(ens, "neurons")), or connect to the .neurons object to a Node to compute the ISI. It is true that the node will run on your PC, but as long as you don’t need to feed the output of that ISI node back into the network, it should not slow down your network simulation too much.

The issue with off-chip Node computation is really when you have the output of the node feed back into the network. Then you incur I/O penalties going off-chip (which you already do if you are probing outputs from your network), and additional penalties for feeding the value back to the chip.

This is what the NengoLoihi dev sent me, for context:

Yes, you can probe the .neurons attribute of an ensemble just like you can in Nengo core to get the output spike trains. You can then calculate rate or ISI or whatever.

Note that when precompute=True , we use NxSDK’s spike probes, which are faster but have limits on the number you can have (the limits are per-chip, I think, something like 2048 per chip). With precompute=False , we probe them manually within a custom SNIP, which is a bit slower but it doesn’t have the same limits (though there are still some limits based on the amount of memory available to the SNIP).

Hello @xchoo, thanks for giving it second thoughts. Below is a custom script I wrote for the calculation of ISI (haven’t implemented the exact function yet, but it should be straightforward, given that I am able to access the spikes (with amplitude 1/dt) in my _get_isi()). My goal is to calculate ISI for more than 1 input scalar (i.e. more than one inp_node1).

SEED = 89
x1, x2 = 0.9, 0.1

def _get_isi(t, x):
  print(x)
  #return x[0], x[1]
  
with nengo.Network(seed=SEED) as net:
  # Create Input Nodes.
  inp_node1 = nengo.Node(x1)
  inp_node2 = nengo.Node(x2)
  
  # Create 2 Ensembles.
  ens1 = nengo.Ensemble(n_neurons=1, dimensions=1, seed=SEED, radius=1)
  ens2 = nengo.Ensemble(n_neurons=1, dimensions=1, seed=SEED, radius=1)
  
  # Create the ISI Node.
  isi_node = nengo.Node(output=_get_isi, size_in=1)
  
  # Connect the Input nodes to the Input ensembles.
  nengo.Connection(inp_node1, ens1, synapse=None) # Default Synapse is Lowpass(0.005)
  nengo.Connection(inp_node2, ens2, synapse=None) # Default Synapse is Lowpass(0.005)
  
  # Connect the Input ensembles to the ISI node.
  nengo.Connection(ens1.neurons, isi_node, synapse=None)
  nengo.Connection(ens2.neurons, isi_node, synapse=None)
  
  # Check the representation of inputs.
  probe1 = nengo.Probe(ens1, synapse=nengo.Lowpass(0.005))
  probe2 = nengo.Probe(ens2, synapse=nengo.Lowpass(0.005))

  # Check the spiking pattern.
  spikes1 = nengo.Probe(ens1.neurons) # Default synpase is None
  spikes2 = nengo.Probe(ens2.neurons) # Default synpase is None
  
with nengo.Simulator(net) as sim:  
  sim.run(1)
  vctr_points_1, activities1 = tuning_curves(ens1, sim)
  vctr_points_2, activities2 = tuning_curves(ens2, sim)

It runs error free when size_in=1 i.e. when I access spikes for only one input (although there are two connections made to the isi_node:

# Connect the Input ensembles to the ISI node.
nengo.Connection(ens1.neurons, isi_node, synapse=None)
nengo.Connection(ens2.neurons, isi_node, synapse=None)

which I thought… it should throw an error due to size_in=1, but no… no error is thrown. However, when I set size_in=2 to access the spikes for both inputs (i.e. inp_node1 and inp_node2) in function _get_isi(), it throws this error: ValidationError: Connection.transform: Transform output size (1) not equal to connection output size (2). I looked through few scant examples of creating and using Node with size_in=2 (or more), and found that one needs to mention a transform=np.eye(2) in the nengo.Connection() while creating a connection between the ensemble and the isi_node, but that gave me an expression of creating a multi-dimensional ensemble - which I guess is not useful for me, as I want to record ISI from each individual neuron (as done in Nengo-DL, e.g. the vector input is presented to the first Conv/Dense layer for n_steps and each neuron in the Ensemble represents one scalar element of the vector input). Can you please help me with recording the ISI from individual neurons, each representing a scalar?

I have few more questions though, which are sort of related to my question here. In the following tuning curve plots:

plt.plot(vctr_points_1, activities1)

vctr_points_2_tc

plt.plot(vctr_points_2, activities2)

vctr_points_2_tc

you can see that both the neurons have same curve, and that’s because of the same seed value. I see that a scalar of value 0.7 or more will lead both the neurons to spike, and that indeed is the case as can be seen below (filtered output for scalar 0.9):

plt.plot(sim.data[probe1])

do_probe1

However, for scalar input of 0.1, the neuron is not supposed to spike as per the tuning curve plot, and that indeed is reflected below in the filtered output plot.

plt.plot(sim.data[probe2])

do_probe2

Now, since the radius is set to one, I am supposed to input values in range [-1, 1] only, but as can be seen above, it’s not guaranteed that a single neuron will spike for that input (a group of neurons will do). So how is it done in Nengo-DL where one neuron represents each scalar input (either direct input or calculated ones in the network e.g. Convolved ones). Please let me know this too.

In Nengo, almost all signals are arrays (Numpy arrays). When you connect to a nengo.Node, the size_in parameter determines the dimensionality of the signal being fed into the node. When you do something like nengo.Node(size_in=2), what Nengo is doing is creating a Node that expects an 2-dimensional array every timestep.

When you connect to / from a .neurons object, Nengo expects / returns an array where the dimensionality is the same as the number of neurons. For every timestep, the value in each dimension of the array denote either the input to that corresponding neuron (connecting to .neurons) or whether or not that neuron has spiked (connecting from .neurons).

That is correct, no error is thrown. This is because you are connecting 1D arrays (the output the .neurons object of an Ensemble with 1 neuron) to the input of the node. Furthermore, by doing this:

# Connect the Input ensembles to the ISI node.
nengo.Connection(ens1.neurons, isi_node, synapse=None)
nengo.Connection(ens2.neurons, isi_node, synapse=None)

what you are actually doing is summing the 1D output of each ensemble together before feeding it to the ISI node. If you set the input x2 to be equal to x1 this will become apparent.

For your specific use case, if you want to connect the .neurons output of different ensembles to the same node, you’ll have to do something like this:

isi_node = nengo.Node(output=_get_isi, size_in=2) # Size in == total number of neurons you want to probe
nengo.Connection(ens1.neurons, isi_node[:ens1.n_neurons], synapse=None)
nengo.Connection(ens2.neurons, isi_node[ens1.n_neurons:], synapse=None)

Here, we use the array slicing feature of connections to connect the output of ens1 to the first ens1.n_neurons dimensions of the isi_node input, and likewise for the neurons output of ens2.

The neurons used in NengoDL networks are typically ReLU or spiking ReLU neurons. Unlike LIF neurons, ReLU neurons have an activation function that is just a ramp, which makes it easier to use 1 single neuron to represent scalar values. For networks that use LIF neurons (and this applies to networks that use ReLU neurons as well), the network training process tunes the connection weights (which affect the gains and biases, and by extension, the intercepts and max_rates, of the neurons as well) such that for the given problem, the neuron representation in that range is most ideal. This is how single neurons are able to represent scalar values for NengoDL / TF trained models (i.e., because they are highly tuned).

In contrast, when you create a random ensemble in Nengo, and connect them to each other using encoders / decoders (i.e., the NEF), the entire ensemble is “generally” tuned (via the decoders) to represent a range of values (default is -1 to 1).

This resolves my doubt. Thanks! I can see the summed up spikes values (i.e. 2/dt) when x2 = x1 = 0.9. Upon using the array slicing feature on Nodes, I can see that I can access the spiking activity of both ensembles (1 neuron each). Although, I attempted to calculate the ISI for a single neuron in the mean time and my following code fails with error: UnboundLocalError: local variable 'prev_spk_time' referenced before assignment. Seems like calculating ISI is not straightforward as I anticipated. Code:

def _get_isi(t, x):
  #print(x)
  int_t = int(t*1000.0)
  if prev_spk_time == None and x[int_t] == 1000.0:
    prev_spk_time = int_t
  elif x[int_t] == 1000.0:
    print("ISI: ", int_t - prev_spk_time)
    prev_spk_time = int_t

I have defined the variable prev_spk_time globally such that it should be accessible to the function _get_isi(). Now, the function _get_isi() is called/executed at every time-step; so any variable defined in its scope will get re-initialized again and again, thus losing any information stored at the previous time-step. Therefore I tried using an external variable prev_spk_time to store the integral time-step of the previous spiking activity (relative to the current activity), but the compilation complains with the above-mentioned error. Is it possible to compute ISI within a Node?

With respect to Nengo-DL representation of values, I was actually having such premonition of highly tuned neurons,

and the above gives me more info; although I do not know the exact maths behind it, i.e. the mathematical steps on how does the TF trained weights results in highly tuned values of gains and biases. I do see here and here that one can calculate the max_rates and intercept from gain and bias and vice versa of course. Can you please point me to the code which has the mathematics of tuning these parameters from the TF learned weights? A short example/explanation would also be very helpful.

Next thing which was on my mind is… if the neurons (be it SpikingReLU or LIF in Nengo-DL domain) are highly tuned to represent a value, do they spike every time-step to approximate the value? Or is there some gap (ISI) in their spiking activity e.g. a neuron (with radius -1 to 1) representing 0.9 would spike more often than a neuron representing the value 0.5? I mean… if the neuron spikes every time-step (for SpikingReLU of course) or every other possible time-step (once the refractory period is over for LIF neurons) then the Loihi Board will be active all the time, thus no sparsity and little/no power savings? I did check the spiking activity of few randomly selected SpikingReLU neurons (after the TF trained network was converted) and I did see some gap between spiking in those random neurons, and I wanted to confirm this behaviour theoretically with you. Please let me know accordingly.

For normal Nengo ensembles (with NEF computation), I do see gaps between the spiking activity of each neuron when representing values of varied magnitude, although the approximation of the “smaller magnitude” values with just one neuron is not very accurate, which I guess… this problem shouldn’t exist in Nengo-DL due to highly tuned neurons as you explained.

This behaviour is consistent with Python’s handling of scopes for variables. If you want to use a global variable to store an external value, you’ll need to define the variable as global:

def my_func(t, x):
    global prev_spk_time
    ... # rest of code

You can also do things like use a function parameter that’s a list:

def my_func(t, x, prev_spk_time=[0]):
    if prev_spk_time[0] == None ...  # example use
    ... # rest of code

When TF trains weights, it modifies connection weights between individual layers and between a bias layer and an activation layer. Since the operations are linear, modifying the connection weights between layers adds a multiplicative scaling term to a neuron’s gain. Likewise, modifying the connection weights between a bias layer and the neuron adds an additive scaling term to the neuron’s bias. Changing the gain and bias of a neuron has the effect of modifying the neuron’s response curve. You can also look at how the currents are calculated for neurons, you’ll see how this overall affects the currents being computed for the neuron.

No. The neuron’s will still spike at whatever firing rate is representative of the value it is representing. Remember that the concept here is that a specific firing rate of a neuron maps on to a specific output value for that neuron. And the mapping is determined by the neuron’s activation function.

Hello @xchoo, thank you for pointing out the global declaration, silly me that I had completely forgotten about it.

About the following,

this makes things a little clearer to me. Taking an example of a simple model with InputLayer, followed by a Dense layer and then an Output layer; one spiking neuron in the Dense layer has weighted inputs from all the inputs in the InputLayer (i.e. many to one connection for a single neuron in Dense layer). Now, let’s talk about just that one particular spiking neuron. As per the architecture it can receive a range of inputs from the InputLayer and will represent the weighted sum of those inputs, thus representing a range of weighted sums. According to you, the connection weights (between layers and between bias layer and activation layer) influence that single neuron’s gain and bias by multiplication of a scaling term (say term1 x gain) and addition of a scaling term (say term2 + bias) respectively. My question is, are these two terms term1 and term2 representative of all the individual connection weights from InputLayer to that single spiking neuron in Dense layer? If yes, how are these scaling terms term1 and term2 computed? If it involves a lot of details, please don’t bother mentioning all of them, just a confirmation of my first question will do for now.

Next, after the desired computation of a spiking neuron’s gain and bias, as you said, it will influence its response curves; so I guess… that particular spiking neuron is tuned to represent a range of weighted sums (thus indirectly representing the range of raw inputs from InputLayer). Therefore, at extremes, a particular weighted sum will result in maximal spiking of the neuron (say 200Hz) and say… another weighted sum will result in zero spiking of neuron (or just above zero which represents minimal spiking). Thus the spiking neuron is now tuned such that it can accurately represent any value in the weighted sums range over a sufficiently long period of network simulation. Or in other words… in light of the following:

the filtered spike signal saturates to the expected representative value. Right?

Yes and no. Conceptually, yes, term1 and term2 would be representative of all of the individual connection weights. But in practice, the computation is done using a matrix multiplication, so these weights are separate (i.e., individual elements in the connection weight matrix), and the matrix multiplication operation puts them together, and there is no term1 or term2 value that is calculated specifically.

Yes. that is correct.

Thanks @xchoo for the insights. I understand there’s more to the actual calculation to tune the spiking neurons in Nengo-DL, but I will rest my case here for the time being with an abstract understanding obtained so far. Thank you very much for this informative discussion!

Hello @xchoo, just trying to confirm my interpretations with you on the following plots. These are obtained from a trained and converted Nengo-DL network and are the outputs of the first Conv layer. The converted network has a scale_firing_rates value of 250. Below is the code I used to collect spikes and corresponding smoothed values from the neurons in the Ensemble corresponding to the first Conv layer. As discussed above, the spiking neurons are supposed to be highly tuned to represent the scalar values.

with nengo_model.net:
  nengo_dl.configure_settings(stateful=False)
  #################################################
  conv0_spikes = nengo.Probe(nengo_model.net.ensembles[0].neurons[:100])
  conv0_values = nengo.Probe(nengo_model.net.ensembles[0].neurons[:100], synapse=0.005)
  #################################################

where nengo_model is the converted model.

First plot corresponds to the smoothed values from the spikes of 4 neurons.

plt.plot(sim_data[conv0_values][0][:,0:4])

smoothed_values
and this second plot corresponds to the spikes of those same 4 neurons.

plt.plot(sim_data[conv0_spikes][0][:,0:4])

spikes
The spikes have amplitudes 4 as they were scaled else I believe, they should be of amplitude 1.

From the first plot, it seems that even if the neurons are trying to best represent the scalar value, they are actually fluctuating a lot around their to be represented value. However, the range of fluctuations vary, i.e. the orange neuron is trying to represent a higher value and as can be seen in the second plot, the orange neuron is spiking faster than others.

Overall, what I draw from both plots is that neurons which have to represent a higher value, they have to spike comparatively faster than the others. This same story should hold with neurons in later/deeper Ensembles… right? Please let me know your thoughts as well.

That is correct. Typically, to represent a larger (scalar) value, you’ll have a higher neuron firing rate.

This behaviour should apply to all neurons in the network, regardless of where it is in the network.

Great! Thanks for confirming!

Just needed a quick confirmation with respect to the following picture:
Screen Shot 2021-04-29 at 10.52.31 AM

The above Node receives four inputs as shown, with inputs related to transform=a and transform=d directly from another Nodes (which again just receives input from neurons and outputs the same input to two objects - one being this Node in the question and other being an Ensemble). Inputs related to transform=b and transform=c are from an Ensemble and the outputs related to transform=e and transform=f are to a Node (which is exactly a same instance as this) and an Ensemble.

Functionality wise, the above Node simply sums the 4 inputs and outputs two transformed copies of the summed input. Is this a Passthrough Node? From this source, it appears to be a pass-through node, therefore just confirming?

Another question, I know it might be bit elusive to answer, but will a network composed of such Nodes (assuming it’s a pass-through) and Ensembles be all capable of running on Loihi (where the input to the network is simply from Nodes which output exact copies of the input (i.e. no function computation) from previous Ensembles’ neurons?

Please let me know.

Yes. The node configuration you have in your picture would be considered a passthrough node.

In theory, if your network only contains ensembles and passthrough nodes, then the entire thing will run on Loihi with only the network inputs and network probes as I/O to and from the Loihi board / chips. In practice however, my previous statement assumes that the number of neurons, and the dimensionality of the connection weight matrices are within the limits of the Loihi system. Although, if they do exceed the Loihi limits, I think NengoLoihi will warn you of this when you build the simulator object.

Thanks for confirming @xchoo, and yes, the number of neurons in such a Net is well below the limits of Loihi, and the transforms are simply scalars, so I should be good here too.