Yes, this is correct. Although, I should clarify that in Nengo, the “input weights” of the neurons consists of several components: any encoders that are used (only for NEF based ensembles, doesn’t apply to converter based networks), the connection transformation, and the neuron’s gain. What is being modified by the scale_firing_rates
parameter is just the neuron’s gain
value, so, if you were to just probe the connection weights, you will not see the change reflected there.
This is correct. Since the ReLU neuron has a linear activation function, a linear increase in the neuron’s gain will result in a linear increase in the neuron’s max_rates
. This does not hold true for neurons with non-linear activation functions.
A rate network example
The scale_firing_rate
parameter changes the performance of the network because of the temporal nature of spikes. To explain this, let us consider a simple two neuron, two layer network, with the weights indicated in parenthesis:
inp --(*1)--> A --(*2)--> B ---> output
First, let us consider the network with rate neurons. If we set the inp
to 1, neuron A
will recieve an input of 1, and thus it’s output firing rate will be 1. This value (1) is then fed through the connection from A
to B
, resulting in the input to neuron B
to be 2. Given an input of 2, the output firing rate of neuron B
would be 2, resulting in the network output of 2.
Before we move on to the spiking neurons, let us now run the rate network over time, using the same constant input of 1. What we expect the outputs of each layer to be at each timestep would be (note: this assumes no propagation delay to simplify the example):
t: dt 2dt 3dt ... ndt
inp: 1 1 1 1
A: 1 1 1 1
B: 2 2 2 2
Now, because the network is running over time, we need some way of evaluating the network output. One method for doing this is to average the network output over a window of time. Another method for evaluating the network output is to simply choose a specific time t
(e.g., the last timestep of the simulation) to read the output. For the rate neurons, both of these methods produce identical results.
The spiking network example
Now, let us convert the neurons to spiking neurons. The neuron activation logic for the SpikingRectifiedLinear
neuron can be found here, and I’ll be using it as a guide to demonstrate the behaviour of the spiking model. Note that the neuron activation logic allows for multiple spikes to appear in one timestep.
So, let’s run through the simulation. For an input of 1, neuron A
should have a firing rate of 1Hz. This means that it should spike every 1/dt
timesteps. For this example, lets take dt
to be 0.1s, so it should spike every 10 timesteps:
t: dt 2dt 3dt ... 9dt 10dt 11dt
inp: 1 1 1 1 1 1
A: 0 0 0 0 10 0
From the output above, several things are evident:
- The neuron only outputs a value when it spikes. At every other time, the output is 0.
- The output spike has a value of
1/dt
, this is to ensure that the total “energy” of the spiking neuron averaged over time is equivalent to the expected output value of 1 (for an input of 1). I.e., if you take that 1/dt
spike, and average it over 1/dt
timesteps, you’ll get the same output as the rate neuron.
Now comes the interesting part, what happens to neuron B
? At t = 10dt
, the input to B
will be 20
, and if we put that through the SpikingRectifiedLinear
activation function, it will produce two spikes for that timestep. Assuming no propagation delay, the output of the network will look like this:
t: dt 2dt 3dt ... 9dt 10dt 11dt
inp: 1 1 1 1 1 1
A: 0 0 0 0 10 0
B: 0 0 0 0 20 0
Now, let’s try to evaluate the network output. If we were to average over a 10dt
window, we’ll get the “correct” output of 2
. However, if were to choose an arbitrary point in the simulation to evaluate the output, we will see that 9 times out of 10, the network output produces a 0
, which is completely incorrect!
The takeaway from this very simplistic example is that for spiking networks, information is only transmitted from layer to layer, and from layer to output when a spike happens. For more complex networks this can also have the effect of reducing the overall network accuracy because spikes may have to arrive at the same time for a neuron to produce an output spike.
The spiking network with scale_firing_rates
So, how does the scale_firing_rates
parameter address this issue? For this example, let’s take the extreme method of increasing the scale_firing_rates
parameter to 10
. As before, the input to neuron A
is 1, but, with a gain of 10
, now the neuron’s expected firing rate would be 10Hz (i.e., once every timestep). To compensate for the increased firing rate, we divide the amplitude of the output spike by scale_firing_rates
to keep the neuron’s total “energy” the same. Since a spike has a default amplitude of 1/dt
, the new spike amplitude would be 1/(10*dt) = 1
This results in the following output:
t: dt 2dt 3dt ... 9dt 10dt 11dt
inp: 1 1 1 1 1 1
A: 1 1 1 1 1 1
Now, let’s look at neuron B
. At each timestep, the neuron receives and input value of 2. With the scale_firing_rates
value of 10
, this increases the neuron input to 20. What this means is that the expected neuron firing rate is 20Hz, i.e., twice every timestep, or rather, the output should be 20
every timestep. As with neuron A
, we divide the amplitude of the spikes by scale_firing_rates
which result in an output of 2 every timestep:
t: dt 2dt 3dt ... 9dt 10dt 11dt
inp: 1 1 1 1 1 1
A: 1 1 1 1 1 1
B: 2 2 2 2 2 2
Looking back at the rate network, we recognize this as the output of the rate network! Thus, by increasing the scale_firing_rates
parameter, we’ve essentially replicated the behaviour of the rate network.
Of course, increasing the scale_firing_rates
parameter to such a large value is the extreme, and in general, we don’t do this because we lose the advantages of spikes, which is that information is only transmitted (i.e., energy is used) when a spike is emitted. You can perform the same exercise with a lower value of scale_firing_rates
, and see the impact it has. As a quick example, let’s set scale_firing_rates = 5
. What you should see is this:
t: dt 2dt 3dt ... 9dt 10dt 11dt
inp: 1 1 1 1 1 1
A: 0 2 0 0 2 0
B: 0 4 0 0 4 0
Here, the network is only producing spikes every other timestep (i.e., energy is about half of the scale_firing_rates=10
network), and the output value (4
) is closer to the expected output (2
) than the original scale_firing_rates=1
spiking network. As in the Keras-to-SNN example, synaptic filtering can be added to smooth out the spikes to produce a more desirable output.
The quick summary of the effect of the scale_firing_rates
parameter is that it increases the spiking rate of the neurons in the network, thereby increasing the rate of information flow (recall, information only flow when a spike occurs) through the network, increasing it’s accuracy and performance.
Caveat: I should mention that what I’ve explained above is super simplified to give you a general idea of how converting spiking neurons can impact the network’s performance. In reality, because of the dynamics of the network (I didn’t even include synaptic filtering or propagation delay in my example), things can get a lot more complicated. 