How to change the learning rate during training

YL0910 · May 2, 2022, 12:11pm

Hi,
I have a question whether Nengo could change the learning rate during training? For example, start learning with a large learning rate to explore the range of possible optimal solutions, then reduce the learning rate to gradually explore to the optimal solution.
I know that such an algorithm exists in ANN, I wonder if such a mechanism exists in Nengo to change the learning rate during training? Or is there an example of a similar alternative solution?
Any good ideas or examples are welcome!

xchoo · May 4, 2022, 2:08am

Hi @YL0910!

The learning rules included in the default installation of Nengo don’t support the modification of the learning rate as part of the simulation (i.e., they are fixed throughout the entire simulation). However, most of the Nengo learning rules are of a form similar to the PES learning rule, i.e.,

\Delta \omega = -\kappa \mathbf{E} a

Or, in words: the change in the weights is proportional to the learning rate (\kappa) multiplied by some error signal (\mathbf{E}) multiplied by the activity of the neuron (a). Looking at the formulation above, we can see that modifying the learning rate by some scalar is equivalent to keeping the learning rate constant and modifying the error signal by that same scalar (because multiplication is commutative). Thus, if you are using the default Nengo learning rules, you can “modify the learning rate” by modifying the magnitude of the error signal (since the learning rate is static).

If you want to use more complex learning rules (that don’t take the formulation I stated above), or want a “proper” implementation of learning rate tempering, and alternative approach is to write a custom learning rule that supports a variable learning rate.

YL0910 · June 30, 2022, 10:01am

Thank you for your detailed reply! I have experimented and found a problem that puzzles me. I set up the error sacler to change the learning rate, but the results were not satisfactory, here is an demo that illustrates the problem.

def Error_Scaler(t):
    if t > 0.3:
        return 1
    else:
        return 0

def product(x):
    return x[0] * x[1]

with model:
    input = nengo.Node(5)
    scaler = nengo.Node(Error_Scaler)
    a = nengo.Ensemble(10, dimensions=2)
    b = nengo.Ensemble(10, dimensions=1)
    nengo.Connection(input, a[0])
    nengo.Connection(scaler, a[1])
    nengo.Connection(a, b, function=product)
    p_b = nengo.Probe(b, synapse=0.01)

with nengo.Simulator(model) as sim:
    sim.run(0.8)
    result = sim.data[p_b]

In order to see the effect of the scaler more clearly, the scaler was set to 0 at the beginning of the experiment, i.e.b should be 0 until 0.3 s. But this was not the case.
P_b.csv (10.3 KB)
Suggestions are welcome！Thanks alot！

xchoo · June 30, 2022, 11:27pm

It looks to me that you are trying to implement a product network (to do change the learning rate) in neurons. I recommend checking out this example first if you are trying to do this approach.

The short answer to why your network is not behaving as you expect it to is that there are simply too few neurons in the a ensemble. If you want a relatively accurate product, I’d go as high as 500 or 1000 neurons for that ensemble. The multiplication example also utilizes several other things to improve the accuracy of the computed product:

–
Using encoders that are aligned to the diagonals. This ensures that each neuron in the product ensemble represents both inputs to the product equally.

combined.encoders = Choice([[1, 1], [-1, 1], [1, -1], [-1, -1]])

–
Increasing the radius of the ensemble to ~\sqrt{2}. This is an important detail as the radius of the network should be scaled to match the expected magnitude of the inputs. As an example, if input and scalar have an expected input range from -2 to 2, then the ensemble must be able to represent the vector value [2, 2] (and [-2, -2]) well, meaning the radius of the ensemble should be \sqrt{2+2}.

It’s recommended to have the expected range of input and scalar to be about the same, but if this not the case, you’ll want to modify the scaling factor on either inputs so that they roughly match.
As an example, if input is expected to have a range of -1 to 1, but scalar is expected to have a range from -5 to 5, you’ll want to scale the scalar input down by 1/5, and then compensate for it in the product function:

scalar_scale = 1.0/5.0

def product(x):
    return x[0] * x[1] / scalar_scale 

nengo.Connection(scalar, a[1], transform=scalar_scale )
nengo.Connection(a, b, function=product)