Learning a Linear Transformation

SebVol · June 16, 2022, 9:23am

Hi community,

I’m quite new to Nengo and simply trying to comprehend all the examples and also wanted to watch the Summer school videos to become comfortable with the API.

However, as I was experimenting with the examples, I wondered if it was possible to learn a linear transformation in the form of a matrix multiplication (Y = AX) with the PES learning rule.

I thought maybe it could look something like this:

    # -- input and pre popluation
    inp = nengo.Node([1, 2, 3, 4])
    pre = nengo.Ensemble(1,dimensions=4)
    nengo.Connection(inp, pre)
    
    matrix = np.matrix([[0, 1, 0, 0], [0, 0, 1, 0], [0, 0, 0, 1], [1, 0, 0, 0]])

    # -- post populations
    post_pes = nengo.Ensemble(1, dimensions=4)

    # -- reference population, containing the actual product
    product = nengo.Ensemble(1, dimensions=4)
    nengo.Connection(inp, product, transform=matrix, synapse=None)

    # -- error populations
    error_pes = nengo.Ensemble(1, dimensions=4)
    nengo.Connection(post_pes, error_pes)
    nengo.Connection(product, error_pes, transform=-1)

xchoo · June 21, 2022, 2:12am

Hi @SebVol, and welcome to the Nengo forums!

Yes, absolutely! The PES learning rule is quite capable of learning linear (and non-linear) transformations, as evidenced by the examples here. As a bit of a teaser, here’s the output of a network I’ve made that learns a randomly shuffled identity matrix (i.e., it just transposes the order of the vector element around). This output is recorded after the network has been trained for 200s, and while this output is being probed, the learning signal has been inhibited, to demonstrate that the network has generalized the learned connection (similar to what is done in this example).

From the plots, we see that the network has done a pretty good job at learning the shuffled identity matrix.

You are generally on the right track, although, there are a few pointers I have for you.

First, note that in Nengo, by default (if you don’t change the radius value) neural ensembles are optimized to represent values between a certain range. It is typically within the range of a unit hypersphere (i.e., given an vector input, the vector magnitude of said input should be at most 1). While it may be possible for the network to learn with inputs that violate this “constraint” (optimized parameter), for simplicity, you should try to keep the inputs to the ensembles with this in mind.
As the dimensionality of the ensembles is increased, the number of neurons you use in the ensemble should increase as well. In some networks, we use a scalar multiplier (e.g., 50 * dim), but for the network I put together, I’m using an exponential multiplier (50 * dim ** 1.5). This is mostly from experience, since I know that more neurons will help the network generalize. It does come at the cost of slowing down the simulation, though.
The network you have proposed should work, with a one minor change (apart from the changes I mentioned above). You seem to be missing the connection between pre and post_pes, which is the one where the learning will actually take place on. Apart from that, the network you proposed should work.

SebVol · June 27, 2022, 3:54pm

Hi @xchoo,

thanks for your reply.

Unfortunately, I was not at my computer last week and could only work on it today, so sorry for my late reply.

I changed my code to the more practical case, where my input is (13,38) matrix and my output a (13,10) matrix. Because I wanted to use the visualization of the nengo simulation I was running my code directly with nengo and not in a jupyter notebook. However, I was running into some difficulties were my model simply did not learn anything and my output was constant. I can’t figure out why:

with model:
    inp_ens = nengo.Ensemble(n_neurons=500,dimensions=38,radius=10) #define radius for range of input values
    x_pos = 0
    y_pos = 0
    
    def inp_loop(t):
        global x_pos
        temp = x_pos
        #print(temp)
        if x_pos<12:
            x_pos+=1
        else:
            x_pos=0
        return c_mat[temp]
        
    def out_loop(t):
        global y_pos
        temp1 = y_pos
        if y_pos<12:
            y_pos+=1
        else:
            y_pos=0
        return s_mat[temp1]
        
    
    stim = nengo.Node(inp_loop)
    target = nengo.Node(out_loop)
    
    nengo.Connection(stim,inp_ens)
    
    
    output = nengo.Node(None,size_in = 10)
    
    def my_func(x):
        return 0
    
    learn_con = nengo.Connection(inp_ens.neurons,output, transform=np.zeros(shape=(10,500)),
                                learning_rule_type=nengo.PES(learning_rate=0.0001))
                                
    
    error = nengo.Node(None,size_in=10)
    
    nengo.Connection(output,error)
    nengo.Connection(target,error,transform=-1)
    nengo.Connection(error,learn_con.learning_rule)

I used the loops in the code so that it can run endlessly in the simulation.

And I have a follow up questions to the code you provided.
How did you define the time dimension?

Thanks again!

xchoo · June 30, 2022, 2:32am

Hi @SebVol,

There are several things to steps to take to debug why your network may not be learning. The first thing to do is to probe the output of stim, target, and error to make sure that those nodes are outputting & computing the correct things (e.g., stim and target should be outputting the correct vector values). Note that you have to keep in mind the radius of the input vectors. As I mentioned here:

The next thing you will want to explore is a comment I made in my original post:

Since you are trying to represent a 38-dimensional input signal, I’d wager that 500 neurons is simply not enough to learn the matrix in question.

The last thing you’ll want to look at is this:

It may be that you are simply not giving the network enough time to learn the function. Try increasing the simulation time to see if that has an effect on the learning.

For Nengo, you can run the simulation in an infinite loop by using the sim.step() function, like so:

with nengo.Simulator(model) as sim:
    while <stopping_condition>:
        sim.step()

I’m not sure what you are trying to ask here… What do you mean by “time dimension”? In Nengo, the time data is determined by the dt of the simulation and by how many timesteps was taking during the simulation. By default, Nengo simulations are created with a dt=0.001 (1ms).

SebVol · July 2, 2022, 7:24am

Thanks @xchoo, I managed to get something working! However my plots aren’t as smooth as yours.

For Nengo, you can run the simulation in an infinite loop by using the sim.step() function, like so

Yes that basically answers my question, I thought I need to introduce a new dimension for time manually.
The only thing I didn’t find is, is there an optimal way to determine the timing of the synapse with respect to the stimulus. For example, if I present the stimulus in my simulation for 0.1s, my synapse value should always equal one hundredth of the stimulus presentation, or similar.

xchoo · July 11, 2022, 2:13am

It looks like your network is working, it’s just that it hasn’t been training for long enough to see the effects of the training. Keep in mind that I trained my network for 200s, whereas the plot you posted seem to only be for 4 seconds. Also keep in mind that in my network, I inhibit the learning process after the 200s training period (see my post above). This has the effect of the learning rule not causing the input to “spike” as it adjusts the output to match the input.

If you are using the default synapse (which is the exponential synapse), the rough rule of thumb is that in \tau_{syn} seconds, a step input will rise from it’s original value to about 2/3rds of the final step value. As an example, if your synapse has a \tau = 0.005s, and you feed in a step input from 0 to 1, then in 0.005s, the output of the synapse will be roughly 0.66 (0.632 to be precise, see here). To be on the safe side, then, the stimulus you present to the network should stay constant for at least 3\tau_{syn}. Of course, this is for a 1 layer network, you may want to add additional time for networks with multiple layers to ensure there is enough time for the information to propagate throughout the entire network.

SebVol · July 20, 2022, 7:36am

Hey @xchoo,

thanks a lot! I toyed around with different representation lengths of the stimulus and everything is working now (also if the error signal is inhibited).

Thanks again for the tips!

xchoo · August 11, 2022, 3:36am

Here’s my code if you want to compare:
test_learn_matrix.py (2.8 KB)

SebVol · August 11, 2022, 8:19am

Thank you for the code; the error as function and node is a nice idea; it makes my code run a little quicker.
Previously, I used a neuron ensemble.