Time sequence autoencoder extensions

khanus · August 13, 2020, 11:29pm

Hi everyone,

Recently I’ve been playing around with a spiking Nengo autoencoder, which I was able to get working with the help of some of the great folks here (thanks for that!).

Here are some results from a model I trained to reconstruct neural spikes (trained on real neural data). The reconstructions are on the left and the original spikes are on the right.

I have two questions I’m thinking about now:

In this example, each spike is a sample of 64 data points over a small window of time (so my input layer is 64 neurons). The inputs are fed into the network in parallel. However, I’d like to feed these data points (the spike waveform) in serial as a time sequence, and then reconstruct that time sequence to get the waveform. Would using the Legendre Memory Units be the best way to accomplish this, or is there some other way?
The second thing I want to explore is the idea of “temporal compression” referenced in this thread: Dimensionality Reduction/Generative Modeling with Nengo
I’ve already seen some interesting results about how the length of the simulation affects reconstruction loss:
reconstructed digit at different time steps692×209 29.8 KB

From left to right, the reconstructions are for 50 time steps, 100 time steps, and 150 time steps respectively.

Something I want to experiment with is for simulating different parts of the network for different amounts of time. For example, if I simulate the hidden layers for a shorter amount of time than the output layer, I could then learn a more temporally sparse representation of the input data, right?

Is there an easy way to do this with Nengo? If I did it by creating different networks, how do I connect between their ensembles so I can then maybe simulate one of them for a certain length of time steps and then another for a different number of time steps? Or would there be another way to do it?

Sorry if this is pretty long, just curious to hear people’s thoughts on this so I know what direction to go. Thanks!

tbekolay · August 17, 2020, 2:38pm

This is very cool! I’m not super familiar with autoencoders, so best if other people can chime in on these questions, but I’ll give my thoughts in any case.

For tasks with temporal dependencies (especially if you want to use spikes) LMUs are likely the best option, yes. I would guess that changing from your current network to a sequential one would be similar to how the normal spiking MNIST example differs from the sequential MNIST example that uses LMUs.
There currently isn’t a straightforward way to simulate some parts of a network for different amouts of time (or with different timestep lengths) in Nengo. One way that you could possibly hack this in (which should work, but feel very weird) would be to encapuslate a whole Nengo simulation in a node, which can be connected to the rest of the network, but there would definitely be complications with that kind of approach.

The main issue is that everything in Nengo is designed to be run continuously over time, so none of the components know at the outset how long they’re going to be running for. So, while you could separately run network A for 3 seconds and network B for 2 seconds, if you want bidirectional communication, you’ll have to think about when network B should be communicating back to network A since their timesteps do not line up. You can get around this a bit by having them simulate for even fractions of time (e.g., network A runs for twice the length of network B, so you have them communicate on every second timestep) but you’ll have to figure out if you want to subsample the output of the faster network, or buffer and summarize (e.g., average) the incoming data.

If you don’t need bidirectional communication and are doing more of a feedforward structure, you can always have your various networks simulated separately and pass information from one network to the other using probes from the upsteam networks. E.g., run network A for 3 seconds, look at probed data to figure out the input to network B, which you can run for 2 seconds, and so on.

I hope one of these ideas will lead you down a productive path Let us know how things are progressing, this project is super cool!

khanus · August 21, 2020, 3:30am

Hi Trevor, thanks for your help. I’ve gotten started with some of these and will report back how it goes.

I’ve run into a bit of trouble with using the LMU, I was wondering if you could help. I don’t quite understand how the LMU is implemented in a network of spiking neurons, i.e. what the network architecture would look like. Is there a handy diagram somewhere? The reason I ask is because I want to create something like a “bottleneck” in the middle hidden layer of a regular autoencoder, like in the following diagram:

I tried looking at the paper but couldn’t find any useful clues.

My second question has to do with training and testing the network. In an autoencoder, the training data input into the network is also the target, but it seems like this causes problems with the loss function used in the given code. My question is will any general loss function work with the LMU? Is there some reason why we move away from tf.losses.mse to tf.losses.SparseCategoricalCrossentropy?

If I just change the target from the digit labels to the digit images themselves, I get some errors when I run sim.evaluate and sim.fit. Specifically:

InvalidArgumentError:  logits and labels must have the same first dimension, got logits shape [100,784] and labels shape [78400]

I think I have to change how I’m setting up the loss function and maybe also the optimizer for training and testing, but not sure how to get this to work with an LMU. I think a lot of my issues are from not understanding LMUs that well. Are there any additional resources on them?

xchoo · August 21, 2020, 8:56pm

HI @khanus, Trevor is a little busy so let’s see I can address some of your questions.

I’ve run into a bit of trouble with using the LMU, I was wondering if you could help. I don’t quite understand how the LMU is implemented in a network of spiking neurons, i.e. what the network architecture would look like. Is there a handy diagram somewhere?

You can find a block diagram of the basic LMU architecture here: https://appliedbrainresearch.com/lmu/
As for a spiking Nengo implementation of the LMU, you can find that here:
This page has moved
The example code above is for the NengoLoihi backend in particular, so it’ll have to be slightly modified to run in standard Nengo, but not too many changes will need to be made. If you require help getting it to run in Nengo, let me know.

The reason I ask is because I want to create something like a “bottleneck” in the middle hidden layer of a regular autoencoder, like in the following diagram:

The LMU architecture also has a similar “bottleneck”. Comparing to your diagram the “code” layer is similar to the “hidden” layer in the LMU network. The encoder layers correspond to the LMU’s e_x, e_h, and e_m encoders and the projection between the green encoder layer to the code layer is similar to the projection between u_t and “Linear” in the LMU network, and the projection between the “hidden” and “linear” layers in the LMU. One important difference between the network architecture you posted and the LMU is that the LMU is a recurrent network (notice that m and h project back on themselves).

My second question has to do with training and testing the network. In an autoencoder, the training data input into the network is also the target, but it seems like this causes problems with the loss function used in the given code. My question is will any general loss function work with the LMU? Is there some reason why we move away from tf.losses.mse to tf.losses.SparseCategoricalCrossentropy ?

The loss function used to train the system depends on the specifics of the task you are trying to solve. This article is a good primer on how to chose a loss function for your network. For the psMNIST example, the LMU network is being tasked to classify the inputs into 10 separate output classes. It is for this reason that we use the cross entropy loss function in that code.

If I just change the target from the digit labels to the digit images themselves, I get some errors when I run sim.evaluate and sim.fit . Specifically:

If you are referring to the psMNIST example, some changes will need to be made to the code to get it to train an “autoencoder”. First, the out output node has a size of 10 (for the 10 digit classes), this will have to be modified to 784 (i.e., 28 x 28 pixels) if you want it to reproduce the input image. Next you’ll need to generate the “label” data. sim.fit and sim.evaluate expect 1 label per training input. In this case, your 1 label would be a 784 dimensional vector corresponding to the associated input image. Finally (at least this is the last change that comes to mind at the moment), you’ll want to change the loss function. Since the task you want to solve is no longer a classification task, the cross-entropy loss function is no longer appropriate, and you’ll probably want to switch back to mse.

I hope that helps!

khanus · September 11, 2020, 3:52am

Hi @xchoo, thanks for your help! I’ve tried implementing all of the steps you mentioned but I think I’m still doing something wrong. To troubleshoot, I’ve left the LMU cell as it is and only made the changes you list here (so I’m not creating that sort of “bottleneck” an autoencoder has). Right now I just want to see if I can get the more basic setup working.

I think my problem is with the data I use for sim.fit. Right now I’ve got it like this:

sim.fit(train_images, train_images, epochs=1)

where train_images.shape is (784,1). When I do this my network doesn’t seem to learn anything, the reconstructed images are just noise. Based on what you said above though, I thought this should be the right shape for the data. Am I missing something here? I’m having trouble figuring out what else the problem could be since I didn’t change much else from the original LMU example. As you said, I’m using mse as my loss function and Adam as the optimizer with a learning rate of 0.001.

xchoo · September 11, 2020, 2:05pm

It’s hard to answer this question without more context of your script. Is it possible to post a link to your script, or post your code here?

khanus · September 11, 2020, 10:06pm

sequential_lmu_autoencoder.ipynb (32.5 KB)

Here’s my code as a jupyter notebook. Thanks!

khanus · September 16, 2020, 8:55pm

Hi @xchoo, apologies if you’re busy, but was just wondering if there was any update on this? Thanks a ton!

xchoo · September 17, 2020, 12:28am

Hi @khanus, unfortunately, I haven’t made much progress with your notebook. I have been busy with a few other things. One thing I might want to suggest is to try running the training for more than one epoch, and see if that improves your reconstruction.

khanus · September 17, 2020, 4:12am

That’s totally cool, take your time! Don’t wanna impose here too much. I had it running for 5 epochs earlier but just reduced it to 1 temporarily so I could iterate a bit faster. Even with 5 epochs it didn’t come close to resembling what it should look like so I didn’t think that was the problem. I’ll keep working on this in the meantime and see if I can figure it out. Whenever you can find time to take a look I’d appreciate it. Thanks!

xchoo · October 8, 2020, 6:12am

Hi @khanus

I finally had the time to take a look at your code to figure out what was wrong with it. As it turns out, it was a rather simple fix (but I only discovered it after taking apart pretty much your entire notebook… )

The offending cell was the one with this code:

with nengo_dl.Simulator(net, minibatch_size=100, unroll_simulation=16) as sim:
    # Load the saved simulator parameters
    sim.freeze_params(net)
    ...

As it turns out (and I am not familiar with NengoDL to have spotted this quickly), freeze_params doesn’t actually do what you want the code to do. Looking at your code, I believe you intended that block of code to load up whatever parameters you had saved during the network training run (i.e., in the cell above, after the sim.fit call)

However, what the code in your notebook actually does is to create a new simulator object, and call the freeze_params function on net. Since your notebook creates an entirely new simulator object, the network is re-initalized with random weights, effectively removing the effect of the training.

** As a side note, what the freeze_params function does is to take whatever had happened in that sim context block (i.e., anything within an as sim block), and transfer it back into the net object. If you had called the freeze_param function in the previous cell, it should have worked. But, since you created a new sim context block before calling freeze_param, NengoDL made a new simulator (and reinitialized everything) before the freeze_param call.

Fixing your code is simple – instead of using freeze_params, use the load_params function, like so:

with nengo_dl.Simulator(net, minibatch_size=100, unroll_simulation=16) as sim:
    # Load the saved simulator parameters
    sim.load_params("./lmu_params")
    ...

Another thing I did to get your code to work is to restructure the training output data. I noticed that it was using the correct array dimensions, so I added this:

# this flattens the images into arrays
train_outputs = train_images.reshape((train_images.shape[0], -1))
test_outputs = test_images.reshape((test_images.shape[0], -1))

# convert outputs into temporal format 
train_outputs = train_outputs[:, None, :]
test_outputs = test_outputs[:, None, :]

And modified the training to do this instead:

sim.fit(train_images, train_outputs, epochs=n_epochs)

I’ve attached my version of the LMU auto-encoder network below. Note that in my notebook, I have two variants of the auto-encoder network. The first being the LMU version that you were trying to implement, and the second being a feed-forward version that is based off this NengoDL example. I used the feed-forward version to debug the rest of the code in the notebook, since it was way faster to train and test.

sequential_lmu_autoencoder.ipynb (44.6 KB)

Here’s an example input-output pair from the test dataset using the LMU auto-encoder network that had been trained on the 60,000 image training set for 1 epoch.

There are definitely more improvements you can make to the code, but I think this should give you a good start.

Oh, and apologies for the wait!

khanus · October 20, 2020, 2:18am

Awesome, thanks so much! That makes a lot of sense, I should have read the documentation a little more carefully. I can’t actually remember what made me put the freeze_params call in a different sim block, might just have been a legacy of something I was trying before. Excited to work with this, I’ve got some things planned to try out. Will make a post with what I end up finding!