How to apply weight decay with NengoDL in Nengo-style network

brent · June 22, 2020, 7:24pm

I haven’t been able to find a good example of how to apply regularization to the weights of a connection with NengoDL. I’ve come up with a solution that seems to work, but I feel like it is not ideal and there is a better method that I am missing. Can anyone point me to an example of applying L2 regularization to weights in addition to the standard loss function?
Here is what I have been doing so far:

import nengo
import tensorflow as tf
import numpy as np
import nengo_dl

input_dim = 100
hidden_size = 1024
output_dim = 2
n_epochs = 10
n_samples = 1000

# random placeholder data
train_input = np.random.uniform(size=(n_samples, input_dim))
train_output = np.random.uniform(size=(n_samples, output_dim))
test_input = np.random.uniform(size=(n_samples, input_dim))
test_output = np.random.uniform(size=(n_samples, output_dim))
val_input = np.random.uniform(size=(n_samples, input_dim))
val_output = np.random.uniform(size=(n_samples, output_dim))


with nengo.Network(seed=13) as net:
    net.config[nengo.Connection].synapse = None
    net.config[nengo.Connection].transform = nengo_dl.dists.Glorot()
    neuron_type = nengo.LIF(amplitude=0.01)

    nengo_dl.configure_settings(stateful=False)
    nengo_dl.configure_settings(keep_history=False)

    inp = nengo.Node(np.zeros((input_dim,)))

    hidden_ens = nengo.Ensemble(
        n_neurons=hidden_size,
        dimensions=1,
        neuron_type=neuron_type
    )

    out = nengo.Node(size_in=2)

    conn_in = nengo.Connection(inp, hidden_ens.neurons, synapse=None)
    conn_out = nengo.Connection(hidden_ens.neurons, out, synapse=None)

    out_p = nengo.Probe(out, label="out_p")

    p_weight_in = nengo.Probe(conn_in, "weights", label="p_weight_in")
    p_weight_out = nengo.Probe(conn_out, "weights", label="p_weight_out")

    out_p_filt = nengo.Probe(out, synapse=0.1, label="out_p_filt")
    

minibatch_size = 200

def mse_loss(y_true, y_pred):
    return tf.metrics.MSE(
        y_true[:, -1], y_pred[:, -1]
    )


with nengo_dl.Simulator(net, minibatch_size=minibatch_size) as sim:

    # add single timestep to training/validation data
    train_input = train_input[:, None, :]
    train_output = train_output[:, None, :]

    val_input = val_input[:, None, :]
    val_output = val_output[:, None, :]

    sim.compile(
        optimizer=tf.optimizers.Adam(0.001),
        loss={
            out_p: mse_loss,
            p_weight_in: nengo_dl.losses.Regularize(),
            p_weight_out: nengo_dl.losses.Regularize(),
        }
    )

    # I get errors if I don't provide input for the regularization probes
    dummy_train_array = np.empty((train_input.shape[0], 1, p_weight_in.size_in), dtype=bool)
    dummy_val_array = np.empty((val_input.shape[0], 1, p_weight_in.size_in), dtype=bool)
    history = sim.fit(
        train_input,
        {
            out_p: train_output,
            p_weight_in: dummy_train_array,
            p_weight_out: dummy_train_array,
        },
        epochs=n_epochs,
        validation_data=(
            val_input,
            {
                out_p: val_output,
                p_weight_in: dummy_val_array,
                p_weight_out: dummy_val_array,
            }
        )
    )

If I sim.compile with the regularization probes, I get errors if I don’t also include data for them in sim.fit for both training and validation, even though the inputs they are given are ignored. My workaround was to use dummy numpy arrays, but for large datasets this can cause memory problems. Making the arrays boolean helps a little with memory, but I feel like there has to be a better way that I am missing.

Eric · June 23, 2020, 2:48am

Hi Brent!

Unfortunately, there’s not a more natural way to do regularization in NengoDL, the way there is in Keras for example. Because NengoDL combines a lot of signals behind the scenes, it’s not trivial to add regularization. That said, we do plan on adding better support for regularization at some point in the future.

For now, what you’re doing is the recommended way. I think I’ve found a way for you to avoid making those huge dummy arrays, though.

The first thing is to not use the dummy arrays in your loss function, since you don’t need them. Rather than using the built-in nengo_dl.losses.Regularize, you’ll have to make your own I think. Something like:

def reg_loss(y_true, y_pred):
    return tf.reduce_mean(tf.square(y_pred))

You can see that this still takes a value for y_true, it just ignores it. You’ll still have to pass dummy arrays, but when you declare them, make the number of “dimensions” (the third index) zero:

    dummy_train_array = np.empty((train_input.shape[0], 1, 0))
    dummy_val_array = np.empty((val_input.shape[0], 1, 0))

This way, they still pass the checks on the first and second indices, but they have size zero so they don’t take up space.

I checked that this works. You’ll want to use loss_weights in sim.compile to control the amount of regularization.

Eric · June 23, 2020, 2:52am

Looking at nengo_dl.losses.Regularize, it looks like y_true is ignored in it as well. So you could just do the second part (make the dummy arrays have zero size).

brent · June 23, 2020, 4:00am

Thanks! Setting the third index to 0 works great, I didn’t realize that one didn’t have to match.
There is still a bit of weirdness with history recording both train and validation values each epoch for every set of weights being regularized, but that doesn’t use as much memory so not a big deal.