Can the gradient of a NengoDL network's output be computed with respect to its input for use in an objective function?

Hello all,

I started learning Nengo a couple of months ago for a project that I have been working on over the summer. More recently, I started to explore NengoDL and I have run into an issue concerning the calculation of the gradients of my network’s output with respect to its input.

Given a network with input x and output y, I want to compute dy/dx in order to use this as part of the objective function during training. In TensorFlow, this task is trivial, since you can just use tf.gradients(). Here is an example of part of the loss function I am using for my non-spiking network in TensorFlow that works as desired.

u = self.net_u(x, y, t)

u_t = tf.gradients(u, t)[0]
u_x = tf.gradients(u, x)[0]
u_y = tf.gradients(u, y)[0]

In this example, x, y, & t are network inputs and u is the network output. I can then go on and use these gradients in the rest of the objective function.

I am now trying to do the same thing in NengoDL. However, when I attempt what I thought was a reasonably similar approach in NengoDL using tf.gradients(), I get a None result for the gradients. Here is a minimal example of what I tried to do. The example is a bit long because I think it is necessary to show the network construction, training setup, and objective function. The None issue occurs in the objective function.

# --------------------------- IMPORT LIBRARIES ---------------------------

# Import built-in necessary libraries.
import nengo
import nengo_dl
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf

# --------------------------- SET NETWORK & TRAINING PROPERTIES ---------------------------

# Define simulation properties.
dt_sim = 0.001
t_sim_duration = 2*np.pi

# Define training properties.
batch_size_train = 10000
minibatch_size_train = 50
n_train_epochs = 50
learning_rate = 0.001

# Simple function that we will approximate.
def f_approx(x):

    return np.sin(x)


# --------------------------- BUILD THE NETWORK ---------------------------

network = nengo.Network()

network.config[nengo.Ensemble].neuron_type = nengo.LIF(amplitude=0.001)
network.config[nengo.Connection].synapse = None

with network:

    # Create the network components.
    x_node = nengo.Node(output=lambda x: x, label='x Node')
    x_ens = nengo.Ensemble(n_neurons=20, dimensions=1, radius=1.25*t_sim_duration, seed=0, label='x Ensemble')
    y_node = nengo.Node(output=None, size_in=1)
    output_node = nengo.Node(output=None, size_in=2)

    # Connect up the network components.
    nengo.Connection(x_node, x_ens)
    nengo.Connection(x_ens, y_node, function=f_approx)
    nengo.Connection(x_node, output_node[0])
    nengo.Connection(y_node, output_node[1])

    # Collect data from the network.
    x_probe = nengo.Probe(x_node)
    y_probe_nofilt = nengo.Probe(y_node)
    y_probe_filt = nengo.Probe(y_node, synapse=0.01)
    output_probe_nofilt = nengo.Probe(output_node, synapse=0.01)
    output_probe_filt = nengo.Probe(output_node)


# --------------------------- SIMULATE THE UNTRAINED NETWORK ---------------------------

with nengo_dl.Simulator(network=network, dt=dt_sim) as sim:

    sim.run(t_sim_duration)


# --------------------------- PLOT THE UNTRAINED NETWORK RESULTS ---------------------------

plt.figure(); plt.xlabel('Time [s]'); plt.ylabel('Network Input [-]'); plt.title('Network Input vs Time')
plt.plot(sim.trange(), sim.data[x_probe], label='x')

plt.figure(); plt.xlabel('Time [s]'); plt.ylabel('Network Output [-]'); plt.title('Network Output vs Time (Untrained)')
plt.plot(sim.trange(), sim.data[y_probe_filt], label='y SNN')
plt.plot(sim.trange(), f_approx(sim.trange()), label='y True')
plt.legend()


# --------------------------- GENERATE NETWORK TRAINING DATA ---------------------------

# Create the training data inputs.
xs_training_input = np.random.uniform(0, t_sim_duration, size=(batch_size_train, 1, 1))

# Compute the training data outputs.
ys_training_output = f_approx(xs_training_input)

# Concatenate the xs & ys data.  I am doing this to get all of this information into the outputs variable of the objective function.
training_outputs = np.concatenate((xs_training_input, ys_training_output), axis=-1)

# Original training data dictionary.  Output is only y.
# data_training = {x_node: xs_training_input, y_probe_nofilt: ys_training_output}

# New training data dictionary.  Output is both x & y.
data_training = {x_node: xs_training_input, output_probe_nofilt: training_outputs}

# Define the objective function.
def train_objective(outputs, targets):

    # Retrieve the tensors associated with the network's inputs and outputs.
    x = outputs[:, :, 0]
    y = outputs[:, :, 1]

    # Compute the gradient of the network output with respect to the input.
    dydx = tf.gradients(y, x)[0]  # THIS VALUE COMES OUT TO NONE.

    print('dydx = ', dydx)

    # Compute the MSE of the output.
    return tf.reduce_mean(tf.square(outputs[:, :, -1] - targets[:, :, -1]))


# --------------------------- TRAIN THE NETWORK ---------------------------

with nengo_dl.Simulator(network=network, minibatch_size=minibatch_size_train) as sim:

    # Original train call, uses just the y data.
    # sim.train(data=data_training, optimizer=tf.train.AdamOptimizer(learning_rate=learning_rate), objective={y_probe_nofilt: train_objective}, n_epochs=n_train_epochs)

    # New train call, uses both the x & y data.
    sim.train(data=data_training, optimizer=tf.train.AdamOptimizer(learning_rate=learning_rate), objective={output_probe_nofilt: train_objective}, n_epochs=n_train_epochs)

    # Freeze the parameters.
    sim.freeze_params(network)



# --------------------------- SIMULATE THE TRAINED NETWORK ---------------------------

with nengo_dl.Simulator(network=network, dt=dt_sim) as sim:

    sim.run(t_sim_duration)


# --------------------------- PLOT THE TRAINED NETWORK RESULTS ---------------------------

plt.figure(); plt.xlabel('Time [s]'); plt.ylabel('Network Output [-]'); plt.title('Network Output vs Time (Trained)')
plt.plot(sim.trange(), sim.data[y_probe_filt], label='y SNN')
plt.plot(sim.trange(), f_approx(sim.trange()), label='y True')
plt.legend()


plt.show()

In this example, I pass both the network input and output to the objective function and attempt to take their gradient, yielding None.

This indicates to me that TensorFlow is not recognizing that the network’s inputs are in fact connected to its output through the Nengo nodes/ensembles that comprise the network. Alternatively, this could perhaps mean that the relationship is non-differentiable. This would certainly be a problem while simulating the Nengo network, since I am using LIF neurons, but since I am only performing the gradient calculation during training (which it is my understanding converts to the LIFRate neuron type), then I assume that the differentiability is not the issue.

My question is therefore:

How can I compute the gradients of the output of a NengoDL network with respect to its input? More specifically, how can I do this during training to incorporate these values into the loss/objective function?

Any assistance or advice that you could provide would be greatly appreciated. It seems like the solution should be something simple that I am missing.

Thanks!

The reason your gradient calculation isn’t working has to do with a tricky part of TensorFlow, which is the difference between concrete values (e.g. numpy arrays) and symbolic Tensors.

When you do

xs_training_input, ys_training_output, and training_outputs are not the (symbolic) inputs and outputs of the network. They are just numpy arrays of data (which you will be passing as input values and target values). Those data arrays will be passed into the objective function, but they are just constant values fed in to the simulation, they aren’t the network input/output tensors.

So when you do this

Note that I think you meant to use targets[:, :, 0] and targets[:, :, 1] here, but it wouldn’t work in either case. Because you’re not computing the gradients between the network inputs and outputs , you’re just computing the gradient between different parts of that numpy array you fed in as training_outputs (which is None).

It’s the same as if you tried to do, in TensorFlow,

a = tf.placeholder(...)
b = a + 1

input_vals = np.zeros(...)
output_vals = sess.run(b, feed_dict={a: input_vals})

combined = np.concatenate((input_vals, output_vals), axis=-1)
c = tf.placeholder(...)
tf.gradients(c[..., 0], c[..., 1], feed_dict={c: combined})

If you want to compute the gradients in TensorFlow you need to use the symbolic tensors (a and b), not the numpy values (input_vals and output_vals).

So, long story short, what you want is access to the symbolic network inputs inside your objective. Those are stored in the sim.tensor_graph.input_ph dictionary. So you could use them inside your objective like

with nengo_dl.Simulator(...) as sim:
    def my_objective(outputs, targets):
        inputs = tf.transpose(sim.tensor_graph.input_ph[my_node], (2, 0, 1))
        dydx = tf.gradients(outputs, inputs)
        ...

(where my_node is the nengo Node that you want the input tensor for). Note that for internal implementation reasons the input_phs have the batch dimension last (so they have shape (n_steps, node.size_out, minibatch_size)), which is why I’ve transposed them there so they line up with the batch-first outputs/targets.

Thank you for your prompt and thorough response.

It makes sense that you would need to use the symbolic input and output tensors of the network in order to compute the gradient with tf.gradients(). I did not know that you could access the symbolic network input tensors using the sim.tensor_graph.input_ph dictionary – that’s very helpful.

A quick point of clarification: It makes sense that the targets input to the objective function would not be symbolic tensors since I am specifying it via a dictionary of numpy arrays. However, what about the outputs input to the objective function? Is outputs a symbolic tensor?

I am thinking about this because I implemented your suggestion and made several minor changes to my code. Yet, I still get the same result as before with dydx = tf.gradients(outputs, inputs) evaluating to None. Here is the updated code.

# --------------------------- IMPORT LIBRARIES ---------------------------

# Import built-in necessary libraries.
import nengo
import nengo_dl
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf

# --------------------------- SET NETWORK & TRAINING PROPERTIES ---------------------------

# Define simulation properties.
dt_sim = 0.001
t_sim_duration = 2*np.pi

# Define training properties.
batch_size_train = 10000
minibatch_size_train = 50
n_train_epochs = 50
learning_rate = 0.001

# Simple function that we will approximate.
def f_approx(x):

    return np.sin(x)


# --------------------------- BUILD THE NETWORK ---------------------------

network = nengo.Network()

network.config[nengo.Ensemble].neuron_type = nengo.LIF(amplitude=0.001)
network.config[nengo.Connection].synapse = None

with network:

    # Create the network components.
    x_node = nengo.Node(output=lambda x: x, label='x Node')
    x_ens = nengo.Ensemble(n_neurons=20, dimensions=1, radius=1.25*t_sim_duration, seed=0, label='x Ensemble')
    y_node = nengo.Node(output=None, size_in=1)

    # Connect up the network components.
    nengo.Connection(x_node, x_ens)
    nengo.Connection(x_ens, y_node, function=f_approx)

    # Collect data from the network.
    x_probe = nengo.Probe(x_node)
    y_probe_nofilt = nengo.Probe(y_node)
    y_probe_filt = nengo.Probe(y_node, synapse=0.01)


# --------------------------- SIMULATE THE UNTRAINED NETWORK ---------------------------

with nengo_dl.Simulator(network=network, dt=dt_sim) as sim:

    sim.run(t_sim_duration)


# --------------------------- PLOT THE UNTRAINED NETWORK RESULTS ---------------------------

plt.figure(); plt.xlabel('Time [s]'); plt.ylabel('Network Input [-]'); plt.title('Network Input vs Time')
plt.plot(sim.trange(), sim.data[x_probe], label='x')

plt.figure(); plt.xlabel('Time [s]'); plt.ylabel('Network Output [-]'); plt.title('Network Output vs Time (Untrained)')
plt.plot(sim.trange(), sim.data[y_probe_filt], label='y SNN')
plt.plot(sim.trange(), f_approx(sim.trange()), label='y True')
plt.legend()


# --------------------------- GENERATE NETWORK TRAINING DATA ---------------------------

# Create the training data inputs.
xs_training_input = np.random.uniform(0, t_sim_duration, size=(batch_size_train, 1, 1))

# Compute the training data outputs.
ys_training_output = f_approx(xs_training_input)

# Training data dictionary.
data_training = {x_node: xs_training_input, y_probe_nofilt: ys_training_output}


# --------------------------- TRAIN THE NETWORK ---------------------------

with nengo_dl.Simulator(network=network, minibatch_size=minibatch_size_train) as sim:

    # Define the objective function we will use when training our SNN. (Note that the SNN will be converted to an ANN for training.)
    def train_objective(outputs, targets):

        # Retrieve the symbolic input tensor.
        inputs = tf.transpose(sim.tensor_graph.input_ph[x_node], (2, 0, 1))

        # Compute the gradient of the network output with respect to the input.
        dydx = tf.gradients(outputs, targets)[0]                  # THIS IS STILL NONE.

        # Compute the MSE of the output.
        return tf.reduce_mean(tf.square(outputs - targets))       # This is just an example so that the code works, I ultimately want to incorporate dydx.

    # Train the network.
    sim.train(data=data_training, optimizer=tf.train.AdamOptimizer(learning_rate=learning_rate), objective={y_probe_nofilt: train_objective}, n_epochs=n_train_epochs)

    # Freeze the parameters.
    sim.freeze_params(network)



# --------------------------- SIMULATE THE TRAINED NETWORK ---------------------------

with nengo_dl.Simulator(network=network, dt=dt_sim) as sim:

    sim.run(t_sim_duration)


# --------------------------- PLOT THE TRAINED NETWORK RESULTS ---------------------------

plt.figure(); plt.xlabel('Time [s]'); plt.ylabel('Network Output [-]'); plt.title('Network Output vs Time (Trained)')
plt.plot(sim.trange(), sim.data[y_probe_filt], label='y SNN')
plt.plot(sim.trange(), f_approx(sim.trange()), label='y True')
plt.legend()


plt.show()

Do you have any ideas about why I might still be getting a None result from the gradient? I speculate that it has to do with the outputs tensor?

Thanks again.

In your code you’re still computing the gradients w.r.t targets

you need to change that to tf.gradients(outputs, inputs).

My apologies.

That was just a copy and paste error / typo I made when editing the code for the forum post. I pasted it into the forum field and when I was tidying it up I must have accidentally changed the variable name.

I actually have what you suggest.

Here is a direct copy and paste of the code that still yields the None result for the gradient.

# --------------------------- IMPORT LIBRARIES ---------------------------

# Import built-in necessary libraries.
import nengo
import nengo_dl
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf

# --------------------------- SET NETWORK & TRAINING PROPERTIES ---------------------------

# Define simulation properties.
dt_sim = 0.001
t_sim_duration = 2*np.pi

# Define training properties.
batch_size_train = 10000
minibatch_size_train = 50
n_train_epochs = 50
learning_rate = 0.001

# Simple function that we will approximate.
def f_approx(x):

    return np.sin(x)


# --------------------------- BUILD THE NETWORK ---------------------------

network = nengo.Network()

network.config[nengo.Ensemble].neuron_type = nengo.LIF(amplitude=0.001)
network.config[nengo.Connection].synapse = None

with network:

    # Create the network components.
    x_node = nengo.Node(output=lambda x: x, label='x Node')
    x_ens = nengo.Ensemble(n_neurons=20, dimensions=1, radius=1.25*t_sim_duration, seed=0, label='x Ensemble')
    y_node = nengo.Node(output=None, size_in=1)

    # Connect up the network components.
    nengo.Connection(x_node, x_ens)
    nengo.Connection(x_ens, y_node, function=f_approx)

    # Collect data from the network.
    x_probe = nengo.Probe(x_node)
    y_probe_nofilt = nengo.Probe(y_node)
    y_probe_filt = nengo.Probe(y_node, synapse=0.01)


# --------------------------- SIMULATE THE UNTRAINED NETWORK ---------------------------

with nengo_dl.Simulator(network=network, dt=dt_sim) as sim:

    sim.run(t_sim_duration)


# --------------------------- PLOT THE UNTRAINED NETWORK RESULTS ---------------------------

plt.figure(); plt.xlabel('Time [s]'); plt.ylabel('Network Input [-]'); plt.title('Network Input vs Time')
plt.plot(sim.trange(), sim.data[x_probe], label='x')

plt.figure(); plt.xlabel('Time [s]'); plt.ylabel('Network Output [-]'); plt.title('Network Output vs Time (Untrained)')
plt.plot(sim.trange(), sim.data[y_probe_filt], label='y SNN')
plt.plot(sim.trange(), f_approx(sim.trange()), label='y True')
plt.legend()


# --------------------------- GENERATE NETWORK TRAINING DATA ---------------------------

# Create the training data inputs.
xs_training_input = np.random.uniform(0, t_sim_duration, size=(batch_size_train, 1, 1))

# Compute the training data outputs.
ys_training_output = f_approx(xs_training_input)

# Training data dictionary.
data_training = {x_node: xs_training_input, y_probe_nofilt: ys_training_output}


# --------------------------- TRAIN THE NETWORK ---------------------------

with nengo_dl.Simulator(network=network, minibatch_size=minibatch_size_train) as sim:

    # Define the objective function we will use when training our SNN. (Note that the SNN will be converted to an ANN for training.)
    def train_objective(outputs, targets):

        # Retrieve the symbolic input tensor.
        inputs = tf.transpose(sim.tensor_graph.input_ph[x_node], (2, 0, 1))

        # Compute the gradient of the network output with respect to the input.
        dydx = tf.gradients(outputs, inputs)[0]                  # THIS IS STILL NONE.

        # Compute the MSE of the output.
        return tf.reduce_mean(tf.square(outputs - targets))       # This is just an example so that the code works, I ultimately want to incorporate dydx.

    # Train the network.
    sim.train(data=data_training, optimizer=tf.train.AdamOptimizer(learning_rate=learning_rate), objective={y_probe_nofilt: train_objective}, n_epochs=n_train_epochs)

    # Freeze the parameters.
    sim.freeze_params(network)



# --------------------------- SIMULATE THE TRAINED NETWORK ---------------------------

with nengo_dl.Simulator(network=network, dt=dt_sim) as sim:

    sim.run(t_sim_duration)


# --------------------------- PLOT THE TRAINED NETWORK RESULTS ---------------------------

plt.figure(); plt.xlabel('Time [s]'); plt.ylabel('Network Output [-]'); plt.title('Network Output vs Time (Trained)')
plt.plot(sim.trange(), sim.data[y_probe_filt], label='y SNN')
plt.plot(sim.trange(), f_approx(sim.trange()), label='y True')
plt.legend()


plt.show()

Please let me know if you have any idea what might still be causing the None. Thanks!

I don’t see anything else obviously wrong, so I’ll have to dig into it a bit more. I’m not sure if anyone has tried to compute gradients inside a NengoDL objective before, so we’re in new territory. I probably won’t have a chance to dig deeper into it today, but I can take a look in the next day or two.

No worries! I appreciate your help. Let me know when you get a chance to take a closer look at it.

Ah this is my mistake, we don’t want to compute the gradient between the outputs and the transposed input, because the transposed op we created doesn’t affect the output. We need to compute the gradient between the output and the original input:

        # Retrieve the symbolic input tensor.
        inputs = sim.tensor_graph.input_ph[x_node]

        # Compute the gradient of the network output with respect to the input.
        dydx = tf.gradients(outputs, inputs)[0]  

Note that this does mean that the gradients will be organized batch-last, so just keep that in mind when doing any other calculations (you might want to transpose the gradients back).

Thank you for the suggestion. It makes sense that we would have to take the gradient of the network’s output with respect to the input before transposing. This does in fact yield a non None result for dydx.

Continuing to the next logical step, I would like to use this gradient as part of the error calculation inside the objective function. However, any inclusion of the dydx term in the error calculation currently gives me the error message:

TypeError: Second-order gradient for while loops not supported.

The code that I am using to produce this error is shown below. The only significant difference from before is in the train_objective() function.

# -------------------------------------------------- IMPORT LIBRARIES --------------------------------------------------

# Import built-in necessary libraries.
import nengo
import nengo_dl
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf

# ----------------------------------------- SET NETWORK & TRAINING PROPERTIES ------------------------------------------

# Define simulation properties.
dt_sim = 0.001
t_sim_duration = 2*np.pi

# Define training properties.
batch_size_train = 10000
minibatch_size_train = 50
n_train_epochs = 50
learning_rate = 0.001

# Simple function that we will approximate.
def f_approx(x):

    return np.sin(x)


# ------------------------------------------------- BUILD THE NETWORK --------------------------------------------------

# Create the network.
network = nengo.Network()

# Set default network properties.
network.config[nengo.Ensemble].neuron_type = nengo.LIF(amplitude=0.001)
network.config[nengo.Connection].synapse = None

# Create the components of the network.
with network:

    # Create the network nodes & ensembles.
    x_node = nengo.Node(output=lambda x: x, label='x Node')
    x_ens = nengo.Ensemble(n_neurons=20, dimensions=1, radius=1.25*t_sim_duration, seed=0, label='x Ensemble')
    y_node = nengo.Node(output=None, size_in=1)

    # Connect up the network components.
    nengo.Connection(x_node, x_ens)
    nengo.Connection(x_ens, y_node, function=f_approx)

    # Collect data from the network.
    x_probe = nengo.Probe(x_node)
    y_probe_nofilt = nengo.Probe(y_node)
    y_probe_filt = nengo.Probe(y_node, synapse=0.01)


# ------------------------------------------- SIMULATE THE UNTRAINED NETWORK -------------------------------------------

# Setup the simulation.
with nengo_dl.Simulator(network=network, dt=dt_sim) as sim:

    # Run the simulation.
    sim.run(t_sim_duration)


# ----------------------------------------- PLOT THE UNTRAINED NETWORK RESULTS -----------------------------------------

# Plot the network input.
plt.figure(); plt.xlabel('Time [s]'); plt.ylabel('Network Input [-]'); plt.title('Network Input vs Time')
plt.plot(sim.trange(), sim.data[x_probe], label='x')

# Plot the network output.
plt.figure(); plt.xlabel('Time [s]'); plt.ylabel('Network Output [-]'); plt.title('Network Output vs Time (Untrained)')
plt.plot(sim.trange(), sim.data[y_probe_filt], label='y SNN')
plt.plot(sim.trange(), f_approx(sim.trange()), label='y True')
plt.legend()


# ------------------------------------------- GENERATE NETWORK TRAINING DATA -------------------------------------------

# Create the training data inputs.
xs_training_input = np.random.uniform(0, t_sim_duration, size=(batch_size_train, 1, 1))

# Compute the training data outputs.
ys_training_output = f_approx(xs_training_input)

# Training data dictionary.
data_training = {x_node: xs_training_input, y_probe_nofilt: ys_training_output}


# ------------------------------------------------- TRAIN THE NETWORK --------------------------------------------------

with nengo_dl.Simulator(network=network, minibatch_size=minibatch_size_train) as sim:

    # Define the objective function we will use when training our SNN. (Note that the SNN will be converted to an ANN for training.)
    def train_objective(outputs, targets):

        # Retrieve the symbolic input tensor.
        inputs = sim.tensor_graph.input_ph[x_node]

        # Compute the gradient of the network output with respect to the input.
        dydx = tf.transpose(tf.gradients(outputs, inputs)[0], (2, 0, 1))  # This is no longer None!

        # Print dydx (for debugging).
        print('dydx =', dydx)

        # Compute the loss of this output.
        # return tf.reduce_mean(tf.square(outputs - targets))     # Standard MSE.  This works fine.
        return tf.reduce_mean(tf.square(dydx))                    # Example error that uses gradient term.  This yields "TypeError: Second-order gradient for while loops not supported."

    # Train the network.
    sim.train(data=data_training, optimizer=tf.train.AdamOptimizer(learning_rate=learning_rate), objective={y_probe_nofilt: train_objective}, n_epochs=n_train_epochs)

    # Freeze the parameters.
    sim.freeze_params(network)



# -------------------------------------------- SIMULATE THE TRAINED NETWORK --------------------------------------------

# Setup the simulation.
with nengo_dl.Simulator(network=network, dt=dt_sim) as sim:

    # Run the simulation.
    sim.run(t_sim_duration)


# ------------------------------------------ PLOT THE TRAINED NETWORK RESULTS ------------------------------------------

# Plot the network output.
plt.figure(); plt.xlabel('Time [s]'); plt.ylabel('Network Output [-]'); plt.title('Network Output vs Time (Trained)')
plt.plot(sim.trange(), sim.data[y_probe_filt], label='y SNN')
plt.plot(sim.trange(), f_approx(sim.trange()), label='y True')
plt.legend()

# Display the results.
plt.show()

My brief research into this error message leads me to believe that it means exactly what it says; for whatever reason, TensorFlow doesn’t support computing second order gradients in while loops. At first, I thought this was odd since I am only computing a first order gradient, but then I realized that the second order gradient likely comes into play when the error gradient is computed for back propagation. Furthermore, I assume that the while loop is being created by NengoDL in the background in order to simulate the network over multiple time steps. Hence the error when using NengoDL but not when running my ANN version in TensorFlow (since I don’t have any need for a while loop).

Does this seem like a reasonable summary of where the problem is coming from?

Interestingly, I do not believe that I actually need to simulate the network for multiple time steps during training. This is for the same reason mentioned in the NengoDL tutorial “Coming from Nengo to NengoDL.” The network is not recurrent nor does it include synaptic filters, and the training version of the network uses LIFRate neurons. This gives the network no temporal dynamics. So, if I am correct about NengoDL using the while loop to simulate the network over time, then it potentially is not applicable in this case.

Is there a way to prevent the while loop from being created if this is in fact the case?

Do you have any other suggestions regarding how I can include the gradient in the objective function output and circumvent this error?

Thanks for your continued support!

Yeah your analysis sounds correct for why you’re getting that error.

Unfortunately there isn’t at the moment, the Nengo simulation always runs within a while loop. However, we’re in the midst of a large rewrite of NengoDL for TensorFlow 2.0, and at the end of that process it should be possible to optionally run things with or without a while loop. That probably won’t be ready for a while though (on the order of 1-2 months), but you can follow along here https://github.com/nengo/nengo-dl/pull/95.

I’m not sure if this will work for your particular use case or not, but one thing that NengoDL allows you to do is manually compute the error gradient outside the simulation, and then feed it in directly to the training process.

There aren’t any nicely written examples of this, unfortunately, the closest is this test here https://github.com/nengo/nengo-dl/blob/master/nengo_dl/tests/test_simulator.py#L1319. But the basic idea is that you would manually compute the gradients you’re looking for like

with nengo_dl.Simulator(net) as sim:
    gradients = tf.gradients(sim.tensor_graph.probe_arrays[y_probe_nofilt], 
                             sim.tensor_graph.input_ph[x_node])
    dydx = sim.sess.run(gradients, feed_dict=sim._fill_feed(
        n_steps, data={x_node: xs_training_input}, training=True))
    err_grad = ...
    sim.train({x_node: xs_training_input, y_probe_nofilt: err_grad}, 
              optimizer=..., objective={y_probe_nofilt: None})

So I’m not sure whether it’d be possible to compute that err_grad = ... part for your use case. What you want there is d_err / d_y (i.e. the derivative of your error function w.r.t. the output of the network), which hopefully you might be able to compute once you have that dydx value?

Thanks for the information. It makes sense that NengoDL would default to always using a while loop due to the temporal nature of the networks that are typically being simulated. It would be cool if there was an option to avoid the while loop in the future though!

For now, I will pursue your suggestion of computing the error gradient manually outside of an objective function. I remember seeing in the documentation some mention of the fact that you can do this, but no explicit examples. The code you have given me should be more than enough to get me started though.

It will likely be a bit before I am able to get back around to testing this out, but when I do I will either confirm that this work around was successful or be back again with questions.

Thanks again for all the help!