Hello all,
I started learning Nengo a couple of months ago for a project that I have been working on over the summer. More recently, I started to explore NengoDL and I have run into an issue concerning the calculation of the gradients of my network’s output with respect to its input.
Given a network with input x and output y, I want to compute dy/dx in order to use this as part of the objective function during training. In TensorFlow, this task is trivial, since you can just use tf.gradients(). Here is an example of part of the loss function I am using for my non-spiking network in TensorFlow that works as desired.
u = self.net_u(x, y, t)
u_t = tf.gradients(u, t)[0]
u_x = tf.gradients(u, x)[0]
u_y = tf.gradients(u, y)[0]
In this example, x, y, & t are network inputs and u is the network output. I can then go on and use these gradients in the rest of the objective function.
I am now trying to do the same thing in NengoDL. However, when I attempt what I thought was a reasonably similar approach in NengoDL using tf.gradients(), I get a None result for the gradients. Here is a minimal example of what I tried to do. The example is a bit long because I think it is necessary to show the network construction, training setup, and objective function. The None issue occurs in the objective function.
# --------------------------- IMPORT LIBRARIES ---------------------------
# Import built-in necessary libraries.
import nengo
import nengo_dl
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
# --------------------------- SET NETWORK & TRAINING PROPERTIES ---------------------------
# Define simulation properties.
dt_sim = 0.001
t_sim_duration = 2*np.pi
# Define training properties.
batch_size_train = 10000
minibatch_size_train = 50
n_train_epochs = 50
learning_rate = 0.001
# Simple function that we will approximate.
def f_approx(x):
return np.sin(x)
# --------------------------- BUILD THE NETWORK ---------------------------
network = nengo.Network()
network.config[nengo.Ensemble].neuron_type = nengo.LIF(amplitude=0.001)
network.config[nengo.Connection].synapse = None
with network:
# Create the network components.
x_node = nengo.Node(output=lambda x: x, label='x Node')
x_ens = nengo.Ensemble(n_neurons=20, dimensions=1, radius=1.25*t_sim_duration, seed=0, label='x Ensemble')
y_node = nengo.Node(output=None, size_in=1)
output_node = nengo.Node(output=None, size_in=2)
# Connect up the network components.
nengo.Connection(x_node, x_ens)
nengo.Connection(x_ens, y_node, function=f_approx)
nengo.Connection(x_node, output_node[0])
nengo.Connection(y_node, output_node[1])
# Collect data from the network.
x_probe = nengo.Probe(x_node)
y_probe_nofilt = nengo.Probe(y_node)
y_probe_filt = nengo.Probe(y_node, synapse=0.01)
output_probe_nofilt = nengo.Probe(output_node, synapse=0.01)
output_probe_filt = nengo.Probe(output_node)
# --------------------------- SIMULATE THE UNTRAINED NETWORK ---------------------------
with nengo_dl.Simulator(network=network, dt=dt_sim) as sim:
sim.run(t_sim_duration)
# --------------------------- PLOT THE UNTRAINED NETWORK RESULTS ---------------------------
plt.figure(); plt.xlabel('Time [s]'); plt.ylabel('Network Input [-]'); plt.title('Network Input vs Time')
plt.plot(sim.trange(), sim.data[x_probe], label='x')
plt.figure(); plt.xlabel('Time [s]'); plt.ylabel('Network Output [-]'); plt.title('Network Output vs Time (Untrained)')
plt.plot(sim.trange(), sim.data[y_probe_filt], label='y SNN')
plt.plot(sim.trange(), f_approx(sim.trange()), label='y True')
plt.legend()
# --------------------------- GENERATE NETWORK TRAINING DATA ---------------------------
# Create the training data inputs.
xs_training_input = np.random.uniform(0, t_sim_duration, size=(batch_size_train, 1, 1))
# Compute the training data outputs.
ys_training_output = f_approx(xs_training_input)
# Concatenate the xs & ys data. I am doing this to get all of this information into the outputs variable of the objective function.
training_outputs = np.concatenate((xs_training_input, ys_training_output), axis=-1)
# Original training data dictionary. Output is only y.
# data_training = {x_node: xs_training_input, y_probe_nofilt: ys_training_output}
# New training data dictionary. Output is both x & y.
data_training = {x_node: xs_training_input, output_probe_nofilt: training_outputs}
# Define the objective function.
def train_objective(outputs, targets):
# Retrieve the tensors associated with the network's inputs and outputs.
x = outputs[:, :, 0]
y = outputs[:, :, 1]
# Compute the gradient of the network output with respect to the input.
dydx = tf.gradients(y, x)[0] # THIS VALUE COMES OUT TO NONE.
print('dydx = ', dydx)
# Compute the MSE of the output.
return tf.reduce_mean(tf.square(outputs[:, :, -1] - targets[:, :, -1]))
# --------------------------- TRAIN THE NETWORK ---------------------------
with nengo_dl.Simulator(network=network, minibatch_size=minibatch_size_train) as sim:
# Original train call, uses just the y data.
# sim.train(data=data_training, optimizer=tf.train.AdamOptimizer(learning_rate=learning_rate), objective={y_probe_nofilt: train_objective}, n_epochs=n_train_epochs)
# New train call, uses both the x & y data.
sim.train(data=data_training, optimizer=tf.train.AdamOptimizer(learning_rate=learning_rate), objective={output_probe_nofilt: train_objective}, n_epochs=n_train_epochs)
# Freeze the parameters.
sim.freeze_params(network)
# --------------------------- SIMULATE THE TRAINED NETWORK ---------------------------
with nengo_dl.Simulator(network=network, dt=dt_sim) as sim:
sim.run(t_sim_duration)
# --------------------------- PLOT THE TRAINED NETWORK RESULTS ---------------------------
plt.figure(); plt.xlabel('Time [s]'); plt.ylabel('Network Output [-]'); plt.title('Network Output vs Time (Trained)')
plt.plot(sim.trange(), sim.data[y_probe_filt], label='y SNN')
plt.plot(sim.trange(), f_approx(sim.trange()), label='y True')
plt.legend()
plt.show()
In this example, I pass both the network input and output to the objective function and attempt to take their gradient, yielding None.
This indicates to me that TensorFlow is not recognizing that the network’s inputs are in fact connected to its output through the Nengo nodes/ensembles that comprise the network. Alternatively, this could perhaps mean that the relationship is non-differentiable. This would certainly be a problem while simulating the Nengo network, since I am using LIF neurons, but since I am only performing the gradient calculation during training (which it is my understanding converts to the LIFRate neuron type), then I assume that the differentiability is not the issue.
My question is therefore:
How can I compute the gradients of the output of a NengoDL network with respect to its input? More specifically, how can I do this during training to incorporate these values into the loss/objective function?
Any assistance or advice that you could provide would be greatly appreciated. It seems like the solution should be something simple that I am missing.
Thanks!