I was following the tutorials on implementing PES on Loihi using NengoLoihi, and got the following output on Loihi.
INFO:DRV: SLURM is being run in background
INFO:DRV: Connecting to 10.212.98.106:42345
INFO:DRV: Host server up..............Done 0.15s
INFO:DRV: Encoding axons/synapses.....Done 9.48ms
INFO:DRV: Compiling Embedded snips....Done 0.34s
INFO:DRV: Compiling Host snips........Done 0.62s
INFO:DRV: Compiling MPDS Registers....Done 1.06ms
INFO:HST: Args chip=0 cpu=0 /homes/rgaurav/nxsdk_1_nengo_loihi/lib/python3.8/site-packages/nxsdk/driver/compilers/../../../temp/1647051637.507753/launcher_chip0_lmt0.bin --chips=1 --remote-relay=1
INFO:DRV: Booting up..................Done 0.64s
INFO:DRV: Encoding probes.............Done 0.67ms
INFO:DRV: Transferring probes.........Done 2.51ms
INFO:DRV: Configuring registers.......Done 0.01s
INFO:DRV: Transferring spikes.........Done 0.65ms
INFO:HST: [Host] Listening for client
INFO:HST: [Host] Connected to client
INFO:HST: chip=0 cpu=0 time 100
INFO:HST: chip=0 cpu=0 time 200
INFO:HST: chip=0 cpu=0 time 300
INFO:HST: chip=0 cpu=0 time 400
INFO:HST: chip=0 cpu=0 time 500
INFO:HST: chip=0 cpu=0 time 600
INFO:HST: chip=0 cpu=0 time 700
INFO:HST: chip=0 cpu=0 time 800
INFO:HST: chip=0 cpu=0 time 900
INFO:HST: chip=0 cpu=0 time 1000
The PES learning is done between two Ensembles (and not the direct connections between neurons) in the linked tutorial. Two related questions:
Why do I see INFO:HST: chip=0 cpu=0 time 1000 type output logs when running on Loihi? In my crude understanding, it appears that CPUs are involved somewhere to compute the weights through NengoLoihi PES? Shouldn’t the computation of weights entirely take place on Loihi neurocores? Please correct me if I am wrong here.
When I attempted to do PES learning between two neurons directly, with Nengo, things work as expected. But with NengoLoihi simulator, I get the following NotImplemented error: NotImplementedError: Learning rule presynaptic object must be an Ensemble (got 'Neurons'), which I suppose is coming from here. From the function it is coming from: build_chip_to_host(), makes me again think that CPUs are involved in learning of weights. Is it (I haven’t looked into the code in detail though)?
The learning is being done using the neurocores, yes. What you are seeing are debug outputs from the snip (the C code that handles I/O between the Loihi board and the python NxSDK) that is running. I believe you see this output because when you have learning in your network, the network goes from running in “precompute” mode to “stream” mode. In “precompute” mode, all of the inputs to the network are precomputed, and are sent to the board as a big chunk (all of the inputs are put on the board when the simulation starts). In “stream” mode, input data is streamed to the board every timestep. You can turn on/off precompute mode with the precompute parameter on the NengoLoihi simulator.
Because of the way PES is implemented on the Loihi hardware (I believe the “error” input is a node, or something), it means that the inputs to the network (this includes the “error” input) cannot be precomputed, since the “error” input is dependent on the ensemble output. This being the case, the simulator is smart enough to switch out of the precompute mode and run in stream mode.
Once again, this is a limitation of the PES implementation on the Loihi board. @Eric will know the full details, but the jist of it is that with Loihi, only PES with decoder learning has been implemented. It may have something to do with the constraints of how the learning rule is implemented in the hardware itself (which the PES rule is then mapped on to), or some other hardware constraints (like the hardware being unable to access the full connection weights).
Hello @xchoo! Thank you for sparing some time for such a quick response!
From the tutorial I linked, the network has an input Node which produces white noise, the error is calculated through an Ensemble. Therefore, I guess, it’s the input Node which is streams the signal into the network by executing on CPU. The input Node here is anyways not a PassThrough node, so executing on CPU and hence the related logs: INFO:HST: chip=0 cpu=0 time 1000, right?
I tried setting precompute=True and got the following error: BuildError: precompute=True not supported when using learning rules. It’s doesn’t matter as of now for my use case, because I will be accepting the inputs from a network deployable on Loihi anyways.
However, I do need to do the PES learning directly between the neurons, for which I see that you have pointed it out as NengoLoihi’s limitation on Loihi boards as of now. I suppose, to realize PES learning between neurons through NengoLoihi on Loihi boards, one needs to figure out how to do PES using NxSDK n2Core APIs and then integrate that with NengoLoihi… right? I do have some leads on how to begin here.
No… This is not the case. Because the input Node is an input, it can still be precomputed. The node I am referring to (the one that cannot be precomputed) is created by NengoLoihi internally as part of the PES learning rule implementation.
Yes, this is what I described:
Yeah, if you want to do PES on neuron-to-neuron connections, you’ll probably need to implement it yourself through the NxSDK API. I’m sure there was some limitation that prevented us from doing so in the first place (probably has something to do with the error signal being too large of a dimensionality, since you’ll need 1 value for each element in the weight matrix). I’ll ping @Eric to provide some insight.
The Loihi chip does not support 3-factor learning rules in the neurocores themselves (or rather, they only support a global “reward” signal, not the more local error signal that we need for PES); to get local 3-factor learning working, you need to use one of the x86 processors to inject the third factor. The easiest way for us to do this was to simply have the third factor (error signal) coming from the host.
So, for example, when you have an Ensemble->Ensemble connection with PES learning, NengoLoihi actually puts the post Ensemble off the chip, to facilitate using its decoded value in the error signal for the learning rule. I don’t think this is the only alternative; there are likely ways to do this and keep that ensemble on-chip, but it would certainly require some forethought.
Things get even more complicated when you get into Neuron->Neuron connections, etc. Essentially, we just implemented a limited version of PES to demonstrate learning on Loihi for one of our early demos; we never received funding to expand this out into the more fully-fledged learning system that would be ideal.
All of this will hopefully become easier on Loihi2 , which has better support for 3-factor learning rules (that said, we haven’t played around with this yet to figure out what it might look like for PES specifically).
Sorry for the misunderstanding @xchoo … but if this is the case, then doesn’t it mean that some component of PES learning is happening on CPUs, thus PES in its entirety is not computed on Loihi neurocores? Or are the debug outputs from the snip - executed on CPUs - not all related to the actual PES learning?
Also, could you elaborate on this following?
I didn’t get the context of error signal being too large of a dimensionality, since you’ll need 1 value for each element in the weight matrix?
@Eric please correct me if I am wrong here, PES is based on delta rule: learning rate x (target output - predicted output) x input. where (target output - predicted output) is the error signal (can be considered global or local depending on the context) and input is the pre-synaptic activity. Three factor learning rules on the other hand involve a term for post synaptic activity too, i.e. a modulatory global error term/signal and the two local terms of the Hebbian rule: terms for pre and post synaptic activity (which could be pre and post traces). So how come PES falls in the class of three factor rules? Or am I inferring your previous reply wrongly? If so, please elaborate on it a bit.
I understand that PES needs an error signal (third term) that comes from the host CPUs, and to facilitate this, the post Ensemble is put off chip, but this defeats the purpose of having an SNN entirely deployed on Loihi with on-chip learning enabled! Doesn’t it? What if, a way can be devised to get the error signal from on chip neurons? I suppose that should help in executing SNNs in its entirety on-chip, with learning supported?
The “target output” acts as a third factor I believe, though maybe there are those who would argue that it’s actually a two-factor rule. In any case, it’s not supported by Loihi on the neurocores, so we have to involve a CPU somewhere. It would probably be possible to do it with the onboard x86 cores, but we’ve opted to have the host in the loop because that was easier to implement with our existing workflow.
As I mentioned, though, Loihi2 has better learning support that may allow PES to fully run on neurocores.