Nengo to Nengo Loihi discretization issue

I’m trying to run a code that runs fine on Nengo, to Nengo-Loihi.
Ran into a Build Error: “Could not find appropriate weight exponent”
I’m assuming that this has to do with the transform that I have on one of the connections and that it requires too high a level of precision than what is available on Loihi. Is this correct?
There are multiple decoders to be calculated in the code, how can I go about isolating which one is causing trouble?

On a more general note, what pitfalls does one have to be wary about while trying to use the nengo-loihi backend?

The full error message:

---------------------------------------------------------------------------
BuildError                                Traceback (most recent call last)
<ipython-input-2-1ff2f4579a5b> in <module>
     49         c_p = nengo.Probe(net.output, synapse=0.01)
     50 
---> 51     sim = nengo_loihi.Simulator(net, dt=0.01)
     52     sim.run(model.t_max)

~/miniconda3/envs/loihi/lib/python3.5/site-packages/nengo_loihi/simulator.py in __init__(self, network, dt, seed, model, precompute, target, progress_bar, remove_passthrough, hardware_options)
    203 
    204         if target != "simreal":
--> 205             discretize_model(self.model)
    206 
    207         if target in ("simreal", "sim"):

~/miniconda3/envs/loihi/lib/python3.5/site-packages/nengo_loihi/discretize.py in discretize_model(model)
    221     
    222     for block in model.blocks:
--> 223         discretize_block(block)
    224 
    225 

~/miniconda3/envs/loihi/lib/python3.5/site-packages/nengo_loihi/discretize.py in discretize_block(block)
    238     w_max = max(w_maxs) if len(w_maxs) > 0 else 0
    239 
--> 240     p = discretize_compartment(block.compartment, w_max)
    241     for synapse in block.synapses:
    242         discretize_synapse(synapse, w_max, p["w_scale"], p["w_exp"])

~/miniconda3/envs/loihi/lib/python3.5/site-packages/nengo_loihi/discretize.py in discretize_compartment(comp, w_max)
    307                 break
    308         else:
--> 309             raise BuildError("Could not find appropriate weight exponent")
    310     elif b_max > 1e-8:
    311         b_scale = BIAS_MAX / b_max
1 Like

Hi and welcome to the forum!

This error is likely due to very small weights on one of the connections in your model, as it can be difficult to handle these correctly with 8-bit quantization required by Loihi. Extremely large weights can also cause the issue, but this is less common. In terms of debugging, one option is to run your script with ipython --pdb so that you can step into the debugger and print information about the compartment that’s causing the problem. If you give all of the ensembles in your model unique labels, you should then be able to see which ensemble is at the root of the problem, since its label will be passed on to the compartment shown in your stack trace here.

A member of the Nengo Loihi development team, @Eric, has recently created a PR to add objects to all build errors, which will hopefully make it much easier to track down which objects in your model are causing problems. The relevant branch to check out is here: https://github.com/nengo/nengo-loihi/pull/289.

In terms of more general tips, tricks, and pitfalls to avoid when using the nengo-loihi backend, we’ve compiled a list of things here: https://www.nengo.ai/nengo-loihi/tips.html.

We’re always looking to add to this, so I’ll check with the team about providing some more context for users around the role of discretization on Loihi.

Let us know if there’s more information that would be helpful, or you have any followup questions.

1 Like

Thank you very much. It was indeed a precision issue caused by very small weights. I truncated the decimal places to solve it.

I have a follow up question on using NxSDK along with nengo-loihi while running models on chip. E.g. the probes for power profiling seem to be available only through NxSDK as of now. But since nengo_loihi interfaces with NxSDK, I was wondering what would be the appropriate way to access the board/chip/core objects, say for power profiling, but also in general.

An issue that I run into when going from the emulator to chip is exceeding the number of cores available. This is even with a small number of neurons ie. <2000. I tried to manually split the ensembles across blocks using nengo_loihi.Blockshape, but that seems to not be callable.
Using the example in the documentation, I get the same error.

with nengo.Network() as net:
    nengo_loihi.add_params(net)
    ens = nengo.Ensemble(16, 1)
    net.config[ens].block_shape = nengo_loihi.BlockShape((2, 2), (4, 4))

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-7-e545d4dbba7e> in <module>
      2     nengo_loihi.add_params(net)
      3     ens = nengo.Ensemble(16, 1)
----> 4     net.config[ens].block_shape = nengo_loihi.BlockShape((2, 2), (4, 4))

AttributeError: module 'nengo_loihi' has no attribute 'BlockShape'

Looking at the code here, https://github.com/nengo/nengo-loihi/blob/master/nengo_loihi/config.py it seemed like, it should be called as nengo_loihi.config.BlockShape() but that fails as well. What am I doing wrong here? I have the most recent version of nengo_loihi available through pip.

Thanks again!

Great, happy to have helped sort out the precision issue. For using NxSDK with Nengo Loihi, you can access the NxsdkBoard object with sim.sims["loihi"].nxsdk_board and then add probes etc. as described in the NxSDK documentation.

Regarding the BlockShape object not being found, you’ll have to switch to the development version of nengo-loihi by cloning the GitHub repository – BlockShape is relatively new and hasn’t made it into an official release yet.

Let me know if you run into anything else!

2 Likes

I seem to be running into an error with the energy and time probes in Loihi. Not sure if this has to do with nengo_loihi or nxsdk but this is what happens: I set up the probes as

with nengo_loihi.Simulator(net, dt = 0.001, precompute=False, hardware_options={'n_chips':4,'allocator':Greedy()}) as sim:
        board= sim.sims["loihi"].nxsdk_board
        eProbe = board.probe(probeType=ProbeParameter.ENERGY,
                             probeCondition=PerformanceProbeCondition(
                                 tStart=1,tEnd=3000,bufferSize=1024,binSize=1))
        tProbe = board.probe(probeType=ProbeParameter.ENERGY,
                             probeCondition=PerformanceProbeCondition(
                                 tStart=1,tEnd=3000,bufferSize=1024,binSize=1))
        
        sim.run(3)

The error message after the running up to 1024 time steps (the bufferSize) is:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-25-1242324bd0f8> in <module>
     61                                  tStart=1,tEnd=3000,bufferSize=100,binSize=1))
     62 
---> 63         sim.run(3)
     64         t_audio = sim.trange()
     65     #sim.run(model.t_audio)

~/miniconda3/envs/loihi_dev_env/lib/python3.5/site-packages/nengo_loihi/simulator.py in run(self, time_in_seconds)
    328                 steps,
    329             )
--> 330             self.run_steps(steps)
    331 
    332     def run_steps(self, steps):

~/miniconda3/envs/loihi_dev_env/lib/python3.5/site-packages/nengo_loihi/simulator.py in run_steps(self, steps)
    341             raise SimulatorClosed("Simulator cannot run because it is closed.")
    342 
--> 343         self._runner.run_steps(steps)
    344         self._n_steps += steps
    345         logger.info("Finished running for %d steps", steps)

~/miniconda3/envs/loihi_dev_env/lib/python3.5/site-packages/nengo_loihi/simulator.py in loihi_bidirectional_with_host(self, steps)
    528             self.host.step()
    529             self._host2chip(self.loihi)
--> 530             self._chip2host(self.loihi)
    531         self.timers.stop("run")
    532 

~/miniconda3/envs/loihi_dev_env/lib/python3.5/site-packages/nengo_loihi/simulator.py in _chip2host(self, sim)
    410             for probe, receiver in self.model.chip2host_receivers.items()
    411         )
--> 412         sim.chip2host(probes_receivers)
    413 
    414     def _host2chip(self, sim):

~/miniconda3/envs/loihi_dev_env/lib/python3.5/site-packages/nengo_loihi/hardware/interface.py in chip2host(self, probes_receivers)
    421         assert self.host_snip.connected
    422 
--> 423         raw_data = self.host_snip.recv_bytes(self.bytes_per_step)
    424 
    425         # create views into data for different chips

~/miniconda3/envs/loihi_dev_env/lib/python3.5/site-packages/nengo_loihi/hardware/interface.py in recv_bytes(self, bytes_expected)
    855         if len(data) < bytes_expected:
    856             raise RuntimeError(
--> 857                 "Received (%d) less than expected (%d)" % (len(data), bytes_expected)
    858             )
    859 

RuntimeError: Received (0) less than expected (512)

The Loihi documentation says that the buffer of the probe will compete with snips for memory on the embedded Lakemont CPU.

My guess here is that when snips is stopped temporarily by the probe, the nengo_loihi interface decides to close and halt the board perhaps? Is there an alternative way to go about setting up an energy probe?

Thanks again!

I’m curious how long the probes halt the simulation; it’s quite possible that Nengo Loihi is timing out when waiting for data from the board.

You can increase the amount of time that Nengo Loihi will wait (and the number of times it tries again) like this:

nengo_loihi.hardware.interface.HostSnip.recv_timeout = 1.0  # Default is 0.01 (10 ms)
nengo_loihi.hardware.interface.HostSnip.recv_retries = 100  # Default is 10

You can put these anywhere in your script before the run call. As you can see, the default behavior is to wait for 10 ms 10 times, which means we only wait 100 ms, which is very likely not enough.

1 Like

Thank you! Increasing the time does resolve that error.

My problem with the probes still persist though. There seems to be an initialization for timestamps that the board.run() method in the nxsdk does which the sim.run() method perhaps skips. That is all that I could make out looking at the nxtime.py file. Is there any example that you may have come across that uses nengo_loihi as well as the nxsdk probes?

The error message (I’ve clipped the initial output before this):

INFO:DRV:      Executing...................Done 0.44s
INFO:HST:  chip=0 cpu=0 Waited to exit (nonsense sum -13580)
WARNING    /homes/iamgvj/miniconda3/envs/loihi_dev_env/lib/python3.5/site-packages/nxsdk/graph/nxtime.py:75: RuntimeWarning: divide by zero encountered in double_scalars
  deltaSum) / numNonZeroPhaseTimes
 [py.warnings]
INFO:DRV:      Processing timeseries.......Error 0.04s
INFO:HST:  [Host] Received shutdown signal: -1
INFO:HST:  [Host] Wrote superhost shutdown signal: 8192 bytes
INFO:HST:  [Host] Closing server socket
INFO:HST:  chip=2 cpu=0 halted, status=0x0
INFO:HST:  chip=0 cpu=0 halted, status=0x0
INFO:HST:  chip=1 cpu=0 halted, status=0x0
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-2-046a2acc9e24> in <module>
     63         nengo_loihi.hardware.interface.HostSnip.recv_retries = 10  # Default is 10
     64 
---> 65         sim.run(2.9)
     66         t_audio = sim.trange()
     

~/miniconda3/envs/loihi_dev_env/lib/python3.5/site-packages/nengo_loihi/simulator.py in run(self, time_in_seconds)
    328                 steps,
    329             )
--> 330             self.run_steps(steps)
    331 
    332     def run_steps(self, steps):

~/miniconda3/envs/loihi_dev_env/lib/python3.5/site-packages/nengo_loihi/simulator.py in run_steps(self, steps)
    341             raise SimulatorClosed("Simulator cannot run because it is closed.")
    342 
--> 343         self._runner.run_steps(steps)
    344         self._n_steps += steps
    345         logger.info("Finished running for %d steps", steps)

~/miniconda3/envs/loihi_dev_env/lib/python3.5/site-packages/nengo_loihi/simulator.py in loihi_bidirectional_with_host(self, steps)
    533         self.timers.start("shutdown")
    534         logger.info("Waiting for run_steps to complete...")
--> 535         self.loihi.wait_for_completion()
    536         logger.info("run_steps completed")
    537         self.timers.stop("shutdown")

~/miniconda3/envs/loihi_dev_env/lib/python3.5/site-packages/nengo_loihi/hardware/interface.py in wait_for_completion(self)
    258 
    259     def wait_for_completion(self):
--> 260         d_func(self.nxsdk_board, b"ZmluaXNoUnVu")
    261 
    262 

~/miniconda3/envs/loihi_dev_env/lib/python3.5/site-packages/nengo_loihi/nxsdk_obfuscation.py in d_func(obj, kwargs, *attrs)
     75         kwargs = {deobfuscate(k): v for k, v in kwargs.items()}
     76     func = d_get(obj, *attrs)
---> 77     return func(**kwargs)

~/miniconda3/envs/loihi_dev_env/lib/python3.5/site-packages/nxsdk/graph/nxboard.py in finishRun(self)
    287     def finishRun(self):
    288         """Finish the run"""
--> 289         return self.executor.finish()
    290 
    291     @deprecated("startDriver is being deprecated. Use start instead.")

~/miniconda3/envs/loihi_dev_env/lib/python3.5/site-packages/nxsdk/driver/executor.py in finish(self)
    119         if self._state is ExecutionState.RUNNING:
    120             self._wait()
--> 121             self._notifyListeners(ExecutionEventEnum.POST_EXECUTION)
    122             self._state = ExecutionState.FINISHED
    123 

~/miniconda3/envs/loihi_dev_env/lib/python3.5/site-packages/nxsdk/driver/executor.py in _notifyListeners(self, event)
    145                 listener.preExecution(self.numStepsToBeExecuted)
    146             elif event == ExecutionEventEnum.POST_EXECUTION:
--> 147                 listener.postExecution()
    148             elif event == ExecutionEventEnum.ON_STOP:
    149                 listener.onStop()

~/miniconda3/envs/loihi_dev_env/lib/python3.5/site-packages/nxsdk/driver/listeners/composite_monitor.py in postExecution(self)
     54     def postExecution(self) -> None:
     55         with timedContextLogging("Processing timeseries", NxSDKLogger.NXDRIVER):
---> 56             [m.postExecution() for m in self._collection.values()]
     57 
     58     def onStop(self) -> None:

~/miniconda3/envs/loihi_dev_env/lib/python3.5/site-packages/nxsdk/driver/listeners/composite_monitor.py in <listcomp>(.0)
     54     def postExecution(self) -> None:
     55         with timedContextLogging("Processing timeseries", NxSDKLogger.NXDRIVER):
---> 56             [m.postExecution() for m in self._collection.values()]
     57 
     58     def onStop(self) -> None:

~/miniconda3/envs/loihi_dev_env/lib/python3.5/site-packages/nxsdk/driver/listeners/monitors/performance_monitor.py in postExecution(self)
     41 
     42     def postExecution(self):
---> 43         self._energyTimeMonitor.updateProbes()
     44 
     45     @property

~/miniconda3/envs/loihi_dev_env/lib/python3.5/site-packages/nxsdk/graph/nxenergy_time.py in updateProbes(self)
    831                     if prb.id in energyProbeData.energyProbeContainer.keys():
    832                         probeData = energyProbeData.energyProbeContainer[prb.id]
--> 833                         prb._updateProbe(probeData)

~/miniconda3/envs/loihi_dev_env/lib/python3.5/site-packages/nxsdk/graph/nxenergy_time.py in _updateProbe(self, data)
    383 
    384         rawTimeProbeData = data.timeProbeData
--> 385         tProbeData = super()._updateProbe(rawTimeProbeData)
    386         powerData = data.powerData
    387 

~/miniconda3/envs/loihi_dev_env/lib/python3.5/site-packages/nxsdk/graph/nxenergy_time.py in _updateProbe(self, data)
     53         self.numRuns += 1
     54         pdata = TimeProbeDataPerRun(runId=self.numRuns, data=data,
---> 55                                     etMonitor=self.etMonitor)
     56         self.numSteps += pdata.numSteps
     57         self.timeProbeDataPerRunList.append(pdata)

~/miniconda3/envs/loihi_dev_env/lib/python3.5/site-packages/nxsdk/graph/nxtime.py in __init__(self, runId, data, etMonitor)
     31         self.numPhasesPerBin = self.etMonitor.numPhaseBins
     32         self.binSize = self.etMonitor.binSize
---> 33         self._postProcessData(data)
     34         self._logger = get_logger("NET.PRB")
     35 

~/miniconda3/envs/loihi_dev_env/lib/python3.5/site-packages/nxsdk/graph/nxtime.py in _postProcessData(self, data)
     62         self.startTimeStampsPerTS = np.empty(0, dtype=np.float64)
     63         self.endTimeStampsPerTS = np.empty(0, dtype=np.float64)
---> 64         self._processBins()
     65 
     66     def _processBins(self):

~/miniconda3/envs/loihi_dev_env/lib/python3.5/site-packages/nxsdk/graph/nxtime.py in _processBins(self)
     84                 # case 1: nSteps is a multiple of binSize
     85                 tileSize = self.binSize
---> 86                 self._unpackTimeBins(begin, end, tileSize, tStart)
     87             elif nSteps < self.binSize:
     88                 # case 2: Can be implemented as a special case of case 3.

~/miniconda3/envs/loihi_dev_env/lib/python3.5/site-packages/nxsdk/graph/nxtime.py in _unpackTimeBins(self, begin, end, tileSize, tStart)
    126         self.endTimeStampsPerTS = np.append(self.endTimeStampsPerTS,
    127                                             endTimeStamps)
--> 128         return endTimeStamps[-1]
    129 
    130     def _debugPrint(self):

IndexError: index -1 is out of bounds for axis 0 with size 0

Hm… I’ve not seen that error, but it definitely seems to me like a bug in NxSDK’s time probes. We have used the energy probes a fair bit, but not the time probes, so it’s possible that they’re not working for whatever reason.

If you’re interested in profiling your simulation, NengoLoihi provides some profiling that you can access with

print(sim.timers)

If that’s sufficient for what you’re doing, then you can remove the time probes and just use the NxSDK energy probes. If not, we might be able to add more things to the timers object, if you have something specific you want to time.