Nengo (the vanilla Nengo) runs using Numpy to perform all of the neural computations of the network, and if you are running the Nengo simulations independently (i.e., spawning a separate Python process for each simulation), they shouldn’t affect each other too much. The specifics of how much they affect each other depend on the CPU architecture you are using, and the CPU scheduler the OS has implemented.
Testing your code on my machine (AMD Ryzen 9 5950X - 16 core, 32 hyperthread) on my Windows OS, I get the following results:
- 1 simulation: 2:35
- 2 simulations: minimum - 2:36, maximum - 2:38, average 2:37
- 4 simulations: minimum - 2:40, maximum - 2:42, average 2:41
- 6 simulations: minimum - 2:40, maximum - 2:46, average 2:43.5
Looking at the results above, there is a increase in the time it takes to complete the simulation, but it is not large increase. Also, the increase in simulation time is roughly linear with the number of simulations being run. Looking at the processor usage (in the Windows task manager), I can see the CPU cores constantly being switched in and out of running the process, so I surmise that the additional parallel simulations is adding extra overhead as more processes are being swapped around the cores.
However, I definitely would not classify these results as limiting the simulations to only a small fraction of the CPU cores. On my system, with my OS, running multiple parallel simulations will definitely slow all of them down, but not by a lot (about 1% slowdown per added simulation).
I also performed the same experiment on our compute cluster. We use SLURM to partition the available CPU cores, and unlike windows, the processes are locked to specific cores while they are running (the process doesn’t get swapped around to other CPU cores). Here are the results from this experiement:
- 1 simulation: 2:24
- 2 simulations: min: 2:24, max: 2:27, avg: 2:25.5
- 4 simulations: min: 2:24, max: 2:26, avg: 2:25
- 8 simulations: min: 2:26, max: 2:34, avg: 2:27.75
- 12 simulations: min: 2:28, max: 2:37, avg: 2:31.75
Once again, while there is some increase in simulation time, running more parallel simulations still nets you more results than running the simulations sequentially, with an increase of only 8s when increasing the number of parallel simulations from 1 to 12. An increase in the simulation time is expected since the simulations are using more than just the CPU cores, but also system memory, and the CPU cache. Having more parallel processes running means a higher overhead when trying to access these shared resources, and this will inevitably cause the simulation speed of all processes to slow, but not by much.
And, as before, I wouldn’t consider this a situation where only a small fraction of the CPU cores are usable.
Note that all of these experiments have been performed using the code you provided without any modification to them. These experiments were done using the latest version of Nengo.
If you are still experiencing the issue you describe, can you post some run time data (like I did above), as well as what CPU architectures you are testing it on?