Thank you Sean for the feedback!
All I know about Nengo is what I read in the "How to build a Brain" book
and the 25 tutorials provided. I have not explicitly studied the
reinforcement learning in Nengo because my primary concern for the moment
is whether it is suitable for my purposes. As for the learning rule, until
now I was using MSTDPET (Reinforcement Learning Through Modulation of
Spike-Timing-Dependent Synaptic Plasticity by Florian 2007) although PES
seemed also appropriate.
To make it clear:
The agent-based model (ABM) will be a java application with lots of agents
(e.g. 500) that are born, live and die throughout the simulation.
Each agent will have a discrete Nengo model which has to be created on its
birth, based on the parameters of its parents' model, accessed and stored
at each time step and maybe discarded on death.
So in terms of the way Nengo will be called by the Java code there will be
discrete functions executed at arbitrary time points :
Arguments: Model parameters
Return: Nengo model
Explanation: The java function must call Nengo to build a model with the
specified parameters and store it in the most accessible way for the
That would be your
# build a model of somesort
model = nengo.Network()
# make a simulator object
sim = nengo.Simulator(model)
Called each timestep
Arguments: Nengo model, Input from environment
Return: Updated Nengo model, Output to environment
Explanation: The java function must call Nengo to load the stored model,
modify synapses based on prediction error, give the input, run the
simulator, retrieve the output and store the updated model
That would be your
# get data from Java program somehow using inter-process communication
# maybe a pipe or a shared file?
data = get_data_from_java()
# increment a time-step in the model
I would imagine that the best way to achieve this is to never terminate and
store in a file but just have 500(!) Nengo simulators running in the
background and access the right one each time an agent in the Java
application has to act. But I don't really know if that would be possible.
So we need to answer the following questions:
- Is it possible to have so many models active independently at the same
time? Of course without the GUI interface, just as processes.
- Is it possible to pause them and resume them at any time? I guess the
sim.step function does exactly that, right?
- Is it possible to retrieve data (output or parameters) at any time point
and then resume from where they stopped?
As for the model itself now I would appreciate any help/suggestion/idea on
how to implement it in Nengo as I am totally newbie to it.
I think that it would need:
- Associative memory component with ongoing learning(Odor vector  to Taste vector )
- Controlled Associative memory component with learning based on prediction error(Taste vector and energy deficit to predicted reward/punishment vector  modulated by prediction error)
- Decision-making component (Odor vector , Taste vector , Predicted
reward/punishment to goal-oriented actions[4&blank]) where actions are
feed, ascent odor gradient (follow the smell of rewarding food), descent
odor gradient (avoid the smell of predator), explore (try to find something
rewarding to do!), wait (no-goal)
- Controlled Motor component (action to move/turn right/turn left elementary
Thanks in advance
You can check my profile here and link to me