Nengo RL performance with standard OpenAI cart pole problem

simondlevy · June 26, 2019, 11:07am

Hi guys,

I hope it’s okay for me to cross-post this from my original post on the RL discussion thread, because that thread seems to been dormant. Having just noticed the activity on this new thread, I figured it might make more sense to post here.

So: I’d like to use RL in Nengo to continue the project I started at last summer’s Brain Camp, where Terry helped me use Nengo to build a simple PID-controller plugin for a quadcopter flight simulator in UnrealEngine4. Since then I’ve replaced the built-in UE4 physics with a dynamics model taken from the literature, which I’ve also translated to Python to enable me to work directly with Nengo. The dynamics model is very simple: you input the four (or six or however many) motor values, and you get back the vehicle state in a form similar to what you’d get from sensors (IMU, altimeter, etc.). The idea would be to train up a Nengo-based RL network using this dynamic model and a simple reward metric (perhaps distance covered within some altitude band, to avoid ballistic maneuvers). Then the Nengo controller would be run directly in the actual simulator, using a general Python plugin I’m working on, for a live demo of Nengo flying the quadcopter. The ultimate goal would of course be using Nengo (running on an ODROID, Jetson, or whatever) to fly a real 'copter.

I’ve been reading through Daniel’s papers to get a general sense of how to approach RL with Nengo, but if someone can point me to one or more concrete examples that I can play with, that’d be a big help.

Thanks,
Simon