It is known that “a faster learning rate for PES results in over-fitting to the most recent online example, while a slower learning rate does not learn quickly enough.”
Although, I saw in several places (like in the adaptive control example) that sometimes it is preferred to decrease the learning rate while adding an amplifier to the post output.
For example, instead of doing
pre = nengo.Ensemble(100, dimensions=1) post = nengo.Node(size_in=1) nengo.Connection(pre, post, learning_rule_type=nengo.PES(1e-4))
reduce the learning rate by 10 times and amplify the output signal by 10 times
pre = nengo.Ensemble(100, dimensions=1) post = nengo.Node(lambda x: 10 * x, size_in=1) nengo.Connection(pre, post, learning_rule_type=nengo.PES(1e-5))
When comparing those examples, the post signal looks very similar. And intuitively, it is exactly the same model… Since multiplying the output by x is like multiplying all the weights in the decoder by x, and it is like multiplying the learning rate (weights diffs) by x.
What is the advantage and benefits of the second example? Why did they prefer the second option over the first in the adaptive control example?