Scaling up simularity

Although I’m still not totally sure what you’re going for here and still think a video call would be a good idea, looking at your code, I would like to note this is how you make a cosine similarity network in Nengo:

prod = nengo.Product()
dot_output = nengo.Node(size_in=1)
nengo.Connection(self.product.output, self.output, transform=np.ones((1, dimensions)))

It’s anything but self-explanatory, where did self.product.output and self.output come from? and what does np.ones((1, dimensions)) do Plus what the heck are you doing dimensioning a connection, Obviously I haven’t read enough of the documentation, can you point out what parts of the documentation covers this stuff?

Is this all supposed to work in a single def statement?

Colour me Confused

I thought I would take a moment and explain the theory behind what I am trying to do.

Essentially it all has to do with information entropy

As we learn we integrate information, as we integrate information it becomes more entropic.

The Maximum Entropy Principle states that the most entropy is in the best learned data. In other words Maximum Entropy exists where the information integration is most completed.

mEPLA, my theory takes this into account, and turns it around to find new opportunities for learning. Essentially what it says is that the minimum entropy is found in areas where information integration is less complete, By extension this means that by allocating learning resources to areas of low entropy we can find opportunities to learn and better integrate information.

simularity is an approximation of entropy, the more a thing is simular the more integrated it’s information is, and the less learning that can be achieved.

What I think I am doing, is proving that similarity detection is a cheap way to detect learning opportunities. By inverting the logic, I am converting what I think is a natural similarity detector into a novelty detector which can detect learning opportunities and by applying learning resources to them, increase the information integration of the whole brain. I am just doing it almost as soon as the information gets to the cortex.

Theoretically novelty is an indication of a learning opportunity.

But to monitor entropy I need to operate in bulk, comparing the entropy of each input with all the others around it. I can then relatively easily gate the lowest entropy locations so that the lowest entropy triggers the gating of the outputs this is probably an early example of salience.

OK, I was disappointed with cosine similarity it doesn’t do what I want for bulk similarity detection. I stumbled across a possible way of getting around the range problem. In essence as the matrix gets more complex the results get out of range. The more complex that matrix the more it gets out of range of the expression limit of the neuron. This is probably because it depends on a product. What I need is a way of taking the bulk similarity of a specific row in the matrix. The solution might be taking a partial derivative of the cosine similarity. of the whole matrix. Apparently it can be tuned to pick out the relative similarity of any particular row in the matrix.

Obviously it is not as simple as doing the partial derivative of a cosine similarity, while a derivative reduces the complexity of the equation slightly the matrix created by taking the derivative of a 2D matrix becomes a 3D matrix. Obviously my Math is not up to this. I’ll come back once I have a handle on what I am trying to do mathematically

I remembered that I had read that entropy was an average, taking the average of the permutations of possible vectors seems to be a superior way to achieve bulk similarity. I have rewritten the script with this in mind, I am still sticking to three inputs to prove the assumption, but it seems to work. The problem now is dilution of the significance of the gating signal as input numbers increase.