When you bind vector A with vector B, you are convolving the the two, which is a process that includes multiplication of elements. As you keep binding more concepts A to B to C to D… your elements in your vector could conceivably get larger and larger. (unless some are fractional or zero).
In the real world, neurons have maximum firing rates, and post-synaptic potentials have maximums too. So I was wondering if there is a ‘saturation’ to the elements in these vectors - they can’t go above a certain maximum. Secondly, is there a normalization of vectors - (the total magnitude of the vector can’t go above some value).
My second question is what the rationale was in selecting convolution. I understand that it has 3 desirable properties:
- it produces a product that is far away from each of its constituents
- if a vector representing ‘pink’ is similar to a vector representing ‘red’ and ‘green’ is less similar, then ‘red’ convoluted with vector A will be more similar to ‘pink’ convoluted with vector A than ‘green’ convoluted with vector A
- You can invert vector ‘pink’ and multiply it by the ‘pink-A’ to get just ‘A’ back again.
So these are all advantages.
But is there any theoretical (not just practical) rationale for choosing ‘convolution’?
More SPA-knowledgeable folks can give a better answer than me, but I’ll try to respond with a few references you can look into in the meantime!
The short answer to your question is that the vectors used in SPA are usually unit vectors so that we don’t run into the problems that you point out here. The theory behind the vector manipulations used by SPA is described in Tony Plate’s Holographic Reduced Representations architecture, which is detailed in his publications.
On the neural side, the neurons we typically use in SPA networks do saturate at a certain firing rate, so we can only represent vectors in a certain range of magnitudes. @jgosmann has done a lot of work looking into how we can best use the representational range of spiking neurons to do the vector manipulations used by SPA; see this paper for details.
This is only partially correct. With unit vectors the vector of bound concepts will grow larger with more concepts. This can be prevented by introducing normalization, but this will make the binding with circular convolution a lossy operation. (As you note neurons saturate and thus will introduce some normalization-like behaviour.) This might make it necessary to use clean-up memories.
Another way to circumvent this problem is by using unitary vectors (note that all unitary vectors are unit vectors, but not all unit vectors are unitary vectors). A unitary vector preserves the length of the other vector in circular convolution. The terminology is derived form unitary transforms because the circular convolution with such a vector can be shown to be a unitary transform. I would recommend using unitary vectors where required, but only there. The number of unitary vectors is significantly smaller than the number of general semantic pointers that fit into a d-dimensional space. Where unitary vectors are required depends on how the representations are constructed, but usually only a few vectors need to be unitary.
nengo.spa.pointer.SemanticPointer class has a
make_unitary method to generate unitary pointers. It will normalize all the complex Fourier coefficients of that vector which preserves the phase shifts done by the circular convolution, but eliminates all scaling of the Fourier coefficients which will preserve the vector lengths.
Regarding the second question: There are other operations that could be used for binding (e.g. permutation) that we probably should look into more, but haven’t yet. Another reasons to use circular convolution: It is fairly straight-forward to implement with the NEF/Nengo (once you figure out all the required transformation matrices). Also, we have some intuition what circular convolution is doing the vectors (it’s somewhat like a rotation).
I have to correct myself too. I looked into this matter again today and it ocurred to me that I based the statement that the number of unitary vectors is significantly smaller than the number of general semantic pointers on assumptions that are most likely wrong. In fact, a space might fit the same number of Semantic Pointers independent of whether the vectors are unitary or not. (I don’t have a proof or derivation of that yet and intuitions don’t always work very well in high-dimensional spaces, so take this with a grain of salt. But matters are more complicated than I initially thought.)