In INNE we implemented three models of unsupervised learning algorithms: Hebbian models with different learning rules, simple competitive learning and the Kohonen feature map.
The panel to
drive the simulation will open when the item HEBBIAN MODEL
in the MODEL
sub-menu is choosen.
The Hebb's model: an overview
The Hebbian unsupervised algorithms are based
on a law specified by Donald Hebb
[Hebb 1949] in 1949,
studying the cellular modifications that occur in animals while learning. He
observed that the connection between two neurons are strengthened when both
neurons are active simultaneously.
The networks are linear and a typical architecture is
shown in this figure.
The ith output Oi is given by:
The learning rules that we have implemented in the Hebbian module are:
This strengthens the output in turn for each input presented, so frequent input patterns will have most influence in the long run, and will come to produce the largest output. But there is a problem: the weights keep on growing without bound and learning never stops; infact, the direction with the largest eigenvalue of C (the correlation matrix) would eventually become dominant, so that w would gradually approach an eigenvector corresponding to , with a increasing huge norm.
We have prevented the divergence of this rule by constraining the growth of the weight vector w; we have used a simple renormalization wi’= wi/|w of all the weights after each update.
Oja in 1982 suggested a more clever approach; he showed that it is possible to make the weight vector approach a constant length without having to do any normalisation.
Oja'’s rule corresponds to adding a weight decay proportional to to the plain Hebbian rule:
Oja’'s rule converges to a weight vector w with the following properties:
This rule is useful for one-output network, it would however be desiderable to have M-output network that extract the first M principal components. Oja and Sanger [Sanger 1989] have both designed one-layer feed-forward networks that do this; the Oja'’s M unit rule is:
For Oja'’s M-unit rule the M weight vector converge to span the same subspace as these first M eigenvectors, but do not find the eigenvector direction themselves. It gives weight vectors which differ from trial to trial and spanning the right subspace; they depend on the initial condition and on the particular data samples seen during learning. On average the variance of each output is the same and this may be useful in some applications where one wants to keep the information spread uniformly across the units. Furthermore, if any algorithm of this sort is implemented in real brains, it would look more like Oja's than Sanger'’s rule
The Sanger'’s rule extracts the principal components in order; it performs exactly the Karhunen_Loe’ve transform. The Generalized Hebbian Algorithm was designed combining the Oja’'s one-unit rule and the Gram-Schmidt orhtogonalization process.
Sanger [Sanger 1989] has proved the following:
Theorem: If W(the matrix weight vector) is assigned random weight at time zero, and ; then with probability 1, the Generalized Hebbian Algorithm will converge, and W will approach the matrix whose rows are the first M eigenvectors of the input correlation matrix C, ordered by decreasing eigenvalue.
An example of application to image compression
The panel to drive the simulation
with its option panel allowing to select the learning rule.