Hopfield Networks

As an in-class activity, we will replicate an experiment described in Section 9.2 (pages 226-231) of our text. We train an artificial neural network to recognize a set of 10x10-pixel images of numerals.

Training:
We are given a series of N patterns, q⁰, q¹, ..., q^N-1, with each pattern q^k ∈ {-1, +1}ⁿ. For this experiment, n=100, as we simply view each image as a 100-dimensional vector.

An n-node network is modeled as a complete graph with real-valued weights set according to the following formula, for all 0 ≤ i < n, 0 ≤ j < n, 0 ≤ k < N:

w_ij = (Σ_k q^k_i q^k_j) / n

where k ranges over all patterns q^k in the training set.

Classification:
To perform a classification, we are seeking a steady-state in which each node of the network outputs a value xⁱ ∈ {-1, +1}, such that

x_i = -1 if Σ_{j ≠ i} w_ij x_j < 0
x_i = +1 if Σ_{j ≠ i} w_ij x_j ≥ 0

We seek this equilibrium with an iterative algorithm. If not currently at equilibrium, we randomly select some x_i that does not satisfy the above formula, and we invert it. We repeat this until reaching equilibrium, or until a predetermined maximum number of iterations is exhausted.

After reaching equilibrium, we check to see if we have reached one of the training patterns, or an unknown configuration.

Experimental Setup:
We can train a network to use portions of the sample images. To be consistent with the book, we start by using pattern "1", then "2", and so on, only using "0" as the tenth pattern, if desired. We allow for the user to choose how many of those patterns to include in the training set (as we will see that it will be difficult to effectively differentiate between all 10 patterns when using only 100 pixels).

Once trained, we perform one or more tests, where each tests consists of taking one of the original samples (a randomly chosen one, by default), and then intentionally introducing noise by flipping each bit of that pattern independently with some probability p (e.g., p=0.10). We then run the classification process and when it concludes, determine whether it reached the original image for that numeral, some other numeral, or some equilibrium position that was distinct from all samples in the training set.

Our software will allow for an arbitrary number of such tests, and it reports on the overall success rate, as well as a matrix showing how often each query numeral was (mis)classified to each possible result. For example, if training on 4 samples and using 30% noise, we get the following results for 1000 trials:

Overall success rate of 0.7080

      1   2   3   4 other
 1: 205   .   .   .  50
 2:   . 163   4   .  74
 3:   .   7 145   . 115
 4:   1   .   . 195  41

We see that we had the most trouble with the numeral "3", occassionally classifying it as a "2", and many times reaching some other steady state.

Unfortunately, we will see that this success falls apart when we add more patterns (and in particular this specific collection of patterns). In fact, when we add in the number "5" it turns out that the training results in a network for which even the unperturbed numbers "2", "3", and "5" are no longer steady states.

Software:
The necessary software can be found at turing:/Public/goldwasser/362/hopfield/ or downloaded as the following zip file.

You are responsible for implementing the following two functions:

train(patterns, weights)
patterns is a sequence of patterns, with each pattern being a linearized sequence of 100 values, each of which is either -1 or +1.

weights is a 100 x 100 matrix of floating-point values (initially 0.0), representing the w_ij values. You are to modify that matrix of weights accordingly as the result of your training. (Note that you need not return anything from this method).
classify(query, weights, randgen, maxIterations)
query is the initial sample to classify, represented as a linearized vector of 100 values, each -1 or +1.

weights it the same matrix that was set during training

randgen is an instance of Python's Random class that you should use for selecting random choices. Our reason for passing you this generator is that it is automatically seeded in accordance with the command-line options.

maxIterations is a maximum number of iterations after which you should stop the process. We expect that the process converges much quicker, but wanted to make sure to have some mechanism to avoid infinite oscillations.

as described above (and documented within the software).

Usage: hopfield.py [options]

Options:
  -h, --help       show this help message and exit
  -a               show all test patterns and exit
  -s SEED          seed for all randomization [default: clock]

  Experiment Options:
    -n PATTERNS    number of patterns to use in training [default: 4]
    -p PROB        Probability of perturbing each bit in the test pattern [default: 0.1]
    -r REPS        Number of independent tests to perform [default: 1]
    -f NUMERAL     force numeral to choose as basis for test query [default: random]
    -m ITERATIONS  maximum number of iterations to perform per query [default: 10000]

  Display Options:
    -t STEPS       trace status every t steps (no trace if 0) [default: 0]
    -d DELAY       per step delay for trace; manual if 0 [default: 0.001]
    -v             visualize trace [default: False]
    -w WIDTH       width of window for visualization [default: 200]
    -q             no console output (other than statistics) [default: False]

Michael Goldwasser

Last modified: Thursday, 21 November 2013