As an in-class activity, we will replicate an experiment described in Section 9.2 (pages 226-231) of our text. We train an artificial neural network to recognize a set of 10x10-pixel images of numerals.
Training:
We are given a series of N patterns,
q0,
q1,
...,
qN-1,
with each pattern
An n-node network is modeled as a complete graph with real-valued weights set according to the following formula, for all 0 ≤ i < n, 0 ≤ j < n, 0 ≤ k < N:
wij = (Σk qki qkj) / nwhere k ranges over all patterns qk in the training set.
Classification:
To perform a classification, we are seeking a steady-state in which
each node of the network outputs a value
We seek this equilibrium with an iterative algorithm. If not currently at equilibrium, we randomly select somexi = -1 if Σj ≠ i wij xj < 0
xi = +1 if Σj ≠ i wij xj ≥ 0
After reaching equilibrium, we check to see if we have reached one of the training patterns, or an unknown configuration.
Experimental Setup:
We can train a network to use portions of the sample images. To be
consistent with the book, we start by using pattern "1", then "2", and
so on, only using "0" as the tenth pattern, if desired. We allow for
the user to choose how many of those patterns to include in the
training set (as we will see that it will be difficult to effectively
differentiate between all 10 patterns when using only 100 pixels).
Once trained, we perform one or more tests, where each tests consists of taking one of the original samples (a randomly chosen one, by default), and then intentionally introducing noise by flipping each bit of that pattern independently with some probability p (e.g., p=0.10). We then run the classification process and when it concludes, determine whether it reached the original image for that numeral, some other numeral, or some equilibrium position that was distinct from all samples in the training set.
Our software will allow for an arbitrary number of such tests, and it reports on the overall success rate, as well as a matrix showing how often each query numeral was (mis)classified to each possible result. For example, if training on 4 samples and using 30% noise, we get the following results for 1000 trials:
Overall success rate of 0.7080
1 2 3 4 other
1: 205 . . . 50
2: . 163 4 . 74
3: . 7 145 . 115
4: 1 . . 195 41
We see that we had the most trouble with the numeral "3",
occassionally classifying it as a "2", and many times reaching some
other steady state.
Unfortunately, we will see that this success falls apart when we add more patterns (and in particular this specific collection of patterns). In fact, when we add in the number "5" it turns out that the training results in a network for which even the unperturbed numbers "2", "3", and "5" are no longer steady states.
Software:
The necessary software can be found at
turing:/Public/goldwasser/362/hopfield/
or downloaded as the following zip file.
You are responsible for implementing the following two functions:
train(patterns, weights)
patterns is a sequence of patterns, with each pattern being a
linearized sequence of 100 values, each of which is either -1 or
+1.
weights is a 100 x 100 matrix of floating-point values (initially 0.0), representing the wij values. You are to modify that matrix of weights accordingly as the result of your training. (Note that you need not return anything from this method).
classify(query, weights, randgen, maxIterations)
query is the initial sample to classify, represented as a
linearized vector of 100 values, each -1 or +1.
weights it the same matrix that was set during training
randgen is an instance of Python's Random class that you should use for selecting random choices. Our reason for passing you this generator is that it is automatically seeded in accordance with the command-line options.
maxIterations is a maximum number of iterations after which you should stop the process. We expect that the process converges much quicker, but wanted to make sure to have some mechanism to avoid infinite oscillations.
Usage: hopfield.py [options]
Options:
-h, --help show this help message and exit
-a show all test patterns and exit
-s SEED seed for all randomization [default: clock]
Experiment Options:
-n PATTERNS number of patterns to use in training [default: 4]
-p PROB Probability of perturbing each bit in the test pattern [default: 0.1]
-r REPS Number of independent tests to perform [default: 1]
-f NUMERAL force numeral to choose as basis for test query [default: random]
-m ITERATIONS maximum number of iterations to perform per query [default: 10000]
Display Options:
-t STEPS trace status every t steps (no trace if 0) [default: 0]
-d DELAY per step delay for trace; manual if 0 [default: 0.001]
-v visualize trace [default: False]
-w WIDTH width of window for visualization [default: 200]
-q no console output (other than statistics) [default: False]