3. An engineering approach
3.1 A simple neuron
An artificial neuron is a device with many inputs and one output. The neuron has two modes of operation; the training mode and the using mode. In the training mode, the neuron can be trained to fire (or not), for particular input patterns. In the using mode, when a taught input pattern is detected at the input, its associated output becomes the current output. If the input pattern does not belong in the taught list of input patterns, the firing rule is used to determine whether to fire or not.
A simple neuron
3.2 Firing rules
The firing rule is an important concept in neural networks and accounts for their high flexibility. A firing rule determines how one calculates whether a neuron should fire for any input pattern. It relates to all the input patterns, not only the ones on which the node was trained.
A simple firing rule can be implemented by using Hamming distance technique. The rule goes as follows:
Take a collection of training patterns for a node, some of which cause it to fire (the 1-taught set of patterns) and others which prevent it from doing so (the 0-taught set). Then the patterns not in the collection cause the node to fire if, on comparison , they have more input elements in common with the 'nearest' pattern in the 1-taught set than with the 'nearest' pattern in the 0-taught set. If there is a tie, then the pattern remains in the undefined state.
For example, a 3-input neuron is taught to output 1 when the input (X1,X2 and X3) is 111 or 101 and to output 0 when the input is 000 or 001. Then, before applying the firing rule, the truth table is;
X1: | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | |
X2: | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 1 | |
X3: | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | |
OUT: | 0 | 0 | 0/1 | 0/1 | 0/1 | 1 | 0/1 | 1 |
As an example of the way the firing rule is applied, take the pattern 010. It differs from 000 in 1 element, from 001 in 2 elements, from 101 in 3 elements and from 111 in 2 elements. Therefore, the 'nearest' pattern is 000 which belongs in the 0-taught set. Thus the firing rule requires that the neuron should not fire when the input is 001. On the other hand, 011 is equally distant from two taught patterns that have different outputs and thus the output stays undefined (0/1).
By applying the firing in every column the following truth table is obtained;
X1: | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | |
X2: | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 1 | |
X3: | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | |
OUT: | 0 | 0 | 0 | 0/1 | 0/1 | 1 | 1 | 1 |
The difference between the two truth tables is called the generalisation of the neuron. Therefore the firing rule gives the neuron a sense of similarity and enables it to respond 'sensibly' to patterns not seen during training.
3.3 Pattern Recognition - an example
An important application of neural networks is pattern recognition. Pattern recognition can be implemented by using a feed-forward (figure 1) neural network that has been trained accordingly. During training, the network is trained to associate outputs with input patterns. When the network is used, it identifies the input pattern and tries to output the associated output pattern. The power of neural networks comes to life when a pattern that has no output associated with it, is given as an input. In this case, the network gives the output that corresponds to a taught input pattern that is least different from the given pattern.
Figure 1.
For example:
The network of figure 1 is trained to recognise the patterns T and H. The associated patterns are all black and all white respectively as shown below.
The network of figure 1 is trained to recognise the patterns T and H. The associated patterns are all black and all white respectively as shown below.
If we represent black squares with 0 and white squares with 1 then the truth tables for the 3 neurones after generalisation are;
X11: | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | |
X12: | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 1 | |
X13: | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | |
OUT: | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 1 |
Top neuron
X21: | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | |
X22: | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 1 | |
X23: | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | |
OUT: | 1 | 0/1 | 1 | 0/1 | 0/1 | 0 | 0/1 | 0 |
Middle neuron
X21: | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | |
X22: | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 1 | |
X23: | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | |
OUT: | 1 | 0 | 1 | 1 | 0 | 0 | 1 | 0 |
Bottom neuron
From the tables it can be seen the following associasions can be extracted:
In this case, it is obvious that the output should be all blacks since the input pattern is almost the same as the 'T' pattern.
Here also, it is obvious that the output should be all whites since the input pattern is almost the same as the 'H' pattern.
Here, the top row is 2 errors away from the a T and 3 from an H. So the top output is black. The middle row is 1 error away from both T and H so the output is random. The bottom row is 1 error away from T and 2 away from H. Therefore the output is black. The total output of the network is still in favour of the T shape.
3.4 A more complicated neuron
The previous neuron doesn't do anything that conventional conventional computers don't do already. A more sophisticated neuron (figure 2) is the McCulloch and Pitts model (MCP). The difference from the previous model is that the inputs are 'weighted', the effect that each input has at decision making is dependent on the weight of the particular input. The weight of an input is a number which when multiplied with the input gives the weighted input. These weighted inputs are then added together and if they exceed a pre-set threshold value, the neuron fires. In any other case the neuron does not fire.
Figure 2. An MCP neuron
In mathematical terms, the neuron fires if and only if;
X1W1 + X2W2 + X3W3 + ... > T
The addition of input weights and of the threshold makes this neuron a very flexible and powerful one. The MCP neuron has the ability to adapt to a particular situation by changing its weights and/or threshold. Various algorithms exist that cause the neuron to 'adapt'; the most used ones are the Delta rule and the back error propagation. The former is used in feed-forward networks and the latter in feedback networks.