“Any man could, if he were so inclined, be the sculptor of his own brain.” ― Santiago Cajal
From using robots to identify objects; creating photo-realistic images of fake celebrities, or even converting thought to text, Neural networks are doing amazing things! As the name suggests, they are based on our understanding of the brain. This blogpost seeks to explain how neural networks, also called ANN (artificial neural network), mimic the physiology and functioning of the human brain.
Let’s start with a high-level physiology of the human brain. It has 3 key parts – hindbrain, midbrain and forebrain. Hindbrain and midbrain control the basic body processes like respiration, digestion, survival responses etc.; not very differently from the brain of other mammals. Our evolutionary difference with other species comes from the forebrain - a densely-packed, three-dimensional layer of neurons under the skull. Branches from these neurons form interconnections called synapses, where memory is stored. To put the complexity in perspective, a human brain has about 100 billion neurons, each connected to about 10,000 other neurons, cumulatively storing 1 – 1000 terabytes of data.
So, how does our brain think? When we receive an external stimulus like vision or sound, data travels as electrical signals through a path between neurons. The specific path is determined by the strength of inter-neuron connections, which itself is a cumulative result of all previous learning experiences. A neuron may get signals from many other neurons. If the sum of all input signals crosses its activation threshold, it transmits the signal to the next connection; otherwise the signal dies at that neuron. Thinking essentially involves taking the information from input neurons, progressively abstracting it through multiple connections among ‘thinking neurons’, finally leading to muscle instruction by output neurons. Our ability to abstract from raw information is a key attribute of intelligence and that’s why many common tests of thinking ability – e.g. GMAT, GRE, SAT, check our ‘if – then’ or ‘so – what’ skills.
Now visualize the neuron as a HW computational unit, the strength of inter-neuron connections as a quantitative weight for signals into each computational unit, and activation threshold as a quantitative threshold for releasing information, and you have the basic building blocks of ANN. The computational unit in a neural network is called a ‘node’ and, unlike the 3-dimensionally packed neurons, nodes are connected to each other in layers – an input layer for getting information, an output layer for generating results and multiple ‘hidden layers’ in between for processing. Every node assigns a ‘weight’ to the connection from an incoming node, and its output is the weighted sum of data from each incoming node. If the weighted sum is above the quantitative threshold, the node fires its output to the next connected node. In the active state, each layer of ANN takes information, combines it into next level of abstraction, and then passes it to the next layer, until the information reaches the highest level of abstraction at the output layer. The larger the number of layers, the deeper the network is, and hence the phrase ‘deep learning’.
Unlike the brain, ANN randomly assigns the weights and thresholds in the beginning. By running data for known outcomes, weights and thresholds are optimized to minimize the error. This ‘try try again, until you succeed’ approach (called supervised learning) is continued until the correct output is achieved, at which stage the ANN is deemed ‘trained’.
Let’s take an example of a facial recognition ANN system. Here, initial training will involve showing a very large number of photos and their identifier (e.g. name). Once the system is trained, the input layer will start at pixel level of image; the initial layers will detect basic features (e.g. color); the next set of layers will abstract them into higher level features (e.g. nose); and the process will continue until the output layer matches the face with a name. The process becomes faster once the system learns common patterns of human face (e.g. eyes are above nose), just as experienced people often make better decisions in a familiar context.
ANNs are incredibly effective because you really don’t have to tell them how to do the job. They replace task-specific traditional programming with ‘learning from data’ (see earlier blogpost on ‘Why data is the new oil’). Now that chip manufacturers have started integrating neural engines into the processor, it’s only a matter of time before ANN becomes core to our daily lives. Jurgen Schmidhuber paints a picture of that future in Scientific American:
“Today's largest [ANN] have a billion connections or so. In 25 years, we should have rather cheap, human-cortex-sized [ANN] with more than 100,000 billion electronic connections. A few decades later, we may have cheap computers with the raw computational power of all of the planet’s 10 billion human brains together…...!”