History of AI

Artificial intelligence technology has been “around” for some 80 years. Although it has gained significant traction recently and its applications are transforming almost every industry in the past few years, its foundations were created around the time of the second World War.

Alan turing is considered one of the pioneers in computing and artificial intelligence and an examination of some of his papers from around that time indeed reveals some of the earliest mentions of machines that can mimic the human mind and reasoning. In fact, the ideas behind machine learning and deep learning go back to the latter part of World War II. Back then, scientists were beginning to build computer systems that were meant to process information the same as human brain.

A paper by Warren McCulloch and Walter Pitts, who as early as 1943 proposed an artificial neuron, a computational model of the “nerve net” in the brain is one of the first mentions of this topic in the literature. This laid down the foundation for the first wave of research into the topic and in the 1950s, In his paper ‘Computing machinery and Intelligence’, Alan Turing suggested a framework to build intelligent machines and methods of testing their intelligence.

Bernard Widrow and Ted Hoff at Stanford University, developed a neural network application by reducing noise in phone lines in the late 1950s. Around the same time, Frank Rosenblatt built the Perceptron while working as an academic and with the US government.

The Perceptron was based on the concept of a neural network and was meant to be able to perform tasks that humans usually performed, such as recognizing images and even walking and talking.

The Perceptron, however, only had a single layer of neurons and was limited in how much it could do.

Marvin Minsky, a colleague of Rosenblatt and an old high school classmate of his, wrote a book “Perceptrons: An Introduction to Computational Geometry” in which it detailed the limitations of the Perceptron and neural networks and led to the AI winter until the mid 1980’s.

Around 1986, there was renewed interest in neural networks by physicists who were coming up with novel mathematical techniques and there was a landmark paper by Geoffry Hinton about the applications of back-propagation in neural networks as a way to overcome some of its limitations— although some practitioners point to a Finnish mathematician, Seppo Linnainmaa, as having invented back propagation already in the 1960s.

This led to a revival of the field and creation of some of the first practical applications, such as detection of fraud in credit card transactions.

In the late 1980’s, at Carnegie Melon, Dean Pomerleau built a self-driving cars using neural networks. Yann LeCun at NYU pioneered the use of neural networks on image recognition tasks and his 1998 paper defined the concept of convolutional neural networks, which mimic the human visual cortex.

In parallel, John Hopfield popularized the “Hopfield” network which was the first recurrent neural network. This was subsequently expanded upon by Jurgen Schmidhuber and Sepp Hochreiter in 1997 with the introduction of the long short-term memory (LSTM), greatly improving the efficiency and practicality of recurrent neural networks.

Although these applications in the 1980’s and 1990’s created momentum for the field, they soon reached their limits due to a lack of data and enough computing power.

There was another AI winter, fortunately a shorter one, for about a decade or so.

Around 2006, Geoffrey Hinton and few others in Canada published a landmark paper in science about the potential of “multi-layered” neural networks.

This, along with the potential it subsequently showed in speech recognition re-ignited interest in neural networks.

This was enough to attract interest and research dollars and over the next few years, companies such as Microsoft and Google ramped up their research in the field dramatically.

In 2012, Hinton and two of his students highlighted the power of deep learning when they obtained significant results in the well-known ImageNet competition, based on a dataset collated by Fei-Fei Li and others.

At the same time, Jeffrey Dean and Andrew Ng were doing breakthrough work on large scale image recognition at Google Brain.

Deep learning also enhanced the existing field of reinforcement learning, led by researchers such as Richard Sutton, leading to the game-playing successes of systems developed by DeepMind.

Given the impressive results that this algorithm showed the entire world, everyone woke up to the potential of deep learning and neural networks.

In 2014, Ian Goodfellow published his paper on generative adversarial networks, which along with reinforcement learning has become the focus of much of the recent research in the field.

Continuing advances in AI capabilities have led to Stanford University’s One Hundred Year Study on Artificial Intelligence, founded by Eric Horvitz, building on the long-standing research he and his colleagues have led at Microsoft Research.

We have benefited from the input and guidance of many of these pioneers in our research over the past few year.

The rest, as they say, is history!! Now, with large amounts of digitized data and significantly stronger computers, progress came fast in various sectors.

Still today, we’re at the dawn of the potential of AI.

Most of the applications to date are with supervised learning, which is teaching the algorithm by giving it annotated data (thousands or millions) and the algorithm learning from that data to identify patterns and making predictions.

The long-term power of neural nets will reside in the unsupervised learning where the algorithms can learn without being trained but by just being given the data and desired outcomes.

Reinforcement learning, which in many ways mimics human learning mechanism, will be also at the leading edge of achieving AI’s full potential.