Back in 1943, McCulloch and Pitts [1] proposed a simple computational model of an artificial neuron, and showed how networks of connected neurons could represent concepts from logic, such as AND, OR and NOT. They also suggested that it should be possible for networks of their artificial neurons to learn, but they did not know how to do this. In 1950, Alan Turing wrote an influential paper called “Computing Machinery and Intelligence” [2] that proposed the Turing Test as well as introducing important concepts such as Machine Learning (ML) and Reinforcement Learning. In the 1950s and 1960s, Arthur Samuel [3] developed a series of programs that used Machine Learning to play checkers (draughts), out-performing him and other human players.
From the 1950s to the present day, there has been progress in many directions under the umbrella of AI, with progress proceeding in parallel with developments in computer systems generally.
The pace of AI and ML research internationally has increased in the past decade, driven by the convergence of four major factors: New theoretical advances in AI and ML algorithms; the availability of very large training data sets, because ML in particular is heavily dependent on data to learn from; the development of open-source software frameworks for ML, and ML models that can be re-used and extended; and new massively parallel computer architectures, including GPUs that were originally developed for high-performance computer games.
AI, Machine Learning, Neural Nets, Deep Learning
The goal of Artificial Intelligence is typically defined as creating computer systems that can perform tasks that require intelligence for humans to perform. This can include tasks such as reasoning, solving problems, representing knowledge, processing human language, perceiving the world, moving within an environment, and learning.
Of those types of AI, machine learning is one of the most important. Machine learning is concerned with creating computer systems that get better at doing a task as they gain experience; for example, programs that get better at chess the more they play, or software that learns to identify objects in photos after being trained with large numbers of photos labelled with different objects.
There are many approaches to machine learning, for example using rule learners, statistical methods, and artificial neural networks. ANNs, also called neural nets, are designed to map inputs (such as a photo) to outputs (such as a list of objects in the photo) by passing signals through a network of artificial neurons, connected to each other by weighted links, that are related to those that McCulloch and Pitts proposed back in 1943. ANNs can “learn” by adjusting their weights, which changes the mapping from inputs to outputs. During a learning phase, an ANN is repeatedly presented with many samples of inputs and the corresponding correct answers, and slowly adjusts its weights until its outputs approximate the correct answers as well as possible.
While neural nets had been explored in ML in the 1980s and 1990s, it is in the last decade that they have led to a new development, termed deep learning. At a basic level, deep learning involves learning with very large neural nets where the inputs are connected via many layers of neurons to the outputs. Because of their size and structure, they are able to represent complex mappings. In addition, early layers can be considered to act as learnable subroutines that are re-used by later layers, or to perform automatic feature processing at multiple levels. Training deep networks has given rise to new challenges that are being addressed via the advances in computation and data that I mentioned earlier.
“Superhuman AI”, “Artificial General Intelligence”, and Hype
At times there can be quite a lot of hype in the AI community. The term Superhuman AI is occasionally used by people hyping their work, although it has a specific accepted definition that is not based in hype: an AI system that can outperform humans at some specific task. We already have some examples of superhuman AI. For example, AlphaGo Zero [6] is able to perform better than any of the world’s best human players of Go. It learned by playing millions of simulated games against itself, rather than studying human masters. A poker-playing program called Libratus [7] has been demonstrated to beat the best professional human poker players in specially-arranged tournaments.
Some people in the AI community question whether we should use the term “superhuman AI” at all, as it may carry inappropriate implications about what AI can do. I have reflected on this in recent years when attending a major AI conference, the International Joint Conference on AI. It includes an annual contest called RoboCup, involving the world’s top teams engaged in robot soccer. If you were to watch them, you will quickly conclude that we’re a long way from superhuman AI. Below is a photo I took while watching the RoboCup contest in 2017. It occurred to me at the time that the little child could easily beat any one of the robots that were playing soccer. This is an example of what can be a source of hype, which is a big gap between reality and expectation.
Superhuman AI leads on to the concept of artificial general intelligence. Superhuman performance has been achieved in some very specific tasks by highly specialised AI systems. This is analogous to how we, as humans, tend to build specialized machines that can outperform humans at specific tasks, whether the task is to perform calculations or dig large holes. But of course, the more specialist a machine is, the less generally usable it is for a wide range of different tasks. Current systems that have achieved superhuman performance, such as those that play Go and Poker, are each dedicated to one specific task, and tend to fail when the task parameters are changed, such as when the size of the Go board is increased.
Artificial general intelligence is a much more distant goal than superhuman AI. That’s defined as the ability of an intelligent agent to perform at a human level any intellectual task that a human can. This is a much harder job than outperforming humans at individual tasks. Many AI researchers estimate that this is at least 50 or more years away, and others question whether it is achievable at all. At present, there is no consensus in the field about whether the current paradigm of deep learning will end up being a key component of artificial general intelligence, or whether brand new paradigms will have to be formulated. A question that has often been asked in the history of AI research is whether or not successful systems for tackling specific tasks are providing us with stepping-stones towards artificial general intelligence. As the philosopher of AI, Hubert Dreyfus, phrased it, “Are we climbing a tree to reach the moon?”
Looking to the Future
While there has been an amount of hype in the field of AI, and it is by no means certain that artificial general intelligence will be achieved in our lifetimes, there is little doubt that AI is starting to become a transformative technology that will suffuse across society, as computers and smartphones already have. It is therefore important for us to understand what AI is about, what it promises, and where those promises fall short. It is also essential for universities to lead the way through research and teaching in AI.
References
McCulloch, W.S., Pitts, W (1943). “A logical calculus of the ideas immanent in nervous activity.” Bulletin of Mathematical Biophysics 5, 115–133.
Turing, A. (1950). “Computing Machinery and Intelligence” Mind, LIX (236): 433–460.
Samuel, A. L. (1959). “Some Studies in Machine Learning Using the Game of Checkers”. IBM Journal of Research and Development. 44: 206–226.
Silver, D. et al. (2017). “Mastering the game of Go without human knowledge.” Nature. 550 (7676): 354–359.
Spice, B. & Allen, G. (2017). “Upping the Ante: Top Poker Pros Face Off vs. Artificial Intelligence”. Carnegie Mellon University.
Profiles
Prof. Michael Madden is the Established Professor and Head of School of Computer Science in NUI Galway. He leads the Machine Learning Research Group that he set up in 2001. His research focuses on new theoretical advances in machine learning, motivated by addressing important data-driven applications in medicine, engineering, and the physical sciences. This has led to over 100 publications, 4 patents and a spin-out company. He has spent time as a Visiting Research Scientist in University of Helsinki, University of California Irvine, and UC Berkeley.