University College London
Autonomous learning agents have flown stunt manoeuvres in helicopters, driven a car 100 miles across the desert, and controlled power stations. Autonomous planning agents have solved airline logistics, designed sewerage layouts, and defeated human champions at chess.
In practice, solving these problems required significant human expertise. Unfortunately, in many other real-world problems, human expert knowledge is unavailable, inaccurate, or prohibitively expensive. This project proposes a new paradigm for autonomously learning, planning, and representing knowledge, based entirely on the learning system's experience, and therefore removes these limitations.
The inspiration for this approach comes from the game of Go. This complex task is widely viewed as a grand challenge for artificial intelligence, which has thwarted traditional approaches to learning and planning. Recently, we have developed a radically different approach, resulting in the first program, MoGo, to perform at human master level.
The new idea of MoGo is to simulate thousands of imaginary games. MoGo evaluates a position by the proportion of simulations leading to a win. As MoGo simulates more games, and sees more positions, it learns to predict the outcome of each position better, and hence makes better decisions. This simple idea has dramatically outperformed all previous approaches.
The key insight of MoGo is that knowledge can be represented by predictions about the future. These predictions can be learnt directly from experience, to provide autonomous learning. They can also be learnt from simulated experience, to provide autonomous planning.
In my research, I am applying this experience-based approach much more widely. Just like Go, many real-world planning and decision-making problems have huge search spaces, and no reliable source of expert knowledge. My goal is to extend MoGo's approach into a general framework for high-performance planning and decision-making in the real world.