Skip to content
What's on

How can a computer understand what is happening in a video?

Prize lecture


18:30 - 19:30


The Royal Society, London, 6-9 Carlton House Terrace, London, SW1Y 5AG


Brain ©monsitj

2017 Milner Award Lecture given by Professor Andrew Zisserman FRS.

How can a computer recognise people and what they are doing and saying in a video stream? The answer is by learning, and learning can take many different forms. 

One form is known as 'strong supervision': this is when a computer is shown many (thousands) of examples of a person or the action they are doing, and from this it learns a model to classify the video.  Another form of learning is known as 'weak' or 'self-supervision': this is when the computer learns directly from the structure of a video stream.

This lecture explains how both forms of supervision can be used to train neural networks using deep learning. It is illustrated throughout with examples including: recognising people by their faces, recognising human actions, automated lip reading, and using both sound and images in concord for training.

The Award 

The Royal Society Milner Award, kindly supported by Microsoft Research, is given annually for outstanding achievement in computer science by a European researcher. 

The award replaces the Royal Society and Académie des sciences Microsoft Award and is named in honour of Professor Robin Milner FRS (1934-2010), a pioneer in computer science.

Professor Andrew Zisserman FRS was awarded the 2017 Milner Award in recognition of his exceptional achievements in computer programming which includes work on computational theory and commercial systems for geometrical images. 

For all enquiries, please contact the Scientific Programmes team.

How can a computer understand what is happening in a video? The Royal Society, London 6-9 Carlton House Terrace London SW1Y 5AG UK