Skip to content
Events

Next-generation molecular and evolutionary epidemiology of infectious disease: challenges and opportunities

Event

Starts:

May
162012

09:00

Ends:

May
172012

17:00

Location

Kavli Royal Society Centre, Chicheley Hall, Newport Pagnell, Buckinghamshire, MK16 9JJ

Overview

Satellite meeting organised by Dr Oliver Pybus, Professor Christophe Fraser and Professor Andrew Rambaut 

Event details

The dynamic interaction between genetically-variable infectious diseases and their hosts represents one of the most complex and intensively-studied phenomena in biology. This satellite meeting will provide a forum for discussion and exploration by researchers working on inter-disciplinary approaches in genomics, immunology, epidemiology and computing. We particularly encourage the participation of early-career researchers and those working on quantitative approaches.

Biographies of the organisers and speakers are available below.  Audio recordings are freely available and the programme can be downloaded here

This meeting was preceded by a related discussion meeting Next-generation molecular and evolutionary epidemiology of infectious disease 14 - 15 May 2012.

Event organisers

Select an organiser for more information

Schedule of talks

Session 1

5 talks Show detail Hide detail

Chair

Professor Christophe Fraser, Imperial College London, UK

Show speakers

Microevolution in Clostridium difficile genomes reveals limited hospital transmission

Dr Xavier Didelot, University of Oxford, UK

Abstract

The control of Clostridium difficile infection (CDI) is a major international healthcare priority, hindered by a limited understanding of transmission epidemiology for these bacteria. However, transmission studies of bacterial pathogens are rapidly being transformed by the advent of next generation sequencing. Here we sequence whole C.difficile genomes from 486 CDI cases arising over 4 years in Oxfordshire. We show that we can estimate the times back to common ancestors of bacterial lineages with sufficient resolution to distinguish whether direct transmission is plausible or not. Time depths were inferred using a within-host evolutionary rate that we estimated at 2.3 mutations per genome per year based on serially isolated genomes (with 95% credibility interval 1.6-3.0). The subset of plausible transmissions was found to be highly associated with pairs of patients sharing time and space in hospital. Conversely, the majority (81%) of pairs of genomes matched by conventional typing and isolated from patients within a month of each other were too distantly related to be direct transmissions. Our results suggest that nosocomial transmission between symptomatic CDI cases contributes far less to current rates of CDI acquisition than has been widely assumed, which clarifies the importance of future research into other transmission routes, for example from asymptomatic carriers. With the costs of DNA sequencing rapidly falling and its use becoming more and more widespread, genomics will revolutionize our understanding of the transmission of bacterial pathogens.

Show speakers

The transmission and microevolution of MRSA as revealed by next-generation sequencing

Dr Ed Feil, University of Bath, UK

Abstract

The advent of next-generation sequencing technology is set to revolutionise our understanding of the intersection between transmission dynamics and micro-evolutionary processes in bacterial pathogens. Here I will review recent work in this area focussing on the methicillin resistant Staphylococcus aureus (MRSA).  These studies illustrate the range of questions that can be addressed on differing epidemiological scales; from considering the global picture; down to a clustering on a national level; between hospitals within a single country; within a single hospital; and finally temporal divergence within a single host. I will consider key future questions relating to the extent to which key phenotypes (resistance / virulence) might be predicted from sequence data, and the relationships between patterns of mutation and recombination, the efficiency of purifying selection and epidemiological behaviour.

Show speakers

Why is Asia ahead?

Dr Sarah Cobey, Harvard University, USA

Abstract

The most successful strains of influenza A (H3N2) tend to emerge from the tropics of East and Southeast Asia. In the absence of controlled trials, it is difficult to determine how the global ecology of influenza interacts with its antigenic evolution, and why some host populations might have a greater impact than others. I will review hypotheses of how influenza’s epidemiology could vary between regions. Using a toy model, I will show how each factor is expected to change the relative contributions of different host populations to the emergence of new variants. Predictions from this model will then be compared to measures of recent influenza evolution.

Show speakers

Modelling the growth and transmission of infectious disease by linking epidemiology and population genetics

Dr Daniel Wilson, University of Oxford, UK

Abstract

Understanding the transmission of infectious disease is important for monitoring outbreaks, informing public health policy, and improving intervention strategies. Traditionally the fields of population genetics and epidemiology have been studied separately; however it is clear that using genetic information alongside epidemiological models has great potential for understanding the dynamics of infectious disease.  Directly estimating epidemiological parameters such as transmission rates can be difficult, as it relies on comprehensive monitoring during an outbreak where relevant processes may be hidden or undetectable. However, genetic information provides an alternative window into the past. I will talk about a combined coalescent-based meta-population model for estimating the parameters of standard SI, SIS and SIR epidemiological models from genetic data. I will apply these models to a meta-analysis of Hepatitis C virus (HCV), with the aim of explaining differences in patterns of genetic diversity between populations in terms of the underlying epidemiological dynamics. I will look at differences between datasets in the growth rate of HCV and whether they are explained by subtype, host population size or prevalence of disease to understand the factors that drive global variation in Hepatitis C diversity.

Show speakers

Session 2

5 talks Show detail Hide detail

Chair

Professor Marc Suchard, University of California, Los Angeles, USA

Show speakers

The birth-death skyline plot and beyond

Professor Tanja Stadler, ETH Zurich, Switzerland

Abstract

Beyond reconstruction of ancestral relationships, phylogenetic trees can be used to infer the processes that generated them. We will introduce the "birth-death skyline plot" that explicitly estimates the rate of transmission, recovery / death and sampling, and allows all of these parameters to vary through time in a piecewise fashion. This model is a powerful exploratory method for understanding the processes driving phylogenetic diversity in measurably evolving populations such as RNA viruses. Being implemented in the software package Beast, the birth-death skyline plot is a direct replacement for the Bayesian skyline plot on viral phylogenies and more accurately models the different roles of incidence and prevalence in determining the phylogenetic diversity of an epidemic. The method is applied to HIV-1 sequence data from the UK as well as to an HCV dataset from Egypt, revealing interesting temporal changes of the basic reproductive number.

I will furthermore give an outlook on how we employ the birth-death skyline plot as well as other birth-death-based approaches to properly account for classical epidemiological dynamics with finite-size host populations (SIR dynamics). I will in particular show that it is possible to estimate the size of the host population using viral phylogenies and apply this method to an HIV-1 dataset from Switzerland.

Show speakers

Exploring the genomic diversity of pathogen populations: a multivariate approach

Dr Thibaut Jombart, Imperial College London, UK

Abstract

Genetic sequence data are becoming increasingly available for a range of pathogens at a variety of spatial and temporal scales. These data can be exploited to inform infectious disease epidemiology in various ways, from the reconstruction of the historical spread of a disease worldwide to the near real-time genetic monitoring of local outbreaks. Here, we show how recent developments in multivariate methods can be used to investigate the genetic diversity of possibly large pathogen sequence datasets. This approach can identify clusters of genetically related infections and describe the spatio-temporal dynamics of the genetic diversity of pathogen populations. It can also be used to reveal alleles which most discriminate groups of pathogens, which can for instance be employed to detect host-specific genetic features.

While useful for exploring pathogen genetic data at large scales, this approach may be less relevant at smaller scales where the overall genetic diversity remains relatively low. This is typically the case in disease outbreaks, where clear-cut genetic clusters might be difficult to identify, but where sequence data may still contain relevant information about transmission pathways. We show how a simple graph approach can be used for reconstructing transmission trees (“who infected whom”) in the case of densely sampled outbreaks. We conclude on how such approaches may be improved by integrating simultaneously genetic and epidemiological information for the reconstruction of disease outbreaks.

Show speakers

Bayesian inference of epidemiological parameters using birth-death tree priors

Professor Alexei Drummond, University of Auckland, New Zealand

Abstract

A general piecewise-constant birth-death-sampling tree prior is described that acts as a kernel for the construction of a class of epidemiological tree priors that are parameterized by fundamental epidemiological parameters like R0, and the infectious interval. This class of priors enables Bayesian inference of epidemiological parameters directly from appropriately sampled molecular sequence data. I will review recent work on this family of tree priors and describe efforts to extend the family both in terms of the observational process (handling sampling heterogeneity) and in terms of the spatial dynamics (handling population structure via multiple demes). Examples of the method will be provided for Dengue-4 and HIV-1.

Show speakers

Next generation molecular epidemiology in public health settings

Dr Marijn van Ballegooigen, RIVM, The Netherlands

Abstract

The quantity and quality at which molecular data of infectious diseases is routinely collected in public health surveillance and outbreak investigations is rapidly increasing. This has naturally led to the question how these data can be used to inform policy makers. In this presentation I would like to present two recent studies that attempt to address this question.

The first case is the monitoring of a vaccination program against hepatitis B based on sequence data. Hepatitis B is caused by a sexually transmittable virus that can cause liver failure years after initial infection. The Netherlands introduced risk group vaccination against hepatitis B in 2002. Because initial infections are often asymptomatic, routine surveillance is sensitive to observation bias. In this case, however, surveillance and coalescent reconstructions of effective population size show a matching trend, suggesting the vaccination program is effective.

The second case is the reconstruction of an outbreak of avian influenza in poultry farms. Analysis of molecular sequences obtained from (nearly) all farms, combined with geographic and temporal information enables a probabilistic reconstruction of individual farm to farm transmissions. This detailed reconstruction of the transmission tree makes it possible to estimate the relative transmission risk of different farm types and even enables the estimation of the role of wind in farm to farm transmission. This information makes it possible to design better intervention strategies.

The current state of molecular data collected for public health typically copes with missing data, biased sampling and small scale outbreaks. Scientific methods that can adequately deal with these shortcomings may offer the best opportunities for public health settings.

Show speakers

Session 3

5 talks Show detail Hide detail

Chair

Professor Andrew Rambaut University of Edinburgh UK

Show speakers

Combining whole genome sequencing and network models to understand the epidemiology of bovine TB in the UK

Dr Roman Biek, University of Glasgow, UK

Abstract

Quantifying transmission dynamics of pathogens infecting multiple host species can pose significant research challenges, especially when the sampling process is biased towards certain types of host. This is exemplified by Mycobacterium bovis, the bacterium causing bovine TB (bTB) in cattle. In the UK, badgers are considered an important wildlife reservoir for bTB, which is thought to prevent the successful eradication of the disease from cattle. However, despite considerable research effort, the epidemiological role badgers play in maintaining and spreading bTB to cattle is still poorly understood. Here, we show how whole genome sequencing (WGS) technology can be combined with high-resolution data on contact networks of cattle to shed new light onto this problem. Focussing on a small cluster of infected cattle and badger samples from Northern Ireland, we provide the first direct genetic evidence of M bovis persistence on farms over multiple outbreaks with a continued, ongoing interaction with local badgers. In addition to providing novel insights into bTB epidemiology, even at extremely local scales, our study suggests that WGS based on more extensive sampling will allow quantification of the extent and direction of M bovis transmission between cattle and badgers, especially in situations where detailed demographic and contact data for cattle are also available.

Show speakers

Incorporating geographic information systems data into phylogenetic analysis

Dr Rebecca Gray, University of Oxford, UK

Abstract

Geographic information systems data (GIS) has been a valuable tool to correlate the spread of infectious diseases with environmental variables. Independently, molecular epidemiology relies upon pathogen genetic mutations that segregate in space and time, which are used in increasingly sophisticated evolutionary models to infer migration paths, rates, and population demography. Clearly a comprehensive approach that incorporates both GIS and evolutionary analyses would allow for rigorous hypothesis testing and greater understanding of the forces governing disease movements. I willdiscuss the advantages of using GIS in molecular epidemiological studies aswell as some of the current computational and theoretical challenges. I will present some recent work on West Nile Virus and rabies virus in which we have used information gained from the phylogeny on migration patterns within thecontext of GIS.

Show speakers

Antigenic flux in the influenza virus population

Dr Trevor Bedford, University of Edinburgh, UK

Abstract

Owing to rapid mutation, the evolution of the influenza virus occurs on a human timescale; rather than being forced to infer past evolutionary events, we can observe them in near real-time. While individuals develop long-lasting immunity to particular influenza strains after infection, antigenic mutations to the influenza virus genome result in proteins that are recognized to a lesser degree by the human immune system, leaving individuals susceptible to future infection. Mutations are only transiently advantageous; the virus population must keep evolving antigenically to stay ahead of developing human immunity. This talk focuses the process of antigenic innovation and the spread of novel strains through the human population.  In this case, we have serological data from the hemagglutination inhibition (HI) assay comparing the level of cross-reactivity between different strains of influenza, as well as sequence data across strains.  Here, we use a probabilistic framework called Bayesian multidimensional scaling (BMDS) to find a single consistent representation of antigenic distances between viruses by placing strains on a two-dimensional map.  We integrate sequence evolution by treating BMDS location as a continuous diffusion across the phylogenetic tree.  In this context, we examine the process of antigenic drift and investigate historical choices in vaccine strain by the World Health Organization.

Show speakers

Multiscale evolutionary dynamics of HIV

Dr Katrina Lythgoe, Imperial College London, UK

Abstract

Through the use of next-generation sequencing, evidence is growing that ancestral HIV-1 genotypes (i.e. the viral genotypes observed during early infection) are, at least sometimes, preferentially transmitted over the majority virus circulating in a donor at the time of transmission.  This ancestral virus probably persists at a low frequency within hosts due to the cycling of virus through very long-lived memory CD4+ T-Cells, a process that we call ‘store and retrieve’.  We show how incorporating the store and retrieve process into our models can help explain two puzzling phenomena: (1) the fact that HIV-1 appears to evolve much faster within individuals than it does at the epidemic level and (2) the low levels of resistance found in developed countries despite the widespread use of antiretroviral drugs.  The preferential transmission of ancestral virus needs to be properly integrated into evolutionary models if we are to accurately predict the evolution of immune escape, drug resistance and virulence in HIV-1 at the population level. Moreover, early infection viruses should be the major target for vaccine design, since these are the viral strains primarily involved in transmission.

Show speakers

Session 4

4 talks Show detail Hide detail

Chair

Dr Oliver Pybus, University of Oxford, UK

Show speakers

Integrating sequence variation and protein structure to identify sites under positive or negative selection

Professor Claus Wilke, University of Texas, USA

Abstract

We present a novel method to identify sites under positive or negative selection in protein-coding genes. Our method combines a traditional Goldman-Yang model of coding-sequence evolution with information obtained from the 3d structure of the evolving protein, specifically the relative solvent accessibility (RSA) of individual residues. We allow individual sites to fall into different evolutionary-rate classes, and we model the RSA-dependence of rate classes via linear functions. We demonstrate that our RSA-dependent model provides a significantly better fit to molecular sequence data than a traditional, RSA-independent model. We further show that our model provides a natural, RSA-dependent neutral baseline for the evolutionary rate ratio omega=dN/dS, and that sites that deviate from this neutral baseline can be considered to be positively or negatively selected. We apply our method to the influenza proteins haemagglutinin and neuraminidase. For haemagglutinin, our method recovers positively selected sites in known antibody binding regions or near the sialic-acid binding site. For neuraminidase, which has no sites with omega>1, our method recovers positively selected sites involved with tamiflu resistance and negatively selected sites that participate in important stabilizing hydrogen bonds.

Show speakers

Within-host and between-host evolutionary rates across the HIV-1 genome

Dr Samuel Alizon, CNRS, France

Abstract

HIV evolves rapidly over the course of an infection due to its short generation times and to the selective pressure exerted by the host’s immune response. The virus is therefore is subject to multi-level selective pressures: at the within-host level, natural selection favours virus strains that grow rapidly inside the host, whereas at the between-host level it favours strains that spread rapidly in the host population. HIV within-host evolutionary rates have been suggested to be approximately 10 times higher than its between-host evolutionary rates. However, this conclusion is based on few analyses of a short portion of the virus envelope gene and it has been shown for instance for HCV that the a difference in evolutionary rates can be restricted to small genomic region. Here, we study in detail these evolutionary rates across the HIV genome using longitudinal data collected in two hosts, one of which is a long-term non-progressor. Our results provide the first large-scale overview of the differences in the HIV rates of molecular evolution at the within- and between-host levels. This work has implications for the understanding of the role of the transmission bottleneck in the evolutionary dynamics of HIV.

Coauthor: Christophe Fraser

Show speakers

Toward realistic models for the evolutionary emergence of novel pathogens

Dr James Lloyd-Smith, University of California, Los Angeles, USA

Abstract

Over the past decade, a nascent body of theory has explored the process by which novel pathogen strains can emerge by evolutionary adaptation in response to new environments (such as new host species).  These models have clarified basic principles, but their depiction of pathogen evolution has been simplistic and there has been almost no connection to empirical research.  In this talk I will present several new models aiming to address these shortcomings.  First I will show how consideration of more complex genotype spaces, motivated by empirical research, can overturn the standard finding that higher mutation rates lead to greater probability of emergence.  Next I will introduce a cross-scale model for pathogen emergence, which accounts for selection acting at within-host and population scales, and show how cross-scale conflicts in selection can prevent emergence of a nearby, fitter genotype.  Finally, time permitting, I will present a phylodynamic analysis of the emergence of transmissible defective dengue viruses in Southeast Asia in 2001, and discuss common principles and lessons for pathogen emergence research in general.

Show speakers
Next-generation molecular and evolutionary epidemiology of infectious disease: challenges and opportunities Kavli Royal Society Centre, Chicheley Hall Newport Pagnell Buckinghamshire MK16 9JJ