Beyond the S. aureus comet: what tree shapes occur in large bacterial genomic data?
Professor Caroline Colijn, Simon Fraser University, Canada
When methicillin-resistant Staphylococcus aureus (MRSA) arose and disseminated widely, some phylogenetic trees of MRSA-containing types of staphylococcus aureus had a distinctive 'comet' shape, with a 'comet head' of recently-adapted resistant isolates in the context of a 'comet tail' that was predominantly drug sensitive. Placing an isolate in the context of such a 'comet' helped public health laboratories interpret local data within the broader setting of S aureus evolution. In this work Professor Colijn and her colleagues ask what other tree shapes, analogous to the MRSA comet, are present in bacterial WGS datasets. They extract trees from large bacterial genomic datasets, visualise them as images, and cluster the images. They find nine major groups of tree images, including the 'comet', star-like phylogenies, barbell' phylogenies and other shapes, and comment on the evolutionary and epidemiological stories these shapes might illustrate.
Genome-scale metabolic network reconstructions of hundreds of diverse Escherichia coli strains reveal strain-specific adaptations and evolutionary trajectories
Dr Jonathan Monk, University of California San Diego, USA
Bottom-up approaches to systems biology rely on constructing a mechanistic basis for the biochemical and genetic processes that underlie cellular functions. Genome-scale network reconstructions of metabolism are built from all known metabolic reactions and metabolic genes in a target organism. A network reconstruction can be converted into a mathematical format and thus lend itself to mathematical analysis. Genome-scale models (GEMs) of enable a systems approach to characterise the pan and core metabolic capabilities of the E coli species. The models have been used to systematically analyze growth capabilities in more than 650 different growth-supporting environments as well as to predict strain-specific auxotrophies. In this work, genome-scale models were constructed for more than 300 representative strains of E coli across all 295 HC1100 levels. The models were used to study E coli metabolic diversity and speciation on a large scale. The results show that unique strain-specific metabolic capabilities correspond to pathotypes and environmental niches. Genome-scale analysis of multiple strains of a species can thus be used to define the metabolic essence of a microbial species and delineate growth differences that shed light on the adaptation process to a particular microenvironment.
New methods with high accuracy and scalability for large-scale phylogenetic estimation
Professor Tandy Warnow, University of Illinois, USA
The estimation of phylogenetic trees for individual genes or multi-locus datasets is a basic part of considerable biological research. In order to enable large trees to be computed, Disjoint Tree Mergers (DTMs) have been developed; these methods operate by dividing the input sequence dataset into disjoint sets, constructing trees on each subset, and then combining the subset trees (using auxiliary information) into a tree on the full dataset. DTMs have been used to advantage for multi-locus species tree estimation, enabling highly accurate species trees at reduced computational effort, compared to leading species tree estimation methods. The talk will show that DTMs can be used to improve the accuracy and speed of methods for species tree estimation methods (eg, ASTRAL) as well as for gene tree estimation (eg, RAxML), thus enabling these methods to run efficiently on much larger datasets than currently possible, and without the need for high performing computing platforms or massive parallelism. These methods are available in open source form on github.
Dr Nicholas Croucher, Imperial College London, UK
Dr John Lees, Imperial College London, UK
Dr Cheryl P Andam, University at Albany, State University of New York, USA