ResearchHub | Open Science Community

Seasonal dynamics of the wild rodent faecal virome

Jayna Raghwani et al.Feb 9, 2022

ABSTRACT Viral discovery studies in wild animals often rely on cross-sectional surveys at a single time point. As a result, our understanding of the temporal stability of wild animal viromes remains poorly resolved. While studies of single host-virus systems indicate that host and environmental factors influence seasonal virus transmission dynamics, comparable insights for whole viral communities in multiple hosts are lacking. Leveraging non-invasive faecal samples from a long-term wild rodent study, we characterised viral communities of three common European rodent species ( Apodemus sylvaticus, A. flavicollis , and M. glareolus ) living in temperate woodland over a single year. Our findings indicate that a substantial fraction of the rodent virome is seasonally transient and associated with vertebrate or bacteria hosts. Further analyses of one of the most abundant virus families, Picornaviruses, show pronounced temporal changes in viral richness and diversity, which were associated with concurrent and up to ∼3-month lags in host density, ambient temperature, rainfall and humidity, suggesting complex feedbacks from the host and environmental factors on virus transmission and shedding in seasonal habitats. Overall, this study emphasizes the importance of understanding the seasonal dynamics of wild animal viromes in order to better predict and mitigate zoonotic risks.

Ecology

Biochemistry

23

Paper

Save

Jointly inferring the dynamics of population size and sampling intensity from molecular sequences

Kris Parag et al.Jul 2, 2019

Abstract Estimating past population dynamics from molecular sequences that have been sampled longitudinally through time is an important problem in infectious disease epidemiology, molecular ecology and macroevolution. Popular solutions, such as the skyline and skygrid methods, infer past effective population sizes from the coalescent event times of phylogenies reconstructed from sampled sequences, but assume that sequence sampling times are uninformative about population size changes. Recent work has started to question this assumption by exploring how sampling time information can aid coalescent inference. Here we develop, investigate, and implement a new skyline method, termed the epoch sampling skyline plot (ESP), to jointly estimate the dynamics of population size and sampling rate through time. The ESP is inspired by real-world data collection practices and comprises a flexible model in which the sequence sampling rate is proportional to the population size within an epoch but can change discontinuously between epochs. We show that the ESP is accurate under several realistic sampling protocols and we prove analytically that it can at least double the best precision achievable by standard approaches. We generalise the ESP to incorporate phylogenetic uncertainty in a new Bayesian package (BESP) in BEAST2. We re-examine two well-studied empirical datasets from virus epidemiology and molecular evolution and find that the BESP improves upon previous coalescent estimators and generates new, biologically-useful insights into the sampling protocols underpinning these datasets. Sequence sampling times provide a rich source of information for coalescent inference that will become increasingly important as sequence collection intensifies and becomes more formalised.

Genetics

Artificial Intelligence

0

Paper

Save

A computationally tractable birth-death model that combines phylogenetic and epidemiological data

Alexander Zarebski et al.Oct 22, 2020

Abstract Inferring the dynamics of pathogen transmission during an outbreak is an important problem in both infectious disease epidemiology. In mathematical epidemiology, estimates are often informed by time series of confirmed cases, while in phylodynamics genetic sequences of the pathogen, sampled through time, are the primary data source. Each data type provides different, and potentially complementary, insight; recent studies have recognised that combining data sources can improve estimates of the transmission rate and number of infected individuals. However, inference methods are typically highly specialised and field-specific and are either computationally prohibitive or require intensive simulation, limiting their real-time utility. We present a novel birth-death phylogenetic model and derive a tractable analytic approximation of its likelihood, the computational complexity of which is linear in the size of the dataset. This approach combines epidemiological and phylodynamic data to produce estimates of key parameters of transmission dynamics and the number of unreported infections. Using simulated data we show (a) that the approximation agrees well with existing methods, (b) validate the claim of linear complexity and (c) explore robustness to model misspecification. This approximation facilitates inference on large datasets, which is increasingly important as large genomic sequence datasets become commonplace. Author summary Mathematical epidemiologists typically studies time series of cases, ie the epidemic curve , to understand the spread of pathogens. Genetic epidemiologists study similar problems but do so using measurements of the genetic sequence of the pathogen which also contain information about the transmission process. There have been many attempts to unite these approaches so that both data sources can be utilised. However, striking a suitable balance between model flexibility and fidelity, in a way that is computationally tractable, has proven challenging; there are several competing methods but for large datasets they are intractable. As sequencing of pathogen genomes becomes more common, and an increasing amount of epidemiological data is collected, this situation will only be exacerbated. To bridge the gap between the time series and genomic methods we developed an approximation scheme, called TimTam, which can accurately and efficiently estimate key features of an epidemic such as the prevalence of the infection and the effective reproduction number, ie how many people are currently infected and the degree to which the infection is spreading.

Genetics

Artificial Intelligence

12

Paper

Save

Are skyline plot-based demographic estimates overly dependent on smoothing prior assumptions?

Kris Parag et al.Jan 27, 2020

In Bayesian phylogenetics, the coalescent process provides an informative framework for inferring dynamical changes in the effective size of a population from a sampled phylogeny (or tree) of its sequences. Popular coalescent inference methods such as the Bayesian Skyline Plot, Skyride and Skygrid all model this population size with a discontinuous, piecewise-constant likelihood but apply a smoothing prior to ensure that posterior population size estimates transition gradually with time. These prior distributions implicitly encode extra population size information that is not available from the observed coalescent tree (data). Here we present a novel statistic, Ω, to quantify and disaggregate the relative contributions of the coalescent data and prior assumptions to the resulting posterior estimate precision. Our statistic also measures the additional mutual information introduced by such priors. Using Ω we show that, because it is surprisingly easy to over-parametrise piecewise-constant population models, common smoothing priors can lead to overconfident and potentially misleading conclusions, even under robust experimental designs. We propose Ω as a useful tool for detecting when posterior estimate precision is overly reliant on prior choices.

Genetics

Artificial Intelligence

0

Paper

Genetics

Artificial Intelligence

0

Save

0

Adaptive Estimation for Epidemic Renewal and Phylogenetic Skyline Models

Kris Parag et al.Jul 16, 2019

Estimating temporal changes in a target population from phylogenetic or count data is an important problem in ecology and epidemiology. Reliable estimates can provide key insights into the climatic and biological drivers influencing the diversity or structure of that population and evidence hypotheses concerning its future growth or decline. In infectious disease applications, the individuals infected across an epidemic form the target population. The renewal model estimates the effective reproduction number, R , of the epidemic from counts of its observed cases. The skyline model infers the effective population size, N , underlying a phylogeny of sequences sampled from that epidemic. Practically, R measures ongoing epidemic growth while N informs on historical caseload. While both models solve distinct problems, the reliability of their estimates depends on p -dimensional piecewise-constant functions. If p is misspecified, the model might underfit significant changes or overfit noise and promote a spurious understanding of the epidemic, which might misguide intervention policies or misinform forecasts. Surprisingly, no transparent yet principled approach for optimising p exists. Usually, p is heuristically set, or obscurely controlled via complex algorithms. We present a computable and interpretable p -selection method based on the minimum description length (MDL) formalism of information theory. Unlike many standard model selection techniques, MDL accounts for the additional statistical complexity induced by how parameters interact. As a result, our method optimises p so that R and N estimates properly adapt to the available data. It also outperforms comparable Akaike and Bayesian information criteria on several classification problems. Our approach requires some knowledge of the parameter space and exposes the similarities between renewal and skyline models.

Genetics

Artificial Intelligence

0

Paper

Genetics

Artificial Intelligence

0

Save

0

Optimising Renewal Models for Real-Time Epidemic Prediction and Estimation

Kris Parag et al.Nov 8, 2019

The effective reproduction number, R t , is an important prognostic for infectious disease epidemics. Significant changes in R t can forewarn about new transmissions or predict the efficacy of interventions. The renewal model infers R t from incidence data and has been applied to Ebola virus disease and pandemic influenza outbreaks, among others. This model estimates R t using a sliding window of length k . While this facilitates real-time detection of statistically significant R t fluctuations, inference is highly k -sensitive. Models with too large or small k might ignore meaningful changes or over-interpret noise-induced ones. No principled k -selection scheme exists. We develop a practical yet rigorous scheme using the accumulated prediction error (APE) metric from information theory. We derive exact incidence prediction distributions and integrate these within an APE framework to identify the k best supported by available data. We find that this k optimises short-term prediction accuracy and expose how common, heuristic k -choices, which seem sensible, could be misleading.

Artificial Intelligence

Epidemiology

0

Paper

Artificial Intelligence

Epidemiology

0

Save

0

Optimal Point Process Filtering for Birth-Death Model Estimation

Kris Parag et al.Jul 19, 2017

The discrete space, continuous time birth-death model is a key process for describing phylogenies in the absence of coalescent approximations. Extensively used in macroevolution for analysing diversification, and in epidemiology for estimating viral dynamics, the birth-death process (BDP) is an important null model for inferring the parameters of reconstructed phylogenies. In this paper we show how optimal, point process (Snyder) filtering techniques can be used for parametric inference on BDPs. Specifically, we introduce the Bayesian Snyder filter (SF) to estimate birth and death rate parameters, given a reconstructed phylogeny. Our estimation procedure makes use of the equivalent Markov birth process description for a reconstructed birth-death phylogeny (Nee et al, 1994). We first analyse the popular constant rate BDP and show that our method gives results consistent with previous work. Among these results is an analytic solution to the special case of the Yule-Furry model. We also find an equivalence between the SF Poisson likelihood and two standard conditioned birth-death model likelihoods. We then generalise our estimation problem to BDPs with time varying rates and numerically solve the SF for two illustrative cases. Our results compare well with a recent Markov chain Monte Carlo method by Hohna et al (2016) and we numericaly show that both methods are solving the same likelihood functions. Lastly we apply the SF to a model selection problem on empirical data. We use the Australian Agamid dataset and predict the same relative model fit as that of the original maximum likelihood technique developed and used by Rabosky (2006) for this dataset. While several capable parametric and non-parametric birth-death estimators already exist, ours is the first to take the Nee et al approach, and directly computes the posterior distribution of the parameters. The SF makes no approximations, beyond those required for parameter space discretisation and numerical integration, and is mean square error optimal. It is deterministic, easily implementable and flexible. We think SFs present a promising alternative parametric BDP inference engine.

Artificial Intelligence

Demography

0

Paper

Artificial Intelligence

Demography

0

Save

0

On Signalling and Estimation Limits for Molecular Birth-Processes

Kris ParagMay 13, 2018

Understanding and uncovering the mechanisms or motifs that molecular networks employ to regulate noise is a key problem in cell biology. As it is often difficult to obtain direct and detailed insight into these mechanisms, many studies instead focus on assessing the best precision attainable on the signalling pathways that compose these networks. Molecules signal one another over such pathways to solve noise regulating estimation and control problems. Quantifying the maximum precision of these solutions delimits what is achievable and allows hypotheses about underlying motifs to be tested without requiring detailed biological knowledge. The pathway capacity, which defines the maximum rate of transmitting information along it, is a widely used proxy for precision. Here it is shown, for estimation problems involving elementary yet biologically relevant birth-process networks, that capacity can be surprisingly misleading. A time-optimal signalling motif, called birth-following, is derived and proven to better the precision expected from the capacity, provided the maximum signalling rate constraint is large and the mean one above a certain threshold. When the maximum constraint is relaxed, perfect estimation is predicted by the capacity. However, the true achievable precision is found highly variable and sensitive to the mean constraint. Since the same capacity can map to different combinations of rate constraints, it can only equivocally measure precision. Deciphering the rate constraints on a signalling pathway may therefore be more important than computing its capacity.

Artificial Intelligence

Molecular Biology

0

Paper

Artificial Intelligence

Molecular Biology

0

Save