Untangling the influences of unmodeled evolutionary processes on phylogenetic signal in a forensically important HIV-1 transmission cluster

Vinson P. Doyle, Louisiana State University
John J. Andersen, Louisiana State University
Bradley J. Nelson, Louisiana State University
Michael L. Metzker, Baylor College of Medicine
Jeremy M. Brown, Louisiana State University

Abstract

Stochastic models of sequence evolution have been developed to reflect many biologically important processes, allowing for accurate phylogenetic reconstruction when an appropriate model is selected. However, commonly used models do not incorporate several potentially important biological processes. Spurious phylogenetic inference may result if these processes play an important role in the evolution of a dataset yet are not incorporated into assumed models. Few studies have attempted to assess the relative importance of multiple processes in producing spurious inferences. The application of phylogenetic methods to infer the source of HIV-1 transmission clusters depends upon accurate phylogenetic results, yet there are several relevant unmodeled biological processes (e.g., recombination and convergence) that may cause complications. Here, through analyses of HIV-1 env sequences from a small, forensically important transmission cluster, we tease apart the impact of these processes and present evidence suggesting that convergent evolution and high rates of insertions and deletions (causing alignment uncertainty) led to spurious phylogenetic signal with forensic relevance. Previous analyses show paraphyly of HIV-1 lineages sampled from an individual who, based on non-phylogenetic evidence, had never acted as a source of infection for others in this transmission cluster. If true, this pattern calls into question assumptions underlying phylogenetic approaches to source and recipient identification. By systematically assessing the contribution of different unmodeled processes, we demonstrate that removal of sites likely influenced by strong positive selection both reduces the alignment-wide signal supporting paraphyly of viruses sampled from this individual and eliminates support for the effects of recombination. Additionally, the removal of ambiguously aligned sites alters strongly supported relationships among viruses sampled from different individuals. These observations highlight the need to jointly consider multiple unmodeled evolutionary processes and motivate a phylogenomic perspective when inferring viral transmission histories. © 2014 Elsevier Inc.