A general framework for branch length estimation in Ancestral Recombination Graphs.
Yun Deng, William S DeWitt, Yun S Song, Rasmus Nielsen
Abstract
Open AccessInference of Ancestral Recombination Graphs (ARGs) is of central interest in the analysis of genomic variation. ARGs can be specified in terms of topologies and coalescence times. The coalescence times are usually estimated using an informative prior derived from coalescent theory, but this may generate biased estimates and can also complicate downstream inferences based on ARGs. Here, we introduce, Prior-Oblivious Length Estimation in Genealogies with Oriented Networks (POLEGON), an approach for estimating branch lengths for ARGs which uses an uninformative prior. Using extensive simulations, we show that this method provides improved estimates of coalescence times and leads to more accurate inferences of effective population sizes under a wide range of demographic assumptions (population expansion, bottleneck, split, etc.). It also improves other downstream inferences including estimates of mutation rates. We apply the method to data from the 1000 Genomes Project to investigate population size histories and differential mutation signatures across populations. We also estimate coalescence times in the Human Leukocyte Antigen (HLA) region and show that they exceed 30 My in multiple segments.