Penn Arts & Sciences Logo

MathBio Seminar

Monday, May 1, 2017 - 4:00pm

Pier Francesco Palamara

Harvard School of Public Health

Location

University of Pennsylvania

318 Carolyn Lynch Lab

Coalescent hidden Markov models (HMM) such as the pairwise sequentially Markovian coalescent (PSMC, Li and Durbin, 2010) enable estimating the locus-specific posterior distribution of the time to most recent common ancestor (TMRCA) of a pair of haploid chromosomes when high-coverage sequencing data is available. I will present the “ascertained sequentially Markovian coalescent” (ASMC), a coalescent HMM that can be used to accurately estimate locus-specific TMRCA probabilities in widely available SNP array data. ASMC utilizes an extremely efficient recursive formulation of the forward/backward HMM algorithm, which enables analysis of very large data sets to reconstruct a detailed landscape of coalescent times along the genome. I will describe results from running ASMC in several cohorts, including ~120,000 unrelated British individuals from the UK Biobank data set, where we find that multiple loci underwent positive selection during the past ~200 generations. Looking at deeper time scales, we detect widespread negative selection that concentrates in regions enriched for heritability in several disease phenotypes.