Mutations, genetic identity, and data granularity

MathBio Seminar

Monday, April 24, 2017 - 4:00pm

Jun Li

University of Michigan

Location

University of Pennsylvania

318 Carolyn Lynch Lab

I will talk about two studies where new insights are gained after we work on a different level of data granularity. First, in collaboration with Sebastian Zoellner we analyzed ~36 million extremely rare variants (defined as singletons in ~4,000 individuals) uniformly ascertained in an as yet unpublished whole-genome sequencing dataset. Our goal is to estimate mutation rate variation across the genome, and to identify genomic and sequence-based predictors of such variation. We found that some genomic features, such as H3K36me3 peaks and CpG islands, can either increase or decrease mutation rates depending on the adjacent sequence context. This shows that their impact of mutations cannot be understood by studying all mutation subtypes in aggregate. In the second study, in collaboration with Noah Rosenberg we assessed the possibility of using an individual's microsatellite genotype data to find matched records in a database of SNP genotypes, even when they have no shared markers. By using ~1,000 samples analyzed on both the 13 tandem repeat markers in the FBI standard forensic panel and 650K common variants routinely typed in GWAS we demonstrate the feasibility of cross-identifying individuals between the criminal justice system on one hand and genetic or ancestry research on the other. These results add to the list of examples where group-level patterns cannot always be transferred to the individual level, or vice versa. Choosing the right granular level of inquiry thus continues to be one of the biggest challenges in data science.

June

S	M	T	W	T	F	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

How to get to Penn's Mathematics Department

The Mathematics Department Office is located on the fourth (top) floor of David Rittenhouse Laboratory ("DRL"). The building is at 209 South 33rd Street (the Southeast corner of 33rd. and Walnut Streets). Note 33rd Street runs one way north while Walnut runs one way west.

Local Buses & Trains

SEPTA [Skookul]
National Trains: Amtrak [telephone: 1 800 872-7245]

Maps and Directions

We are about a 15 minute walk from the main 30th Street Station and 5 minutes from the University City Rail Station at 32nd and Spruce (=South Street & Convention Avenue). Coming from the airport by train (about 15 minutes): the University City Rail Station is the second stop after you leave the airport.

If you drive, the most convenient public parking is in the pay lot whose entrance is on 34th Street between Market and Chestnut Streets.