Antibodies must recognize a great diversity of antigens to protect us from infectious disease. The binding properties of antibodies are determined by the sequences of their corresponding B cell receptors (BCRs). These BCR sequences are created in "draft" form by VDJ recombination, which randomly selects and deletes from the ends of V, D, and J genes, then joins them together with additional random nucleotides. If they pass initial screening and bind an antigen, these sequences then undergo an evolutionary process of mutation and selection, "revising" the BCR to improve binding to its cognate antigen. It has recently become possible to determine the antibody-determining BCR sequences resulting from this process in high throughput. Although these sequences implicitly contain a wealth of information about both antigen exposure and the process by which we learn to resist pathogens, this information can only be extracted using computer algorithms.
In this talk, I will describe two recent projects to develop model-based inferential tools for analyzing BCR sequences: first, a hidden Markov model (HMM) framework to reconstruct BCR rearrangement events and determine which BCRs derived from the same rearrangements, and second, a novel method for assessing selection on BCRs that side-steps the difficulties in differentiating between selection and motif-driven mutation. We use this new method to derive a per-residue map of selection on millions of reads, which provides a more nuanced view of the constraints on framework and variable regions.
This work is joint with Trevor Bedford (Fred Hutch), Vladimir Minin (UW Statistics), and Duncan Ralph (Fred Hutch).