ResearchHub | Open Science Community

Functional antibodies exhibit light chain coherence

David Jaffe et al.Apr 25, 2022

The vertebrate adaptive immune system modifies the genome of individual B cells to encode antibodies binding particular antigens 1 . In most mammals, antibodies are composed of a heavy and a light chain which are sequentially generated by recombination of V, D (for heavy chains), J, and C gene segments. Each chain contains three complementarity-determining regions (CDR1-3), contributing to antigen specificity. Certain heavy and light chains are preferred for particular antigens 2–21 . We considered pairs of B cells sharing the same heavy chain V gene and CDRH3 amino acid sequence and isolated from different donors, also known as public clonotypes 22,23 . We show that for naive antibodies (not yet adapted to antigens), the probability that they use the same light chain V gene is ∼10%, whereas for memory (functional) antibodies it is ∼80%. This property of functional antibodies is a phenomenon we call light chain coherence . We also observe it when similar heavy chains recur within a donor. Thus, though naive antibodies appear to recur by chance, the recurrence of functional antibodies reveals surprising constraint and determinism in the processes of V(D)J recombination and immune selection. For most functional antibodies, the heavy chain determines the light chain.

Genetics

Immunology

48

Paper

Save

enclone: precision clonotyping and analysis of immune receptors

David Jaffe et al.Apr 22, 2022

Abstract Half a billion years of evolutionary battle forged the vertebrate adaptive immune system, an astonishingly versatile factory for molecules that can adapt to arbitrary attacks. The history of an individual encounter is chronicled within a clonotype: the descendants of a single fully rearranged adaptive immune cell. For B cells, reading this immune history for an individual remains a fundamental challenge of modern immunology. Identification of such clonotypes is a magnificently challenging problem for three reasons: The cell history is inferred rather than directly observed : the only available data are the sequences of V(D)J molecules occurring in a sample of cells. Each immune receptor is a pair of V(D)J molecules . Identifying these pairs at scale is a technological challenge and cannot be done with perfect accuracy—real samples are mixtures of cells and fragments thereof. These molecules can be intensely mutated during the optimization of the response to particular antigens, blurring distinctions between kindred molecules. It is thus impossible to determine clonotypes exactly. All solutions to this problem make a trade-off between sensitivity and specificity; useful solutions must address actual artifacts found in real data. We present enclone 1 , a system for computing approximate clonotypes from single cell data, and demonstrate its use and value with the 10x Genomics Immune Profiling Solution. To test it, we generate data for 1.6 million individual B cells, from four humans, including deliberately enriched memory cells, to tax the algorithm and provide a resource for the community. We analytically determine the specificity of enclone ’s clonotyping algorithm, showing that on this dataset the probability of co-clonotyping two unrelated B cells is around 10 −9 . We prove that using only heavy chains increases the error rate by two orders of magnitude. enclone comprises a comprehensive toolkit for the analysis and display of immune receptor data. It is ultra-fast, easy to install, has public source code, comes with public data, and is documented at bit.ly/enclone . It has three “flavors” of use: (1) as a command-line tool run from a terminal window, that yields visual output; (2) as a command-line tool that yields parseable output that can be fed to other programs; and (3) as a graphical version (GUI).

Biophysics

Immunology

40

Paper

Save

Comparing 10x Genomics single-cell 3’ and 5’ assay in short-and long-read sequencing

Justine Hsu et al.Oct 28, 2022

Abstract Barcoding strategies are fundamental to droplet-based single-cell sequencing, and understanding the biases and caveats between approaches is essential. Here, we comprehensively evaluated both short and long reads of the cDNA obtained through the two marketed approaches from 10x Genomics, the “3’ assay” and the “5’ assay”, which attach barcodes at different ends of the mRNA molecule. Although the barcode detection, cell-type identification, and gene expression profile are similar in both assays, the 5’ assay captured more exonic molecules and fewer intronic molecules compared to the 3’ assay. We found that 13.7% of genes sequenced have longer average read lengths and are more complete (spanning both polyA-site and TSS) in the long reads from the 5’ assay compared to the 3’ assay. These genes are characterized by long average transcript length, high intron number, and low expression overall. Despite these differences, cell-type-specific isoform profiles observed from the two assays remain highly correlated. This study provides a benchmark for choosing the single-cell assay for the intended research question, and insights regarding platform-specific biases to be mindful of when analyzing data, particularly across samples and technologies.

Genetics

Molecular Biology

1

Paper

Save

Resolving the Full Spectrum of Human Genome Variation using Linked-Reads

Patrick Marks et al.Dec 8, 2017

Large-scale population based analyses coupled with advances in technology have demonstrated that the human genome is more diverse than originally thought. Standard short-read approaches, used primarily due to accuracy, throughput and costs, fail to give a complete picture of a genome. They struggle to identify large, balanced structural events, cannot access repetitive regions of the genome and fail to resolve the human genome into its two haplotypes. Here we describe an approach that retains long range information while harnessing the power of short reads. Starting from only ~1ng of DNA, we produce barcoded short read libraries. The use of novel informatic approaches allows for the barcoded short reads to be associated with the long molecules of origin producing a novel datatype known as 'Linked-Reads'. This approach allows for simultaneous detection of small and large variants from a single Linked-Read library. We have previously demonstrated the utility of whole genome Linked-Reads (lrWGS) for performing diploid, de novo assembly of individual genomes (Weisenfeld et al. 2017). In this manuscript, we show the utility of reference based analysis using a single Linked-Read library for full spectrum genome analysis. We demonstrate the ability of Linked-Reads to reconstruct megabase scale haplotypes and to recover parts of the genome that are typically inaccessible to short reads, including phenotypically important genes such as STRC, SMN1 and SMN2. We demonstrate the ability of both lrWGS and Linked-Read Whole Exome Sequencing (lrWES) to identify complex structural variations, including balanced events, single exon deletions, and single exon duplications. The data presented here show that Linked-Reads provide a scalable approach for comprehensive genome analysis that is not possible using short reads alone.

Genetics

Molecular Biology

0

Paper

Genetics

Molecular Biology

0

Save