ResearchHub | Open Science Community

Decontamination of ambient RNA in single-cell RNA-seq with DecontX

Shiyi Yang et al.Mar 5, 2020

Abstract Droplet-based microfluidic devices have become widely used to perform single-cell RNA sequencing (scRNA-seq). However, ambient RNA present in the cell suspension can be aberrantly counted along with a cell’s native mRNA and result in cross-contamination of transcripts between different cell populations. DecontX is a novel Bayesian method to estimate and remove contamination in individual cells. DecontX accurately predicts contamination levels in a mouse-human mixture dataset and removes aberrant expression of marker genes in PBMC datasets. We also compare the contamination levels between four different scRNA-seq protocols. Overall, DecontX can be incorporated into scRNA-seq workflows to improve downstream analyses.

Genetics

Ecology

1

Paper

Save

Whole-genome doubling confers unique genetic vulnerabilities on tumour cells

Ryan Quinton et al.Jan 27, 2021

Genetics

Molecular Biology

0

Paper

Save

Whole genome doubling confers unique genetic vulnerabilities on tumor cells

Ryan Quinton et al.Jun 19, 2020

Summary Whole genome doubling (WGD) occurs early in tumorigenesis and generates genetically unstable tetraploid cells that fuel tumor development. Cells that undergo WGD (WGD + ) must adapt to accommodate their abnormal tetraploid state; however, the nature of these adaptations, and whether they confer vulnerabilities that can subsequently be exploited therapeutically, is unclear. Using sequencing data from ∼10,000 primary human cancer samples and essentiality data from ∼600 cancer cell lines, we show that WGD gives rise to common genetic traits that are accompanied by unique vulnerabilities. We reveal that WGD + cells are more dependent on spindle assembly checkpoint signaling, DNA replication factors, and proteasome function than WGD − cells. We also identify KIF18A , which encodes for a mitotic kinesin, as being specifically required for the viability of WGD + cells. While loss of KIF18A is largely dispensable for accurate chromosome segregation during mitosis in WGD − cells, its loss induces dramatic mitotic errors in WGD + cells, ultimately impairing cell viability. Collectively, our results reveal new strategies to specifically target WGD + cancer cells while sparing the normal, non-transformed WGD − cells that comprise human tissue.

Genetics

Molecular Biology

4

Paper

Save

Celda: A Bayesian model to perform co-clustering of genes into modules and cells into subpopulations using single-cell RNA-seq data

Zhe Wang et al.Nov 17, 2020

Abstract Single-cell RNA-seq (scRNA-seq) has emerged as a powerful technique to quantify gene expression in individual cells and elucidate the molecular and cellular building blocks of complex tissues. We developed a novel Bayesian hierarchical model called Cellular Latent Dirichlet Allocation (Celda) to perform simultaneous co-clustering of genes into transcriptional modules and cells into subpopulations. Celda can quantify the probabilistic contribution of each gene to each module, each module to each cell population, and each cell population to each sample. We used Celda to identify transcriptional modules and cell subpopulations in a publicly available peripheral blood mononuclear cell (PBMC) dataset. Celda identified a population of proliferating T cells and a single plasma cell which were missed by two other clustering methods. Celda identified transcriptional modules that highlighted unique and shared biological programs across cell types. Celda also outperformed a PCA-based approach for gene clustering on simulated data. Overall, Celda presents a novel statistically principled approach towards characterizing transcriptional programs and cellular heterogeneity in single-cell RNA-seq data.

Genetics

Artificial Intelligence

9

Paper

Save

Decontamination of ambient RNA in single-cell RNA-seq with DecontX

Shiyi Yang et al.Jul 16, 2019

Droplet-based microfluidic devices have become widely used to perform single-cell RNA sequencing (scRNA- seq) and discover novel cellular heterogeneity in complex biological systems. However, ambient RNA present in the cell suspension can be incorporated into these droplets and aberrantly counted along with a cell's native mRNA. This results in cross-contamination of transcripts between different cell populations and can potentially decrease the precision of downstream analyses. We developed a novel hierarchical Bayesian method called DecontX to estimate and remove contamination in individual cells from scRNA- seq data. DecontX accurately predicted the proportion of contaminated counts in a mixture of mouse and human cells. Decontamination of PBMC datasets removed aberrant expression of cell type specific marker genes from other cell types and improved overall separation of cell clusters. In general, DecontX can be incorporated into scRNA-seq workflows to assess quality of dissociation protocols and improve downstream analyses.

Ecology

Biochemistry

0

Paper

Save

Interactive analysis of single-cell data using flexible workflows with SCTK2.0

Yichen Wang et al.Jul 14, 2022

Summary Analysis of single-cell RNA-seq (scRNA-seq) data can reveal novel insights into heterogeneity of complex biological systems. Many tools and workflows have been developed to perform different types of analysis. However, these tools are spread across different packages or programming environments, rely on different underlying data structures, and can only be utilized by people with knowledge of programming languages. In the Single Cell Toolkit 2.0 (SCTK2.0), we have integrated a variety of popular tools and workflows to perform various aspects of scRNA-seq analysis. All tools and workflows can be run in the R console or using an intuitive graphical user interface built with R/Shiny. HTML reports generated with Rmarkdown can be used to document and recapitulate individual steps or entire analysis workflows. We show that the toolkit offers more features when compared with existing tools and allows for a seamless analysis of scRNA-seq data for non-computational users. Graphical Abstract Highlights Intuitive graphical user interface for interactive analysis of scRNA-seq data Allows non-computational users to analyze scRNA-seq data with end-to-end workflows Provides interoperability between tools across different programming environments Produces HTML reports for reproducibility and easy sharing of results

Artificial Intelligence

Biophysics

10

Paper

Artificial Intelligence

Biophysics

0

Save

18

Comprehensive generation, visualization, and reporting of quality control metrics for single-cell RNA sequencing data

Rui Hong et al.Nov 17, 2020

Abstract Single-cell RNA sequencing (scRNA-seq) can be used to gain insights into cellular heterogeneity within complex tissues. However, a variety of technical artifacts can be present in scRNA-seq data and need to be assessed before downstream analyses can be performed. While several algorithms and tools have been developed to perform individual quality control (QC) tasks, they are scattered in different packages across several programming environments. Comprehensive pipelines to streamline the process of generating and visualizing QC metrics are lacking. To address this need, we built the SCTK-QC pipeline within the singleCellTK R package ( https://github.com/compbiomed/singleCellTK ). Features in this pipeline include the ability to import data from 11 different preprocessing tools or file formats, perform empty droplet detection with 2 different algorithms, generate standard quality control metrics such as number of UMIs per cell or the percentage of mitochondrial counts, predict doublets using 6 different algorithms, and estimate ambient RNA. QC data can be exported to R and/or Python objects used in popular down-stream workflows. Results are visualized in an easy-to-read HTML report. This pipeline can also be used by non-computational users with an interactive graphical user interface developed with R/Shiny. Overall, the SCTK-QC pipeline will streamline and standardize QC analysis for scRNA-seq data across a variety of different single-cell transcriptomic platforms and preprocessing tools.

Molecular Biology

Cancer Research

18

Paper

Save

Pipeliner: A Nextflow-based framework for the definition of sequencing data processing pipelines

Anthony Federico et al.Nov 23, 2018

The advent of high-throughput sequencing technologies has led to the need for flexible and user-friendly data pre-processing platforms. The Pipeliner framework provides an out-of-the-box solution for processing various types of sequencing data. It combines the Nextflow scripting language and Anaconda package manager to generate modular computational workflows. We have used Pipeliner to create several pipelines for sequencing data processing including bulk RNA-seq, single-cell RNA-seq (scRNA-seq), as well as Digital Gene Expression (DGE) data. This report highlights the design methodology behind Pipeliner which enables the development of highly flexible and reproducible pipelines that are easy to extend and maintain on multiple computing environments. We also provide a quick start user guide demonstrating how to setup and execute available pipelines with toy datasets.

Molecular Biology

Environmental Engineering

0

Paper

Molecular Biology

Environmental Engineering

0

Save