ResearchHub | Open Science Community

Museum of Spatial Transcriptomics

Lambda Moses et al.Oct 13, 2023

Abstract The function of many biological systems, such as embryos, liver lobules, intestinal villi, and tumors depends on the spatial organization of their cells. In the past decade high-throughput technologies have been developed to quantify gene expression in space, and computational methods have been developed that leverage spatial gene expression data to identify genes with spatial patterns and to delineate neighborhoods within tissues. To assess the ability and potential of spatial gene expression technologies to drive biological discovery, we present a curated database of literature on spatial transcriptomics dating back to 1987, along with a thorough analysis of trends in the field such as usage of experimental techniques, species, tissues studied and computational approaches used. Our analysis places current methods in historical context, and we derive insights about the field that can guide current research strategies. A companion supplement offers a more detailed look at the technologies and methods analyzed: https://pachterlab.github.io/LP_2021/ .

Leverage (Statistics)

Context (Archaeology)

Computational Biology

140

Paper

Leverage (Statistics)

24

0

Save

0

Voyager: exploratory single-cell genomics data analysis with geospatial statistics

Lambda Moses et al.May 26, 2024

Exploratory spatial data analysis (ESDA) can be a powerful approach to understanding single-cell genomics datasets, but it is not yet part of standard data analysis workflows. In particular, geospatial analyses, which have been developed and refined for decades, have yet to be fully adapted and applied to spatial single-cell analysis. We introduce the Voyager platform, which systematically brings the geospatial ESDA tradition to (spatial) -omics, with local, bivariate, and multivariate spatial methods not yet commonly applied to spatial -omics, united by a uniform user interface. Using Voyager, we showcase biological insights that can be derived with its methods, such as biologically relevant negative spatial autocorrelation. Underlying Voyager is the SpatialFeatureExperiment data structure, which combines Simple Feature with SingleCellExperiment and AnnData to represent and operate on geometries bundled with gene expression data. Voyager has comprehensive tutorials demonstrating ESDA built on GitHub Actions to ensure reproducibility and scalability, using data from popular commercial technologies. Voyager is implemented in both R/Bioconductor and Python/PyPI, and features compatibility tests to ensure that both implementations return consistent results.

Geospatial Analysis

Computer Science

Spatial Analysis

0

Paper

Save

The tidyomics ecosystem: enhancing omic data analyses

W Hutchison et al.Sep 6, 2024

0

Paper

Save

Quantitative assessment of single-cell RNA-seq clustering with CONCORDEX

Kayla Jackson et al.Oct 24, 2023

Abstract Many single cell RNA-sequencing (scRNA-seq) data analysis workflows rely on methods that embed and visualize the properties of a k-nearest neighbor (kNN) graph in two-dimensions. These visualizations are typically combined with categorical labels assigned to individual data points and can support a range of analysis tasks, despite the fact that these embeddings are known to distort the local and global properties of the graph. Rather than relying on a two-dimensional visualization, we introduce a method for quantitatively assessing the concordance between a set of labels and the k-nearest neighbor graph. Our method, called CONCORDEX, computes for each node the fraction of neighbors with the same or different label, and compares the result to the mean obtained with random labeling of the graph. CONCORDEX can be used for any categorical label and can be interpreted via an intuitive heatmap visualization. We demonstrate its utility for assessment of clustering results and “integration”. Since CONCORDEX can be used to directly visualize properties of a kNN graph, we also use CONCORDEX to evaluate how well two-dimensional embeddings capture the local and global structure of the underlying graph. We have made CONCORDEX available as a Python-based command line tool ( https://github.com/pachterlab/concordex ) and as a software package in Bioconductor: https://bioconductor.org/packages/concordexR .

Computer Science

Visualization

Categorical Variable

11

Paper

Save

The impact of package selection and versioning on single-cell RNA-seq analysis

Joseph Rich et al.May 28, 2024

Standard single-cell RNA-sequencing analysis (scRNA-seq) workflows consist of converting raw read data into cell-gene count matrices through sequence alignment, followed by analyses including filtering, highly variable gene selection, dimensionality reduction, clustering, and differential expression analysis. Seurat and Scanpy are the most widely-used packages implementing such workflows, and are generally thought to implement individual steps similarly. We investigate in detail the algorithms and methods underlying Seurat and Scanpy and find that there are, in fact, considerable differences in the outputs of Seurat and Scanpy. The extent of differences between the programs is approximately equivalent to the variability that would be introduced in benchmarking scRNA-seq datasets by sequencing less than 5% of the reads or analyzing less than 20% of the cell population. Additionally, distinct versions of Seurat and Scanpy can produce very different results, especially during parts of differential expression analysis. Our analysis highlights the need for users of scRNA-seq to carefully assess the tools on which they rely, and the importance of developers of scientific software to prioritize transparency, consistency, and reproducibility for their tools.

Consistency (Knowledge Bases)

Workflow

Computer Science

0

Paper

Consistency (Knowledge Bases)

Workflow

0

Save

0

kallisto, bustools, and kb-python for quantifying bulk, single-cell, and single-nucleus RNA-seq

Delaney Sullivan et al.Nov 22, 2023

+9

K

D

The term "RNA-seq" refers to a collection of assays based on sequencing experiments that involve quantifying RNA species from bulk tissue, from single cells, or from single nuclei. The kallisto, bustools, and kb-python programs are free, open-source software tools for performing this analysis that together can produce gene expression quantification from raw sequencing reads. The quantifications can be individualized for multiple cells, multiple samples, or both. Additionally, these tools allow gene expression values to be classified as originating from nascent RNA species or mature RNA species, making this workflow amenable to both cell-based and nucleus-based assays. This protocol describes in detail how to use kallisto and bustools in conjunction with a wrapper, kb-python, to preprocess RNA-seq data.

Python (Programming Language)

Rna

Rna-seq

0

Paper

Python (Programming Language)

Rna

0

Save

0

Modular and efficient pre-processing of single-cell RNA-seq

Páll Melsted et al.May 6, 2020

Analysis of single-cell RNA-seq data begins with pre-processing of sequencing reads to generate count matrices. We investigate algorithm choices for the challenges of pre-processing, and describe a workflow that balances efficiency and accuracy. Our workflow is based on the kallisto ( ) and bustools ( ) programs, and is near-optimal in speed and memory. The workflow is modular, and we demonstrate its flexibility by showing how it can be used for RNA velocity analyses. Documentation and tutorials for using the kallisto | bus workflow are available at .

Workflow

Modular Design

Computer Science

0

Paper

Save

The Genetic Architecture of Dietary Iron Overload and Associated Pathology in Mice

Brie Fuqua et al.Oct 24, 2023

Tissue iron overload is a frequent pathologic finding in multiple disease states including non-alcoholic fatty liver disease (NAFLD), neurodegenerative disorders, cardiomyopathy, diabetes, and some forms of cancer. The role of iron, as a cause or consequence of disease progression and observed phenotypic manifestations, remains controversial. In addition, the impact of genetic variation on iron overload related phenotypes is unclear, and the identification of genetic modifiers is incomplete. Here, we used the Hybrid Mouse Diversity Panel (HMDP), consisting of over 100 genetically distinct mouse strains optimized for genome-wide association studies and systems genetics, to characterize the genetic architecture of dietary iron overload and pathology. Dietary iron overload was induced by feeding male mice (114 strains, 6-7 mice per strain on average) a high iron diet for six weeks, and then tissues were collected at 10-11 weeks of age. Liver metal levels and gene expression were measured by ICP-MS/ICP-AES and RNASeq, and lipids were measured by colorimetric assays. FaST-LMM was used for genetic mapping, and Metascape, WGCNA, and Mergeomics were used for pathway, module, and key driver bioinformatics analyses. Mice on the high iron diet accumulated iron in the liver, with a 6.5 fold difference across strain means. The iron loaded diet also led to a spectrum of copper deficiency and anemia, with liver copper levels highly positively correlated with red blood cell count, hemoglobin, and hematocrit. Hepatic steatosis of various severity was observed histologically, with 52.5 fold variation in triglyceride levels across the strains. Liver triglyceride and iron mapped most significantly to an overlapping locus on chromosome 7 that has not been previously associated with either trait. Based on network modeling, significant key drivers for both iron and triglyceride accumulation are involved in cholesterol biosynthesis and oxidative stress management. To make the full data set accessible and useable by others, we have made our data and analyses available on a resource website.

Biology

Steatosis

Fatty Liver

1

Paper

Biology

Steatosis

0

Save