ResearchHub | Open Science Community

Joint Variational Autoencoders for Multimodal Imputation and Embedding

Noah Kalafut et al.Oct 18, 2022

Abstract Single-cell multimodal datasets have measured various characteristics of individual cells, enabling a deep understanding of cellular and molecular mechanisms. However, multimodal data generation remains costly and challenging, and missing modalities happen frequently. Recently, machine learning approaches have been developed for data imputation but typically require fully matched multimodalities to learn common latent embeddings that potentially lack modality specificity. To address these issues, we developed an open-source machine learning model, Joint Variational Autoencoders for multimodal Imputation and Embedding (JAMIE). JAMIE takes single-cell multimodal data that can have partially matched samples across modalities. Variational autoencoders learn the latent embeddings of each modality. Then, embeddings from matched samples across modalities are aggregated to identify joint cross-modal latent embeddings before reconstruction. To perform cross-modal imputation, the latent embeddings of one modality can be used with the decoder of the other modality. For interpretability, Shapley values are used to prioritize input features for cross-modal imputation and known sample labels. We applied JAMIE to both simulation data and emerging single-cell multimodal data including gene expression, chromatin accessibility, and electrophysiology in human and mouse brains. JAMIE significantly outperforms existing state-of-the-art methods in general and prioritized multimodal features for imputation, providing potentially novel mechanistic insights at cellular resolution.

Artificial Intelligence

Biophysics

7

Paper

Artificial Intelligence

2

0

Save

8

Brain and Organoid Manifold Alignment (BOMA), a machine learning framework for comparative gene expression analysis across brains and organoids

Chenfeng He et al.Jun 14, 2022

Abstract Organoids have become valuable models for understanding cellular and molecular mechanisms in human development including brains. However, whether developmental gene expression programs are preserved between human organoids and brains, especially in specific cell types, remains unclear. Importantly, there is a lack of effective computational approaches for comparative data analyses between organoids and developing humans. To address this, by considering the public data availability and research significance, we developed a machine learning framework, Brain and Organoid Manifold Alignment (BOMA) for comparative gene expression analysis of brains and organoids, to identify conserved and specific developmental trajectories as well as developmentally expressed genes and functions, especially at cellular resolution. BOMA first performs a global alignment and then uses manifold learning to locally refine the alignment, revealing conserved developmental trajectories between brains and organoids. Using BOMA, we found that human cortical organoids better align with certain brain cortical regions than other non-cortical regions, implying organoid-preserved developmental gene expression programs specific to brain regions. Additionally, our alignment of non-human primate and human brains reveals highly conserved gene expression around birth. Also, we integrated and analyzed developmental scRNA-seq data of human brains and organoids, showing conserved and specific cell trajectories and clusters. Further identification of expressed genes of such clusters and enrichment analyses reveal brain- or organoid-specific developmental functions and pathways. Finally, we experimentally validated important specific expressed genes using immunofluorescence. BOMA is open-source available as a web tool for general community use.

Genetics

Developmental Biology

8

Paper

Save

MANGEM: a web app for Multimodal Analysis of Neuronal Gene expression, Electrophysiology and Morphology

Robert Olson et al.Apr 4, 2023

D

N

R

Single-cell techniques have enabled the acquisition of multi-modal data, particularly for neurons, to characterize cellular functions. Patch-seq, for example, combines patch-clamp recording, cell imaging, and single-cell RNA-seq to obtain electrophysiology, morphology, and gene expression data from a single neuron. While these multi-modal data offer potential insights into neuronal functions, they can be heterogeneous and noisy. To address this, machine-learning methods have been used to align cells from different modalities onto a low-dimensional latent space, revealing multi-modal cell clusters. However, the use of those methods can be challenging for biologists and neuroscientists without computational expertise and also requires suitable computing infrastructure for computationally expensive methods. To address these issues, we developed a cloud-based web application, MANGEM (Multimodal Analysis of Neuronal Gene expression, Electrophysiology, and Morphology) at https://ctc.waisman.wisc.edu/mangem. MANGEM provides a step-by-step accessible and user-friendly interface to machine-learning alignment methods of neuronal multi-modal data while enabling real-time visualization of characteristics of raw and aligned cells. It can be run asynchronously for large-scale data alignment, provides users with various downstream analyses of aligned cells and visualizes the analytic results such as identifying multi-modal cell clusters of cells and detecting correlated genes with electrophysiological and morphological features. We demonstrated the usage of MANGEM by aligning Patch-seq multimodal data of neuronal cells in the mouse visual cortex.

Artificial Intelligence

Biophysics

1

Paper

Artificial Intelligence

Biophysics

0

Save