ResearchHub | Open Science Community

Integrating deep learning and unbiased automated high-content screening to identify complex disease signatures in human fibroblasts

Lauren Schiff et al.Nov 16, 2020

Drug discovery for diseases such as Parkinson’s disease are impeded by the lack of screenable cellular phenotypes. We present an unbiased phenotypic profiling platform that combines automated cell culture, high-content imaging, Cell Painting, and deep learning. We applied this platform to primary fibroblasts from 91 Parkinson’s disease patients and matched healthy controls, creating the largest publicly available Cell Painting image dataset to date at 48 terabytes. We use fixed weights from a convolutional deep neural network trained on ImageNet to generate deep embeddings from each image and train machine learning models to detect morphological disease phenotypes. Our platform’s robustness and sensitivity allow the detection of individual-specific variation with high fidelity across batches and plate layouts. Lastly, our models confidently separate LRRK2 and sporadic Parkinson’s disease lines from healthy controls (receiver operating characteristic area under curve 0.79 (0.08 standard deviation)), supporting the capacity of this platform for complex disease modeling and drug screening applications.

Genetics

Artificial Intelligence

1

Paper

Save

Common genetic variation impacts stress response in the brain

Carina Seah et al.Dec 28, 2023

To explain why individuals exposed to identical stressors experience divergent clinical outcomes, we determine how molecular encoding of stress modifies genetic risk for brain disorders. Analysis of post-mortem brain (n=304) revealed 8557 stress-interactive expression quantitative trait loci (eQTLs) that dysregulate expression of 915 eGenes in response to stress, and lie in stress-related transcription factor binding sites. Response to stress is robust across experimental paradigms: up to 50% of stress-interactive eGenes validate in glucocorticoid treated hiPSC-derived neurons (n=39 donors). Stress-interactive eGenes show brain region- and cell type-specificity, and, in post-mortem brain, implicate glial and endothelial mechanisms. Stress dysregulates long-term expression of disorder risk genes in a genotype-dependent manner; stress-interactive transcriptomic imputation uncovered 139 novel genes conferring brain disorder risk only in the context of traumatic stress. Molecular stress-encoding explains individualized responses to traumatic stress; incorporating trauma into genomic studies of brain disorders is likely to improve diagnosis, prognosis, and drug discovery.

Genetics

Philosophy

0

Paper

Save

ScaleFEx^SM: a lightweight and scalable method to extract fixed features from single cells in high-content imaging screens

Gabriel Comolet et al.Jul 9, 2023

Abstract High-content imaging (HCI) is a popular technique that leverages high throughput datasets to uncover phenotypes of cell populations in vitro . When the differences between populations (such as a healthy and disease state) are completely unknown, it is crucial to build very large HCI screens to account for individual (donor) variation, as well as having enough replicates to create a reliable model. One approach to highlight phenotypic differences is to reduce images into a set of features using unbiased methods, such as embeddings or autoencoders. These methods are powerful at preserving the predictive power contained in each image while removing most of the unimportant image features and noise (e.g., background). However, they do not provide interpretable information about the features driving the decision process of the AI algorithm used. While tools have been developed to address this issue, such as CellProfiler, scaling this tool to large sample batches containing hundreds of thousands of images poses computational challenges. Additionally, the resulting feature vector, computationally expensive to have generated, is very large in size (containing over 3000 features) with many redundant features, making it challenging to perform further analysis and identify the truly relevant features. Ultimately, there is an increased risk of overfitting due to the presence of too many non-meaningful features that can ultimately skew downstream predictions. To address this issue, we have developed ScaleFEx SM , a Python pipeline that extracts multiple generic fixed features at the single cell level that can be deployed across large high-content imaging datasets with low computational requirements. This pipeline efficiently and reliably computes features related to shape, size, intensity, texture, granularity as well as correlations between channels. Additionally, it allows the measurement of additional features specifically related to mitochondria and RNA only, as they represent important channels with characteristics worth to be measured on their own. The measured features can be used to not only separate populations of cells using AI tools, but also highlight the specific interpretable features that differ between populations. We applied ScaleFEx SM to identify the phenotypic shifts that multiple cell lines undergo when exposed to different compounds. We used a combination of recursive feature elimination, logistic regression, correlation analysis and dimensionality reduction representations to narrow down to the most meaningful features that described the drug shifts. Furthermore, we used the best scoring features to extract images of cells for each class closest to the average to visually highlight the phenotypic shifts caused by the drugs. Using this approach, we were able to identify features linked to the drug shifts in line with literature, and we could visually validate their involvement in the morphological changes of the cells. ScaleFEx SM can be used as a powerful tool to understand the underlying phenotypes of complex diseases and subtle drug shifts at the single cell level, bringing us a step closer to identifying disease-modifying compounds for the major diseases of our time.

Genetics

Artificial Intelligence

1

Paper

Genetics

Artificial Intelligence

0

Save

Integrating deep learning and unbiased automated high-content screening to identify complex disease signatures in human fibroblasts

Common genetic variation impacts stress response in the brain

ScaleFExSM: a lightweight and scalable method to extract fixed features from single cells in high-content imaging screens

ScaleFEx^SM: a lightweight and scalable method to extract fixed features from single cells in high-content imaging screens