Pathway analyses are key methods to analyze 'omics experiments. Nevertheless, integrating data from different 'omics technologies and different species still requires considerable bioinformatics knowledge.Here we present the novel ReactomeGSA resource for comparative pathway analyses of multi-omics datasets. ReactomeGSA can be used through Reactome's existing web interface and the novel ReactomeGSA R Bioconductor package with explicit support for scRNA-seq data. Data from different species is automatically mapped to a common pathway space. Public data from ExpressionAtlas and Single Cell ExpressionAtlas can be directly integrated in the analysis. ReactomeGSA greatly reduces the technical barrier for multi-omics, cross-species, comparative pathway analyses.We used ReactomeGSA to characterize the role of B cells in anti-tumor immunity. We compared B cell rich and poor human cancer samples from five of the Cancer Genome Atlas (TCGA) transcriptomics and two of the Clinical Proteomic Tumor Analysis Consortium (CPTAC) proteomics studies. B cell-rich lung adenocarcinoma samples lacked the otherwise present activation through NFkappaB. This may be linked to the presence of a specific subset of tumor associated IgG+ plasma cells that lack NFkappaB activation in scRNA-seq data from human melanoma. This showcases how ReactomeGSA can derive novel biomedical insights by integrating large multi-omics datasets. Pathway analyses are key methods to analyze 'omics experiments. Nevertheless, integrating data from different 'omics technologies and different species still requires considerable bioinformatics knowledge. Here we present the novel ReactomeGSA resource for comparative pathway analyses of multi-omics datasets. ReactomeGSA can be used through Reactome's existing web interface and the novel ReactomeGSA R Bioconductor package with explicit support for scRNA-seq data. Data from different species is automatically mapped to a common pathway space. Public data from ExpressionAtlas and Single Cell ExpressionAtlas can be directly integrated in the analysis. ReactomeGSA greatly reduces the technical barrier for multi-omics, cross-species, comparative pathway analyses. We used ReactomeGSA to characterize the role of B cells in anti-tumor immunity. We compared B cell rich and poor human cancer samples from five of the Cancer Genome Atlas (TCGA) transcriptomics and two of the Clinical Proteomic Tumor Analysis Consortium (CPTAC) proteomics studies. B cell-rich lung adenocarcinoma samples lacked the otherwise present activation through NFkappaB. This may be linked to the presence of a specific subset of tumor associated IgG+ plasma cells that lack NFkappaB activation in scRNA-seq data from human melanoma. This showcases how ReactomeGSA can derive novel biomedical insights by integrating large multi-omics datasets. Increasingly available approaches such as transcriptome sequencing (RNA-seq), MS-based shotgun proteomics, and microarray studies enable us to characterize genome- and proteome-wide expression changes. This leads to the challenge of deriving relevant biological insights from lists of hundreds of regulated genes and proteins. Pathway analysis techniques have emerged as a solution to this problem. Resources like the Gene Ontology (GO) (1The Gene Ontology ConsortiumThe Gene Ontology Resource: 20 years and still GOing strong.Nucleic Acids Res. 2019; 47: D330-D338Crossref PubMed Scopus (1929) Google Scholar), the Kyoto Encyclopedia of Genes and Genomes (KEGG) (2Kanehisa M. Furumichi M. Tanabe M. Sato Y. Morishima K. KEGG: new perspectives on genomes, pathways, diseases and drugs.Nucleic Acids Res. 2017; 45: D353-D361Crossref PubMed Scopus (3859) Google Scholar), the Molecular Signatures Database (MSigDB) (3Subramanian A. Tamayo P. Mootha V.K. Mukherjee S. Ebert B.L. Gillette M.A. Paulovich A. Pomeroy S.L. Golub T.R. Lander E.S. Mesirov J.P. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.Proc. Natl. Acad. Sci. U S A. 2005; 102: 15545-15550Crossref PubMed Scopus (24108) Google Scholar), or Reactome (4Jassal B. Matthews L. Viteri G. Gong C. Lorente P. Fabregat A. Sidiropoulos K. Cook J. Gillespie M. Haw R. Loney F. May B. Milacic M. Rothfels K. Sevilla C. Shamovsky V. Shorser S. Varusai T. Weiser J. Wu G. Stein L. Hermjakob H. D'Eustachio P. The reactome pathway knowledgebase.Nucleic Acids Res. 2020; 48: D498-D503PubMed Google Scholar) organize existing biological knowledge into gene sets or pathways. Pathway analysis approaches can use these resources to represent long lists of regulated genes and proteins as biologically defined pathways. This leads to a more intuitive interpretation of the data and increases the statistical power. Although single genes or proteins may only show small, nonsignificant changes, synchronous changes within a pathway may reveal a biologically important effect. Thereby, pathway analysis has become an essential resource for 'omics data analyses. The increasing availability of public 'omics datasets has made it common practice to include these into analyses. These data integration is commonly complicated if datasets were created in different species or using different 'omics approaches. Pathway analysis approaches offer a solution to this problem because data can be mapped to the more general and comparable pathway space. Existing web-based pathway analysis resources, such as PANTHER (5Mi H. Huang X. Muruganujan A. Tang H. Mills C. Kang D. Thomas P.D. PANTHER version 11: expanded annotation data from Gene Ontology and Reactome pathways, and data analysis tool enhancements.Nucleic Acids Res. 2017; 45: D183-D189Crossref PubMed Scopus (1399) Google Scholar), the Database for Annotation, visualization and Integrated Discovery (DAVID) (6Huang D.W. Sherman B.T. Lempicki R.A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources.Nat. Protoc. 2009; 4: 44-57Crossref PubMed Scopus (24210) Google Scholar) or Reactome's pathway analysis (7Fabregat A. Sidiropoulos K. Viteri G. Forner O. Marin-Garcia P. Arnau V. D'Eustachio P. Stein L. Hermjakob H. Reactome pathway analysis: a high-performance in-memory approach.BMC Bioinformatics. 2017; 18: 142Crossref PubMed Scopus (267) Google Scholar) all provide over-representation analyses. This type of pathway analysis only tests whether a list of genes is overrepresented in a specific pathway. These approaches have the advantage that the user input is simple, but ignore any underlying quantitative information at the cost of reduced statistical power. Moreover, users must manually separate up- and down-regulated genes and process them in separate analyses. Thereby, any result is only a partial representation of the underlying biological changes. The recently developed iLINCS resource extends the concept of single-resource pathway analysis to a powerful multi-omics and multi-resource analysis (8Pilarczyk M. Najafabadi M.F. Kouril M. Vasiliauskas J. Niu W. Shamsaei B. Mahi N. Zhang L. Clark N. Ren Y. White S. Karim R. Xu H. Biesiada J. Bennet M.F. Davidson S. Reichard J.F. Stathias V. Koleti A. Vidovic D. Clark D.J.B. Schurer S. Ma'ayan A. Meller J. Medvedovic M. Connecting omics signatures of diseases, drugs, and mechanisms of actions with iLINCS.bioRxiv 826271v1. 2019; 13Google Scholar). It tests whether a list of gene/protein identifiers correlates with a large set of pre-computed signatures. These signatures are often the result of differential expression analyses. Therefore, like the aforementioned resources, iLINCS ignores any underlying quantitative information in the final comparison. Additionally, the comparison with public data are limited to pre-defined experimental designs and comparisons whose results are stored as pre-computed signatures. Therefore, a large portion of the data remains unused. Here, we present the novel Reactome gene set analysis system "ReactomeGSA." ReactomeGSA supports the comparative pathway analysis of multiple independent datasets. Datasets are submitted to a single pathway analysis and represented side-by-side on the pathway level. It uses gene set analysis methods that take the quantitative information into consideration and thereby performs the differential expression analysis directly on the pathway level. Data from different species is automatically mapped to a common pathway space through Reactome's internal mapping system. All supported gene set analysis methods are optimized for different types of 'omics approaches including single cell RNA-sequencing (scRNA-seq) data. Public datasets can be directly integrated from ExpressionAtlas and Single Cell ExpressionAtlas (9Papatheodorou I. Moreno P. Manning J. Fuentes A.M.-P. George N. Fexova S. Fonseca N.A. Füllgrabe A. Green M. Huang N. Huerta L. Iqbal H. Jianu M. Mohammed S. Zhao L. Jarnuczak A.F. Jupp S. Marioni J. Meyer K. Petryszak R. Prada Medina C.A. Talavera-López C. Teichmann S. Vizcaino J.A. Brazma A. Expression Atlas update: from tissues to single cells.Nucleic Acids Res. 2020; 48: D77-D83PubMed Google Scholar). We used ReactomeGSA to show that B cell receptor signaling is surprisingly down-regulated in B cell-rich lung adenocarcinoma in contrast to four other human cancers. We could further link this to IgG+ plasma cells in scRNA-seq data. ReactomeGSA thereby provides easy access to multi-omics, cross-species, comparative pathway analysis to reveal key biological mechanisms by integrating large 'omics datasets. The ReactomeGSA analysis system is accessible through Reactome's web-based pathway browser application (https://www.reactome.org) and the "ReactomeGSA" R Bioconductor package. Both access ReactomeGSA's web-based application programming interface (API) which is also publicly accessible at https://gsa.reactome.org. The backend is a Kubernetes application (https://kubernetes.io/) currently consisting of six deployments. Each deployment represents one Docker container (Docker Inc, https://www.docker.com). All data are stored in a Redis instance (https://redis.io/). The different components are linked through a message system provided by RabbitMQ (Pivotal, https://www.rabbitmq.com/). All components of the ReactomeGSA backend are developed in Python. The actual gene set analysis is performed using R Bioconductor (10Huber W. Carey V.J. Gentleman R. Anders S. Carlson M. Carvalho B.S. Bravo H.C. Davis S. Gatto L. Girke T. Gottardo R. Hahne F. Hansen K.D. Irizarry R.A. Lawrence M. Love M.I. MacDonald J. Obenchain V. Oleś A.K. Pagès H. Reyes A. Shannon P. Smyth G.K. Tenenbaum D. Waldron L. Morgan M. Orchestrating high-throughput genomic analysis with Bioconductor.Nat. Methods. 2015; 12: 115-121Crossref PubMed Scopus (1729) Google Scholar) packages through the rpy2 (https://rpy2.github.io/) Python interface to the R language in the worker node (Fig. 1). A key advantage of this setup is that the complete ReactomeGSA application can be described in one so-called YAML file - a Kubernetes configuration file. Because all Docker containers are freely available on Docker Hub (https://hub.docker.com) the ReactomeGSA system can be deployed using the single "kubectl apply -f reactome_gsa.yaml" command. We created a single YAML-formatted configuration file to quickly adapt ReactomeGSA to different use cases (ie. the number of resources available to the different nodes). Detailed information on how to adapt ReactomeGSA can be found on the GitHub repository (https://github.com/reactome/ReactomeGSA). Thereby, users can set up their own version of the ReactomeGSA system within minutes and deploy it locally or in the cloud. At the time of writing, ReactomeGSA supports three different analysis methods: Camera through the "limma" (11Ritchie M.E. Phipson B. Wu D. Hu Y. Law C.W. Shi W. Smyth G.K. limma powers differential expression analyses for RNA-sequencing and microarray studies.Nucleic Acids Res. 2015; 43: e47Crossref PubMed Scopus (13168) Google Scholar) package, PADOG through the "PADOG" package (12Tarca A.L. Draghici S. Bhatti G. Romero R. Down-weighting overlapping genes improves gene set analysis.BMC Bioinformatics. 2012; 13: 136Crossref PubMed Scopus (84) Google Scholar), and the single-sample gene set enrichment analysis (ssGSEA) (13Barbie D.A. Tamayo P. Boehm J.S. Kim S.Y. Moody S.E. Dunn I.F. Schinzel A.C. Sandy P. Meylan E. Scholl C. Fröhling S. Chan E.M. Sos M.L. Michel K. Mermel C. Silver S.J. Weir B.A. Reiling J.H. Sheng Q. Gupta P.B. Wadlow R.C. Le H. Hoersch S. Wittner B.S. Ramaswamy S. Livingston D.M. Sabatini D.M. Meyerson M. Thomas R.K. Lander E.S. Mesirov J.P. Root D.E. Gilliland D.G. Jacks T. Hahn W.C. Systematic RNA interference reveals that oncogenic KRAS-driven cancers require TBK1.Nature. 2009; 462: 108-112Crossref PubMed Scopus (1641) Google Scholar) through the "GSVA" (14Hänzelmann S. Castelo R. Guinney J. GSVA: gene set variation analysis for microarray and RNA-seq data.BMC Bioinformatics. 2013; 14: 7Crossref PubMed Scopus (3003) Google Scholar) package. All pathway analyses are performed by the worker node in the ReactomeGSA system (Fig. 1). The workflow in ReactomeGSA follows the following briefly described steps: First, the user's input data are validated in terms of experimental design, validity of submitted identifiers, and data format. Next, all identifiers are mapped to the respective human UniProt identifiers (see below). Then, the selected pathway analysis is performed for each of the submitted datasets. The parameters for the pathway analysis (such as the kernel to use for the ssGSEA analysis) is automatically chosen based on the selected data type. Finally, the pathway analysis result is converted to Reactome's internal data format to render the result in the PathwayBrowser. Reactome's manual curation is based on human UniProt identifiers (15UniProt ConsortiumUniProt: a worldwide hub of protein knowledge.Nucleic Acids Res. 2019; 47: D506-D515Crossref PubMed Scopus (3489) Google Scholar). Thus, as a first step in the analysis, the submitted identifiers are mapped to human UniProt using Reactome's identifier mapping system. A key issue in mapping identifiers between different identifier systems and across species is to resolve one-to-many mappings. In these cases, the ReactomeGSA system keeps an internal record of all mappings. Genes that map to multiple UniProt identifiers which all belong to the same pathway are only added once to this pathway. Thereby, one-to-many mappings are resolved at the pathway-level and inaccuracies introduced through identifier conversions are greatly reduced. To increase the coverage of Reactome pathways, pathways can be extended through medium and high confidence interactions derived from IntAct (16Orchard S. Ammari M. Aranda B. Breuza L. Briganti L. Broackes-Carter F. Campbell N.H. Chavali G. Chen C. del-Toro N. Duesbury M. Dumousseau M. Galeota E. Hinz U. Iannuccelli M. Jagannathan S. Jimenez R. Khadake J. Lagreid A. Licata L. Lovering R.C. Meldal B. Melidoni A.N. Milagros M. Peluso D. Perfetto L. Porras P. Raghunath A. Ricard-Blum S. Roechert B. Stutz A. Tognolli M. van Roey K. Cesareni G. Hermjakob H. The MIntAct project–IntAct as a common curation platform for 11 molecular interaction databases.Nucleic Acids Res. 2014; 42: D358-D363Crossref PubMed Scopus (1068) Google Scholar). This function considerably extends Reactome's coverage. At the time of writing, the ReactomeGSA system supports five types of quantitative 'omics data: Microarray intensities, transcriptomics raw and normalized read counts, and proteomics spectral counts and intensity-based quantitative data. Internally, these different types of data are processed using two different methods: statistics for discrete quantitative data (in case of raw transcriptomics read counts and spectral counting based quantitative proteomics data) and statistics for continuous data. For Camera and PADOG, discrete values are normalized using edgeR's (17McCarthy D.J. Chen Y. Smyth G.K. Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation.Nucleic Acids Res. 2012; 40: 4288-4297Crossref PubMed Scopus (2507) Google Scholar) calcNormFactors function. Then, the data are transformed using limma's voom function (18Law C.W. Chen Y. Shi W. Smyth G.K. voom: Precision weights unlock linear model analysis tools for RNA-seq read counts.Genome Biol. 2014; 15: R29Crossref PubMed Scopus (2594) Google Scholar). Continuous data are directly processed using limma (11Ritchie M.E. Phipson B. Wu D. Hu Y. Law C.W. Shi W. Smyth G.K. limma powers differential expression analyses for RNA-sequencing and microarray studies.Nucleic Acids Res. 2015; 43: e47Crossref PubMed Scopus (13168) Google Scholar) and normalized using limma's normalizeBetweenArrays function. The pathway analysis is subsequently performed using limma's camera function or PADOG as implemented in the respective Bioconductor R package (19Tarca A.L. Bhatti G. Romero R. A comparison of gene set analysis methods in terms of sensitivity, prioritization and specificity.PLoS ONE. 2013; 8e79217 Crossref PubMed Scopus (114) Google Scholar). For the ssGSEA method (13Barbie D.A. Tamayo P. Boehm J.S. Kim S.Y. Moody S.E. Dunn I.F. Schinzel A.C. Sandy P. Meylan E. Scholl C. Fröhling S. Chan E.M. Sos M.L. Michel K. Mermel C. Silver S.J. Weir B.A. Reiling J.H. Sheng Q. Gupta P.B. Wadlow R.C. Le H. Hoersch S. Wittner B.S. Ramaswamy S. Livingston D.M. Sabatini D.M. Meyerson M. Thomas R.K. Lander E.S. Mesirov J.P. Root D.E. Gilliland D.G. Jacks T. Hahn W.C. Systematic RNA interference reveals that oncogenic KRAS-driven cancers require TBK1.Nature. 2009; 462: 108-112Crossref PubMed Scopus (1641) Google Scholar) the analysis is performed using the GSVA Bioconductor R package (14Hänzelmann S. Castelo R. Guinney J. GSVA: gene set variation analysis for microarray and RNA-seq data.BMC Bioinformatics. 2013; 14: 7Crossref PubMed Scopus (3003) Google Scholar). Discrete data are processed using a poisson kernel and continuous data using a gaussian kernel. Thereby, multiple types of 'omics data can be supported. The analysis of scRNA-seq data are supported through the ReactomeGSA R package's "analyze_sc_clusters" function, as well as through the direct import of data from the Single Cell Expression Atlas (9Papatheodorou I. Moreno P. Manning J. Fuentes A.M.-P. George N. Fexova S. Fonseca N.A. Füllgrabe A. Green M. Huang N. Huerta L. Iqbal H. Jianu M. Mohammed S. Zhao L. Jarnuczak A.F. Jupp S. Marioni J. Meyer K. Petryszak R. Prada Medina C.A. Talavera-López C. Teichmann S. Vizcaino J.A. Brazma A. Expression Atlas update: from tissues to single cells.Nucleic Acids Res. 2020; 48: D77-D83PubMed Google Scholar). In both cases, we calculate the mean expression of genes within a cluster. For the R package, this is done through either "Seurat"'s (20Stuart T. Butler A. Hoffman P. Hafemeister C. Papalexi E. Mauck 3rd, W.M. Hao Y. Stoeckius M. Smibert P. Satija R. Comprehensive Integration of Single-Cell Data.Cell. 2019; 177: 1888-1902.e21Abstract Full Text Full Text PDF PubMed Scopus (3408) Google Scholar) "AverageExpression" function, or through scater's (21McCarthy D.J. Campbell K.R. Lun A.T.L. Wills Q.F. Scater: pre-processing, quality control, normalization and visualization of single-cell RNA-seq data in R.Bioinformatics. 2017; 33: 1179-1186PubMed Google Scholar) "aggregateAccrossCells" function depending on the input object. Single cell data retrieved from the Single Cell Expression Atlas is processed using custom python code (see https://github.com/reactome/gsa-backend for details). This approach to create pseudo-bulk RNA-seq data resembles previously described methods to calculate differentially expressed genes (22Amezquita R.A. Lun A.T.L. Becht E. Carey V.J. Carpp L.N. Geistlinger L. Marini F. Rue-Albrecht K. Risso D. Soneson C. Waldron L. Pagès H. Smith M.L. Huber W. Morgan M. Gottardo R. Hicks S.C. Orchestrating single-cell analysis with Bioconductor.Nat. Methods. 2020; 17: 137-145Crossref PubMed Scopus (137) Google Scholar). Thereby, all pathway analysis methods supported by the ReactomeGSA analysis system are accessible to scRNA-seq data as well. The TCGA transcriptomics data for melanoma (TCGA-SKCM) (23Cancer Genome Atlas Network Genomic Classification of Cutaneous Melanoma.Cell. 2015; 161: 1681-1696Abstract Full Text Full Text PDF PubMed Scopus (1783) Google Scholar), lung adenocarcinoma (TCGA-LUAD) (24Cancer Genome Atlas Research Network Comprehensive molecular profiling of lung adenocarcinoma.Nature. 2014; 511: 543-550Crossref PubMed Scopus (3291) Google Scholar), lung squamous cell carcinoma (TCGA-LUSC) (25Cancer Genome Atlas Research Network Comprehensive genomic characterization of squamous cell lung cancers.Nature. 2012; 489: 519-525Crossref PubMed Scopus (2806) Google Scholar), ovarian cancer (TCGA-OV) (26Cancer Genome Atlas Research Network Integrated genomic analyses of ovarian carcinoma.Nature. 2011; 474: 609-615Crossref PubMed Scopus (5161) Google Scholar), and breast cancer (TCGA-BRCA) (27Cancer Genome Atlas Network Comprehensive molecular portraits of human breast tumours.Nature. 2012; 490: 61-70Crossref PubMed Scopus (7887) Google Scholar) were retrieved using the "TCGAbiolinks" R Bioconductor package (28Colaprico A. Silva T.C. Olsen C. Garofano L. Cava C. Garolini D. Sabedot T.S. Malta T.M. Pagnotta S.M. Castiglioni I. Ceccarelli M. Bontempi G. Noushmehr H. TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data.Nucleic Acids Res. 2016; 44 (e71): e71Crossref PubMed Scopus (1195) Google Scholar). For all datasets apart from melanoma, only primary tumor samples were retained. Genes that were expressed in less than 30% of the samples with at least 10 reads were removed. The abundance of plasmablast-like B cells (TIPB) was quantified using the single-sample Gene Set Enrichment Analysis (ssGSEA) method (13Barbie D.A. Tamayo P. Boehm J.S. Kim S.Y. Moody S.E. Dunn I.F. Schinzel A.C. Sandy P. Meylan E. Scholl C. Fröhling S. Chan E.M. Sos M.L. Michel K. Mermel C. Silver S.J. Weir B.A. Reiling J.H. Sheng Q. Gupta P.B. Wadlow R.C. Le H. Hoersch S. Wittner B.S. Ramaswamy S. Livingston D.M. Sabatini D.M. Meyerson M. Thomas R.K. Lander E.S. Mesirov J.P. Root D.E. Gilliland D.G. Jacks T. Hahn W.C. Systematic RNA interference reveals that oncogenic KRAS-driven cancers require TBK1.Nature. 2009; 462: 108-112Crossref PubMed Scopus (1641) Google Scholar) as implemented in the "GSVA" R Bioconductor package (14Hänzelmann S. Castelo R. Guinney J. GSVA: gene set variation analysis for microarray and RNA-seq data.BMC Bioinformatics. 2013; 14: 7Crossref PubMed Scopus (3003) Google Scholar). Plasmablast-like B cells were described as CD38, CD27, and PAX5 (29Griss J. Bauer W. Wagner C. Simon M. Chen M. Grabmeier-Pfistershammer K. Maurer-Granofszky M. Roka F. Penz T. Bock C. Zhang G. Herlyn M. Glatz K. Läubli H. Mertz K.D. Petzelbauer P. Wiesner T. Hartl M. Pickl W.F. Somasundaram R. Steinberger P. Wagner S.N. B cells sustain inflammation and predict response to immune checkpoint blockade in human melanoma.Nat. Commun. 2019; 104186 Crossref PubMed Scopus (125) Google Scholar). Samples were classified as TIPB-high and -low split by the median expression of the TIPB signature in all samples of the cohort. Overall survival was assessed using the R "survival" package. The comparative pathway analysis was performed using the ReactomeGSA R Bioconductor package. In all studies, plasmablast "high" and "low" samples were compared with each other using PADOG (12Tarca A.L. Draghici S. Bhatti G. Romero R. Down-weighting overlapping genes improves gene set analysis.BMC Bioinformatics. 2012; 13: 136Crossref PubMed Scopus (84) Google Scholar). The complete R code of this analysis, including the detailed versions of all R packages used is available in the respective Jupyter notebook (see Data availability). Data processed through the common data analysis pipeline (CDA) was downloaded from the CPTAC data portal (breast cancer at https://cptac-data-portal.georgetown.edu/cptac/s/S015, ovarian cancer at https://cptac-data-portal.georgetown.edu/cptac/s/S020). For breast cancer (30Mertins P. Mani D.R. Ruggles K.V. Gillette M.A. Clauser K.R. Wang P. Wang X. Qiao J.W. Cao S. Petralia F. Kawaler E. Mundt F. Krug K. Tu Z. Lei J.T. Gatza M.L. Wilkerson M. Perou C.M. Yellapantula V. Huang K.-L. Lin C. McLellan M.D. Yan P. Davies S.R. Townsend R.R. Skates S.J. Wang J. Zhang B. Kinsinger C.R. Mesri M. Rodriguez H. Ding L. Paulovich A.G. Fenyö D. Ellis M.J. Carr S.A. NCI CPTAC Proteogenomics connects somatic mutations to signalling in breast cancer.Nature. 2016; 534: 55-62Crossref PubMed Scopus (897) Google Scholar), we used the proteome-level iTRAQ summary, for ovarian cancer (31Zhang H. Liu T. Zhang Z. Payne S.H. Zhang B. McDermott J.E. Zhou J.-Y. Petyuk V.A. Chen L. Ray D. Sun S. Yang F. Chen L. Wang J. Shah P. Cha S.W. Aiyetan P. Woo S. Tian Y. Gritsenko M.A. Clauss T.R. Choi C. Monroe M.E. Thomas S. Nie S. Wu C. Moore R.J. Yu K.-H. Tabb D.L. Fenyö D. Bafna V. Wang Y. Rodriguez H. Boja E.S. Hiltke T. Rivers R.C. Sokoll L. Zhu H. Shih I.-M. Cope L. Pandey A. Zhang B. Snyder M.P. Levine D.A. Smith R.D. Chan D.W. Rodland K.D. CPTAC Investigators Integrated Proteogenomic Characterization of Human High-Grade Serous Ovarian Cancer.Cell. 2016; 166: 755-765Abstract Full Text Full Text PDF PubMed Scopus (518) Google Scholar) the PNNL-based protein-level iTRAQ summary. Samples were matched to the respective TCGA samples through the short barcode using the first 11 characters. Only unambiguous matches were retained. Plasmablast abundance-based groupings were transferred from the respective TCGA data set. The data were analyzed using the ReactomeGSA R package and PADOG. Raw read counts of the scRNA-seq data set by Jerby-Arnon et al. (32Jerby-Arnon L. Shah P. Cuoco M.S. Rodman C. Su M.-J. Melms J.C. Leeson R. Kanodia A. Mei S. Lin J.-R. Wang S. Rabasha B. Liu D. Zhang G. Margolais C. Ashenberg O. Ott P.A. Buchbinder E.I. Haq R. Hodi F.S. Boland G.M. Sullivan R.J. Frederick D.T. Miao B. Moll T. Flaherty K.T. Herlyn M. Jenkins R.W. Thummalapalli R. Kowalczyk M.S. Cañadas I. Schilling B. Cartwright A.N.R. Luoma A.M. Malu S. Hwu P. Bernatchez C. Forget M.-A. Barbie D.A. Shalek A.K. Tirosh I. Sorger P.K. Wucherpfennig K. Van Allen E.M. Schadendorf D. Johnson B.E. Rotem A. Rozenblatt-Rosen O. Garraway L.A. Yoon C.H. Izar B. Regev A. A cancer cell program promotes T cell exclusion and resistance to checkpoint blockade.Cell. 2018; 175 (e24): 984-997Abstract Full Text Full Text PDF PubMed Scopus (451) Google Scholar) were retrieved from the Gene Expression Omnibus (GEO, identifier GSE115978). The data were processed using "Seurat" version 3.1 (20) following the new scTransform normalization strategy regressing out the patient and cohort properties. To identify the B cells from the total number of cells we used the first 35 components of the principal component analysis for the subsequent steps. The neighbor graph and clustering was performed using the default parameters. B cell clusters were identified based on a high expression of CD20 (MS4A1), CD79A, CD19, and CD138 (SDC1). B cells were extracted from the data set and re-processed, starting with the normalization step. Here, the top 11 components of the principal component analysis were used for the respective analysis steps. B cell clusters were subsequently classified following the strategy by Sanz et al. (33Sanz I. Wei C. Jenks S.A. Cashman K.S. Tipton C. Woodruff M.C. Hom J. Lee F.E.-H. Challenges and opportunities for consistent classification of human B cell and plasma cell populations.Front. Immunol. 2019; 102458 Crossref PubMed Scopus (113) Google Scholar). Plasmablast-like B cells and plasma cells were differentiated based on a low expression of MS4A1 (CD20) in plasmablast-like B cells. Finally, the ssGSEA analysis was performed using the ReactomeGSA R packages' analyze_sc_clusters function. The complete workflow including the detailed versions of all used R packages can be found in the respective Jupyter notebook (see Data availability). ReactomeGSA can be accessed through Reactome's web interface (https://www.reactome.org/PathwayBrowser/#TOOL=AT) or through the novel "ReactomeGSA" R Bioconductor package (https://doi.org/doi:10.18129/B9.bioc.ReactomeGSA, Fig. 1). Both access the public API (https://gsa.reactome.org) to perform the pathway analysis. The analysis system is a Kubernetes application based on the microservice paradigm that automatically scales to current demand (see Methods for details). This infrastructure enables us to offer computationally expensive pathway analysis methods through an open interface. ReactomeGSA currently supports three methods: PADOG (12Tarca A.L. Draghici S. Bhatti G. Romero R. Down-weighting overlapping genes improves gene set analysis.BMC Bioinformatics. 2012; 13: 136Crossref PubMed Scopus (84) Google Scholar), Camera through the limma R package (11Ritchie M.E. Phipson B. Wu D. Hu Y. Law C.W. Shi W. Smyth G.K. limma powers differential expression analyses for RNA-sequencing and microarray studies.Nucleic Acids Res. 2015; 43: e47Crossref PubMed Scopus (13168) Google Scholar), and the ssGSEA (13Barbie D.A. Tamayo P. Boehm J.S. Kim S.Y. Moody S.E. Dunn I.F. Schinzel A.C. Sandy P. Meylan E. Scholl C. Fröhling S. Chan E.M. Sos M.L. Michel K. Mermel C. Silver S.J. Weir B.A. Reiling J.H. Sheng Q. Gupta P.B. Wadlow R.C. Le H. Hoersch S. Wittner B.S. Ramaswamy S. Livingston D.M. Sabatini D.M. Meyerson M. Thomas R.K. Lander E.S. Mesirov J.P. Root D.E. Gilliland D.G. Jacks T. Hahn W.C. Systematic RNA interference reveals that oncogenic KRAS-driven cancers require TBK1.Nature. 2009; 462: 108-112Crossref PubMed Scopus (1641) Google Scholar) through the GSVA (14Hänzelmann S. Castelo R. Guinney J. GSVA: gene set variation analysis for microarray and RNA-seq data.BMC Bioi