ResearchHub | Open Science Community

The ProteomeXchange consortium in 2017: supporting the cultural change in proteomics public data deposition

Eric Deutsch et al.Oct 11, 2016

The ProteomeXchange (PX) Consortium of proteomics resources (http://www.proteomexchange.org) was formally started in 2011 to standardize data submission and dissemination of mass spectrometry proteomics data worldwide. We give an overview of the current consortium activities and describe the advances of the past few years. Augmenting the PX founding members (PRIDE and PeptideAtlas, including the PASSEL resource), two new members have joined the consortium: MassIVE and jPOST. ProteomeCentral remains as the common data access portal, providing the ability to search for data sets in all participating PX resources, now with enhanced data visualization components. We describe the updated submission guidelines, now expanded to include four members instead of two. As demonstrated by data submission statistics, PX is supporting a change in culture of the proteomics field: public data sharing is now an accepted standard, supported by requirements for journal submissions resulting in public data release becoming the norm. More than 4500 data sets have been submitted to the various PX resources since 2012. Human is the most represented species with approximately half of the data sets, followed by some of the main model organisms and a growing list of more than 900 diverse species. Data reprocessing activities are becoming more prominent, with both MassIVE and PeptideAtlas releasing the results of reprocessed data sets. Finally, we outline the upcoming advances for ProteomeXchange.

Biochemistry

Molecular Biology

0

Paper

Save

Genome sequence of a serotype M3 strain of group AStreptococcus: Phage-encoded toxins, the high-virulence phenotype, and clone emergence

Stephen Beres et al.Jul 16, 2002

Genome sequences are available for many bacterial strains, but there has been little progress in using these data to understand the molecular basis of pathogen emergence and differences in strain virulence. Serotype M3 strains of group A Streptococcus (GAS) are a common cause of severe invasive infections with unusually high rates of morbidity and mortality. To gain insight into the molecular basis of this high-virulence phenotype, we sequenced the genome of strain MGAS315, an organism isolated from a patient with streptococcal toxic shock syndrome. The genome is composed of 1,900,521 bp, and it shares ≈1.7 Mb of related genetic material with genomes of serotype M1 and M18 strains. Phage-like elements account for the great majority of variation in gene content relative to the sequenced M1 and M18 strains. Recombination produces chimeric phages and strains with previously uncharacterized arrays of virulence factor genes. Strain MGAS315 has phage genes that encode proteins likely to contribute to pathogenesis, such as streptococcal pyrogenic exotoxin A (SpeA) and SpeK, streptococcal superantigen (SSA), and a previously uncharacterized phospholipase A 2 (designated Sla). Infected humans had anti-SpeK, -SSA, and -Sla antibodies, indicating that these GAS proteins are made in vivo . SpeK and SSA were pyrogenic and toxic for rabbits. Serotype M3 strains with the phage-encoded speK and sla genes increased dramatically in frequency late in the 20th century, commensurate with the rise in invasive disease caused by M3 organisms. Taken together, the results show that phage-mediated recombination has played a critical role in the emergence of a new, unusually virulent clone of serotype M3 GAS.

Genetics

Epidemiology

0

Paper

Save

A High-Confidence Human Plasma Proteome Reference Set with Estimated Concentrations in PeptideAtlas

Terry Farrah et al.Jun 3, 2011

Biochemistry

Molecular Biology

0

Paper

Save

Genome sequence and comparative microarray analysis of serotype M18 group A Streptococcus strains associated with acute rheumatic fever outbreaks

James Smoot et al.Mar 26, 2002

Acute rheumatic fever (ARF), a sequelae of group A Streptococcus (GAS) infection, is the most common cause of preventable childhood heart disease worldwide. The molecular basis of ARF and the subsequent rheumatic heart disease are poorly understood. Serotype M18 GAS strains have been associated for decades with ARF outbreaks in the U.S. As a first step toward gaining new insight into ARF pathogenesis, we sequenced the genome of strain MGAS8232, a serotype M18 organism isolated from a patient with ARF. The genome is a circular chromosome of 1,895,017 bp, and it shares 1.7 Mb of closely related genetic material with strain SF370 (a sequenced serotype M1 strain). Strain MGAS8232 has 178 ORFs absent in SF370. Phages, phage-like elements, and insertion sequences are the major sources of variation between the genomes. The genomes of strain MGAS8232 and SF370 encode many of the same proven or putative virulence factors. Importantly, strain MGAS8232 has genes encoding many additional secreted proteins involved in human-GAS interactions, including streptococcal pyrogenic exotoxin A (scarlet fever toxin) and two uncharacterized pyrogenic exotoxin homologues, all phage-associated. DNA microarray analysis of 36 serotype M18 strains from diverse localities showed that most regions of variation were phages or phage-like elements. Two epidemics of ARF occurring 12 years apart in Salt Lake City, UT, were caused by serotype M18 strains that were genetically identical, or nearly so. Our analysis provides a critical foundation for accelerated research into ARF pathogenesis and a molecular framework to study the plasticity of GAS genomes.

Genetics

Immunology

0

Paper

Save

A repository of assays to quantify 10,000 human proteins by SWATH-MS

George Rosenberger et al.Sep 15, 2014

Abstract Mass spectrometry is the method of choice for deep and reliable exploration of the (human) proteome. Targeted mass spectrometry reliably detects and quantifies pre-determined sets of proteins in a complex biological matrix and is used in studies that rely on the quantitatively accurate and reproducible measurement of proteins across multiple samples. It requires the one-time, a priori generation of a specific measurement assay for each targeted protein. SWATH-MS is a mass spectrometric method that combines data-independent acquisition (DIA) and targeted data analysis and vastly extends the throughput of proteins that can be targeted in a sample compared to selected reaction monitoring (SRM). Here we present a compendium of highly specific assays covering more than 10,000 human proteins and enabling their targeted analysis in SWATH-MS datasets acquired from research or clinical specimens. This resource supports the confident detection and quantification of 50.9% of all human proteins annotated by UniProtKB/Swiss-Prot and is therefore expected to find wide application in basic and clinical research. Data are available via ProteomeXchange (PXD000953-954) and SWATHAtlas (SAL00016-35).

History

Biochemistry

0

Paper

Save

A complete mass-spectrometric map of the yeast proteome applied to quantitative trait analysis

Paola Picotti et al.Jan 18, 2013

High-throughput peptide synthesis and mass spectrometry are used to generate a near-complete reference map of the Saccharomyces cerevisiae proteome; two versions of the map (supporting discovery- and hypothesis-driven proteomics) are then applied to a protein-based quantitative trait locus analysis. Complete 'gold standard' reference maps of the components within a system are valuable resources for a research community. This paper presents one such resource, a complete mass-spectrometric reference map of the budding yeast Saccharomyces cerevisiae. The map comes in two versions — one for discovery-driven (shotgun) and the other for hypothesis-driven (targeted) proteomic measurements — and will support most studies performed with contemporary proteomic technologies. The maps provide essentially a set of highly specific assays for the detection and quantification of every yeast protein in any sample, and their value is demonstrated here in a protein quantitative trait locus analysis. Experience from different fields of life sciences suggests that accessible, complete reference maps of the components of the system under study are highly beneficial research tools. Examples of such maps include libraries of the spectroscopic properties of molecules, or databases of drug structures in analytical or forensic chemistry. Such maps, and methods to navigate them, constitute reliable assays to probe any sample for the presence and amount of molecules contained in the map. So far, attempts to generate such maps for any proteome have failed to reach complete proteome coverage1,2,3. Here we use a strategy based on high-throughput peptide synthesis and mass spectrometry to generate an almost complete reference map (97% of the genome-predicted proteins) of the Saccharomyces cerevisiae proteome. We generated two versions of this mass-spectrometric map, one supporting discovery-driven (shotgun)3,4 and the other supporting hypothesis-driven (targeted)5,6 proteomic measurements. Together, the two versions of the map constitute a complete set of proteomic assays to support most studies performed with contemporary proteomic technologies. To show the utility of the maps, we applied them to a protein quantitative trait locus (QTL) analysis7, which requires precise measurement of the same set of peptides over a large number of samples. Protein measurements over 78 S. cerevisiae strains revealed a complex relationship between independent genetic loci, influencing the levels of related proteins. Our results suggest that selective pressure favours the acquisition of sets of polymorphisms that adapt protein levels but also maintain the stoichiometry of functionally related pathway members.

Genetics

Molecular Biology

0

Paper

Save

Human SRMAtlas: A Resource of Targeted Assays to Quantify the Complete Human Proteome

Ulrike Kusebauch et al.Jul 1, 2016

Genetics

History

0

Paper

Save

Phosphoproteomic Analysis Reveals Interconnected System-Wide Responses to Perturbations of Kinases and Phosphatases in Yeast

Bernd Bodenmiller et al.Dec 21, 2010

The phosphorylation and dephosphorylation of proteins by kinases and phosphatases constitute an essential regulatory network in eukaryotic cells. This network supports the flow of information from sensors through signaling systems to effector molecules and ultimately drives the phenotype and function of cells, tissues, and organisms. Dysregulation of this process has severe consequences and is one of the main factors in the emergence and progression of diseases, including cancer. Thus, major efforts have been invested in developing specific inhibitors that modulate the activity of individual kinases or phosphatases; however, it has been difficult to assess how such pharmacological interventions would affect the cellular signaling network as a whole. Here, we used label-free, quantitative phosphoproteomics in a systematically perturbed model organism (Saccharomyces cerevisiae) to determine the relationships between 97 kinases, 27 phosphatases, and more than 1000 phosphoproteins. We identified 8814 regulated phosphorylation events, describing the first system-wide protein phosphorylation network in vivo. Our results show that, at steady state, inactivation of most kinases and phosphatases affected large parts of the phosphorylation-modulated signal transduction machinery-and not only the immediate downstream targets. The observed cellular growth phenotype was often well maintained despite the perturbations, arguing for considerable robustness in the system. Our results serve to constrain future models of cellular signaling and reinforce the idea that simple linear representations of signaling pathways might be insufficient for drug development and for describing organismal homeostasis.

Biochemistry

Molecular Biology

0

Paper

Save

PASSEL: The PeptideAtlas SRMexperiment library

Terry Farrah et al.Feb 9, 2012

Public repositories for proteomics data have accelerated proteomics research by enabling more efficient cross‐analyses of datasets, supporting the creation of protein and peptide compendia of experimental results, supporting the development and testing of new software tools, and facilitating the manuscript review process. The repositories available to date have been designed to accommodate either shotgun experiments or generic proteomic data files. Here, we describe a new kind of proteomic data repository for the collection and representation of data from selected reaction monitoring ( SRM ) measurements. The P eptide A tlas SRM E xperiment L ibrary ( PASSEL ) allows researchers to easily submit proteomic data sets generated by SRM . The raw data are automatically processed in a uniform manner and the results are stored in a database, where they may be downloaded or browsed via a web interface that includes a chromatogram viewer. PASSEL enables cross‐analysis of SRM data, supports optimization of SRM data collection, and facilitates the review process of SRM data. Further, PASSEL will help in the assessment of proteotypic peptide performance in a wide array of samples containing the same peptide, as well as across multiple experimental protocols.

Molecular Biology

Food Science

0

Paper

Save

The Human Plasma Proteome Draft of 2017: Building on the Human Plasma PeptideAtlas from Mass Spectrometry and Complementary Assays

Jochen Schwenk et al.Sep 22, 2017

Human blood plasma provides a highly accessible window to the proteome of any individual in health and disease. Since its inception in 2002, the Human Proteome Organization's Human Plasma Proteome Project (HPPP) has been promoting advances in the study and understanding of the full protein complement of human plasma and on determining the abundance and modifications of its components. In 2017, we review the history of the HPPP and the advances of human plasma proteomics in general, including several recent achievements. We then present the latest 2017-04 build of Human Plasma PeptideAtlas, which yields ∼43 million peptide-spectrum matches and 122,730 distinct peptide sequences from 178 individual experiments at a 1% protein-level FDR globally across all experiments. Applying the latest Human Proteome Project Data Interpretation Guidelines, we catalog 3509 proteins that have at least two non-nested uniquely mapping peptides of nine amino acids or more and >1300 additional proteins with ambiguous evidence. We apply the same two-peptide guideline to historical PeptideAtlas builds going back to 2006 and examine the progress made in the past ten years in plasma proteome coverage. We also compare the distribution of proteins in historical PeptideAtlas builds in various RNA abundance and cellular localization categories. We then discuss advances in plasma proteomics based on targeted mass spectrometry as well as affinity assays, which during early 2017 target ∼2000 proteins. Finally, we describe considerations about sample handling and study design, concluding with an outlook for future advances in deciphering the human plasma proteome.

Biochemistry

Molecular Biology

0

Paper

Biochemistry

208

0

Save