We have merged four different views of the human plasma proteome, based on different methodologies, into a single nonredundant list of 1175 distinct gene products. The methodologies used were 1) literature search for proteins reported to occur in plasma or serum; 2) multidimensional chromatography of proteins followed by two-dimensional electrophoresis and mass spectroscopy (MS) identification of resolved proteins; 3) tryptic digestion and multidimensional chromatography of peptides followed by MS identification; and 4) tryptic digestion and multidimensional chromatography of peptides from low-molecular-mass plasma components followed by MS identification. Of 1,175 nonredundant gene products, 195 were included in more than one of the four input datasets. Only 46 appeared in all four. Predictions of signal sequence and transmembrane domain occurrence, as well as Genome Ontology annotation assignments, allowed characterization of the nonredundant list and comparison of the data sources. The “nonproteomic” literature (468 input proteins) is strongly biased toward signal sequence-containing extracellular proteins, while the three proteomics methods showed a much higher representation of cellular proteins, including nuclear, cytoplasmic, and kinesin complex proteins. Cytokines and protein hormones were almost completely absent from the proteomics data (presumably due to low abundance), while categories like DNA-binding proteins were almost entirely absent from the literature data (perhaps unexpected and therefore not sought). Most major categories of proteins in the human proteome are represented in plasma, with the distribution at successively deeper layers shifting from mostly extracellular to a distribution more like the whole (primarily cellular) proteome. The resulting nonredundant list confirms the presence of a number of interesting candidate marker proteins in plasma and serum. We have merged four different views of the human plasma proteome, based on different methodologies, into a single nonredundant list of 1175 distinct gene products. The methodologies used were 1) literature search for proteins reported to occur in plasma or serum; 2) multidimensional chromatography of proteins followed by two-dimensional electrophoresis and mass spectroscopy (MS) identification of resolved proteins; 3) tryptic digestion and multidimensional chromatography of peptides followed by MS identification; and 4) tryptic digestion and multidimensional chromatography of peptides from low-molecular-mass plasma components followed by MS identification. Of 1,175 nonredundant gene products, 195 were included in more than one of the four input datasets. Only 46 appeared in all four. Predictions of signal sequence and transmembrane domain occurrence, as well as Genome Ontology annotation assignments, allowed characterization of the nonredundant list and comparison of the data sources. The “nonproteomic” literature (468 input proteins) is strongly biased toward signal sequence-containing extracellular proteins, while the three proteomics methods showed a much higher representation of cellular proteins, including nuclear, cytoplasmic, and kinesin complex proteins. Cytokines and protein hormones were almost completely absent from the proteomics data (presumably due to low abundance), while categories like DNA-binding proteins were almost entirely absent from the literature data (perhaps unexpected and therefore not sought). Most major categories of proteins in the human proteome are represented in plasma, with the distribution at successively deeper layers shifting from mostly extracellular to a distribution more like the whole (primarily cellular) proteome. The resulting nonredundant list confirms the presence of a number of interesting candidate marker proteins in plasma and serum. The human plasma proteome is likely to contain most, if not all, human proteins, as well as proteins derived from some viruses, bacteria, and fungi. Many of the human proteins, introduced by low-level tissue leakage, ought to be present at very low concentrations (≪pg/ml), while others, such as albumin, are present in very large amounts (≫mg/ml). Numerous post-translationally modified forms of each protein are likely to be present, along with literally millions of distinct clonal immunoglobulin (Ig) 1The abbreviations used are: Ig, immunoglobulin; MS, mass spectrometry; GO, Genome Ontology; 2DE, two-dimensional electrophoresis; NR, nonredundant; TM, transmembrane; LC, liquid chromatography; MS/MS, tandem MS; IT, ion trap.1The abbreviations used are: Ig, immunoglobulin; MS, mass spectrometry; GO, Genome Ontology; 2DE, two-dimensional electrophoresis; NR, nonredundant; TM, transmembrane; LC, liquid chromatography; MS/MS, tandem MS; IT, ion trap. sequences. This complexity and enormous dynamic range make plasma the most difficult specimen to be dealt with by proteomics (1Anderson N.L. Anderson N.G. The human plasma proteome: History, character, and diagnostic prospects..Mol. Cell. Proteomics. 2002; 1: 845-867Google Scholar). At the same time, plasma is the most generally informative proteome from a medical viewpoint. Almost all cells in the body communicate with plasma directly or through extracellular or cerebrospinal fluids, and many release at least part of their contents into plasma upon damage or death. Some medical conditions, such as myocardial infarction, are officially defined based on the increase of a specific protein in the plasma (e.g. cardiac troponin-T), and it is difficult to argue convincingly that there is any disease state that does not produce some specific pattern of protein change in the body’s working fluid. This immense diagnostic potential has spurred a rapid acceleration in the search for protein disease markers by a wide variety of proteomics strategies. Current methods of proteomics are only beginning to catalog the contents of plasma. Two-dimensional electrophoresis was able to resolve 40 distinct plasma proteins in 1976 (2Anderson L. Anderson N.G. High resolution two-dimensional electrophoresis of human plasma proteins..Proc. Natl. Acad. Sci. U. S. A. 1977; 74: 5421-5425Google Scholar), but, because of the dynamic range problem, this number had only grown to 60 in 1992 (3Hughes G.J. Frutiger S. Paquet N. Ravier F Pasquali C. Sanchez J.C. James R. Tissot J.D. Bjellqvist B. Hochstrasser D.F. Plasma protein map: An update by microsequencing..Electrophoresis. 1992; 13: 707-714Google Scholar) and is substantially unchanged today, a quarter century later. It is now clear that more than two dimensions of conventional resolution are required to progress beyond this point. Recently, several truly multidimensional survey efforts have been mounted, with the result that the number of distinct proteins detected has increased dramatically. Additional dimensions of separation can be introduced at any of three levels: a) separation of intact proteins, either by specific binding (e.g. subtraction of defined high-abundance proteins) or continuous resolution (e.g. electrophoresis or chromatography); b) separation of peptides derived from plasma proteins, either by specific binding (e.g. capture by anti-peptide antibodies) or continuous resolution (e.g. chromatography); and c) separation of peptides, and particularly their fragments, by mass spectrometry (MS). Many possible combinations of these dimensions can be implemented, the only limitations being the effort, cost, and time of analyzing many fractions or runs instead of one. In this article, we have compared and combined data from three different multi-dimensional strategies with data from a fourth, classical source (the protein biochemistry and clinical chemistry literature) to provide a meta-level overview of both the contents and the rate of discovery of new components in plasma. The three experimental datasets are derived from 1) whole protein separation by a three-dimensional process (immunosubtraction/ion exchange/size exclusion) followed by two-dimensional electrophoresis (2DE) followed by MS identification of resolved spots (4Pieper R. Su Q. Gatlin C.L. Huang S.T. Anderson N.L. Steiner S. Multi-component immunoaffinity subtraction chromatography: An innovative step towards a comprehensive survey of the human plasma proteome..Proteomics. 2003; 3: 422-432Google Scholar); 2) Ig subtraction followed by trypsin digestion followed by two-dimensional liquid chromatography (LC) (ion exchange/reversed phase) followed by tandem MS (MS/MS) (5Adkins J.N. Varnum S.M. Auberry K.J. Moore R.J. Angell N.H. Smith R.D. Springer D.L. Pounds J.G. Toward a human blood serum proteome: Analysis by multidimensional separation coupled with mass spectrometry..Mol. Cell. Proteomics. 2002; 1: 947-955Google Scholar); and 3) molecular mass fractionation, followed by trypsin digestion followed by two-dimensional LC (cation exchange/reversed phase) followed by MS/MS (6Tirumalai R.S. Chan K.C. Prieto D.A. Issaq H.J. Conrads T.P. Veenstra T.D. Characterization of the low molecular weight human serum proteome..Mol. Cell. Proteomics. 2003; 2: 1096-1103Google Scholar). These three experimental approaches have two features in common (the removal of most Igs, by specific subtraction or size, and the use of MS for molecular identification) but otherwise they span the gamut of proteomics discovery approaches: separation at the protein level, separation at the tryptic peptide level, and a hybrid. Combining experimental data with literature search results on proteins detected in plasma (representing a large body of accumulated “nonproteomics” data) should provide a broad perspective on plasma contents. Because the same proteins detected by various methods can be referred to by different names or accession numbers, we have used a sequence-based approach to eliminate redundancy and cluster all occurrences of the same protein. The resulting list makes it possible to examine the overlap between the various approaches and to see whether they are biased toward particular classes of proteins. In addition, a pooled nonredundant list should provide a relatively unbiased survey of the kinds of proteins present in plasma, which could have important diagnostic implications. Finally, a large list of proteins actually observed in plasma paves the way for top-down, targeted proteomics approaches to the discovery of disease markers: the development of accurate high-throughput specific assays for selected candidates from this list, as a supplement to the use of single methods for marker discovery in small sample sets. In the longer term, proteins with strong, mechanistic disease relationships may be viable therapeutic candidates as well. Manual Medline searches were performed searching for titles or abstracts containing human plasma or serum proteins, excluding articles on membranes, stimulation, drug, and dose. A total of 468 entries were collected, of which 458 had a human sequence accession number in one or more of the major databases. Intact proteins were fractionated by chromatography and 2DE and identified by MS, generating the dataset described by Pieper et al. (7Pieper R. Gatlin C.L. Makusky A.J. Russo P.S. Schatz C.R. Miller S.S. Su Q. McGrath A.M. Estock M.A. Parmar P.P. Zhao M. Huang S.T. Zhou J. Wang F. Esquer-Blasco R. Anderson N.L. Taylor J. Steiner S. The human serum proteome: Display of nearly 3700 chromatographically separated protein spots on two-dimensional electrophoresis gels and identification of 325 distinct proteins..Proteomics. 2003; 3: 1345-1364Google Scholar). Briefly, human blood sera were obtained in equal volumes from two healthy male donors (ages 40 and 80). Albumin, haptoglobin, transferrin, transthyretin, α-1-anti trypsin, α-1-acid glycoprotein, hemopexin, and α-2-macroglobulin were removed by immunoaffinity chromatography. The immunoaffinity-subtracted serum concentrate was fractionated further by sequential anion exchange and size exclusion chromatography. The resulting 66 samples were individually subjected to 2DE. All visible Coomassie Blue R250 spots were cut out, destained, reduced, alkylated, and digested with trypsin. All extracted peptides were analyzed by matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) on a Bruker Biflex or Autoflex mass spectrometer (Bruker, Billerica, MA) and searched against Swiss-Prot. Those samples that did not give positive identification by MALDI-TOF where subjected to LC-MS/MS analysis by ion trap (IT) MS (Thermo Finnegan LCQ, Woburn, MA) and searched against the National Center for Biotechnology Information (NCBI) database using SEQUEST. A published dataset prepared by Adkins et al., (5Adkins J.N. Varnum S.M. Auberry K.J. Moore R.J. Angell N.H. Smith R.D. Springer D.L. Pounds J.G. Toward a human blood serum proteome: Analysis by multidimensional separation coupled with mass spectrometry..Mol. Cell. Proteomics. 2002; 1: 947-955Google Scholar) was used. Briefly, human blood serum was obtained from a healthy anonymous female donor. Igs were depleted by affinity adsorption chromatography using protein A/G. The resulting Ig-depleted plasma was digested with trypsin and separated by strong cation exchange on a polysulfoethyl A column followed by reverse-phase separation on a capillary C18 column. The capillary column was interfaced to an IT-MS (Thermo Finnigan LCQ Deca XP) using electrospray ionization. The IT-MS was configured to perform MS/MS scans on the three most intense precursor masses from a single MS scan. All samples were measured over a mass/charge (m/z) range of 400–2,000, with fractions containing high complexity being measured with segmented m/z ranges. Tandem mass spectra were analyzed by SEQUEST as described using the NCBI May 2002 database. The fourth dataset is that described by Tirumalai et al. (6Tirumalai R.S. Chan K.C. Prieto D.A. Issaq H.J. Conrads T.P. Veenstra T.D. Characterization of the low molecular weight human serum proteome..Mol. Cell. Proteomics. 2003; 2: 1096-1103Google Scholar), focused on the lower-molecular-mass plasma proteome. Briefly standard human serum was purchased from the National Institute of Standards and Technology. High-molecular-mass proteins were removed in the presence of acetonitrile using Centriplus centrifugal filters with a molecular mass cutoff of 30 kDa. The low-molecular-mass filtrate was reduced, alkylated, and digested with trypsin. The digested sample was fractionated by strong cation exchange chromatography on a polysulfoethyl A column. Reversed-phase LC was subsequently performed on 300A Jupiter C-18 column coupled on line to an IT-MS (Thermo Finnegan LCQ Deca XP). Each full MS scan was followed by three MS/MS scans where the three most abundant peptide molecular ions were selected. MS/MS spectra were searched against the a human protein database using SEQUEST. The Blastp protein comparison algorithm (8Altschul S.F. Gish W. Miller W. Myers E.W. Lipman D.J. Basic local alignment search tool..J. Mol. Biol. 1990; 215: 403-410Google Scholar, 9Altschul S.F. Madden T.L. Schaffer A.A. Zhang J. Zhang Z. Miller W. Lipman D.J. Gapped blast and psi-blast: A new generation of protein database search programs..Nucleic Acids Res. 1997; 25: 3389-3402Google Scholar) was used to query the sequence of each protein identified against a database containing the aggregate sequences of all proteins identified by any method. Sequences sharing greater than 95% identity over an aligned region were grouped into “unique sequence clusters.” Sequences were unmasked, and the minimum alignment length considered was 15 aa. This similarity-based approach was sufficient to group identical sequences, sequence fragments, and splice variants. Annotation in the nonredundant table was reported for the “best annotated” protein in the cluster set. Signal peptides were predicted using the commercially available SignalP version 2.0 neural net and hidden Markov model (HMM) algorithms (10Nielsen H. Engelbrecht J. Brunak S. von Heijne G. Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites..Protein Eng. 1997; 10: 1-6Google Scholar) and sigmask (11Swindells M. Rae M. Pearce M. Moodie S. Miller R. Leach P. Application of high-throughput computing in bioinformatics..Philos. Transact. Ser. A. Math. Phys. Eng. Sci. 2002; 360: 1179-1189Google Scholar) signal masking program developed as part of Inpharmatica’s Biopendium (12Michalovich D. Overington J. Fagan R. Protein sequence analysis in silico: Application of structure-based bioinformatics to genomic initiatives..Curr. Opin. Pharmacol. 2002; 2: 574-580Google Scholar) protein annotation database. Each sequence received a score of +1 for a statistically significant positive signal peptide prediction from any of the three algorithms. The scores 0, 1, 2, and 3 for a particular sequence were then converted to qualitative terms “no,” “possible signal,” “signal,” or “signal confident,” respectively. Transmembrane (TM) regions were predicted using the commercial version of TMHMM version 2.0 algorithm (13Krogh A. Larsson B. von Heijne G. Sonnhammer E.L. Predicting transmembrane protein topology with a hidden Markov model: Application to complete genomes..J. Mol. Biol. 2001; 305: 567-580Google Scholar). The total number of TM helices predicted per sequence was reported for each protein sequence. When a predicted TM region overlapped a predicted signal sequence (as it did in 40 cases in H_Plasma_NR_v2), this was interpreted as a signal sequence only. Sequences were scanned against a library of BioPendium and iPSI-BLAST (9Altschul S.F. Madden T.L. Schaffer A.A. Zhang J. Zhang Z. Miller W. Lipman D.J. Gapped blast and psi-blast: A new generation of protein database search programs..Nucleic Acids Res. 1997; 25: 3389-3402Google Scholar, 11Swindells M. Rae M. Pearce M. Moodie S. Miller R. Leach P. Application of high-throughput computing in bioinformatics..Philos. Transact. Ser. A. Math. Phys. Eng. Sci. 2002; 360: 1179-1189Google Scholar)-like protein profiles constructed from SCOP (14Murzin A.G. Brenner S.E. Hubbard T. Chothia C. SCOP: A structural classification of proteins database for the investigation of sequences and structures..J. Mol. Biol. 1995; 247: 536-540Google Scholar), PFAM (15Bateman A. Birney E. Cerruti L. Durbin R. Etwiller L. Eddy S.R. Griffiths-Jones S. Howe K.L. Marshall M. Sonnhammer E.L. The pfam protein families database..Nucleic Acids Res. 2002; 30: 276-280Google Scholar), PRINTS (16Attwood T.K. Bradley P. Flower D.R. Gaulton A. Maudling N. Mitchell A.L. Moulton G. Nordle A. Paine K. Taylor P. Uddin A. Zygouri C. Prints and its automatic supplement, preprints..Nucleic Acids Res. 2003; 31: 400-402Google Scholar), and PROSITE (17Sigrist C.J. Cerutti L. Hulo N. Gattiker A. Falquet L. Pagni M. Bairoch A. Bucher P. PROSITE: A documented database using patterns and profiles as motif descriptors..Brief Bioinform. 2002; 3: 265-274Google Scholar) domain families. Hits to these profiles were reported at a statistical e-value cut-off of 1e-5. This cut-off was chosen to maximize profile coverage and minimize the occurrence of false positives. Sequences were not masked for low complexity or coiled coils prior to profile scanning. NCBI GI number accessions for the sequences were matched to their SPTR (18Boeckmann B. Bairoch A. Apweiler R. Blatter M.C. Estreicher A. Gasteiger E. Martin M.J. Michoud K. O’Donovan C. Phan I. Pilbout S. Schneider M. The Swiss-Prot protein knowledgebase and its supplement TrEMBL in 2003..Nucleic Acids Res. 2003; 31: 365-370Google Scholar) equivalents based on sequences sharing >95% sequence identity over 90% of the query sequence length. GO (19Ashburner M. Ball C.A. Blake J.A. Botstein D. Butler H. Cherry J.M. Davis A.P. Dolinski K. Dwight S.S. Eppig J.T. Harris M.A. Hill D.P. Issel-Tarver L. Kasarskis A. Lewis S. Matese J.C. Richardson J.E. Ringwald M. Rubin G.M. Sherlock G. Gene Ontology: Tool for the unification of biology. The Gene Ontology Consortium..Nat. Genet. 2000; 25: 25-29Google Scholar) component, process, and function terms were then extracted from text-based annotation files available for download from the GO database ftp site: ftp.geneontology.org/pub/go/gene-associations/gene_association.goa_human. For graphical reporting, a series of GO terms in each category were extracted by text searching of relevant keywords (indicated by the category names on plots) through all the assigned GO definitions. A GO component summary for the whole human proteome was prepared by applying the same approach to the complete GO human database referred to above. The nonredundant (NR) plasma database was assembled as a series of tables in a PostgreSQL relational database and queried to derive summary statistics for tables and figures shown here. Four sets of accession numbers for proteins occurring in plasma (468 from Lit, 319 from 2DEMS, 607 [reported as 490 nonredundant accessions] from LCMS1, and 341 from LCMS2) were combined to yield 1,735 total initial accessions (Table I). A total of 55 of the input accessions referred to nonhuman sequences, and these were not considered further in the present analysis. A very conservative method of selecting distinct proteins was used in order to avoid counting sequence variants, splice variants, or cleavage products of one gene product as different: any sequences that shared a region larger than 15 aa with greater than 95% sequence identity were assigned to the same cluster and reported as a single entry in the nonredundant set. Fig. 1 shows one result of applying these criteria, in this case resulting in the assignment of 10 initial accessions to a single cluster for haptoglobin, a major plasma protein found in all four initial datasets and whose three separate subunit types are derived from a single translation product. This case also highlights the general observation that not all datasets used the same primary accession database (NCBI GI, Swiss-Prot, or RefSeq as examples). The largest cluster (109 “redundant” entries) is accounted for by Igs, where all the Ig heavy and light chains of all types were clustered together as one entry arbitrarily chosen as S40354 (an Ig κ chain sequence). Thus 6.2% of the input accessions were Igs, despite the fact that each of the experimental methods included steps to remove these molecules.Table IProtein redundancy within and between datasetsLitLCMS1LCMS22DEMSTotalBeginning accessions4686073413191735Minus nonhuman4585803303121680Minus intrasource redundancy and nonhuman accessions4334753182831509Unique to source in NR284334221141980Total combined NR list––––1175 Open table in a new tab This approach is more conservative (fewer distinct proteins reported) than the methods used in some of the input data sources, which accounts for the decrease in each set when intra-set redundancy is removed (1,509 human accessions remain). When inter-set redundancies are removed (making the full list nonredundant by the criteria described above), a total of 1,175 distinct proteins remain. The entire nonredundant set, here abbreviated H_Plasma_NR_v2 (H_Plasma_NR_v1 being Table I of Ref. 1Anderson N.L. Anderson N.G. The human plasma proteome: History, character, and diagnostic prospects..Mol. Cell. Proteomics. 2002; 1: 845-867Google Scholar), is provided as a supplemental data table. Of these, a total of 980 occur in only one source. Because so many entries occur only once, and given the non-zero frequency of false MS identifications, independent confirmation will be required to validate most of this list as true plasma components. Of the 1,175 nonredundant human proteins in H_Plasma_NR_v2, 195 entries, or 17%, were present in more than one dataset (set H_Plasma_195: Fig. 2 and Table II). Only 46 (4%) were found in all four sets of accessions (Total_sources = 4, shown in bold type in Table II). Of these only one (inter-α trypsin inhibitor heavy chain H1) is predicted to have even a single transmembrane domain, and only one (the hemoglobin β chain presumably released from red cell lysis) is predicted not to have a signal sequence. These characteristics (presence of signal sequence and absence of transmembrane domains) are those expected for major plasma proteins secreted by organs such as the liver.Table IIPlasma proteins detected in at least two datasetsAccessionLit2DEMSLCMS1LCMS2Total_accessionsTotal_sourcesSignalTMDescriptionP10809101133No060-kDa heat shock protein, mitochondrial precursor (Hsp60) (60-kDa chaperonin) (CPN60) (Heat shock protein 60) (HSP-60) (mitochondrial matrix protein P1) (P60 lymphocyte protein) (hucha60)AAB27045001122Possible signal070-kDa peroxisomal membrance protein homolog (internal fragment)P02570240283No0Actin, cytoplasmic 1 (β-actin)Q15848110022Signal confident0Adiponectin precursor (30-kDa adipocyte complement-related protein) (ACRP30) (adipose most abundant gene transcript 1) (apm-1) (gelatin-binding protein)NP_001124011022Signal confident0Afamin precursor; α-albumin (Homo sapiens)P02763111033Signal confident0α-1-acid glycoprotein 1 precursor (AGP 1) (orosomucoid 1) (OMD 1)P01011112043Signal confident0α-1-antichymotrypsin precursor (ACT)P01009111144Signal confident0α-1-antitrypsin precursor (α-1 protease inhibitor) (α-1-antiproteinase) (PRO0684/PRO2209)P04217111033No0α-1B-glycoprotein precursor (α-1-B glycoprotein)P08697111144Signal0α-2-antiplasmin precursor (α-2-plasmin inhibitor) (α-2-PI) (α-2-AP)P02765111144Signal confident0α-2-HS-glycoprotein precursor (Fetuin-A) (α-2-Z-globulin) (Ba-α-2-glycoprotein) (PRO2743)P01023112154Signal confident0α-2-macroglobulin precursor (α-2-M)P02760111033Signal confident0AMBP protein precursor [contains α-1-microglobulin (protein HC) (complex-forming glycoprotein heterogeneous in charge) (α-1 microglycoprotein); inter-α-trypsin inhibitor light chain (ITI-LC) (bikunin) (HI-30)]P01019111144Signal0Angiotensinogen precursor [contains angiotensin I (Ang I); angiotensin II (Ang II); angiotensin III (Ang III) (Des-Asp[1]-angiotensin II)]P01008111144Signal0Antithrombin-III precursor (ATIII) (PRO0309)P02647112264Signal confident0Apolipoprotein A-I precursor (Apo-AI)P02652111144Signal confident0Apolipoprotein A-II precursor (Apo-AII) (apoa-II)P06727111254Signal confident0Apolipoprotein A-IV precursor (Apo-AIV)P04114112043Signal confident0Apolipoprotein B-100 precursor (Apo B-100) [contains: apolipoprotein B-48 (Apo B-48)]P02655111144Signal confident0Apolipoprotein C-II precursor (Apo-CII)P02656111144Signal confident0Apolipoprotein C-III precursor (Apo-CIII)P05090111144Signal confident0Apolipoprotein D precursor (Apo-D) (apod)P02649132174Signal confident0Apolipoprotein E precursor (Apo-E)Q13790111144Signal confident0Apolipoprotein F precursor (Apo-F)O14791111144Signal0Apolipoprotein L1 precursor (apolipoprotein L-I) (apolipoprotein L) (apol-I) (Apo-L) (apol)P08519101022Signal0Apolipoprotein(a) precursor (EC 3.4.21.−) (Apo(a)) (Lp(a))P06576011022Possible signal0ATP synthase β chain, mitochondrial precursor (EC 3.6.3.14)P01160101022Signal confident0Atrial natriuretic factor precursor (ANF) (atrial natriuretic peptide) (ANP) (prepronatriodilatin) [contains: cardiodilatin-related peptide (CDP)]P02749111144Signal confident0β-2-glycoprotein I precursor (apolipoprotein H) (Apo-H) (B2GPI) (β(2)GPI) (activated protein C-binding protein) (APC inhibitor)P01884100122Signal confident0β-2-microglobulin precursorI39467001122No0Bullous pemphigoid antigen, human (fragment)P04003111033Signal0C4b-binding protein α chain precursor (c4bp) (proline-rich protein) (PRP)P20851110022Signal confident0C4b-binding protein β chain precursorP05109110022No0Calgranulin A (Migration inhibitory factor-related protein 8) (MRP-8) (cystic fibrosis antigen) (CFAG) (P8) (leukocyte L1 complex light chain) (S100 calcium-binding protein A8) (calprotectin L1L subunit)NP_001729011022No0Carbonic anhydrase I; carbonic dehydratase (Homo sapiens)P22792111033No0Carboxypeptidase N 83-kDa chain (carboxypeptidase N regulatory subunit) (fragment)P15169110022Signal0Carboxypeptidase N catalytic chain precursor (EC 3.4.17.3) (arginine carboxypeptidase) (kinase 1) (serum carboxypeptidase N) (SCPN) (anaphylatoxin inactivator) (plasma carboxypeptidase B)P07339110022Signal confident0Cathepsin D precursor (EC 3.4.23.5)P07711110022Signal confident0Cathepsin L precursor (EC 3.4.22.15) (major excreted protein) (MEP)P25774010122Signal confident0Cathepsin S precursor (EC 3.4.22.27)NP_005185001122No0CCAAT/enhancer binding protein β, interleukin 6-dependentO43866110022Signal confident0CD5 antigen-like precursor (SP-α) (CT-2) (igm-associated peptide)NP_005187002132No0Centromere protein F (350/400kd, mitosin); mitosin; centromereP00450111144Signal confident0Ceruloplasmin precursor (EC 1.16.3.1) (ferroxidase)NP_006421001122No0Chaperonin containing TCP1, subunit 4 (δ); chaperoninNP_004061001122No10Chloride channel Ka; chloride channel, kidney, A; hclc-Ka (Homo sapiens)P06276110022Signal confident1Cholinesterase precursor (EC 3.1.1.8) (acylcholine acylhydrolase) (choline esterase II) (butyrylcholine esterase) (pseudocholinesterase)P10909111144Signal confident0Clusterin precursor (complement-associated protein SP-40,40) (complement cytolysis inhibitor) (CLI) (NA1 and NA2) (apolipoprotein J) (Apo-J) (TRPM-2)P00740110133Signal0Coagulation factor IX precursor (EC 3.4.21.22) (Christmas factor)P12259101133Signal confident0Coagulation factor V precursor (activated protein C cofactor)P00451101022Signal0Coagulation factor VIII precursor (procoagulant component) (antihemophilic factor) (AHF)P00742110022Signal confident0Coagulation factor X precursor (EC 3.4.21.6) (Stuart factor)P00748111144Signal confident0Coagulation factor XII precursor (EC 3.4.21.38) (Hageman factor) (HAF)P00488110133No0Coagulation factor XIII A chain precursor (EC 2.3.2.13) (protein-glutamine γ-glutamyltransferase A chain) (transglutaminase A chain)P05160111033Signal confident0Coagulation facto