Tiered Human Integrated Sequence Search Databases for Shotgun Proteomics.

Eric W Deutsch , Zhi Sun , David S Campbell , Pierre-Alain Binz , Terry Farrah , David Shteynberg , Luis Mendoza , Gilbert S Omenn , Robert L Moritz

J Proteome Res

Institute for Systems Biology, Seattle, Washington 98109, United States.

Published: November 2016

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

The results of analysis of shotgun proteomics mass spectrometry data can be greatly affected by the selection of the reference protein sequence database against which the spectra are matched. For many species there are multiple sources from which somewhat different sequence sets can be obtained. This can lead to confusion about which database is best in which circumstances-a problem especially acute in human sample analysis. All sequence databases are genome-based, with sequences for the predicted gene and their protein translation products compiled. Our goal is to create a set of primary sequence databases that comprise the union of sequences from many of the different available sources and make the result easily available to the community. We have compiled a set of four sequence databases of varying sizes, from a small database consisting of only the ∼20,000 primary isoforms plus contaminants to a very large database that includes almost all nonredundant protein sequences from several sources. This set of tiered, increasingly complete human protein sequence databases suitable for mass spectrometry proteomics sequence database searching is called the Tiered Human Integrated Search Proteome set. In order to evaluate the utility of these databases, we have analyzed two different data sets, one from the HeLa cell line and the other from normal human liver tissue, with each of the four tiers of database complexity. The result is that approximately 0.8%, 1.1%, and 1.5% additional peptides can be identified for Tiers 2, 3, and 4, respectively, as compared with the Tier 1 database, at substantially increasing computational cost. This increase in computational cost may be worth bearing if the identification of sequence variants or the discovery of sequences that are not present in the reviewed knowledge base entries is an important goal of the study. We find that it is useful to search a data set against a simpler database, and then check the uniqueness of the discovered peptides against a more complex database. We have set up an automated system that downloads all the source databases on the first of each month and automatically generates a new set of search databases and makes them available for download at http://www.peptideatlas.org/thisp/ .

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5096980	PMC
http://dx.doi.org/10.1021/acs.jproteome.6b00445	DOI Listing

Publication Analysis

Top Keywords

sequence databases

sequence

database

tiered human

human integrated

databases

search databases

shotgun proteomics

mass spectrometry

protein sequence

Similar Publications

Comparative Analysis of COVID-19 Gene Target Dropout/Failure Results using Thermofisher TaqPath COVID-19 Combo Kit and Nextstrain CoVariants Genomic Databases.

J Healthc Sci Humanit

January 2024

Assistant Professor & Clinical Coordinator, Health Informatics Program, School of Health Professions, State University of New York Downstate Health Sciences University, 450 Clarkson Avenue, MSC 94, Brooklyn, NY 11203, (718) 270-7738, Fax: (718) 270-7739 Email:

Cheryl Davis , Shushawna DeOliveira , Adiebonye Jumbo

COVID-19 variants continue to infect thousands of people even though the end of the pandemic was announced on May 11, 2023. Nextstrain CoVariants (CoVariants) genomic databases provide detailed information about more than 31 variants of COVID-19 viruses that have been identified through genomic sequencing, showing the mutations they carry. Mutated viruses may yield a negative result for a gene target using a PCR test that has a positive COVID-19 test result.

View Article and Find Full Text PDF

Similar Publications

A compilation of 13 patients with metastatic colorectal cancer and concomitant and family mutations.

Front Oncol

August 2025

Department of Hematology and Oncology, Wake Forest University School of Medicine, Winston-Salem, NC, United States.

Jigisha Srivastav , Morgan E Lehman , Joni K Evans , Ravi Paluri , Caio Max Sao Pedro Rocha Lima

Introduction: Metastatic colorectal cancer (mCRC) exhibits significant heterogeneity in molecular profiles, influencing treatment response and patient outcomes. Mutations in v-raf murine sarcoma viral oncogene homolog B1 () and rat sarcoma () family genes are commonly observed in mCRC. Though originally thought to be mutually exclusive, recent data have shown that patients may present with concomitant and mutations, posing unique challenges and implications for clinical management.

View Article and Find Full Text PDF

Similar Publications

predicts poor prognosis and modulates immune infiltration in gastric cancer: a TCGA-based bioinformatics study.

Front Genet

August 2025

Department of Gastrointestinal and Hernia Surgery, Ganzhou Hospital-Nanfang Hospital, Southern Medical University, Ganzhou, China.

Cheng Wu , Yungeng Liu , Chuanyuan Liu , Chuanfa Fang

Background: Gastric cancer (GC) is a leading cause of cancer-related mortality; however, biomarkers predicting its immunotherapy resistance remain scarce. Vascular cell adhesion molecule ()-, an immune cell adhesion mediator, is implicated in tumor progression; however, its prognostic and immunomodulatory roles in GC remain unclear.

Methods: In this study, we analyzed expression and its clinical relevance in GC using RNA-sequencing data from The Cancer Genome Atlas.

View Article and Find Full Text PDF

Similar Publications

Integrated Single-Cell and Transcriptome Analysis with Experimental Validation Reveals PANoptosis-Related Gene Signatures in the Immune Microenvironment of Autoimmune Thyroiditis.

J Inflamm Res

September 2025

The Second Clinical College of Liaoning University of Traditional Chinese Medicine, Shenyang, Liaoning Province, People's Republic of China.

Zhuo Zhao , Ziyu Liu , Qun Wang , Hao Gao , Nan Song

Purpose: Autoimmune thyroiditis (AIT) is the most common organ-specific autoimmune disease, and its pathogenesis is closely related to the inflammatory microenvironment driven by immune cell penetration. The role of the newly proposed concept of PANoptosis in immune-related diseases is gradually being revealed. However, there is currently a lack of reports on PANoptosis in AIT.

View Article and Find Full Text PDF

Similar Publications

Genomic characterization and multidrug resistance of isolated from peregrine falcons in Saudi Arabia: A One Health perspective.

Vet World

July 2025

Veterinary Hospital, Faculty of Veterinary Medicine, Zagazig University, Zagazig, Egypt.

Ali Wahdan , Mahmoud Mohamed , Mahmoud M Elhaig , Mohammed Al-Rasheed , Ehab M Abd-Allah

Background And Aim: is a multidrug-resistant (MDR) zoonotic pathogen increasingly implicated in infections in both humans and animals, including avian species. Raptors, particularly peregrine falcons, are vulnerable due to their exposure to diverse environments and intensive management practices. This study aimed to identify isolates from peregrine falcons in Saudi Arabia and to characterize their genomic features, phylogenetic relationships, and antimicrobial resistance (AMR) profiles using whole-genome sequencing (WGS).

View Article and Find Full Text PDF

Similar Publications