98%
921
2 minutes
20
Metaproteomics, a method for untargeted, high-throughput identification of proteins in complex samples, provides functional information about microbial communities and can tie functions to specific taxa. Metaproteomics often generates less data than other omics techniques, but analytical workflows can be improved to increase usable data in metaproteomic outputs. Identification of peptides in the metaproteomic analysis is performed by comparing mass spectra of sample peptides to a reference database of protein sequences. Although these protein databases are an integral part of the metaproteomic analysis, few studies have explored how database composition impacts peptide identification. Here, we used cervicovaginal lavage (CVL) samples from a study of bacterial vaginosis (BV) to compare the performance of databases built using six different strategies. We evaluated broad versus sample-matched databases, as well as databases populated with proteins translated from metagenomic sequencing of the same samples versus sequences from public repositories. Smaller sample-matched databases performed significantly better, driven by the statistical constraints on large databases. Additionally, large databases attributed up to 34% of significant bacterial hits to taxa absent from the sample, as determined orthogonally by 16S rRNA gene sequencing. We also tested a set of hybrid databases which included bacterial proteins from NCBI RefSeq and translated bacterial genes from the samples. These hybrid databases had the best overall performance, identifying 1,068 unique human and 1,418 unique bacterial proteins, ~30% more than a database populated with proteins from typical vaginal bacteria and fungi. Our findings can help guide the optimal identification of proteins while maintaining statistical power for reaching biological conclusions. IMPORTANCE Metaproteomic analysis can provide valuable insights into the functions of microbial and cellular communities by identifying a broad, untargeted set of proteins. The databases used in the analysis of metaproteomic data influence results by defining what proteins can be identified. Moreover, the size of the database impacts the number of identifications after accounting for false discovery rates (FDRs). Few studies have tested the performance of different strategies for building a protein database to identify proteins from metaproteomic data and those that have largely focused on highly diverse microbial communities. We tested a range of databases on CVL samples and found that a hybrid sample-matched approach, using publicly available proteins from organisms present in the samples, as well as proteins translated from metagenomic sequencing of the samples, had the best performance. However, our results also suggest that public sequence databases will continue to improve as more bacterial genomes are published.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10469846 | PMC |
http://dx.doi.org/10.1128/msystems.00678-22 | DOI Listing |
NPJ Biofilms Microbiomes
September 2025
GFZ Helmholtz Centre for Geosciences, Potsdam, Germany.
Eukaryotic algae-dominated microbiomes thrive on the Greenland Ice Sheet (GrIS) in harsh environmental conditions, including low temperatures, high light, and low nutrient availability. Chlorophyte algae bloom on snow, while streptophyte algae dominate bare ice surfaces. Empirical data about the cellular mechanisms responsible for their survival in these extreme conditions are scarce.
View Article and Find Full Text PDFJ Environ Manage
September 2025
School of Human Settlements and Civil Engineering, Xi'an Jiaotong University, Xi'an, Shaanxi Province, 710049, China. Electronic address:
The stability of microbial communities within sewer systems is essential for maintaining effluent quality and infrastructure longevity. However, the functional consequences of viral interactions with biofilms remain poorly characterised. This study examines the effects of bacteriophage MS2 adsorption on biofilm structure, metabolism, and pathogenic potential in a simulated 1 km sewer pipeline.
View Article and Find Full Text PDFHealth Inf Sci Syst
December 2025
Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN 47408 USA.
The gut microbiome plays a fundamental role in human health and disease. Individual variations in the microbiome and the corresponding functional implications are key considerations to enhance precision health and medicine. Metaproteomics has recently revealed protein expression that might be associated with human health and disease.
View Article and Find Full Text PDFJ Microbiol Methods
August 2025
Department of Human Sciences & James Comprehensive Cancer Center, The Ohio State University, Columbus, OH 43210, USA. Electronic address:
Metaproteomic analysis offers critical insights into gut microbiome function; however, efficient microbial protein extraction from fecal samples remains challenging due to the complexity of different types of bacterial cell walls in the microbiome. In this study, we systematically compared three representative detergent-based lysis buffers (sodium dodecyl sulfate_urea, dodecyl β-D-maltoside_urea, sodium dodecyl sulfate_ dodecyl β-D-maltoside_urea) for metaproteomics sample preparation. After multiple levels of analyses, we identified SDS_DDM_urea as the most efficient option for extracting diverse microbial proteins, peptides, and identifying microbial species.
View Article and Find Full Text PDFProteomics
August 2025
Fred Hutchinson Cancer Center, Division of Public Health Sciences, Seattle, Washington, USA.
The human gut microbiome is a diverse community of microorganisms residing in the gastrointestinal tract. The storage condition of fecal samples may impact the taxonomic and protein compositions of microbiomes in these samples. Here, we performed a mass spectrometry-based metaproteomic study to assess the impact of storage media on human gut microbiome in fecal samples.
View Article and Find Full Text PDF