98%
921
2 minutes
20
The Cytochrome C Oxidase subunit I gene ("COI") is the de facto standard for animal DNA barcoding. Organism identification based on COI requires an accurate and extensive annotated database of COI sequences. Such a database can also be of value in reconstructing evolutionary history and in diversity studies. Two COI databases are currently available: BOLD and Midori. BOLD's submissions conform to stringent sequence and metadata requirements; BOLD is specific to COI but makes no attempt to be comprehensive. Midori, derived from GenBank, has more sequences but less stringent standards than BOLD, resulting in higher error rates. To address the need for a comprehensive and accurate COI database, we adapted the ARBitrator algorithm, which classifies based only on sequence properties and has successfully auto-curated bacterial genes mined from GenBank. The adapted algorithm, which we call CO-ARBitrator, built a database of over a million metazoan COI sequences. Sensitivity and specificity are significantly higher than Midori. Specificity is comparable to what BOLD achieves with data quality prerequisites. Results and software are publicly available.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6080493 | PMC |
http://dx.doi.org/10.1038/sdata.2018.156 | DOI Listing |
Mar Environ Res
August 2025
State Key Laboratory of Submarine Geoscience, School of Oceanography, Shanghai Jiao Tong University, Shanghai, 200030, China; State Key Laboratory of Submarine Geoscience, Second Institute of Oceanography, Ministry of Natural Resources, Hangzhou, 310012, China; Key Laboratory of Marine Ecosystem Dyn
Using environmental DNA (eDNA) metabarcoding with mitochondrial COI gene markers for biodiversity assessments has been gaining popularity. This approach is particularly advantageous in marine ecosystems due to the significant challenges posed by traditional sampling methods. However, limitations like primer specificity and primer-template bias during the PCR amplification can affect the accuracy of biodiversity assessments, though these issues have not been quantitatively evaluated to date.
View Article and Find Full Text PDFZookeys
July 2025
Sciences Department, Museums Victoria Research Institute, Museums Victoria, GPO Box 666, Melbourne, Victoria 3001, Australia Museums Victoria Research Institute, Museums Victoria Melbourne Australia.
Phylum Annelida are ubiquitous metazoans found in almost every terrestrial and aquatic habitat on Earth. Historically, taxonomic studies on the phylum have been focused largely on its majorgroups, polychaetes, oligochaetes and leeches, so that while family-level keys for each group are available, no single-source identification guide exists to the world's annelid families. Here, the first illustrated linear key to annelid families is provided and family-level descriptions and diagnoses that distinguish individuals of each family from those of other families in the phylum are updated.
View Article and Find Full Text PDFMetazoan cells signal to each other via direct contact between cell surface proteins (CSPs) and by interactions of CSP receptors with secreted ligands. CSP extracellular domain (ECD) interactions control organ development and physiology and are perturbed in disease states. However, because they cannot be accurately assessed using standard high-throughput screening techniques, they are underrepresented in protein interaction databases.
View Article and Find Full Text PDFPeerJ
July 2025
Centre for Conservation Ecology and Genomics, University of Canberra, Canberra, ACT, Australia.
DNA barcoding is a widely used tool for species identification, with its reliability heavily dependent on reference databases. While the quality of these databases has long been debated, a critical knowledge gap remains in their comprehensive evaluation and comparison at regional scales. Marine metazoan species in the western and central Pacific Ocean (WCPO), a region characterized by high biodiversity and limited sequencing efforts, are an example of this gap.
View Article and Find Full Text PDFWhole-genome sequencing provides lists of genes of putative relevance to organismal biology. However, in all metazoans, a large fraction of inferred genes has no known functions, in some cases with no orthologs in related species, and even orthology at the DNA-sequence level often not providing indisputable evidence of gene function. A first step towards resolving the functional features of gene encyclopedias in multicellular species is to evaluate the tissues in which individual genes are expressed.
View Article and Find Full Text PDF