Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Computer vision has increasingly shown potential to improve data processing efficiency in ecological research. However, training computer vision models requires large amounts of high-quality, annotated training data. This poses a significant challenge for researchers looking to create bespoke computer vision models, as substantial human resources and biological replicates are often needed to adequately train these models. Synthetic images have been proposed as a potential solution for generating large training datasets, but models trained with synthetic images often have poor generalization to real photographs. Here we present a modular pipeline for training generalizable classification models using synthetic images. Our pipeline includes 3D asset creation with the use of 3D scanners, synthetic image generation with open-source computer graphic software, and domain adaptive classification model training. We demonstrate our pipeline by applying it to skulls of 16 mammal species in the order Carnivora. We explore several domain adaptation techniques, including maximum mean discrepancy (MMD) loss, fine-tuning, and data supplementation. Using our pipeline, we were able to improve classification accuracy on real photographs from 55.4% to a maximum of 95.1%. We also conducted qualitative analysis with t-distributed stochastic neighbor embedding (t-SNE) and gradient-weighted class activation mapping (Grad-CAM) to compare different domain adaptation techniques. Our results demonstrate the feasibility of using synthetic images for ecological computer vision and highlight the potential of museum specimens and 3D assets for scalable, generalizable model training.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12407421PMC
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0329482PLOS

Publication Analysis

Top Keywords

computer vision
16
synthetic images
16
museum specimens
8
classification models
8
vision models
8
models synthetic
8
real photographs
8
model training
8
domain adaptation
8
adaptation techniques
8

Similar Publications

Primary agricultural products are closely related to our daily lives, as they serve not only as raw materials for food processing but also as products directly purchased by consumers. These products face the issue of freshness decline and spoilage during both production and consumption. Freshness degradation induces sensory deterioration and nutritional loss and promotes harmful substance accumulation, causing gastrointestinal issues or even endangering life.

View Article and Find Full Text PDF

Background: Guidelines recommend leaving in situ rectosigmoid polyps diagnosed during colonoscopy that are 5 mm or smaller if the endoscopist optically predicts them to be non-neoplastic. However, no randomised controlled trial has been done to examine the efficacy and safety of this strategy.

Methods: This open-label, multicentre, non-inferiority, randomised controlled trial enrolled adults age 18 years or older undergoing colonoscopy for screening, surveillance, or clinical indications across four Italian centres.

View Article and Find Full Text PDF

Crossmodal correspondences - systematic mappings between stimulus attributes in different modalities - are ubiquitous in the general population. For example, high-pitched (vs low-pitched) sounds are commonly associated with elevated (vs low) positions in space, and rounded (vs angular) shapes tend to be linked to the term 'Bouba' (vs 'Kiki'). There is still some debate about the role of immediate sensory experience versus conceptual colour understanding in crossmodal correspondences.

View Article and Find Full Text PDF

Artificial intelligence (AI) and machine learning (ML) are rapidly transforming healthcare, with growing interest in their application to rare pediatric surgical conditions. In these settings, limited data availability often brakes traditional research. Although pediatric surgery has historically been slower than other specialties in adopting ML, recent years have seen an increase in AI-driven tools designed for surgical care.

View Article and Find Full Text PDF

Vision foundation models have demonstrated vast potential in achieving generalist medical segmentation capability, providing a versatile, task-agnostic solution through a single model. However, current generalist models involve simple pre-training on various medical data containing irrelevant information, often resulting in the negative transfer phenomenon and degenerated performance. Furthermore, the practical applicability of foundation models across diverse open-world scenarios, especially in out-of-distribution (OOD) settings, has not been extensively evaluated.

View Article and Find Full Text PDF