Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Video classification, as an essential task in computer vision, aims to identify and label video content using computer technology automatically. However, the current mainstream video classification models face two significant challenges in practical applications: first, the classification accuracy is not high, which is mainly attributed to the complexity and diversity of video data, including factors such as subtle differences between different categories, background interference, and illumination variations; and second, the number of model training parameters is too high resulting in longer training time and increased energy consumption. To solve these problems, we propose the OM-Video Swin Transformer (OM-VST) model. This model adds a multi-scale feature fusion module with an optimized downsampling module based on a Video Swin Transformer (VST) to improve the model's ability to perceive and characterize feature information. To verify the performance of the OM-VST model, we conducted comparison experiments between it and mainstream video classification models, such as VST, SlowFast, and TSM, on a public dataset. The results show that the accuracy of the OM-VST model is improved by 2.81% while the number of parameters is reduced by 54.7%. This improvement significantly enhances the model's accuracy in video classification tasks and effectively reduces the number of parameters during model training.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11884693PMC
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0318884PLOS

Publication Analysis

Top Keywords

video classification
16
om-vst model
12
optimized downsampling
8
downsampling module
8
multi-scale feature
8
feature fusion
8
mainstream video
8
classification models
8
model training
8
swin transformer
8

Similar Publications

AI Model Based on Diaphragm Ultrasound to Improve the Predictive Performance of Invasive Mechanical Ventilation Weaning: Prospective Cohort Study.

JMIR Form Res

September 2025

Department of Critical Care Medicine, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangdong Provincial Geriatrics Institute, No. 106, Zhongshaner Rd, Guangzhou, 510080, China, 86 15920151904.

Background: Point-of-care ultrasonography has become a valuable tool for assessing diaphragmatic function in critically ill patients receiving invasive mechanical ventilation. However, conventional diaphragm ultrasound assessment remains highly operator-dependent and subjective. Previous research introduced automatic measurement of diaphragmatic excursion and velocity using 2D speckle-tracking technology.

View Article and Find Full Text PDF

For effective treatment of bacterial infections, it is essential to identify the species causing the infection as early as possible. Current methods typically require hours of overnight culturing of a bacterial sample and a larger quantity of cells to function effectively. This study uses one-hour phase-contrast time-lapses of single-cell bacterial growth collected from microfluidic chip traps, also known as a "mother machine".

View Article and Find Full Text PDF

Emotion annotation in code-mixed languages like Hinglish (Hindi-English) presents unique challenges due to linguistic complexity and resource constraints. This study introduces a hybrid active learning framework that combines lexical rules, machine learning, and iterative expert feedback to achieve cost-efficient, high-accuracy emotion annotation. Grounded in psychological theories of emotion, including Discrete Emotions Theory and Cognitive Appraisal Theory, the framework employs bilingual emotion dictionaries (e.

View Article and Find Full Text PDF

The application of the clinical nursing pathway in the anesthesia recovery room is of great significance for improving nursing quality and reducing the incidence of complications. However, the influence of the clinical nursing pathway construction scheme and implementation path on patient outcomes in the anesthesia recovery room is not clear. In this study, 200 patients in the surgical anesthesia recovery room, aged 50 to 70 years old and graded as American Society of Anesthesiologists Physical Status Classification System (ASA) II-III, were randomly divided into the control group (n=100) and the interventional group (n=100).

View Article and Find Full Text PDF

ObjectiveTo evaluate the diagnostic performance of a combined model incorporating ultrasound video-based radiomics features and clinical variables for distinguishing between benign and malignant breast lesions.MethodsA total of 346 patients (173 benign and 173 malignant) were retrospectively enrolled. Breast ultrasound videos were acquired and processed using semi-automatic segmentation in 3D Slicer.

View Article and Find Full Text PDF