98%
921
2 minutes
20
Multimodal large language models (MLLMs) have recently shown significant advancements in video understanding, excelling in content reasoning and instruction-following tasks. However, hallucination, where models generate inaccurate or misleading content, remains underexplored in the video domain. Building on the observation that MLLM visual encoders often fail to distinguish visually different yet semantically similar video pairs, we introduce VIDHALLUC, the largest benchmark designed to examine hallucinations in MLLMs for video understanding. It consists of 5,002 videos, paired to highlight cases prone to hallucinations. VIDHALLUC assesses hallucinations across three critical dimensions: (1) action, (2) temporal sequence, and (3) scene transition. Comprehensive testing shows that most MLLMs are vulnerable to hallucinations across these dimensions. Furthermore, we propose DINO-HEAL, a trainingfree method that reduces hallucinations by incorporating spatial saliency from DINOv2 to reweight visual features during inference. Our results show that DINO-HEAL consistently improves performance on VIDHALLUC, achieving an average improvement of 3.02% in mitigating hallucinations across all tasks. Both the VIDHALLUC benchmark and DINO-HEAL code are available at https://people-robots.github.io/vidhalluc.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12408113 | PMC |
http://dx.doi.org/10.1109/cvpr52734.2025.01281 | DOI Listing |
Mar Pollut Bull
September 2025
St Abbs Marine Station, The Harbour, St Abbs TD14 5PW, United Kingdom. Electronic address:
The offshore renewable energy industry is expanding rapidly due to decarbonisation commitments and need for energy security. This will change the marine environment in ways that are not fully understood, including more subsea power cables in the sea. Movement of electricity through these cables generates an electromagnetic field (EMF), which might affect marine species.
View Article and Find Full Text PDFComput Biol Med
September 2025
School of Medicine, University of Western Australia, 35 Stirling Hwy, Crawley, 6009, WA, Australia; Harry Perkins Institute of Medical Research, 5 Robin Warren Dr, Murdoch, 6150, WA, Australia; Department of Cardiology, Fiona Stanley Hospital, 11 Robin Warren Dr, Murdoch, 6150, WA, Australia. Electr
Remote Photoplethysmography (rPPG) promises to turn digital cameras into medical devices with the measurement of heart rates, oxygen saturation and the diagnosis arrhythmias already demonstrated. The face-centric nature of current rPPG techniques limits open-datasets from including subjects with clinically-relevant cardiorespiratory conditions without sharing private medical information. The neck, with few identifiable characteristics, is well suited to overcoming this limitation, as it serves as a region of interest (ROI) for pulse detection during jugular venous examination, a common clinical technique.
View Article and Find Full Text PDFJMIR Form Res
September 2025
Department of Orthopedics, The First People's Hospital of Guannan: Lianyun, Lianyungang, China.
Background: Adolescence is a critical period for lifelong health, which makes access to accurate and comprehensive sexuality education essential. As video platforms become a primary source of information for adolescents, the quality of their content significantly impacts their physical and mental health.
Objective: This study aimed to evaluate the quality, reliability, understandability, and actionability of adolescent sexuality education videos on major Chinese platforms (Bilibili, TikTok or Douyin, and Kwai), analyze associated user comment sentiment and topics, identify predictors of quality and reliability, and provide recommendations.
Psychometrika
September 2025
Department of Statistics and Data Science, https://ror.org/042tdr378Southern Methodist University, Dallas, TX, USA.
Empathic accuracy (EA) is the ability to accurately understand another person's thoughts and feelings, which is crucial for social and psychological interactions. Traditionally, EA is assessed by comparing a perceiver's moment-to-moment ratings of a target's emotional state with the target's own self-reported ratings at corresponding time points. However, misalignments between these two sequences are common due to the complexity of emotional interpretation and individual differences in behavioral responses.
View Article and Find Full Text PDF