Open vocabulary detection for concealed object detection in AMMW image.

Chenjiang Jiang , Chunyu Li , Xuejun Zhao

Sci Rep

Shanghai Key Laboratory of Crime Scene Evidence, Shanghai Research Institute of Criminal Science and Technology, Shanghai 200072, China.

Published: August 2025

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Currently, millimeter-wave imaging system plays a central role in security detection systems. Existing concealed object detectors for millimeter-wave images can only detect pre-trained categories and fail when encountering new, unseen categories. Accurately identifying the increasingly diverse types and shapes of concealed objects is a pressing challenge. Therefore, this paper proposes a novel open vocabulary detection algorithm: Open-MMW, capable of recognizing more diverse and untrained objects. This is the first time that open vocabulary detection has been introduced into the task of millimeter-wave image detection. We improved the YOLO-World detector framework by designing Multi-Scale Convolution and Task-Integrated Block to optimize feature extraction and detection accuracy. Additionally, the Text-Image Interaction Module leverages attention mechanisms to address the challenge of feature alignment between millimeter-wave images and text. Extensive experiments conducted on public and private datasets demonstrate the effectiveness of Open-MMW. Compared to the baseline model, Open-MMW improves recall by 13.7%, precision by 13.9%, mAP@0.5 by 14.2%, and mAP@[0.5-0.95] by 10.3%.The performance improvements are even more significant compared to state-of-the-art multimodal interaction models, showcasing powerful zero-shot detection capabilities not present in traditional closed-set detection.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12354676	PMC
http://dx.doi.org/10.1038/s41598-025-13935-y	DOI Listing

Publication Analysis

Top Keywords

open vocabulary

vocabulary detection

detection

concealed object

millimeter-wave images

detection concealed

object detection

detection ammw

ammw image

image currently

Similar Publications

A Frontier Review of Semantic SLAM Technologies Applied to the Open World.

Sensors (Basel)

August 2025

School of Electronic Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China.

Le Miao , Wen Liu , Zhongliang Deng

With the growing demand for autonomous robotic operations in complex and unstructured environments, traditional semantic SLAM systems-which rely on closed-set semantic vocabularies-are increasingly limited in their ability to robustly perceive and understand diverse and dynamic scenes. This paper focuses on the paradigm shift toward open-world semantic scene understanding in SLAM and provides a comprehensive review of the technological evolution from closed-world assumptions to open-world frameworks. We survey the current state of research in open-world semantic SLAM, highlighting key challenges and frontiers.

View Article and Find Full Text PDF

Similar Publications

From pilot to practice: a scoping review protocol mapping the development of AI-enabled solutions for maternal health using technology readiness levels.

BMJ Open

August 2025

Digital Global Public Health, Hasso-Plattner-Institut, University Potsdam, Potsdam, Germany.

Nico Marquardt , Vladimir Choi , Charles Martyn-Dickens , Marelize Gorgens , Sam Mathewlynn

Introduction: Maternal mortality remains a critical public health challenge in low- and middle-income countries (LMICs), where over 92% of global maternal deaths occur. Artificial intelligence (AI)-enabled solutions are increasingly recognised for their potential to improve and expand health services delivered to women. Such solutions can accelerate how health systems address gaps in maternal healthcare, including prevention, early detection, intervention and treatment.

View Article and Find Full Text PDF

Similar Publications

Effects of Elicitation Method on Functionally Relevant Item Selection in Spanish and English Monolinguals and Bilinguals.

Int J Lang Commun Disord

August 2025

North Alabama Medical Center, Florence, Alabama, USA.

Dallin J Bailey , Esther Barahona Wilkes

Purpose: One important decision speech language pathologists make when planning anomia treatment is the identification and selection of the specific vocabulary items to target during therapy. However, this process is not entirely straightforward. Although 'functional relevance' has high face validity for the identification of target items, interpretations differ, which may impact which words are selected for therapy.

View Article and Find Full Text PDF

Similar Publications

Sentence recognition in quiet and amidst single-talker babble in Chinese kindergarten-aged children with cochlear implants.

J Acoust Soc Am

August 2025

Department of Speech-Language-Hearing Sciences and Center for Neurobehavioral Development, University of Minnesota, Minneapolis, USA.

Linjun Zhang , Jiuju Wang , Tian Hong , Yang Zhao , Hua Shu

This study aimed to investigate open-set sentence recognition in quiet and amidst single-talker babble among Mandarin-speaking children with cochlear implants (CIs) to elucidate key contributing cognitive and linguistic factors influencing performance. Open-set sentence recognition was assessed in both conditions, alongside measurement of cognitive skills (operational efficiency and auditory short-term memory) and linguistic skills (oral vocabulary and syntactic competence) in kindergarten-aged children with CIs (n = 22; age = 59.8 ± 10.

View Article and Find Full Text PDF

Similar Publications

UpGen: Unleashing Potential of Foundation Models for Training-Free Camouflage Detection via Generative Models.

IEEE Trans Image Process

August 2025

Ji Du , Jiesheng Wu , Desheng Kong , Weiyun Liang , Fangwei Hao

Camouflaged Object Detection (COD) aims to segment objects resembling their environment. To address the challenges of extensive annotations and complex optimizations in supervised learning, recent prompt-based segmentation methods excavate insightful prompts from Large Vision-Language Models (LVLMs) and refine them using various foundation models. These are subsequently fed into the Segment Anything Model (SAM) for segmentation.

View Article and Find Full Text PDF

Similar Publications