Smart City Infrastructure Monitoring with a Hybrid Vision Transformer for Micro-Crack Detection.

Rashid Nasimov , Young Im Cho

Sensors (Basel)

Department of Computer Engineering, Gachon University, Sujeong-gu, Seongnam-si 13120, Republic of Korea.

Published: August 2025

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Innovative and reliable structural health monitoring (SHM) is indispensable for ensuring the safety, dependability, and longevity of urban infrastructure. However, conventional methods lack full efficiency, remain labor-intensive, and are susceptible to errors, particularly in detecting subtle structural anomalies such as micro-cracks. To address this issue, this study proposes a novel deep-learning framework based on a modified Detection Transformer (DETR) architecture. The framework is enhanced by integrating a Vision Transformer (ViT) backbone and a specially designed Local Feature Extractor (LFE) module. The proposed ViT-based DETR model leverages ViT's capability to capture global contextual information through its self-attention mechanism. The introduced LFE module significantly enhances the extraction and clarification of complex local spatial features in images. The LFE employs convolutional layers with residual connections and non-linear activations, facilitating efficient gradient propagation and reliable identification of micro-level defects. Thorough experimental validation conducted on the benchmark SDNET2018 dataset and a custom dataset of damaged bridge images demonstrates that the proposed Vision-Local Feature Detector (ViLFD) model outperforms existing approaches, including DETR variants and YOLO-based models (versions 5-9), thereby establishing a new state-of-the-art performance. The proposed model achieves superior accuracy (95.0%), precision (0.94), recall (0.93), F1-score (0.93), and mean Average Precision (mAP@0.5 = 0.89), confirming its capability to accurately and reliably detect subtle structural defects. The introduced architecture represents a significant advancement toward automated, precise, and reliable SHM solutions applicable in complex urban environments.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12389922	PMC
http://dx.doi.org/10.3390/s25165079	DOI Listing

Publication Analysis

Top Keywords

vision transformer

subtle structural

lfe module

smart city

city infrastructure

infrastructure monitoring

monitoring hybrid

hybrid vision

transformer micro-crack

micro-crack detection

Similar Publications

Deep feature engineering for accurate sperm morphology classification using CBAM-enhanced ResNet50.

PLoS One

September 2025

School of Computer Science, CHART Laboratory, University of Nottingham, Nottingham, United Kingdom.

Şafak Kılıç

Background And Objective: Male fertility assessment through sperm morphology analysis remains a critical component of reproductive health evaluation, as abnormal sperm morphology is strongly correlated with reduced fertility rates and poor assisted reproductive technology outcomes. Traditional manual analysis performed by embryologists is time-intensive, subjective, and prone to significant inter-observer variability, with studies reporting up to 40% disagreement between expert evaluators. This research presents a novel deep learning framework combining Convolutional Block Attention Module (CBAM) with ResNet50 architecture and advanced deep feature engineering (DFE) techniques for automated, objective sperm morphology classification.

View Article and Find Full Text PDF

Similar Publications

3D-CNN Enhanced Multiscale Progressive Vision Transformer for AD Diagnosis.

IEEE J Biomed Health Inform

September 2025

Fei Huang , Nanguang Chen , Anqi Qiu

Vision Transformer (ViT) applied to structural magnetic resonance images has demonstrated success in the diagnosis of Alzheimer's disease (AD) and mild cognitive impairment (MCI). However, three key challenges have yet to be well addressed: 1) ViT requires a large labeled dataset to mitigate overfitting while most of the current AD-related sMRI data fall short in the sample sizes. 2) ViT neglects the within-patch feature learning, e.

View Article and Find Full Text PDF

Similar Publications

Flexible and robust cell-type annotation for highly multiplexed tissue images.

Cell Syst

September 2025

Ray and Stephanie Lane Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, USA. Electronic address:

Huangqingbo Sun , Shiqiu Yu , Anna Martinez Casals , Anna Bäckström , Yuxin Lu

Identifying cell types in highly multiplexed images is essential for understanding tissue spatial organization. Current cell-type annotation methods often rely on extensive reference images and manual adjustments. In this work, we present a tool, the Robust Image-Based Cell Annotator (RIBCA), that enables accurate, automated, unbiased, and fine-grained cell-type annotation for images with a wide range of antibody panels without requiring additional model training or human intervention.

View Article and Find Full Text PDF

Similar Publications

Temporal Modeling With Frozen Vision-Language Foundation Models for Parameter-Efficient Text-Video Retrieval.

IEEE Trans Neural Netw Learn Syst

September 2025

Leqi Shen , Tianxiang Hao , Tao He , Yifeng Zhang , Pengzhang Liu

Temporal modeling plays an important role in the effective adaption of the powerful pretrained text-image foundation model into text-video retrieval. However, existing methods often rely on additional heavy trainable modules, such as transformer or BiLSTM, which are inefficient. In contrast, we avoid introducing such heavy components by leveraging frozen foundation models.

View Article and Find Full Text PDF

Similar Publications

AI Model Based on Diaphragm Ultrasound to Improve the Predictive Performance of Invasive Mechanical Ventilation Weaning: Prospective Cohort Study.

JMIR Form Res

September 2025

Department of Critical Care Medicine, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangdong Provincial Geriatrics Institute, No. 106, Zhongshaner Rd, Guangzhou, 510080, China, 86 15920151904.

Feier Song , Huazhang Liu , Huan Ma , Xuanhui Chen , Shouhong Wang

Background: Point-of-care ultrasonography has become a valuable tool for assessing diaphragmatic function in critically ill patients receiving invasive mechanical ventilation. However, conventional diaphragm ultrasound assessment remains highly operator-dependent and subjective. Previous research introduced automatic measurement of diaphragmatic excursion and velocity using 2D speckle-tracking technology.

View Article and Find Full Text PDF

Similar Publications