98%
921
2 minutes
20
Recently, generating dense maps in real-time has become a hot research topic in the mobile robotics community, since dense maps can provide more informative and continuous features compared with sparse maps. Implicit depth representation (e.g., the depth code) derived from deep neural networks has been employed in the visual-only or visual-inertial simultaneous localization and mapping (SLAM) systems, which achieve promising performances on both camera motion and local dense geometry estimations from monocular images. However, the existing visual-inertial SLAM systems combined with depth codes are either built on a filter-based SLAM framework, which can only update poses and maps in a relatively small local time window, or based on a loosely-coupled framework, while the prior geometric constraints from the depth estimation network have not been employed for boosting the state estimation. To well address these drawbacks, we propose DiT-SLAM, a novel real-time ense visual-inertial SLAM with mplicit depth representation and ightly-coupled graph optimization. Most importantly, the poses, sparse maps, and low-dimensional depth codes are optimized with the tightly-coupled graph by considering the visual, inertial, and depth residuals simultaneously. Meanwhile, we propose a light-weight monocular depth estimation and completion network, which is combined with attention mechanisms and the conditional variational auto-encoder (CVAE) to predict the uncertainty-aware dense depth maps from more low-dimensional codes. Furthermore, a robust point sampling strategy introducing the spatial distribution of 2D feature points is also proposed to provide geometric constraints in the tightly-coupled optimization, especially for textureless or featureless cases in indoor environments. We evaluate our system on open benchmarks. The proposed methods achieve better performances on both the dense depth estimation and the trajectory estimation compared to the baseline and other systems.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9102487 | PMC |
http://dx.doi.org/10.3390/s22093389 | DOI Listing |
Sensors (Basel)
August 2025
James Watt School of Engineering, University of Glasgow, Glasgow G12 8QQ, UK.
Visual-Inertial Odometry (VIO) systems often suffer from degraded performance in environments with low texture. Although some previous works have combined line features with point features to mitigate this problem, the line features still degrade under more challenging conditions, such as varying illumination. To tackle this, we propose DeepLine-VIO, a robust VIO framework that integrates learned line features extracted via an attraction-field-based deep network.
View Article and Find Full Text PDFSensors (Basel)
July 2025
School of Mechanical and Electrical Engineering, China Jiliang University, Hangzhou 310018, China.
Simultaneous Localization and Mapping (SLAM) remains challenging in dynamic environments. Recent approaches combining deep learning with algorithms for dynamic scenes comprise two types: faster, less accurate object detection-based methods and highly accurate, computationally costly instance segmentation-based methods. In addition, maps lacking semantic information hinder robots from understanding their environment and performing complex tasks.
View Article and Find Full Text PDFSensors (Basel)
June 2025
School of Mechanical and Electrical Engineering, Shenzhen Polytechnic University, Shenzhen 518055, China.
This paper presents SE2-LET-VINS, an enhanced Visual-Inertial Simultaneous Localization and Mapping (VI-SLAM) system built upon the classic Visual-Inertial Navigation System for Monocular Cameras (VINS-Mono) framework, designed to improve localization accuracy and robustness in complex environments. By integrating Lightweight Neural Network (LET-NET) for high-quality feature extraction and Special Euclidean Group in 2D (SE2) optical flow tracking, the system achieves superior performance in challenging scenarios such as low lighting and rapid motion. The proposed method processes Inertial Measurement Unit (IMU) data and camera data, utilizing pre-integration and RANdom SAmple Consensus (RANSAC) for precise feature matching.
View Article and Find Full Text PDFSensors (Basel)
April 2025
School of Automation Science and Electrical Engineering, Beihang University, Beijing 100191, China.
Monocular visual-inertial odometry based on the MSCKF algorithm has demonstrated computational efficiency even with limited resources. Moreover, the MSCKF-VIO is primarily designed for localization tasks, where environmental features such as points, lines, and planes are tracked across consecutive images. These tracked features are subsequently triangulated using the historical IMU/camera poses in the state vector to perform measurement updates.
View Article and Find Full Text PDFSensors (Basel)
January 2025
Department of Mechanical Engineering and Mechanics (MEM), Drexel University, 3141 Chestnut St., Philadelphia, PA 19104, USA.
The maturity of simultaneous localization and mapping (SLAM) methods has now reached a significant level that motivates in-depth and problem-specific reviews. The focus of this study is to investigate the evolution of vision-based, LiDAR-based, and a combination of these methods and evaluate their performance in enclosed and GPS-denied (EGD) conditions for infrastructure inspection. This paper categorizes and analyzes the SLAM methods in detail, considering the sensor fusion type and chronological order.
View Article and Find Full Text PDF