Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

The cooperative, connected, and automated mobility (CCAM) infrastructure plays a key role in understanding and enhancing the environmental perception of autonomous vehicles (AVs) driving in complex urban settings. However, the deployment of CCAM infrastructure necessitates the efficient selection of the computational processing layer and deployment of machine learning (ML) and deep learning (DL) models to achieve greater performance of AVs in complex urban environments. In this paper, we propose a computational framework and analyze the effectiveness of a custom-trained DL model (YOLOv8) when deployed in diverse devices and settings at the vehicle-edge-cloud-layered architecture. Our main focus is to understand the interplay and relationship between the DL model's accuracy and execution time during deployment at the layered framework. Therefore, we investigate the trade-offs between accuracy and time by the deployment process of the YOLOv8 model over each layer of the computational framework. We consider the CCAM infrastructures, i.e., sensory devices, computation, and communication at each layer. The findings reveal that the performance metrics results (e.g., 0.842 mAP@0.5) of deployed DL models remain consistent regardless of the device type across any layer of the framework. However, we observe that inference times for object detection tasks tend to decrease when the DL model is subjected to different environmental conditions. For instance, the Jetson AGX (non-GPU) outperforms the Raspberry Pi (non-GPU) by reducing inference time by 72%, whereas the Jetson AGX Xavier (GPU) outperforms the Jetson AGX ARMv8 (non-GPU) by reducing inference time by 90%. A complete average time comparison analysis for the transfer time, preprocess time, and total time of devices Apple M2 Max, Intel Xeon, Tesla T4, NVIDIA A100, Tesla V100, etc., is provided in the paper. Our findings direct the researchers and practitioners to select the most appropriate device type and environment for the deployment of DL models required for production.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11014187PMC
http://dx.doi.org/10.3390/s24072080DOI Listing

Publication Analysis

Top Keywords

jetson agx
12
deep learning
8
ccam infrastructure
8
complex urban
8
computational framework
8
time
8
time deployment
8
device type
8
non-gpu reducing
8
reducing inference
8

Similar Publications

In this paper, we presents a case study involving the implementation experience and a methodological framework through a comprehensive comparative analysis of the YOLOX and YOLOv12 object detection models for agricultural automation systems deployed in the Jetson AGX Orin edge computing platform. We examined the architectural differences between the models and their impact on detection capabilities in data-imbalanced potato-harvesting environments. Both models were trained on identical datasets with images capturing potatoes, soil clods, and stones, and their performances were evaluated through 30 independent trials under controlled conditions.

View Article and Find Full Text PDF

While Vision Transformers (ViTs) have shown consistent progress in computer vision, deploying them for real-time decision-making scenarios (< 1 ms) is challenging. Current computing platforms like CPUs, GPUs, or FPGA-based solutions struggle to meet this deterministic low-latency real-time requirement, even with quantized ViT models. Some approaches use pruning or sparsity to reduce model size and latency, but this often results in accuracy loss.

View Article and Find Full Text PDF

AnimalRTPose: Faster cross-species real-time animal pose estimation.

Neural Netw

October 2025

School of Physics, Northeast Normal University, Renmin Street 5268, Changchun, Jilin, 130024, China. Electronic address:

Recent advancements in computer vision have facilitated the development of sophisticated tools for analyzing complex animal behaviors, yet the diversity of animal morphology and environmental complexities present significant challenges to real-time animal pose estimation. To address these challenges, we introduce AnimalRTPose, a one-stage model designed for cross-species real-time animal pose estimation. At its core, AnimalRTPose leverages CSPNeXt, a novel backbone network that integrates depthwise separable convolution with skip connections for high-frequency feature extraction, a channel attention mechanism (CAM) to enhance the fusion of high-frequency and low-frequency features, and spatial pyramid pooling (SPP) to capture multi-scale contextual information.

View Article and Find Full Text PDF

Deploying deep learning models on edge devices offers advantages in terms of data security and communication latency. However, optimizing these models to achieve fast computing speeds without sacrificing accuracy can be challenging, especially in video surveillance applications where real-time processing is crucial. In this study, we investigate the deployment of gait recognition models as a multi-objective selection problem in which we seek to simultaneously minimize several objectives, such as latency and energy consumption, while maintaining accuracy.

View Article and Find Full Text PDF

Accurate 3D object detection is crucial for autonomous vehicles (AVs) to navigate safely in complex environments. This paper introduces a novel fusion framework that integrates Camera image-based and LiDAR data-based . Unlike conventional fusion approaches, which often struggle with feature misalignment, enhances spatial consistency and multi-level feature aggregation, significantly improving detection accuracy.

View Article and Find Full Text PDF