Target-aware transformer tracking with hard occlusion instance generation.

Front Neurorobot

Key Laboratory of Precision Opto-Mechatronics Technology, Ministry of Education, School of Instrumentation and Opto-Electronics Engineering, Beihang University, Beijing, China.

Published: January 2024


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Visual tracking is a crucial task in computer vision that has been applied in diverse fields. Recently, transformer architecture has been widely applied in visual tracking and has become a mainstream framework instead of the Siamese structure. Although transformer-based trackers have demonstrated remarkable accuracy in general circumstances, their performance in occluded scenes remains unsatisfactory. This is primarily due to their inability to recognize incomplete target appearance information when the target is occluded. To address this issue, we propose a novel transformer tracking approach referred to as TATT, which integrates a target-aware transformer network and a hard occlusion instance generation module. The target-aware transformer network utilizes an encoder-decoder structure to facilitate interaction between template and search features, extracting target information in the template feature to enhance the unoccluded parts of the target in the search features. It can directly predict the boundary between the target region and the background to generate tracking results. The hard occlusion instance generation module employs multiple image similarity calculation methods to select an image pitch in video sequences that is most similar to the target and generate an occlusion instance mimicking real scenes without adding an extra network. Experiments on five benchmarks, including LaSOT, TrackingNet, Got10k, OTB100, and UAV123, demonstrate that our tracker achieves promising performance while running at approximately 41 fps on GPU. Specifically, our tracker achieves the highest AUC scores of 65.5 and 61.2% in partial and full occlusion evaluations on LaSOT, respectively.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10806154PMC
http://dx.doi.org/10.3389/fnbot.2023.1323188DOI Listing

Publication Analysis

Top Keywords

occlusion instance
16
target-aware transformer
12
hard occlusion
12
instance generation
12
transformer tracking
8
tracking hard
8
visual tracking
8
transformer network
8
generation module
8
search features
8

Similar Publications

Precision livestock farming increasingly relies on non-invasive, high-fidelity systems capable of monitoring cattle with minimal disruption to behavior or welfare. Conventional identification methods, such as ear tags and wearable sensors, often compromise animal comfort and produce inconsistent data under real-world farm conditions. This study introduces Dairy DigiD, a deep learning-based biometric classification framework that categorizes dairy cattle into four physiologically defineda groups-young, mature milking, pregnant, and dry cows-using high-resolution facial images.

View Article and Find Full Text PDF

BackgroundMechanical thrombectomy (MT) is a well-established treatment for acute large-vessel occlusion. While the transfemoral approach (TFA) is the standard, it can be challenging in elderly patients with tortuous vasculature. The transbrachial approach (TBA) offers a shorter and more direct route but is associated with more puncture site complications.

View Article and Find Full Text PDF

Aortoiliofemoral Lower Extremity CT Angiography.

Radiographics

October 2025

Mallinckrodt Institute of Radiology, Washington University School of Medicine, 510 S Kingshighway Blvd, CB 8131, St Louis, MO 63110.

CT angiography (CTA) of the aortoiliofemoral (AIF) arteries in the abdomen, pelvis, and lower extremities has become an invaluable tool in assessment of patients with peripheral arterial disease (PAD) and lower extremity trauma. AIF CTA provides rapid and comprehensive assessment of arterial inflow and outflow, guiding management of patients with chronic claudication and those with more acute manifestations, including atherothrombotic occlusion, embolic disease, or thrombosis of prior interventions such as bypass graft or stent placement. Careful attention to technique is critical in performing diagnostic AIF CTA, as pitfalls related to imaging too early or too late relative to the arrival of contrast material in the legs can lead to misdiagnosis or diagnostic uncertainty.

View Article and Find Full Text PDF

Pneumatosis intestinalis (PI) is characterized by the presence of air within the walls of the small intestine, large intestine, and sometimes the gastric wall. The mechanism and pathogenesis of PI are poorly understood. The discovery of PI can occur in the form of an incidental finding, such as a benign course or a life-threatening condition, such as intestinal ischemia.

View Article and Find Full Text PDF

Accurate detection of cherry tomato clusters and their ripeness stages is critical for the development of intelligent harvesting systems in modern agriculture. In response to the challenges posed by occlusion, overlapping clusters, and subtle ripeness variations under complex greenhouse environments, an improved YOLO11-based deep convolutional neural network detection model, called AFBF-YOLO, is proposed in this paper. First, a dataset comprising 486 RGB images and over 150,000 annotated instances was constructed and augmented, covering four ripeness stages and fruit clusters.

View Article and Find Full Text PDF