Occlusion-Aware Transformer With Second-Order Attention for Person Re-Identification.

Yanping Li , Yizhang Liu , Hongyun Zhang , Cairong Zhao , Zhihua Wei , Duoqian Miao

IEEE Trans Image Process

Published: May 2024

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Person re-identification (ReID) typically encounters varying degrees of occlusion in real-world scenarios. While previous methods have addressed this using handcrafted partitions or external cues, they often compromise semantic information or increase network complexity. In this paper, we propose a new method from a novel perspective, termed as OAT. Specifically, we first use a Transformer backbone with multiple class tokens for diverse pedestrian feature learning. Given that the self-attention mechanism in the Transformer solely focuses on low-level feature correlations, neglecting higher-order relations among different body parts or regions. Thus, we propose the Second-Order Attention (SOA) module to capture more comprehensive features. To address computational efficiency, we further derive approximation formulations for implementing second-order attention. Observing that the importance of semantics associated with different class tokens varies due to the uncertainty of the location and size of occlusion, we propose the Entropy Guided Fusion (EGF) module for multiple class tokens. By conducting uncertainty analysis on each class token, higher weights are assigned to those with lower information entropy, while lower weights are assigned to class tokens with higher entropy. The dynamic weight adjustment can mitigate the impact of occlusion-induced uncertainty on feature learning, thereby facilitating the acquisition of discriminative class token representations. Extensive experiments have been conducted on occluded and holistic person re-identification datasets, which demonstrate the effectiveness of our proposed method.

Download full-text PDF	Source
http://dx.doi.org/10.1109/TIP.2024.3393360	DOI Listing

Publication Analysis

Top Keywords

class tokens

second-order attention

person re-identification

multiple class

feature learning

class token

weights assigned

class

occlusion-aware transformer

transformer second-order

Similar Publications

PM: A new prompting multi-modal model paradigm for few-shot medical image classification.

Comput Methods Programs Biomed

September 2025

Key Laboratory of Social Computing and Cognitive Intelligence (Ministry of Education), Dalian University of Technology, Dalian, 116024, China; School of Computer Science and Technology, Dalian University of Technology, Dalian, 116024, China. Electronic address:

Zhenwei Wang , Qiule Sun , Bingbing Zhang , Pengfei Wang , Jianxin Zhang

Background And Objective: Few-shot learning has emerged as a key technological solution to address challenges such as limited data and the difficulty of acquiring annotations in medical image classification. However, relying solely on a single image modality is insufficient to capture conceptual categories. Therefore, medical image classification requires a comprehensive approach to capture conceptual category information that aids in the interpretation of image content.

View Article and Find Full Text PDF

Similar Publications

Diagnosing autism spectrum disorders using a double deep Q-Network framework based on social media footprints.

Front Med (Lausanne)

August 2025

Department of Computer Science, College of Computer Engineering and Sciences, Prince Sattam Bin Abdulaziz University, Al-Kharj, Saudi Arabia.

Nesren S Farhah , Ahmed Abdullah Alqarni , Nadhem Ebrahim , Sultan Ahmad

Introduction: Social media is increasingly used in many contexts within the healthcare sector. The improved prevalence of Internet use via computers or mobile devices presents an opportunity for social media to serve as a tool for the rapid and direct distribution of essential health information. Autism spectrum disorders (ASD) are a comprehensive neurodevelopmental syndrome with enduring effects.

View Article and Find Full Text PDF

Similar Publications

DSTANet: A Lightweight and High-Precision Network for Fine-Grained and Early Identification of Maize Leaf Diseases in Field Environments.

Sensors (Basel)

August 2025

College of Electrical Engineering and Information, Northeast Agricultural University, Harbin 150030, China.

Xinyue Gao , Lili He , Yinchuan Liu , Jiaxin Wu , Yuying Cao

Early and accurate identification of maize diseases is crucial for ensuring sustainable agricultural development. However, existing maize disease identification models face challenges including high inter-class similarity, intra-class variability, and limited capability in identifying early-stage symptoms. To address these limitations, we proposed DSTANet (decomposed spatial token aggregation network), a lightweight and high-performance model for maize leaf disease identification.

View Article and Find Full Text PDF

Similar Publications

Automatic analysis of negation cues and scopes for medical texts in French using language models.

Comput Biol Med

August 2025

Inria, Lyon Research Center, F-69603, Villeurbanne, France; AIstroSight, Inria, Université Claude Bernard Lyon 1, Hospices Civils de Lyon, Villeurbanne, F-69603, France. Electronic address:

S Sadoune , A Richard , F Talbot , T Guyet , L Boussel

Objective: Correct automatic analysis of a medical report requires the identification of negations and their scopes. Since most of available training data comes from medical texts in English, it usually takes additional work to apply to non-English languages. Here, we introduce a supervised learning method for automatically identifying and determining the scopes and negation cues in French medical reports using language models based on BERT.

View Article and Find Full Text PDF

Similar Publications

High efficiency classification of thyroid cytopathological images based on knowledge distillation and vision transformer.

Sci Rep

August 2025

Department of Endocrinology and Metabolism, Shanghai Jiao Tong University School of Medicine Affiliated Sixth People's Hospital, Shanghai, 200233, China.

Jiazhe Zhang , Haolin Zhang , Peng Jiang , Qin Huang , Guangya Zhu

Thyroid cancer is one of the most common types of cancer, pathological diagnosis based on Fine Needle Aspiration Cytology is clinically used as the standard for assessing thyroid cancer. However, the complex structure and large-scale data volume of thyroid pathology images pose challenges in terms of accuracy and efficiency for automatic diagnosis. To address this practical problem, this paper proposes a knowledge distillation method called Multi-Dimensional Knowledge Distillation, which involves feature-based distillation and response-based distillation.

View Article and Find Full Text PDF

Similar Publications