Interaction-Aware Transformer Network for Human-Object Interaction Detection.

Weibo Jiang , Weihong Ren , Jiandong Tian , Hanwei Ma , Bowen Chen , Honghai Liu

IEEE Trans Cybern

Published: September 2025

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

human-object interaction (HOI) detection tackles the problem of joint localization and classification of HOIs. Recent HOI detection methods are mainly based on transformer networks, where the explicit priors at the object level (e.g., scene layout, object appearance, or category) are usually fed into the transformer to improve the object query ability. Though these methods have achieved remarkable results, they did not pay enough attention to the implicit action-level information, which is the fundamental element of HOI. In this work, we propose an interaction-aware transformer network (IATN) to obtain the interaction-aware query, by jointly utilizing implicit action-level priors and explicit object-level priors. Specifically, we design an action-aware module (AAM) to aggregate implicit action priors from the scene level and instance level, respectively. Then, we design an action-oriented graph (AOG), where human feature and object feature are graph nodes and action semantics represent graph edges, to aggregate priors jointly from action level and object level. Afterwards, the interaction-aware query is acquired and finally adopted to obtain the HOI predictions. Besides, we leverage knowledge distillation to enhance the action-level priors by transferring the final HOI predictions to the intermediate features. Extensive experiments on HICO-DET and V-COCO datasets verify the effectiveness of our proposed interaction-aware model.

Download full-text PDF	Source
http://dx.doi.org/10.1109/TCYB.2025.3587037	DOI Listing

Publication Analysis

Top Keywords

interaction-aware transformer

transformer network

human-object interaction

hoi detection

object level

implicit action-level

interaction-aware query

action-level priors

hoi predictions

priors

Similar Publications

The cognitive impacts of large language model interactions on problem solving and decision making using EEG analysis.

Front Comput Neurosci

July 2025

Department of Engineering, The University of Hong Kong, Hong Kong, China.

Ting Jiang , Jihua Wu , Stephen C H Leung

Introduction: The increasing integration of large language models (LLMs) into human-AI collaboration necessitates a deeper understanding of their cognitive impacts on users. Traditional evaluation methods have primarily focused on task performance, overlooking the underlying neural dynamics during interaction.

Methods: In this study, we introduce a novel framework that leverages electroencephalography (EEG) signals to assess how LLM interactions affect cognitive processes such as attention, cognitive load, and decision-making.

View Article and Find Full Text PDF

Similar Publications

Interaction-Aware Transformer Network for Human-Object Interaction Detection.

IEEE Trans Cybern

September 2025

Weibo Jiang , Weihong Ren , Jiandong Tian , Hanwei Ma , Bowen Chen

View Article and Find Full Text PDF

Similar Publications

Multi-camera spatiotemporal deep learning framework for real-time abnormal behavior detection in dense urban environments.

Sci Rep

July 2025

Physics Department, Science College, Princess Nourahbint Abdulrahman University, Riyadh, Saudi Arabia.

Sai Babu Veesam , B Tarakeswara Rao , Zarina Begum , R S M Lakshmi Patibandla , Arvin Arun Dcosta

The emerging density in today's urban environments requires a strong multi-camera architecture for real-time abnormality detection and behavior analysis. Most of the existing methods tend to fail in detecting unusual behaviors due to occlusion, dynamic scene changes and high computational inefficiency. These failures often result in high rates of false positives and poor generalization for unseen anomalies.

View Article and Find Full Text PDF

Similar Publications

Cross-Skeleton Interaction Graph Aggregation Network for Representation Learning of Mouse Social Behaviour.

IEEE Trans Image Process

January 2025

Feixiang Zhou , Xinyu Yang , Fang Chen , Long Chen , Zheheng Jiang

Automated social behaviour analysis of mice has become an increasingly popular research area in behavioural neuroscience. Recently, pose information (i.e.

View Article and Find Full Text PDF

Similar Publications

Interformer: an interaction-aware model for protein-ligand docking and affinity prediction.

Nat Commun

November 2024

AI Lab, Tencent, Shenzhen, China.

Houtim Lai , Longyue Wang , Ruiyuan Qian , Junhong Huang , Peng Zhou

In recent years, the application of deep learning models to protein-ligand docking and affinity prediction, both vital for structure-based drug design, has garnered increasing interest. However, many of these models overlook the intricate modeling of interactions between ligand and protein atoms in the complex, consequently limiting their capacity for generalization and interpretability. In this work, we propose Interformer, a unified model built upon the Graph-Transformer architecture.

View Article and Find Full Text PDF

Similar Publications