Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

human-object interaction (HOI) detection tackles the problem of joint localization and classification of HOIs. Recent HOI detection methods are mainly based on transformer networks, where the explicit priors at the object level (e.g., scene layout, object appearance, or category) are usually fed into the transformer to improve the object query ability. Though these methods have achieved remarkable results, they did not pay enough attention to the implicit action-level information, which is the fundamental element of HOI. In this work, we propose an interaction-aware transformer network (IATN) to obtain the interaction-aware query, by jointly utilizing implicit action-level priors and explicit object-level priors. Specifically, we design an action-aware module (AAM) to aggregate implicit action priors from the scene level and instance level, respectively. Then, we design an action-oriented graph (AOG), where human feature and object feature are graph nodes and action semantics represent graph edges, to aggregate priors jointly from action level and object level. Afterwards, the interaction-aware query is acquired and finally adopted to obtain the HOI predictions. Besides, we leverage knowledge distillation to enhance the action-level priors by transferring the final HOI predictions to the intermediate features. Extensive experiments on HICO-DET and V-COCO datasets verify the effectiveness of our proposed interaction-aware model.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TCYB.2025.3587037DOI Listing

Publication Analysis

Top Keywords

interaction-aware transformer
8
transformer network
8
human-object interaction
8
hoi detection
8
object level
8
implicit action-level
8
interaction-aware query
8
action-level priors
8
hoi predictions
8
priors
6

Similar Publications

Introduction: The increasing integration of large language models (LLMs) into human-AI collaboration necessitates a deeper understanding of their cognitive impacts on users. Traditional evaluation methods have primarily focused on task performance, overlooking the underlying neural dynamics during interaction.

Methods: In this study, we introduce a novel framework that leverages electroencephalography (EEG) signals to assess how LLM interactions affect cognitive processes such as attention, cognitive load, and decision-making.

View Article and Find Full Text PDF

human-object interaction (HOI) detection tackles the problem of joint localization and classification of HOIs. Recent HOI detection methods are mainly based on transformer networks, where the explicit priors at the object level (e.g.

View Article and Find Full Text PDF

The emerging density in today's urban environments requires a strong multi-camera architecture for real-time abnormality detection and behavior analysis. Most of the existing methods tend to fail in detecting unusual behaviors due to occlusion, dynamic scene changes and high computational inefficiency. These failures often result in high rates of false positives and poor generalization for unseen anomalies.

View Article and Find Full Text PDF

Automated social behaviour analysis of mice has become an increasingly popular research area in behavioural neuroscience. Recently, pose information (i.e.

View Article and Find Full Text PDF

In recent years, the application of deep learning models to protein-ligand docking and affinity prediction, both vital for structure-based drug design, has garnered increasing interest. However, many of these models overlook the intricate modeling of interactions between ligand and protein atoms in the complex, consequently limiting their capacity for generalization and interpretability. In this work, we propose Interformer, a unified model built upon the Graph-Transformer architecture.

View Article and Find Full Text PDF