Three-Dimensional View Relationship-Based Context-Aware Emotion Recognition.

Lifeng Zhang , Xiangwei Zheng , Xuanchi Chen , Lizhen Cui

IEEE Trans Neural Netw Learn Syst

Published: July 2025

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Context-aware emotion recognition (CAER) leverages comprehensive scene information, including facial expressions, body postures, and contextual background. However, current studies predominantly rely on facial expressions, body postures, and global contextual features; the interaction between the agents (target individuals) and other objects in the scene is usually absent or incomplete. In this article, a three-dimensional view relationship-based CAER (TDRCer) method is proposed, which comprises two branches: the personal emotional branch (PEB) and the contextual emotional branch (CEB). First, PEB is designed for the extraction of facial expression features and body posture features from the agent. A vision transformer (ViT), pretrained by contrastive learning with a novel loss function combining Euclidean distance and cosine similarity, is applied to enhance the robustness of facial expression features. Meanwhile, the human body contour images extracted by semantic segmentation are fed into another ViT to extract body posture features. Second, CEB is constructed for the extraction of global contextual features and interactive relationships among objects in the scene. The images masked by the agents' bodies are fed into a ViT to extract global contextual features. By leveraging both the gaze angle and depth map, a three-dimensional view graph (3DVG) is constructed to represent the interactive relationships between agents and objects in the scene. Then, a graph convolutional network is employed to extract interactive relationship features from the 3DVG. Finally, the multiplicative fusion strategy is applied to fuse the features of two branches, and the fused features are utilized to classify the emotions. TDRCer achieves an accuracy of 89.90% on the CAER-S dataset and a mean average precision (mAP) of 36.02% on the EMOTIons in context (EMOTIC) dataset. The code can be accessed at https://github.com/mengTender/TDRCer.

Download full-text PDF	Source
http://dx.doi.org/10.1109/TNNLS.2024.3476249	DOI Listing

Publication Analysis

Top Keywords

three-dimensional view

global contextual

contextual features

objects scene

features

view relationship-based

context-aware emotion

emotion recognition

facial expressions

expressions body

Similar Publications

Fully Endoscopic Microvascular Decompression for Hemifacial Spasm Using 2-Dimensional/3-Dimensional Endoscopy: Clinical Analysis of 204 Cases.

Oper Neurosurg

September 2025

Department of Neurosurgery and the Training Base of Neuroendoscopic Physicians under the Chinese Medical Doctor Association, Jiangsu Clinical Medicine Center of Tissue Engineering and Nerve Injury Repair, Affiliated Hospital of Nantong University, Nantong, Jiangsu Province, China.

Qinwei Wang , Xuqi Hao , Qianqian Liu , Jian Chen , Rui Jiang

Background And Objectives: Microvascular decompression (MVD) for hemifacial spasm (HFS) is commonly conducted under a microscope. We report a large series of fully endoscopic MVDs for HFS and describe our initial experience with 3-dimensional (3D) endoscopy.

Methods: Clinical data of 204 patients with HFS who underwent fully endoscopic MVD using 2-dimensional (2D) and 3D endoscopy (191 and 13 patients, respectively) from July 2017 to October 2024 were retrospectively analyzed.

View Article and Find Full Text PDF

Similar Publications

LGMMFusion: A LiDAR-guided multi-modal fusion framework for enhanced 3D object detection.

PLoS One

September 2025

School of Mechanical and Electrical Engineering, China University of Mining and Technology (Beijing), Beijing, China.

Haixing Cheng , Chengyong Liu , Wenzhe Gu , Yuyi Wu , Mengye Zhao

Multi-modal data fusion plays a critical role in enhancing the accuracy and robustness of perception systems for autonomous driving, especially for the detection of small objects. However, small object detection remains particularly challenging due to sparse LiDAR points and low-resolution image features, which often lead to missed or imprecise detections. Currently, many methods process LiDAR point clouds and visible-light camera images separately, and then fuse them in the detection head.

View Article and Find Full Text PDF

Similar Publications

MGRL-DDI: Multiview Graph Representation Learning for Accurate Drug-Drug Interaction Prediction.

J Chem Inf Model

September 2025

College of Life Sciences and Medicine, Zhejiang Sci-Tech University, Hangzhou 310018, China.

Peng Xiong , Hu Chen , Jiaxu Zhou , Yuni Zeng , Qi Dai

Drug-drug interactions (DDIs) present a significant challenge in clinical practice, as they may lead to adverse reactions, diminished therapeutic efficacy, and serious risks to patient safety. However, most existing methods depend on single-view representations of drug molecules or substructures, which limits their capacity to capture the diverse and complex nature of drug properties. To overcome this limitation, we propose MGRL-DDI, a multiview graph representation learning framework that comprehensively models drug structures from three complementary perspectives: Three-dimensional (3D) molecular graphs, motif graphs, and molecular graphs.

View Article and Find Full Text PDF

Similar Publications

Usefulness of the BirdView™ in preventing devices from being lost visual field during robotic surgery using the Hugo™ RAS system: a prospective study.

Surg Endosc

September 2025

Department of Lower Gastrointestinal Surgery, Kitasato University School of Medicine, Kanagawa, Japan.

Hiroyuki Egi , Minoru Hattori , Shinichiro Chino , Motohiro Chuman , Masahiro Maruyama

Background: The greatest advantage of robotic surgery is that it enables precise surgery through a magnified three-dimensional (3D) image effect, multi-degree-of-freedom forceps, and a stable surgical field. However, it has disadvantages such as lack of tactile sensation and the existence of blind spots and the possibility of organ damage is higher in comparison to open or laparoscopic surgery. To solve these two major problems, we developed BirdView™, a wide view camera system.

View Article and Find Full Text PDF

Similar Publications

Construction and application of multicellular tumor microenvironment models based on three-dimensional bioprinting technology.

Hepatobiliary Surg Nutr

August 2025

Department of Liver Surgery, Peking Union Medical College Hospital (PUMCH), Peking Union Medical College (PUMC) & Chinese Academy of Medical Sciences (CAMS), Beijing, China.

Bo Zhou , Minghao Sun , Mian Yang , Wei Cui , Huayu Yang

View Article and Find Full Text PDF

Similar Publications