98%
921
2 minutes
20
Context-aware emotion recognition (CAER) leverages comprehensive scene information, including facial expressions, body postures, and contextual background. However, current studies predominantly rely on facial expressions, body postures, and global contextual features; the interaction between the agents (target individuals) and other objects in the scene is usually absent or incomplete. In this article, a three-dimensional view relationship-based CAER (TDRCer) method is proposed, which comprises two branches: the personal emotional branch (PEB) and the contextual emotional branch (CEB). First, PEB is designed for the extraction of facial expression features and body posture features from the agent. A vision transformer (ViT), pretrained by contrastive learning with a novel loss function combining Euclidean distance and cosine similarity, is applied to enhance the robustness of facial expression features. Meanwhile, the human body contour images extracted by semantic segmentation are fed into another ViT to extract body posture features. Second, CEB is constructed for the extraction of global contextual features and interactive relationships among objects in the scene. The images masked by the agents' bodies are fed into a ViT to extract global contextual features. By leveraging both the gaze angle and depth map, a three-dimensional view graph (3DVG) is constructed to represent the interactive relationships between agents and objects in the scene. Then, a graph convolutional network is employed to extract interactive relationship features from the 3DVG. Finally, the multiplicative fusion strategy is applied to fuse the features of two branches, and the fused features are utilized to classify the emotions. TDRCer achieves an accuracy of 89.90% on the CAER-S dataset and a mean average precision (mAP) of 36.02% on the EMOTIons in context (EMOTIC) dataset. The code can be accessed at https://github.com/mengTender/TDRCer.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TNNLS.2024.3476249 | DOI Listing |
Oper Neurosurg
September 2025
Department of Neurosurgery and the Training Base of Neuroendoscopic Physicians under the Chinese Medical Doctor Association, Jiangsu Clinical Medicine Center of Tissue Engineering and Nerve Injury Repair, Affiliated Hospital of Nantong University, Nantong, Jiangsu Province, China.
Background And Objectives: Microvascular decompression (MVD) for hemifacial spasm (HFS) is commonly conducted under a microscope. We report a large series of fully endoscopic MVDs for HFS and describe our initial experience with 3-dimensional (3D) endoscopy.
Methods: Clinical data of 204 patients with HFS who underwent fully endoscopic MVD using 2-dimensional (2D) and 3D endoscopy (191 and 13 patients, respectively) from July 2017 to October 2024 were retrospectively analyzed.
PLoS One
September 2025
School of Mechanical and Electrical Engineering, China University of Mining and Technology (Beijing), Beijing, China.
Multi-modal data fusion plays a critical role in enhancing the accuracy and robustness of perception systems for autonomous driving, especially for the detection of small objects. However, small object detection remains particularly challenging due to sparse LiDAR points and low-resolution image features, which often lead to missed or imprecise detections. Currently, many methods process LiDAR point clouds and visible-light camera images separately, and then fuse them in the detection head.
View Article and Find Full Text PDFJ Chem Inf Model
September 2025
College of Life Sciences and Medicine, Zhejiang Sci-Tech University, Hangzhou 310018, China.
Drug-drug interactions (DDIs) present a significant challenge in clinical practice, as they may lead to adverse reactions, diminished therapeutic efficacy, and serious risks to patient safety. However, most existing methods depend on single-view representations of drug molecules or substructures, which limits their capacity to capture the diverse and complex nature of drug properties. To overcome this limitation, we propose MGRL-DDI, a multiview graph representation learning framework that comprehensively models drug structures from three complementary perspectives: Three-dimensional (3D) molecular graphs, motif graphs, and molecular graphs.
View Article and Find Full Text PDFSurg Endosc
September 2025
Department of Lower Gastrointestinal Surgery, Kitasato University School of Medicine, Kanagawa, Japan.
Background: The greatest advantage of robotic surgery is that it enables precise surgery through a magnified three-dimensional (3D) image effect, multi-degree-of-freedom forceps, and a stable surgical field. However, it has disadvantages such as lack of tactile sensation and the existence of blind spots and the possibility of organ damage is higher in comparison to open or laparoscopic surgery. To solve these two major problems, we developed BirdView™, a wide view camera system.
View Article and Find Full Text PDFHepatobiliary Surg Nutr
August 2025
Department of Liver Surgery, Peking Union Medical College Hospital (PUMCH), Peking Union Medical College (PUMC) & Chinese Academy of Medical Sciences (CAMS), Beijing, China.