98%
921
2 minutes
20
Humans convey their intentions through the usage of both verbal and nonverbal behaviors during face-to-face communication. Speaker intentions often vary dynamically depending on different nonverbal contexts, such as vocal patterns and facial expressions. As a result, when modeling human language, it is essential to not only consider the literal meaning of the words but also the nonverbal contexts in which these words appear. To better model human language, we first model expressive nonverbal representations by analyzing the fine-grained visual and acoustic patterns that occur during word segments. In addition, we seek to capture the dynamic nature of nonverbal intents by shifting word representations based on the accompanying nonverbal behaviors. To this end, we propose the Recurrent Attended Variation Embedding Network (RAVEN) that models the fine-grained structure of nonverbal subword sequences and dynamically shifts word representations based on nonverbal cues. Our proposed model achieves competitive performance on two publicly available datasets for multimodal sentiment analysis and emotion recognition. We also visualize the shifted word representations in different nonverbal contexts and summarize common patterns regarding multimodal variations of word representations.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7098710 | PMC |
Cereb Cortex
August 2025
Department of Psychology, University of Milano-Bicocca, Milan, Italy.
Semantic composition allows us to construct complex meanings (e.g., "dog house", "house dog") from simpler constituents ("dog", "house").
View Article and Find Full Text PDFJ Am Acad Orthop Surg Glob Res Rev
September 2025
From the University of California, Riverside, Riverside, CA (Arroyo, Moore); the Birmingham Heersink School of Medicine, University of Alabama, Birmingham, AL (Cruz); the Warren Alpert Medical School, Brown University, Providence, RI (Rodarte); Department of Orthopaedic Surgery, Banner University Sp
Introduction: Orthopaedic surgery has historically been among the least ethnically diverse fields in medicine. The latest American Academy of Orthopaedic Surgeons (AAOS) Census report in 2018 indicates that only 2.2% of all practicing orthopaedic surgeons in the United States identify as Hispanic/Latino.
View Article and Find Full Text PDFCurr Dir Psychol Sci
May 2025
Oklahoma State University, Department of Psychology.
Hearing a single word can initiate a sequence of activation that spreads from the representation of the word (e.g., "candy") to words that share auditory and visual form (e.
View Article and Find Full Text PDFJ Cogn Dev
April 2025
Department of Psychology, University of Wisconsin-Madison, Madison, United States.
The role of function in toddlers' object labeling has been debated for decades in developmental science. We aimed to clarify the relation between toddlers' understanding of functions and words using a set of everyday objects that varied in the number of associated functions (e.g.
View Article and Find Full Text PDFSci Rep
August 2025
School of Statistics and Mathematics, Zhongnan University of Economics and Law, Wuhan, 430073, China.
The secondary structure of a protein serves as the foundation for constructing its three-dimensional (3D) structure, which in turn is critical for determining its function and role in biological processes. Therefore, accurately predicting secondary structure not only facilitates the understanding of a protein's 3D conformation but also provides essential insights into its interactions, functional mechanisms, and potential applications in biomedical research. Deep learning models are particularly effective in protein secondary structure prediction because of their ability to process complex sequence data and extract meaningful patterns, thereby increasing prediction accuracy and efficiency.
View Article and Find Full Text PDF