Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Humans convey their intentions through the usage of both verbal and nonverbal behaviors during face-to-face communication. Speaker intentions often vary dynamically depending on different nonverbal contexts, such as vocal patterns and facial expressions. As a result, when modeling human language, it is essential to not only consider the literal meaning of the words but also the nonverbal contexts in which these words appear. To better model human language, we first model expressive nonverbal representations by analyzing the fine-grained visual and acoustic patterns that occur during word segments. In addition, we seek to capture the dynamic nature of nonverbal intents by shifting word representations based on the accompanying nonverbal behaviors. To this end, we propose the Recurrent Attended Variation Embedding Network (RAVEN) that models the fine-grained structure of nonverbal subword sequences and dynamically shifts word representations based on nonverbal cues. Our proposed model achieves competitive performance on two publicly available datasets for multimodal sentiment analysis and emotion recognition. We also visualize the shifted word representations in different nonverbal contexts and summarize common patterns regarding multimodal variations of word representations.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7098710PMC

Publication Analysis

Top Keywords

word representations
20
nonverbal behaviors
12
nonverbal contexts
12
nonverbal
10
representations nonverbal
8
human language
8
representations based
8
word
6
representations
6
shift dynamically
4

Similar Publications

Semantic composition allows us to construct complex meanings (e.g., "dog house", "house dog") from simpler constituents ("dog", "house").

View Article and Find Full Text PDF

Hispanic Authorship in Orthopaedics: A Bibliometric Analysis of Orthopaedic Literature in the United States.

J Am Acad Orthop Surg Glob Res Rev

September 2025

From the University of California, Riverside, Riverside, CA (Arroyo, Moore); the Birmingham Heersink School of Medicine, University of Alabama, Birmingham, AL (Cruz); the Warren Alpert Medical School, Brown University, Providence, RI (Rodarte); Department of Orthopaedic Surgery, Banner University Sp

Introduction: Orthopaedic surgery has historically been among the least ethnically diverse fields in medicine. The latest American Academy of Orthopaedic Surgeons (AAOS) Census report in 2018 indicates that only 2.2% of all practicing orthopaedic surgeons in the United States identify as Hispanic/Latino.

View Article and Find Full Text PDF

Hearing a single word can initiate a sequence of activation that spreads from the representation of the word (e.g., "candy") to words that share auditory and visual form (e.

View Article and Find Full Text PDF

The role of function in toddlers' object labeling has been debated for decades in developmental science. We aimed to clarify the relation between toddlers' understanding of functions and words using a set of everyday objects that varied in the number of associated functions (e.g.

View Article and Find Full Text PDF

The secondary structure of a protein serves as the foundation for constructing its three-dimensional (3D) structure, which in turn is critical for determining its function and role in biological processes. Therefore, accurately predicting secondary structure not only facilitates the understanding of a protein's 3D conformation but also provides essential insights into its interactions, functional mechanisms, and potential applications in biomedical research. Deep learning models are particularly effective in protein secondary structure prediction because of their ability to process complex sequence data and extract meaningful patterns, thereby increasing prediction accuracy and efficiency.

View Article and Find Full Text PDF