Developing fine-grained sense-aware lexical sophistication indices based on the CEFR levels of word senses.

Behav Res Methods

School of International Chinese Language Education, Beijing Normal University, 19 Xinjiekouwai St, Beijing, 100875, China.

Published: July 2025


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Lexical sophistication has garnered attention across diverse research domains in which language production and text complexity are relevant areas of study. Nevertheless, among the myriad existing lexical sophistication measures, the vast majority do not systematically differentiate different senses of polysemous words but rather treat all senses of a polysemous word as equally sophisticated. To address this limitation, the current study introduces a system that automatically assigns the words in a text to CEFR (i.e., the Common European Framework of Reference for Languages) levels based on their senses used in context, using the English Vocabulary Profile as a reference. We further propose a set of fine-grained sense-aware lexical sophistication indices based on the CEFR levels of word senses and evaluate the extent to which these indices can predict holistic scores of second language (L2) English writing quality using 1,236 exam scripts from the CLC-FCE dataset (Yannakoudakis et al., 2011). The results show that these fine-grained sense-aware indices are more strongly correlated with scores than existing lexical sophistication measures, with three significant predictors explaining 11.8% of the variance in holistic scores. A regression model that combines the new indices with existing ones achieves substantially greater predictive power than models built with either set of indices alone. We discuss the potential implications of our findings for future research in L2 lexical sophistication.

Download full-text PDF

Source
http://dx.doi.org/10.3758/s13428-025-02741-zDOI Listing

Publication Analysis

Top Keywords

lexical sophistication
24
fine-grained sense-aware
12
sense-aware lexical
8
sophistication indices
8
indices based
8
based cefr
8
cefr levels
8
levels word
8
word senses
8
existing lexical
8

Similar Publications

This paper presents a mixed-method approach to analyzing news media, combining quantitative linguistic metrics with qualitative discourse frameworks. We first extract linguistic features such as quotations, readability levels, and lexical richness, then perform named entity recognition and topic modeling. To add depth, we apply Fairclough's model of critical discourse analysis-highlighting social and cultural contexts-together with Goffman's frame analysis of social behavior, enabling a systematic comparison of narrative strategies and community engagement.

View Article and Find Full Text PDF

Developing fine-grained sense-aware lexical sophistication indices based on the CEFR levels of word senses.

Behav Res Methods

July 2025

School of International Chinese Language Education, Beijing Normal University, 19 Xinjiekouwai St, Beijing, 100875, China.

Lexical sophistication has garnered attention across diverse research domains in which language production and text complexity are relevant areas of study. Nevertheless, among the myriad existing lexical sophistication measures, the vast majority do not systematically differentiate different senses of polysemous words but rather treat all senses of a polysemous word as equally sophisticated. To address this limitation, the current study introduces a system that automatically assigns the words in a text to CEFR (i.

View Article and Find Full Text PDF

Mental illnesses often manifest through behavioral changes, with speech serving as a key medium for expressing thoughts and emotions. The use of computational linguistics on speech data in mental illnesses is a promising approach to uncover objective biomarkers for the early detection of mental illnesses. This study analyzed speech transcripts from 80 youths at ultra-high risk of psychosis (UHR) and 329 healthy controls, examining text features such as sentiment variability, cohesion, lexical sophistication, morphology, syntactic sophistication, and lexical diversity.

View Article and Find Full Text PDF

Purpose: When faced with challenging communicative situations, people with dysarthria are commonly advised to rephrase their message, using common words and keeping sentences short and manageable. However, it remains unclear whether relevant clinical populations can implement these changes on demand. The goals of this study were to (a) identify lexical changes that occur when speakers are prompted to rephrase sentences and (b) examine how rephrasing messages affects acoustic measures of speech production and listener perceptual ratings.

View Article and Find Full Text PDF

Genre-based research holds significant theoretical and practical importance in second language acquisition (SLA). While many L2 English writing studies have suggested argumentative writing was generally more challenging than narrative, whether this generalization applies to typologically different languages, such as German with its complex morphological and structural properties, requires further investigation. Specifically, genre effects on L2 writing development remain insufficiently understood for non-English languages, particularly regarding complexities beyond syntactic and lexical.

View Article and Find Full Text PDF