Perplexity and proximity: Large language model perplexity complements semantic distance metrics for the detection of incoherent speech.

Weizhe Xu , Serguei Pakhomov , Patrick Heagerty , Eric Horvitz , Ellen R Bradley , Josh Woolley , Andrew Campbell , Alex Cohen , Dror Ben-Zeev , Trevor Cohen

J Biomed Inform

Biomedical Informatics and Medical Education (BIME), University of Washington, Seattle, WA, USA; Behavioral Research in Technology (BRiTE) Center, Psychiatry and Behavioral Sciences, University of Washington, Seattle, WA, USA.

Published: August 2025

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Objective: Semantic coherence in speech is characterized by a logical, connected flow of ideas. A lack of coherence in speech may reflect disorganized thinking, a core feature of psychosis in schizophrenia spectrum disorders (SSDs). Developing tools that could help with automated assessment of semantic coherence in language could facilitate early detection of SSDs and improved monitoring of symptoms, enabling more timely intervention. Large language models (LLMs) have demonstrated strong capabilities on numerous language-centric tasks and have shown promise for analyzing semantic coherence due to the natural fit between their innate measures of language perplexity and the surprising turns that incoherent narrative often takes. This study aims to develop a novel representation and associated measure of semantic coherence using LLM-based perplexity metrics and to compare this measure with traditional vector distance-based coherence metrics.

Method: We evaluated "bag" and "chain" models based on LLM perplexities as measures of semantic coherence. Regression models were trained using both single and paired combinations of perplexity- and proximity-based features to predict human ratings of semantic coherence using standardized instruments. Performance was evaluated on held-out examples from a training set of speeches from individuals experiencing psychotic symptoms and a test set of clinical interviews with patients diagnosed with SSDs, both with labels from human assessments of disorganized thinking severity.

Results: The best performance was achieved using a combination of perplexity and proximity features, yielding a Spearman correlation with human ratings of 0.61 (vs. 0.56 with proximity features alone) on leave-one-out cross-validation in the training set, and 0.54 (vs. 0.52 with proximity features alone) on the test set.

Conclusion: We developed novel methods for assessing semantic coherence using LLM perplexities and found them complementary to proximity-based methods. Combined, these methods showed improved performance across two datasets, highlighting LLM's potential in enhancing automated diagnosis and monitoring of SSDs.

Download full-text PDF	Source
http://dx.doi.org/10.1016/j.jbi.2025.104899	DOI Listing

Publication Analysis

Top Keywords

semantic coherence

proximity features

coherence

perplexity proximity

large language

semantic

coherence speech

disorganized thinking

llm perplexities

human ratings

A PHP Error was encountered