Re-evaluating Theory of Mind evaluation in large language models.

Philos Trans R Soc Lond B Biol Sci

Department of Psychology, Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, Cambridge, MA, USA.

Published: August 2025

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

The question of whether large language models (LLMs) possess Theory of Mind (ToM)-often defined as the ability to reason about others' mental states-has sparked significant scientific and public interest. However, the evidence as to whether LLMs possess ToM is mixed, and the recent growth in evaluations has not resulted in a convergence. Here, we take inspiration from cognitive science to re-evaluate the state of ToM evaluation in LLMs. We argue that a major reason for the disagreement on whether LLMs have ToM is a lack of clarity on whether models should be expected to match human behaviours, or the computations underlying those behaviours. We also highlight ways in which current evaluations may be deviating from 'pure' measurements of ToM abilities, which also contributes to the confusion. We conclude by discussing several directions for future research, including the relationship between ToM and pragmatic communication, which could advance our understanding of artificial systems as well as human cognition.This article is part of the theme issue 'At the heart of human communication: new views on the complex relationship between pragmatics and Theory of Mind'.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12351311	PMC
http://dx.doi.org/10.1098/rstb.2023.0499	DOI Listing

Publication Analysis

Top Keywords

theory mind

large language

language models

llms possess

tom

re-evaluating theory

mind evaluation

evaluation large

models question

question large

Similar Publications

Events in the stream of behavior.

Curr Opin Behav Sci

October 2025

Washington University in St. Louis.

Maverick E Smith , Jeffrey M Zacks , Zachariah M Reagh

The human mind constructs and updates models of events during comprehension. Event models are multidimensional, multi-timescale, and structured. They enable prediction, shape memory formation, and facilitate action control.

View Article and Find Full Text PDF

Similar Publications

Applied Behavior Analysis in the Crosshairs: Neurodiversity, the Intact Mind, and Autism Politics.

Perspect Behav Sci

September 2025

History and Sociology of Science Department, University of Pennsylvania, 249 South 36th Street, Philadelphia, PA 19104 USA.

Amy S F Lutz

Recent attacks on applied behavior analysis (ABA) by neurodiversity advocates share a common theme with opposition to other supports, such as subminimum wage vocational programs and congregate residential settings: the intact mind assumption, which maintains that even profoundly autistic people have typical intelligence, even if they present as severely cognitively impaired. This article examines the history of the intact mind assumption, which was largely shaped by psychoanalytic theory in the mid-20 century, as well as its impact on contemporary disability policy and practice.

View Article and Find Full Text PDF

Similar Publications

Integrating Kolb's experiential learning theory into nursing education: a four-stage intervention with case analysis, mind maps, reflective journals, and peer simulations for advanced health assessment.

Front Med (Lausanne)

August 2025

School of Nursing, Anhui University of Chinese Medicine, Hefei, China.

Jing Cheng , Yingting Wu , Li Huang , Yuehong Wu , Yuxiang Guan

Purpose: This study evaluates the effectiveness of integrating case-based mind maps and reflective journals within Kolb's experiential learning framework in advanced nursing education.

Methods: An design compared 2023 (control group,  = 46) and 2024 (experimental group,  = 57) cohorts of nursing master's students. The experimental group received a Kolb-based intervention comprising: case analysis (concrete experience), reflective journals (reflective observation), mind maps (abstract conceptualization), and peer-led simulations (active experimentation).

View Article and Find Full Text PDF

Similar Publications

Emotion understanding in infants and young children: How input shapes emotional development.

Adv Child Dev Behav

September 2025

University of California, Davis, CA, USA.

Vanessa LoBue , Marianella Casasola , Lisa M Oakes

Here, we will review the developmental literature on how infants and young children learn about emotions. We take a process-based perspective, highlighting how the protracted trajectory of emotional development unfolds concurrently with changes in children's cognitive abilities, and how variability based on context, culture, and experience shape this trajectory over time. We will also emphasize the role of input into this development, a factor that has often been ignored.

View Article and Find Full Text PDF

Similar Publications

Children's Understanding of How Past Experience Shapes Future Expectations.

Child Dev

September 2025

Department of Psychology, Yale University, New Haven, Connecticut, USA.

Rosie Aboody , Caiqin Zhou , Julian Jara-Ettinger

As adults, we do not expect ignorant agents to behave randomly or always get things wrong. Instead, we expect them to act reasonably, guided by past experiences. We test whether 4-to-6-year-olds share this intuition and use it to infer others' knowledge, or whether they rely on a simple "ignorance = error" heuristic identified in past work.

View Article and Find Full Text PDF

Similar Publications