What is it like to be a bot? The world according to GPT-4.

Dan Lloyd

Front Psychol

Trinity College, Hartford, CT, United States.

Published: August 2024

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

The recent explosion of Large Language Models (LLMs) has provoked lively debate about "emergent" properties of the models, including intelligence, insight, creativity, and meaning. These debates are rocky for two main reasons: The emergent properties sought are not well-defined; and the grounds for their dismissal often rest on a fallacious appeal to extraneous factors, like the LLM training regime, or fallacious assumptions about processes within the model. The latter issue is a particular roadblock for LLMs because their internal processes are largely unknown - they are colossal black boxes. In this paper, I try to cut through these problems by, first, identifying one salient feature shared by systems we regard as intelligent/conscious/sentient/etc., namely, their responsiveness to environmental conditions that may not be near in space and time. They engage with subjective worlds ("s-worlds") which may or may not conform to the actual environment. Observers can infer s-worlds from behavior alone, enabling hypotheses about perception and cognition that do not require evidence from the internal operations of the systems in question. The reconstruction of s-worlds offers a framework for comparing cognition across species, affording new leverage on the possible sentience of LLMs. Here, we examine one prominent LLM, OpenAI's GPT-4. Inquiry into the emergence of a complex subjective world is facilitated with philosophical phenomenology and cognitive ethology, examining the pattern of errors made by GPT-4 and proposing their origin in the absence of an analogue of the human subjective awareness of time. This deficit suggests that GPT-4 ultimately lacks a capacity to construct a stable perceptual world; the temporal vacuum undermines any capacity for GPT-4 to construct a consistent, continuously updated, model of its environment. Accordingly, none of GPT-4's statements are epistemically secure. Because the anthropomorphic illusion is so strong, I conclude by suggesting that GPT-4 works with its users to construct improvised works of fiction.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11339530	PMC
http://dx.doi.org/10.3389/fpsyg.2024.1292675	DOI Listing

Publication Analysis

Top Keywords

gpt-4

bot? gpt-4

gpt-4 explosion

explosion large

large language

language models

models llms

llms provoked

provoked lively

lively debate

Similar Publications

Designing Patient-Centered Communication Aids in Pediatric Surgery Using Large Language Models.

J Pediatr Surg

September 2025

Harvard Medical School, Boston, MA, United States; Medically Engineered Solutions in Healthcare Incubator, Innovation in Operations Research Center, Mass General Brigham, Boston, MA, United States. Electronic address:

Arya S Rao , Aneesh Mazumder , Elizabeth Roux , Cameron Young , Ethan Bott

Introduction: Large language models (LLMs) have been shown to translate information from highly specific domains into lay-digestible terms. Pediatric surgery remains an area in which it is difficult to communicate clinical information in an age-appropriate manner, given the vast diversity in language comprehension levels across patient populations and the complexity of procedures performed. This study evaluates LLMs as tools for generating explanations of common pediatric surgeries to increase efficiency and quality of communication.

View Article and Find Full Text PDF

Similar Publications

Commentary on "DeepSeek-R1 and GPT-4 are comparable in a complex diagnostic challenge: a historical control study".

Int J Surg

September 2025

The Third Affiliated Hospital of Zhejiang Chinese Medical University, Hangzhou, Zhejiang, China.

Hanzhe Lv , Longhao Chen , Zhizhen Lv , Lijiang Lv

View Article and Find Full Text PDF

Similar Publications

Student Perceptions of a Custom Artificial Intelligence Clinical Case Companion.

J Physician Assist Educ

September 2025

Andrew P. Chastain, DMS, PA-C, is an assistant professor at Butler University, Indianapolis, Indiana.

Andrew P Chastain , Chris Roman , Kevin M Bogenschutz

Introduction: Artificial intelligence tools show promise in supplementing traditional physician assistant education, particularly in developing clinical reasoning skills. However, limited research exists on custom Generative Pretrained Transformer (GPT) applications in physician assistant (PA) education. This study evaluated student experiences and perceptions of a custom GPT-based clinical reasoning tool.

View Article and Find Full Text PDF

Similar Publications

Out-of-the-Box Large Language Models for Detecting and Classifying Critical Findings in Radiology Reports Using Various Prompt Strategies.

AJR Am J Roentgenol

September 2025

Department of Radiology, Stanford University, Stanford, CA, USA.

Ish A Talati , Juan M Zambrano Chaves , Avisha Das , Imon Banerjee , Daniel L Rubin

The increasing complexity and volume of radiology reports present challenges for timely critical findings communication. To evaluate the performance of two out-of-the-box LLMs in detecting and classifying critical findings in radiology reports using various prompt strategies. The analysis included 252 radiology reports of varying modalities and anatomic regions extracted from the MIMIC-III database, divided into a prompt engineering tuning set of 50 reports, a holdout test set of 125 reports, and a pool of 77 remaining reports used as examples for few-shot prompting.

View Article and Find Full Text PDF

Similar Publications

Comparative performance of neurosurgery-specific, peer-reviewed versus general AI chatbots in bilingual board examinations: evaluating accuracy, consistency, and error minimization strategies.

Acta Neurochir (Wien)

September 2025

Department of Neurosurgery, Istinye University, Istanbul, Turkey.

Mahmut Çamlar , Umut Tan Sevgi , Gökberk Erol , Furkan Karakaş , Yücel Doğruel

Background: Recent studies suggest that large language models (LLMs) such as ChatGPT are useful tools for medical students or residents when preparing for examinations. These studies, especially those conducted with multiple-choice questions, emphasize that the level of knowledge and response consistency of the LLMs are generally acceptable; however, further optimization is needed in areas such as case discussion, interpretation, and language proficiency. Therefore, this study aimed to evaluate the performance of six distinct LLMs for Turkish and English neurosurgery multiple-choice questions and assess their accuracy and consistency in a specialized medical context.

View Article and Find Full Text PDF

Similar Publications