Spin-glass model of in-context learning.

Yuhao Li , Ruoran Bai , Haiping Huang

Phys Rev E

Sun Yat-sen University, PMI Lab, School of Physics, Guangzhou 510275, People's Republic of China.

Published: July 2025

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Large language models show a surprising in-context learning ability-being able to use a prompt to form a prediction for a query, yet without additional training, in stark contrast to old-fashioned supervised learning. Providing a mechanistic interpretation and linking the empirical phenomenon to physics are thus challenging and remain unsolved. We study a simple yet expressive transformer with linear attention and map this structure to a spin glass model with real-valued spins, where the couplings and fields explain the intrinsic disorder in data. The spin glass model explains how the weight parameters interact with each other during pretraining, and further clarifies why an unseen function can be predicted by providing only a prompt yet without further training. Our theory reveals that for single-instance learning, increasing the task diversity leads to the emergence of in-context learning, by allowing the Boltzmann distribution to converge to a unique correct solution of weight parameters. Therefore, the pretrained transformer displays a prediction power in a prompt setting. The proposed analytically tractable model thus offers a promising avenue for thinking about how to interpret many intriguing but puzzling properties of large language models.

Download full-text PDF	Source
http://dx.doi.org/10.1103/5l5m-4nk5	DOI Listing

Publication Analysis

Top Keywords

in-context learning

large language

language models

spin glass

glass model

weight parameters

learning

spin-glass model

model in-context

learning large

Similar Publications

The impact of an extinction reminder on AAB renewal is sensitive to the level of association with extinction.

Learn Behav

September 2025

Universidad Nacional Autónoma de México, Mexico, Mexico.

A Matías Gámez , Fátima Rojas-Iturria , Rodolfo Bernal-Gamboa

An experiment using a predictive learning task with college students evaluated the impact of a stimulus associated with extinction on an AAB renewal design. Four groups of participants learned a specific relationship between two cues (X and Y) and two outcomes (O1 and O2) in Context A during the first phase. Subsequently, both cues were subjected to extinction in the same Context A.

View Article and Find Full Text PDF

Similar Publications

Retrieval augmented generation based dynamic prompting for few-shot biomedical named entity recognition using large language models.

Res Sq

August 2025

Yao Ge , Sudeshna Das , Yuting Guo , Abeed Sarker

Biomedical named entity recognition (NER) is a high-utility natural language processing (NLP) task, and large language models (LLMs) show promise particularly in few-shot settings (i.e., limited training data).

View Article and Find Full Text PDF

Similar Publications

Evaluating large language model-generated brain MRI protocols: performance of GPT4o, o3-mini, DeepSeek-R1 and Qwen2.5-72B.

Eur Radiol

September 2025

Institute of Diagnostic and Interventional Neuroradiology, TUM University Hospital, School of Medicine and Health, Technical University of Munich, Munich, Germany.

Su Hwan Kim , Severin Schramm , Lena Schmitzer , Kerem Serguen , Sebastian Ziegelmayer

Objectives: To evaluate the potential of LLMs to generate sequence-level brain MRI protocols.

Materials And Methods: This retrospective study employed a dataset of 150 brain MRI cases derived from local imaging request forms. Reference protocols were established by two neuroradiologists.

View Article and Find Full Text PDF

Similar Publications

Medical Entity Linking in Low-Resource Settings with Fine-Tuning-Free LLMs.

Stud Health Technol Inform

September 2025

Chair of Medical Informatics, Institute of AI and Informatics in Medicine (AIIM), TUM University Hospital, Technical University of Munich, Munich, Germany.

Suteera Seeha , Martin Boeker , Luise Modersohn

Introduction: Medical entity linking is an important task in biomedical natural language processing, aiming to align textual mentions of medical concepts with standardized concepts in ontologies. Most existing approaches rely on supervised models or domain-specific embeddings, which require large datasets and significant computational resources.

Objective: The objective of this work is (1) to investigate the effectiveness of large language models (LLMs) in improving both candidate generation and disambiguation for medical entity linking through synonym expansion and in-context learning, and (2) to evaluate this approach against traditional string-matching and supervised methods.

View Article and Find Full Text PDF

Similar Publications

Performance and improvement strategies for adapting generative large language models for electronic health record applications: A systematic review.

Int J Med Inform

August 2025

Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, MA 02115, United States; Department of Medicine, Harvard Medical School, Boston, MA 02115, United States.

Xinsong Du , Zhengyang Zhou , Yifei Wang , Ya-Wen Chuang , Yiming Li

Purpose: To synthesize performance and improvement strategies for adapting generative LLMs in EHR analyses and applications.

Methods: We followed the PRISMA guidelines to conduct a systematic review of articles from PubMed and Web of Science published between January 1, 2023 and November 9, 2024. Multiple reviewers including biomedical informaticians and a clinician involved in the article reviewing process.

View Article and Find Full Text PDF

Similar Publications