98%
921
2 minutes
20
Large language models show a surprising in-context learning ability-being able to use a prompt to form a prediction for a query, yet without additional training, in stark contrast to old-fashioned supervised learning. Providing a mechanistic interpretation and linking the empirical phenomenon to physics are thus challenging and remain unsolved. We study a simple yet expressive transformer with linear attention and map this structure to a spin glass model with real-valued spins, where the couplings and fields explain the intrinsic disorder in data. The spin glass model explains how the weight parameters interact with each other during pretraining, and further clarifies why an unseen function can be predicted by providing only a prompt yet without further training. Our theory reveals that for single-instance learning, increasing the task diversity leads to the emergence of in-context learning, by allowing the Boltzmann distribution to converge to a unique correct solution of weight parameters. Therefore, the pretrained transformer displays a prediction power in a prompt setting. The proposed analytically tractable model thus offers a promising avenue for thinking about how to interpret many intriguing but puzzling properties of large language models.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1103/5l5m-4nk5 | DOI Listing |
Learn Behav
September 2025
Universidad Nacional Autónoma de México, Mexico, Mexico.
An experiment using a predictive learning task with college students evaluated the impact of a stimulus associated with extinction on an AAB renewal design. Four groups of participants learned a specific relationship between two cues (X and Y) and two outcomes (O1 and O2) in Context A during the first phase. Subsequently, both cues were subjected to extinction in the same Context A.
View Article and Find Full Text PDFBiomedical named entity recognition (NER) is a high-utility natural language processing (NLP) task, and large language models (LLMs) show promise particularly in few-shot settings (i.e., limited training data).
View Article and Find Full Text PDFEur Radiol
September 2025
Institute of Diagnostic and Interventional Neuroradiology, TUM University Hospital, School of Medicine and Health, Technical University of Munich, Munich, Germany.
Objectives: To evaluate the potential of LLMs to generate sequence-level brain MRI protocols.
Materials And Methods: This retrospective study employed a dataset of 150 brain MRI cases derived from local imaging request forms. Reference protocols were established by two neuroradiologists.
Stud Health Technol Inform
September 2025
Chair of Medical Informatics, Institute of AI and Informatics in Medicine (AIIM), TUM University Hospital, Technical University of Munich, Munich, Germany.
Introduction: Medical entity linking is an important task in biomedical natural language processing, aiming to align textual mentions of medical concepts with standardized concepts in ontologies. Most existing approaches rely on supervised models or domain-specific embeddings, which require large datasets and significant computational resources.
Objective: The objective of this work is (1) to investigate the effectiveness of large language models (LLMs) in improving both candidate generation and disambiguation for medical entity linking through synonym expansion and in-context learning, and (2) to evaluate this approach against traditional string-matching and supervised methods.
Int J Med Inform
August 2025
Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, MA 02115, United States; Department of Medicine, Harvard Medical School, Boston, MA 02115, United States.
Purpose: To synthesize performance and improvement strategies for adapting generative LLMs in EHR analyses and applications.
Methods: We followed the PRISMA guidelines to conduct a systematic review of articles from PubMed and Web of Science published between January 1, 2023 and November 9, 2024. Multiple reviewers including biomedical informaticians and a clinician involved in the article reviewing process.