98%
921
2 minutes
20
A hallmark of intelligence is the ability to adapt behavior to changing environments, which requires adapting one's own learning strategies. This phenomenon is known as learning to learn in cognitive science and meta-learning in artificial intelligence. While this phenomenon is well-established in humans and animals, no quantitative framework exists for characterizing the trajectories through which biological agents adapt their learning strategies. Previous computational studies that either assume fixed strategies or use task-optimized neural networks do not explain how humans refine strategies through experience. Here we show that humans adjust their reinforcement learning strategies resembling principles of gradient-based online optimization. We introduce DynamicRL, a framework using neural networks to track how participants' learning parameters (e.g., learning rates and decision temperatures) evolve throughout experiments. Across four diverse bandit tasks, DynamicRL consistently outperforms traditional reinforcement learning models with fixed parameters, demonstrating that humans continuously adapt their strategies over time. These dynamically-estimated parameters reveal trajectories that systematically increase expected rewards, with updates significantly aligned with policy gradient ascent directions. Furthermore, this learning process operates across multiple timescales, with strategy parameters updating more slowly than behavioral choices, and update effectiveness correlates with local gradient strength in the reward landscape. Our work offers a generalizable approach for characterizing meta-learning trajectories, bridging theories of biological and artificial intelligence by providing a quantitative method for studying how adaptive behavior is optimized through experience.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12324363 | PMC |
http://dx.doi.org/10.1101/2025.07.28.667308 | DOI Listing |
Dev Psychobiol
September 2025
Department of Psychiatry, University of Illinois Chicago, Chicago, Illinois, USA.
Depressed mothers often experience parenting difficulties, which can persist after their symptoms have remitted. However, not all depressed mothers show parenting struggles, suggesting that there could be unidentified characteristics that increase risk. Specifically, neurobiological models emphasize that reward system deficits contribute to maladaptive parenting and depression, but no studies have evaluated how they could conjointly lead to parenting challenges.
View Article and Find Full Text PDFNat Commun
September 2025
Animal Physiology Unit, Institute of Neurobiology, University of Tübingen, Tübingen, Germany.
Interval timing, the ability to perceive and estimate durations between events, is essential for many animal behaviors. In mammals, it is linked to specific cortical and sub-cortical brain regions, but its neural basis in birds remains unclear. We trained two male carrion crows on a time estimation task using visual stimuli, cueing them to wait for a minimum duration of 1500 ms, 3000 ms, or 6000 ms before responding to receive a reward.
View Article and Find Full Text PDFBMJ Lead
September 2025
Green Templeton College, University of Oxford, Oxford, UK.
Background: In 2021, Dr Kalra embraced an opportunity for a leadership role at a start-up healthcare organisation in India. This gave him an opportunity to adapt his National Health Service (NHS) leadership experience to the evolving Indian private healthcare landscape. This paper shares his lived experience as a National Medical Director and delves into the experiences and leadership insights he acquired during this.
View Article and Find Full Text PDFJ Safety Res
September 2025
University of Massachusetts Amherst, 160 Governors Drive, Amherst, MA 01002, USA. Electronic address:
Introduction: Effective driver education for teen drivers is increasingly important, especially as Advanced Driver Assistance Systems (ADAS) become standard in modern vehicles. This study examines driver education programs in the commonwealth of Massachusetts and explores how they are placed to prepare young drivers to understand and safely use ADAS technologies.
Method: Through a convergent mixed-methods approach, we analyzed thematic data from interviews and surveys of key stakeholders and performed sentiment analysis to capture their concerns and attitudes.
Int J Med Inform
September 2025
Profesora Titular de la Universidad de Alicante, Spain. Electronic address:
Background: Immersive Virtual Reality (IVR) is increasingly used in health sciences education to simulate high-risk, low-frequency scenarios such as mass casualty incidents. While prior research has focused on student outcomes, the perceptions of instructors about available IVR tools remains underexplored.
Objective: To evaluate instructors' perceptions regarding ease of use, educational value, and technical quality of the "VR-Triage" immersive simulation tool in a disaster and mass casualty incident course.