HMM for discovering decision-making dynamics using reinforcement learning experiments.

Biostatistics

Department of Biostatistics, Columbia University, 722 West 168th St, New York, NY, 10032, United States.

Published: December 2024


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Major depressive disorder (MDD), a leading cause of years of life lived with disability, presents challenges in diagnosis and treatment due to its complex and heterogeneous nature. Emerging evidence indicates that reward processing abnormalities may serve as a behavioral marker for MDD. To measure reward processing, patients perform computer-based behavioral tasks that involve making choices or responding to stimulants that are associated with different outcomes, such as gains or losses in the laboratory. Reinforcement learning (RL) models are fitted to extract parameters that measure various aspects of reward processing (e.g. reward sensitivity) to characterize how patients make decisions in behavioral tasks. Recent findings suggest the inadequacy of characterizing reward learning solely based on a single RL model; instead, there may be a switching of decision-making processes between multiple strategies. An important scientific question is how the dynamics of strategies in decision-making affect the reward learning ability of individuals with MDD. Motivated by the probabilistic reward task within the Establishing Moderators and Biosignatures of Antidepressant Response in Clinical Care (EMBARC) study, we propose a novel RL-HMM (hidden Markov model) framework for analyzing reward-based decision-making. Our model accommodates decision-making strategy switching between two distinct approaches under an HMM: subjects making decisions based on the RL model or opting for random choices. We account for continuous RL state space and allow time-varying transition probabilities in the HMM. We introduce a computationally efficient Expectation-maximization (EM) algorithm for parameter estimation and use a nonparametric bootstrap for inference. Extensive simulation studies validate the finite-sample performance of our method. We apply our approach to the EMBARC study to show that MDD patients are less engaged in RL compared to the healthy controls, and engagement is associated with brain activities in the negative affect circuitry during an emotional conflict task.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12090054PMC
http://dx.doi.org/10.1093/biostatistics/kxae033DOI Listing

Publication Analysis

Top Keywords

reward processing
12
reinforcement learning
8
behavioral tasks
8
reward learning
8
embarc study
8
reward
7
decision-making
5
hmm discovering
4
discovering decision-making
4
decision-making dynamics
4

Similar Publications

Background: Although current evidence supports the effectiveness of social norm feedback (SNF) interventions, their sustained integration into primary care remains limited. Drawing on the elements of the antimicrobial SNF intervention strategy identified through the Delphi-based evidence applicability evaluation, this study aims to explore the barriers and facilitators to its implementation in primary care institutions, thereby informing future optimization.

Methods: Based on the five domains of the Consolidated Framework for Implementation Research (CFIR), we developed semi-structured interview and focus group discussion guides.

View Article and Find Full Text PDF

Reduction in reward-driven behaviour depends on the basolateral but not central nucleus of the amygdala in female rats.

J Neurosci

September 2025

Center for Studies in Behavioural Neurobiology, Department of Psychology, Concordia University, Montreal, QC, Canada, H4B 1R6

Adaptive behavior depends on a dynamic balance between acquisition and extinction memories. Male and female rodents differ in extinction learning rates, suggestion potential sex-based differences in this balance. In males, deletion of extinction-recruited neurons in the central nucleus (CN) of the amygdala impairs extinction retrieval, shifting behavior toward acquisition (Lay et al.

View Article and Find Full Text PDF

Objective: To assess biological factors associated with anhedonia in depression and amotivation in cannabis use (PROSPERO: CRD42023422438).

Method: A systematic review was conducted of 8 electronic databases. Inclusion criteria included original research studies that investigated the association of biological factors or behavioral tasks with depression combined with concepts of anhedonia or cannabis combined with concepts of amotivation including apathy.

View Article and Find Full Text PDF

Neural Correlates of Reward Processing: Impact of Individual Differences in Preference for Prosocial Interactions.

Brain Behav

September 2025

Centre For Cognitive and Clinical Neuroscience, College of Health, Medicine and Life Sciences, Brunel University of London, London, UK.

Introduction: There is an ongoing debate about the neural mechanisms and subjective preferences involved in the processing of social rewards compared to non-social reward types.

Methods: Using whole-brain functional magnetic resonance imaging (fMRI), we examined brain activation patterns during the anticipation and consumption phases of monetary and social rewards (using the Monetary and Social Incentive Delay Task-MSIDT, featuring human avatars) and their associations with self-reported social reward preferences measured by the Social Reward Questionnaire (SRQ) in 20 healthy right-handed individuals.

Results: In the anticipation phase, all reward types activated the dorsal striatum, middle cingulo-insular (salience) network, inferior frontal gyrus (IFG), and supplementary motor areas.

View Article and Find Full Text PDF

Decision-making often involves evaluating trade-offs between potential rewards and aversive outcomes, engaging both motivational drive and affective judgment. The ventral striatum (VS) and ventral pallidum (VP) are key regions in these processes. While the VS is associated with reward processing and incentive motivation, the VP encodes hedonic value and mediates motivated behaviors.

View Article and Find Full Text PDF