Commun Psychol
September 2025
Social connection, a basic human need, is vital during adolescence. How a lack of connection impacts adolescent behaviour is unclear. To address this question, we employed experimental short-term isolation with and without access to virtual social interactions (iso total; iso with media; order counterbalanced, both compared to a separate baseline session).
View Article and Find Full Text PDFProc Natl Acad Sci U S A
September 2025
The tendency to repeat past choices more often than expected from the history of outcomes has been repeatedly empirically observed in reinforcement learning experiments. It can be explained by at least two computational processes: asymmetric update and (gradual) choice perseveration. A recent meta-analysis showed that both mechanisms are detectable in human reinforcement learning.
View Article and Find Full Text PDFIn-context learning enables large language models (LLMs) to perform a variety of tasks, including solving reinforcement learning (RL) problems. Given their potential use as (autonomous) decision-making agents, it is important to understand how these models behave in RL tasks and the extent to which they are susceptible to biases. Motivated by the fact that, in humans, it has been widely documented that the value of a choice outcome depends on how it compares to other local outcomes, the present study focuses on whether similar value encoding biases apply to LLMs.
View Article and Find Full Text PDFThe rise of social media has profoundly altered the social world, introducing new behaviors that can satisfy our social needs. However, it is not yet known whether human social strategies, which are well adapted to the offline world we developed in, operate as effectively within this new social environment. Here, we describe how the computational framework of reinforcement learning (RL) can help us to precisely frame this problem and diagnose where behavior-environment mismatches emerge.
View Article and Find Full Text PDFCommun Psychol
December 2024
Individuals often rely on the advice of more experienced peers to minimise uncertainty and increase success likelihood. In most domains where knowledge is acquired through experience, advisers are themselves continuously learning. Here we examine the way advising behaviour changes throughout the learning process, and the way individual traits and costs and benefits of giving advice shape this behaviour.
View Article and Find Full Text PDFIn the present study, we investigate and compare reasoning in large language models (LLMs) and humans, using a selection of cognitive psychology tools traditionally dedicated to the study of (bounded) rationality. We presented to human participants and an array of pretrained LLMs new variants of classical cognitive experiments, and cross-compared their performances. Our results showed that most of the included models presented reasoning errors akin to those frequently ascribed to error-prone, heuristic-based human reasoning.
View Article and Find Full Text PDFTrends Cogn Sci
September 2024
While decision theories have evolved over the past five decades, their focus has largely been on choices among a limited number of discrete options, even though many real-world situations have a continuous-option space. Recently, theories have attempted to address decisions with continuous-option spaces, and several computational models have been proposed within the sequential sampling framework to explain how we make a decision in continuous-option space. This article aims to review the main attempts to understand decisions on continuous-option spaces, give an overview of applications of these types of decisions, and present puzzles to be addressed by future developments.
View Article and Find Full Text PDFRecent evidence indicates that reward value encoding in humans is highly context dependent, leading to suboptimal decisions in some cases, but whether this computational constraint on valuation is a shared feature of human cognition remains unknown. Here we studied the behaviour of n = 561 individuals from 11 countries of markedly different socioeconomic and cultural makeup. Our findings show that context sensitivity was present in all 11 countries.
View Article and Find Full Text PDFRecent advances in the field of machine learning have yielded novel research perspectives in behavioural economics and financial markets microstructure studies. In this paper we study the impact of individual trader leaning characteristics on markets using a stock market simulator designed with a multi-agent architecture. Each agent, representing an autonomous investor, trades stocks through reinforcement learning, using a centralized double-auction limit order book.
View Article and Find Full Text PDFBackground: Drugs like opioids are potent reinforcers thought to co-opt value-based decisions by overshadowing other rewarding outcomes, but how this happens at a neurocomputational level remains elusive. Range adaptation is a canonical process of fine-tuning representations of value based on reward context. Here, we tested whether recent opioid exposure impacts range adaptation in opioid use disorder, potentially explaining why shifting decision making away from drug taking during this vulnerable period is so difficult.
View Article and Find Full Text PDFWhile navigating a fundamentally uncertain world, humans and animals constantly evaluate the probability of their decisions, actions or statements being correct. When explicitly elicited, these confidence estimates typically correlates positively with neural activity in a ventromedial-prefrontal (VMPFC) network and negatively in a dorsolateral and dorsomedial prefrontal network. Here, combining fMRI with a reinforcement-learning paradigm, we leverage the fact that humans are more confident in their choices when seeking gains than avoiding losses to reveal a functional dissociation: whereas the dorsal prefrontal network correlates negatively with a condition-specific confidence signal, the VMPFC network positively encodes task-wide confidence signal incorporating the valence-induced bias.
View Article and Find Full Text PDFReinforcement-based adaptive decision-making is believed to recruit fronto-striatal circuits. A critical node of the fronto-striatal circuit is the thalamus. However, direct evidence of its involvement in human reinforcement learning is lacking.
View Article and Find Full Text PDFReinforcement learning research in humans and other species indicates that rewards are represented in a context-dependent manner. More specifically, reward representations seem to be normalized as a function of the value of the alternative options. The dominant view postulates that value context-dependence is achieved via a divisive normalization rule, inspired by perceptual decision-making research.
View Article and Find Full Text PDFWe systematically misjudge our own performance in simple economic tasks. First, we generally overestimate our ability to make correct choices-a bias called overconfidence. Second, we are more confident in our choices when we seek gains than when we try to avoid losses-a bias we refer to as the valence-induced confidence bias.
View Article and Find Full Text PDFRecent evidence indicates that reward value encoding in humans is highly context-dependent, leading to suboptimal decisions in some cases. But whether this computational constraint on valuation is a shared feature of human cognition remains unknown. To address this question, we studied the behavior of individuals from across 11 countries of markedly different socioeconomic and cultural makeup using an experimental approach that reliably captures context effects in reinforcement learning.
View Article and Find Full Text PDFBehavioral results suggest that learning by trial-and-error (i.e., reinforcement learning) relies on a teaching signal, the prediction error, which quantifies the difference between the obtained and the expected reward.
View Article and Find Full Text PDFStandard models of decision-making assume each option is associated with subjective value, regardless of whether this value is inferred from experience (experiential) or explicitly instructed probabilistic outcomes (symbolic). In this study, we present results that challenge the assumption of unified representation of experiential and symbolic value. Across nine experiments, we presented participants with hybrid decisions between experiential and symbolic options.
View Article and Find Full Text PDFBehav Neurosci
February 2023
Do we preferentially learn from outcomes that confirm our choices? In recent years, we investigated this question in a series of studies implementing increasingly complex behavioral protocols. The learning rates fitted in experiments featuring partial or complete feedback, as well as free and forced choices, were systematically found to be consistent with a choice-confirmation bias. One of the prominent behavioral consequences of the confirmatory learning rate pattern is choice hysteresis: that is, the tendency of repeating previous choices, despite contradictory evidence.
View Article and Find Full Text PDFUnderstanding how learning changes during human development has been one of the long-standing objectives of developmental science. Recently, advances in computational biology have demonstrated that humans display a bias when learning to navigate novel environments through rewards and punishments: they learn more from outcomes that confirm their expectations than from outcomes that disconfirm them. Here, we ask whether confirmatory learning is stable across development, or whether it might be attenuated in developmental stages in which exploration is beneficial, such as in adolescence.
View Article and Find Full Text PDFBackgrounds: Value-based decision-making impairment in depression is a complex phenomenon: while some studies did find evidence of blunted reward learning and reward-related signals in the brain, others indicate no effect. Here we test whether such reward sensitivity deficits are dependent on the overall value of the decision problem.
Methods: We used a two-armed bandit task with two different contexts: one 'rich', one 'poor' where both options were associated with an overall positive, negative expected value, respectively.
Trends Cogn Sci
July 2022
Humans do not integrate new information objectively: outcomes carrying a positive affective value and evidence confirming one's own prior belief are overweighed. Until recently, theoretical and empirical accounts of the positivity and confirmation biases assumed them to be specific to 'high-level' belief updates. We present evidence against this account.
View Article and Find Full Text PDF