Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Introduction: Recent advances in computational neuroscience highlight the significance of prefrontal cortical meta-control mechanisms in facilitating flexible and adaptive human behavior. In addition, hippocampal function, particularly mental simulation capacity, proves essential in this adaptive process. Rooted from these neuroscientific insights, we present , a novel neuroscience-inspired reinforcement learning architecture that demonstrates rapid adaptation to environmental dynamics whilst managing variable goal states and state-transition uncertainties.

Methods: This architectural framework implements prefrontal meta-control mechanisms integrated with hippocampal replay function, which in turn optimized task performance with limited experiences. We evaluated this approach through comprehensive experimental simulations across three distinct paradigms: the two-stage Markov decision task, which frequently serves in human learning and decision-making research; , an established benchmark suite for model-based reinforcement learning; and a variant incorporating multiple goals under uncertainty.

Results: Experimental results demonstrate 's superior performance compared with baseline reinforcement learning algorithms across multiple metrics: average reward, choice optimality, and a number of trials for success.

Discussions: These findings advance our understanding of computational reinforcement learning whilst contributing to the development of brain-inspired learning agents capable of flexible, goal-directed behavior within dynamic environments.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11983510PMC
http://dx.doi.org/10.3389/fncom.2025.1559915DOI Listing

Publication Analysis

Top Keywords

reinforcement learning
20
prefrontal meta-control
8
mental simulation
8
learning agents
8
dynamic environments
8
meta-control mechanisms
8
learning
7
reinforcement
5
meta-control incorporating
4
incorporating mental
4

Similar Publications

Reward delays are often associated with reduced probability of reward, although standard assessments of delay discounting do not specify degree of reward certainty. Thus, the extent to which estimates of delay discounting are influenced by uncontrolled variance in perceived reward certainty remains unclear. Here we examine 370 participants who were randomly assigned to complete a delay discounting task when reward certainty was either unspecified (n=184) or specified as 100% (n = 186) in the task trials and task instructions.

View Article and Find Full Text PDF

The increasing dependence on cloud computing as a cornerstone of modern technological infrastructures has introduced significant challenges in resource management. Traditional load-balancing techniques often prove inadequate in addressing cloud environments' dynamic and complex nature, resulting in suboptimal resource utilization and heightened operational costs. This paper presents a novel smart load-balancing strategy incorporating advanced techniques to mitigate these limitations.

View Article and Find Full Text PDF

Rule following as choice: The role of reinforcement rate and rule accuracy on rule-following behavior.

J Exp Anal Behav

September 2025

Laboratorio de Análisis de la Conducta, Universidad Nacional Autónoma de México. Facultad de Estudios Superiores Iztacala.

Rules can control the listener's behavior, yet few studies have examined variables that quantitatively determine the extent of this control relative to other rules and contingencies. To explore these variables, we employed a novel procedure that required a choice between rules. Participants clicked two buttons on a computer screen to earn points exchangeable for money.

View Article and Find Full Text PDF

Multiagent Inductive Policy Optimization.

IEEE Trans Neural Netw Learn Syst

September 2025

Policy optimization methods are promising to tackle high-complexity reinforcement learning (RL) tasks with multiple agents. In this article, we derive a general trust region for policy optimization methods by considering the effect of subpolicy combinations among agents in multiagent environments. Based on this trust region, we propose an inductive objective to train the policy function, which can ensure agents learn monotonically improving policies.

View Article and Find Full Text PDF

In essence, reinforcement learning (RL) solves optimal control problem (OCP) by employing a neural network (NN) to fit the optimal policy from state to action. The accuracy of policy approximation is often very low in complex control tasks, leading to unsatisfactory control performance compared with online optimal controllers. A primary reason is that the landscape of value function is always not only rugged in most areas but also flat on the bottom, which damages the convergence to the minimum point.

View Article and Find Full Text PDF