98%
921
2 minutes
20
An agent in a nonstationary contextual bandit problem should balance between exploration and the exploitation of (periodic or structured) patterns present in its previous experiences. Handcrafting an appropriate historical context is an attractive alternative to transform a nonstationary problem into a stationary problem that can be solved efficiently. However, even a carefully designed historical context may introduce spurious relationships or lack a convenient representation of crucial information. In order to address these issues, we propose an approach that learns to represent the relevant context for a decision based solely on the raw history of interactions between the agent and the environment. This approach relies on a combination of features extracted by recurrent neural networks with a contextual linear bandit algorithm based on posterior sampling. Our experiments on a diverse selection of contextual and noncontextual nonstationary problems show that our recurrent approach consistently outperforms its feedforward counterpart, which requires handcrafted historical contexts, while being more widely applicable than conventional nonstationary bandit algorithms. Although it is very difficult to provide theoretical performance guarantees for our new approach, we also prove a novel regret bound for linear posterior sampling with measurement error that may serve as a foundation for future theoretical work.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1162/neco_a_01539 | DOI Listing |
Bayesian Anal
January 2025
Department of Statistics, University of Washington, Seattle, USA.
We introduce the BREASE framework for the Bayesian analysis of randomized controlled trials with binary treatment and outcome. Approaching the problem from a causal inference perspective, we propose parameterizing the likelihood in terms of the aseline isk, fficacy, and dverse ide ffects of the treatment, along with a flexible, yet intuitive and tractable jointly independent beta prior distribution on these parameters, which we show to be a generalization of the Dirichlet prior for the joint distribution of potential outcomes. Our approach has a number of desirable characteristics when compared to current mainstream alternatives: (i) it naturally induces prior dependence between expected outcomes in the treatment and control groups; (ii) as the baseline risk, efficacy and risk of adverse side effects are quantities commonly present in the clinicians' vocabulary, the hyperparameters of the prior are directly interpretable, thus facilitating the elicitation of prior knowledge and sensitivity analysis; and (iii) we provide analytical formulae for the marginal likelihood, Bayes factor, and other posterior quantities, as well as an exact posterior sampling algorithm and an accurate and fast data-augmented Gibbs sampler in cases where traditional MCMC fails.
View Article and Find Full Text PDFDev Med Child Neurol
September 2025
Neuropsychology Service, Psychological and Mental Health Services, Great Ormond Street Hospital, London, UK.
Aim: To systematically review neurocognitive outcomes associated with postoperative paediatric cerebellar mutism syndrome (pCMS), comparing children with and without pCMS after posterior fossa tumour surgery, and in relation to moderating demographic and clinical risk factors.
Method: PsycInfo, Medline, and Embase databases were systematically searched up to December 2024. Studies of children aged 2 to 18 years with pCMS who had undergone standardized neurocognitive assessment were included.
J Neurosci
September 2025
Jefferson Moss Rehabilitation Research Institute, Thomas Jefferson University, Elkins Park, PA 19027.
Tool use is a complex motor planning problem. Prior research suggests that planning to use tools involves resolving competition between different tool-related action representations. We therefore reasoned that competition may also be exacerbated with tools for which the motions of the tool and the hand are incongruent (e.
View Article and Find Full Text PDFObjective: Previous studies of nerve distribution in the orofacial complex have focused primarily on the anatomic courses of nerve fibers and have rarely addressed the density of nerve distribution. The nerve distribution in the mandible was described in only one report which showed an increase in nerve distribution density moving from the alveolar crest toward the inferior alveolar nerve. However, no previous reports have focused on the nerve distribution density in the maxilla.
View Article and Find Full Text PDFBioinformatics
September 2025
Institute of Ecology and Evolution, University of Edinburgh, Edinburgh, United Kingdom.
Summary: In Bayesian phylogenetic and phylodynamic studies it is common to summarise the posterior distribution of trees with a time-calibrated summary phylogeny. While the maximum clade credibility (MCC) tree is often used for this purpose, we here show that a novel summary tree method-the highest independent posterior subtree reconstruction, or HIPSTR-contains consistently higher supported clades over MCC. We also provide faster computational routines for estimating both summary trees in an updated version of TreeAnnotator X, an open-source software program that summarizes the information from a sample of trees and returns many helpful statistics such as individual clade credibilities contained in the summary tree.
View Article and Find Full Text PDF