98%
921
2 minutes
20
The coevolution of signalling is a complex problem within animal behaviour, and is also central to communication between artificial agents. The Sir Philip Sidney game was designed to model this dyadic interaction from an evolutionary biology perspective, and was formulated to demonstrate the emergence of honest signalling. We use Multi-Agent Reinforcement Learning (MARL) to show that in the majority of cases, the resulting behaviour adopted by agents is not that shown in the original derivation of the model. This paper demonstrates that MARL can be a powerful tool to study evolutionary dynamics and understand the underlying mechanisms of learning over generations; particularly advantageous is the interpretability of this type of approach, as well as that fact that it allows us to study emergent behaviour without the need to constrain the strategy space from the outset. Although it originally set out to exemplify honest signalling, we show that the game provides no incentive for such behaviour. In the majority of cases, the optimal outcome is one that does not require a signal for the resource to be given. This type of interaction is observed within animal behaviour, and is sometimes denoted proactive prosociality. High learning and low discount rates of the reinforcement learning model are shown to be optimal in order to achieve the outcome that maximises both agents' reward, and proximity to the given threshold leads to suboptimal learning.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1371/journal.pcbi.1013302 | DOI Listing |
Sci Adv
September 2025
Movement Disorder and Neuromodulation Unit, Department of Neurology, Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany.
Subthalamic deep brain stimulation (STN-DBS) provides unprecedented spatiotemporal precision for the treatment of Parkinson's disease (PD), allowing for direct real-time state-specific adjustments. Inspired by findings from optogenetic stimulation in mice, we hypothesized that STN-DBS can mimic dopaminergic reinforcement of ongoing movement kinematics during stimulation. To investigate this hypothesis, we delivered DBS bursts during particularly fast and slow movements in 24 patients with PD.
View Article and Find Full Text PDFAdv Physiol Educ
September 2025
Department of Physiology, University College Cork, Western Gateway Building, Cork, Ireland.
This study aimed to evaluate the effectiveness of online synchronous and asynchronous teaching formats for undergraduate physiology education in a medical program in Ireland, with a specific focus on the use of LabTutor (Lt) LabStation online laboratory platform for remote access. To understand how the Lt platform was used by students and whether it enhanced their learning experience in physiology, we conducted a survey and questionnaire. We focused on students' access to Lt activities and examined any gender differences in the utilization of, and attitudes towards, these activities in a 'Fundamentals of Medicine' module for first-year medical students (n=65).
View Article and Find Full Text PDFJ Palliat Med
September 2025
Department of Medicine, Section of Palliative Care, Stanford University School of Medicine, Stanford, California, USA.
A half-day workshop improved palliative care clinicians' ability to integrate psychological concepts into serious illness communication but created demand for longitudinal learning. To pilot "Process Rounds," a four-session, case-based, adapted psychotherapeutic supervision group reinforcing formulation, countertransference, and mindful intervention. Workshop graduates from four cohorts were invited; 25/143 enrolled.
View Article and Find Full Text PDFIEEE J Biomed Health Inform
September 2025
This study aims to optimize the dynamic administration regimen of prophylactic enoxaparin in critically ill patients to reduce the risk of VTE, major bleeding, and 30-day all-cause mortality. We developed and internally and externally validated an artificial intelligence (AI) policy utilizing Double dueling deep Q network, using data from the Medical Information Mart for Intensive Care IV (MIMIC-IV) database (training and internal test set) and the eICU Collaborative Research Database (eICU-CRD, external test set). We compared the performance among the AI policy, the clinician's policy, the weight-tiered policy, and the fixed 40- mg-once-daily (QD) policy.
View Article and Find Full Text PDFCerebellum
September 2025
Neuropsychology and Applied Cognitive Neuroscience Laboratory, State Key Laboratory of Cognitive Science and Mental Health, Institute of Psychology, Chinese Academy of Sciences, Beijing, China.
Reward processing involves several components, including reward anticipation, cost-effort computation, reward consumption, reward sensitivity, and reward learning. Recent research has highlighted the cerebellum's role in reward processing. This study aimed to investigate the effects of cerebellar stimulation on reward processing using high-definition transcranial direct current stimulation (HD-tDCS).
View Article and Find Full Text PDF