98%
921
2 minutes
20
Learning-based policy optimization methods have shown great potential for building general-purpose control systems. However, existing methods still struggle to achieve complex task objectives while ensuring policy safety during learning and execution phases for black-box systems. To address these challenges, we develop data-driven safe policy optimization (D2SPO), a novel reinforcement learning (RL)-based policy improvement method that jointly learns a control barrier function (CBF) for system safety and a linear temporal logic (LTL) guided RL algorithm for complex task objectives. Unlike many existing works that assume known system dynamics, by carefully constructing the data sets and redesigning the loss functions of D2SPO, a provably safe CBF is learned for black-box dynamical systems, which continuously evolves for improved system safety as RL interacts with the environment. To deal with complex task objectives, we take advantage of the capability of LTL in representing the task progress and develop LTL-guided RL policy for efficient completion of various tasks with LTL objectives. Extensive numerical and experimental studies demonstrate that D2SPO outperforms most state-of-the-art (SOTA) baselines and can achieve over 95% safety rate and nearly 100% task completion rates. The experiment video is available at https://youtu.be/2RgaH-zcmkY.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TNNLS.2023.3339885 | DOI Listing |
J Med Internet Res
September 2025
University College London, London, United Kingdom.
Background: Online postal self-sampling (OPSS) allows service users to screen for sexually transmitted infections (STIs) by ordering a self-sampling kit online, taking their own samples, returning them to a laboratory for testing, and receiving their results remotely. OPSS availability and use has increased in both the United Kingdom and globally the past decade but has been adopted in different regions of England at different times, with different models of delivery. It is not known why certain models were decided on or how implementation strategies have influenced outcomes, including the sustainability of OPSS in sexual health service delivery.
View Article and Find Full Text PDFPurpose: In Armenia, a lower-middle-income country, cancer causes 21% of all deaths, with over half of cases diagnosed at advanced stages. Without universal health insurance, patients rely on out-of-pocket payments or black-market channels for costly immunotherapies, underscoring the need for real-world data to inform equitable policy reforms.
Methods: We conducted a multicenter, retrospective cohort study of patients who received at least one dose of an immune checkpoint inhibitor (ICI) between January 2017 and December 2023 across six Armenian oncology centers.
PLoS One
September 2025
School of Public Health, College of Health Sciences, Makerere University, Kampala, Uganda.
Background: Despite advances in HIV care, viral load suppression (VLS) among adolescents living with HIV (ALHIV) in Uganda continue to lag behind that of adults, even with the introduction of dolutegravir (DTG)-based regimens, the Youth and Adolescent Peer Supporter (YAPS) model, and community-based approaches. Understanding factors associated with HIV viral load non-suppression in this population is critical to inform HIV treatment policy. This study assessed the prevalence and predictors of viral load non-suppression among ALHIV aged 10-19 years on DTG-based ART in Soroti City, Uganda.
View Article and Find Full Text PDFIntegr Environ Assess Manag
September 2025
School of Public Health, Taipei Medical University, New Taipei City, 235040Taiwan.
Incorporating bioaccessibility into health risk assessments enhances the accuracy of exposure estimates for heavy metal (HM) pollution, supports targeted remediation, and informs public health and policy decisions, particularly for vulnerable populations. Because HM bioaccessibility depends on local soil and geographic characteristics, identifying its relationship with soil properties is crucial for assessing soil pollution potential. Although HM concentrations can be measured relatively easily, bioaccessibility requires complex laboratory procedures, limiting routine applications in regulatory contexts.
View Article and Find Full Text PDFPolicy optimization methods are promising to tackle high-complexity reinforcement learning (RL) tasks with multiple agents. In this article, we derive a general trust region for policy optimization methods by considering the effect of subpolicy combinations among agents in multiagent environments. Based on this trust region, we propose an inductive objective to train the policy function, which can ensure agents learn monotonically improving policies.
View Article and Find Full Text PDF