A PHP Error was encountered

Severity: Warning

Message: file_get_contents(https://...@gmail.com&api_key=61f08fa0b96a73de8c900d749fcb997acc09&a=1): Failed to open stream: HTTP request failed! HTTP/1.1 429 Too Many Requests

Filename: helpers/my_audit_helper.php

Line Number: 197

Backtrace:

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 197
Function: file_get_contents

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 271
Function: simplexml_load_file_from_url

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 3165
Function: getPubMedXML

File: /var/www/html/application/controllers/Detail.php
Line: 597
Function: pubMedSearch_Global

File: /var/www/html/application/controllers/Detail.php
Line: 511
Function: pubMedGetRelatedKeyword

File: /var/www/html/index.php
Line: 317
Function: require_once

Multiagent Inductive Policy Optimization. | LitMetric

Multiagent Inductive Policy Optimization.

IEEE Trans Neural Netw Learn Syst

Published: September 2025


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Policy optimization methods are promising to tackle high-complexity reinforcement learning (RL) tasks with multiple agents. In this article, we derive a general trust region for policy optimization methods by considering the effect of subpolicy combinations among agents in multiagent environments. Based on this trust region, we propose an inductive objective to train the policy function, which can ensure agents learn monotonically improving policies. Furthermore, we observe that the policy always updates very weakly before falling into a local optimum. To address this, we introduce a cost regarding policy distance in the inductive objective to strengthen the motivation of agents to explore new policies. This approach strikes a balance during training, where the policy update step size remains within the constraints of the trust region, preventing excessive updates while avoiding getting stuck in local optima. Simulations on wind farm (WF) control tasks and two multiagent benchmarks demonstrate the high performance of the proposed multiagent inductive policy optimization (MAIPO) method.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TNNLS.2025.3601360DOI Listing

Publication Analysis

Top Keywords

policy optimization
16
trust region
12
multiagent inductive
8
policy
8
inductive policy
8
optimization methods
8
inductive objective
8
multiagent
4
optimization
4
optimization policy
4

Similar Publications