We study regret minimization in online episodic linear Markov Decision Processes, and propose a policy optimization algorithm that is computationally efficient, and obtains rate optimal Oe(√K) regret where K denotes ...
详细信息
We study regret minimization in online episodic linear Markov Decision Processes, and propose a policy optimization algorithm that is computationally efficient, and obtains rate optimal Oe(√K) regret where K denotes the number of episodes. Our work is the first to establish the optimal rate (in terms of K) of convergence in the stochastic setting with bandit feedback using a policy optimization based approach, and the first to establish the optimal rate in the adversarial setup with full information feedback, for which no algorithm with an optimal rate guarantee was previously known. Copyright 2024 by the author(s)
Expensive multi-objective optimization problems often allow limited functional evaluations, which makes the traditional evolutional algorithms requiring large sample sizes challenging to solve. Multi-objective efficie...
详细信息
This study revives the Turtle Trading Rule, offering new parameters and a more stable strategy for increased profit opportunities. Proposed in the 1980s, the Turtle Trading method is a classic trend trading strategy c...
详细信息
In recent decades, large-scale optimization has received significant research attention;however, most of these studies have not considered problems with functional constraints. The introduction of constraints signific...
详细信息
Chaotic Evolution (CE) is a significantly faster and more robust method for solving single-objective and multi-objective optimization problems. However, there are various factors that can impact the performance of CE,...
详细信息
The importance of the UAV path-planning problem lies in ensuring the safe and efficient completion of tasks by UAVs while maximizing resource utilization and minimizing risks. Addressing this issue, a new algorithm na...
详细信息
With the wide application of UAVs, the scenarios have become more complex, and UAV tasking is facing more *** of all, a heterogeneous multi-UAV collaborative multi-tasking assignment model is constructed, which includ...
详细信息
Decision-optimisation problems are ubiquitous in all areas of production practice. The predominant methodology for their resolution is the abstraction of the problem into a mathematical model, followed by the applicat...
详细信息
The Whale optimization Algorithm (WOA) is a revolutionary meta-heuristic area dedicated to intelligent issue exploration and optimization. The communal foraging activities of humpback whales inspired this evolutionary...
详细信息
To solve the issues of long localization times, low population diversity, slow convergence rates, and weak fault tolerance in the intelligent optimisation algorithms currently in use for fault localization in active d...
详细信息
暂无评论