This paper tackles the challenge of using multiple robots to search for unknown dynamic targets in complex, large environments. As both the number of robots and environmental complexity increase, coordinating efficien...
详细信息
This paper tackles the challenge of using multiple robots to search for unknown dynamic targets in complex, large environments. As both the number of robots and environmental complexity increase, coordinating efficient distributedsearches becomes more difficult. Previous methods define search as either exploration or utilizing an initial target distribution to accelerate the process. However, these methods fail to dynamically build or update the target distribution from scratch and overlook the differences between the already searched regions during the search. We propose DAPTP, a novel distributed Awareness Planner using Time Potential for dynamic target search. At the core of DAPTP is the concept of the time potential map, which estimates the target distribution in the environment based on historical search information. The importance of the searched regions is distinguished based on their corresponding time potential. Building on this, the coverage and search direction for each robot is then planned by maximizing the change in time potential, ensuring that areas with the greatest potential variation receive prioritized attention. We conduct extensive experiments both simulations and real-world scenarios. The results demonstrate that our approach significantly surpasses stateof-the-art methods in terms of reducing search steps and improving collective environmental awareness, area search rate, detected target number, and success rate. The source code is available at: https://***/ arclab-hku/DAPTP.
The development of machine learning and artificial intelligence algorithms, as well as the progress of unmanned aerial vehicle swarm technology, has significantly enhanced the intelligence and autonomy of unmanned aer...
详细信息
The development of machine learning and artificial intelligence algorithms, as well as the progress of unmanned aerial vehicle swarm technology, has significantly enhanced the intelligence and autonomy of unmanned aerial vehicles in search missions, resulting in greater efficiency when searching unknown areas. However, as search scenarios become more complex, the existing unmanned aerial vehicle swarm search method lacks scalability and efficient cooperation. Furthermore, due to the increasing scale of search scenarios, the accuracy and real-time performance of global information are difficult to ensure, necessitating the provision of local information. This paper focuses on the large-scale search scenario and split it to provide both local and global information for running unmanned aerial vehicle swarm searchalgorithms. Since the search environment is often unknown, dynamic, and complex, it requires adaptive decision-making in a constantly changing environment, which is suitable for modeling as a Markov decision process. Considering the sequential-based scenario, we propose a distributed collaborative search method based on a multi-agent reinforcement learning algorithm, which can operate efficiently in complex and large-scale scenarios. Additionally, the proposed method can utilize a convolutional neural network to process high-dimensional map data with almost no loss of the structure information. Experimental results demonstrate that the proposed method can collaboratively search unknown areas, avoid collisions and repetitions, and find all targets faster compared with the benchmarks.
This paper proposes a new approach for the non-supervised learning process of multiagent player systems operating in a high performance environment, being that the cooperative agents are trained so as to be expert in ...
详细信息
This paper proposes a new approach for the non-supervised learning process of multiagent player systems operating in a high performance environment, being that the cooperative agents are trained so as to be expert in specific stages of a game. This proposal is implemented by means of the Checkers automatic player denominated D-MA-Draughts, which is composed of 26 agents. The first is specialized in initial and intermediary game stages, whereas the remaining are specialists in endgame stages (defined by board-games containing, at most, 12 pieces). Each of these agents consists of a Multilayer Neural Network, trained without human supervision through Temporal Difference Methods. The best move is determined by the distributed search algorithm known as Young Brothers Wait Concept. Each endgame agent is able to choose a move from a determined profile of endgame board. These profiles are defined by a clustering process performed by a Kohonen-SOM network from a database containing endgame boards retrieved from real matches. Once trained, the D-MA-Draughts agents can actuate in a match according to two distinct game dynamics. In fact, the D-MA-Draughts architecture corresponds to an extension of two preliminary versions: MP-Draughts, which is a multiagent system with a serial searchalgorithm, and D-VisionDraughts, which is a single agent with a distributed search algorithm. The D-MA-Draughts gains are estimated through several tournaments against these preliminary versions. The results show that D-MA-Draughts improves upon its predecessors by significantly reducing training time and the endgame loops, thus beating them in several tournaments.
暂无评论