We offer a new in-depth investigation of global path planning (GPP) for unmanned ground vehicles, an autonomous mining samplingrobot named ROMIE. GPP is essential for ROMIE's optimal performance, which istranslate...
详细信息
We offer a new in-depth investigation of global path planning (GPP) for unmanned ground vehicles, an autonomous mining samplingrobot named ROMIE. GPP is essential for ROMIE's optimal performance, which istranslated into solving the traveling salesman problem, a complexgraph theory challenge that is crucial for determining the most effective routeto cover all sampling locations in a mining field. This problem is central toenhancing ROMIE's operational efficiency and competitiveness against humanlabor by optimizing cost and time. The primary aim of this research is toadvance GPP by developing, evaluating, and improving a cost-efficient softwareand web application. We delve into an extensive comparison and analysis of Google operations research (OR)-Tools optimization algorithms. Our study is driven by the goal of applyingand testing the limits of OR-Tools capabilities by integrating Reinforcementlearning techniques for the first time. This enables us to compare thesemethods with OR-Tools, assessing their computational effectiveness andreal-world application efficiency. Our analysis seeks to provide insights intothe effectiveness and practical application of each technique. Our findingsindicate that q-learning stands out as the optimal strategy, demonstratingsuperior efficiency by deviating only 1.2% on average from the optimalsolutions across our datasets. Advancing global path planning algorithm is studied for transforming geochemical mining sampling in autonomous vehicles. Cutting-edge algorithms are harnessed to solve the intricate traveling salesman problem, optimizing route efficiency. A novel analysis of operations research-tools and reinforcement learning techniques is investigated, demonstrating q-learning's superior efficiency (codes provided for benchmarking). Technological advancements with a new benchmark for autonomous mining operations are *** (c) 2024 WILEY-VCH GmbH
Logistics AGV car as an important part in the intelligent manufacturing, the path planning problem by the attention of many scholars. at present, the path planning algorithm based on reinforcement learning exists the ...
详细信息
ISBN:
(数字)9781665403870
ISBN:
(纸本)9781665403870
Logistics AGV car as an important part in the intelligent manufacturing, the path planning problem by the attention of many scholars. at present, the path planning algorithm based on reinforcement learning exists the problems of slow convergence and the result is not stable, logistics AGV car in order to get a better return function, you need to perform different actions to gain more experience and information. In order to balance exploration and utilization problems, the traditional q-learning algorithm introduces the probability value of an exploration factor into the action selection strategy of AGV, selects the state-action pair of the largest q-value function every time, which leads to the system is easy to fall into the local optimal solution, which also slows down the convergence rate of the whole process. And the final action selection results will also have fluctuations. In order to solve this problem, this paper proposes an improved dynamic adjustment of exploration factor epsilon strategy, that is to choose different exploration factor e values in different stages of reinforcement learning, which can better solve the contradiction between exploration and utilization. Through simulation and real experiments, it is proved that the convergence speed of the improved reinforcement learning algorithm is faster and the stability of the convergence result is improved.
In this paper we introduce a new approach to discrete-time semi-Markov decision processes based on the sojourn time process. Different characterizations of discrete-time semi-Markov processes are exploited and decisio...
详细信息
In this paper we introduce a new approach to discrete-time semi-Markov decision processes based on the sojourn time process. Different characterizations of discrete-time semi-Markov processes are exploited and decision processes are constructed by their means. With this new approach, the agent is allowed to consider different actions depending also on the sojourn time of the process in the current state. A numerical method based on q-learning algorithms for finite horizon reinforcement learning and stochastic recursive relations is investigated. Finally, we consider two toy examples: one in which the reward depends on the sojourn-time, according to the gambler's fallacy;the other in which the environment is semi-Markov even if the reward function does not depend on the sojourn time. These are used to carry on some numerical evaluations on the previously presented q-learning algorithm and on a different naive method based on deep reinforcement learning.
This paper looks at interference-conscious spectrum allocation in 6G cell networks. A novel dynamic resource sharing a set of rules is proposed, aiming to efficaciously use available stay spectral resources and minimi...
详细信息
ISBN:
(纸本)9798400709418
This paper looks at interference-conscious spectrum allocation in 6G cell networks. A novel dynamic resource sharing a set of rules is proposed, aiming to efficaciously use available stay spectral resources and minimize the interference not unusual to more than one technique or customer. The proposed set of rules consists of sub-algorithms: the first segment is a channel selection set of rules, which selects the most excellent channels for each licensee based totally on their signal-to-interference-plus-noise ratio (SINR) and interference degrees. The second segment is an optimization set of rules, which promotes the most valuable spectrum to get admission to and aid allocation in line with the interference necessities of the specific consumer or network. The proposed rules embrace the access strategy mentioned in 3GPP spec 5G-NR, wherein a fast spectrum allocation and aid-sharing principles throughout multiple licensees are used to maximize spectrum usage. Outcomes suggest that the algorithm can attain powerful spectrum utilization even by imparting high levels of interference mitigation. The proposed gadget gives a promising technique to enhance 6G spectrum allocation. It is predicted to offer an attractive answer for operators searching to deploy a dynamic, interference-resistant communications carrier.
One desired aspect of microservice architecture is the ability to self-adapt its own architecture and behavior in response to changes in the operational environment. To achieve the desired high levels of self-adaptabi...
详细信息
One desired aspect of microservice architecture is the ability to self-adapt its own architecture and behavior in response to changes in the operational environment. To achieve the desired high levels of self-adaptability, this research implements distributed microservice architecture model running a swarm cluster, as informed by the Monitor, Analyze, Plan, and Execute over a shared Knowledge (MAPE-K) model. The proposed architecture employs multiadaptation agents supported by a centralized controller, which can observe the environment and execute a suitable adaptation action. The adaptation planning is managed by a deep recurrent q-learning network (DRqN). It is argued that such integration between DRqN and Markov decision process (MDP) agents in a MAPE-K model offers distributed microservice architecture with self-adaptability and high levels of availability and scalability. Integrating DRqN into the adaptation process improves the effectiveness of the adaptation and reduces any adaptation risks, including resource overprovisioning and thrashing. The performance of DRqN is evaluated against deep q-learning and policy gradient algorithms, including (1) a deep q-learning network (DqN), (2) a dueling DqN (DDqN), (3) a policy gradient neural network, and (4) deep deterministic policy gradient. The DRqN implementation in this paper manages to outperform the aforementioned algorithms in terms of total reward, less adaptation time, lower error rates, plus faster convergence and training time. We strongly believe that DRqN is more suitable for driving the adaptation in distributed services-oriented architecture and offers better performance than other dynamic decision-making algorithms.
This paper deals with mobile-centered decision making in heterogeneous networks, where intelligent mobile terminals take autonomous decisions about the JRRM actions, consisting to connect to one of the available syste...
详细信息
ISBN:
(纸本)9781424480166
This paper deals with mobile-centered decision making in heterogeneous networks, where intelligent mobile terminals take autonomous decisions about the JRRM actions, consisting to connect to one of the available systems. This distributed decision-making is possible due to q-learning algorithms implemented within the mobile terminals that enable them to profit from their past experience in order to enhance their subsequent decisions. We develop an original Markovian model that allows analyzing analytically the evolution of the q-learning process and show how the the performance is enhanced until convergence.
暂无评论