Crowdtesting is one of the hot spots in artificial intelligence in recent years. Among which multi-agent crowdtesting system is a method to deal with the complex problems in crowdtesting process. How to improve the ef...
详细信息
ISBN:
(纸本)9781665435741
Crowdtesting is one of the hot spots in artificial intelligence in recent years. Among which multi-agent crowdtesting system is a method to deal with the complex problems in crowdtesting process. How to improve the efficiency on the premise of ensuring the robustness of the system has become an urgent issue. This paper takes the task assignment in the crowdtesting process as the research background. On the basis of a single agent, this paper proposes a multi-agent collaboration framework and designs imperfect-information sharing mechanism by combining with q -learning and finally we optimized the algorithm. We performed relevant simulation experiments. Compared with traditional machine learning algorithms in indicators such as robustness and adaptability, our algorithm q-learning model based on imperfect-information under multiagent crowdtesting (qMIMC) had a good performance. In addition, this paper provides a certain reference for the application of multi-agent systems in crowdtesting systems.
Unmanned aerial vehicle (UAV) path planning can be treated as a nondeterministic polynomial (NP) hard concern or an optimization problem. The conventional approaches are unable to effectively handle these issues due t...
详细信息
Unmanned aerial vehicle (UAV) path planning can be treated as a nondeterministic polynomial (NP) hard concern or an optimization problem. The conventional approaches are unable to effectively handle these issues due to discontinuity, non-linearity, multi-modality, and inseparability. On the other hand, meta-heuristic algorithms are effective at tackling these issues because they are simple, adaptable, and derivation free. To enhance the performance in a variety of challenging circumstances, this paper proposes a novel q-learning-based multi-objective sheep flock optimizer with a Cauchy operator (q-MOSFO-CA) to solve the constrained UAV path planning issues. The multi-objective functions considered here are costs and constraints (threat, terrain, turning, climbing, and gliding constraints) to determine the feasible and optimal path. To avoid the probability of falling into the local optimum and to address the shortcoming of unbalanced convergence and also to maintain the exploitation and exploration capability, the Cauchy operator (CA) is integrated with the sheep flock optimization (SFO) algorithm. The q-learning model is introduced to balance both the global and local searches. Here, the exploration model performs the global search whereas the exploitation model performs the local search to attain an optimal solution. In the simulation scenario, the statistical analysis is conducted under two scenarios, and some essential measures such as the number of iterations at convergence (NIC), evaluation time (ET), energy consumption, and convergence analysis are determined. The proposed method obtains NIC of 1305 and 1436, ET of 12.8 and 15.2 s, and energy consumption of 20,600 and 21,465 J for both Scenarios 1 and 2, respectively. A novel technique (q-MOSFO-CA) for determining efficient UAV path planning is introduced to ensure the vehicle's safety more accurately. To propose a multi-objective sheep flock optimization with Cauchy operator (MOSFO-CA) technique, multi-objec
The influence maximization problem that has caused great attention in social network analysis aims at selecting a small set of influential spreaders so that the information cascade triggered by the seed set is maximiz...
详细信息
The influence maximization problem that has caused great attention in social network analysis aims at selecting a small set of influential spreaders so that the information cascade triggered by the seed set is maximized. The majority of the existing works mainly focus on developing single-stage seeding strategies that would ignite all the seeds before the influence spread. However, it cannot depict the scenarios of the practical, where ones would like to make further decisions based on observed activation. In this paper, we investigate the policies for the intractable sequential influence maximization problem. A q-learning-driven discrete differential evolution algorithm based on the reinforcement q-learning model, which is treated as a parameter controller to adaptively adjust the parameters during the evolution of the algorithm, is proposed. The policy distributes the seeding actions over the spreading process by estimating the latest node status of the network dynamically. Extensive simulations are conducted on six social networks of the practical, and the findings demonstrate the superiority and effectiveness of the hybrid meta-heuristic algorithm compared with the state-of-the-art methods.
This study explores the impact of aging on reinforcement learning in mice, focusing on changes in learning rates and behavioral strategies. A 5-armed bandit task (5-ABT) and a computational q-learning model were used ...
详细信息
This study explores the impact of aging on reinforcement learning in mice, focusing on changes in learning rates and behavioral strategies. A 5-armed bandit task (5-ABT) and a computational q-learning model were used to evaluate the positive and negative learning rates and the inverse temperature across three age groups (3, 12, and 18 months). Results showed a significant decline in the negative learning rate of 18-month-old mice, which was not observed for the positive learning rate. This suggests that older mice maintain the ability to learn from successful experiences while decreasing the ability to learn from negative outcomes. We also observed a significant age-dependent variation in inverse temperature, reflecting a shift in action selection policy. Middle-aged mice (12 months) exhibited higher inverse temperature, indicating a higher reliance on previous rewarding experiences and reduced exploratory behaviors, when compared to both younger and older mice. This study provides new insights into aging research by demonstrating that there are age-related differences in specific components of reinforcement learning, which exhibit a non-linear pattern.
In this article, we present a q-learning-enabled safe navigation system-S-Nav-that recommends routes in a road network by minimizing traveling through categorically demarcated COVID-19 hotspots. S-Nav takes the source...
详细信息
In this article, we present a q-learning-enabled safe navigation system-S-Nav-that recommends routes in a road network by minimizing traveling through categorically demarcated COVID-19 hotspots. S-Nav takes the source and destination as inputs from the commuters and recommends a safe path for traveling. The S-Nav system dodges hotspots and ensures minimal passage through them in unavoidable situations. This feature of S-Nav reduces the commuter's risk of getting exposed to these contaminated zones and contracting the virus. To achieve this, we formulate the reward function for the reinforcement learningmodel by imposing zone-based penalties and demonstrate that S-Nav achieves convergence under all conditions. To ensure real-time results, we propose an Internet of Things (IoT)-based architecture by incorporating the cloud and fog computing paradigms. While the cloud is responsible for training on large road networks, the geographically aware fog nodes take the results from the cloud and retrain them based on smaller road networks. Through extensive implementation and experiments, we observe that S-Nav recommends reliable paths in near real time. In contrast to state-of-the-art techniques, S-Nav limits passage through red/orange zones to almost 2% and close to 100% through green zones. However, we observe 18% additional travel distances compared to precarious shortest paths.
暂无评论