检索结果-内蒙古大学图书馆

19th IEEE International Symposium on Parallel and Distributed Processing with Applications (IEEE ISPA)

作者： Zhang, Jie Li, Kefan Zhang, Baoming Xu, Ming Wang, Chongjun Nanjing Univ Dept Comp Sci & Technol State Key Lab Novel Software Technol Nanjing Peoples R China

ISBN: (纸本)9781665435741

Crowdtesting is one of the hot spots in artificial intelligence in recent years. Among which multi-agent crowdtesting system is a method to deal with the complex problems in crowdtesting process. How to improve the efficiency on the premise of ensuring the robustness of the system has become an urgent issue. This paper takes the task assignment in the crowdtesting process as the research background. On the basis of a single agent, this paper proposes a multi-agent collaboration framework and designs imperfect-information sharing mechanism by combining with q -learning and finally we optimized the algorithm. We performed relevant simulation experiments. Compared with traditional machine learning algorithms in indicators such as robustness and adaptability, our algorithm q-learning model based on imperfect-information under multiagent crowdtesting (qMIMC) had a good performance. In addition, this paper provides a certain reference for the application of multi-agent systems in crowdtesting systems.

关键词： Multi-Agent System q-learning model Crowdsourcing System Imperfect-information

来源：评论

学校读者我要写书评

暂无评论

Robust q-learning-based multi-objective sheep flock optimizer with a Cauchy operator for effective path planning in unmanned aerial vehicles

引用

INTERNATIONAL JOURNAL OF COMMUNICATION SYSTEMS 2024年第2期37卷

作者： Kumar, Vikash Saha, Seemanti Natl Inst Technol Dept Elect & Commun Engn Patna 800005 India

Unmanned aerial vehicle (UAV) path planning can be treated as a nondeterministic polynomial (NP) hard concern or an optimization problem. The conventional approaches are unable to effectively handle these issues due to discontinuity, non-linearity, multi-modality, and inseparability. On the other hand, meta-heuristic algorithms are effective at tackling these issues because they are simple, adaptable, and derivation free. To enhance the performance in a variety of challenging circumstances, this paper proposes a novel q-learning-based multi-objective sheep flock optimizer with a Cauchy operator (q-MOSFO-CA) to solve the constrained UAV path planning issues. The multi-objective functions considered here are costs and constraints (threat, terrain, turning, climbing, and gliding constraints) to determine the feasible and optimal path. To avoid the probability of falling into the local optimum and to address the shortcoming of unbalanced convergence and also to maintain the exploitation and exploration capability, the Cauchy operator (CA) is integrated with the sheep flock optimization (SFO) algorithm. The q-learning model is introduced to balance both the global and local searches. Here, the exploration model performs the global search whereas the exploitation model performs the local search to attain an optimal solution. In the simulation scenario, the statistical analysis is conducted under two scenarios, and some essential measures such as the number of iterations at convergence (NIC), evaluation time (ET), energy consumption, and convergence analysis are determined. The proposed method obtains NIC of 1305 and 1436, ET of 12.8 and 15.2 s, and energy consumption of 20,600 and 21,465 J for both Scenarios 1 and 2, respectively. A novel technique (q-MOSFO-CA) for determining efficient UAV path planning is introduced to ensure the vehicle's safety more accurately. To propose a multi-objective sheep flock optimization with Cauchy operator (MOSFO-CA) technique, multi-objec

关键词： Cauchy operator global optimization path planning q-learning model sheep flock optimization unmanned aerial vehicle

来源：评论

学校读者我要写书评

暂无评论

Sequential seeding policy on social influence maximization: a q-learning-driven discrete differential evolution optimization

引用

JOURNAL OF SUPERCOMPUTING 2024年第3期80卷 3334-3359页

作者： Tang, Jianxin Song, Shihui Zhu, Hongyu Du, qian qu, Jitao Lanzhou Univ Technol Sch Comp & Commun Lanzhou 730050 Peoples R China Lanzhou Univ Technol Wenzhou Engn Inst Pump & Valve Wenzhou 325100 Peoples R China

The influence maximization problem that has caused great attention in social network analysis aims at selecting a small set of influential spreaders so that the information cascade triggered by the seed set is maximized. The majority of the existing works mainly focus on developing single-stage seeding strategies that would ignite all the seeds before the influence spread. However, it cannot depict the scenarios of the practical, where ones would like to make further decisions based on observed activation. In this paper, we investigate the policies for the intractable sequential influence maximization problem. A q-learning-driven discrete differential evolution algorithm based on the reinforcement q-learning model, which is treated as a parameter controller to adaptively adjust the parameters during the evolution of the algorithm, is proposed. The policy distributes the seeding actions over the spreading process by estimating the latest node status of the network dynamically. Extensive simulations are conducted on six social networks of the practical, and the findings demonstrate the superiority and effectiveness of the hybrid meta-heuristic algorithm compared with the state-of-the-art methods.

关键词： Social networks Influence maximization Sequential seeding policy q-learning model Discrete differential evolution optimization

来源：评论

学校读者我要写书评

暂无评论

Nonlinear age-related differences in probabilistic learning in mice: A 5-armed bandit task study

引用

NEUROBIOLOGY OF AGING 2024年 142卷 8-16页

作者： Ohta, Hiroyuki Nozawa, Takashi Nakano, Takashi Morimoto, Yuji Ishizuka, Toshiaki Natl Def Med Coll Dept Pharmacol 3-2 Namiki Tokorozawa Saitama 3598513 Japan Mejiro Univ 4-31-1 Naka Ochiai Tokyo Tokyo 1618539 Japan Fujita Hlth Univ Sch Med Dept Computat Biol 1-98 Dengakugakubo Toyoake Aichi 4701192 Japan Fujita Hlth Univ Int Ctr Brain Sci ICBS 1-98 Dengakugakubo Toyoake Aichi 4701192 Japan Natl Def Med Coll Dept Physiol 3-2 Namiki Tokorozawa Saitama 3598513 Japan

This study explores the impact of aging on reinforcement learning in mice, focusing on changes in learning rates and behavioral strategies. A 5-armed bandit task (5-ABT) and a computational q-learning model were used to evaluate the positive and negative learning rates and the inverse temperature across three age groups (3, 12, and 18 months). Results showed a significant decline in the negative learning rate of 18-month-old mice, which was not observed for the positive learning rate. This suggests that older mice maintain the ability to learn from successful experiences while decreasing the ability to learn from negative outcomes. We also observed a significant age-dependent variation in inverse temperature, reflecting a shift in action selection policy. Middle-aged mice (12 months) exhibited higher inverse temperature, indicating a higher reliance on previous rewarding experiences and reduced exploratory behaviors, when compared to both younger and older mice. This study provides new insights into aging research by demonstrating that there are age-related differences in specific components of reinforcement learning, which exhibit a non-linear pattern.

关键词： Aging Reinforcement learning learning rate 5-armed bandit task q-learning model

来源：评论

学校读者我要写书评

暂无评论

S-Nav: Safety-Aware IoT Navigation Tool for Avoiding COVID-19 Hotspots

引用

IEEE INTERNET OF THINGS JOURNAL 2021年第8期8卷 6975-6982页

作者： Misra, Sudip Deb, Pallav Kumar Koppala, Naimisha Mukherjee, Anandarup Mao, Shiwen Indian Inst Technol Kharagpur Dept Comp Sci & Engn Kharagpur 721302 W Bengal India Indian Inst Technol Delhi Dept Math & Comp New Delhi 110016 India Auburn Univ Dept Elect & Comp Engn Auburn AL 36849 USA

In this article, we present a q-learning-enabled safe navigation system-S-Nav-that recommends routes in a road network by minimizing traveling through categorically demarcated COVID-19 hotspots. S-Nav takes the source and destination as inputs from the commuters and recommends a safe path for traveling. The S-Nav system dodges hotspots and ensures minimal passage through them in unavoidable situations. This feature of S-Nav reduces the commuter's risk of getting exposed to these contaminated zones and contracting the virus. To achieve this, we formulate the reward function for the reinforcement learning model by imposing zone-based penalties and demonstrate that S-Nav achieves convergence under all conditions. To ensure real-time results, we propose an Internet of Things (IoT)-based architecture by incorporating the cloud and fog computing paradigms. While the cloud is responsible for training on large road networks, the geographically aware fog nodes take the results from the cloud and retrain them based on smaller road networks. Through extensive implementation and experiments, we observe that S-Nav recommends reliable paths in near real time. In contrast to state-of-the-art techniques, S-Nav limits passage through red/orange zones to almost 2% and close to 100% through green zones. However, we observe 18% additional travel distances compared to precarious shortest paths.

关键词： Fog computing hotspots Internet of Things (IoT) path planning q-learning model reinforcement learning (RL) shortest path

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：