检索结果-内蒙古大学图书馆

Optimizing cultural heritage tourism routes using q-learning: a case study of Macau

Sustainable Communities 2025年第1期2卷

作者： Junxin Song Yile Chen a Institute for Research on Portuguese Speaking Countries City University of Macau Taipa Macau China b Faculty of Humanities and Arts Macau University of Science and Technology Taipa Macau China

IntroductionThis study proposes a q-learning-based optimization method for cultural heritage tourism routes, using the Historic Centre of Macau as a case study. The goal is to efficiently visit multiple attractions within a limited *** and methodsCoordinates of 25 heritage sites were obtained through the Google Maps API, and the Haversine formula was used to calculate distances. We designed a state space, action space, and reward function based on distance and time for dynamic route *** and conclusionThe results show that the q-learning algorithm creates the best route that includes all the attractions while shortening the whole path and achieving rapid convergence. The optimized routes improve visit efficiency and balance attraction utilization, preventing overcrowding in popular areas. This approach provides practical implications for intelligent cultural heritage tourism planning. By designing an intelligent tourism route planning system, this helps tourists explore the Historic Centre of Macau more efficiently. Future research will focus on refining the reward function by incorporating visitor preferences and real-time traffic conditions for greater personalization and applicability.

关键词： Cultural heritage tourism Historic Centre of Macau intelligent tourism planning q-learning algorithm route optimization sustainable tourism

来源：评论

学校读者我要写书评

暂无评论

Control the population of free viruses in nonlinear uncertain HIV system using q-learning

引用

INTERNATIONAL JOURNAL OF MACHINE learning AND CYBERNETICS 2018年第7期9卷 1169-1179页

作者： Gholizade-Narm, Hossein Noori, Amin Shahrood Univ Technol Dept Control Engn Shahrood Iran Sadjad Univ Technol Mashhad Iran

This paper surveys a new method to reduce the infected cells and free virus particles (virions) via a nonlinear HIV model. Three scenarios are considered for control performance evaluation. At first, the system and initial conditions are considered known completely. In the second case, the initial conditions are taken randomly. In the third scenario, in addition to uncertainty in initial condition, an additive noise is taken into account. The optimal control method is used to design an effective drug-schedule to reduce the number of infected cells and free virions with and without uncertainty. By using the q-learning algorithm, which is the most applicable algorithm in reinforcement learning, the drug delivery rate is obtained off-line. Since q-learning is a model-free algorithm, it is expected that the performance of the control in the presence of uncertainty does not change significantly. Simulation results confirm that the proposed control method has a good performance and high functionality in controlling the free virions for both certain and uncertain HIV models.

关键词： Uncertain system dynamics Drug therapy Human immunodeficiency virus q-learning algorithm Reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Predator-Prey Reward Based q-learning Coverage Path Planning for Mobile Robot

引用

IEEE ACCESS 2023年 11卷 29673-29683页

作者： Zhang, Meiyan Cai, Wenyu Pang, Lingfeng Zhejiang Univ Water Resources & Elect Power Coll Elect Engn Hangzhou 310018 Peoples R China Hangzhou Dianzi Univ Coll Elect & Informat Hangzhou 310018 Peoples R China

Coverage Path Planning (CPP in short) is a basic problem for mobile robot when facing a variety of applications. q-learning based coverage path planning algorithms are beginning to be explored recently. To overcome the problem of traditional q-learning of easily falling into local optimum, in this paper, the new-type reward functions originating from Predator-Prey model are introduced into traditional q-learning based CPP solution, which introduces a comprehensive reward function that incorporates three rewards including Predation Avoidance Reward Function, Smoothness Reward Function and Boundary Reward Function. In addition, the influence of weighting parameters on the total reward function is discussed. Extensive simulation results and practical experiments verify that the proposed Predator-Prey reward based q-learning Coverage Path Planning (PP-q-learning based CPP in short) has better performance than traditional BCD and q-learning based CPP in terms of repetition ratio and turns number.

关键词： Path planning Mobile robots Predator prey systems q-learning Planning Partitioning algorithms Behavioral sciences Coverage path planning predator-prey model reinforcement learning q-learning algorithm mobile robot

来源：评论

学校读者我要写书评

暂无评论

Stochastic Optimal CPS Relaxed Control Methodology for Interconnected Power Systems Using q-learning Method

引用

JOURNAL OF ENERGY ENGINEERING 2011年第3期137卷 116-129页

作者： Yu, Tao Zhou, Bin Chan, Ka Wing Lu, En Hong Kong Polytech Univ Dept Elect Engn HKSAR Hong Kong Hong Kong Peoples R China China So Power Grid Co Guangdong Power Dispatching Ctr Guangzhou 510600 Guangdong Peoples R China S China Univ Technol Coll Elect Engn Guangzhou 510641 Guangdong Peoples R China

This paper presents the application and design of a novel stochastic optimal control methodology based on the q-learning method for solving the automatic generation control (AGC) under the new control performance standards (CPS) for the North American Electric Reliability Council (NERC). The aims of CPS are to relax the control constraint requirements of AGC plant regulation and enhance the frequency dispatch support effect from interconnected control areas. The NERC's CPS-based AGC problem is a dynamic stochastic decision problem that can be modeled as a reinforcement learning (RL) problem based on the Markov decision process theory. In this paper, the q-learning method is adopted as the RL core algorithm with CPS values regarded as the rewards from the interconnected power systems;the CPS control and relaxed control objectives are formulated as immediate reward functions by means of a linear weighted aggregative approach. By regulating a closed-loop CPS control rule to maximize the long-term discounted reward in the procedure of online learning, the optimal CPS control strategy can be gradually obtained. This paper also introduces a practical semisupervisory group prelearning method to improve the stability and convergence ability of q-learning controllers during the prelearning process. Tests on the China Southern Power Grid demonstrate that the proposed control strategy can effectively enhance the robustness and relaxation property of AGC systems while CPS compliances are ensured. DOI:10.1061/(ASCE)EY.1943-7897.0000017. (C) 2011 American Society of Civil Engineers.

关键词： q-learning algorithm Reinforcement learning Automatic generation control Control performance standard Markov decision process Optimal control China Southern Power Grid

来源：评论

学校读者我要写书评

暂无评论

qMCR:A q-learning-Based Multi-Hop Cooperative Routing Protocol for Underwater Acoustic Sensor Networks

引用

China Communications 2021年第8期18卷 224-236页

作者： Yougan Chen Kaitong Zheng Xing Fang Lei Wan Xiaomei Xu Key Laboratory of Underwater Acoustic Communication and Marine Information Technology(Xiamen University) Ministry of EducationXiamen 361005China Shenzhen Research Institute of Xiamen University Shenzhen 518000China Dongshan Swire Marine Station College of Ocean and Earth SciencesXiamen UniversityXiamen 361102China School of Information Technology Illinois State UniversityNormalIL 61790USA School of Informatics Xiamen UniversityXiamen 361005China

Routing plays a critical role in data transmission for underwater acoustic sensor networks(UWSNs)in the internet of underwater things(IoUT).Traditional routing methods suffer from high end-toend delay,limited bandwidth,and high energy *** the development of artificial intelligence and machine learning algorithms,many researchers apply these new methods to improve the quality of *** this paper,we propose a qlearning-based multi-hop cooperative routing protocol(qMCR)for *** protocol can automatically choose nodes with the maximum q-value as forwarders based on distance ***,we combine cooperative communications with q-learning algorithm to reduce network energy consumption and improve communication *** results show that the running time of the qMCR is less than one-tenth of that of the artificial fish-swarm algorithm(AFSA),while the routing energy consumption is kept at the same *** to the extremely fast speed of the algorithm,the qMCR is a promising method of routing design for UWSNs,especially for the case that it suffers from the extreme dynamic underwater acoustic channels in the real ocean environment.

关键词： q-learning algorithm routing internet of underwater things underwater acoustic communication multi-hop cooperative communication

来源：评论

学校读者我要写书评

暂无评论

Improved q-learning Applied to Dynamic Obstacle Avoidance and Path Planning

引用

IEEE ACCESS 2022年 10卷 92879-92888页

作者： Wang, Chunlei Yang, Xiao Li, He China Univ Min & Technol Sch Publ Adm Xuzhou 221116 Jiangsu Peoples R China Harbin Engn Univ Sch Comp Sci & Technol Harbin 150001 Heilongjiang Peoples R China Beijing Jiaotong Univ Dept Languages & Commun Studies Beijing 100044 Haidian Peoples R China

Due to the complexity of interactive environments, dynamic obstacle avoidance path planning poses a significant challenge to agent mobility. Dynamic path planning is a complex multi-constraint combinatorial optimization problem. Some existing algorithms easily fall into local optimization when solving such problems, leading to defects in convergence speed and accuracy. Reinforcement learning has certain advantages in solving decision sequence problems in complex environments. A q-learning algorithm is a reinforcement learning method. In order to improve the value evaluation of the algorithm in solving practical problems, this paper introduces the priority weight into the q-learning algorithm. The improved algorithm is compared with existing algorithms and applied to dynamic obstacle avoidance path planning. Experiments show that the improved algorithm dramatically improves the convergence speed and accuracy and increases the value evaluation. The improved algorithm finds the shortest path of 16 units in 27 seconds.

关键词： Path planning Heuristic algorithms Robots q-learning Planning Optimization Genetic algorithms Dynamic obstacle avoidance sequence problems reinforcement learning q-learning algorithm

来源：评论

学校读者我要写书评

暂无评论

A q-learning-Based Hierarchical Routing Protocol With Unequal Clustering for Underwater Acoustic Sensor Networks

引用

IEEE SENSORS JOURNAL 2023年第6期23卷 6312-6325页

作者： Yuan, Yufan Liu, Meiyan Zhuo, Xiaoxiao Wei, Yan Tu, Xingbin qu, Fengzhong Zhejiang Univ Ocean Coll Zhoushan 316021 Peoples R China Chinese Acad Sci Shanghai Inst Microsyst & Informat Technol SIMT Key Lab Wireless Sensor Network & Commun Shanghai 200050 Peoples R China Hainan Inst Zhejiang Univ Sanya 572025 Peoples R China

Underwater acoustic sensor networks (UASNs) have emerged as a viable networking approach due to their numerous aquatic applications in recent years. As a vital component of UASNs, routing protocols are essential for ensuring reliable data transmissions and extending the longevity of UASNs. Recently, several clustering-based routing protocols have been proposed to reduce energy consumption and overcome the resource constraints of deployed sensor nodes. However, they rarely consider the hot-spots' problem and the sink node isolation problem in the multihop underwater sensor networks. In this article, we propose a q-learning-based hierarchical routing protocol with unequal clustering (qHUC) for determining an effective data forwarding path to extend the lifespan of UASNs. First, a hierarchical network structure is constructed for initialization. Then, a combination of unequal clustering and the q-learning algorithm is applied to the hierarchical structure to disperse the remaining energy more evenly throughout the network. With the use of the q-learning algorithm, a global optimal CH and next-hop can be determined better than a greedy one. In addition, the q value that guarantees the optimal routing decisions can be computed without incurring any additional costs by combining the q-learning algorithm with clustering. The simulation results show that the qHUC can achieve efficient routing and prolong the network lifetime significantly.

关键词： Energy balance network lifetime q-learning algorithm underwater acoustic networks unequal clustering

来源：评论

学校读者我要写书评

暂无评论

q-learning-Based Resilience Assessment of Weakly Coupled Cyber-Physical Power Systems

引用

IEEE TRANSACTIONS ON RELIABILITY 2024年

作者： Wang, Shuliang Yang, Xiancheng Huang, Xiaodi Zhang, Jianhua Luan, Shengyang Jiangsu Normal Univ Sch Elect Engn & Automat Xuzhou 221116 Peoples R China Charles Sturt Univ Sch Comp Math & Engn Albury NSW 2640 Australia

The capability of cyber-physical power system (CPPS) to recover from cascading failures caused by extreme events and restore prefailure functionality is a critical focus in resilience research. In contrast to the strongly coupled systems studied by most researchers, this article examines weakly coupled CPPS, exploring result-oriented recovery approaches to enhance system resilience. Various repair methods are compared in terms of the resilience of weakly connected CPPS across different coupling modes and probabilities of failover. Utilizing the q-learning algorithm, an optimized sequence for network restoration is obtained to minimize the negative influence of failures on network functionality while reducing power loss. The proposed method's effectiveness and generalizability have been comprehensively verified through simulation experiments by establishing weakly coupled CPPS for the IEEE 39, IEEE 118, and IEEE 300 networks and their corresponding scale-free networks. Its rationality was verified through two recovery mechanisms: single-node recovery and multinode recovery. By comparing the proposed method with heuristic recovery methods and optimization-based recovery methods, we found that it can significantly accelerate network recovery, and improve network resilience, achieving better resilience centrality. These findings provide valuable insights for decision making in CPPS recovery work.

关键词： Cascading failure model optimal network restoration q-learning algorithm resilience weakly coupled cyber-physical power system (CPPS) Cascading failure model optimal network restoration q-learning algorithm resilience weakly coupled cyber-physical power system (CPPS)

来源：评论

学校读者我要写书评

暂无评论

q-learning based Maximum Power Extraction for Wind Energy Conversion System With Variable Wind Speed

引用

IEEE TRANSACTIONS ON ENERGY CONVERSION 2020年第3期35卷 1160-1170页

作者： Kushwaha, Ashish Gopal, Madan Singh, Bhim Shiv Nadar Univ Sch Engn Dept Elect Engn Gautam Buddha Nagar 201314 Uttar Pradesh India Indian Inst Technol Dept Elect Engn New Delhi 110016 India

This paper presents an intelligent wind speed sensor less maximum power point tracking (MPPT) method for a variable speed wind energy conversion system (VS-WECS) based on a q-learning algorithm. The q-learning algorithm consists of q-values for each state action pair which is updated using reward and learning rate. Inputs to define these states are electrical power received by grid and rotational speed of the generator. In this paper, q-learning is equipped with peak detection technique, which drives the system towards peak power even if learning is incomplete which makes the real time tracking faster. To make the learning uniform, each state has its separate learning parameter instead of common learning parameter for all states as is the case in conventional q-learning. Therefore, if half learned system is running at peak point, it does not affect the learning of unvisited states. Also, wind speed change detection is combined with proposed algorithm which makes it eligible to work for varying wind speed conditions. In addition, the information of wind turbine characteristics and wind speed measurement is not needed. The algorithm is verified through simulations and experimentation and also compared with perturbation and observation (P&O) algorithm.

关键词： Wind speed Wind turbines Generators Maximum power point trackers Reinforcement learning Voltage control Maximum power point tracking q-learning algorithm reinforcement learning wind energy conversion system

来源：评论

学校读者我要写书评

暂无评论

Auction design for the allocation of carbon emission allowances to supply chains via multi-agent-based model and q-learning

引用

COMPUTATIONAL & APPLIED MATHEMATICS 2022年第4期41卷 1-41页

作者： Avval, Akram Esmaeili Dehghanian, Farzad Pirayesh, Mohammadali Ferdowsi Univ Mashhad Fac Engn Ind Engn Dept POB *** Mashhad Razavi Khorasan Iran

To increase competition, control price, and decrease inefficiency in the carbon allowance auction market, limitations on bidding price and volume can be set. With limitations, participants have the same cap bidding price and volume. While without the limitations, participants have different values per unit of carbon allowance;therefore, some participants may be strong and the other week. Due to the impact of these limitations on the auction, this paper tries to compare the uniform and discriminatory pricing in a carbon allowance auction with and without the limitations utilizing a multi-agent-based model consisting of the government and supply chains. The government determines the supply chains' initial allowances. The supply chains compete in the carbon auction market and determine their bidding strategies based on the q-learning algorithm. Then they optimize their tactical and operational decisions. They can also trade their carbon allowances in a carbon trading market in which price is free determined according to carbon supply and demand. Results show that without the limitations, the carbon price in the uniform pricing is less than or equal to the discriminatory pricing method. At the same time, there are no differences between them in the case with limitations. Overall, the auction reduces the profit of the supply chains. This negative effect is less in uniform than discriminatory pricing in the case without the limitations. Nevertheless, the strong supply chains make huge profits from the auction when mitigation rate is high.

关键词： Supply chain management Multi-agent-based model Carbon auction market Pricing methods Price and volume limitations q-learning algorithm

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：