检索结果-内蒙古大学图书馆

Optimal Decision-Making Method for a Plug-In Electric Taxi in Uncertain Environment

IEEE ACCESS 2021年 9卷 62467-62477页

作者： You, Yang Zhu, Jisong Huang, Yichuan Jing, Zhaoxia South China Univ Technol Sch Elect Power Engn Guangzhou 510640 Peoples R China

This paper studies the optimal decision-making problem for a plug-in electric taxi (PET) in a time-varying complex environment, i.e., a passenger environment, charging station environment, traffic environment, and taxi company management system, in order to maximize PET profit in a short-term operating cycle. First, this problem is formulated as a sequential decision-making problem composed of multiple decision slots. Then, to make the model more practical, the model is divided into two parts: an external environment and an electric taxi model for refinement. The uncertainty and time-varying characteristics of four environmental aspects, including passengers, charging stations, traffic, and taxi company management systems, are analysed and modelled. The transitions between adjacent processes and the environmental feedback of each process are modelled by further subdividing both the serving process and the charging process of the PET into multiple subprocesses, including cruising, carrying passengers, driving to the charging station, queueing before charging, and connecting to the power grid for charging. There are several uncertain factors in the sequential decision-making process for the PET, which leads to difficulty in solving the problem. To address this difficulty, the model-free algorithm sarsa is chosen. Finally, the effectiveness of the proposed method is verified by simulation results.

关键词： Public transportation Decision making Electric vehicles Uncertainty Charging stations Probability distribution Power grids Plug-in electric taxi decision making uncertainty sarsa algorithm load modeling

来源：评论

学校读者我要写书评

暂无评论

A Case Study: Characterization of Performance Inconsistency for Reinforcement Learning on Flappy Bird Game 12

A Case Study: Characterization of Performance Inconsistency ...

引用

12th International Conference on ICT Convergence (ICTC) - Beyond the Pandemic Era with ICT Convergence Innovation

作者： Shakerimov, Aidar Li, Dmitriy Park, Jurn-Gyu Nazarbayev Univ Comp Sci Sch Engn & Digital Sci Nur Sultan Kazakhstan

ISBN: (纸本)9781665423830

One of the serious problems in Reinforcement Learning (RL) algorithms is that their performance usually varies when the same experiment is repeated or reproduced. Although RL results are hard to reproduce due to algorithms' intrinsic variance, which was not investigated systematically. Through this case study on Flappy Bird environment, we introduce and characterize four important factors on performance inconsistency of RL algorithms: 1) level of environment randomness, 2) order of action-value updates process, 3) exploration rate strategy, and 4) selection between on- and off-policy algorithms. Using a quantitative metric (coefficient of variation), we compare, analyze and investigate the results and the effects of each factor on the performance inconsistency/variance in RL. We believe our experimental results and analysis will provide opportunities to obtain an efficient agent that repeats/reproduces more consistent performance results.

关键词： Reinforcement Learning (RL) Performance Inconsistency State Discretization Q-learning sarsa algorithm

来源：评论

学校读者我要写书评

暂无评论

Power Control Research for Device-to-Device Wireless Network Underlying Reinforcement Learning

Power Control Research for Device-to-Device Wireless Network...

引用

作者： Kang Han Chengyin Ye College of Computer and Communication LiaoNing Petrochemical University

Aiming at the problem that co-channel interference leads to decrease of system data throughput when reusing cellular user spectrum resources, a D2D communication link power control method combined with reinforcement learning sarsa algorithm is proposed. Joint RL Model-free control and on-policy characteristics of actions consistent with evaluation strategies in sarsa. The simulation results show that it can significantly reduce average transmission power in the communication process and effectively improve the data throughput.

关键词： D2D communications reinforcement learning sarsa algorithm power control

来源：评论

学校读者我要写书评

暂无评论

AGV Path Planning Model based on Reinforcement Learning

AGV Path Planning Model based on Reinforcement Learning

引用

Chinese Automation Congress (CAC)

作者： Liao, Xiaofei Wang, Yang Xuan, Yiliang Wu, Dequan Donghua Univ Coll Informat Sci & Technol Shanghai 201620 Peoples R China Donghua Univ Engn Res Ctr Digitized Text & Fash Technol Minist Educ Shanghai 201620 Peoples R China

ISBN: (纸本)9781728176871

With the rapid growth of logistics transportation, automated guided vehicle (AGV) technologY has developed speedily. Path planning is one of the key research topics of AGV. It is difficult to plan an optimal path from starting position to target position for AGV in the complex environment. In this paper, reinforcement learning technology is introduced to solve the problem that it is difficult to model AGV path planning due to complex and unknown environment. The sarsa algorithm based on simulated annealing strategy can effectively guide ACV to plan the optimal path in the grid graph, and improve the success rate. Aiming at the problem that the traditional reinforcement learning algorithm processes data insufficiently in case of large-scale state space, the potential field method combined with deep q-network algorithm is proposed for AGV path planning. The algorithm can effective!) guide AGV to carry out optimal path planning, and solve the problem that the traditional reinforcement learning algorithm can not deal with complex space. Finally, these algorithms are applied to the ACV path planning system to simulate the motion state of a single AGV from the loading point to the unloading point. It verifies that our algorithms can effectively implement the AGV intelligent path planning and improve the efficiency of warehousing logistics.

关键词： AGV path planning reinforcement learning sarsa algorithm deep Q network algorithm

来源：评论

学校读者我要写书评

暂无评论

A Texas Hold'em decision model based on Reinforcement Learning 32

A Texas Hold'em decision model based on Reinforcement Learni...

引用

32nd Chinese Control And Decision Conference (CCDC)

作者： Zhang, XiaoChuan Li, Yi Chongqing Univ Technol Dept Comp Sci Chongqing Peoples R China

ISBN: (纸本)9781728158556

Texas Hold 'em is a typical example of computer incomplete information game. The traditional machine learning method has been unable to deal with the huge search state space of Texas Hold 'em. In this paper, the value-based reinforcement learning algorithm is adopted, which can deal with the huge amount of data without manual extraction of data features, and can realize the unsupervised training of the model. However, the value based reinforcement learning algorithm has the problems of too complex acquisition process and over estimation of DQN algorithm. Therefore, this paper introduces DQN-S model, which combines reinforcement learning algorithm with Monte Carlo game search, and integrates sarsa algorithm to solve the above problems to a certain extent. Finally, experimental data show that the over estimation effect of DQN algorithm is weakened to a certain extent after the sarsa algorithm is incorporated. The average return of each game of DQN-S model and model is more than 5 chips, and the average return of each game of DQN-S model is more than 3 chips in the game with its strongest training version.

关键词： Incomplete information machine game Reinforcement learning Texas Hold 'em DQN-S model Monte Carlo search sarsa algorithm DQN algorithm

来源：评论

学校读者我要写书评

暂无评论

Backward Q-learning: The combination of sarsa algorithm and Q-learning

引用

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE 2013年第9期26卷 2184-2193页

作者： Wang, Yin-Hao Li, Tzuu-Hseng S. Lin, Chih-Jui Natl Cheng Kung Univ Dept Elect Engn AiRobots Lab Tainan 70101 Taiwan

Reinforcement learning (RI) has been applied to many fields and applications, but there are still some dilemmas between exploration and exploitation strategy for action selection policy. The well-known areas of reinforcement learning are the Q-learning and the sarsa algorithms, but they possess different characteristics. Generally speaking, the sarsa algorithm has faster convergence characteristics, while the Q-learning algorithm has a better final performance. However, sarsa algorithm is easily stuck in the local minimum and Q-learning needs longer time to learn. Most literatures investigated the action selection policy. Instead of studying an action selection strategy, this paper focuses on how to combine Q-learning with the sarsa algorithm, and presents a new method, called backward Q-learning, which can be implemented in the sarsa algorithm and Q-learning. The backward Q-learning algorithm directly tunes the Q-values, and then the Q-values will indirectly affect the action selection policy. Therefore, the proposed RL algorithms can enhance learning speed and improve final performance. Finally, three experimental results including cliff walk, mountain car, and cart-pole balancing control system are utilized to verify the feasibility and effectiveness of the proposed scheme. All the simulations illustrate that the backward Q-learning based RL algorithm outperforms the well-known Q-learning and the sarsa algorithm. (C) 2013 Elsevier Ltd. All rights reserved.

关键词： Backward Q-learning Q-learning Reinforcement learning sarsa algorithm

来源：评论

学校读者我要写书评

暂无评论

Reinforcement Learning algorithms in Global Path Planning for Mobile Robot

Reinforcement Learning Algorithms in Global Path Planning fo...

引用

International Conference on Industrial Engineering, Applications and Manufacturing (ICIEAM)

作者： Sichkar, Valentyn N. ITMO Univ Dept Control Syst & Robot St Petersburg Russia

ISBN: (纸本)9781538681190

The paper is devoted to the research of two approaches for global path planning for mobile robots, based on Q-Learning and sarsa algorithms. The study has been done with different adjustments of two algorithms that made it possible to learn faster. The implementation of two Reinforcement Learning algorithms showed differences in learning time and the methods of building path to avoid obstacles and to reach a destination point. The analysis of obtained results made it possible to select optimal parameters of the considered algorithms for the tested environments. Experiments were performed in virtual environments where algorithms learned which steps to choose in order to get a maximum payoff and reach the goal avoiding obstacles.

关键词： reinforcement learning Q-Learning algorithm sarsa algorithm path planning mobile agent

来源：评论

学校读者我要写书评

暂无评论

A sarsa-based adaptive controller for building energy conservation

引用

JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING 2018年第2期18卷 329-338页

作者： Fu, Qiming Hu, Lingyao Wu, Hongjie Hu, Fuyuan Hu, Wen Chen, Jianping Suzhou Univ Sci & Technol Inst Elect & Informat Engn Suzhou 215009 Jiangsu Peoples R China Suzhou Univ Sci & Technol Jiangsu Key Lab Intelligent Bldg Energy Efficienc Suzhou 215009 Jiangsu Peoples R China Suzhou Univ Sci & Technol Suzhou Key Lab Mobile Networking & Appl Technol Suzhou 215009 Jiangsu Peoples R China

In the field of building equipment control, the traditional methods have some problems - instability and slow convergence. To deal with these problems, a new sarsa-based adaptive controller, SAC (sarsa-based adaptive controller) was proposed. Based on the model of the exchange mechanism about the building energy consumption, the proposed method with sarsa algorithm models the exchange mechanism of the building energy consumption, and tries to find the best control policy, which can decrease the energy consumption without losing the performance of good comfort of the building occupants. Compared with the PID method, the proposed SAC has better convergence performance and robustness.

关键词： Energy conservation Reinforcement learning sarsa algorithm adaptive controller

来源：评论

学校读者我要写书评

暂无评论

Hybrid Robotic Reinforcement Learning for Inspection/Correction Tasks

引用

Procedia Manufacturing 2019年 39卷 406-413页

作者： Hoda Nasereddin Gerald M. Knapp Louisiana State University 3277 Patrick F. Taylor Hall Baton Rouge 70803 USA Louisiana State University 3240-A Patrick F. Taylor Hall Baton Rouge 70803 USA

The ability to rapidly program robots for complex tasks is an important precursor to wider adoption of robotics in industry. Robot programming is often time consuming and brittle to unanticipated variations in processing. Automated robot task learning is a solution to this problem. Reinforcement Learning (RL) is a commonly used approach for a robot to autonomously learn simple tasks. In RL, rewards are used to guide the robot towards learning an optimal plan or control policy. RL, however, has proven to be of limited value for problems with large-state spaces and considerable environmental variability. In this paper, we investigate formulation of the RL approach for inspect/correct types of tasks, specifically a misplaced block in a simple grid-world environment (requiring searching the gird world to identify a missing block and returning the missing block back to the target). We use a hybrid method, combining the sarsa algorithm and a model of the environment. The model of the environment is used as a reference model to reduce the state space, avoiding unnecessary exploration of the environment. A main focus of this research is the impact of task variability on RL performance.

关键词： Reinforcement Learning Inspect Correct sarsa algorithm

来源：评论

学校读者我要写书评

暂无评论

Online Energy Management and Heterogeneous Task Scheduling for Smart Communities with Residential Cogeneration and Renewable Energy

引用

ENERGIES 2018年第8期11卷 2104-2104页

作者： Cao, Yongsheng Zhang, Guanglin Li, Demin Wang, Lin Li, Zongpeng Donghua Univ Minist Educ Coll Informat Sci & Technol Engn Res Ctr Digitized Text & Fash Technol Shanghai 201620 Peoples R China Shanghai Jiao Tong Univ Dept Automat Shanghai 200240 Peoples R China Wuhan Univ Sch Comp Sci Wuhan 430072 Hubei Peoples R China

With the development of renewable energy technology and communication technology in recent years, many residents now utilize renewable energy devices in their residences with energy storage systems. We have full confidence in the promising prospects of sharing idle energy with others in a community. However, it is a great challenge to share residents' energy with others in a community to minimize the total cost of all residents. In this paper, we study the problem of energy management and task scheduling for a community with renewable energy and residential cogeneration, such as residential combined heat and power system (resCHP) to pay the least electricity bill. We take elastic and inelastic load demands into account which are delay intolerant and delay tolerant tasks in the community. The minimum cost problem of a non-cooperative community is extracted into a random non-convex optimization problem with some physical constraints. Our objective is to minimize the time-average cost for each resident in the community, including the cost of the external grid and natural gas. The Lyapunov optimization theory and a primal-dual gradient method are adopted to tackle this problem, which needs no future data and has low computational complexity. Furthermore, we design a cooperative renewable energy sharing algorithm based on State-action-reward-state-action (sarsa) algorithm, in the condition that each residence in the community is able to communicate with its neighbors by a central controller. Finally, extensive simulations are presented to validate the proposed algorithms by using practical data.

关键词： dynamic energy management resCHP system energy sharing sarsa algorithm smart grid

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：