检索结果-内蒙古大学图书馆

Reinforcement learning based mainline dynamic speed limit adjustment of expressway off-ramp upstream under connected and autonomous vehicles environment

引用

IET INTELLIGENT TRANSPORT SYSTEMS 2022年第12期16卷 1809-1819页

作者： Xiao, Daiquan Kang, Shengyang Xu, Xuecai Shen, Zhenwu Huazhong Univ Sci & Technol Sch Civil & Hydraul Engn Wuhan 430074 Peoples R China Shenzhen Urban Transport Planning Ctr Co Ltd Shenzhen Peoples R China Wuhan Huake Quanda Transport Planning & Design Co Wuhan Peoples R China

With the rapid progress of urbanization and continuous increasing of automobiles, expressway on- and off-ramp area becomes the bottleneck, and recurrent congestion occurs frequently. In order to solve the problem of traffic jam caused around off-ramps, various methods have been employed. Among all, mainline variable speed limit (VSL) control accounts for some proportion. In this study, mainline VSL adjustment of off-ramp upstream is investigated with the reinforcement learning algorithm under the connected vehicles environment to alleviate the traffic congestion. First, the assumptions are made to be suitable for the traffic conditions of mainline VSL control on off-ramp upstream, and then VSL algorithm based on reinforcement learning is presented, and q-learning is chosen as the main algorithm. Next, the state space, action space, and reward function required by q-learning are constructed orderly, and the related parameters are labelled. After that, according to the platform based on Python and VISSIM, three schemes, free control (Scheme 0), mainline VSL adjustment of off-ramp upstream based on rule (Scheme 1), and mainline VSL adjustment of off-ramp upstream based on q-learning algorithm (Scheme 2), are designed, and the three schemes are simulated and compared quantitatively to reflect the off-ramp travel efficiency. The results indicate that mainline dynamic VSL adjustment of off-ramp upstream based on q-learning algorithm performs the best in terms of general and specific indexes. The results provide potential insights for relieving the traffic congestion and traffic flow control under CAVs environment.

关键词： mobile robots road vehicles mainline VSL control intelligent transportation systems reinforcement learning algorithm traffic jam connected vehicles environment off-ramp area mainline dynamic speed limit adjustment learning (artificial intelligence) mainline VSL adjustment VSL algorithm traffic flow control mainline variable speed limit control accounts automobiles off-ramp travel efficiency traffic congestion mainline dynamic VSL adjustment road traffic road traffic control reinforcement learning Python traffic engineering computing q-learning algorithm autonomous vehicles environment

来源：评论

学校读者我要写书评

暂无评论

Adaptive hysteresis compensation control of a macro-fiber composite bimorph by improved reinforcement learning

引用

JOURNAL OF INTELLIGENT MATERIAL SYSTEMS AND STRUCTURES 2024年第19期35卷 1471-1482页

作者： Li, Xingqiu Hu, Kaiming Li, Hua Wang, Ban Xu, Suan He, Yuchen China Jiliang Univ Sch Mech & Elect Engn 258 Xueyuan St Hangzhou 310018 Zhejiang Peoples R China Zhejiang Univ Sch Aeronaut & Astronaut Hangzhou Zhejiang Peoples R China Hangzhou City Univ Dept Mech Engn Hangzhou Zhejiang Peoples R China

The hysteresis characteristics related to the frequency and amplitude of the control signal seriously affect the precision of the displacement tracking control of the macro-fiber composite (MFC) bimorph. The traditional feedforward compensator tends to exhibit low precision in controlling displacement. Although it enhances the control accuracy to a certain extent by incorporating a feedback controller, the existing feedback controller has a weak adaptive ability. Thus, the q-learning (qL) algorithm is combined with the Bouc-Wen (BW) feedforward compensator in this study. The output voltage from the BW compensator has a more significant impact on the control accuracy and convergence speed of the qL algorithm. Therefore, this paper proposes an improved qL (IqL) algorithm that leverages the control error. Moreover, a BW-IqL adaptive control method is proposed to realize precise adaptive control of the MFC bimorph. Experimental comparisons are performed with the proposed method and the traditional BW, BW-PID, and BW-fuzzy PID controllers. The BW-IqL controller reduces the average relative errors of the three controllers by 86.9%, 59.9%, and 41.8%, respectively. Meanwhile, the (q) over bar table plays a major role in error suppression in the IqL controller. These results verify that the BW-IqL control method has higher adaptability and accuracy.

关键词： Macro-fiber composite bimorph hysteresis reinforcement learning q-learning algorithm Bouc-Wen model

来源：评论

学校读者我要写书评

暂无评论

Maximum entropy-based optimal threshold selection using deterministic reinforcement learning with controlled randomization

引用

SIGNAL PROCESSING 2002年第7期82卷 993-1006页

作者： Yin, PY Ming Chuan Univ Dept Informat Management Tao Yuan 333 Taiwan

Traditional maximum entropy-based thresholding methods are very popular and efficient in the case of bilevel thresholding. But they are very computationally expensive when extended to multilevel thresholding since the inevitable exhaustive search of optimal thresholds needed to maximize the posterior entropy. In this paper, a reinforcement learning (R-L) approach is proposed for the maximum entropy thresholding. We show that finding the optimal thresholds using the maximum entropy criterion is equivalent to learning an optimal policy of the RL problem. Therefore, the powerful q-learning algorithm. which is widely used in RL, can be employed to eradicate the computation burden of the maximum entropy-based thresholding methods. The experimental results show that the proposed method is suitable in the case of multilevel thresholding and the performance is better than that of the genetic algorithm-based entropy thresholding method. (C) 2002 Published by Elsevier Science B.V.

关键词： genetic algorithms image segmentation maximum entropy criterion multilevel thresholding q-learning algorithm reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Code Dissemination of Long Chain Wireless Sensor Networks Based on q-learning

Code Dissemination of Long Chain Wireless Sensor Networks Ba...

引用

第四届材料科学应用与能源材料国际研讨会

作者： Ming Yue Wang Gui Geng Zeng College of Telecommunication & Information Engineering NJUPT

Long chain wireless sensor networks have been applied in a variety of applications, such as railway lines and power lines. However, there are few researches on code dissemination protocols under long chain topology. In this paper, we propose a new algorithm of code dissemination for long chain topology based on q-learning algorithm(qLCD). In the proposed algorithm, redundancy is reduced by allowing only one forwarder to broadcast data packets. The q learning is taken to search for the optimal forwarder for data dissemination. The forwarder needs to meet two objectives:reduce dissemination time and prolong lifetime. In order to solve the multi-objective problem, this paper introduce the integrated approach to balance two goals. The simulation results show that the qLCD can reduce the average dissemination time and prolong the cycle life compared to the Deluge.

关键词： Long chain Wireless sensor network q-learning algorithm code dissemination

来源：评论

学校读者我要写书评

暂无评论

Optimal Values Selection of q-learning Parameters in Stochastic Mazes

Optimal Values Selection of Q-learning Parameters in Stochas...

引用

作者： Xiaolin Zhou School of Mathematics and Information Sciences Guangzhou University

The model-free characteristic of the q-learning algorithm, without obtaining information about the environment and being available for agents to learn by themselves, enables q-learning to be widely applied to path planning fields. Nonetheless, the selection of parameter values will have a crucial impact on the results. In this paper, how to determine an appropriate value of learning rate and discount factor and these parameters' effect on the overall results will be presented. The agents with different learning rate or discount factor values will perform in randomly generated mazes, the results of which will be aggregated and compared. When the learning rate equals 0.9, under the condition of setting the learning rate as variable and discount factor as invariant, the aggregated data of 0.9 can reach convergence way more quickly than in other settings(0.6, 0.3, 0.1);when the discount factor equals 0.9 and the experiment follows the unique variable principle, the aggregated data of 0.9 searches for shorter path length and faster than other groups(0.6, 0.3, 0.1);when both the learning rate and discount factor are set to 0.9 –other groups are 1.0, 0.1, and 0 – the group of 0.9 is more stable than the group of 0.1 and shows convergence, which does not appear in the group of 1.0 and 0, within 80 iterations.

关键词： Reinforcement learning q-learning algorithm Optimal Value learning Rate Discount Factor Path planning Stochastic Maze Obstacles avoidance

来源：评论

学校读者我要写书评

暂无评论

Multi-objective traffic signal control model for traffic management

引用

TRANSPORTATION LETTERS-THE INTERNATIONAL JOURNAL OF TRANSPORTATION RESEARCH 2015年第4期7卷 196-200页

作者： Long, q. Zhang, J. -F. Zhou, Z. -M. Hunan City Univ Sch Civil Engn Yiyang 413000 Hunan Peoples R China

Traffic signals are one of the main traffic management tools used to control traffic flow on the roads and should reflect traffic managers' intentions in different tasks. This paper showed a multi-objective optimization model, and its algorithm was aimed at the intricate structure of traffic control. First, the indexes of queuing lengths, delay times, and stop times were chosen as the evaluation indexes of optimization model. Second, the weight of the optimization indexes was confirmed with a fuzzy analytic process (FAP) according to the traffic managers' strategy. Finally, the multi-objective optimization model was solved with the q-learning algorithm, and thus the signal control scheme of the intersection was produced in real-time while considering the traffic management strategy. The results of the simulation showed that the method could effectively improve traffic efficiency at the intersection, and at the same time, the intentions of the traffic managers could be fully embodied.

关键词： Multi-objective model Traffic management strategy Intersection signal controlling Fuzzy analytic process q-learning algorithm

来源：评论

学校读者我要写书评

暂无评论

Interference Mitigation for Coexisting Wireless Body Area Networks: Distributed learning Solutions

引用

IEEE ACCESS 2020年 8卷 24209-24218页

作者： George, Emy Mariam Jacob, Lillykutty Natl Inst Technol Calicut Kozhikode 673601 India

When multiple wireless body area networks (WBANs) exist in close proximity to each other, the inter-user interference considerably degrades the signal to interference plus noise ratio of the packets arriving at each WBAN coordinator. Also, the propagation paths within each WBAN experience fading due to the continuous changes in the body posture and mobility of the human body. The most preferred coexisting mechanisms specified in the IEEE 802.15.6 standard is the channel hopping mechanism, which fails to consider the varying radio environment and obtained reward in its channel selection. Thus, our paper investigates this channel selection problem for interference mitigation in a time-varying environment. We formulate this channel selection problem as a finite repeated potential game and propose two learning algorithms, Stochastic learning algorithm (SLA) and Stochastic Estimator learning algorithm (SELA) to achieve the Nash Equilibrium (NE) of the game. Numerical results show the convergence of the learning algorithms to the NE point of the game. The performance evaluation and impact of parameters on these two algorithms are also analyzed in our paper.

关键词： Channel hopping IEEE 802 15 6 interference mitigation stochastic estimator learning algorithm stochastic learning algorithm potential game q-learning algorithm WBAN

来源：评论

学校读者我要写书评

暂无评论

A communication security anti-interference decision model using deep learning in intelligent industrial IoT environment

引用

SOFT COMPUTING 2022年第16期26卷 7993-8002页

作者： Yan, Lichao Hu, Juan Wang, Yi Zheng, Ning Di, Jinhong Zhengzhou Univ Aeronaut Sch Intelligent Engn 15 Wenyuan West Rd Zhengzhou 450046 Henan Peoples R China

To traditional anti-jamming decision algorithm that cannot meet the security needs of smart city development, this paper proposes a communication security anti-interference decision algorithm using deep learning in an intelligent industrial IoT environment. Firstly, an interactive system model of cognitive users and disruptors with intelligent perception function is constructed. Besides, the interference intensity and channel gain are comprehensively analyzed to design the optimization goal to maximize network capacity. Then, by modeling the interaction between cognitive environment and decision engine as the interaction between environment and agent in deep reinforcement learning, the q-learning algorithm integrating reinforcement learning is used to explore the maximum action reward feedback to cognitive decision engine, so as to intelligently obtain the effective interference parameters of communication state. Finally, the proposed algorithm is experimentally demonstrated based on MATLAB simulation platform. The results show that when the number of links is 300, the network capacity of proposed algorithm is about 960 bit . s(-1) . Hz(-1), and the cumulative average reward value reaches 0.59, which is better than the comparison algorithm, and realizes high reliable autonomous decision-making.

关键词： Internet of things Reinforcement learning q-learning algorithm Network capacity maximization Action reward Communication security Anti-jamming decision-making

来源：评论

学校读者我要写书评

暂无评论

Research on path planning algorithm of mobile robot based on reinforcement learning

引用

SOFT COMPUTING 2022年第18期26卷 8961-8970页

作者： Pan, Guoqian Xiang, Yong Wang, Xiaorui Yu, Zhongquan Zhou, Xinzhi Sichuan Univ Coll Elect & Informat Engn Chengdu Sichuan Peoples R China CAAC Res Inst 2 Chengdu Sichuan Peoples R China Civil Aviat Logist Technol Co Ltd Chengdu Sichuan Peoples R China

In order to solve the problems of low learning efficiency and slow convergence speed when mobile robot uses reinforcement learning method for path planning in complex environment, a reinforcement learning method based on each round path planning result is proposed. Firstly, the algorithm adds obstacle learning matrix to improve the success rate of path planning;and introduces heuristic reward to speed up the learning process by reducing the search space;then proposes a method of dynamically adjusting the exploration factor to balance the exploration and utilization in path planning, so as to further improve the performance of the algorithm. Finally, the simulation experiment in grid environment shows that compared with q-learning algorithm, the improved algorithm not only shortens the average path length of the robot to reach the target position, but also speeds up the learning efficiency of the algorithm, so that the robot can find the optimal path more quickly. The code of EPRqL algorithm proposed in this paper has been published to GitHub: GitHub: https://***/ panpanpanguoguoqian/ ***.

关键词： Complex environment Mobile robot Path planning q-learning algorithm

来源：评论

学校读者我要写书评

暂无评论

Distributed multi-agent scheme support for service continuity in IMS-4G-Cloud networks

引用

COMPUTERS & ELECTRICAL ENGINEERING 2015年 42卷 49-59页

作者： Hsieh, Han-Chuan Chen, Jiann-Liang Natl Taiwan Univ Sci & Technol Dept Elect Engn Taipei Taiwan

In this study, the quality of Service (qoS) needed to support service continuity in heterogeneous networks is achieved by a Distributed Multi-Agent Scheme (DMAS) based on cooperation concepts and an awareness algorithm. A set of problem solving agents autonomously process local tasks and cooperatively interoperate via an in-cloud blackboard system to provide qoS and mobility information. A q-learning awareness algorithm calculates the exceptive rewards of a handoff to all access networks. These rewards are then used by problem solving agents to determine what actions must be performed. Agents located in the integrated IMS-4G-Cloud networks handle service continuity by using a handoff mechanism. Through operations and cooperation among active agents, these phases select a policy for predictive and anticipated IF Multimedia Subsystem (IMS) handoff management. Compared with conventional IMS handoff management, the proposed DMAS scheme achieves shorter handoff delay and better qoS for real-time service applications. (C) 2014 Elsevier Ltd. All rights reserved.

关键词： Heterogeneous network Cooperative networking Distributed Multi-Agent Scheme (DMAS) IP Multimedia Subsystem (IMS) q-learning algorithm quality of Service (qoS)

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：