检索结果-内蒙古大学图书馆

Research on ATO Control Method for Urban Rail Based on Deep Reinforcement Learning

IEEE ACCESS 2023年 11卷 5919-5928页

作者： Chen, Xiaoqiang Guo, Xiao Meng, Jianjun Xu, Ruxun Li, Shanshan Li, Decang Lanzhou Jiaotong Univ Mechatron T&R Inst Lanzhou 730070 Peoples R China Gansu Logist & Transportat Equipment Informat Tech Lanzhou 730070 Peoples R China Gansu Logist & Transportat Equipment Ind Tech Ctr Lanzhou 730070 Peoples R China Lanzhou Jiaotong Univ Sch Mech Engn Lanzhou 730070 Peoples R China

Aiming at the problems of punctuality, parking accuracy and energy saving of urban rail train operation, an intelligent control method for automatic train operation (ATO) based on deep Q network (dqn) is proposed. The train dynamics model is established under the condition of satisfying the safety principle and various constraints of automatic driving of urban rail train. Considering the transformation rules and sequences of working conditions between train stations, the agent in the dqn algorithm is used as the train controller to adjust the train automatic driving strategy in real time according to the train operating state and operating environment, and optimizes the generation of the train automatic driving curve. Taking the Beijing Yizhuang Subway line as an example, the simulation test results show that the dqn urban rail train control method reduces energy consumption by 12.32% compared with the traditional train PID control method, and improves the running punctuality and parking accuracy;at the same time, the dqn train automatically driving control method can adjust the train running state in real time and dynamically, and has good adaptability and robustness to the change of train running environment parameters.

关键词： Energy consumption Railway transportation Reinforcement learning Heuristic algorithms Target tracking Real-time systems Resource management Urban areas Public transportation Urban rail train dqn algorithm multi-objective optimization automatic driving

来源：评论

学校读者我要写书评

暂无评论

An integration model for Texas Hold'em

引用

INTERNATIONAL JOURNAL OF COMPUTING SCIENCE AND MATHEMATICS 2023年第3期18卷 203-213页

作者： Wang, Yajie Han, Shengyu Wei, Zhihao Shi, Zhonghui Shenyang Aerosp Univ Engn Training Ctr Shenyang 110000 Liaoning Peoples R China Shenyang Aerosp Univ Dept Comp Sci Shenyang 110000 Liaoning Peoples R China

Texas Hold'em is a representative of an incomplete information game. Existing research on computing Nash equilibrium as a Texas Hold'em strategy has problems, including high resource consumption and conservative strategies. To solve the above problems, an integration model combining deep learning and reinforcement learning is proposed. Firstly, to reduce the storage resources consumed due to the large Texas Hold'em state space, a long short-term memory (LSTM) is designed to predict the game results. Since the win rate and historical action information are used as input data by the LSTM, a convolutional neural network (CNN) is designed to predict the current win rate. Secondly, in order to enable the strategy to have dynamic adjustment ability, the deep Q-network(dqn) is used to generate the strategy by using the results predicted by LSTM. Finally, an agent is implemented to provide training data for LSTM. The experimental results show that the model wins more chips, which proves that it can be used as a solution for incomplete information games.

关键词： Texas Hold' em reinforcement learning deep learning dqn algorithm integration model

来源：评论

学校读者我要写书评

暂无评论

A Dynamic Configuration Design Method Based on Cloud Model and Deep Reinforcement Learning 29

A Dynamic Configuration Design Method Based on Cloud Model a...

引用

29th International Conference on Automation and Computing (ICAC)

作者： Li, Rong Zhang, Haizhu Qin, Shengfeng Yang, Hao Zhang, Yongjie Southwest Jiaotong Univ Sch Mech Engn Chengdu Peoples R China Northumbria Univ Sch Design Newcastle Upon Tyne Tyne & Wear England

ISBN: (纸本)9798350360882;9798350360899

Rapidly developing technology and changing market make customer requirement (CR) change rapidly. In the early stage of product development, timely access to CRs is crucial. However, the previous design methods cannot grasp the changes of customers requirement preferences in time and give a better design scheme. Therefore, a dynamic configuration design method based on cloud model dealing with requirement uncertainty and obtaining CR preference is proposed to quickly solve the design scheme. Firstly, the reverse cloud conversion algorithm named Multiple Backward Cloud Transformation based on Sampling with Replacement (MBCT-SR) is used to convert the CR data into a multi-level evaluation cloud model, and the CR preference is calculated. Secondly, the design structure matrix is constructed to map the CR preference to the configuration instance. Then, the Deep Q Network (dqn) algorithm model is established and trained, and the updated CR preference is input to quickly solve the product configuration scheme. Finally, the effectiveness of the proposed method is verified by an example analysis of high-speed train bogies.

关键词： Complex product configuration design Cloud model Requirement preference dqn algorithm

来源：评论

学校读者我要写书评

暂无评论

A novel fault-tolerant scheduling approach for collaborative workflows in an edge-IoT environment

引用

Digital Communications and Networks 2022年第6期8卷 911-922页

作者： Tingyan Long Yong Ma Lei Wu Yunni Xia Ning Jiang Jianqi Li Xiaodong Fu Xiangmi You Bo Zhang College of Computer Science Chongqing UniversityChongqing400030China School of Computer and Information Engineering Jiangxi Normal UniversityJiangxi330022China School of Mathematical Sciences University of Electronic Science and Technology of ChinaChengdu611731China Mashang Consumer Finance Co. Ltd.(MSCF)Chongqing401129China Global Energy Interconnection Research Institute Co.Ltd. Beijing102209China Faculty of Information Engineering and Automation Kunming University of Science and TechnologyYunnan650500China CISDI Research&Development CO. LTDChongqing401122China

As a newly emerging computing paradigm, edge computing shows great capability in supporting and boosting 5G and Internet-of-Things (IoT) oriented applications, e.g., scientific workflows with low-latency, elastic, and on-demand provisioning of computational resources. However, the geographically distributed IoT resources are usually interconnected with each other through unreliable communications and ever-changing contexts, which brings in strong heterogeneity, potential vulnerability, and instability of computing infrastructures at different levels. It thus remains a challenge to enforce high fault-tolerance of edge-IoT scientific computing task flows, especially when the supporting computing infrastructures are deployed in a collaborative, distributed, and dynamic environment that is prone to faults and failures. This work proposes a novel fault-tolerant scheduling approach for edge-IoT collaborative workflows. The proposed approach first conducts a dependency-based task allocation analysis, then leverages a Primary-Backup (PB) strategy for tolerating task failures that occur at edge nodes, and finally designs a deep Q-learning algorithm for identifying the near-optimal workflow task scheduling scheme. We conduct extensive simulative case studies on multiple randomly-generated workflow and real-world edge-IoT server position datasets. Results clearly suggest that our proposed method outperforms the state-of-the-art competitors in terms of task completion ratio, server active time, and resource utilization.

关键词： Edge computing Fault tolerance dqn algorithm Primary-backup model

来源：评论

学校读者我要写书评

暂无评论

SAAS parallel task scheduling based on cloud service flow load algorithm

引用

COMPUTER COMMUNICATIONS 2022年 182卷 170-183页

作者： Zhu, Jian Li, Qian Ying, Shi Wuhan Univ Sch Comp Sci Wuhan 430072 Peoples R China Guangxi Vocat Normal Univ Sch Comp & Informat Engn Nanning 530007 Peoples R China

In cloud platform applications, the user's goal is to obtain high-quality application services, while the service provider's goal is to obtain revenue by performing the tasks submitted by the user. The platform built by the service provider's application resources needs to improve the mapping between service requests and resources to achieve higher value. Through the current situation of resource management in the cloud environment, it is found that many task scheduling and resource allocation algorithms are still affected by factors such as the diversity, dynamics, and multiple constraints of resources and tasks. This paper focuses on Software as a Service (SaaS) applications' task scheduling and resource configuration in a dynamic and uncertain cloud environment. It is a challenging online scheduling problem to automatically and intelligently allocate user task requests that continually reach SaaS applications to appropriate resources for execution. To this end, a real-time task scheduling method based on deep reinforcement learning is proposed, which automatically and intelligently allocates user task requests that continually reach SaaS applications to appropriate resources for execution. In this way, the limited virtual machine resources rented by SaaS providers can be used in a balanced and efficient manner. In the experiment, by comparing with other five task scheduling algorithms, it is proved that the algorithm proposed in this paper not only improves the execution efficiency of better deploying workflow in IaaS public cloud, but also makes the resources provided by SaaS are used in a balanced and efficient manner.

关键词： SAAS task scheduling Cloud service flow Network space status Parallel task management dqn algorithm

来源：评论

学校读者我要写书评

暂无评论

Enhancing Abstractive Text Summarization with Proximal Policy Optimization 4

Enhancing Abstractive Text Summarization with Proximal Polic...

引用

4th International Conference on Advances in Electrical, Computing, Communication and Sustainable Technologies, ICAECT 2024

作者： Reddy, K. Lokeshwar Phani Shanmukh, M. Charan Kumar, M. Kumar M, Tharun Arjun Kumar, C. Prasanna Kumar, R. Venkatraman, K. Amrita Vishwa Vidyapeetham Amrita School of Computing Department of Computer Science and Engineering Chennai India

ISBN: (纸本)9798350343670

In the realm of Natural Language Processing (NLP), Abstract Text Summarization (ATS) holds a crucial position, involving the transformation of lengthy textual content into concise summaries while retaining essential information. This research paper delves into the utilization of advanced algorithms such as Seq-Seq Transfer Learning, Pegasus, and Proximal Policy Optimization for the abstraction text summarization process. Specifically, this study incorporates cutting-edge techniques like deep neurotemporal models and Reinforcement Learning (RL), with a focus on Preorder Language Models (PTLM). The integration of knowledge into the development process further enhances the efficacy of ATS applications. The paper meticulously examines the recent advancements in abstract concept extraction, addressing both historical and existing challenges while proposing innovative solutions. Emphasizing the significance of data and metrics in abstract ATS, this research incorporates the widely used ROUGE metric for evaluation. By comparing influential models, the study provides a comprehensive overview of the field's evolution. In addition, it explores the integration of transformative graph-based Transformer architecture and PTLM, which have revolutionized NLP. The research paper concludes by offering valuable insights into the future of abstract text summarization, paving the way for further exploration and innovation in the domain. © 2024 IEEE.

关键词： Abstract Text Summarization dqn algorithm Natural Language Processing Pegasus algorithm Seq-Seq algorithm Transfer Learning

来源：评论

学校读者我要写书评

暂无评论

Fault-tolerant control system for once-through steam generator based on reinforcement learning algorithm

引用

NUCLEAR ENGINEERING AND TECHNOLOGY 2022年第9期54卷 3283-3292页

作者： Li, Cheng Yu, Ren Yu, Wenmin Wang, Tianshu Naval Univ Engn Wuhan 430033 Peoples R China China Nucl Power Operat Technol Corp LTD Wuhan 430000 Peoples R China

Based on the Deep Q-Network(dqn) algorithm of reinforcement learning, an active fault-tolerance method with incremental action is proposed for the control system with sensor faults of the once -through steam generator(OTSG). In this paper, we first establish the OTSG model as the interaction environment for the agent of reinforcement learning. The reinforcement learning agent chooses an ac-tion according to the system state obtained by the pressure sensor, the incremental action can gradually approach the optimal strategy for the current fault, and then the agent updates the network by different rewards obtained in the interaction process. In this way, we can transform the active fault tolerant control process of the OTSG to the reinforcement learning agent's decision-making process. The com-parison experiments compared with the traditional reinforcement learning algorithm(RL) with fixed strategies show that the active fault-tolerant controller designed in this paper can accurately and rapidly control under sensor faults so that the pressure of the OTSG can be stabilized near the set-point value, and the OTSG can run normally and stably.(c) 2022 Korean Nuclear Society, Published by Elsevier Korea LLC. All rights reserved. This is an open access article under the CC BY-NC-ND license (http://***/licenses/by-nc-nd/4.0/).

关键词： Once -through steam generator Reinforcement learning dqn algorithm Incremental action

来源：评论

学校读者我要写书评

暂无评论

An Efficient Approach for Obstacle Avoidance and Navigation in Robots

An Efficient Approach for Obstacle Avoidance and Navigation ...

引用

2023 International Conference on Integrated Intelligence and Communication Systems, ICIICS 2023

作者： Phani Shanmukh, M. Natarajan, B. Kannan, C. Tamilselvi, M. Vigneshwaran, T. Husain, S. Syed Chennai India Vit University School of Computer Science and Engineering Chennai India Cmr Institute of Technology Department of Computer Science and Engineering Hyderabad India Roever Engineering Department of Computer Science and Engineering Perambalur India School of Engineering and Technology Department of Cse Bangalore India K Ramakrishnan College of Engineering Department of Electronics and Communication Engineering Tamilnadu Tiruchirapalli India

ISBN: (纸本)9798350315455

Reinforcement learning has emerged as a prominent technique for enhancing robot obstacle avoidance capabilities in recent years. This research provides a comprehensive overview of reinforcement learning methods, focusing on Bayesian, static, dynamic policy, Deep Q-Learning (dqn) and extended dynamic policy algorithms. In the context of robot obstacle avoidance, these algorithms enable an agent to interact with its physical environment, learns effective operating strategies, and optimize actions to maximize a reward signal. The environment typically consists of a physical space that the robot must navigate without encountering obstacles. The reward signal serves as an objective measure of the robot's performance towards accomplishing specific goals, such as reaching designated positions or completing tasks. Furthermore, successful obstacle avoidance strategies acquired in simulation environments can be seamlessly transferred to real-world scenarios. The promising results achieved thus far indicate the potential of reinforcement learning as a powerful tool for enhancing robot obstacle avoidance. This research concludes with insights into the future prospects of reward learning, high-lighting its ongoing importance in the development of intelligent robotics systems. The proposed algorithm dqn outperforms well among all the other algorithms with an accuracy of 81%, Through this research, we aim to provide valuable insights and directions for further advancements in the field of robot obstacle avoidance using reinforcement learning techniques. © 2023 IEEE.

关键词： Bayesian algorithms dqn algorithm dynamic policy extended dynamic policy physical environment real-world transfer. reinforcement learning reward signal robot obstacle avoidance simulation static algorithms

来源：评论

学校读者我要写书评

暂无评论

Security and 5G: Attack mitigation using Reinforcement Learning in SDN networks

Security and 5G: Attack mitigation using Reinforcement Learn...

引用

IEEE Future Networks World Forum (FNWF)

作者： Alvaro Fernandez-Carrasco, Jose Segurola-Gil, Lander Zola, Francesco Orduna-Urrutia, Raul BRTA Fdn Vicomtech Mikeletegi 57 Donostia San Sebastian 20009 Spain

ISBN: (纸本)9781665462501

5G ecosystem is shaping the future of communication networks enabling innovation and digital transformation not only for individual users but also for companies, industries, and communities. In this scenario, technologies such as Software Defined Networking (SDN) represent a solution for telecommunications providers to create agile, scalable, efficient platforms capable of meeting the requirements in the 5G ecosystem. However, as network environments and systems become increasingly complex, both in terms of size and dynamic behavior, the number of vulnerabilities in them can be very high. In addition, hackers are continuously improving intrusion methods, which are becoming more difficult to detect. For this reason, in this study, we deploy a system based on a Reinforcement Learning (RL) agent capable of applying different countermeasures to defend a network against intrusion and DDoS attacks using SDN. The approach is drawn like a serious game in which a defender and an attacker carry out actions based on the observations they get from the environment, i.e., network current status. In this study, defenders and attackers are trained using the Deep Q-Learning (dqn) algorithm with some variations, like Prioritized Replay, Dueling, and Double dqn, comparing their results in order to get the best strategy for attack mitigation. The results of this paper show that RL algorithms can be successfully used to create more versatile agents able of interpreting and adapting themselves to different situations and so run the best countermeasure to protect the network. According to the results, it is also shown that the Complete strategy, which includes the three dqn variations analyzed, is the one that allows obtaining agents with the best decision making to respond to attacks.

关键词： 5G SDN Reinforcement Learning dqn algorithm Cybersecurity

来源：评论

学校读者我要写书评

暂无评论

Deep Reinforcement Learning-Based Joint Optimization Control of Indoor Temperature and Relative Humidity in Office Buildings

引用

BUILDINGS 2023年第2期13卷 438页

作者： Chen, Changcheng An, Jingjing Wang, Chuang Duan, Xiaorong Lu, Shiyu Che, Hangyu Qi, Meiwei Yan, Da Beijing Univ Civil Engn & Architecture Sch Environm & Energy Engn Beijing 100044 Peoples R China Hitachi China Ltd Beijing 100190 Peoples R China Beijing Tongheng Energy & Environm Technol Inst Beijing 100085 Peoples R China Tsinghua Univ Bldg Energy Res Ctr Sch Architecture Minist Educ Beijing 100084 Peoples R China Tsinghua Univ Key Lab Eco Planning & Green Bldg Minist Educ Beijing 100084 Peoples R China

Indoor temperature and relative humidity control in office buildings is crucial, which can affect thermal comfort, work efficiency, and even health of the occupants. In China, fan coil units (FCUs) are widely used as air-conditioning equipment in office buildings. Currently, conventional FCU control methods often ignore the impact of indoor relative humidity on building occupants by focusing only on indoor temperature as a single control object. This study used FCUs with a fresh-air system in an office building in Beijing as the research object and proposed a deep reinforcement learning (RL) control algorithm to adjust the air supply volume for the FCUs. To improve the joint control satisfaction rate of indoor temperature and relative humidity, the proposed RL algorithm adopted the deep Q-network algorithm. To train the RL algorithm, a detailed simulation environment model was established in the Transient System Simulation Tool (TRNSYS), including a building model and FCUs with a fresh-air system model. The simulation environment model can interact with the RL agent in real time through a self-developed TRNSYS-Python co-simulation platform. The RL algorithm was trained, tested, and evaluated based on the simulation environment model. The results indicate that compared with the traditional on/off and rule-based controllers, the RL algorithm proposed in this study can increase the joint control satisfaction rate of indoor temperature and relative humidity by 12.66% and 9.5%, respectively. This study provides preliminary direction for a deep reinforcement learning control strategy for indoor temperature and relative humidity in office building heating, ventilation, and air-conditioning (HVAC) systems.

关键词： fan coil units reinforcement learning dqn algorithm indoor temperature and relative humidity control co-simulation

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：