Aiming at the problems of punctuality, parking accuracy and energy saving of urban rail train operation, an intelligent control method for automatic train operation (ATO) based on deep Q network (dqn) is proposed. The...
详细信息
Aiming at the problems of punctuality, parking accuracy and energy saving of urban rail train operation, an intelligent control method for automatic train operation (ATO) based on deep Q network (dqn) is proposed. The train dynamics model is established under the condition of satisfying the safety principle and various constraints of automatic driving of urban rail train. Considering the transformation rules and sequences of working conditions between train stations, the agent in the dqn algorithm is used as the train controller to adjust the train automatic driving strategy in real time according to the train operating state and operating environment, and optimizes the generation of the train automatic driving curve. Taking the Beijing Yizhuang Subway line as an example, the simulation test results show that the dqn urban rail train control method reduces energy consumption by 12.32% compared with the traditional train PID control method, and improves the running punctuality and parking accuracy;at the same time, the dqn train automatically driving control method can adjust the train running state in real time and dynamically, and has good adaptability and robustness to the change of train running environment parameters.
Texas Hold'em is a representative of an incomplete information game. Existing research on computing Nash equilibrium as a Texas Hold'em strategy has problems, including high resource consumption and conservati...
详细信息
Texas Hold'em is a representative of an incomplete information game. Existing research on computing Nash equilibrium as a Texas Hold'em strategy has problems, including high resource consumption and conservative strategies. To solve the above problems, an integration model combining deep learning and reinforcement learning is proposed. Firstly, to reduce the storage resources consumed due to the large Texas Hold'em state space, a long short-term memory (LSTM) is designed to predict the game results. Since the win rate and historical action information are used as input data by the LSTM, a convolutional neural network (CNN) is designed to predict the current win rate. Secondly, in order to enable the strategy to have dynamic adjustment ability, the deep Q-network(dqn) is used to generate the strategy by using the results predicted by LSTM. Finally, an agent is implemented to provide training data for LSTM. The experimental results show that the model wins more chips, which proves that it can be used as a solution for incomplete information games.
Rapidly developing technology and changing market make customer requirement (CR) change rapidly. In the early stage of product development, timely access to CRs is crucial. However, the previous design methods cannot ...
详细信息
ISBN:
(纸本)9798350360882;9798350360899
Rapidly developing technology and changing market make customer requirement (CR) change rapidly. In the early stage of product development, timely access to CRs is crucial. However, the previous design methods cannot grasp the changes of customers requirement preferences in time and give a better design scheme. Therefore, a dynamic configuration design method based on cloud model dealing with requirement uncertainty and obtaining CR preference is proposed to quickly solve the design scheme. Firstly, the reverse cloud conversion algorithm named Multiple Backward Cloud Transformation based on Sampling with Replacement (MBCT-SR) is used to convert the CR data into a multi-level evaluation cloud model, and the CR preference is calculated. Secondly, the design structure matrix is constructed to map the CR preference to the configuration instance. Then, the Deep Q Network (dqn) algorithm model is established and trained, and the updated CR preference is input to quickly solve the product configuration scheme. Finally, the effectiveness of the proposed method is verified by an example analysis of high-speed train bogies.
As a newly emerging computing paradigm, edge computing shows great capability in supporting and boosting 5G and Internet-of-Things (IoT) oriented applications, e.g., scientific workflows with low-latency, elastic, and...
详细信息
As a newly emerging computing paradigm, edge computing shows great capability in supporting and boosting 5G and Internet-of-Things (IoT) oriented applications, e.g., scientific workflows with low-latency, elastic, and on-demand provisioning of computational resources. However, the geographically distributed IoT resources are usually interconnected with each other through unreliable communications and ever-changing contexts, which brings in strong heterogeneity, potential vulnerability, and instability of computing infrastructures at different levels. It thus remains a challenge to enforce high fault-tolerance of edge-IoT scientific computing task flows, especially when the supporting computing infrastructures are deployed in a collaborative, distributed, and dynamic environment that is prone to faults and failures. This work proposes a novel fault-tolerant scheduling approach for edge-IoT collaborative workflows. The proposed approach first conducts a dependency-based task allocation analysis, then leverages a Primary-Backup (PB) strategy for tolerating task failures that occur at edge nodes, and finally designs a deep Q-learning algorithm for identifying the near-optimal workflow task scheduling scheme. We conduct extensive simulative case studies on multiple randomly-generated workflow and real-world edge-IoT server position datasets. Results clearly suggest that our proposed method outperforms the state-of-the-art competitors in terms of task completion ratio, server active time, and resource utilization.
In cloud platform applications, the user's goal is to obtain high-quality application services, while the service provider's goal is to obtain revenue by performing the tasks submitted by the user. The platfor...
详细信息
In cloud platform applications, the user's goal is to obtain high-quality application services, while the service provider's goal is to obtain revenue by performing the tasks submitted by the user. The platform built by the service provider's application resources needs to improve the mapping between service requests and resources to achieve higher value. Through the current situation of resource management in the cloud environment, it is found that many task scheduling and resource allocation algorithms are still affected by factors such as the diversity, dynamics, and multiple constraints of resources and tasks. This paper focuses on Software as a Service (SaaS) applications' task scheduling and resource configuration in a dynamic and uncertain cloud environment. It is a challenging online scheduling problem to automatically and intelligently allocate user task requests that continually reach SaaS applications to appropriate resources for execution. To this end, a real-time task scheduling method based on deep reinforcement learning is proposed, which automatically and intelligently allocates user task requests that continually reach SaaS applications to appropriate resources for execution. In this way, the limited virtual machine resources rented by SaaS providers can be used in a balanced and efficient manner. In the experiment, by comparing with other five task scheduling algorithms, it is proved that the algorithm proposed in this paper not only improves the execution efficiency of better deploying workflow in IaaS public cloud, but also makes the resources provided by SaaS are used in a balanced and efficient manner.
In the realm of Natural Language Processing (NLP), Abstract Text Summarization (ATS) holds a crucial position, involving the transformation of lengthy textual content into concise summaries while retaining essential i...
详细信息
Based on the Deep Q-Network(dqn) algorithm of reinforcement learning, an active fault-tolerance method with incremental action is proposed for the control system with sensor faults of the once -through steam generator...
详细信息
Based on the Deep Q-Network(dqn) algorithm of reinforcement learning, an active fault-tolerance method with incremental action is proposed for the control system with sensor faults of the once -through steam generator(OTSG). In this paper, we first establish the OTSG model as the interaction environment for the agent of reinforcement learning. The reinforcement learning agent chooses an ac-tion according to the system state obtained by the pressure sensor, the incremental action can gradually approach the optimal strategy for the current fault, and then the agent updates the network by different rewards obtained in the interaction process. In this way, we can transform the active fault tolerant control process of the OTSG to the reinforcement learning agent's decision-making process. The com-parison experiments compared with the traditional reinforcement learning algorithm(RL) with fixed strategies show that the active fault-tolerant controller designed in this paper can accurately and rapidly control under sensor faults so that the pressure of the OTSG can be stabilized near the set-point value, and the OTSG can run normally and stably.(c) 2022 Korean Nuclear Society, Published by Elsevier Korea LLC. All rights reserved. This is an open access article under the CC BY-NC-ND license (http://***/licenses/by-nc-nd/4.0/).
Reinforcement learning has emerged as a prominent technique for enhancing robot obstacle avoidance capabilities in recent years. This research provides a comprehensive overview of reinforcement learning methods, focus...
详细信息
5G ecosystem is shaping the future of communication networks enabling innovation and digital transformation not only for individual users but also for companies, industries, and communities. In this scenario, technolo...
详细信息
ISBN:
(纸本)9781665462501
5G ecosystem is shaping the future of communication networks enabling innovation and digital transformation not only for individual users but also for companies, industries, and communities. In this scenario, technologies such as Software Defined Networking (SDN) represent a solution for telecommunications providers to create agile, scalable, efficient platforms capable of meeting the requirements in the 5G ecosystem. However, as network environments and systems become increasingly complex, both in terms of size and dynamic behavior, the number of vulnerabilities in them can be very high. In addition, hackers are continuously improving intrusion methods, which are becoming more difficult to detect. For this reason, in this study, we deploy a system based on a Reinforcement Learning (RL) agent capable of applying different countermeasures to defend a network against intrusion and DDoS attacks using SDN. The approach is drawn like a serious game in which a defender and an attacker carry out actions based on the observations they get from the environment, i.e., network current status. In this study, defenders and attackers are trained using the Deep Q-Learning (dqn) algorithm with some variations, like Prioritized Replay, Dueling, and Double dqn, comparing their results in order to get the best strategy for attack mitigation. The results of this paper show that RL algorithms can be successfully used to create more versatile agents able of interpreting and adapting themselves to different situations and so run the best countermeasure to protect the network. According to the results, it is also shown that the Complete strategy, which includes the three dqn variations analyzed, is the one that allows obtaining agents with the best decision making to respond to attacks.
Indoor temperature and relative humidity control in office buildings is crucial, which can affect thermal comfort, work efficiency, and even health of the occupants. In China, fan coil units (FCUs) are widely used as ...
详细信息
Indoor temperature and relative humidity control in office buildings is crucial, which can affect thermal comfort, work efficiency, and even health of the occupants. In China, fan coil units (FCUs) are widely used as air-conditioning equipment in office buildings. Currently, conventional FCU control methods often ignore the impact of indoor relative humidity on building occupants by focusing only on indoor temperature as a single control object. This study used FCUs with a fresh-air system in an office building in Beijing as the research object and proposed a deep reinforcement learning (RL) control algorithm to adjust the air supply volume for the FCUs. To improve the joint control satisfaction rate of indoor temperature and relative humidity, the proposed RL algorithm adopted the deep Q-network algorithm. To train the RL algorithm, a detailed simulation environment model was established in the Transient System Simulation Tool (TRNSYS), including a building model and FCUs with a fresh-air system model. The simulation environment model can interact with the RL agent in real time through a self-developed TRNSYS-Python co-simulation platform. The RL algorithm was trained, tested, and evaluated based on the simulation environment model. The results indicate that compared with the traditional on/off and rule-based controllers, the RL algorithm proposed in this study can increase the joint control satisfaction rate of indoor temperature and relative humidity by 12.66% and 9.5%, respectively. This study provides preliminary direction for a deep reinforcement learning control strategy for indoor temperature and relative humidity in office building heating, ventilation, and air-conditioning (HVAC) systems.
暂无评论