The main advantage of Reinforcement learning is that it provides unexpected solutions for a designer. This study shows how a mobile robot can obtain unexpected motion forms by using Reinforcement learning. Results sho...
详细信息
ISBN:
(纸本)9788995003848
The main advantage of Reinforcement learning is that it provides unexpected solutions for a designer. This study shows how a mobile robot can obtain unexpected motion forms by using Reinforcement learning. Results show the the mobile robot with two-dimensional mobile ability can obtain unexpected motion forms for both advance motion and rotation motion. The mechanisms for these motions were investigated in order to understand how to obtain these motions. Moreover, since this system has a two-dimensional factor, this study examines the learning characteristic for the oblivion of the learning knowledge. In addition, this study examines the learning of the knowledge manipulation method to obtain new learning results with respect to the two-dimensional factor.
For the past years, the energy overconsumption problems, arising from the rapid growth of Internet scale and services types, are becoming more and more serious. In this context, the Information and Communication Techn...
详细信息
ISBN:
(纸本)9781479900930
For the past years, the energy overconsumption problems, arising from the rapid growth of Internet scale and services types, are becoming more and more serious. In this context, the Information and Communication Technology (ICT) sector has given an extensive concern to the research on green networking. From an energy-saving point of view for Internet, this paper designs a power consumption model and a qoS model for unicast. Furthermore, this paper proposes a unicast routing algorithm based on Chandy-Misra algorithm and q-learning algorithm for green Internet, which is compared with an ant colony algorithm based self-adaptive energy saving routing with respect to power consumption, the success rate of routing and running time. Results show the intelligent unicast routing algorithm proposed can effectively reduce network energy consumption, while guaranteeing the good performance.
In order to maximize the total profit and improve the service level, based on the perspective of queuing theory, a new approach for dynamic joint decision on price and delivery date in Make-to-order (MTO) manufacturin...
详细信息
ISBN:
(纸本)9783037859728
In order to maximize the total profit and improve the service level, based on the perspective of queuing theory, a new approach for dynamic joint decision on price and delivery date in Make-to-order (MTO) manufacturing firms using q-learning algorithm was proposed. Compared with static price and delivery date policy, the simulation results show that the proposed algorithm performs better in total profit and service level. The total profit does not increase with the growing number of accepted orders and the number of accepted orders must match the production capacity.
In this paper, we propose a UAV dynamic path planning algorithm to solve the path planning problem of a single UAV in a dynamic environment. The contributions of this paper mainly include the following two folds: (1) ...
详细信息
ISBN:
(纸本)9781728176840
In this paper, we propose a UAV dynamic path planning algorithm to solve the path planning problem of a single UAV in a dynamic environment. The contributions of this paper mainly include the following two folds: (1) Using a combination of global and local path planning to improve planning efficiency. (2) The improved q-learning algorithm and artificial potential field method are combined to solve the problem that effective path planning cannot be performed between two path nodes. Finally, simulation results with Matlab proves the effectiveness of the algorithm.
In this paper, two adaptive inventory control models, i.e. centralized and decentralized respectively, for a multi-echelon multi-cycle supply chain consisting of one supplier and one retailer with non-stationary stoch...
详细信息
ISBN:
(纸本)9781479970162
In this paper, two adaptive inventory control models, i.e. centralized and decentralized respectively, for a multi-echelon multi-cycle supply chain consisting of one supplier and one retailer with non-stationary stochastic demand were established, In the centralized model, the vendor managed inventory replenishment policy was used by the supplier and the retailer didn't keep any stock, An improved exponential smoothing method was used by the supplier to forecast the future demand. The EOq model was used by the supplier to determine the replenishment quantity for the retailer and an adaptive approach was used by the supplier to determine his safety stock to against demand fluctuation. An reinforcement learningalgorithm was adopted to select an proper safety factor according to the stochastic demand. On the contrary in the decentralized model, both the supplier and the retailers hold their own inventory and safety stock for themselves respectively. That is, they control their own inventory independently. In both cases, the aim is to satisfy the given target service level predefined. In our simulation study two types of demand patterns, stationary and non-stationary demand, are considered respectively. The bullwhip effect generated in the course of forecasting and processing of demand information were analyzed. The results show that the proposed method can satisfy the given service level and mitigate the bullwhip effect to some extent.
Smart meters (SMs) play a pivotal rule in the smart grid by being able to report the electricity usage of consumers to the utility provider (UP) almost in real-time. However, this could leak sensitive information abou...
详细信息
ISBN:
(纸本)9781728171005
Smart meters (SMs) play a pivotal rule in the smart grid by being able to report the electricity usage of consumers to the utility provider (UP) almost in real-time. However, this could leak sensitive information about the consumers to the UP or a third-party. Recent works have leveraged the availability of energy storage devices, e.g., a rechargeable battery (RB), in order to provide privacy to the consumers with minimal additional energy cost. In this paper, a privacy-cost management unit (PCMU) is proposed based on a model-free deep reinforcement learningalgorithm, called deep double q-learning (DDqL). Empirical results evaluated on actual SMs data are presented to compare DDqL with the state-of-the-art, i.e., classical q-learning (CqL). Additionally, the performance of the method is investigated for two concrete cases where attackers aim to infer the actual demand load and the occupancy status of dwellings. Finally, an abstract information-theoretic characterization is provided.
In this paper agent-based simulation is employed to study the power market operation under two alternative pricing systems: uniform and discriminatory (pay-as-bid). Power suppliers are modeled as adaptive agents capab...
详细信息
ISBN:
(纸本)9781424401772
In this paper agent-based simulation is employed to study the power market operation under two alternative pricing systems: uniform and discriminatory (pay-as-bid). Power suppliers are modeled as adaptive agents capable of learning through the interaction with their environment, following a Reinforcement learningalgorithm. The SA-q-learning algorithm, a slightly changed version of the popular q-learning, is used in this paper;it proposes a solution to the difficult problem of the balance between exploration and exploitation and it has been chosen for its quick convergence. A test system with five supplier-agents is used to study the suppliers' behavior under the uniform and the pay-as-bid pricing systems.
Instead of using classical offline data-driven optimization technique in traffic network signal control, this work aims to explore the potential of implementing an online data-driven optimization technique. A dynamic ...
详细信息
ISBN:
(纸本)9781538678138
Instead of using classical offline data-driven optimization technique in traffic network signal control, this work aims to explore the potential of implementing an online data-driven optimization technique. A dynamic modeling technique is proposed using q-learning (qL) algorithm to online observe and learn the inflow-outflow traffic behaviors and extract the model parameters to update the evaluation model used in the fitness function of genetic algorithm (GA). The proposed GA with dynamic modeling is known as dyna-GA. Dyna-GA is then integrated into a hierarchical-based multi-agent traffic signal control system which consists of two layers. The lower-layer consists of several local agents that have autonomy in controlling their local intersection, whereas the upper-layer consists of one supervisory agent that has jurisdiction on all the local agents. The supervisory agent has the superiority in overwriting the local control decision if conflict occurred. The robustness of the proposed dyna-GA under several traffic scenarios is tested using a simulated arterial traffic network. The simulation results show the proposed dyna-GA has better performances in minimizing travel delay as compared to the classical GA which does not have the dynamic model.
A Reinforcement learning (RL) method applied to the dynamic load allocation in AGC system is presented. The problem can be modeled as a Markov Decision Process (MDP). The q-learning algorithm as a model-free learning ...
详细信息
ISBN:
(纸本)9781424442409
A Reinforcement learning (RL) method applied to the dynamic load allocation in AGC system is presented. The problem can be modeled as a Markov Decision Process (MDP). The q-learning algorithm as a model-free learningalgorithm is introduced. It learns an optimal action strategy by experience from exploring an unknown system and getting rewards. Rewards are chosen to express how well actions control the system. The applications of the q-learning algorithm to the two-area power system model and China Southern,Power Grid model are presented. The case study shows that the q-learning algorithm enhances the performance of AGC system under CPS.
This study achieves quality-of-Service (qoS) management in heterogeneous networking using a distributed multiagent scheme (DMAS) based on the concept of cooperation and the awareness algorithm. The proposed scheme is ...
详细信息
ISBN:
(纸本)9781424495375
This study achieves quality-of-Service (qoS) management in heterogeneous networking using a distributed multiagent scheme (DMAS) based on the concept of cooperation and the awareness algorithm. The proposed scheme is developed for supporting qoS management in a user-accepted and cost-effective fashion, which consists of a collection of problem-solving agents with three modules: the knowledge source, the in-cloud blackboard system, and the control engine built into the scheme. A set of problem-solving agents autonomously process local tasks and cooperatively interoperate via an in-cloud blackboard system to guarantee qoS. An awareness algorithm, called the q-learning algorithm, calculates the exceptive rewards of a handoff to all access networks. These rewards are then used by these problem-solving agents to determine what to do. Through operations and cooperation among the active agents, a policy is selected and a user-accepted schedule that meets the specified qoS is generated. Compared with traditional qoS management mechanisms, the proposed DMAS scheme has a 36% lower packet loss ratio in video streaming applications and a 34% lower average delay in VoIP applications with only a minor sacrifice in system computational complexity.
暂无评论