Agent's learning behavior usually presents biased judgments influenced by many internal and external reasons, we incorporate an improved q-learning algorithm in the reinforcement learning which is examined with th...
详细信息
Agent's learning behavior usually presents biased judgments influenced by many internal and external reasons, we incorporate an improved q-learning algorithm in the reinforcement learning which is examined with the prisoner's dilemma game in an activity-driven networks. The heterogeneous learning rate and epsilon-greedy exploration mechanism are taken into account while modeling decision-making of agents. Simulation results show the proposed reinforcement learning mechanism is conducive to the emergence of defective behavior, i.e. it could maximize one's expected payoff regardless of its neighbors' strategy. In addition, we find the temptation gain, vision level and the number of connected edges of activated agents are proportional to the density of defectors. Interestingly, when the inherent learning rate is small, the increase of exploration rate can demote the appearance of defectors, and the decrease of defectors is insignificant by increasing of exploration rate conversely.
This paper investigates enhanced Inter-Cell Interference Coordination(eICIC) techniques for Heterogeneous Networks(HetNets),and models this strategic coexistence as a multi-player system in which interference manageme...
详细信息
ISBN:
(纸本)9781479900763
This paper investigates enhanced Inter-Cell Interference Coordination(eICIC) techniques for Heterogeneous Networks(HetNets),and models this strategic coexistence as a multi-player system in which interference management strategies inspired from a form of reinforcement learning known as distributed q-learning are ***,this paper focuses on time domain eICIC techniques in which each macrocell optimizes its Almost Blank Subframe(ABS) configuration consisted of ABS density and reduces power *** of relying on predefined configuration,the system is designed to learn optimal ABS configurations by directly interacting with the *** substantiate our theoretical findings,system level simulations are carried out in which our proposed solution is compared with the conventional approach that ABS configuration is *** proposed solution is shown to yield substantial gains of user throughput compared to fixed ABS configuration.
In this paper,we propose a centralized adaptive inventory control model for a multi-level multi-cycle supply chain consisting of one supplier and one retailer with non-stationary random *** our approach,the fuzzy expo...
详细信息
In this paper,we propose a centralized adaptive inventory control model for a multi-level multi-cycle supply chain consisting of one supplier and one retailer with non-stationary random *** our approach,the fuzzy exponential smoothing method adopted to forecast the future demand,and the EOq(Economic Order quantity) model determines the ordered ***,a reinforcement learningalgorithm is developed to evaluate the effects of safety *** objective is to satisfy a given target service level predefined for the *** types of demand process patterns,known and unknown demand distribution,are ***,the bullwhip effect generated while processing of demand information is *** results show that the proposed control method can improve the service level and reduce the bullwhip effect to some extent.
Port power, as one of the important scenarios to promote energy substitution, has the characteristics of great substitution potential and strong interaction ability. In order to explore the technology of port power pa...
详细信息
ISBN:
(纸本)9781450372930
Port power, as one of the important scenarios to promote energy substitution, has the characteristics of great substitution potential and strong interaction ability. In order to explore the technology of port power participating in power grid regulation, a strategy of port power participating in power grid regulation based on multi-agent system (MAS) technology is proposed in this paper. Firstly, the analysis and modeling of the controllable resources are carried out from the power side and the power grid side of the port, that is, the interactive resources on both sides of the supply and demand are analyzed and modeled, which lays the foundation for the key technology research. Secondly, in order to reflect the economic and environmental benefits of energy substitution, and to take into account the benefits of port ships, ports, power grids and the government, the optimization based on MAS is proposed. Finally, q-learning algorithm is used to solve the model, and an example is used to verify the effectiveness of the model and the control strategy.
In this paper,two adaptive inventory control models,*** and decentralized respectively,for a multi-echelon multi-cycle supply chain consisting of one supplier and one retailer with non-stationary stochastic demand wer...
详细信息
ISBN:
(纸本)9781479970186
In this paper,two adaptive inventory control models,*** and decentralized respectively,for a multi-echelon multi-cycle supply chain consisting of one supplier and one retailer with non-stationary stochastic demand were *** the centralized model,the vendor managed inventory replenishment policy was used by the supplier and the retailer didn't keep any *** improved exponential smoothing method was used by the supplier to forecast the future *** EOq model was used by the supplier to determine the replenishment quantity for the retailer and an adaptive approach was used by the supplier to determine his safety stock to against demand *** reinforcement learningalgorithm was adopted to select an proper safety factor according to the stochastic *** the contrary,in the decentralized model,both the supplier and the retailers hold their own inventory and safety stock for themselves *** is,they control their own inventory *** both cases,the aim is to satisfy the given target service level *** our simulation study,two types of demand patterns,stationary and non-stationary demand,are considered *** bullwhip effect generated in the course of forecasting and processing of demand information were *** results show that the proposed method can satisfy the given service level and mitigate the bullwhip effect to some extent.
For the past years, the energy overconsumption problems, arising from the rapid growth of Internet scale and services types, are becoming more and more serious. In this context, the Information and Communication Techn...
详细信息
ISBN:
(纸本)9781479900961
For the past years, the energy overconsumption problems, arising from the rapid growth of Internet scale and services types, are becoming more and more serious. In this context, the Information and Communication Technology (ICT) sector has given an extensive concern to the research on green networking. From an energy-saving point of view for Internet, this paper designs a power consumption model and a qoS model for unicast. Furthermore, this paper proposes a unicast routing algorithm based on Chandy-Misra algorithm and q-learning algorithm for green Internet, which is compared with an ant colony algorithm based self-adaptive energy saving routing with respect to power consumption, the success rate of routing and running time. Results show the intelligent unicast routing algorithm proposed can effectively reduce network energy consumption, while guaranteeing the good performance.
Due to the periodicity and rhythm of flood disasters, the government needs to carry out regulation on the reserve enterprises to ensure the regular maintenance and update of flood control supplies. However, due to inf...
详细信息
Due to the periodicity and rhythm of flood disasters, the government needs to carry out regulation on the reserve enterprises to ensure the regular maintenance and update of flood control supplies. However, due to information asymmetry, the regulation effect is not ideal. In this paper, the Markov game model of government and reserve enterprise is established, and the equilibrium solution of government enterprise game is obtained by using q-learning algorithm. The feasibility of this method is proved by the analysis of an example, and some suggestions are put forward for the substitute storage mechanism of flood control supplies.
暂无评论