Reinforcement learning aims to obtain optimal/suboptimal strategy through trial-and-error and interaction with dynamic environment. After an introduction of basic knowledge of reinforcement learning, TD algorithm, Q-l...
详细信息
ISBN:
(纸本)9789881563811
Reinforcement learning aims to obtain optimal/suboptimal strategy through trial-and-error and interaction with dynamic environment. After an introduction of basic knowledge of reinforcement learning, TD algorithm, Q-learning algorithm, Dyna algorithm and Sarsa algorithm base on Markov decision model are discussed, respectively. Moreover, reinforcement learning based on partially observable Markov decision process and semi-Markov decision model for uncertain environment are analyzed, respectively. The research status of Q learning in the field of multi-robot systems is also presented. Finally, the main challenges and further research work are given.
Internet as the fertile ground of free speech, everyone can express their emotions, cognition and views of things via the internet, especially in the era of Web2.0, the occurrence of blog which provides a wider space ...
详细信息
Internet as the fertile ground of free speech, everyone can express their emotions, cognition and views of things via the internet, especially in the era of Web2.0, the occurrence of blog which provides a wider space for personal speech. Internet provides a broad platform for contemporary college students to express thought and emotion, at the same time it provides a channel for education workers to understand students' thought and mental. Through real-time monitoring information on school area network, we can get real-time status, mood and some dynamic data which can portray characteristics of emotion of the concerned entity; also we can abstract the characterization of psychological crisis information, and use this feature to do data mining on these data. In this way we can find group of students which have tendency of psychological CriSIS, and use corresponding psychological crisis intervention online and offline methods to ease students' psychological crisis and to prevent tragic events in timely.
With the fast development of Internet and its data scale, B2B (Business to Business), whose speed and high availability advantage is based on Internet, is eroding more and more market share of traditional business. In...
详细信息
With the fast development of Internet and its data scale, B2B (Business to Business), whose speed and high availability advantage is based on Internet, is eroding more and more market share of traditional business. In recent years, the new data processing technologies, such as cloud computing, assure the enhancement of computer's computing capacity; it has become possible for researchers to process and analyze the massive business data, including large scale customer data, and large scale commodity information. A new B2B platform frame, based on cloud computing with less running time and better response efficiency, is proposed here to promote transaction handle efficiency. By adoption of complex network theory and cloud computing technology, the new platform has been evaluated to decrease transaction handle time, and make more benefits.
Stripe-based wireless sensor networks can be deployed in many scenarios, such as roads, bridges, tunnels and metros. The deployment of this type of networks must he considered in detailed because the network is charac...
详细信息
ISBN:
(纸本)9781467313988
Stripe-based wireless sensor networks can be deployed in many scenarios, such as roads, bridges, tunnels and metros. The deployment of this type of networks must he considered in detailed because the network is characterized by the long distance and narrow width. The paper gives an analysis of balanced energy consumption deployment of this network. The discussion is presented according to two type of width of network. The main factors which affect the network lifetime are analyzed in detailed and the simulation is presented. The simulation result shows that the short width of networks has little effect on network lifetime. Also, the maximum network lifetime can not be reached when the nodes have short communication radius. The network lifetime can be affected by the long width of networks. The network lifetime can be maximized when the nodes have long communication range.
This paper proposes a novel budget model based on differential game to deal with budget allocation in competitive search advertisements under a finite time horizon, with consideration of budget constraints. We extend ...
详细信息
This paper proposes a novel budget model based on differential game to deal with budget allocation in competitive search advertisements under a finite time horizon, with consideration of budget constraints. We extend the advertising response function with the dynamical advertising effort u and quality score q to fit search advertising scenarios. We also discuss Nash equilibriums of our model, and study some desirable properties of two kinds of equilibriums in the case with budget constraints: "budget-stable" open-loop Nash equilibrium (BS-OLNE) and "budget-unstable" open-loop Nash equilibrium (BUS-OLNE). We have evaluated our budget model and identified properties with computational experiments. Experimental results show that budget strategies with dynamical advertising elasticity are superior to those with fixed one and our findings on OLNEs are helpful for advertisers to make budget decisions.
Emerging research on team science aims at better understanding the key contextual factors during the trans-disciplinary scientific collaboration process and enhancing the outcomes of large-scale collaborative research...
详细信息
Emerging research on team science aims at better understanding the key contextual factors during the trans-disciplinary scientific collaboration process and enhancing the outcomes of large-scale collaborative research programs. Team science can range from a few participants working at the same site to numerous researchers dispersed across multiple geographic and organizational venues. This article outlines the fundamental conceptual framework of a next-generation team-science-enabling platform (TSEP).
作者:
Dong, X. -S.Xiong, GangFan, DongZhu, F. -H.Xie, LiChinese Acad Sci
Inst Automat State Key Lab Management & Control Complex Syst Beijing Peoples R China Chinese Acad Sci
Cloud Comp Ind Technol Innovat & Incubat Ctr Dongguan Res Inst CASIA Dongguan 523808 Peoples R China Zhejiang Univ
Dept Informat Sci & Elect Engn Zhejiang Prov Key Lab Informat Network Technol Hangzhou 310027 Zhejiang Peoples R China
Bus Rapid Transit (BRT) is an effective way to improve urban traffic status. But, because of its complexity, it is difficult in its operation management and scheduling. In this article, based on ACP approach, Parallel...
详细信息
ISBN:
(纸本)9781467313988
Bus Rapid Transit (BRT) is an effective way to improve urban traffic status. But, because of its complexity, it is difficult in its operation management and scheduling. In this article, based on ACP approach, Parallel BRT System is constructed, which can detect real-time passenger flow at stations, traffic flow at stations and intersections and queuing length of vehicles on the road;to provide short-term passenger and traffic saturation prediction in order to timely arrange transportation management and relieve congestion;to assess, improve and optimize the emergency management of holidays, events, accidents and other emergencies;to improve the quality of real-time scheduling functions based on the measurement results detected from videos, and so on. This system has been applied in Guangzhou Zhongshan Avenue BRT, which was applied for BRT's monitoring, warning, forecasting, emergency management, real-time scheduling and other needs, to guarantee BRT's smooth, safety, efficiency and reliability.
Chemical industry is complex and continuous process industry, and the control and management of the long-term safe operation involves a great deal of information and data on the staffs, management, equipment and techn...
详细信息
Multiple-target tracking in complex scenes is one of the most complicated problems in computer vision. Handling the occlusion between objects is the key issue in multiple target tracking. This paper presents an occlus...
详细信息
It is important for a chemical plant to find a suitable performance appraisal method. In this paper, based on the ACP (artificial system, computational experiment, and parallel execution) theory and the PageRank algor...
详细信息
暂无评论