Deep reinforcement learning (DRL) is capable of learning high-performing policies on a variety of complex high-dimensional tasks, ranging from video games to robotic manipulation. However, standard DRL methods often s...
详细信息
ISBN:
(数字)9781728162126
ISBN:
(纸本)9781728162133
Deep reinforcement learning (DRL) is capable of learning high-performing policies on a variety of complex high-dimensional tasks, ranging from video games to robotic manipulation. However, standard DRL methods often suffer from poor sample efficiency, partially because they aim to be entirely problem-agnostic. In this work, we introduce a novel approach to exploration and hierarchical skill learning that derives its sample efficiency from intuitive assumptions it makes about the behavior of objects both in the physical world and simulations which mimic physics. Specifically, we propose the Hypothesis Proposal and Evaluation (HyPE) algorithm, which discovers objects from raw pixel data, generates hypotheses about the controllab.lity of observed changes in object state, and learns a hierarchy of skills to test these hypotheses. We demonstrate that HyPE can dramatically improve the sample efficiency of policy learning in two different domains: a simulated robotic blockpushing domain, and a popular benchmark task: Breakout. In these domains, HyPE learns high-scoring policies an order of magnitude faster than several state-of-the-art reinforcement learning methods.
Chinese language has been generally regarded as a Subject-Verb -Object (SVO) language and the basic semantic unit is the Chinese word that is usually consisted by two or more Chinese characters. However, word-centered...
详细信息
In this paper, behavior of teleoperation systems with modeling error and error of delay time in Smith predictor is discussed. In teleoperation systems usually there is a large distance between Master system and Slave ...
详细信息
In this paper, behavior of teleoperation systems with modeling error and error of delay time in Smith predictor is discussed. In teleoperation systems usually there is a large distance between Master system and Slave system. In this case there is always an error in modeling of system. The condition for stability of teleoperation systems with modeling error is derived by introducing of a theorem. This theorem can assist a designer in ensuring the stability of the teleoperation system. Also, error of delay time and stability of teleoperation systems by using of Internet as communication channel are discussed. The effect of delay time prediction on the system stability and performance is studied and it is shown that delay time prediction could improve system performance. Simulation results are presented to verify the obtained results.
Future humanoid robots, working beside humans in complex dynamic environments, would be required to perform a wide repertoire of task. To this end traditional methods for deriving a control policy won't succeed, l...
详细信息
The ultimate goal for humanoid robotics research is to develop humanoid robotic systems capable and flexible enough to handle the challenge of working alongside human in complex natural environments performing everyda...
详细信息
With the increasing of micro parts, much attention has been paid to development of micro devices, which can handle those micro parts. We present a novel wireless microrobot based on the deformation of piezoelectric. T...
详细信息
ISBN:
(纸本)0780382730
With the increasing of micro parts, much attention has been paid to development of micro devices, which can handle those micro parts. We present a novel wireless microrobot based on the deformation of piezoelectric. The microrobot can fulfill both long distance transport and precise positioning. With three drive units arranged at apexes of an equilateral triangle, omnidirectional motion can be achieved. The author mainly discusses its driving principle and wireless control strategies. Some experimental results are also given to examine the effects of related parameters on its kinematics characteristics.
Humanoid robots are required to perform a wide repertoire of task working beside humans in complex dynamic environments. Learning mechanism are important for building up this type of repertoires of robot skills, howev...
详细信息
In this paper we present two advanced methods for evolutionary optimisation. One method is based on Parallel Genetic Algorithms. It is called Cooperating Populations with Different Evolution Behaviours (CoPDEB), and a...
详细信息
In this paper we present two advanced methods for evolutionary optimisation. One method is based on Parallel Genetic Algorithms. It is called Cooperating Populations with Different Evolution Behaviours (CoPDEB), and allows each population to exhibit a different evolution behaviour. Results from two problems show the advantage of using different evolution behaviour on each population. The other method concerns application of GAs on constrained optimisation problems. It is called the Varying Fitness Function (VFF) method and implements a fitness function with varying penalty tenns, added to the objective function for penalising infeasible solutions, in order to assist the GA to easily locate the area of the global optimum. Simulation results on two real world problems show that the VFF method outperfonns the classic static fitness function implementations.
The ability of autonomous robots to precisely compute their spatial coordinates constitutes an important attribute. In this regard, Visual Odometry (VO) becomes a most appropriate tool, in estimating the full pose of ...
详细信息
The ability of autonomous robots to precisely compute their spatial coordinates constitutes an important attribute. In this regard, Visual Odometry (VO) becomes a most appropriate tool, in estimating the full pose of a camera, placed onboard a robot by analyzing a sequence of images. The paper at hand proposes an accurate computationally-efficient VO algorithm relying exclusively on stereo vision. A non-iterative outlier detection technique capable of efficiently discarding outliers of matched features is suggested. The developed technique is combined with an incremental motion estimation approach to estimate the robot's trajectory. The accuracy of the proposed system has been evaluated both on simulated data and using a real robotic platform. Experimental results from rough terrain routes show remarkable accuracy with positioning errors as low as 1.1%.
The paper examines the usefulness of Fuzzy Cognitive Maps in modeling complex systems and specifically their use in modeling manufacturing systems and information from an abstract point of view. Aspects such as Fuzzy ...
详细信息
The paper examines the usefulness of Fuzzy Cognitive Maps in modeling complex systems and specifically their use in modeling manufacturing systems and information from an abstract point of view. Aspects such as Fuzzy Cognitive Map representation and development are presented and PCM use to develop a behavioral model of the system is discussed. Fuzzy Cognitive Maps applicability in modeling complex systems and their use to aggregate different models for the complex system is discussed. A hierarchical structure is proposed, where a Fuzzy Cognitive Map models the supervisor of the system.
暂无评论