Unlike autonomous ground vehicles (AGVs), unmanned aerial vehicles (UAVs) have a higher dimensional configuration space, which makes the motion planning of multi-UAVs a challenging task. In addition, uncertainties and...
详细信息
Unlike autonomous ground vehicles (AGVs), unmanned aerial vehicles (UAVs) have a higher dimensional configuration space, which makes the motion planning of multi-UAVs a challenging task. In addition, uncertainties and noises are more significant in UAV scenarios, which increases the difficulty of autonomous navigation for multi-UAV. In this letter, we proposed a two-stage reinforcement learning (RL) based multi-UAV collision avoidance approach without explicitly modeling the uncertainty and noise in the environment. Our goal is to train a policy to plan a collision-free trajectory by leveraging local noisy observations. However, the reinforcement learned collision avoidance policies usually suffer from high variance and low reproducibility, because unlike supervised learning, RL does not have a fixed training set with ground-truth labels. To address these issues, we introduced a two-stage training method for RL based collision avoidance. For the first stage, we optimize the policy using a supervised training method with a loss function that encourages the agent to follow the well-known reciprocal collision avoidance strategy. For the second stage, we use policy gradient to refine the policy. We validate our policy in a variety of simulated scenarios, and the extensive numerical simulations demonstrate that our policy can generate time-efficient and collision-free paths under imperfect sensing, and can well handle noisy local observations with unknown noise levels.
Among the main challenges in robotics, target-driven visual navigation has gained increasing interest in recent years. In this task, an agent has to navigate in an environment to reach a user specified target, only th...
详细信息
Among the main challenges in robotics, target-driven visual navigation has gained increasing interest in recent years. In this task, an agent has to navigate in an environment to reach a user specified target, only through vision. Recent fruitful approaches rely on deep reinforcement learning, which has proven to be an effective framework to learn navigation policies. However, current state-of-the-art methods require to retrain, or at least fine-tune, the model for every new environment and object. In real scenarios, this operation can be extremely challenging or even dangerous. For these reasons, we address generalization in target-driven visual navigation by proposing a novel architecture composed of two networks, both exclusively trained in simulation. The first one has the objective of exploring the environment, while the other one of locating the target. They are specifically designed to work together, while separately trained to help generalization. In this article, we test our agent in both simulated and real scenarios, and validate its generalization capabilities through extensive experiments with previously unseen goals and unknown mazes, even much larger than the ones used for training.
Despite decades of research, general purpose in-hand manipulation remains one of the unsolved challenges of robotics. One of the contributing factors that limit current robotic manipulation systems is the difficulty o...
详细信息
Despite decades of research, general purpose in-hand manipulation remains one of the unsolved challenges of robotics. One of the contributing factors that limit current robotic manipulation systems is the difficulty of precisely sensing contact forces - sensing and reasoning about contact forces are crucial to accurately control interactions with the environment. As a step towards enabling better robotic manipulation, we introduce DIGIT, an inexpensive, compact, and high-resolution tactile sensor geared towards in-hand manipulation. DIGIT improves upon past vision-based tactile sensors by miniaturizing the form factor to be mountable on multi-fingered hands, and by providing several design improvements that result in an easier, more repeatable manufacturing process, and enhanced reliability. We demonstrate the capabilities of the DIGIT sensor by training deep neural network model-based controllers to manipulate glass marbles in-hand with a multi-finger robotic hand. To provide the robotic community access to reliable and low-cost tactile sensors, we open-source the DIGIT design at ***.
Decentralized multi-agent reinforcement learning has been demonstrated to be an effective solution to large multi-agent control problems. However, agents typically can only make decisions based on local information, r...
详细信息
Decentralized multi-agent reinforcement learning has been demonstrated to be an effective solution to large multi-agent control problems. However, agents typically can only make decisions based on local information, resulting in suboptimal performance in partially-observable settings. The addition of a communication channel overcomes this limitation by allowing agents to exchange information. Existing approaches, however, have required agent output size to scale exponentially with the number of message bits, and have been slow to converge to satisfactory policies due to the added difficulty of learning message selection. We propose an independent bitwise message policy parameterization that allows agent output size to scale linearly with information content. Additionally, we leverage aspects of the environment structure to derive a novel policy gradient estimator that is both unbiased and has a lower variance message gradient contribution than typical policy gradient estimators. We evaluate the impact of these two contributions on a collaborative multi-agent robot navigation problem, in which information must be exchanged among agents. We find that both significantly improve sample efficiency and result in improved final policies, and demonstrate the applicability of these techniques by deploying the learned policies on physical robots.
Object manipulation performed by robots refers to the art of controlling the shape and location of an object through force constraints with robot end-effectors, both robot hands, and grippers. The success of task exec...
详细信息
Object manipulation performed by robots refers to the art of controlling the shape and location of an object through force constraints with robot end-effectors, both robot hands, and grippers. The success of task execution is usually guaranteed by the sense of touch. In this work, we present an optical tactile sensor incorporating plastic optical fibers, transparent silicone rubber, and an off-the-shelf color camera that can detect: translational and rotational shear forces, and contact location and its normal force. Contact localization is possible thanks to the shear strain. Specifically, one of the layers stretches so that its thickness decreases. The decrease in the thickness results in the color change at the point of contact. Elastic behavior of the sensing media provides a robust rotational and translational shear detection mechanism when torque and planar force, respectively, are applied onto the sensing surface. Thanks to the plastic optofibers, signal processing electronics are placed away from the sensing surface making the sensor immune to hazardous environments. Machine learning techniques were used to benchmark the sensing performance of the sensor. By implementing a multi-output CNN model, the contact type was classified into normal and shear or torsional deformation and their corresponding continuous contact features were estimated.
Drifting is a complicated task for autonomous vehicle control. Most traditional methods in this area are based on motion equations derived by the understanding of vehicle dynamics, which is difficult to be modeled pre...
详细信息
Drifting is a complicated task for autonomous vehicle control. Most traditional methods in this area are based on motion equations derived by the understanding of vehicle dynamics, which is difficult to be modeled precisely. We propose a robust drift controller without explicit motion equations, which is based on the latest model-free deep reinforcement learning algorithm soft actor-critic. The drift control problem is formulated as a trajectory following task, where the error-based state and reward are designed. After being trained on tracks with different levels of difficulty, our controller is capable of making the vehicle drift through various sharp corners quickly and stably in the unseen map. The proposed controller is further shown to have excellent generalization ability, which can directly handle unseen vehicle types with different physical properties, such as mass, tire friction, etc.
On the pursuit of autonomous flying robots, the scientific community has been developing onboard real-time algorithms for localisation, mapping and planning. Despite recent progress, the available solutions still lack...
详细信息
On the pursuit of autonomous flying robots, the scientific community has been developing onboard real-time algorithms for localisation, mapping and planning. Despite recent progress, the available solutions still lack accuracy and robustness in many aspects. While mapping for autonomous cars had a substantive boost using deep-learning techniques to enhance LIDAR measurements using image-based depth completion, the large viewpoint variations experienced by aerial vehicles are still posing major challenges for learning-based mapping approaches. In this letter, we propose a depth completion and uncertainty estimation approach that better handles the challenges of aerial platforms, such as large viewpoint and depth variations, and limited computing resources. The core of our method is a novel compact network that performs both depth completion and confidence estimation using an image-guided approach. Real-time performance onboard a GPU suitable for small flying robots is achieved by sharing deep features between both tasks. Experiments demonstrate that our network outperforms the state-of-the-art in depth completion and uncertainty estimation for single-view methods on mobile GPUs. We further present a new photorealistic aerial depth completion dataset that exhibits more challenging depth completion scenarios than the established indoor and car driving datasets. The dataset includes an open-source, visual-inertial UAV simulator for photo-realistic data generation. Our results show that our network trained on this dataset can be directly deployed on real-world outdoor aerial public datasets without fine-tuning or style transfer.
Fast feedback control and safety guarantees are essential in modern robotics. We present an approach that achieves both by combining novel robust model predictive control (MPC) with function approximation via (deep) n...
详细信息
Fast feedback control and safety guarantees are essential in modern robotics. We present an approach that achieves both by combining novel robust model predictive control (MPC) with function approximation via (deep) neural networks (NNs). The result is a new approach for complex tasks with nonlinear, uncertain, and constrained dynamics as are common in robotics. Specifically, we leverage recent results in MPC research to propose a new robust setpoint tracking MPC algorithm, which achieves reliable and safe tracking of a dynamic setpoint while guaranteeing stability and constraint satisfaction. The presented robust MPC scheme constitutes a one-layer approach that unifies the often separated planning and control layers, by directly computing the control command based on a reference and possibly obstacle positions. As a separate contribution, we show how the computation time of the MPC can be drastically reduced by approximating the MPC law with a NN controller. The NN is trained and validated from offline samples of the MPC, yielding statistical guarantees, and used in lieu thereof at run time. Our experiments on a state-of-the-art robot manipulator are the first to show that both the proposed robust and approximate MPC schemes scale to real-world robotic systems.
Humans can naturally learn to execute a new task by seeing it performed by other individuals once, and then reproduce it in a variety of configurations. Endowing robots with this ability of imitating humans from third...
详细信息
Humans can naturally learn to execute a new task by seeing it performed by other individuals once, and then reproduce it in a variety of configurations. Endowing robots with this ability of imitating humans from third person is a very immediate and natural way of teaching new tasks. Only recently, through meta-learning, there have been successful attempts to one-shot imitation learning from humans;however, these approaches require a lot of human resources to collect the data in the real world to train the robot. But is there a way to remove the need for real world human demonstrations during training? We show that with Task-Embedded Control Networks, we can infer control polices by embedding human demonstrations that can condition a control policy and achieve one-shot imitation learning. Importantly, we do not use a real human arm to supply demonstrations during training, but instead leverage domain randomisation in an application that has not been seen before: sim-to-real transfer on humans. Upon evaluating our approach on pushing and placing tasks in both simulation and in the real world, we show that in comparison to a system that was trained on real-world data we are able to achieve similar results by utilising only simulation data. Videos can be found here: https://***/view/tecnets-humans.
We investigated the application of haptic feedback control and deep reinforcement learning (DRL) to robot-assisted dressing. Our method uses DRL to simultaneously train human and robot control policies as separate neu...
详细信息
We investigated the application of haptic feedback control and deep reinforcement learning (DRL) to robot-assisted dressing. Our method uses DRL to simultaneously train human and robot control policies as separate neural networks using physics simulations. In addition, we modeled variations in human impairments relevant to dressing, including unilateral muscle weakness, involuntary arm motion, and limited range of motion. Our approach resulted in control policies that successfully collaborate in a variety of simulated dressing tasks involving a hospital gown and a T-shirt. In addition, our approach resulted in policies trained in simulation that enabled a real PR2 robot to dress the arm of a humanoid robot with a hospital gown. We found that training policies for specific impairments dramatically improved performance;that controller execution speed could be scaled after training to reduce the robot's speed without steep reductions in performance;that curriculum learning could be used to lower applied forces;and that multi-modal sensing, including a simulated capacitive sensor, improved performance.
暂无评论