Recently end-to-end unsupervised deeplearning methods have demonstrated an impressive performance for visual depth and ego-motion estimation tasks. These data-based learning methods do not rely on the same limiting a...
详细信息
Recently end-to-end unsupervised deeplearning methods have demonstrated an impressive performance for visual depth and ego-motion estimation tasks. These data-based learning methods do not rely on the same limiting assumptions that geometry-based methods do. The encoder-decoder network has been widely used in the depth estimation and the RCNN has brought significant improvements in the ego-motion estimation. Furthermore, the latest use of generative adversarial nets (GANs) in depth and ego-motion estimation has demonstrated that the estimation could be further improved by generating pictures in the game learning process. This paper proposes a novel unsupervised network system for visual depth and ego-motion estimation-stacked generative adversarial network. It consists of a stack of GAN layers, of which the lowest layer estimates the depth and ego-motion while the higher layers estimate the spatial features. It can also capture the temporal dynamic due to the use of a recurrent representation across the layers. We select the most commonly used KITTI data set for evaluation. The evaluation results show that our proposed method can produce better or comparable results in depth and ego-motion estimation.
A common approach in the field of tactile robotics is the development of a new perception algorithm for each new application of existing hardware solutions. In this letter, we present a method of dimensionality reduct...
详细信息
A common approach in the field of tactile robotics is the development of a new perception algorithm for each new application of existing hardware solutions. In this letter, we present a method of dimensionality reduction of an optical-based tactile sensor image output using a convolutional neural network encoder structure. Instead of using various complex perception algorithms, and/ or manually choosing task-specific data features, this unsupervised feature extraction method allows simultaneous online deployment of multiple simple perception algorithms on a common set of black-box features. The method is validated on a set of benchmarking use cases. Contact object shape, edge position, orientation, and indentation depth are estimated using shallowneural networks and machine learning models. Furthermore, a contact force estimator is trained, affirming that the extracted features contain sufficient information on both spatial and mechanical characteristics of the manipulated object.
We learn end-to-end point-to-point and pathfollowing navigation behaviors that avoid moving obstacles. These policies receive noisy lidar observations and output robot linear and angular velocities. The policies are t...
详细信息
We learn end-to-end point-to-point and pathfollowing navigation behaviors that avoid moving obstacles. These policies receive noisy lidar observations and output robot linear and angular velocities. The policies are trained in small, static environments with AutoRL, an evolutionary automation layer around reinforcement learning (RL) that searches for a deep RL reward and neural network architecture with large-scale hyper-parameter optimization. AutoRL first finds a reward that maximizes task completion and then finds a neural network architecture that maximizes the cumulative of the found reward. Empirical evaluations, both in simulation and on-robot, show that AutoRL policies do not suffer from the catastrophic forgetfulness that plagues many other deep reinforcement learning algorithms, generalize to new environments and moving obstacles, are robust to sensor, actuator, and localization noise, and can serve as robust building blocks for larger navigation tasks. Our path-following and point-to-point policies are, respectively, 23% and 26% more successful than comparison methods across new environments.
Minimally invasive surgery (MIS) is increasingly becoming a vital method of reducing surgical trauma and significantly improving postoperative recovery. However, skillful handling of surgical instruments used in MIS, ...
详细信息
Minimally invasive surgery (MIS) is increasingly becoming a vital method of reducing surgical trauma and significantly improving postoperative recovery. However, skillful handling of surgical instruments used in MIS, especially for laparoscopy, requires a long period of training and depends highly on the experience of surgeons. This letter presents a new robot-assisted surgical training system which is designed to improve the practical skills of surgeons through intrapractice feedback and demonstration from both human experts and reinforcement learning (RL) agents. This system utilizes proximal policy optimization to learn the control policy in simulation. Subsequently, a generative adversarial imitation learning agent is trained based on both expert demonstrations and learned policies in simulation. This agent then generates demonstration policies on the robot-assisted device for trainees and produces feedback scores during practice. To further acquire surgical tools coordinates and encourage self-oriented practice, a mask region-based convolution neural network is trained to perform the semantic segmentation of surgical tools and targets. To the best of our knowledge, this system is the first robot-assisted laparoscopy training system which utilizes actual surgical tools and leverages deep reinforcement learning to provide demonstration training from both human expert perspectives and RL criterion.
To collect a human-annotated dataset for training deep convolutional neural networks is a very time-consuming and laborious process. To reduce this burden, we previously proposed an automated annotation by placing one...
详细信息
To collect a human-annotated dataset for training deep convolutional neural networks is a very time-consuming and laborious process. To reduce this burden, we previously proposed an automated annotation by placing one visual marker above the detection target object in the training phase. However, in this approach, occasionally the marker hides the object surface. To avoid this issue, we propose placing a pedestal with multiple markers at the bottom of the object. If we use multiple markers, the object can be annotated even when the object hides some of the markers. Besides that, the simple modification of placing the markers on the bottom allows the use of simple background masking to avoid the neural network learning the remaining markers in the training image as a feature of the object. Background masking can completely remove the markers during the training process. Experiments showed the proposed vision system using our automatic object annotation outperformed the vision system using manual annotation in terms of object detection, orientation estimation, and 2D position estimation while reducing the time required for dataset collection from 16.1 hours to 7.30 hours.
Designing effective low-level robot controllers often entail platform-specific implementations that require manual heuristic parameter tuning, significant system knowledge, or long design times. With the rising number...
详细信息
Designing effective low-level robot controllers often entail platform-specific implementations that require manual heuristic parameter tuning, significant system knowledge, or long design times. With the rising number of robotic and mechatronic systems deployed across areas ranging from industrial automation to intelligent toys, the need for a general approach to generating low-level controllers is increasing. To address the challenge of rapidly generating low-level controllers, we argue for using model-based reinforcement learning (MBRL) trained on relatively small amounts of automatically generated (i.e., without system simulation) data. In this letter, we explore the capabilities of MBRL on a Crazyflie centimeter-scale quadrotor with rapid dynamics to predict and control at <= 50 Hz. To our knowledge, this is the first use of MBRL for controlled hover of a quadrotor using only on-board sensors, direct motor input signals, and no initial dynamics knowledge. Our controller leverages rapid simulation of a neural network forward dynamics model on a graphic processing unit enabled base station, which then transmits the best current action to the quadrotor firmware via radio. In our experiments, the quadrotor achieved hovering capability of up to 6 s with 3 min of experimental training data.
We propose a novel temporal attention based neural network architecture for robotics tasks that involve fusion of time series of sensor data, and evaluate the performance improvements in the context of autonomous navi...
详细信息
We propose a novel temporal attention based neural network architecture for robotics tasks that involve fusion of time series of sensor data, and evaluate the performance improvements in the context of autonomous navigation of unmanned ground vehicles (UGVs) in uncertain environments. The architecture generates feature vectors by fusing raw pixel and depth values collected by camera(s) and LiDAR(s), stores a history of the generated feature vectors, and incorporates the temporally attended history with current features to predict a steering command. The experimental studies show the robust performance in unknown and cluttered environments. Furthermore, the temporal attention is resilient to noise, bias, blur, and occlusions in the sensor signals. We trained the network on indoor corridor datasets (that will be publicly released) from our UGV. The datasets have LiDAR depth measurements, camera images, and human tele-operation commands.
This letter addresses two challenges facing sampling-based kinodynamic motion planning: a way to identify good candidate states for local transitions and the subsequent computationally intractable steering between the...
详细信息
This letter addresses two challenges facing sampling-based kinodynamic motion planning: a way to identify good candidate states for local transitions and the subsequent computationally intractable steering between these candidate states. Through the combination of sampling-based planning, a Rapidly Exploring Randomized Tree (RRT) and an efficient kinodynamic motion planner through machine learning, we propose an efficient solution to long-range planning for kinodynamic motion planning. First, we use deep reinforcement learning to learn an obstacle-avoiding policy that maps a robot's sensor observations to actions, which is used as a local planner during planning and as a controller during execution. Second, we train a reachability estimator in a supervised manner, which predicts the RL policy's time to reach a state in the presence of obstacles. Lastly, we introduce RL-RRT that uses the RL policy as a local planner, and the reachability estimator as the distance function to bias tree-growth towards promising regions. We evaluate our method on three kinodynamic systems, including physical robot experiments. Results across all three robots tested indicate that RL-RRT outperforms state of the art kinodynamic planners in efficiency, and also provides a shorter path finish time than a steering function free method. The learned local planner policy and accompanying reachability estimator demonstrate transferability to the previously unseen experimental environments, making RL-RRT fast because the expensive computations are replaced with simple neural network inference.
We introduce a general self-supervised approach to predict the future outputs of a short-range sensor (such as a proximity sensor) given the current outputs of a long-range sensor (such as a camera). We assume that th...
详细信息
We introduce a general self-supervised approach to predict the future outputs of a short-range sensor (such as a proximity sensor) given the current outputs of a long-range sensor (such as a camera). We assume that the former is directly related to some piece of information to be perceived (such as the presence of an obstacle in a given position), whereas the latter is information rich but hard to interpret directly. We instantiate and implement the approach on a small mobile robot to detect obstacles at various distances using the video stream of the robot's forward-pointing camera, by training a convolutional neural network on automatically-acquired datasets. We quantitatively evaluate the quality of the predictions on unseen scenarios, qualitatively evaluate robustness to different operating conditions, and demonstrate usage as the sole input of an obstacle-avoidance controller. We additionally instantiate the approach on a different simulated scenario with complementary characteristics, to exemplify the generality of our contribution.
Rapid and reliable robot grasping for a diverse set of objects has applications from warehouse automation to home de-cluttering. One promising approach is to learn deep policies from synthetic training datasets of poi...
详细信息
Rapid and reliable robot grasping for a diverse set of objects has applications from warehouse automation to home de-cluttering. One promising approach is to learn deep policies from synthetic training datasets of point clouds, grasps, and rewards sampled using analytic models with stochastic noise models for domain randomization. In this letter, we explore how the distribution of synthetic training examples affects the rate and reliability of the learned robot policy. We propose a synthetic data sampling distribution that combines grasps sampled from the policy action set with guiding samples from a robust grasping supervisor that has full state knowledge. We use this to train a robot policy based on a fully convolutional network architecture that evaluates millions of grasp candidates in 4-DOF (3-D position and planar orientation). Physical robot experiments suggest that a policy based on fully convolutional grasp quality CNNs (FC-GQ-CNNs) can plan grasps in 0.625 s, considering 5000x more grasps than our prior policy based on iterative grasp sampling and evaluation. This computational efficiency improves rate and reliability, achieving 296 mean picks per hour (MPPH) compared to 250 MPPH for iterative policies. Sensitivity experiments explore the effect of supervisor guidance level and granularity of the policy action space. Code, datasets, videos, and supplementary material can be found at http://***/fcgqcnn.
暂无评论