deep neural networks facilitate visuosensory inputs for robotic systems. However, the features encoded in a network without specific constraints have little physical meaning. In this research, we add constraints on th...
详细信息
deep neural networks facilitate visuosensory inputs for robotic systems. However, the features encoded in a network without specific constraints have little physical meaning. In this research, we add constraints on the network so that the trained features are forced to represent the actual twist coordinates of interactive objects in a scene. The trained coordinates describe 6d-pose of the objects, and transformation is applied to change the coordinate system. This algorithm is developed for a mobile service robot that imitates an object-oriented task by watching human demonstrations. As the robot has mobility, the video demonstrations are collected from different viewpoints. Our feature trajectories of twist coordinates are synthesized in the global coordinate after transformation is applied according to robot localization. Then, the trajectories are trained as probabilistic model and imitated by the robot with geometric dynamics of . Our main contribution is to develop a trainable robot with visually demonstrated human performances. Additionally, our algorithmic contribution is to design a scene interpretation network where constraints are incorporated to estimate 6d-pose of objects.
deep reinforcement learning (RL) uses model-free techniques to optimize task-specific control policies. Despite having emerged as a promising approach for complex problems, RL is still hard to use reliably for real-wo...
详细信息
deep reinforcement learning (RL) uses model-free techniques to optimize task-specific control policies. Despite having emerged as a promising approach for complex problems, RL is still hard to use reliably for real-world applications. Apart from challenges such as precise reward function tuning, inaccurate sensing and actuation, and non-deterministic response, existing RL methods do not guarantee behavior within required safety constraints that are crucial for real robot scenarios. In this regard, we introduce guided constrained policy optimization (GCPO), an RL framework based upon our implementation of constrained proximal policy optimization (CPPO) for tracking base velocity commands while following the defined constraints. We introduce schemes which encourage state recovery into constrained regions in case of constraint violations. We present experimental results of our training method and test it on the real ANYmal quadruped robot. We compare our approach against the unconstrained RL method and show that guided constrained RL offers faster convergence close to the desired optimum resulting in an optimal, yet physically feasible, robotic control behavior without the need for precise reward function tuning.
This article proposes a learning method for denoising gyroscopes of Inertial Measurement Units (IMUs) using ground truth data, and estimating in real time the orientation (attitude) of a robot in dead reckoning. The o...
详细信息
This article proposes a learning method for denoising gyroscopes of Inertial Measurement Units (IMUs) using ground truth data, and estimating in real time the orientation (attitude) of a robot in dead reckoning. The obtained algorithm outperforms the state-of-the-art on the (unseen) test sequences. The obtained performances are achieved, thanks to a well-chosen model, a proper lass function for orientation increments, and through the identification of key points when training with high-frequency inertial data. Our approach builds upon a neural network based on dilated convolutions, without requiring any recurrent neural network. We demonstrate how efficient our strategy is for 3D attitude estimation on the EuRoC and TUM-VI datasets. Interestingly, we observe our dead reckoning algorithm manages to beat top-ranked visual-inertial odometry systems in terms of attitude estimation although it does not use vision sensors. We believe this article offers new perspectives for visual-inertial localization and constitutes a step toward more efficient learning methods involving IMUs. Our open-source implementation is available at https : / /github. com/mbrossar/denoise - imu - gyro.
Complete blood cell count, which indicates the density of different blood cells in the human body is extremely important for evaluating the overall health of a person and also for detecting a wide range of disorders, ...
详细信息
Complete blood cell count, which indicates the density of different blood cells in the human body is extremely important for evaluating the overall health of a person and also for detecting a wide range of disorders, including anemia, infection and leukemia. Hence, automating this task will not only increase the speed of diagnosis, but also lower the overall treatment cost. In this paper, we focus on using a convolution neural network to perform this complete blood cell count on blood smear images. The network is also trained to detect malarial pathogens in the blood, if present. Experiments show that the overall performance of the system has a mean average precision of over 0.95 when compared with the ground-truth. Furthermore, the system predicts the images containing malarial parasites as infected 100% of the time. The software is also ported to a low cost microcomputer for rapid prototyping.
In past robotics applications, Model Predictive Control (MPC) has often been limited to linear models and relatively short time horizons. In recent years however, research in optimization, optimal control, and simulat...
详细信息
In past robotics applications, Model Predictive Control (MPC) has often been limited to linear models and relatively short time horizons. In recent years however, research in optimization, optimal control, and simulation has enabled some forms of nonlinear model predictive control which find locally optimal solutions. The limiting factor for applying nonlinear MPC for robotics remains the computation necessary to solve the optimization, especially for complex systems and for long time horizons. This letter presents a new solution method which addresses computational concerns related to nonlinear MPC called nonlinear Evolutionary MPC (NEMPC), and then we compare it to several existing methods. These comparisons include simulations on torque-limited robots performing a swing-up task and demonstrate that NEMPC is able to discover complex behaviors to accomplish the task. Comparisons with state-of-the-art nonlinear MPC algorithms show that NEMPC finds high quality control solutions very quickly using a global, instead of local, optimization method. Finally, an application in hardware (a 24 state pneumatically actuated continuum soft robot) demonstrates that this method is tractable for real-time control of high degree of freedom systems.
Contact-rich manipulation tasks in unstructured environments often require both haptic and visual feedback. It is nontrivial to manually design a robot controller that combines these modalities, which have very differ...
详细信息
Contact-rich manipulation tasks in unstructured environments often require both haptic and visual feedback. It is nontrivial to manually design a robot controller that combines these modalities, which have very different characteristics. While deep reinforcement learning has shown success in learning control policies for high-dimensional inputs, these algorithms are generally intractable to train directly on real robots due to sample complexity. In this article, we use self-supervision to learn a compact and multimodal representation of our sensory inputs, which can then be used to improve the sample efficiency of our policy learning. Evaluating our method on a peg insertion task, we show that it generalizes over varying geometries, configurations, and clearances, while being robust to external perturbations. We also systematically study different self-supervised learning objectives and representation learning architectures. Results are presented in simulation and on a physical robot.
In this work, we present a data-driven simulation and training engine capable of learning end-to-end autonomous vehicle control policies using only sparse rewards. By leveraging real, human-collected trajectories thro...
详细信息
In this work, we present a data-driven simulation and training engine capable of learning end-to-end autonomous vehicle control policies using only sparse rewards. By leveraging real, human-collected trajectories through an environment, we render novel training data that allows virtual agents to drive along a continuum of new local trajectories consistent with the road appearance and semantics, each with a different view of the scene. We demonstrate the ability of policies learned within our simulator to generalize to and navigate in previously unseen real-world roads, without access to any human control labels during training. Our results validate the learned policy onboard a full-scale autonomous vehicle, including in previously un-encountered scenarios, such as new roads and novel, complex, near-crash situations. Our methods are scalable, leverage reinforcement learning, and apply broadly to situations requiring effective perception and robust operation in the physical world.
In this work, we present an end-to-end network for fast panoptic segmentation. This network, called Fast Panoptic Segmentation Network (FPSNet), does not require computationally costly instance mask predictions or rul...
详细信息
In this work, we present an end-to-end network for fast panoptic segmentation. This network, called Fast Panoptic Segmentation Network (FPSNet), does not require computationally costly instance mask predictions or rule-based merging operations. This is achieved by casting the panoptic task into a custom dense pixel-wise classification task, which assigns a class label or an instance id to each pixel. We evaluate FPSNet on the Cityscapes and Pascal VOC datasets, and find that FPSNet is faster than existing panoptic segmentation methods, while achieving better or similar panoptic segmentation performance. On the Cityscapes validation set, we achieve a Panoptic Quality score of 55.1%, at prediction times of 114 milliseconds for images with a resolution of 1024 x 2048 pixels. For lower resolutions of the Cityscapes dataset and for the Pascal VOC dataset, FPSNet achieves prediction times as low as 45 and 28 milliseconds, respectively.
To achieve a successful grasp, gripper attributes such as its geometry and kinematics play a role as important as the object geometry. The majority of previous work has focused on developing grasp methods that general...
详细信息
To achieve a successful grasp, gripper attributes such as its geometry and kinematics play a role as important as the object geometry. The majority of previous work has focused on developing grasp methods that generalize over novel object geometry but are specific to a certain robot hand. We propose UniGrasp, an efficient data-driven grasp synthesis method that considers both the object geometry and gripper attributes as inputs. UniGrasp is based on a novel deep neural network architecture that selects sets of contact points from the input point cloud of the object. The proposed model is trained on a large dataset to produce contact points that are in force closure and reachable by the robot hand. By using contact points as output, we can transfer between a diverse set of multifingered robotic hands. Our model produces over 90% valid contact points in Top10 predictions in simulation and more than 90% successful grasps in real world experiments for various known two-fingered and three-fingered grippers. Our model also achieves 93%, 83% and 90% successful grasps in real world experiments for an unseen two-fingered gripper and two unseen multi-fingered anthropomorphic robotic hands.
Ego-motion estimation is a core task in robotic systems as well as in augmented and virtual reality applications. It is often solved using visual-inertial odometry, which involves using one or more always-on cameras o...
详细信息
Ego-motion estimation is a core task in robotic systems as well as in augmented and virtual reality applications. It is often solved using visual-inertial odometry, which involves using one or more always-on cameras on mobile robots and wearable devices. As consumers increasingly use such devices in their homes and workplaces, which are filled with sensitive details, the role of privacy in such camera-based approaches is of ever increasing importance. In this letter, we introduce the first solution to perform privacy-preserving ego-motion estimation. We recover camera ego-motion from an extremely low-resolution monocular camera by estimating dense optical flow at a higher spatial resolution (i.e., 4x super resolution). We propose SRFNet for directly estimating Super-Resolved Flow, a novel convolutional neural network model that is trained in a supervised setting using ground-truth optical flow. We also present a weakly supervised approach for training a variant of SRFNet on real videos where ground truth flow is unavailable. On image pairs with known relative camera orientations, we use SRFNet to predict the auto-epipolar flow that arises from pure camera translation, from which we robustly estimate the camera translation direction. We evaluate our super-resolved optical flow estimates and camera translation direction estimates on the Sintel and KITTI odometry datasets, where our methods outperform several baselines. Our results indicate that robust ego-motion recovery from extremely low-resolution images can be viable when camera orientations and metric scale is recovered from inertial sensors and fused with the estimated translations.
暂无评论