Semantic scene completion (SSC) refers to the task of inferring the 3D semantic segmentation of a scene while simultaneously completing the 3D shapes. We propose PALNet, a novel hybrid network for SSC based on single ...
详细信息
Semantic scene completion (SSC) refers to the task of inferring the 3D semantic segmentation of a scene while simultaneously completing the 3D shapes. We propose PALNet, a novel hybrid network for SSC based on single depth. PALNet utilizes a two-stream network to extract both 2D and 3D features from multi-stages using fine-grained depth information to efficiently capture the context, as well as the geometric cues of the scene. Current methods for SSC treat all parts of the scene equally causing unnecessary attention to the interior of objects. To address this problem, we propose Position Aware Loss (PA-Loss) which is position importance aware while training the network. Specifically, PA-Loss considers Local Geometric Anisotropy to determine the importance of different positions within the scene. It is beneficial for recovering key details like the boundaries of objects and the corners of the scene. Comprehensive experiments on two benchmark datasets demonstrate the effectiveness of the proposed method and its superior performance. Code and demo Video demo can be found here: https://***/j-LAMcMh0yg. are avaliable at https://***/UniLauX/PALNet.
As human motor learning is hypothesized to use the motor synergy concept, we investigate if this concept could also be observed in deep reinforcement learning for robotics. From this point of view, we carried out a jo...
详细信息
As human motor learning is hypothesized to use the motor synergy concept, we investigate if this concept could also be observed in deep reinforcement learning for robotics. From this point of view, we carried out a joint-space synergy analysis on multi-joint running agents in simulated environments trained using two state-of-the-art deep reinforcement learning algorithms. Although a synergy constraint has never been encoded into the reward function, the synergy emergence phenomenon could be observed statistically in the learning agent. To our knowledge, this is the first attempt to quantify the synergy development in detail and evaluate its emergence process during deeplearning motor control tasks. We then demonstrate that there is a correlation between our synergy-related metrics and the performance and energy efficiency of a trained agent. Interestingly, the proposed synergy-related metrics reflected a better learning capability of SAC over TD3. It suggests that these metrics could be additional new indices to evaluate deep reinforcement learning algorithms for motor learning. It also indicates that synergy is required for multi-joints robots to move energy-efficiently.
deeplearning has enabled remarkable advances in scene understanding, particularly in semantic segmentation tasks. Yet, current state of the art approaches are limited to a closed set of classes, and fail when facing ...
详细信息
deeplearning has enabled remarkable advances in scene understanding, particularly in semantic segmentation tasks. Yet, current state of the art approaches are limited to a closed set of classes, and fail when facing novel elements, also known as out of distribution (OoD) data. This is a problem as autonomous agents will inevitably come across a wide range of objects, all of which cannot be included during training. We propose a novel method to distinguish any object (foreground) from empty building structure (background) in indoor environments. We use normalizing flow to estimate the probability distribution of high-dimensional background descriptors. Foreground objects are therefore detected as areas in an image for which the descriptors are unlikely given the background distribution. As our method does not explicitly learn the representation of individual objects, its performance generalizes well outside of the training examples. Our model results in an innovative solution to reliably segment foreground from background in indoor scenes, which opens the way to a safer deployment of robots in human environments.
Hysteresis causes difficulties in precisely controlling motion of flexible surgery robots and degrades the surgical performance. In order to reduce hysteresis, model-based feed-forward and feedback-based methods using...
详细信息
Hysteresis causes difficulties in precisely controlling motion of flexible surgery robots and degrades the surgical performance. In order to reduce hysteresis, model-based feed-forward and feedback-based methods using endoscopic cameras have been suggested. However, model-based methods show limited performance when the sheath configuration is deformed. Although feedback-based methods maintain their performance regardless of the changing sheath configuration, these methods are limited in practical situations where the surgical instruments are obscured by surgical debris, such as blood and tissues. In this letter, a hysteresis compensation method using learning-based hybrid joint angle estimation (LBHJAE) is proposed to address both of these situations. This hybrid method combines image-based joint angle estimation (IBJAE) and kinematic-based joint angle estimation (KBJAE) using a Kalman filter. The proposed method can estimate an actual joint angle of a surgical instrument as well as reduce its hysteresis both in the face of partial obscuration and in different sheath configurations. We use a flexible surgery robot, K-FLEX, to evaluate our approach. The results indicate that the proposed method has effective performance in reducing hysteresis.
Neural networks predictions are unreliable when the input sample is out of the training distribution or corrupted by noise. Being able to detect such failures automatically is fundamental to integrate deeplearning al...
详细信息
Neural networks predictions are unreliable when the input sample is out of the training distribution or corrupted by noise. Being able to detect such failures automatically is fundamental to integrate deeplearning algorithms into robotics. Current approaches for uncertainty estimation of neural networks require changes to the network and optimization process, typically ignore prior knowledge about the data, and tend to make over-simplifying assumptions which underestimate uncertainty. To address these limitations, we propose a novel framework for uncertainty estimation. Based on Bayesian belief networks and Monte-Carlo sampling, our framework not only fully models the different sources of prediction uncertainty, but also incorporates prior data information, e.g. sensor noise. We show theoretically that this gives us the ability to capture uncertainty better than existing methods. In addition, our framework has several desirable properties: (i) it is agnostic to the network architecture and task;(ii) it does not require changes in the optimization process;(iii) it can be applied to already trained architectures. We thoroughly validate the proposed framework through extensive experiments on both computer vision and control tasks, where we outperform previous methods by up to 23% in accuracy. The video available at https://***/X7n-bRS5vSM shows qualitative results of our experiments. The project's code is available at: https://***/s3nygw7.
Soft bodies made from flexible and deformable materials are popular in many robotics applications, but their proprioceptive sensing has been a long-standing challenge. In other words, there has hardly been a method to...
详细信息
Soft bodies made from flexible and deformable materials are popular in many robotics applications, but their proprioceptive sensing has been a long-standing challenge. In other words, there has hardly been a method to measure and model the high-dimensional 3D shapes of soft bodies with internal sensors. We propose a framework to measure the high-resolution 3D shapes of soft bodies in real-time with embedded cameras. The cameras capture visual patterns inside a soft body, and a convolutional neural network (CNN) produces a latent code representing the deformation state, which can then be used to reconstruct the body's 3D shape using another neural network. We test the framework on various soft bodies, such as a Baymax-shaped toy, a latex balloon, and some soft robot fingers, and achieve real-time computation (<= 2.5 ms/frame) for robust shape estimation with high Precision (<= 1% relative error) and high resolution. We believe the method could be applied to soft robotics and human-robot interaction for proprioceptive shape sensing. Our code is available at: https://***/deepSoRo.
We demonstrate model-based, visual robot manipulation of deformable linear objects. Our approach is based on a state-space representation of the physical system that the robot aims to control. This choice has multiple...
详细信息
We demonstrate model-based, visual robot manipulation of deformable linear objects. Our approach is based on a state-space representation of the physical system that the robot aims to control. This choice has multiple advantages, including the ease of incorporating physics priors in the dynamics model and perception model, and the ease of planning manipulation actions. In addition, physical states can naturally represent object instances of different appearances. Therefore, dynamics in the state space can be learned in one setting and directly used in other visually different settings. This is in contrast to dynamics learned in pixel space or latent space, where generalization to visual differences are not guaranteed. Challenges in taking the state-space approach are the estimation of the high-dimensional state of a deformable object from raw images, where annotations are very expensive on real data, and finding a dynamics model that is both accurate, generalizable, and efficient to compute. We are the first to demonstrate self-supervised training of rope state estimation on real images, without requiring expensive annotations. This is achieved by our novel self-supervising learning objective, which is generalizable across a wide range of visual appearances. With estimated rope states, we train a fast and differentiable neural network dynamics model that encodes the physics of mass-spring systems. Our method has a higher accuracy in predicting future states compared to models that do not involve explicit state estimation and do not use any physics prior, while only using 3% of training data. We also show that our approach achieves more efficient manipulation, both in simulation and on a real robot, when used within a model predictive controller.
Detecting and adapting to catastrophic failures in robotic systems requires a robot to learn its new dynamics quickly and safely to best accomplish its goals. To address this challenging problem, we propose probabilis...
详细信息
Detecting and adapting to catastrophic failures in robotic systems requires a robot to learn its new dynamics quickly and safely to best accomplish its goals. To address this challenging problem, we propose probabilistically-safe, online learning techniques to infer the altered dynamics of a robot at the moment a failure (e.g., physical damage) occurs. We combine model predictive control and active learning within a chance-constrained optimization framework to safely and efficiently learn the new plant model of the robot. We leverage a neural network for function approximation in learning the latent dynamics of the robot under failure conditions. Our framework generalizes to various damage conditions while being computationally light-weight to advance real-time deployment. We empirically validate within a virtual environment that we can regain control of a severely damaged aircraft in seconds and require only 0.1 seconds to find safe, information-rich trajectories, outperforming state-of-the-art approaches.
In this letter, we formulate a novel Markov Decision Process (MDP) for safe and data-efficient learning for humanoid locomotion aided by a dynamic balancing model. In our previous studies of biped locomotion, we relie...
详细信息
In this letter, we formulate a novel Markov Decision Process (MDP) for safe and data-efficient learning for humanoid locomotion aided by a dynamic balancing model. In our previous studies of biped locomotion, we relied on a low-dimensional robot model, commonly used in high-level Walking Pattern Generators (WPGs). However, a low-level feedback controller cannot precisely track desired footstep locations due to the discrepancies between the full order model and the simplified model. In this study, we propose mitigating this problem by complementing a WPG with reinforcement learning. More specifically, we propose a structured footstep control method consisting of a WPG, a neural network, and a safety controller. The WPG provides an analytical method that promotes efficient learning while the neural network maximizes long-term rewards, and the safety controller encourages safe exploration based on step capturability and the use of control-barrier functions. Our contributions include the following (1) a structured learning control method for locomotion, (2) a data-efficient and safe learning process to improve walking using a physics-based model, and (3) the scalability of the procedure to various types of humanoid robots and walking.
There has been much recent interest in deeplearning methods for monocular image based object pose estimation. While object pose estimation is an important problem for autonomous robot interaction with the physical wo...
详细信息
There has been much recent interest in deeplearning methods for monocular image based object pose estimation. While object pose estimation is an important problem for autonomous robot interaction with the physical world, and the application space for monocular-based methods is expansive, there has been little work on applying these methods with fisheye imaging systems. Also, little exists in the way of annotated fisheye image datasets on which these methods can be developed and tested. The research landscape is even more sparse for object detection methods applied in the underwater domain, fisheye image based or otherwise. In this work, we present a novel framework for adapting a ROI-based 6D object pose estimation method to work on full fisheye images. The method incorporates the gnomic projection of regions of interest from an intermediate spherical image representation to correct for the fisheye distortions. Further, we contribute a fisheye image dataset, called UWHandles, collected in natural underwater environments, with 6D object pose and 2D bounding box annotations.
暂无评论