Electrical impedance tomographic (EIT) tactile sensing holds great promise for whole-body coverage of contact-rich robotic systems, offering extensive flexibility in sensor geometry. However, low spatial resolution re...
详细信息
Electrical impedance tomographic (EIT) tactile sensing holds great promise for whole-body coverage of contact-rich robotic systems, offering extensive flexibility in sensor geometry. However, low spatial resolution restricts its practical use, despite the existing deep-learning-based reconstruction methods. This study introduces EIT-GNN, a graph-structured data-driven EIT reconstruction framework that achieves super-resolution in large-area tactile perception on unbounded form factors of robots. EIT-GNN represents the arbitrary sensor shape into mesh connections, then employs a twofold architecture of transformer encoder and graph convolutional neural network to best manage such the geometrical prior knowledge, resulting in the accurate, generalized, and parameter-efficient reconstruction procedure. As a proof-of-concept, we demonstrate its application using large-area face-shaped sensor hardware, which represents one of the most complex geometries in human/humanoid anatomy. An extensive set of experiments, including simulation study, ablation analysis, single-touch indentation test, and latent feature analysis, confirm its superiority over alternative models. The beneficial features of the approach are demonstrated through its application in active tactile-servo control of humanoid head motion, paving the new way for integrating tactile sensors with intricate designs into robotic systems.
We propose a sample-based model predictive control (MPC) method for collision-free navigation that uses a normalizing flow as a sampling distribution, conditioned on the start, goal, environment, and cost parameters. ...
详细信息
We propose a sample-based model predictive control (MPC) method for collision-free navigation that uses a normalizing flow as a sampling distribution, conditioned on the start, goal, environment, and cost parameters. This representation allows us to learn a distribution that accounts for both the dynamics of the robot and complex obstacle geometries. We propose a way to incorporate this sampling distribution into two sampling-based MPC methods, MPPI, and iCEM. However, when deploying these methods, the robot may encounter an out-of-distribution (OOD) environment. To generalize our method to OOD environments, we also present an approach that performs projection on the representation of the environment. This projection changes the environment representation to be more in-distribution while also optimizing trajectory quality in the true environment. Our simulation results on a 2-D double-integrator, a 12-DoF quadrotor and a seven-DoF kinematic manipulator suggest that using a learned sampling distribution with projection outperforms MPC baselines on both in-distribution and OOD environments over different cost functions, including OOD environments generated from real-world data.
During the operation of industrial robots, unusual events may endanger the safety of humans and the quality of production. When collecting data to detect such cases, it is not ensured that data from all potentially oc...
详细信息
During the operation of industrial robots, unusual events may endanger the safety of humans and the quality of production. When collecting data to detect such cases, it is not ensured that data from all potentially occurring errors is included as unforeseeable events may happen over time. Therefore, anomaly detection (AD) delivers a practical solution, using only normal data to learn to detect unusual events. We introduce a dataset that allows training and benchmarking of anomaly detection methods for robotic applications based on machine data which will be made publicly available to the research community. As a typical robot task the dataset includes a pick-and-place application which involves movement, actions of the end effector, and interactions with the objects of the environment. Since several of the contained anomalies are not task-specific but general, evaluations on our dataset are transferable to other robotics applications as well. In addition, we present multivariate time-series flow (MVT-Flow) as a new baseline method for anomaly detection: It relies on deep-learning-based density estimation with normalizing flows, tailored to the data domain by taking its structure into account for the architecture. Our evaluation shows that MVT-Flow outperforms baselines from previous work by a large margin of 6.2% in area under receiving operator characteristic.
Knowing when a trained segmentation model is encountering data that is different to its training data is important. Understanding and mitigating the effects of this play an important part in their application from a p...
详细信息
Knowing when a trained segmentation model is encountering data that is different to its training data is important. Understanding and mitigating the effects of this play an important part in their application from a performance and assurance perspective-this being a safety concern in applications such as autonomous vehicles. This article presents a segmentation network that can detect errors caused by challenging test domains without any additional annotation in a single forward pass. As annotation costs limit the diversity of labeled datasets, we use easy-to-obtain, uncurated and unlabeled data to learn to perform uncertainty estimation by selectively enforcing consistency over data augmentation. To this end, a novel segmentation benchmark based on the sense-assess-eXplain (SAX) is used, which includes labeled test data spanning three autonomous-driving domains, ranging in appearance from dense urban to off-road. The proposed method, named gamma-SSL, consistently outperforms uncertainty estimation and out-of-distribution techniques on this difficult benchmark-by up to 10.7% in area under the receiver operating characteristic curve and 19.2% in area under the precision-recall curve in the most challenging of the three scenarios.
The perception of moving objects is crucial for autonomous robots performing collision avoidance in dynamic environments. LiDARs and cameras tremendously enhance scene interpretation but do not provide direct motion i...
详细信息
The perception of moving objects is crucial for autonomous robots performing collision avoidance in dynamic environments. LiDARs and cameras tremendously enhance scene interpretation but do not provide direct motion information and face limitations under adverse weather. Radar sensors overcome these limitations and provide Doppler velocities, delivering direct information on dynamic objects. In this article, we address the problem of moving instance segmentation in radar point clouds to enhance scene interpretation for safety-critical tasks. Our radar instance transformer enriches the current radar scan with temporal information without passing aggregated scans through a neural network. We propose a full-resolution backbone to prevent information loss in sparse point cloud processing. Our instance transformer head incorporates essential information to enhance segmentation but also enables reliable, class-agnostic instance assignments. In sum, our approach shows superior performance on the new moving instance segmentation benchmarks, including diverse environments, and provides model-agnostic modules to enhance scene interpretation.
Grasping is a crucial task in robotics, necessitating tactile feedback and reactive grasping adjustments for robust grasping of objects under various conditions and with differing physical properties. In this article,...
详细信息
Grasping is a crucial task in robotics, necessitating tactile feedback and reactive grasping adjustments for robust grasping of objects under various conditions and with differing physical properties. In this article, we introduce LeTac-MPC, a learning-based model predictive control (MPC) for tactile-reactive grasping. Our approach enables the gripper to grasp objects with different physical properties on dynamic and force-interactive tasks. We utilize a vision-based tactile sensor, GelSight (Yuan et al. 2017), which is capable of perceiving high-resolution tactile feedback that contains information on the physical properties and states of the grasped object. LeTac-MPC incorporates a differentiable MPC layer designed to model the embeddings extracted by a neural network from tactile feedback. This design facilitates convergent and robust grasping control at a frequency of 25 Hz. We propose a fully automated data collection pipeline and collect a dataset only using standardized blocks with different physical properties. However, our trained controller can generalize to daily objects with different sizes, shapes, materials, and textures. The experimental results demonstrate the effectiveness and robustness of the proposed approach. We compare LeTac-MPC with two purely model-based tactile-reactive controllers (MPC and PD) and open-loop grasping. Our results show that LeTac-MPC has optimal performance in dynamic and force-interactive tasks and optimal generalizability.
Long-horizon dexterous robot manipulation of deformable objects, such as banana peeling, is a problematic task because of the difficulties in object modeling and a lack of knowledge about stable and dexterous manipula...
详细信息
Long-horizon dexterous robot manipulation of deformable objects, such as banana peeling, is a problematic task because of the difficulties in object modeling and a lack of knowledge about stable and dexterous manipulation skills. This article presents a goal-conditioned dual-action deep imitation learning (DIL) approach that can learn dexterous manipulation skills using human demonstration data. Previous DIL methods map the current sensory input and reactive action, which often fails because of compounding errors in imitation learning caused by the recurrent computation of actions. The method predicts reactive action only when the precise manipulation of the target object is required (local action) and generates the entire trajectory when precise manipulation is not required (global action). This dual-action formulation effectively prevents compounding error in the imitation learning using the trajectory-based global action while responding to unexpected changes in the target object during the reactive local action. The proposed method was tested in a real dual-arm robot and successfully accomplished the banana-peeling task.
Embodied visual navigation has witnessed significant advancements. However, most studies commonly assume that environments are static and contain at least one collision-free path. In human environments, agents frequen...
详细信息
Embodied visual navigation has witnessed significant advancements. However, most studies commonly assume that environments are static and contain at least one collision-free path. In human environments, agents frequently encounter challenges when navigating through scenes with disarranged objects. In this letter, we explore the interactive navigation problem, wherein agents possess the ability to physically interact with and modify the environment, such as moving obstacles aside, to improve their efficiency in reaching the target. To this end, we propose a novel cross dimension scene representation module under the framework of reinforcement learning (RL) that provides joint 2D and 3D scene representation for interactive agents. We first leverage 2D and 3D observation encoders to extract informative features from observations. Subsequently, a joint representation network is proposed to lift the dimension of 2D feature maps to 3D and align them with 3D observation, enabling us to fuse information from different dimensions. This allows us to simultaneously harness the advantages of 2D and 3D observations, thereby yielding a more informative representation for interactive RL agents in addressing challenges arising from physical interactions. We validate our proposed approach in the iGibson environment, and experimental results demonstrate a significant improvement over baseline methods.
Our goal is to perform out-of-distribution (OOD) detection, i.e., to detect when a robot is operating in environments drawn from a different distribution than the ones used to train the robot. We leverage probably app...
详细信息
Our goal is to perform out-of-distribution (OOD) detection, i.e., to detect when a robot is operating in environments drawn from a different distribution than the ones used to train the robot. We leverage probably approximately correct-Bayes theory to train a policy with a guaranteed bound on performance on the training distribution. Our idea for OOD detection relies on the following intuition: violation of the performance bound on test environments provides evidence that the robot is operating OOD. We formalize this via statistical techniques based on $p$-values and concentration inequalities. The approach provides guaranteed confidence bounds on OOD detection including bounds on both the false-positive and false-negative rates of the detector and is task-driven and only sensitive to changes that impact the robot's performance. We demonstrate our approach in simulation and hardware for a grasping task using objects with unfamiliar shapes or poses and a drone performing vision-based obstacle avoidance in environments with wind disturbances and varied obstacle densities. Our examples demonstrate that we can perform task-driven OOD detection within just a handful of trials.
LiDAR semantic segmentation (LSS) for autonomous driving has been a growing field of interest in recent years. Datasets and methods have appeared and expanded very quickly, but methods have not been updated to exploit...
详细信息
LiDAR semantic segmentation (LSS) for autonomous driving has been a growing field of interest in recent years. Datasets and methods have appeared and expanded very quickly, but methods have not been updated to exploit this new data availability and rely on the same classical datasets. Different ways of performing LSS training and inference can be divided into several subfields, which include the following: domain generalization, source-to-source segmentation, and pretraining. In this work, we aim to improve results in all of these subfields with the novel approach of multisource training. Multisource training relies on the availability of various datasets at training time. To overcome the common obstacles in multisource training, we introduce the coarse labels and call the newly created multisource dataset COLA. We propose three applications of this new dataset that display systematic improvement over single-source strategies: COLA-DG for domain generalization (+10% ), COLA-S2S for source-to-source segmentation (+5.3% ), and COLA-PT for pretraining (+12% ). We demonstrate that multisource approaches bring systematic improvement over single-source approaches.
暂无评论