Virtualizing the physical world into virtual models has been a critical technique for robot navigation and planning in the real world. To foster manipulation with articulated objects in everyday life, this work explor...
详细信息
ISBN:
(纸本)9798350323658
Virtualizing the physical world into virtual models has been a critical technique for robot navigation and planning in the real world. To foster manipulation with articulated objects in everyday life, this work explores building articulation models of indoor scenes through a robot's purposeful interactions in these scenes. Prior work on articulation reasoning primarily focuses on siloed objects of limited categories. To extend to room-scale environments, the robot has to efficiently and effectively explore a large-scale 3D space, locate articulated objects, and infer their articulations. We introduce an interactive perception approach to this task. Our approach, named Ditto in the House, discovers possible articulated objects through affordance prediction, interacts with these objects to produce articulated motions, and infers the articulation properties from the visual observations before and after each interaction. It tightly couples affordance prediction and articulation inference to improve both tasks. We demonstrate the effectiveness of our approach in both simulation and real-world scenes. Code and additional results are available at https://***/HouseDitto/
Waste disposal into water bodies is a serious concern for environmental engineers, often resulting in urban flooding, soil degradation in agricultural areas, and freshwater pollution. Additionally, trash accumulation ...
详细信息
This paper presents an adaptive online learning framework for systems with uncertain parameters to ensure safety-critical control in non-stationary environments. Our approach consists of two phases. The initial phase ...
详细信息
ISBN:
(纸本)9798350384581;9798350384574
This paper presents an adaptive online learning framework for systems with uncertain parameters to ensure safety-critical control in non-stationary environments. Our approach consists of two phases. The initial phase is centered on a novel sparse Gaussian process (GP) framework. We first integrate a forgetting factor to refine a variational sparse GP algorithm, thus enhancing its adaptability. Subsequently, the hyperparameters of the Gaussian model are trained with a specially compound kernel, and the Gaussian model's online inferential capability and computational efficiency are strengthened by updating a solitary inducing point derived from newly samples, in conjunction with the learned hyperparameters. In the second phase, we propose a safety filter based on high order control barrier functions (HOCBFs), synergized with the previously trained learning model. By leveraging the compound kernel from the first phase, we effectively address the inherent limitations of GPs in handling high-dimensional problems for real-time applications. The derived controller ensures a rigorous lower bound on the probability of satisfying the safety specification. Finally, the efficacy of our proposed algorithm is demonstrated through real-time obstacle avoidance experiments executed using both simulation platform and a real-world 7-DOF robot.
We propose an interactive approach for 3D instance segmentation, where users can iteratively collaborate with a deep learning model to segment objects directly in a 3D point cloud. Current methods for 3D instance segm...
详细信息
ISBN:
(纸本)9798350323658
We propose an interactive approach for 3D instance segmentation, where users can iteratively collaborate with a deep learning model to segment objects directly in a 3D point cloud. Current methods for 3D instance segmentation are generally trained in a fully-supervised fashion, which requires large amounts of costly training labels, and does not generalize well to classes unseen during training. Few works have attempted to obtain 3D segmentation masks using human interactions. Existing methods rely on user feedback in the 2D image domain. As a consequence, users are required to constantly switch between 2D images and 3D representations, and custom architectures are employed to combine multiple input modalities. Therefore, integration with existing standard 3D models is not straightforward. The core idea of this work is to enable users to interact directly with 3D point clouds by clicking on desired 3D objects of interest (or their background) to interactively segment the scene in an open-world setting. Specifically, our method does not require training data from any target domain and can adapt to new environments where no appropriate training sets are available. Our system continuously adjusts the object segmentation based on the user feedback and achieves accurate dense 3D segmentation masks with minimal human effort (few clicks per object). Besides its potential for efficient labeling of large-scale and varied 3D datasets, our approach, where the user directly interacts with the 3D environment, enables new AR/VR and human-robot interaction applications.
We present a novel framework using EnergyBased models (EBMs) for localizing a ground vehicle mounted with a range sensor against satellite imagery in the absence of GPS. Lidar sensors have become ubiquitous on autonom...
详细信息
ISBN:
(纸本)9798350323658
We present a novel framework using EnergyBased models (EBMs) for localizing a ground vehicle mounted with a range sensor against satellite imagery in the absence of GPS. Lidar sensors have become ubiquitous on autonomous vehicles for describing its surrounding environment. Map priors are typically built using the same sensor modality for localization purposes. However, these map building endeavors using range sensors are often expensive and time-consuming. Alternatively, we leverage the use of satellite images as map priors, which are widely available, easily accessible, and provide comprehensive coverage. We propose a method using convolutional transformers that performs accurate metric-level localization in a cross-modal manner, which is challenging due to the drastic difference in appearance between the sparse range sensor readings and the rich satellite imagery. We train our model end-to-end and demonstrate our approach achieving higher accuracy than the state-of-the-art on KITTI, Pandaset, and a custom dataset.
Weed detection and management is an intricate and formidable challenge, affecting the efficiency of human agricultural practices. Object detection has merged as a key technique in precision-based agriculture for comba...
详细信息
Kolmogorov-Arnold Networks (KANs) have shown potential as an alternative to Multi-Layer Perceptrons (MLPs) in neural networks, providing universal function approximation with fewer parameters and reduced memory usage....
详细信息
ISBN:
(纸本)9798331517939;9788993215380
Kolmogorov-Arnold Networks (KANs) have shown potential as an alternative to Multi-Layer Perceptrons (MLPs) in neural networks, providing universal function approximation with fewer parameters and reduced memory usage. In this paper, we explore the use of KANs as function approximators within the Proximal Policy Optimization (PPO) algorithm. We evaluate this approach by comparing its performance to the original MLP-based PPO using the DeepMind Control Proprio robotics benchmark. Our results indicate that the KAN-based reinforcement learning algorithm can achieve comparable performance to its MLP-based counterpart, often with fewer parameters. These findings suggest that KANs may offer a more efficient option for reinforcement learning models. Our implementations can be found in the following link: https://***/victorkich/Kolmogorov-PPO.
Industrial anomaly detection plays a crucial role in modern manufacturing and production processes. In order to help researchers and professionals in the manufacturing industry understand the developing research resul...
详细信息
Obtaining the shape and size of a robot's workspace is essential for both its design and control. However, determining the accurate workspace of a multi-segment continuum robot by graphic or analytical methods is ...
详细信息
ISBN:
(纸本)9798350323658
Obtaining the shape and size of a robot's workspace is essential for both its design and control. However, determining the accurate workspace of a multi-segment continuum robot by graphic or analytical methods is a challenging task due to its inherent flexibility and complex structure. Existing numerical methods have limitations when applied to a continuum robot. This paper presents an Equivalent Two Section (ETS) method for calculating the workspace of multi-segment continuum robots. This method is based on the forward kinematics and a piecewise constant curvature (PCC) model to determine the boundaries of the workspace. In order to verify the proposed method, simulation experiments are conducted using six different maximum bending angles and seven different number of segments. Results of the ETS method are compared to the true workspaces of these configurations estimated by an exhaustive approach. The results show that the proposed ETS method is both efficient and accurate, and has small estimation errors. Discussions on the advantages and limitations of the proposed ETS method are also presented.
People with Visual Impairments (PVI) typically recognize objects through haptic perception. Knowing objects and materials before touching is desired by the target users but under-explored in the field of human-centere...
详细信息
ISBN:
(纸本)9798350384581;9798350384574
People with Visual Impairments (PVI) typically recognize objects through haptic perception. Knowing objects and materials before touching is desired by the target users but under-explored in the field of human-centered robotics. To fill this gap, in this work, a wearable vision-based robotic system, MATERobot, is established for PVI to recognize materials and object categories beforehand. To address the computational constraints of mobile platforms, we propose a lightweight yet accurate model MATEViT to perform pixel-wise semantic segmentation, simultaneously recognizing both objects and materials. Our methods achieve respective 40.2% and 51.1% of mIoU on COCOStuff-10K and DMS datasets, surpassing the previous method with +5.7% and +7.0% gains. Moreover, on the field test with participants, our wearable system reaches a score of 28 in the NASA-Task Load Index, indicating low cognitive demands and ease of use. Our MATERobot demonstrates the feasibility of recognizing material property through visual cues and offers a promising step towards improving the functionality of wearable robots for PVI. The source code has been made publicly available at MATERobot.
暂无评论