The cost aggregation strategy shows a crucialrole in learning-based stereo matching tasks, where 3D convolutional filters obtain state of the art but require intensive computation resources, while 2D operations need l...
详细信息
The cost aggregation strategy shows a crucialrole in learning-based stereo matching tasks, where 3D convolutional filters obtain state of the art but require intensive computation resources, while 2D operations need less GPU memory but are sensitive to domain shift. In this letter, we decouple the 4D cubic cost volume used by 3D convolutional filters into sequential cost maps along the direction of disparity instead of dealing with it at once by exploiting a recurrent cost aggregation strategy. Furthermore, a novel recurrent module, Stacked Recurrent Hourglass (SRH), is proposed to process each cost map. Our hourglass network is constructed based on Gated Recurrent Units (GRUs) and down/upsampling layers, which provides GRUs larger receptive fields. Then two hourglass networks are stacked together, while multi-scale information is processed by skip connections to enhance the performance of the pipeline in textureless areas. The proposed architecture is implemented in an end-to-end pipeline and evaluated on public datasets, which reduces GPU memory consumption by up to 56.1% compared with PSMNet using stacked hourglass 3D CNNs without the degradation of accuracy. Then, we further demonstrate the scalability of the proposed method on several high-resolution pairs, while previously learned approaches often fail due to the memory constraint. The code is released at https://***/hongzhidu/SRHNet.
This article describes motion planning networks (MPNet), a computationally efficient, learning-based neural planner for solving motion planning *** uses neural networks to learn general near-optimal heuristics for pat...
详细信息
This article describes motion planning networks (MPNet), a computationally efficient, learning-based neural planner for solving motion planning *** uses neural networks to learn general near-optimal heuristics for path planning in seen and unseen environments. It takes environment information such as raw point cloud from depth sensors, as well as a robot's initial and desired goal configurations and recursively calls itself to bidirectionally generate connectable paths. In addition to finding directly connectable and near-optimal paths in a single pass, we show that worst-case theoretical guarantees can be proven if we merge this neural network strategy with classical sample-based planners in a hybrid approach while still retaining significant computational and optimality improvements. To train the MPNet models, we present an active continual learning approach that enables MPNet to learn from streaming data and actively ask for expert demonstrations when needed, drastically reducing data for training. We validate MPNet against gold-standard and state-of-the-art planning methods in a variety of problems from two-dimensional to seven-dimensional robot configuration spaces in challenging and cluttered environments, with results showing significant and consistently stronger performance metrics, and motivating neural planning in general as a modern strategy for solving motion planning problems efficiently.
In this article, we are interested in robotic visual object classification using a deep convolutional neural network (DCNN) classifier. We show that the correlation coefficient of the automatically learned DCNN featur...
详细信息
In this article, we are interested in robotic visual object classification using a deep convolutional neural network (DCNN) classifier. We show that the correlation coefficient of the automatically learned DCNN features of two object images carries robust information on their similarity, and can be utilized to significantly improve the robot's classification accuracy, without additional training. More specifically, we first probabilistically analyze how the feature correlation carries vital similarity information and build a correlation-based Markov random field (CoMRF) for joint object labeling. Given query and motion budgets, we then propose an optimization framework to plan the robot's query and path based on our CoMRF. This gives the robot a new way to optimally decide which object sites to move close to for better sensing and for which objects to ask a remote human for help with classification, which considerably improves the overall classification. We extensively evaluate our proposed approach on two large datasets (e.g., drone imagery and indoor scenes) and several real-world robotic experiments. The results show that our proposed approach significantly outperforms the benchmarks.
In this research, a global path planning method based on recurrent neural networks by means of a new Loss function is presented, which regardless of the complexity of the configuration space, generates the path in a r...
详细信息
In this research, a global path planning method based on recurrent neural networks by means of a new Loss function is presented, which regardless of the complexity of the configuration space, generates the path in a relatively constant time. The new Loss function is defined in such a way that in addition to learning the input data of the network, it creates an adjustable safety margin around the obstacles and ultimately creates a safe path. Moreover, a new global path planning method is also introduced, which is used to create the dataset required to train the proposed neural network. The convergence of this method is mathematically proven and it is shown that this method can also produce a suboptimal path in a much shorter time than the common methods of global path planning reported in the literature. In short, the main purpose of this research consists in providing a method which can create a suboptimal, fast and safe path for a mobile robot from any random starting point to any random destination in a known environment. First, the proposed methods will be implemented for different two-dimensional environments consisting of convex and non-convex obstacles, considering the robot as a point-mass, and then it will be implemented in a simulation environment, AI2THOR. Compared to classical global path planning algorithms, such as RRT and A*, the proposed approach demonstrates better performance in complex and challenging environments. (c) 2023 Elsevier B.V. All rights reserved.
Electrical impedance tomography (EIT) based tactile sensor offers significant benefits on practical deployment because of its sparse electrode allocation, including durability, large-area scalability, and low fabricat...
详细信息
Electrical impedance tomography (EIT) based tactile sensor offers significant benefits on practical deployment because of its sparse electrode allocation, including durability, large-area scalability, and low fabrication cost, but the degradation of a tactile spatial resolution has remained challenging. This article describes a deep neural network based EIT reconstruction framework, the EIT neural network (EIT-NN), alleviating this tradeoff between tactile sensing performance and hardware simplicity. EIT-NN learns a computationally efficient, nonlinear reconstruction attribute, achieving high-resolution tactile sensation and well-generalized reconstruction capability to address arbitrary complex touch modalities. We train EIT-NN by presenting a sim-to-real dataset synthesis strategy for computationally efficient generalizability. Furthermore, we propose a spatial sensitivity aware mean-squared error loss function, which uses an intrinsic spatial sensitivity of the sensor to guarantee a well-posed EIT operation. We validate an outperformance of EIT-NN against conventional EIT sensing methods by conducting a simulation study, a single-touch indentation test, and a two-point discrimination test. The results show improved spatial resolution, sensitivity, and localization accuracy. The beneficial features of the generalized sensing of EIT-NN were demonstrated by examining touch modality discrimination performance.
Multi-agent path finding (MAPF) is an indispensable component of large-scale robot deployments in numerous domains ranging from airport management to warehouse automation. In particular, this work addresses lifelong M...
详细信息
Multi-agent path finding (MAPF) is an indispensable component of large-scale robot deployments in numerous domains ranging from airport management to warehouse automation. In particular, this work addresses lifelong MAPF (LMAPF) - an online variant of the problem where agents are immediately assigned a new goal upon reaching their current one - in dense and highly structured environments, typical of real-world warehouse operations. Effectively solving LMAPF in such environments requires expensive coordination between agents as well as frequent replanning abilities, a daunting task for existing coupled and decoupled approaches alike. With the purpose of achieving considerable agent coordination without any compromise on reactivity and scalability, we introduce PRIMAL(2), a distributed reinforcement learning framework for LMAPF where agents learn fully decentralized policies to reactively plan paths online in a partially observable world. We extend our previous work, which was effective in low-density sparsely occupied worlds, to highly structured and constrained worlds by identifying behaviors and conventions which improve implicit agent coordination, and enable their learning through the construction of a novel local agent observation and various training aids. We present extensive results of PRIMAL(2) in both MAPF and LMAPF environments and compare its performance to state-of-the-art planners in terms of makespan and throughput. We show that PRIMAL(2) significantly surpasses our previous work and performs comparably to these baselines, while allowing real-time re-planning and scaling up to 2048 agents.
The multi-agent path finding(MAPF) problem is crucial to improve the efficiency of warehouse systems. Compared with traditional centralized methods, which encounter escalating computational complexities with increasin...
详细信息
ISBN:
(数字)9789887581581
ISBN:
(纸本)9798350366907
The multi-agent path finding(MAPF) problem is crucial to improve the efficiency of warehouse systems. Compared with traditional centralized methods, which encounter escalating computational complexities with increasing scale,reinforcement learning-based methods has been proven to be an effective method for solving MAPF problem. Nevertheless, in the complex and large-scale scenarios, the policies learned by existing reinforcement learning-based methods are generally inadequate to address the challenges effectively. By leveraging the concepts of policy evaluation and policy evolution, this paper aims to improve performance and sample efficiency. Consequently, we introduce an MAPF method based on evolutionary reinforcement learning. In particular, we design a collaborative policy network model based on reinforcement ***, a novel evolutionary reinforcement learning training framework is constructed. Through the quantitative evaluation mechanism, policy evaluation is carried out, and evolutionary algorithm is used for policy evolution, so that the collaborative policy could better guide the agent to complete the path finding task. We test on high-density warehouse environment instances of various map sizes, and the experimental results show that our method has high success rate and low average steps.
We design a multiscopic vision system that utilizes a low-cost monocular RGB camera to acquire accurate depth estimation. Unlike multi-view stereo with images captured at unconstrained camera poses, the proposed syste...
详细信息
We design a multiscopic vision system that utilizes a low-cost monocular RGB camera to acquire accurate depth estimation. Unlike multi-view stereo with images captured at unconstrained camera poses, the proposed system controls the motion of a camera to capture a sequence of images in horizontally or vertically aligned positions with the same parallax. In this system, we propose a new heuristic method and a robust learning-based method to fuse multiple cost volumes between the reference image and its surrounding images. To obtain training data, we build a synthetic dataset with multiscopic images. The experiments on the real-world Middlebury dataset and real robot demonstration show that our multiscopic vision system outperforms traditional two-frame stereo matching methods in depth estimation. Our code and dataset are available at https://***/view/multiscopic.
We present a biomimetic framework for human neuromuscular and visuomotor control that promises to be of value to researchers developing humanoid robots. Our framework features a biomechanically simulated human musculo...
详细信息
We present a biomimetic framework for human neuromuscular and visuomotor control that promises to be of value to researchers developing humanoid robots. Our framework features a biomechanically simulated human musculoskeletal model, actuated by numerous skeletal muscles, with realistic eyes driven by extraocular and intraocular muscles, whose optic organs refract light, and whose retinas have many nonuniformly distributed photoreceptors. The humanoid's visuomotor control system comprises 24 trained deep neural networks (DNNs)-10 DNNs in its vision subsystem and 14 DNNs in its motor subsystem-plus an additional 4 trained shallow neural networks (SNNs) that control the irises and lenses of the eyes. Of the motor DNNs, a pair control the extraocular muscles, 6 per eye, responsible for eye movements, 2 control the 216 neck muscles of the cervicocephalic biomechanical complex, producing natural head movements, 2 control the 443 core muscles of the torso, and 2 control each limb;i.e., the 29 muscles of each arm and 39 muscles of each leg. Directly from the foveated retinal photoreceptor responses, a pair of foveation DNNs drive eye, head, and torso movements, while 8 limb vision DNNs extract the visual information needed to direct arm and leg actions. By synthesizing its own training data, our humanoid automatically learns efficient, online, active visuomotor control of its eyes, head, torso, and limbs in order to perform nontrivial tasks involving the foveation and visual pursuit of moving target objects coupled with visually-guided limb-reaching actions to intercept them. We also demonstrate that it can balance itself in an upright stance, take steps, and perform certain simulated sports activities.
deep neural networks facilitate visuosensory inputs for robotic systems. However, the features encoded in a network without specific constraints have little physical meaning. In this research, we add constraints on th...
详细信息
deep neural networks facilitate visuosensory inputs for robotic systems. However, the features encoded in a network without specific constraints have little physical meaning. In this research, we add constraints on the network so that the trained features are forced to represent the actual twist coordinates of interactive objects in a scene. The trained coordinates describe 6d-pose of the objects, and transformation is applied to change the coordinate system. This algorithm is developed for a mobile service robot that imitates an object-oriented task by watching human demonstrations. As the robot has mobility, the video demonstrations are collected from different viewpoints. Our feature trajectories of twist coordinates are synthesized in the global coordinate after transformation is applied according to robot localization. Then, the trajectories are trained as probabilistic model and imitated by the robot with geometric dynamics of . Our main contribution is to develop a trainable robot with visually demonstrated human performances. Additionally, our algorithmic contribution is to design a scene interpretation network where constraints are incorporated to estimate 6d-pose of objects.
暂无评论