We present a vision-based control and interaction framework for mobile robots, and describe its implementation in a legged amphibious robot. The control scheme enables the robot to navigate, follow targets of interest...
详细信息
ISBN:
(纸本)9781424442119
We present a vision-based control and interaction framework for mobile robots, and describe its implementation in a legged amphibious robot. The control scheme enables the robot to navigate, follow targets of interest, and interact with human operators. The visual framework presented in this paper enables deployment of the vehicle in underwater environments along with a human scuba diver as the operator, without requiring any external tethered control. We present the current implementation of this framework in our particular family of underwater robots, with a focus on the underlying software and hardware infrastructure. We look at the practical issues pertaining to system implementation as it applies to our framework, from choice of operating systems to communication bus design. While our system has been effectively used in both open-ocean and closed-water environments, we perform some quantitative measurements with an effort to analyze the responsiveness and robustness of the complete architecture.
Depth is a valuable piece of information for robots and autonomous vehicles. Indeed, it enables them to move in space and avoid obstacles. Nevertheless, depth alone is not enough to let them interact with their surrou...
详细信息
ISBN:
(纸本)9781728198910
Depth is a valuable piece of information for robots and autonomous vehicles. Indeed, it enables them to move in space and avoid obstacles. Nevertheless, depth alone is not enough to let them interact with their surroundings. They also need to locate the different objects that are present in their environment. In this paper, we propose a deep learning model that solves unsupervised monocular depth estimation and supervised instance segmentation at the same time with a common architecture. The first task is solved through novel view synthesis while the second is solved by minimising an embedding loss function. Our approach is motivated by the idea that knowing where objects are in the scene could improve the depth estimation of unsupervised monocular depth models. We tested our architecture on two datasets, Kitti and Cityscapes and reached state-of-the-art depth estimation results while solving a second task.
Two-stream networks have been very successful for solving the problem of action detection. However, prior work using two-stream networks train both streams separately, which prevents the network from exploiting regula...
详细信息
ISBN:
(纸本)9781538664810
Two-stream networks have been very successful for solving the problem of action detection. However, prior work using two-stream networks train both streams separately, which prevents the network from exploiting regularities between the two streams. Moreover, unlike the visual stream, the dominant forms of optical flow computation typically do not maximally exploit GPU parallelism. We present a real-time end-to-end trainable two-stream network for action detection. First, we integrate the optical flow computation in our framework by using Flownet2. Second, we apply early fusion for the two streams and train the whole pipeline jointly end-to-end. Finally, for better network initialization, we transfer from the task of action recognition to action detection by pre-training our framework using the recently released large-scale Kinetics dataset. Our experimental results show that training the pipeline jointly end-to-end with fine-tuning the optical flow for the objective of action detection improves detection performance significantly. Additionally, we observe an improvement when initializing with parameters pre-trained using Kinetics. Last, we show that by integrating the optical flow computation, our framework is more efficient, running at real-time speeds (up to 31 fps).
This paper proposes a novel algorithm developed specifically for removing colour cast from faded photographic materials. The algorithm is fully automated and requires no initial training. Quality of restored images is...
详细信息
Automatic detection and segmentation of overlapping leaves in dense foliage can be a difficult task, particularly for leaves with strong textures and high occlusions. We present Dense-Leaves, an image dataset with gro...
详细信息
A fundamental task in computervision is that of determining the position and orientation of a moving camera relative to an observed object or scene. Many such visual tracking algorithms have been proposed in the comp...
详细信息
ISBN:
(纸本)0769523196
A fundamental task in computervision is that of determining the position and orientation of a moving camera relative to an observed object or scene. Many such visual tracking algorithms have been proposed in the computervision, artificial intelligence and robotics literature over the past 30 years. Predominantly, these remain unvalidated since the ground-truth camera positions and orientations at each frame in a video sequence are not available for comparison with the outputs of the proposed vision systems. A method is presented for generating real visual test data with complete underlying ground-truth. The method enables the production of long video sequences, filmed along complicated six degree of freedom trajectories, featuring a variety of objects, in a variety of different visibility conditions, for which complete ground-truth data is known including the camera position and orientation at every image frame, intrinsic camera calibration data, a lens distortion model and models of the viewed objects.
We show that differentiation via fitting B-splines to the spatio-temporal intensity data comprising an image sequence provides at least the same and usually better 2D Lucas and Kanade optical flow than that computed v...
详细信息
Determining the viewpoint (pose) of rigid objects in monocular 2D images is a classic vision problem with applications to robotic grasping, augmented reality, semantic SLAM, autonomous navigation and scene understandi...
详细信息
ISBN:
(纸本)9781509024919
Determining the viewpoint (pose) of rigid objects in monocular 2D images is a classic vision problem with applications to robotic grasping, augmented reality, semantic SLAM, autonomous navigation and scene understanding in general. Using only 3D CAD models of an object class as input, we demonstrate the ability to accurately predict viewpoint in real-world images even in the presence of clutter and occlusion. We report results on eight datasets, one of which is new, in the hope of providing the community with new viewpoint prediction baselines. We show that deep representations (from convolutional networks) can bridge the large divide between purely synthetic training data and real-world test data to achieve near state-of-the-art results in viewpoint prediction but without the need for labeled, real-world training data. Our general approach to viewpoint prediction is applicable to any object class where 3D models are available.
We show how a greedy approach to visual search - i.e., directly moving to the most likely location of the target - can be suboptimal, if the target object is hard to detect. Instead it is more efficient and leads to h...
详细信息
ISBN:
(纸本)9780769527864
We show how a greedy approach to visual search - i.e., directly moving to the most likely location of the target - can be suboptimal, if the target object is hard to detect. Instead it is more efficient and leads to higher detection accuracy to first look for other related objects, that are easier to detect. These provide contextual priors for the target that make it easier to find. We demonstrate this in simulation using POMDP models, focussing on two special cases: where the target object is contained within the related object, and where the target object is spatially adjacent to the related object.
In this paper we discuss landmark based absolute localization of tiny autonomous mobile robots in a known environment. Landmark features are naturally occurring as it is not allowed to modify the environment with spec...
详细信息
暂无评论