检索结果-内蒙古大学图书馆

Towards a generic diver-following algorithm: Balancing robustness and efficiency in deep visual detection

学校读者我要写书评

暂无评论

arXiv 2018年

作者： Islam, Md Jahidul Fulton, Michael Sattar, Junaed Interactive Robotics and Vision Lab-oratory Department of Computer Science and Engineering University of Minnesota-Twin Cities United States

This paper explores the design and development of a class of robust diver-following algorithms for autonomous underwater robots. By considering the operational challenges for underwater visual tracking in diverse real-world settings, we formulate a set of desired features of a generic diver following algorithm. We attempt to accommodate these features and maximize general tracking performance by exploiting the state-of-the-art deep object detection models. We fine-tune the building blocks of these models with a goal of balancing the trade-off between robustness and efficiency in an on-board setting under real-time constraints. Subsequently, we design an architecturally simple Convolutional Neural Network (CNN)-based diver-detection model that is much faster than the state-of-the-art deep models yet provides comparable detection performances. In addition, we validate the performance and effectiveness of the proposed diver-following modules through a number of field experiments in closed-water and open-water environments. Copyright © 2018, The Authors. All rights reserved.

关键词： Efficiency

Deep representation of industrial components using simulated images

学校读者我要写书评

暂无评论

Deep representation of industrial components using simulated...

2017 IEEE International Conference on robotics and Automation, ICRA 2017

作者： Kim, Seong-Heum Choe, Gyeongmin Ahn, Byungtae Kweon, In So Robotics and Computer Vision Laboratory School of Electrical Engineering KAIST Daejeon Korea Republic of

ISBN: (纸本)9781509046331

In this paper, we present a visual learning framework to retrieve a 3D model and estimate its pose from a single image. To increase the quantity and quality of training data, we define our simulation space in the near infrared (NIR) band, and utilize the quasi-Monte Carlo (MC) method for scalable photorealistic rendering of manufactured components. Two types of convolutional neural network (CNN) architectures are trained over these synthetic data and a relatively small amount of real data. The first CNN model seeks the most discriminative information and uses it to classify industrial components with fine-grained shape attributes. Once a 3D model is identified, one of the category-specific CNNs is tested for pose regression in the second phase. The mixed data for learning object categories is useful in domain adaptation and attention mechanism in our system. We validate our data-driven method with 88 component models, and the experimental results are qualitatively demonstrated. Also, the CNNs trained with various conditions of mixed data are quantitatively analyzed. © 2017 IEEE.

关键词： Monte Carlo methods

Deployment of a low cost fuzzy controller using open source embedded hardware and software tools

学校读者我要写书评

暂无评论

Deployment of a low cost fuzzy controller using open source ...

2018 International MultiConference of Engineers and computer Scientists, IMECS 2018

作者： Ponguillo, Ronald A. Basic Electronics Area Vision and Robotics Center Escuela Superior Politécnica del Litoral ESPOL Faculty of Electrical and Computer Engineering Campus Gustavo Galindo Km 30.5 Via Perimetral P.O. Box 09-01-5863 Guayaquil Ecuador

ISBN: (纸本)9789881404886

This paper describes implementation of a Fuzzy Logic controller into a Raspberry Pi 2B+ platform. The development was made using Octave software and C++ language for implement the mathematical model for the controller. The Linux distribution used was Raspbian. Several tests were performed to try out how good can be a control system mounted over low cost platform. The tests performed consisted on measure the response of the controller when the reference is fixes and when it is variable. The run times for the algorithm implemented and the CPU consumption form the system were measured. The tests results shows that is possible implement this type of control using this approach, but raises a question to answer. It is possible to implement whatever kind of controller from its mathematical model using low cost embedded platform? The experiments showed when the controller is most sophisticated the computational cost grows. The time for initialization resulted bigger than others types of simplest controllers but to the end the control was possible. © 2018 Newswood Limited. All rights reserved.

关键词： Controllers

LoST? Appearance-invariant place recognition for opposite viewpoints using visual semantics

学校读者我要写书评

暂无评论

arXiv 2018年

作者： Garg, Sourav Suenderhauf, Niko Milford, Michael Australian Centre for Robotic Vision Robotics and Autonomous Systems School of Electrical Engineering and Computer Science Queensland University of Technology Brisbane Australia

Human visual scene understanding is so remarkable that we are able to recognize a revisited place when entering it from the opposite direction it was first visited, even in the presence of extreme variations in appearance. This capability is especially apparent during driving: a human driver can recognize where they are when travelling in the reverse direction along a route for the first time, without having to turn back and look. The difficulty of this problem exceeds any addressed in past appearance- and viewpoint-invariant visual place recognition (VPR) research, in part because large parts of the scene are not commonly observable from opposite directions. Consequently, as shown in this paper, the precision-recall performance of current state-of-the-art viewpoint- and appearance-invariant VPR techniques is orders of magnitude below what would be usable in a closed-loop system. Current engineered solutions predominantly rely on panoramic camera or LIDAR sensing setups;an eminently suitable engineering solution but one that is clearly very different to how humans navigate, which also has implications for how naturally humans could interact and communicate with the navigation system. In this paper we develop a suite of novel semantic- and appearance-based techniques to enable for the first time high performance place recognition in this challenging scenario. We first propose a novel Local Semantic Tensor (LoST) descriptor of images using the convolutional feature maps from a state-of-the-art dense semantic segmentation network. Then, to verify the spatial semantic arrangement of the top matching candidates, we develop a novel approach for mining semanticallysalient keypoint correspondences. On publicly available benchmark datasets that involve both 180 degree viewpoint change and extreme appearance change, we show how meaningful recall at 100% precision can be achieved using our proposed system where existing systems often fail to ever reach 100% precision. We also

关键词： Navigation systems

Testing SPARUS II AUV, an open platform for industrial, scientific and academic applications

学校读者我要写书评

暂无评论

arXiv 2018年

作者： Carreras, Marc Candela, Carles Ribas, David Palomeras, Narcís Magí, Lluís Mallios, Angelos Vidal, Eduard Vidal, Èric Ridao, Pere Computer Vision and Robotics Institute Universitat de Girona Parc Científic i Tecnològic UdG Girona17003 Spain

This paper describes the experience of preparing and testing the SPARUS II AUV in different applications. The AUV was designed as a lightweight vehicle combining the classical torpedo-shape features with the hovering capability. The robot has a payload area to allow the integration of different equipment depending on the application. The software architecture is based on ROS, an open framework that allows an easy integration of many devices and systems. Its flexibility, easy operation and openness makes the SPARUS II AUV a multipurpose platform that can adapt to industrial, scientific and academic applications. Five units were developed in 2014, and different teams used and adapted the platform for different applications. The paper describes some of the experiences in preparing and testing this open platform to different applications. Copyright © 2018, The Authors. All rights reserved.

关键词： Autonomous underwater vehicles

Generative 3D hand tracking with spatially constrained pose sampling 28

学校读者我要写书评

暂无评论

Generative 3D hand tracking with spatially constrained pose ...

28th British Machine vision Conference, BMVC 2017

作者： Roditakis, Konstantinos Makris, Alexandros Argyros, Antonis A. Computational Vision and Robotics Laboratory Institute of Computer Science FORTH Greece Computer Science Department University of Crete Greece

ISBN: (纸本)190172560X

We present a method for 3D hand tracking that exploits spatial constraints in the form of end effector (fingertip) locations. The method follows a generative, hypothesize-and-test approach and uses a hierarchical particle filter to track the hand. In contrast to state of the art methods that consider spatial constraints in a soft manner, the proposed approach enforces constraints during the hand pose hypothesis generation phase by sampling in the Reachable Distance Space (RDS). This sampling produces hypotheses that respect both the hands’ dynamics and the end effector locations. The data likelihood term is calculated by measuring the discrepancy between the rendered 3D model and the available observations. Experimental results on challenging, ground truth-annotated sequences containing severe hand occlusions demonstrate that the proposed approach outperforms the state of the art in hand tracking accuracy. © 2017. The copyright of this document resides with its authors.

关键词： End effectors

Conditional affordance learning for driving in urban environments

学校读者我要写书评

暂无评论

arXiv 2018年

作者： Sauer, Axel Savinov, Nikolay Geiger, Andreas Computer Vision and Geometry Group ETH Zürich Chair of Robotics Science and System Intelligence Technical University of Munich Autonomous Vision Group MPI for Intelligent Systems and University of Tübingen

Most existing approaches to autonomous driving fall into one of two categories: Modular pipelines, that build an extensive model of the environment, and imitation learning approaches, that map images directly to control outputs. A recently proposed third paradigm, direct perception, aims to combine the advantages of both by using a neural network to learn appropriate low-dimensional intermediate representations. However, existing direct perception approaches are restricted to simple highway situations, lacking the ability to navigate intersections, stop at traffic lights or respect speed limits. In this work, we propose a direct perception approach which maps video input to intermediate representations suitable for autonomous navigation in complex urban environments given high-level directional inputs. Compared to state-of-the-art reinforcement and conditional imitation learning approaches, we achieve an improvement of up to 68 % in goal-directed navigation on the challenging CARLA simulation benchmark. In addition, our approach is the first to handle traffic lights and speed signs by using image-level labels only, as well as smooth car-following, resulting in a significant reduction of traffic accidents in simulation. Copyright © 2018, The Authors. All rights reserved.

关键词： Autonomous vehicles

Real-Time Automatic License Plate Recognition through Deep Multi-Task Networks

学校读者我要写书评

暂无评论

Real-Time Automatic License Plate Recognition through Deep M...

Brazilian Symposium on computer Graphics and Image Processing (SIBGRAPI)

作者： Gabriel Resende Gonçalves Matheus Alves Diniz Rayson Laroca David Menotti William Robson Schwartz Department of Computer Science Universidade Federal de Minas Gerais Brazil Laboratory of Vision Robotics and Imaging Universidade Federal do Paraná Brazil

With the increasing number of cameras available in the cities, video traffic analysis can provide useful insights for the transportation segment. One of such analysis is the Automatic License Plate Recognition (ALPR). Previous approaches divided this task into several cascaded subtasks, i.e., vehicle location, license plate detection, character segmentation and optical character recognition. However, since each task has its own accuracy, the error propagation between each subtask is detrimental to the final accuracy. Therefore, focusing on the reduction of error propagation, we propose a technique that is able to perform ALPR using only two deep networks, the first performs license plate detection (LPD) and the second performs license plate recognition (LPR). The latter does not execute explicit character segmentation, which reduces significantly the error propagation. As these deep networks need a large number of samples to converge, we develop new data augmentation techniques that allow them to reach their full potential as well as a new dataset to train and evaluate ALPR approaches. According to experimental results, our approach is able to achieve state-of-the-art results in the SSIG-SegPlate dataset, reaching improvements up to 1.4 percentage point when compared to the best baseline. Furthermore, the approach is also able to perform in real time even in scenarios where many plates are present at the same frame, reaching significantly higher frame rates when compared with previously proposed approaches.

关键词： Licenses Detectors Task analysis Character recognition Proposals Image segmentation Real-time systems

6D pose estimation using an improved method based on point pair features

学校读者我要写书评

暂无评论

arXiv 2018年

作者： Vidal, Joel Lin, Chyi-Yeu Martí, Robert Department of Mechanical Engineering National Taiwan University of Science and Technology Taipei Taiwan Computer Vision and Robotics Group University of Girona Girona Spain

The Point Pair Feature [4] has been one of the most successful 6D pose estimation method among model-based approaches as an efficient, integrated and compromise alternative to the traditional local and global pipelines. During the last years, several variations of the algorithm have been proposed. Among these extensions, the solution introduced by Hinterstoisser et al. [6] is a major contribution. This work presents a variation of this PPF method applied to the SIXD Challenge datasets presented at the 3rd International Workshop on Recovering 6D Object Pose held at the ICCV 2017. We report an average recall of 0.77 for all datasets and overall recall of 0.82, 0.67, 0.85, 0.37, 0.97 and 0.96 for hinterstoisser, tless, tudlight, rutgers, tejani and doumanoglou datasets, respectively. Copyright © 2018, The Authors. All rights reserved.

关键词： Object detection