Human-robot interaction strongly benefits from fast, predictive action recognition. For us this is relatively easy but difficult for a robot. To address this problem, here we present a novel prediction algorithm for m...
详细信息
ISBN:
(纸本)9781538680940
Human-robot interaction strongly benefits from fast, predictive action recognition. For us this is relatively easy but difficult for a robot. To address this problem, here we present a novel prediction algorithm for manipulation action classes in video sequences. Manipulations are first represented using the Enriched Semantic Event Chain (ESEC) framework. This creates a temporal sequence of static and dynamic spatial relations between the objects that take part in the manipulation by which an action can be quickly recognized. We measured performance on 32 ideal as well as real manipulations and compared our method also against a state of the art trajectory-based HMM method for action recognition. We observe that manipulations can be correctly predicted after only (on average) 45% of action's total time and that we are almost twice as fast as the HMM-based method. Finally, we demonstrate the advantage of this framework in a simple robot demonstration comparing two different approaches.
In this study, an embedded Pan-Tilt-Zoom (PTZ) tracker system design is proposed that is based on NVIDIA Tegra K1-X1 mobile GPU platform. For this purpose, state-of-the-art correlation filter (CF) based video object t...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
In this study, an embedded Pan-Tilt-Zoom (PTZ) tracker system design is proposed that is based on NVIDIA Tegra K1-X1 mobile GPU platform. For this purpose, state-of-the-art correlation filter (CF) based video object tracking (VOT) algorithms are exploited regarding their high performance. Each algorithmic step is carefully implemented on GPU that further increases the efficiency and decreases execution times. The PTZ control is designed to track human targets by centralizing within the image coordinates where the targets have limited speed but obvious appearance changes. Incorporating on-board decode and encode capability of Tegra platform as well as angular position control, the presented approach enables 50-100 fps target tracking for HD (1920x1080) videos on K1 and X1 correspondingly. This is to our best knowledge the first efficient implementation of CF trackers on a mobile GPU platform with use of multiple features, scale and background adaptation. This study extends the scope of accuracy focused VOT research to platform optimized efficient implementations for real-time high resolution video tracking.
Night Vision Imaging in the Short-Wave Infra-Red (SWIR) has some unique advantages over Visible, Near Infra-Red (NIR) or thermal imaging. It benefits from relatively high irradiance levels and intuitive reflective ima...
详细信息
ISBN:
(纸本)9781510617605
Night Vision Imaging in the Short-Wave Infra-Red (SWIR) has some unique advantages over Visible, Near Infra-Red (NIR) or thermal imaging. It benefits from relatively high irradiance levels and intuitive reflective imaging. InGaAs/InP is the leading technology for two-dimensional (2D) SWIR detector arrays, utilizing low dark current, high efficiency and excellent uniformity. SCD's SWIR imager is a low Size, Weight and Power (SWaP) video engine based on a low noise 640x512/15 mu m InGaAs Focal Plane Array (FPA) embedded in a low cost plastic package which includes a Thermo-Electric Cooler (TEC). The SWIR imager dimensions are 31x31x32 mm(3), it weighs 50 gram and has less than 1.4W Power consumption (excluding TEC). It supports conventional video formats, such as Camera Link and BT. 656. The video engine imageprocessingalgorithms include Non-Uniformity Correction (NUC), Auto Exposure Control (AEC), Auto Gain Control (AGC), Dynamic Range Compression (DRC) and de-noising algorithms. The algorithms are specifically optimized for Low Light Level (LLL) conditions enabling imaging from sub mlux to 100 Klux light levels. In this work we will review the optimized video engine LLL architecture, electro-optical performance and the applicability to night vision systems.
New applications related to robotic manipulation or transportation tasks, with or without physical grasping, are continuously being developed. To perform these activities, the robot takes advantage of different kinds ...
详细信息
New applications related to robotic manipulation or transportation tasks, with or without physical grasping, are continuously being developed. To perform these activities, the robot takes advantage of different kinds of perceptions. One of the key perceptions in robotics is vision. However, some problems related to imageprocessing makes the application of visual information within robot control algorithms difficult. Camera-based systems have inherent errors that affect the quality and reliability of the information obtained. The need of correcting image distortion slows down image parameter computing, which decreases performance of control algorithms. In this paper, a new approach to correcting several sources of visual distortions on images in only one computing step is proposed. The goal of this system/algorithm is the computation of the tilt angle of an object transported by a robot, minimizing image inherent errors and increasing computing speed. After capturing the image, the computer system extracts the angle using a Fuzzy filter that corrects at the same time all possible distortions, obtaining the real angle in only one processing step. This filter has been developed by the means of Neuro-Fuzzy learning techniques, using datasets with information obtained from real experiments. In this way, the computing time has been decreased and the performance of the application has been improved. The resulting algorithm has been tried out experimentally in robot transportation tasks in the humanoid robot TEO (Task Environment Operator) from the University Carlos iii of Madrid.
In this paper, we propose a novel real-time method for tracking planar edge templates. This method tracks an edge template by estimating its homography transformations with respect to the sampled edge pixels detected ...
详细信息
ISBN:
(纸本)9781538680940
In this paper, we propose a novel real-time method for tracking planar edge templates. This method tracks an edge template by estimating its homography transformations with respect to the sampled edge pixels detected from the incoming frames. Particularly, we define a cost function based on a new feature map of the to-be-tracked edge template and optimize it by a Lucas-Kanade-like algorithm. The feature map is defined as the fourth root of the distance transform. Our method operates on just edges so that it is good at tracking those low textured targets, such as hollow targets (mug rim), thin targets (cable, ring) and non-Lambertian objects (disc). We validate and compare our method with four other methods on five newly collected real-world video sequences. The results achieves the lowest overall average error (1.58 pixels) and also outperforms others in terms of success rate. The per frame processing time of about 30 ms proves that our method is acceptable in real-time applications. The code and dataset are publicly available at: http://***/similar to xuebin/.
Automatic visual pattern recognition is complex and highly researched area of imageprocessing. This research aims to study various pattern recognition algorithms, cloth pattern recognition is presented as research pr...
详细信息
Super-resolution reconstruction (SRR) is aimed at increasing spatial resolution given a single image or multiple images presenting the same scene. The existing methods are underpinned with a premise that the observed ...
详细信息
algorithms are powerful and necessary tools behind a large part of the information we use every day. However, they may introduce new sources of bias, discrimination and other unfair practices that affect people who ar...
详细信息
ISBN:
(纸本)9783030000639;9783030000622
algorithms are powerful and necessary tools behind a large part of the information we use every day. However, they may introduce new sources of bias, discrimination and other unfair practices that affect people who are unaware of it. Greater algorithm transparency is indispensable to provide more credible and reliable services. Moreover, requiring developers to design transparent algorithm-driven applications allows them to keep the model accessible and human understandable, increasing the trust of end users. In this paper we present EBANO, a new engine able to produce prediction-local explanations for a black-box model exploiting interpretable feature perturbations. EBANO exploits the hypercolumns representation together with the cluster analysis to identify a set of interpretable features of images. Furthermore two indices have been proposed to measure the influence of input features on the final prediction made by a CNN model. EBANO has been preliminary tested on a set of heterogeneous images. The results highlight the effectiveness of EBANO in explaining the CNN classification through the evaluation of interpretable features influence.
I consider a number of methods of automatic quadratic features adjustment for digital textural images of biological tissues in order to improve the quality of classification. The proposed approaches are based on optim...
详细信息
Hyperspectral image registration is a relevant task for real-time applications like environmental disasters management or search and rescue scenarios. Traditional algorithms for this problem were not really devoted to...
详细信息
ISBN:
(数字)9781728144849
ISBN:
(纸本)9781728144856
Hyperspectral image registration is a relevant task for real-time applications like environmental disasters management or search and rescue scenarios. Traditional algorithms for this problem were not really devoted to real-time performance. The HYFMGPU algorithm arose as a high-performance GPU-based solution to solve such a lack. Nevertheless, a single-GPU solution is not enough, as sensors are evolving and then generating images with finer resolutions and wider wavelength ranges. An MPI+CUDA multi-GPU implementation of HYFMGPU was previously presented. However, this solution shows the programming complexity of combining MPI with an accelerator programming model. In this paper we present a new and more abstract programming approach for this type of applications, which provides a high efficiency while simplifying the programming of the multi-device parts of the code. The solution uses Hitmap, a library to ease the programming of parallel applications based on distributed arrays. It uses a more algorithm-oriented approach than MPI, including abstractions for the automatic partition and mapping of arrays at runtime with arbitrary granularity, as well as techniques to build flexible communication patterns that transparently adapt to the data partitions. We show how these abstractions apply to this application class. We present a comparison of development effort metrics between the original MPI implementation and the one based on Hitmap, with reductions of up to 95% for the Halstead score in specific work redistribution steps. We finally present experimental results showing that these abstractions are internally implemented in a high efficient way that can reduce the overall performance time in up to 37% comparing with the original MPI implementation.
暂无评论