the proceedings contain 192 papers. the topics discussed include: feature vector compression based on least error quantization;a comprehensive analysis of deep learning based representation for face recognition;two-st...
ISBN:
(纸本)9781467388504
the proceedings contain 192 papers. the topics discussed include: feature vector compression based on least error quantization;a comprehensive analysis of deep learning based representation for face recognition;two-stream CNNs for gesture-based verification and identification: learning user style;CALIPER: continuous authentication layered with integrated PKI encoding recognition;latent fingerprint image segmentation using fractal dimension features and weighted extreme learning machine ensemble;a comparison of human and automated face verification accuracy on unconstrained image sets;offline signature verification based on bag-of-VisualWords model using KAZE features and weighting schemes;implementation of fixed-length template protection based on homomorphic encryption with application to signature biometrics;seeing the forest from the trees: a holistic approach to near-infrared heterogeneous face recognition;a novel visualization tool for evaluating the accuracy of 3d sensing and reconstruction algorithms for automatic dormant pruning applications;a pointing gesture based egocentric interaction system: dataset, approach and application;multimodal multi-stream deep learning for egocentric activity recognition;sparse kernel machines for discontinuous registration and nonstationary regularization;and accurate small deformation exponential approximant to integrate large velocity fields: application to image registration.
the proceedings contain 640 papers. the topics discussed include: generation and comprehension of unambiguous object description;image question answering using convolutional neural network with dynamic parameter predi...
ISBN:
(纸本)9781467388504
the proceedings contain 640 papers. the topics discussed include: generation and comprehension of unambiguous object description;image question answering using convolutional neural network with dynamic parameter prediction;neural module networks;learning to assign orientations to feature points;affinity CNN: learning pixel-centric pairwise relations for figure/ground embedding;occlusion boundary detection via deep exploration of context;exploit bounding box annotations for multi-label object recognition;MCMC shape sampling for image segmentation with nonparametric shape priors;learning action maps of large environments via first-person vision;sample and filter: nonparametric scene parsing via efficient filtering;training region-based object detectors with online hard example mining;learning with side information through modality hallucination;HyperNet: towards accurate region proposal generation and joint object detection;macroscopic interferometry: rethinking depth estimation with frequency-domain time-of-flight;ASP vision: optically computing the first layer of convolutional neural networks using angle sensitive pixels;hierarchical recurrent neural encoder for video representation with application to captioning;and from keyframes to key objects: video summarization by representative object proposal selection.
this paper describes the datasets and computervision challenges that form part of the PETS 2016 workshop. PETS 2016 addresses the application of on-board multi sensor surveillance for protection of mobile critical as...
详细信息
ISBN:
(纸本)9781509014378
this paper describes the datasets and computervision challenges that form part of the PETS 2016 workshop. PETS 2016 addresses the application of on-board multi sensor surveillance for protection of mobile critical assets. the sensors (visible and thermal cameras) are mounted on the asset itself and surveillance is performed around the asset. Two datasets are provided: (1) a multi sensor dataset as used for the PETS2014 challenge which addresses protection of trucks (the ARENA Dataset);and (2) a new dataset - the IPATCH Dataset - addressing the application of multi sensor surveillance to protect a vessel at sea from piracy. the dataset specifically addresses several vision challenges set in the PETS 2016 workshop, and corresponding to different steps in a video understanding system: Low-Level Video Analysis (object detection and tracking), Mid-Level Video Analysis ('simple' event detection: the behaviour recognition of a single actor) and High-Level Video Analysis ('complex' event detection: the behaviour and interaction recognition of several actors).
e propose a two-level system for apparent age estimation from facial images. Our system first classifies samples into overlapping age groups. Within each group, the apparent age is estimated with local regressors, who...
详细信息
ISBN:
(纸本)9781509014378
e propose a two-level system for apparent age estimation from facial images. Our system first classifies samples into overlapping age groups. Within each group, the apparent age is estimated with local regressors, whose outputs are then fused for the final estimate. We use a deformable parts model based face detector, and features from a pre-trained deep convolutional network. Kernel extreme learning machines are used for classification. We evaluate our system on the ChaLearn Looking at People 2016 - Apparent Age Estimation challenge dataset, and report 0.3740 normal score on the sequestered test set.
Perceiving distance from two camera images, a task called stereo vision, is fundamental for many applications in robotics or automation. However, algorithms that compute this information at high accuracy have a high c...
详细信息
ISBN:
(纸本)9781509014378
Perceiving distance from two camera images, a task called stereo vision, is fundamental for many applications in robotics or automation. However, algorithms that compute this information at high accuracy have a high computational complexity. One such algorithm, Semi Global Matching (SGM), performs well in many stereo vision benchmarks, while maintaining a manageable computational complexity. Nevertheless, CPU and GPU implementations of this algorithm often fail to achieve real-time processing of camera images, especially in power-constrained embedded environments. this work presents a novel architecture to calculate disparities through SGM. the proposed architecture is highly scalable and applicable for low-power embedded as well as high-performance multi-camera high-resolution applications.
In this paper, we present a distributed embedded vision system that enables surround scene analysis and vehicle threat estimation. the proposed system analyzes the surroundings of the ego-vehicle using four cameras, e...
详细信息
ISBN:
(纸本)9781509014378
In this paper, we present a distributed embedded vision system that enables surround scene analysis and vehicle threat estimation. the proposed system analyzes the surroundings of the ego-vehicle using four cameras, each connected to a separate embedded processor. Each processor runs a set of optimized vision-based techniques to detect surrounding vehicles, so that the entire system operates at real-time speeds. this setup has been demonstrated on multiple vehicle testbeds with high levels of robustness under real-world driving conditions and is scalable to additional cameras. Finally, we present a detailed evaluation which shows over 95% accuracy and operation at nearly 15 frames per second.
We describe an FPGA-based on-board control system for autonomous orientation of an aerial robot to assist aerial manipulation tasks. the system is able to apply yaw control to aid an operator to precisely position a d...
详细信息
ISBN:
(纸本)9781509014378
We describe an FPGA-based on-board control system for autonomous orientation of an aerial robot to assist aerial manipulation tasks. the system is able to apply yaw control to aid an operator to precisely position a drone when it is nearby a bar-like object. this is achieved by applying parallel Hough transform enhanced with a novel image space separation method, enabling highly reliable results in various circumstances combined with high performance. the feasibility of this approach is shown by applying the system to a multi-rotor aerial robot equipped with an upward directed robotic hand on top of the airframe developed for high altitude manipulation tasks. In order to grasp a bar-like object, orientation of the bar object is observed from the image data obtained by a monocular camera mounted on the robot. this data is then analyzed by the on-board FPGA system to control yaw angle of the aerial robot. In experiments, reliable yaw-orientation control of the aerial robot is achieved.
Person-independent and pose-invariant estimation of eye-gaze is important for situation analysis and for automated video annotation. We propose a fast cascade regression based method that first estimates the location ...
详细信息
ISBN:
(纸本)9781509014378
Person-independent and pose-invariant estimation of eye-gaze is important for situation analysis and for automated video annotation. We propose a fast cascade regression based method that first estimates the location of a dense set of markers and their visibility, then reconstructs face shape by fitting a part-based 3D model. Next, the reconstructed 3D shape is used to estimate a canonical view of the eyes for 3D gaze estimation. the model operates in a feature space that naturally encodes local ordinal properties of pixel intensities leading to photometric invariant estimation of gaze. To evaluate the algorithm in comparison with alternative approaches, three publicly-available databases were used, Boston University Head Tracking, Multi-View Gaze and CAVE Gaze datasets. Precision for head pose and gaze averaged 4 degrees or less for pitch, yaw, and roll. the algorithm outperformed alternative methods in both datasets.
Background subtraction is a basic problem for change detection in videos and also the first step of high-level computervision applications. Most background subtraction methods rely on color and texture feature. Howev...
详细信息
ISBN:
(纸本)9781509014378
Background subtraction is a basic problem for change detection in videos and also the first step of high-level computervision applications. Most background subtraction methods rely on color and texture feature. However, due to illuminations changes in different scenes and affections of noise pixels, those methods often resulted in high false positives in a complex environment. To solve this problem, we propose an adaptive background subtraction model which uses a novel Local SVD Binary pattern (named LSBP) feature instead of simply depending on color intensity. this feature can describe the potential structure of the local regions in a given image, thus, it can enhance the robustness to illumination variation, noise, and shadows. We use a sample consensus model which is well suited for our LSBP feature. Experimental results on CDnet 2012 dataset demonstrate that our background subtraction method using LSBP feature is more effective than many state-of-the-art methods.
暂无评论