This paper presents a prediction-and-verification segmentation scheme using attention images from multiple fixations. A major advantage of this scheme is that it can handle a large number of different deformable objec...
详细信息
ISBN:
(纸本)0818672587
This paper presents a prediction-and-verification segmentation scheme using attention images from multiple fixations. A major advantage of this scheme is that it can handle a large number of different deformable objects presented in complex backgrounds. The scheme is also relatively efficient since the segmentation is guided by the past knowledge through a prediction-and-verification scheme. The system has been tested to segment hands in the sequences of intensity images, where each sequence represents a hand sign. The experimental result showed a 95% correct segmentation rate with a 3% false rejection rate.
We present a surface radiance model for diffuse lighting that incorporates shadows, interreflections, and surface orientation. We show that, for smooth surfaces, the model is an excellent approximation of the radiosit...
详细信息
ISBN:
(纸本)0818672587
We present a surface radiance model for diffuse lighting that incorporates shadows, interreflections, and surface orientation. We show that, for smooth surfaces, the model is an excellent approximation of the radiosity equation. We present a new data structure and algorithm that uses this model to compute shape-from-shading under diffuse lighting. The algorithm was tested on both synthetic and real images, and performs more accurately than the only previous algorithm for this problem. Various causes of error are discussed, including approximation errors in image modelling, poor local constraints at the image boundary, and ill-conditioning of the problem itself.
The purpose of this study is not only to recognize some kind of facial expressions which is associated with human emotion but also to estimate its degree. Our method is based on the idea that facial expression recogni...
详细信息
ISBN:
(纸本)0780342364
The purpose of this study is not only to recognize some kind of facial expressions which is associated with human emotion but also to estimate its degree. Our method is based on the idea that facial expression recognition can be achieved by extracting a variation from expressionless face with considering face area as a whole pattern. For the purpose of extracting subtle changes in the face such as the degree of expressions, it is necessary to eliminate the individuality appearing in the facial image. Using a elastic net model, a variation of facial expression is represented as motion vectors of the deformed Net from a facial edge image. Then, applying K-L expansion, the change of facial expression represented as the motion vectors of nodes is mapped into low dimensional eigen space, and estimation is achieved by projecting input images on to the Emotion Space. In this paper we have constructed three kinds of expression models: happiness, anger, surprise, curd experimental results are evaluated.
Recognizing an action from a sequence of 3D skeletal poses is a challenging task. First, different actors may perform the same action in various styles. Second, the estimated poses are sometimes inaccurate. These chal...
详细信息
ISBN:
(纸本)9781467388511
Recognizing an action from a sequence of 3D skeletal poses is a challenging task. First, different actors may perform the same action in various styles. Second, the estimated poses are sometimes inaccurate. These challenges can cause large variations between instances of the same class. Third, the datasets are usually small, with only a few actors performing few repetitions of each action. Hence training complex classifiers risks over-fitting the data. We address this task by mining a set of key-pose-motifs for each action class. A key-pose-motif contains a set of ordered poses, which are required to be close but not necessarily adjacent in the action sequences. The representation is robust to style variations. The key-pose-motifs are represented in terms of a dictionary using soft-quantization to deal with inaccuracies caused by quantization. We propose an efficient algorithm to mine key-pose-motifs taking into account of these probabilities. We classify a sequence by matching it to the motifs of each class and selecting the class that maximizes the matching score. This simple classifier obtains state-of-the-art performance on two benchmark datasets.
In contrast to the generic object, aerial targets are often non-axis aligned with arbitrary orientations having the cluttered surroundings. Unlike the mainstreamed approaches regressing the bounding box orientations, ...
详细信息
ISBN:
(数字)9781665469463
ISBN:
(纸本)9781665469463
In contrast to the generic object, aerial targets are often non-axis aligned with arbitrary orientations having the cluttered surroundings. Unlike the mainstreamed approaches regressing the bounding box orientations, this paper proposes an effective adaptive points learning approach to aerial object detection by taking advantage of the adaptive points representation, which is able to capture the geometric information of the arbitrary-oriented instances. To this end, three oriented conversion functions are presented to facilitate the classification and localization with accurate orientation. Moreover, we propose an effective quality assessment and sample assignment scheme for adaptive points learning toward choosing the representative oriented reppoints samples during training, which is able to capture the non-axis aligned features from adjacent objects or background noises. A spatial constraint is introduced to penalize the outlier points for roust adaptive learning. Experimental results on four challenging aerial datasets including DOTA, HRSC2016, UCAS-AOD and DIOR-R, demonstrate the efficacy of our proposed approach. The source code is availabel at: https://github com/LiWentomng/OrientedRepPoints.
Video surveillance systems generated about 65% of the Universe Big Data in 2015. The development of systems for intelligent analysis of such a large amount of data is among the most investigated topics in the academia...
详细信息
ISBN:
(纸本)9781509014378
Video surveillance systems generated about 65% of the Universe Big Data in 2015. The development of systems for intelligent analysis of such a large amount of data is among the most investigated topics in the academia and commercial world. Recent outcomes in knowledge management and computational intelligence demonstrate the effectiveness of semantic technologies in several fields like image and text analysis, hand writing and speech recognition. In this paper a solution that, starting from the output of a people tracking algorithm, is able to recognize simple events (person falling to the ground) and complex ones (person aggression) is presented. The proposed solution uses semantic web technologies for automatically annotating the output produced by the tracking algorithm;a sets of rules for reasoning on these annotated data are also proposed. Such rules allow to define complex analytics functions demonstrating the effectiveness of hybrid approaches for event recognition.
Most human activity analysis works (i.e., recognition or prediction) only focus on a single granularity, i.e., either modelling global motion based on the coarse level movement such as human trajectories or forecastin...
详细信息
ISBN:
(纸本)9781538664209
Most human activity analysis works (i.e., recognition or prediction) only focus on a single granularity, i.e., either modelling global motion based on the coarse level movement such as human trajectories or forecasting future detailed action based on body parts' movement such as skeleton motion. In contrast, in this work, we propose a multi-granularity interaction prediction network which integrates both global motion and detailed local action. Built on a bidirectional LSTM network, the proposed method possesses between granularities links which encourage feature sharing as well as cross-feature consistency between both global and local granularity (e.g., trajectory or local action), and in turn predict long-term global location and local dynamics of each individual. We validate our method on several public datasets with promising performance.
In order to reduce false alarms and to improve the target detection performance of an automatic target detection and recognition system operating in a cluttered environment, it is important to develop the models not o...
详细信息
ISBN:
(纸本)0818672587
In order to reduce false alarms and to improve the target detection performance of an automatic target detection and recognition system operating in a cluttered environment, it is important to develop the models not only for man-made targets but also of natural background clutters. Because of the high complexity of natural clutters, this clutter model can only be reliably built through learning from real examples. If available, contextual information that characterizes each training example can be used to further improve the learned clutter model. In this paper, we present such a clutter model aided target detection system. Emphases are placed on two topics: (1) learning the background clutter model from sensory data through a self-organizing process, (2) reinforcing the learned clutter model using contextual information.
Recently, object detection in aerial images has gained much attention in computervision. Different from objects in natural images, aerial objects are often distributed with arbitrary orientation. Therefore, the detec...
详细信息
ISBN:
(纸本)9781665445092
Recently, object detection in aerial images has gained much attention in computervision. Different from objects in natural images, aerial objects are often distributed with arbitrary orientation. Therefore, the detector requires more parameters to encode the orientation information, which are often highly redundant and inefficient. Moreover, as ordinary CNNs do not explicitly model the orientation variation, large amounts of rotation augmented data is needed to train an accurate object detector. In this paper, we propose a Rotation-equivariant Detector (ReDet) to address these issues, which explicitly encodes rotation equivariance and rotation invariance. More precisely, we incorporate rotation-equivariant networks into the detector to extract rotation-equivariant features, which can accurately predict the orientation and lead to a huge reduction of model size. Based on the rotation-equivariant features, we also present Rotation-invariant RoI Align (RiRoI Align), which adaptively extracts rotation-invariant features from equivariant features according to the orientation of RoI. Extensive experiments on several challenging aerial image datasets DOTA-v1.0, DOTA-v1.5 and HRSC2016, show that our method can achieve state-of-the-art performance on the task of aerial object detection. Compared with previous best results, our ReDet gains 1.2, 3.5 and 2.6 mAP on DOTA-v1.0, DOTA-v1.5 and HRSC2016 respectively while reducing the number of parameters by 60% (313 Mb vs. 121 Mb).
We describe an approach to the classification of 3-D objects using a multi-scale representation. This approach starts with a smoothing algorithm for representing objects at different scales. Smoothing is applied in cu...
详细信息
ISBN:
(纸本)0780342364
We describe an approach to the classification of 3-D objects using a multi-scale representation. This approach starts with a smoothing algorithm for representing objects at different scales. Smoothing is applied in curvature space directly, thus avoiding the usual shrinkage problems and allowing for efficient implementations. A 3-D similarity measure that integrates the representations of the objects at multiple scales is introduced Given a library of models, objects that are similar based an this multi-scale measure are grouped together into classes. Thtr objects that are in the same class ave combined into a single prototype object. Finally the prototypes are used for hierarchical recognition by first comparing the scene representation to the prototypes and then matching it only to the objects in the most likely class rather than to the entire library of models. Beyond its application to object recognition, this approach provides an attractive implementation of the intuitive nations of scale and approximate similarity for 3-D shapes.
暂无评论