The proceedings contain 100 papers. The topics discussed include: human recognition in a video network;the guaranteed cost switch control of BTT vehicle based on RBF-NN compensation;a novel method for evaluating the v...
ISBN:
(纸本)9780819478078
The proceedings contain 100 papers. The topics discussed include: human recognition in a video network;the guaranteed cost switch control of BTT vehicle based on RBF-NN compensation;a novel method for evaluating the validity of the visual attended regions based on SIFT descriptors;human body motion tracking based on quantum-inspired immune cloning algorithm;object detection with geometric context of keypoints described as lifetime;a combined feature latent semantic model for scene classification;spectral clustering with eigenvector selection based on ensemble ranking;to generate a finite element model of human thorax using the VCH dataset;three-dimensional building reconstruction using highly overlapped aerial images;study on human vision model of the multi-parameter correction factor;a hybrid registration approach in super-resolution reconstruction for visual surveillance application;and realistic generation of natural phenomena based on video synthesis.
The robustness of Mutual Information (MI), the most used multimodal dense stereo correspondence measure, is restricted by the size of the matching windows. However, obtaining the appropriately sized MI windows for mat...
详细信息
The robustness of Mutual Information (MI), the most used multimodal dense stereo correspondence measure, is restricted by the size of the matching windows. However, obtaining the appropriately sized MI windows for matching thermal-visible pair of images of multiple people with various poses, clothes, distances to cameras, and different levels of occlusions is quite challenging. In this paper, we propose local self-similarity (LSS) as a multimodal dense stereo correspondence measure. We integrated LSS as a similarity metric with a disparity voting registration method to demonstrate the suitability of LSS for a visible-thermal stereo registration method. We have analyzed comparatively LSS and MI as multimodal correspondence measures and discussed LSS advantages compared to MI. We have also tested our LSS-based registration method in several indoor videos of multiple people and shown that our registration method outperforms the most recent MI-based registration method in the state-of-the-art.
In this paper, a review of vision-based natural disaster warning methods is presented. Because natural disaster warning is receiving a lot of attention in recent research, a comprehensive review of various disaster-wa...
详细信息
In this paper, a review of vision-based natural disaster warning methods is presented. Because natural disaster warning is receiving a lot of attention in recent research, a comprehensive review of various disaster-warning techniques developed in recent years is needed. This paper surveys recent studies on warning systems four different types of natural disaster, i.e., wildfire smoke and flame detection, water level detection for flood prevention, and coastal zone monitoring, using computervision and pattern-recognition techniques. Finally, we conclude with some thoughts about future research directions.
The current OCR cannot segment words and characters from video images due to complex background as well as low resolution of video images. To have better accuracy, this paper presents a new gradient based method for w...
详细信息
The current OCR cannot segment words and characters from video images due to complex background as well as low resolution of video images. To have better accuracy, this paper presents a new gradient based method for words and character segmentation from text line of any orientation in video frames for recognition. We propose a Max-Min clustering concept to obtain text cluster from the normalized absolute gradient feature matrix of the video text line image. Union of the text cluster with the output of Canny operation of the input video text line is proposed to restore missing text candidates. Then a run length algorithm is applied on the text candidate image for identifying word gaps. We propose a new idea for segmenting characters from the restored word image based on the fact that the text height difference at the character boundary column is smaller than that of the other columns of the word image. We have conducted experiments on a large dataset at two levels (word and character level) in terms of recall, precision and f-measure. Our experimental setup involves 3527 characters of English and Chinese, and this dataset is selected from TRECVID database of 2005 and 2006.
Due to its invariance to monotonic grayscale transformation and simple computation, Local Binary pattern (LBP) is broadly used as feature extractor in face recognition tasks in recent years [3]. In previous work, peop...
详细信息
ISBN:
(纸本)9781457720086
Due to its invariance to monotonic grayscale transformation and simple computation, Local Binary pattern (LBP) is broadly used as feature extractor in face recognition tasks in recent years [3]. In previous work, people have proposed methods of using Adaboost to select most representative features in samples. Zhang et al. proposed a method applying Adaboost algorithm to select those most distinctive features from which they extract LBP features. Though LBP features selected by Adaboost represent local textures effectively. Their method, however, neglects exploitation of holistic spatial information in nature of image samples. To solve this problem, we proposed the spatial enhanced multi-level boosing using uniform LBP and multilevel Adaboost algorithm. In this paper, we select most distinctive features which then being concatenated to represent spatial information using multi-level boosting algorithm. Experiments on ORL database yielded an exciting recognition rate of 98.96%.
This paper deals with real time face detection and tracking by a video camera. The method is based on a simple and fast initializing stage for learning. The transferable belief model is used to deal with the prior mod...
详细信息
State of the art local stereo correspondence algorithms that adapt their supports to image content allow to infer very accurate disparity maps often comparable to algorithms based on global disparity optimization meth...
详细信息
State of the art local stereo correspondence algorithms that adapt their supports to image content allow to infer very accurate disparity maps often comparable to algorithms based on global disparity optimization methods. However, despite their effectiveness, accurate local approaches based on this methodology are also computationally expensive and several simplifications aimed at reducing their computational load have been proposed. Unfortunately, compared to the original approaches, the effectiveness of most of these simplified techniques is significantly reduced. In this paper, we consider an efficient and accurate algorithm referred to as Fast Bilateral Stereo (FBS) that enables to efficiently obtain results comparable to state of the art local approaches describing its mapping on GPUs with CUDA. Experimental results on two NVIDIA GPUs show that our CUDA implementation delivers, on standard stereo pairs, accurate and dense disparity maps in near real-time achieving speedup greater than 100X with respect to the equivalent CPU-based implementation.
Textural patterns are often complex, exhibit scale-dependent changes in structure and are difficult to identify and describe. Lacunarity has been proposed as a general method for the analysis of several spatial patter...
详细信息
Textural patterns are often complex, exhibit scale-dependent changes in structure and are difficult to identify and describe. Lacunarity has been proposed as a general method for the analysis of several spatial patterns. Lacunarity data can designate a mathematical index of spatial heterogeneity, therefore the corresponding feature vectors should possess the necessary inter-class statistical properties that would enable them to be used for patternrecognition purposes. The objective of this work is to construct a supervised classification model of binary lacunarity data - computed by Valous et al. (2009) - from pork ham slice (three qualities) surface images, with the aid of kernel principal component analysis (KPCA) and a multilayer perceptron (MLP) neural network, using a portion of informative salient features. According to the principle of parsimony, the smallest possible number of features should be used so as to give an adequate representation of the feature space. Therefore, the dimension of the initial space, comprising of 510 features, was reduced by 90% in order to avoid any noise effects in the subsequent classification. Then, using KPCA, the first nineteen kernel principal components (99.04% of total variance) were extracted from the reduced feature space, and were used as input in the MLP. The correct classification percentages for the training, test and validation sets using the neural classifier were 86.7%, 86.7%, and 85.0%, respectively. The binary lacunarity spatial metric captured relevant information that provided a good level of differentiation among pork ham slice images.
Local space-time features and bag-of-feature (BOF) representation are often used for action recognition in previous approaches. For complicated human activities, however, the limitation of these approaches blows up be...
详细信息
Local space-time features and bag-of-feature (BOF) representation are often used for action recognition in previous approaches. For complicated human activities, however, the limitation of these approaches blows up because of the local properties of features and the lack of context. This paper addresses the problem by exploiting the spatio-temporal context information between *** first define a spatio-temporal context, which combines the scale invariant spatio-temporal neighberhood of local features with the spatio-temporal relationships between them. Then, we introduce a spatio-temporal context kernel (STCK), which not only takes into account the local properties of features but also considers their spatial and temporal context information. STCK has a promising generalization property and can be plugged into SVMs for activities recognition. The experimental results on challenging activity datasets show that, compared to context-free model, the spatio-temporal context kernel improves the recognition performance.
暂无评论