A structured light system simplifies three-dimensional reconstruction by illuminating a specially designed pattern to the target object, thereby generating a distinct texture on it for imaging and further processing. ...
详细信息
A structured light system simplifies three-dimensional reconstruction by illuminating a specially designed pattern to the target object, thereby generating a distinct texture on it for imaging and further processing. Success of the system hinges upon what features are to be coded in the projected pattern, extracted in the captured image, and matched between the projector's display panel and the camera's image plane. The codes have to be such that they are largely preserved in the image data upon illumination from the projector, reflection from the target object, and projective distortion in the imaging process. The features also need to be reliably extracted in the image domain. In this article, a two-dimensional pseudorandom pattern consisting of rhombic color elements is proposed, and the grid points between the pattern elements are chosen as the feature points. We describe how a type classification of the grid points plus the pseudorandomness of the projected pattern can equip each grid point with a unique label that is preserved in the captured image. We also present a grid point detector that extracts the grid points without the need of segmenting the pattern elements, and that localizes the grid points in subpixel accuracy. Extensive experiments are presented to illustrate that, with the proposed pattern feature definition and feature detector, more features points in higher accuracy can be reconstructed in comparison with the existing pseudorandomly encoded structured light systems. (C) 2011 Society of Photo-Optical Instrumentation Engineers (SPIE). [DOI: 10.1117/1.3615649]
Saliency detection has been applied to the target acquisition case. This paper proposes a two-dimensional hidden Markov model (2D-HMM) that exploits the hidden semantic information of an image to detect its salient re...
详细信息
Saliency detection has been applied to the target acquisition case. This paper proposes a two-dimensional hidden Markov model (2D-HMM) that exploits the hidden semantic information of an image to detect its salient regions. A spatial pyramid histogram of oriented gradient descriptors is used to extract features. After encoding the image by a learned dictionary, the 2D-Viterbi algorithm is applied to infer the saliency map. This model can predict fixation of the targets and further creates robust and effective depictions of the targets' change in posture and viewpoint. To validate the model with a human visual search mechanism, two eyetrack experiments are employed to train our model directly from eye movement data. The results show that our model achieves better performance than visual attention. Moreover, it indicates the plausibility of utilizing visual track data to identify targets. (c) 2018 SPIE and IS&T
Bag of visual words is a popular model in human action recognition, but usually suffers from loss of spatial and temporal configuration information of local features, and large quantization error in its feature coding...
详细信息
Bag of visual words is a popular model in human action recognition, but usually suffers from loss of spatial and temporal configuration information of local features, and large quantization error in its feature coding procedure. In this paper, to overcome the two deficiencies, we combine sparse coding with spatio-temporal pyramid for human action recognition, and regard this method as the baseline. More importantly, which is also the focus of this paper, we find that there is a hierarchical structure in feature vector constructed by the baseline method. To exploit the hierarchical structure information for better recognition accuracy, we propose a tree regularized classifier to convey the hierarchical structure information. The main contributions of this paper can be summarized as: first, we introduce a tree regularized classifier to encode the hierarchical structure information in feature vector for human action recognition. Second, we present an optimization algorithm to learn the parameters of the proposed classifier. Third, the performance of the proposed classifier is evaluated on YouTube, Hollywood2, and UCF50 datasets, the experimental results show that the proposed tree regularized classifier obtains better performance than SVM and other popular classifiers, and achieves promising results on the three datasets.
This paper proposed a high-performance image retrieval framework, which combines the improved feature extraction algorithm SIFT (Scale Invariant feature Transform), improved feature matching, improved feature coding F...
详细信息
This paper proposed a high-performance image retrieval framework, which combines the improved feature extraction algorithm SIFT (Scale Invariant feature Transform), improved feature matching, improved feature coding Fisher and improved Gaussian Mixture Model (GMM) for image retrieval. Aiming at the problem of slow convergence of traditional GMM algorithm, an improved GMM is proposed. This algorithm initializes the GMM by using on-line K-means clustering method, which improves the convergence speed of the algorithm. At the same time, when the model is updated, the storage space is saved through the improvement of the criteria for matching rules and generating new Gaussian distributions. Aiming at the problem that the dimension of SIFT (Scale Invariant feature Transform) algorithm is too high, the matching speed is too slow and the matching rate is low, an improved SIFT algorithm is proposed, which preserves the advantages of SIFT algorithm in fuzzy, compression, rotation and scaling invariance advantages, and improves the matching speed, the correct match rate is increased by an average of 40% to 55%. Experiments on a recently released VOC 2012 database and a database of 20 category objects containing 230,800 images showed that the framework had high precision and recall rates and less query time. Compared with the standard image retrieval framework, the improved image retrieval framework can detect the moving target quickly and effectively and has better robustness.
feature combination is an effective way for image classification. Most of the work in this line mainly considers feature combination based on different low-level image descriptors, while ignoring the complementary pro...
详细信息
feature combination is an effective way for image classification. Most of the work in this line mainly considers feature combination based on different low-level image descriptors, while ignoring the complementary property of different higher-level image features derived from the same type of low-level descriptor. In this paper, we explore the complementary property of different image features generated from one single type of low-level descriptor for image classification. Specifically, we propose a soft salient coding (SSaC) method, which overcomes the information suppression problem in the original salient coding (SaC) method. We analyse the physical meaning of the SSaC feature and the other two types of image features in the framework of Spatial Pyramid Matching (SPM), and propose using multiple kernel learning (MKL) to combine these features for classification tasks. Experiments on three image databases (Caltech-101, UIUC 8-Sports and 15-Scenes) not only verify the effectiveness of the proposed MKL combination method, but also reveal that collaboration is more important than selection for classification when limited types of image features are employed.
The dynamic and consistent information association among vtrious application activities in the full life cycle of a product is a key to the assurance of the cooperation among different application domains. In order to...
详细信息
The dynamic and consistent information association among vtrious application activities in the full life cycle of a product is a key to the assurance of the cooperation among different application domains. In order to establish and maintain the association, a design-process-based product association model was proposed. This model takes advantage of the generic naming mechanism, the private protocol for history-based form feature modeling, on which the Data Association Protocol is built. Hence the model can provide the way of constructing and maintaining the information linkage among different product developing stages naturally and dynamically while keeping the privacy of the feature coding. A case study illustrates the utilities of the model in the data linking between design model and process planning model.
The feature coding algorithm, "Vector of Locally Aggregated Descriptors (VLAD)", can be used effectively for large scale object instance retrieval. Despite its effectiveness and excellent performance, the ex...
详细信息
ISBN:
(纸本)9781479983407
The feature coding algorithm, "Vector of Locally Aggregated Descriptors (VLAD)", can be used effectively for large scale object instance retrieval. Despite its effectiveness and excellent performance, the existence of ambiguous cluster centers can reduce the performance. Though an idea to this problem has been proposed, it is not practical in fact. In this paper, we analyze possible situations that cause effect on the results and propose a novel approach to improve the VLAD method. The proposed method mainly focuses on the similarity measure between each two images. For each two images, we adapt the original cluster center to VLAD vectors. As we illustrate, our method has promising results with small vocabulary size on both datasets of 15 Scenes and VOC2007.
This paper proposes to employ deep learning model to encode local descriptors for image classification. Previous works using deep architectures to obtain higher representations are often operated from pixel level, whi...
详细信息
ISBN:
(纸本)9781479983407
This paper proposes to employ deep learning model to encode local descriptors for image classification. Previous works using deep architectures to obtain higher representations are often operated from pixel level, which lack the power to be generalized to large-size and complex images due to computational burdens and internal essence capture. Our method slips the leash of this limitation by starting from local descriptors to leverage more semantical inputs. We investigate to use two layers of Restricted Boltzmann Machines (RBMs) to encode different local descriptors with a novel group sparse learning (GSL) inspired by the recent success of sparse coding. Besides, unlike the most existing pure unsupervised feature coding strategies, we use another RBM corresponding to semantic labels to perform supervised fine-tuning which makes our model more suitable for classification task. Experimental results on Caltech-256 and Indoor-67 datasets demonstrate the effectiveness of our method.
暂无评论