We present a novel model for human action categorization. A video sequence is represented as a collection of spatial and spatial-temporal features by extracting static and dynamic interest points. We propose a hierarc...
详细信息
We introduce a method of understanding of four musical time patterns and three tempos that are generated by a human conductor of robot orchestra or an operator of computer-based music play system using the hand gestur...
详细信息
ISBN:
(纸本)9781424407835
We introduce a method of understanding of four musical time patterns and three tempos that are generated by a human conductor of robot orchestra or an operator of computer-based music play system using the hand gesture recognition. We use only a stereo vision camera with no extra special devices. We suggest a simple and reliable vision-based hand gesture recognition with two naive features. One is the motion-direction code which is a quantized code for motion directions. The other is the conducting feature point (CFP) where the point of sudden motion changes. The proposed hand gesture recognition system operates as follows: First, it extracts the human band region by segmenting the depth information generated by stereo matching of image sequences. Next, it follows the motion of the center of the gravity(COG) of the extracted hand region and generates the gesture features such as CFP and the direction-code. Finally, we obtain the current timing pattern of beat and tempo of the playing music by the proposed hand gesture recognition using either CFP tracking or motion histogram matching. The experimental results on the test data set show that the musical time pattern and tempo recognition rate is over 86.42% for the motion histogram matching, and 79.75% for the CFP tracking.
Informative Vector Machine (IVM) is an efficient fast sparse Gaussian processs (GP) method previously suggested for active learning. It greatly reduces the computational cost of GP classification and makes the GP lear...
详细信息
This work focuses on the development of a computervision system for the automatic on-line inspection and classification of Satsuma segments. During the image acquisition the segments are in movement, wet and frequent...
详细信息
ISBN:
(纸本)9783540728481
This work focuses on the development of a computervision system for the automatic on-line inspection and classification of Satsuma segments. During the image acquisition the segments are in movement, wet and frequently in contact with other pieces. The segments are transported over six semi-transparent conveyor belts that advance at speed of 1 nits. During on-line operation, the system acquires images of the segments using two cameras connected to a single computer and process the images in less than 50 ms. Extracting morphological features from the objects, the system identifies automatically pieces of skin and row material and separates entire segments from broken ones, discriminating between those with slight or large breaking degree. Combinations of morphological parameters were employed to decide the quality of each segment, classifying correctly 95% of sound segments.
作者:
Zitouni, ImedIBM Corp
Thomas J Watson Res Ctr Multilingual NLP POB 21820-136 Yorktown Hts NY 10598 USA
In this paper, we introduce the backoff hierarchical class n-gram language models to better estimate the likelihood of unseen n-gram events. This multi-level class hierarchy language modeling approach generalizes the ...
详细信息
In this paper, we introduce the backoff hierarchical class n-gram language models to better estimate the likelihood of unseen n-gram events. This multi-level class hierarchy language modeling approach generalizes the well-known backoff n-gram language modeling technique. It uses a class hierarchy to define word contexts. Each node in the hierarchy,is a class that contains all the words of its descendant nodes. The closer a node to the root, the more general the class (and context) is. We investigate the effectiveness of the approach to model unseen events in speech recognition. Our results illustrate that the proposed technique outperforms backoff n-gram language models. We also study the effect of the vocabulary size and the depth of the class hierarchy on the performance of the approach. Results are presented on Wall Street Journal (WSJ) corpus using two vocabulary set: 5000 words and 20,000 words. Experiments with 5000 word vocabulary, which contain a small numbers of unseen events in the test set, show up to 10% improvement of the unseen event perplexity when using the hierarchical class n-gram language models. With a vocabulary of 20,000 words, characterized by a larger number of unseen events, the perplexity of unseen events decreases by 26%, while the word error rate (WER) decreases by 12% when using the hierarchical approach. Our results suggest that the largest gains in performance are obtained when the test set contains a large number of unseen events. (c) 2006 Elsevier Ltd. All rights reserved.
This paper presents a novel real-time palmprint recognition system for cooperative user applications. This system is the first one achieving non-contact capturing and recognizing palmprint images under unconstrained s...
详细信息
ISBN:
(纸本)9783540763895
This paper presents a novel real-time palmprint recognition system for cooperative user applications. This system is the first one achieving non-contact capturing and recognizing palmprint images under unconstrained scenes. Its novelties can be described in two aspects. The first is a novel design of image capturing device. The hardware can reduce influences of background objects and segment out hand regions efficiently. The second is a process of automatic hand detection and fast palmprint alignment, which aims to obtain normalized palmprint images for subsequent feature extraction. The palmprint recognition algorithm used in the system is based on accurate ordinal palmprint representation. By integrating power of the novel imaging device, the palmprint preprocessing approach and the palmprint recognition engine, the proposed system provides a friendly user interface and achieves a good performance under unconstrained scenes simultaneously.
Interest point detection in still images is a well-studied topic in computervision. In the spatiotemporal domain, however, it is still unclear which features indicate useful interest points. In this paper we approach...
详细信息
ISBN:
(纸本)9783540749332
Interest point detection in still images is a well-studied topic in computervision. In the spatiotemporal domain, however, it is still unclear which features indicate useful interest points. In this paper we approach the problem by learning a detector from examples: we record eye movements of human subjects watching video sequences and train a neural network to predict which locations are likely to become eye movement targets. We show that our detector outperforms current spatiotemporal interest point architectures on a standard classification dataset.
Autonomous cars will likely play an important role in the future. A vision system designed to support outdoor navigation for such vehicles has to deal with large dynamic environments, changing imaging conditions, and ...
详细信息
This paper apply computer image processing and pattern recognizition methods to solve the problem of auto classification and counting of leukocytes (white blood cell) in peripheral blood.. In this paper a new leukocyt...
详细信息
ISBN:
(纸本)9780819469533
This paper apply computer image processing and pattern recognizition methods to solve the problem of auto classification and counting of leukocytes (white blood cell) in peripheral blood.. In this paper a new leukocyte arithmetic of five-part based on image process and pattern recognizition is presented, which relized auto classify of leukocyte. The first aim is detect the leukocytes. A major requirement of the whole system is to classify these leukocytes to 5 classes. This arithmetic bases on notability mechanism of eyes, process image by sequence, divides up leukocytes and pick up characters. Using the prior kwonledge of cells and image shape information, this arithmetic divides up the probable shape of Leukocyte first by a new method based on Chamfer and then gets the detail characters. It can reduce the mistake judge rate and the calculation greatly. It also has the learning fuction. This paper also presented a new measurement of karyon's shape which can provide more accurate information. This algorithm has great application value in clinical blood test.
We study the influence of numerical conditioning on the accuracy of two closed-form solutions to the overconstrained relative orientation problem. We consider the well known eight-point algorithm and the recent five-p...
详细信息
暂无评论