We present a new method for tracking the 3D position, global orientation and full articulation of human hands. Following recent advances in model-based, hypothesize-and-test methods, the high-dimensional parameter spa...
详细信息
ISBN:
(纸本)9781479951192
We present a new method for tracking the 3D position, global orientation and full articulation of human hands. Following recent advances in model-based, hypothesize-and-test methods, the high-dimensional parameter space of hand configurations is explored with a novel evolutionary optimization technique specifically tailored to the problem. the proposed method capitalizes on the fact that samples from quasi-random sequences such as the Sobol have low discrepancy and exhibit a more uniform coverage of the sampled space compared to random samples obtained from the uniform distribution. the method has been tested for the problems of tracking the articulation of a single hand (27D parameter space) and two hands (54D space). Extensive experiments have been carried out with synthetic and real data, in comparison with state of the art methods. the quantitative evaluation shows that for cases of limited computational resources, the new approach achieves a speed-up of four (single hand tracking) and eight (two hands tracking) without compromising tracking accuracy. Interestingly, the proposed method is preferable compared to the state of the art either in the case of limited computational resources or in the case of more complex (i.e., higher dimensional) problems, thus improving the applicability of the method in a number of application domains.
We have developed a small scale four-layered neural network (NN) model for simple character recognition, which can recognize the patterns transformed by affine conversion. In this study 24 patterns are presented as in...
详细信息
We have developed a small scale four-layered neural network (NN) model for simple character recognition, which can recognize the patterns transformed by affine conversion. In this study 24 patterns are presented as input patterns. An input pattern is divided into 64 local patterns and connected withthe 1st hidden layer. After the training, we investigated the recognition mechanism of NN using Alopex algorithm. Effectiveness of this method is demonstrated
the Internet contains billions of images, freely available online. Methods for efficiently searching this incredibly rich resource are vital for a large number of applications. these include object recognition [2], co...
详细信息
the Internet contains billions of images, freely available online. Methods for efficiently searching this incredibly rich resource are vital for a large number of applications. these include object recognition [2], computer graphics [11, 27], personal photo collections, online image search tools. In this paper, our goal is to develop efficient image search and scene matching techniques that are not only fast, but also require very little memory, enabling their use on standard hardware or even on handheld devices. Our approach uses recently developed machine learning techniques to convert the Gist descriptor (a real valued vector that describes orientation energies at different scales and orientations within an image) to a compact binary code, with a few hundred bits per image. Using our scheme, it is possible to perform real-time searches with millions from the Internet using a single large PC and obtain recognition results comparable to the full descriptor. Using our codes on high quality labeled images from the LabelMe database gives surprisingly powerful recognition results using simple nearest neighbor techniques.
Video concept learning often requires a large set of training samples. In practice, however, acquiring noise-free training labels with sufficient positive examples is very expensive. A plausible solution for training ...
详细信息
ISBN:
(纸本)9781467388528
Video concept learning often requires a large set of training samples. In practice, however, acquiring noise-free training labels with sufficient positive examples is very expensive. A plausible solution for training data collection is by sampling from the vast quantities of images and videos on the Web. Such a solution is motivated by the assumption that the retrieved images or videos are highly correlated withthe query. Still, a number of challenges remain. First, Web videos are often untrimmed. thus, only parts of the videos are relevant to the query. Second, the retrieved Web images are always highly relevant to the issued query. However, thoughtlessly utilizing the images in the video domain may even hurt the performance due to the well-known semantic drift and domain gap problems. As a result, a valid question is how Web images and videos interact for video concept learning. In this paper, we propose a Lead-Exceed Neural Network (LENN), which reinforces the training on Web images and videos in a curriculum manner. Specifically, the training proceeds by inputting frames of Web videos to obtain a network. the Web images are then filtered by the learnt network and the selected images are additionally fed into the network to enhance the architecture and further trim the videos. In addition, Long Short-Term Memory (LSTM) can be applied on the trimmed videos to explore temporal information. Encouraging results are reported on UCF101, TRECVID 2013 and 2014 MEDTest in the context of both action recognition and event detection. Without using human annotated exemplars, our proposed LENN can achieve 74.4% accuracy on UCF101 dataset.
Image matching plays an important role in many aspects of computervision. Our proposed method is based on Scale Invariant Feature Transform (SIFT) which is one of the popular image matching methods. the main ideas be...
详细信息
Image matching plays an important role in many aspects of computervision. Our proposed method is based on Scale Invariant Feature Transform (SIFT) which is one of the popular image matching methods. the main ideas behind our method are removing the excess keypoints, adding oriented patterns to descriptor, and decreasing the size of the descriptors. By doing these changes to SIFT, we would have oriented patterns of keypoints. In addition, the numbers of keypoints have been reduced and the places of keypoints would be selected more accurately, and also the size of the descriptors has been reduced.
this paper proposes a method for electrocardiogram (ECG) heartbeat patternrecognition using adaptive wavelet network (AWN). the ECG beat recognition can be divided into a sequence of stages, starting from feature ext...
详细信息
this paper proposes a method for electrocardiogram (ECG) heartbeat patternrecognition using adaptive wavelet network (AWN). the ECG beat recognition can be divided into a sequence of stages, starting from feature extraction and conversion of QRS complexes, and then identifying cardiac arrhythmias based on the detected features. the discrimination method of ECG beats is a two-subnetwork architecture, consisting of a wavelet layer and a probabilistic neural network (PNN). Morlet wavelets are used to extract the features from each heartbeat, and then PNN is used to analyze the meaningful features and perform discrimination tasks. the AWN is suitable for application in a dynamic environment, with add-in and delete-off features using automatic target adjustment and parameter tuning. the experimental results obtained by testing the data of the MIT-BIH arrhythmia database demonstrate the efficiency of the proposed method
Image registration is one of the image processing methods which is widely used in computervision, patternrecognition, and medical imaging. In digital subtraction radiography, image registration is one of the importa...
详细信息
Image registration is one of the image processing methods which is widely used in computervision, patternrecognition, and medical imaging. In digital subtraction radiography, image registration is one of the important prerequisites to match the reference and subsequent images. In this paper, we propose an automatic non-rigid registration method namely curvature-based registration that relies on a curvature based penalizing term and its application on dental radiography. the regularizing term of this intensity-based registration approach provides affine linear transformation so that pre-registration step is no longer necessary. this leads to faster and more reliable solutions. the implementation of this approach is based on the numerical solution of the underlying Euler-Lagrange equations. In addition, a comparison between this algorithm and linear alignment method (LAM) with 20 image pairs is presented
暂无评论