Fiducial marker systems consist of patterns that are mounted in the environment and automatically detected in digital camera images using an accompanying detection algorithm. They are useful for Augmented Reality (AR)...
详细信息
ISBN:
(纸本)0769523722
Fiducial marker systems consist of patterns that are mounted in the environment and automatically detected in digital camera images using an accompanying detection algorithm. They are useful for Augmented Reality (AR), robot navigation, and general applications where the relative pose between a camera and object is required. Important parameters for such marker systems is their false detection rate (false positive rate), their inter-marker confusion rate, minimal detection size (in pixels) and immunity to lighting variation. ARTag is a marker system that uses digital coding theory to get a very low false positive and inter-marker confusion rate with a small required marker size, employing an edge linking method to give robust lighting variation immunity. ARTag markers are bi-tonal planar patterns containing a unique ID number encoded with robust digital techniques of checksums and forward error correction (FEC). This proposed new system, ARTag has very low and numerically quantifiable error rates, does not require a greyscale threshold as does other marker systems, and can encode up to 2002 different unique ID's with no need to store patterns. Experimental results are shown validating this system.
This paper presents a formal model of an active recognition system that can be programmed by learning. At each time step the system decides between producing an action to generate new data and stopping to issue the na...
详细信息
ISBN:
(纸本)0818672587
This paper presents a formal model of an active recognition system that can be programmed by learning. At each time step the system decides between producing an action to generate new data and stopping to issue the name of the object observed. The actions can be directed either towards the external environment or towards the internal perceptual system of the agent. The decision strategy is based on a quantitative evaluation of the system learning experience. The problem studied is the recognition of chess pieces using a moving camera and a multiscale feature detector. The recognition is difficult because the objects are complex - neither polyhedral nor smooth - and rather similar between classes, especially in certain view configurations. The system uses the information obtained by observing internal state transitions when the camera is moved or when the feature detector scale is changed. A simulation of the agent and the environment is used for experimental measures of the model performances.
作者:
Wang, XGTang, XOMIT
Comp Sci & Articial Intelligence Lab Cambridge MA 02139 USA
In [1], three popular subspace face recognition methods, PCA, Bayes, and LDA were analyzed under the same framework and an unified subspace analysis was proposed However, since they are all based on a single Gaussian ...
详细信息
ISBN:
(纸本)0769523722
In [1], three popular subspace face recognition methods, PCA, Bayes, and LDA were analyzed under the same framework and an unified subspace analysis was proposed However, since they are all based on a single Gaussian model, a global linear subspace often fails to deliver good performance on the data set with complex intrapersonal variation. They also have to face the problem caused by high dimensional face feature vector and the difficulty in finding optimal parameters for subspace analysis. In this paper, we develop a random mixture model to improve Bayes and LDA subspace analysis. By clustering the intrapersonal difference, the complex intrapersonal variation manifold is learned by a set of local linear intrapersonal subspaces. To boost the system performance, we construct multiple low dimensional subspaces by randomly sampling on the high dimensional feature vector and randomly selecting the parameters for subspace analysis. The effectiveness of our method is demonstrated by experiments on the AR face database containing 2340 face images.
We propose a new technique for object class recognition, which learns a generative appearance model in a discriminative manner The technique is based on the intermediate representation of an image as a set of patches,...
详细信息
ISBN:
(纸本)0769523722
We propose a new technique for object class recognition, which learns a generative appearance model in a discriminative manner The technique is based on the intermediate representation of an image as a set of patches, which are extracted using an interest point detector The learning problem becomes an instance of supervised learning from sets of unordered features. In order to solve this problem, we designed a classifier based on a simple, part based, generative object model. Only the appearance of each part is modeled. When learning the model parameters, we use a discriminative boosting algorithm which minimizes the loss of the training error directly. The models thus learnt have clear probabilistic semantics, and also maintain good classification performance. The performance of the algorithm has been tested using publicly available benchmark data, and shown to be comparable to other state of the art algorithms for this task;our main advantage in these comparisons is speed (order of magnitudesfaster) and scalability.
In this paper, we have proposed a novel framework to achieve more effective classifier training by using unlabeled samples. By integrating concept hierarchy for semantic image concept organization, a hierarchical mixt...
详细信息
ISBN:
(纸本)0769523722
In this paper, we have proposed a novel framework to achieve more effective classifier training by using unlabeled samples. By integrating concept hierarchy for semantic image concept organization, a hierarchical mixture model is proposed to enable multi-level image concept modeling and hierarchical classifier training. To effectively learn the base-level classifiers for the atomic image concepts at the first level of the concept hierarchy, we have proposed a novel adaptive EM algorithm to achieve more effective classifier training with higher prediction accuracy. To effectively learn the classifiers for the higher-level semantic image concepts, we have also proposed a novel technique for classifier combining by using hierarchical mixture model. The experimental results on two large-scale image databases are also provided.
The efficiency of patternrecognition is particularly crucial in two scenarios;whenever there are a large number of classes to discriminate, and, whenever recognition must be performed a large number of times. We prop...
详细信息
ISBN:
(纸本)0818672587
The efficiency of patternrecognition is particularly crucial in two scenarios;whenever there are a large number of classes to discriminate, and, whenever recognition must be performed a large number of times. We propose a single technique, namely, pattern rejection, that greatly enhances efficiency in both cases. A rejector is a generalization of a classifier, that quickly eliminates a large fraction of the candidate classes or inputs. This allows a recognition algorithm to dedicate its efforts to a much smaller number of possibilities. Importantly, a collection of rejectors may be combined to form a composite rejector, which is shown to be far more effective than any of its individual components. A simple algorithm is proposed for the construction of each of the component rejectors. Its generality is established through close relationships with the Karhunen-Loeve expansion and Fisher's discriminant analysis. Composite rejectors were constructed for two representative applications, namely, appearance matching based object recognition and local feature detection. The results demonstrate substantial efficiency improvements over existing approaches, most notably Fisher's discriminant analysis.
With the aim to design a general learning framework for detecting faces of various poses or under different lighting conditions, we are motivated to formulate the task as a classification problem over data of multiple...
详细信息
Focus of attention mechanisms for robot vision are discussed. A new method for neglecting low level filter responses from already modelled structures is presented. The method is based on a filtering technique termed n...
详细信息
ISBN:
(纸本)0818672587
Focus of attention mechanisms for robot vision are discussed. A new method for neglecting low level filter responses from already modelled structures is presented. The method is based on a filtering technique termed normalized convolution. In one experiment, the robot is continuously moving its arm in the scene while tracking other objects. It is shown how the arm can be made 'invisible' so that only the moving object of interest is detected. This makes tracking of objects much simpler. In another experiment, the attention of the system is shifted between objects by simply cancelling the mask of the object to be attended to. With this strategy the low level processes do not need to know the difference between a new object entering the scene and a mask being cancelled, and thus a complex communication structure between high and low levels is avoided.
We describe how certain tasks in the audio domain can be effectively addressed using computervision approaches. This paper focuses on the problem of music identification, where the goal is to reliably identify a song...
详细信息
ISBN:
(纸本)0769523722
We describe how certain tasks in the audio domain can be effectively addressed using computervision approaches. This paper focuses on the problem of music identification, where the goal is to reliably identify a song given a few seconds of noisy audio. Our approach treats the spectrogram of each music clip as a 2-D image and transforms music identification into a corrupted sub-image retrieval problem. By employing pairwise boosting on a large set of Viola-Jones features, our system learns compact, discriminative, local descriptors that are amenable to efficient indexing. During the query phase, we retrieve the set of song snippets that locally match the noisy sample and employ geometric verification in conjunction with an EM-based "occlusion " model to identify the song that is most consistent with the observed signal. We have implemented our algorithm in a practical system that can quickly and accurately recognize music from short audio samples in the presence of distortions such as poor recording quality and significant ambient noise. Our experiments demonstrate that this approach significantly outperforms the current state-of-the-art in content-based music identification.
In this paper we present an artificial vision algorithm for real-time obstacle detection in unstructured environments. The images have been taken using a stereoscopical vision system. The system uses a new approach, o...
详细信息
暂无评论