The efficiency of patternrecognition is particularly crucial in two scenarios;whenever there are a large number of classes to discriminate, and, whenever recognition must be performed a large number of times. We prop...
详细信息
ISBN:
(纸本)0818672587
The efficiency of patternrecognition is particularly crucial in two scenarios;whenever there are a large number of classes to discriminate, and, whenever recognition must be performed a large number of times. We propose a single technique, namely, pattern rejection, that greatly enhances efficiency in both cases. A rejector is a generalization of a classifier, that quickly eliminates a large fraction of the candidate classes or inputs. This allows a recognition algorithm to dedicate its efforts to a much smaller number of possibilities. Importantly, a collection of rejectors may be combined to form a composite rejector, which is shown to be far more effective than any of its individual components. A simple algorithm is proposed for the construction of each of the component rejectors. Its generality is established through close relationships with the Karhunen-Loeve expansion and Fisher's discriminant analysis. Composite rejectors were constructed for two representative applications, namely, appearance matching based object recognition and local feature detection. The results demonstrate substantial efficiency improvements over existing approaches, most notably Fisher's discriminant analysis.
We present algorithms for coupling and training hidden Markov models CHMMsl to model interacting processes, and demonstrate their superiority to conventional HMMs in a vision task classifying two-handed actions. HMMs ...
详细信息
ISBN:
(纸本)0780342364
We present algorithms for coupling and training hidden Markov models CHMMsl to model interacting processes, and demonstrate their superiority to conventional HMMs in a vision task classifying two-handed actions. HMMs are perhaps the most successful framework in perceptual computing for modeling and classifying dynamic behaviors, popular because they offer dynamic time warping, a training algorithm, and a clear Bayesian semantics. However;the Markovian framework makes strong restrictive assumptions about the system generating the signal-that it is a single process having a smalt number of states and an extremely limited stare memory The single-process model is often inappropriate for vision (and speech) applications, resulting in low ceilings on model performance. Coupled HMMs provide an efficient way to resolve many of these problems, and offer superior training speeds, model likelihoods, and robustness to initial conditions.
We present the Incremental Focus of Attention (IFA) architecture for adding robustness to software-based, real-time, motion trackers. The framework provides a structure which, when given the entire camera image to sea...
详细信息
ISBN:
(纸本)0818672587
We present the Incremental Focus of Attention (IFA) architecture for adding robustness to software-based, real-time, motion trackers. The framework provides a structure which, when given the entire camera image to search, efficiently focuses the attention of the system into a narrow set of possible states that includes the target state. IFA offers a means for automatic tracking initialization and reinitialization when environmental conditions momentarily deteriorate and cause the system to lose track of its target. Systems based on the framework degrade gracefully as various assumptions about the environment are violated. In particular, multiple tracking algorithms are layered so that the failure of a single algorithm causes another algorithm of less precision to take over, thereby allowing the system to return approximate feature state information.
Given a machine learning model, adversarial perturbations transform images such that the model's output is classified as an attacker chosen class. Most research in this area has focused on adversarial perturbation...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
Given a machine learning model, adversarial perturbations transform images such that the model's output is classified as an attacker chosen class. Most research in this area has focused on adversarial perturbations that are imperceptible to the human eye. However, recent work has considered attacks that are perceptible but localized to a small region of the image. Under this threat model, we discuss both defenses that remove such adversarial perturbations, and attacks that can bypass these defenses.
Motion of an observer relative to objects in a scene provides information about the structure of the scene. Changing patterns of shading due to motion relative to the light source provide information about surface str...
详细信息
ISBN:
(纸本)0818672587
Motion of an observer relative to objects in a scene provides information about the structure of the scene. Changing patterns of shading due to motion relative to the light source provide information about surface structure, albedos, and light sources. One can stratify this photometric information into affine, unitary, and metric structure, much like the stratification of structure from motion [1]. For Lambertian surfaces, if either motion or photometry give us more than affine structure, the two cues can be combined to yield full metric information. Edge constraints plus unitary photometry also give us full metric photometry. Affine structure alone contains much of the quantitative structure information, allowing us to judge such things as the ordinal relationships between the albedos.
To create a more realistic soccer game derived from TV images, we developed an image synthesis system that generates an image sequence from the viewpoint of a player on the field. Tills system is based on the camera c...
详细信息
ISBN:
(纸本)0818684976
To create a more realistic soccer game derived from TV images, we developed an image synthesis system that generates an image sequence from the viewpoint of a player on the field. Tills system is based on the camera calibration theory. The system first determines the camera parameters of a TV image by using the intersection points of the white lines drawn on the soccer field. It then extracts players from each image and estimates their positions in the world coordinate system. Finally;it applies a running motion to the players in their respective positions and generates computer graphics animation from the viewpoint of any player selected by a user. The system was tested over seven sequences of TV images and demonstrated satisfactory results.
Different instances of a handwritten word consist of the same basic features (humps, cusps, crossings, etc.) arranged in a deformable spatial pattern. Thus, keywords in cursive text can be detected by looking for the ...
详细信息
ISBN:
(纸本)0818684976
Different instances of a handwritten word consist of the same basic features (humps, cusps, crossings, etc.) arranged in a deformable spatial pattern. Thus, keywords in cursive text can be detected by looking for the appropriate features in the "correct" spatial configuration. A keyword can be modeled hierarchically as a set of word fragments, each of which consists of lower-level features. To allow flexibility, the spatial configuration of keypoints within a fragment is modeled using a Dryden-Mardia (DM) probability density over the shape of the configuration. In a writer-dependent test on a transcription of the Declaration of Independence (similar to 1300 words, similar to 7500 characters), the method detected all eleven instances of the keyword "government" with only four false positives.
We study the problem of estimating rigid motion from a sequence of monocular perspective images obtained by navigating around an object while fixating a particular feature point. We cast the problem in the framework o...
详细信息
ISBN:
(纸本)0818672587
We study the problem of estimating rigid motion from a sequence of monocular perspective images obtained by navigating around an object while fixating a particular feature point. We cast the problem in the framework of "epipolar geometry", and propose a filter based upon implicit dynamical model for recursively estimating motion under the fixation constraint. This allows us to compare the quality of the estimates directly against the ones obtained assuming a general rigid motion simply by changing the geometry of the parameter space, while maintaining the same structure of the recursive estimator. We also present a closed-form static solution from two views, and a recursive estimator of the relative pose between the viewer and the scene.
We present a neural network-based face detection system. A retinally connected neural network examines small windows of an image, and decides whether each window contains a face. The system arbitrates between multiple...
详细信息
ISBN:
(纸本)0818672587
We present a neural network-based face detection system. A retinally connected neural network examines small windows of an image, and decides whether each window contains a face. The system arbitrates between multiple networks to improve performance over a single network. We use a bootstrap algorithm for training the networks, which adds false detections into the training set as training progresses. This eliminates the difficult task of manually selecting non-face training examples, which must be chosen to span the entire space of non-face images. Comparisons with other state-of-the-art face detection systems are presented;our system has better performance in terms of detection and false-positive rates.
Many presume that parsing the shadows out of an image is a high-level task, because of the global nature of the shadow formation process. But shape-from-shading algorithms are low-level, in the sense that they seek so...
详细信息
ISBN:
(纸本)0818672587
Many presume that parsing the shadows out of an image is a high-level task, because of the global nature of the shadow formation process. But shape-from-shading algorithms are low-level, in the sense that they seek solutions (surface normals or depth values) directly from image intensities. A dilemma arises: since shape-from-shading involves an illumination term, shadows must first be identified. We show that a structure intermediate between intensities and surfaces - the shading flow field - provides a solution to this dilemma. Our analysis is based on the observation that the geometric information that can be derived from images supports different inferences than the photometric information, and our specific goal will be to articulate this geometric structure and to show how shading flow fields can be reliably computed.
暂无评论