We present a new framework for recognizing planar object classes, which is based on local feature detectors and a probabilistic model of the spatial arrangement of the features. The allowed object deformations are rep...
详细信息
ISBN:
(纸本)0818672587
We present a new framework for recognizing planar object classes, which is based on local feature detectors and a probabilistic model of the spatial arrangement of the features. The allowed object deformations are represented through shape statistics, which are learned from examples. Instances of an object in an image are detected by finding the appropriate features in the correct spatial configuration. The algorithm is robust with respect to partial occlusion, detector false alarms, and missed features. A 94% success rate was achieved for the problem of locating quasi-frontal views of faces in cluttered scenes.
Self-attention is a corner stone for transformer models. However, our analysis shows that self-attention in vision transformer inference is extremely sparse. When applying a sparsity constraint, our experiments on ima...
详细信息
ISBN:
(纸本)9781665448994
Self-attention is a corner stone for transformer models. However, our analysis shows that self-attention in vision transformer inference is extremely sparse. When applying a sparsity constraint, our experiments on image (ImageNet-1K) and video (Kinetics-400) understanding show we can achieve 95% sparsity on the self-attention maps while maintaining the performance drop to be less than 2 points. This motivates us to rethink the role of self-attention in vision transformer models.
In the context of variational auto-encoders, learning disentangled latent variable representations remains a challenging problem. In this abstract, we consider the semi-supervised setting, in which the factors of vari...
详细信息
ISBN:
(纸本)9781665448994
In the context of variational auto-encoders, learning disentangled latent variable representations remains a challenging problem. In this abstract, we consider the semi-supervised setting, in which the factors of variation are labelled for a small fraction of our samples. We examine how the quality of learned representations is affected by the dimension of the unsupervised component of the latent space. We also consider a variational lower bound for the mutual information between the data and the semi-supervised component of the latent space, and analyze its role in the context of disentangled representation learning.
The aim of this paper is to demonstrate that a state of the art feature matcher (LoFTR) can be made more robust to rotations by simply replacing the backbone CNN with a steerable CNN which is equivariant to translatio...
详细信息
ISBN:
(纸本)9781665487399
The aim of this paper is to demonstrate that a state of the art feature matcher (LoFTR) can be made more robust to rotations by simply replacing the backbone CNN with a steerable CNN which is equivariant to translations and image rotations. It is experimentally shown that this boost is obtained without reducing performance on ordinary illumination and viewpoint matching sequences.
A pulsed laser radar (ladar) based object recognition system with applications to automatic target recognition is reported. The approach used is to fit the sensed range images to the range templates extracted using la...
详细信息
ISBN:
(纸本)0818684976
A pulsed laser radar (ladar) based object recognition system with applications to automatic target recognition is reported. The approach used is to fit the sensed range images to the range templates extracted using laser physics based simulation of computer Aided Design target models. A projection based pre-screener filters out more than 80 percent of candidate templates. An M of N pixel matching scheme for internal shape matching combined with a silhouette matching scheme is used for recognition. The system has been blind tested on a data set containing 276 real ladar images of military vehicles at various orientations and different ranges. The system achieves above 90 percent accuracy in recognition of 0.4 meters resolution ladar images.
We describe a framework for face recognition at a distance based on sparse-stereo reconstruction. We develop a 3D acquisition system that consists of two CCD stereo cameras mounted on pan-tilt units with adjustable ba...
详细信息
ISBN:
(纸本)9781424439942
We describe a framework for face recognition at a distance based on sparse-stereo reconstruction. We develop a 3D acquisition system that consists of two CCD stereo cameras mounted on pan-tilt units with adjustable baseline. We first detect the facial region and extract its landmark points, which are used to initialize an AAM mesh fitting algorithm. The fitted mesh vertices provide point correspondences between the left and right images of a stereo pair;stereo-based reconstruction is then used to infer the 3D information of the mesh vertices. We perform experiments regarding the use of different features extracted from these vertices for face recognition. The cumulative rank curves (CMC), which are generated using the proposed framework, confirms the feasibility of the proposed work for long distance recognition of human faces with respect to the state-of-the-art.
Two novel variants of Dynamic Lint Architecture that are based on mathematical morphology and incorporate coefficients which weigh the contribution of each node in elastic graph matching according to its discriminator...
详细信息
ISBN:
(纸本)0818684976
Two novel variants of Dynamic Lint Architecture that are based on mathematical morphology and incorporate coefficients which weigh the contribution of each node in elastic graph matching according to its discriminatory pourer are developed. They are the so called Morphological Dynamic Link Architecture and the Morphological Signal Decomposition-Dynamic Lint Architecture. The proposed variants are tested for face authentication in a cooperative scenario where the candidates claim an identity to be checked. Their performance is evaluated in terms of their Receiver Operating Characteristic and the Equal Error Rate achieved in M2VTS database. An Equal Error Rate in the range 3.7-6.8% is reported.
We analyze the use of kinematic constraints for articulated object tracking. Conditions for the occurrence of singularities in 3-D models are presented and their effects on tracking are characterized. We describe a no...
详细信息
ISBN:
(纸本)0818684976
We analyze the use of kinematic constraints for articulated object tracking. Conditions for the occurrence of singularities in 3-D models are presented and their effects on tracking are characterized. We describe a novel 2-D Scaled Prismatic Model (SPM) for figure registration. In contrast to 3-D kinematic models, the SPM has fewer singularity problems and does not require detailed knowledge of the 3-D kinematics. We fully characterize the singularities in the SPM and illustrate tracking through singularities using synthetic and real examples with 3-D and 2-D models. Our results demonstrate the significant benefits of the SPM in tracking with a single source of video.
We present a method for matching curves which accommodates large and small deformation. The method preserves geometric similarities in the case of small deformation, and loosens these geometric constraints when large ...
详细信息
ISBN:
(纸本)0818684976
We present a method for matching curves which accommodates large and small deformation. The method preserves geometric similarities in the case of small deformation, and loosens these geometric constraints when large deformations occur. The approach is based on the computation of a set of geodesic paths connecting the curves. These two curves are defined asa source area and a destination area which can have an arbitrary number of connected components and different topologies. The applicative framework of the presented method is the study of the crustal deformation from a set of iso-elevation curves. An experiment with real curves demonstrates that the approach can be successfully applied to characterize deformation of Digital Elevation Models.
Two approaches for 3D curved object reconstruction using active sensor and illumination control are proposed and compared to each other. In both cases, the highlight information is fully utilized rather than discarded...
详细信息
ISBN:
(纸本)0818684976
Two approaches for 3D curved object reconstruction using active sensor and illumination control are proposed and compared to each other. In both cases, the highlight information is fully utilized rather than discarded, and knowledge of the object surface is not required. The first approach requires camera control only and recovers shape (depth) from highlights and occluding contours. The second approach requires both camera and illumination control and recovers SD depth from highlights only.
暂无评论