For a lot of applications, and particularly for medical intra-operative applications, the exploration of and navigation through 3-D image data provided by sensors like ToF (Time-of-Flight) cameras, MUSTOF (Multisensor...
详细信息
ISBN:
(纸本)9781424423392
For a lot of applications, and particularly for medical intra-operative applications, the exploration of and navigation through 3-D image data provided by sensors like ToF (Time-of-Flight) cameras, MUSTOF (Multisensor-Tune-of-Flight) endoscopes or CT (Computed Tomography) [8], requires a user-interface which avoids physical interaction with an input device. Thus, we process a touchless user-interface based on gestures classified by the data provided by a ToF camera. Reasonable and necessary user interactions are described. For those interactions a suitable set Of gestures is introduced. A user-interface is then proposed, which interprets the current gesture and performs the assigned functionality. For evaluating the quality of the developed user-interface we considered the aspects of classification rate, real-time applicability, usability, intuitiveness and training time. The results of our evaluation show that our system, which provides a classification rate of 94.3% at a framerate of 11 frames per second, satisfactorily addresses all these quality requirements.
Latent fingerprint identification is of critical importance to law enforcement agencies in forensics application. While tremendous progress has been made in the field of automatic fingerprint matching, latent fingerpr...
详细信息
ISBN:
(纸本)9781424423392
Latent fingerprint identification is of critical importance to law enforcement agencies in forensics application. While tremendous progress has been made in the field of automatic fingerprint matching, latent fingerprint matching continues to be a difficult problem because the challenges involved in latent print matching are quite different from plain or rolled fingerprint matching. Poor quality of friction ridge impressions, small finger area and large non-linear distortion are some of the main difficulties in latent fingerprint matching. We propose a system for matching latent images to rolled fingerprints that takes into account the specific characteristics of the latent matching problem. In addition to minutiae additional features like orientation field and quality map are also used in our system. Experimental results on the NIST SD27 latent database indicate that the introduction of orientation field and quality, map to minutiae-based matching leads to good recognition performance despite the inherently difficult nature of the problem. We achieve the rank-20 accuracy of 93.4% in retrieving 258 latents from a background database of 2,258 rolled fingerprints.
We identify the social relationships between individuals in consumer photos. Consumer photos generally do not contain a random gathering of strangers but rather groups of friends and families. Detecting and identifyin...
详细信息
ISBN:
(纸本)9781424423392
We identify the social relationships between individuals in consumer photos. Consumer photos generally do not contain a random gathering of strangers but rather groups of friends and families. Detecting and identifying these relationships are important steps towards understanding consumer image collections. Similar to the approach that a human might use, we use a rule-based system to quantify the domain knowledge (e.g. children tend to be photographed more often than adults;parents tend to appear with their kids). The weight of each rule reflects its importance in the overall prediction model. Learning and inference are based on a sound mathematical formulation using the theory developed in the area of statistical relational models. In particular we use the language called Markov Logic [14]. We evaluate our model using cross validation on a set of about 4500 photos collected from 13 different users. Our experiments show the potential of our approach by improving the accuracy (as well as other statistical measures) over a set of two different relationship prediction tasks when compared with different baselines. We conclude with directions for future work.
In this paper we propose a framework that performs automatic semantic annotation of visual events (SAVE). This is an enabling technology for content-based video annotation, query and retrieval with applications in Int...
详细信息
ISBN:
(纸本)9781424423392
In this paper we propose a framework that performs automatic semantic annotation of visual events (SAVE). This is an enabling technology for content-based video annotation, query and retrieval with applications in Internet video search and video data mining. The method involves identifying objects in the scene, describing their inter-relations, detecting events of interest, and representing them semantically in a human readable and query-able format. The SAVE framework is composed of three main components. The first component is an image parsing engine that performs scene content extraction using bottom-up image analysis and a stochastic attribute image grammar, where we define a visual vocabulary from pixels, primitives, parts, objects and scenes, and specify their spatio-temporal or compositional relations;and a bottom-up top-down strategy is used for inference. The second component is an event inference engine, where the Video Event Markup Language (VEML) is adopted for semantic representation, and a grammar-based approach is used for event analysis and detection. The third component is the text generation engine that generates text report using head-driven phrase structure grammar (HPSG). The main contribution of this paper is a framework for an end-to-end system that infers visual events and annotates a large collection of videos. Experiments with maritime and urban scenes indicate the feasibility of the proposed approach.
T3D face reconstruction from a single 2D image is mathematically ill-posed. However, to solve ill-posed problems in the area of computervision, a variety of methods has been proposed;some of the solutions are to esti...
详细信息
ISBN:
(纸本)9781424423392
T3D face reconstruction from a single 2D image is mathematically ill-posed. However, to solve ill-posed problems in the area of computervision, a variety of methods has been proposed;some of the solutions are to estimate latent information or to apply model based approaches. In this paper, we propose a novel method to reconstruct a 3D face from a single 2D face image based on pose estimation and a deformable model of 3D face shape. For 3D face reconstruction from a single 2D face image, it is the first task to estimate the depth lost by 2D projection of 3D faces. Applying the EM algorithm to facial landmarks in a 2D image, we propose a pose estimation algorithm to infer the pose parameters of rotation, scaling, and translation. After estimating the pose, much denser points are interpolated between the landmark points by a 3D deformable model and barycentric coordinates. As opposed to previous literature, our method can locate facial feature points automatically in a 2D facial image. Moreover, we also show that the proposed method for pose estimation can be successfully applied to 3D face reconstruction. Experiments demonstrate that our approach can produce reliable results for reconstructing photorealistic 3D faces.
Research on 3D face recognition has been intensified in recent years. However, most research has focused on the 3D static data analysis. In this paper, we investigate the face recognition problem using dynamic 3D face...
详细信息
Graph cuts has become a powerful and popular optimization tool for energies defined over an MRF and have found applications in image segmentation, stereo vision, image restoration, etc. The maxflow/mincut algorithm to...
详细信息
This paper deals with the study of various implementations of the AdaBoost algorithm in order to address the issue of real-time pedestrian detection in images. We use gradient-based local descriptors and we combine th...
详细信息
vision based Human Robot Interaction (HRI) in a crowded scene is a challenging research problem. The aim of this paper is to provide a reliable framework for simple gesture recognition for robotic navigation under par...
详细信息
暂无评论