Facial expression recognition has been an active research area in the past ten years, with a growing application area like avatar animation and neuromarketing. The recognition of facial expressions is not an easy prob...
详细信息
ISBN:
(纸本)9781467379625
Facial expression recognition has been an active research area in the past ten years, with a growing application area like avatar animation and neuromarketing. The recognition of facial expressions is not an easy problem for machine learning methods, since different people can vary in the way that they show their expressions. And even an image of the same person in one expression can vary in brightness, background and position. Therefore, facial expression recognition is still a challenging problem in computer vision. In this work, we propose a simple solution for facial expression recognition that uses a combination of standard methods, like Convolutional Network and specific image pre-processing steps. Convolutional networks, and the most machine learning methods, achieve better accuracy depending on a given feature set. Therefore, a study of some image pre-processing operations that extract only expression specific features of a face image is also presented. The experiments were carried out using a largely used public database for this problem. A study of the impact of each image pre-processing operation in the accuracy rate is presented. To the best of our knowledge, our method achieves the best result in the literature, 97.81% of accuracy, and takes less time to train than state-of-the-art methods.
Remote Sensing images (RSI) have been used as a major source of data, particularly with respect to the creation of thematic maps. This process is usually modeled as a supervised learning task where the system needs to...
详细信息
ISBN:
(纸本)9781509035694;9781509035687
Remote Sensing images (RSI) have been used as a major source of data, particularly with respect to the creation of thematic maps. This process is usually modeled as a supervised learning task where the system needs to learn the patterns of interest provided by the user and assign a class to the rest of the image regions. Thus, it is common to have images obtained from different sensors, which could improve the quality of thematic maps. However, this requires the creation of techniques to properly encode and combine the different properties of the images. So, this paper proposes a boosting-based technique for classification of regions in RSI that manages to encode features extracted from different sources of data, spectral and spatial domains. The approach is evaluated in an urban and a coffee crop recognition scenarios, achieving statistically better results in comparison with the baselines in urban classification and better results at some baselines for the coffee crop recognition.
In this paper, we present a new approach for dynamic hand gesture recognition that uses intensity, depth, and skeleton joint data captured by Kinect sensor. This method integrates global and local information of a dyn...
详细信息
ISBN:
(纸本)9781509035694;9781509035687
In this paper, we present a new approach for dynamic hand gesture recognition that uses intensity, depth, and skeleton joint data captured by Kinect sensor. This method integrates global and local information of a dynamic gesture. First, we represent the skeleton 3D trajectory in spherical coordinates. Then, we select the most relevant points in the hand trajectory with our proposed method for keyframe detection. After, we represent the joint movements by spatial, temporal and hand position changes information. Next, we use the direction cosines definition to describe the body positions by generating histograms of cumulative magnitudes from the depth data which were converted in a point-cloud. We evaluate our approach with different public gesture datasets and a sign language dataset created by us. Our results outperformed state-of-the-art methods and highlight the smooth and fast processing for feature extraction being able to be implemented in real time.
We present an object matching method that employs matches of local graphs of keypoints, called keygraphs, instead of simple keypoint matches. For a keygraph match to be valid, vertex (keypoint) descriptors must be sim...
详细信息
ISBN:
(纸本)9781509035694;9781509035687
We present an object matching method that employs matches of local graphs of keypoints, called keygraphs, instead of simple keypoint matches. For a keygraph match to be valid, vertex (keypoint) descriptors must be similar and both keygraphs must satisfy structural properties concerning keypoints orientation, scale, relative position and cyclic ordering; as a result, the large majority of initial incorrect keypoint matches is correctly filtered out. We introduce a novel approach to sample keypoint triples (i.e. keygraphs) in a query image, based on complementary Delaunay triangulations; this generates a linear number of triples with relation to the number of keypoints. Query keygraphs are then matched against the indexed model keypoints; each established keygraph match is used to evaluate a candidate pose (an affine transformation). The proposed method has been evaluated for object recognition and pose estimation, achieving a better performance in comparison to state-of-the-art methods.
Content-Based image Retrieval (CBIR) aims to retrieve similar graphical objects from large databases based on their contents. CBIR requires definition of descriptors, algorithms that condense information from the obje...
详细信息
ISBN:
(纸本)9781509035694;9781509035687
Content-Based image Retrieval (CBIR) aims to retrieve similar graphical objects from large databases based on their contents. CBIR requires definition of descriptors, algorithms that condense information from the object in order to represent it usually as a real number or a vector in Rn. This article presents the Spectral Descriptor, a new descriptor designed for retrieving three-dimensional geometric objects applied to aid the diagnosis of Congestive Heart Failure (CHF). Our descriptor is based on techniques of compressive sensing and rewrites the coordinates of 3D objects vertices on a basis on which they have a sparse representation. Tests with surfaces reconstructed from heart MRI images, specifically from left ventricle, show that the descriptor has presented a good performance, reaching an average precision of approximately 85% for CHF and 71% for non-CHF cases, maintaining high levels of precision. Results also showed that the Spectral Descriptor can decrease the high dimensionality of features vectors in CBIR systems.
We present a system for creating interactive exploded view diagrams in generalized 3D grids. The primary difference between our approach and existing ones is that our technique neither requires geometrical information...
详细信息
ISBN:
(纸本)9781467379625
We present a system for creating interactive exploded view diagrams in generalized 3D grids. The primary difference between our approach and existing ones is that our technique neither requires geometrical information of the whole model nor any information regarding the relationship among model parts;instead our implementation depends on which grid cells are considered as object of interest, and which view angle to use. To achieve this, we introduce the Explosion Tree, a data structure closely related to a BSP tree, which supports the explosion view diagrams technique based on the relationship between disjoint convex polygons. In this paper we discuss the application of this technique to Corner-Point Grid which has been extensively used for geological modeling and flow simulation. All the data presented in this work consists of real data currently used in the industry.
Retinal vessel segmentation is an important step for the detection of numerous system diseases, such as glaucoma, diabetic retinopathy, and others. Thus, the retinal blood vessel analysis can be used to diagnose and t...
详细信息
ISBN:
(纸本)9781509035694;9781509035687
Retinal vessel segmentation is an important step for the detection of numerous system diseases, such as glaucoma, diabetic retinopathy, and others. Thus, the retinal blood vessel analysis can be used to diagnose and to monitor the progress of these diseases. Manual segmentation of fundus images is a long and tedious task that requires a specialist. Therefore, many algorithms have been developed for this purpose. This paper demonstrates an automated method for retinal blood vessel segmentation based on the combination of topological and morphological vessel extractors. Each of these extractors is based on different blood vessel features to increase the detection robustness. The final segmentation is obtained intersecting the two resulting images, smoothing the vessel borders and removing spurious objects remaining. Our proposed method is tested on DRIVE and STARE databases, achieving an average accuracy of 0.9565 and 0.9568, respectively, with good values of sensitivity and specificity.
Electroencephalogram (EEG) is a method that records electrical activities of the brain. Reliable interpretation of its measurements rely on accurate correspondence of the scalp electrodes and the underlying cortical s...
详细信息
Electroencephalogram (EEG) is a method that records electrical activities of the brain. Reliable interpretation of its measurements rely on accurate correspondence of the scalp electrodes and the underlying cortical surface. To standardize comparative studies international 10/20, 10/10 and 10/5 systems for the placement of electrodes have been proposed to specify the locations of scalp EEG sensors under the assumption that there is a consistent correlation between these sites and the structure of the cerebral cortex. However, several studies have demonstrated that the cranio-cerebral correlations vary greatly. For enhancing the electrophysiological analyses, an algorithm is presented in this paper, allowing the multimodal visualization of EEG and magnetic resonance scan in the patient's native space. The key to our solution is twofold: an interactive image-based electrode placement algorithm and an extended GPU-based multimodal ray-casting algorithm. Experimental results show that with the present tool one can easily assess the cranio-cerebral correspondences even when the brain tissue is displaced by structural lesions.
We present an efficient ray-tracing technique to render bokeh effects produced by parametric aspheric lenses. Contrary to conventional spherical lenses, aspheric lenses do generally not permit a simple closed-form sol...
详细信息
We present an efficient ray-tracing technique to render bokeh effects produced by parametric aspheric lenses. Contrary to conventional spherical lenses, aspheric lenses do generally not permit a simple closed-form solution of ray-surface intersections. We propose a numerical root-finding approach, which uses tight proxy surfaces to ensure a good initialization and convergence behavior. Additionally, we simulate mechanical imperfections resulting from the lens fabrication via a texture-based approach. Fractional Fourier transform and spectral dispersion add additional realism to the synthesized bokeh effect. Our approach is well-suited for execution on graphicsprocessing units (GPUs) and we demonstrate complex defocus-blur and lens-flare effects.
This paper presents a global point cloud descriptor to be used for efficient object recognition and pose estimation. The proposed method is based on the estimation of a reference frame for the whole point cloud that r...
详细信息
This paper presents a global point cloud descriptor to be used for efficient object recognition and pose estimation. The proposed method is based on the estimation of a reference frame for the whole point cloud that represents an object instance, which is used for aligning it with the canonical coordinate system. After that, a descriptor is computed for the aligned point cloud based on how its 3D points are spatially distributed. Such descriptor is also extended with color distribution throughout the aligned point cloud. The global alignment transforms of matched point clouds are used for computing object pose. The proposed approach was evaluated with a publicly available dataset, showing that it outperforms major state of the art global descriptors regarding recognition rate and performance and that it allows precise pose estimation.
暂无评论