Facial expression recognition has been an emerging research area in last two decades. This paper proposes a new hybrid system for automatic facial expression recognition. The proposed method utilizes histograms of ori...
详细信息
Aesthetics is concerned with the beauty and art of things in the world. Judging the aesthetics of images is a highly subjective task. Recently, deep learning-based approaches have achieved great success in image aesth...
详细信息
This paper presents the idea of a multimodal human aerobotic interaction. An overview of the aerobotic system and its application is given. The joystick-based controller interface and its limitations is discussed. Two...
详细信息
ISBN:
(纸本)9789898533524
This paper presents the idea of a multimodal human aerobotic interaction. An overview of the aerobotic system and its application is given. The joystick-based controller interface and its limitations is discussed. Two techniques are suggested as emerging alternatives to the joystick-based controller interface used in human aerobotic interaction. The first technique is a multimodal combination of speech, gaze, gesture, and other non-verbal cues already used in regular human-human interaction. The second is telepathic interaction via brain computer interfaces. The potential limitations of these alternatives is highlighted, and the considerations for further works are presented.
Convolutional neural networks (CNN) are widely used in computervision, especially in image classification. However, the way in which information and invariance properties are encoded through in deep CNN architectures...
详细信息
ISBN:
(纸本)9781467399616
Convolutional neural networks (CNN) are widely used in computervision, especially in image classification. However, the way in which information and invariance properties are encoded through in deep CNN architectures is still an open question. In this paper, we propose to modify the standard convolutional block of CNN in order to transfer more information layer after layer while keeping some invariance within the network. Our main idea is to exploit both positive and negative high scores obtained in the convolution maps. This behavior is obtained by modifying the traditional activation function step before pooling. We are doubling the maps with specific activations functions, called MaxMin strategy, in order to achieve our pipeline. Extensive experiments on two classical datasets, MNIST and CIFAR-10, show that our deep MaxMin convolutional net outperforms standard CNN.
We present the results of analyzing gait motion in first-person video taken from a commercially available wearable camera embedded in a pair of glasses. The video is analyzed with three different computervision metho...
详细信息
ISBN:
(纸本)9781509055104
We present the results of analyzing gait motion in first-person video taken from a commercially available wearable camera embedded in a pair of glasses. The video is analyzed with three different computervision methods to extract motion vectors from different gait sequences from four individuals for comparison against a manually annotated ground truth dataset. Using a combination of signal processing and computervision techniques, gait features are extracted to identify the walking pace of the individual wearing the camera as well as validated using the ground truth dataset. Our preliminary results indicate that the extraction of activity from the video in a controlled setting shows strong promise of being utilized in different activity monitoring applications such as in the eldercare environment, as well as for monitoring chronic healthcare conditions.
Pathological examination is the most accurate method for the diagnosis of cancer. Breast cancer histopathology evaluation analyses the chemical and cellular characteristics of the cells of a suspicious breast tumor. A...
详细信息
Animals have been a common sighting on roads in India which leads to several accidents between them and vehicles every year. This makes it vital to develop a support system for driverless vehicles that assists in prev...
详细信息
This paper investigates precise pupil center localization in low-resolution images. Being an essential preprocessing step in many applications such as gaze estimation, face alignment as well as human-computer interact...
详细信息
ISBN:
(纸本)9781467399616
This paper investigates precise pupil center localization in low-resolution images. Being an essential preprocessing step in many applications such as gaze estimation, face alignment as well as human-computer interaction, robust, precise, and efficient methods are necessary. We present a method for accurate eye center localization operating with images from simple off-the-shelf hardware such as webcams. The proposed method utilizes the isophote representation that allows to find pupil center candidates by introducing a novel voting mechanism for pixel weights. To cope with multiple local maxima resulting from the isophote voting map, we combine this information with quasi-continuous responses of a modified cascade classifier framework utilizing appearance-based features. We conduct experiments on the BioID database and show that the presented method outperforms results of existing methods within an error range of the pupil diameter while running at 10 fps on a standard CPU with 3.3 GHz in a Matlab implementation.
We extract 3D curb from video sequence, using a single camera equipped with fish-eye lens and located at the front/rear of the vehicle. The challenge in extracting curbs from images lies in their small size and their ...
详细信息
ISBN:
(纸本)9781467399616
We extract 3D curb from video sequence, using a single camera equipped with fish-eye lens and located at the front/rear of the vehicle. The challenge in extracting curbs from images lies in their small size and their lack of texture. We show that by appropriately exploiting appearance features, 3D geometry, and temporal information, one can reliably detect and localize the curbs in the 3D scene. The main underlying assumption of our model is that the road surface is flat and that the curb is approximately orthogonal to the road plane. We collected nine videos with ground truth, under day-time sunny weather condition, up to 2m range. Our experimental results compare favorably wrt the current the state-of-the-art on our database -90% precision rate in average and over 85% accuracy in curb height estimation.
Effectively feature matching between images is key to many computervision applications. Recently, binary descriptors are attracting increasing attention for their low computational complexity and small memory require...
详细信息
暂无评论