We present a reduced model based on position based dynamics for real-time simulation of human musculature. We demonstrate our methods on the muscles of the human arm. Co-simulation of all the muscles of the human arm ...
详细信息
ISBN:
(纸本)9781450366151
We present a reduced model based on position based dynamics for real-time simulation of human musculature. We demonstrate our methods on the muscles of the human arm. Co-simulation of all the muscles of the human arm allow us to accurately track the development of stresses and strains in the muscles, when the arm is moved. We evaluate our method for accuracy by comparing it with gold standard simulation models based on finite volume methods, and demonstrate the stability of the method under flexion, extension and torsion.
Cosmetic makeup is an art in itself which people use to often enhance their beauty and express themselves. Putting on makeup is a tedious task, so who would not love to see themselves in makeup before physically apply...
详细信息
ISBN:
(纸本)9781450366151
Cosmetic makeup is an art in itself which people use to often enhance their beauty and express themselves. Putting on makeup is a tedious task, so who would not love to see themselves in makeup before physically applying it? In this work, we demonstrate the transfer of makeup from a reference makeup image to some subject image. Our technique involves generation and use of 3D face models of these images. Some nice features of our pipeline include : (1) process adapts to the lighting conditions of the subject image. (2) after makeup on subject image, it is possible to view it in different indoor and outdoor lighting conditions. (3) accessories can be added realistically with proper lighting so that they naturally fit in the subject's scene. Experiments show that these also make our makeup look very realistic and improved compared to many existing frameworks. The main advantage of our novel pipeline is the requirement of only the subject and reference images as input. This is unlike other techniques using 3D models which require extensive data collection.
Describing the contents of an image automatically has been a fundamental problem in the field of artificial intelligence and computervision. Existing approaches are either top-down, which start from a simple represen...
详细信息
ISBN:
(纸本)9781450366151
Describing the contents of an image automatically has been a fundamental problem in the field of artificial intelligence and computervision. Existing approaches are either top-down, which start from a simple representation of an image and convert it into a textual description;or bottom-up, which come up with attributes describing numerous aspects of an image to form the caption or a combination of both. Recurrent neural networks (RNN) enhanced by Long Short-Term Memory networks (LSTM) have become a dominant component of several frameworks designed for solving the image captioning task. Despite their ability to reduce the vanishing gradient problem, and capture dependencies, they are inherently sequential across time. In this work, we propose two novel approaches, a top-down and a bottom-up approach independently, which dispenses the recurrence entirely by incorporating the use of a Transformer, a network architecture for generating sequences relying entirely on the mechanism of attention. Adaptive positional encodings for the spatial locations in an image and a new regularization cost during training is introduced. The ability of our model to focus on salient regions in the image automatically is demonstrated visually. Experimental evaluation of the proposed architecture on the MS-COCO dataset is performed to exhibit the superiority of our method.
Sign Language is the most expressive form of communication for speech and hearing impaired people to communicate with normal person but a normal person cannot understand sign language. So in order to break this barrie...
详细信息
Egocentric activity recognition (EAR) is an emerging area in the field of computervision research. Motivated by the current success of Convolutional Neural Network (CNN), we propose a multi-stream CNN for multimodal ...
详细信息
ISBN:
(纸本)9781450366151
Egocentric activity recognition (EAR) is an emerging area in the field of computervision research. Motivated by the current success of Convolutional Neural Network (CNN), we propose a multi-stream CNN for multimodal egocentric activity recognition using visual (RGB videos) and sensor stream (accelerometer, gyroscope, etc.). In order to effectively capture the spatio-temporal information contained in RGB videos, two types of modalities are extracted from visual data: Approximate Dynamic image (ADI) and Stacked Difference image (SDI). These image-based representations are generated both at clip level as well as entire video level, and are then utilized to finetune a pretrained 2D-CNN called MobileNet, which is specifically designed for mobile vision applications. Similarly for sensor data, each training sample is divided into three segments, and a deep 1D-CNN network is trained (corresponding to each type of sensor stream) from scratch. During testing, the softmax scores of all the streams (visual + sensor) are combined by late fusion. The experiments performed on multimodal egocentric activity dataset demonstrates that our proposed approach can achieve state-of-the-art results, outperforming the current best handcrafted and deep learning based techniques.
In this work, we propose a computationally efficient compressive sensing based approach for very low bit rate lossy coding of hyperspectral (HS) image data by exploiting the redundancy inherent in this imaging modalit...
详细信息
ISBN:
(纸本)9781450366151
In this work, we propose a computationally efficient compressive sensing based approach for very low bit rate lossy coding of hyperspectral (HS) image data by exploiting the redundancy inherent in this imaging modality. We divide the HS datacube into subsets of adjacent bands, each of which is encoded into a coded snapshot using a random code matrix. These coded snapshot images are encoded using the wavelet-based SPIHT compression technique. The decompression from the coded snapshots at the receiver is done using the orthogonal matching pursuit with the help of an overcomplete dictionary learned on a general purpose training dataset. We provide ample experimental results and performance comparisons to substantiate the usefulness of the proposed method. In the proposed technique the encoder is free from any decoder and it offers a significant saving in computation and yet yields a much higher compression quality.
In practice, images can contain different amounts of noise for different color channels, which is not acknowledged by existing super-resolution approaches. In this paper, we propose to super-resolve noisy color images...
详细信息
ISBN:
(纸本)9781450366151
In practice, images can contain different amounts of noise for different color channels, which is not acknowledged by existing super-resolution approaches. In this paper, we propose to super-resolve noisy color images by considering the color channels jointly. Noise statistics are blindly estimated from the input low-resolution image and are used to assign different weights to different color channels in the data cost. Implicit low-rank structure of visual data is enforced via nuclear norm minimization in association with adaptive weights, which is added as a regularization term to the cost. Additionally, multi-scale details of the image are added to the model through another regularization term that involves projection onto PCA basis, which is constructed using similar patches extracted across different scales of the input image. The results demonstrate the super-resolving capability of the approach in real scenarios.
Face Recognition (FR) under adversarial conditions has been a big challenge for researchers in the computervision and Machine Learning communities in the recent past. Most of state-of-the-art face recognition systems...
详细信息
ISBN:
(纸本)9781450366151
Face Recognition (FR) under adversarial conditions has been a big challenge for researchers in the computervision and Machine Learning communities in the recent past. Most of state-of-the-art face recognition systems have been designed to overcome degradations in a face due to variations in pose, illumination, contrast, resolution, along with blur. However, interestingly none have addressed the fascinating issue of makeup as a spoof attack, which drastically changes the appearance of a face, making it difficult for even humans to detect and identify the impostor. In this paper, we propose a novel multi-component deep convolutional neural network (CNN) based architecture which performs the complex task of makeup removal from a disguised face, to reveal the original mugshot image of the impostor (i.e. without makeup). The proposed network also performs the hard tasks of FR on a disguised face in addition to recognition of identity and generation of the face of the spoofed target, by minimizing a novel multi-component objective function. Comparison of performance with a few recent state-of-the-art methods of FR over three benchmark datasets reveals the superiority of our proposed method for both synthesis as well as recognition (FR) tasks.
Machine Learning models are known to be susceptible to small but structured changes to their inputs that can result in wrong inferences. It has been shown that such samples, called adversarial samples, can be created ...
详细信息
ISBN:
(纸本)9781450366151
Machine Learning models are known to be susceptible to small but structured changes to their inputs that can result in wrong inferences. It has been shown that such samples, called adversarial samples, can be created rather easily for standard neural network architectures. These adversarial samples pose a serious threat for deploying state-of-the-art deep neural network models in the real world. We propose a feature augmentation technique called BatchOut to learn robust models towards such examples. The proposed approach is a generic feature augmentation technique that is not specific to any adversary and handles multiple attacks. We evaluate our algorithm on benchmark datasets and architectures to show that models trained using our method are less susceptible to adversaries created using multiple methods.
Handwritten character recognition is an imperative issue in the field of pattern recognition and machine learning research. In the recent years, several techniques for handwritten character recognition have been propo...
详细信息
暂无评论