Traffic Sign Recognition is very crucial for self-driving cars and Advanced Driver Assistance Systems. As the vehicle moves within a region or across regions, it encounters a variety of signs which needs to be recogni...
详细信息
ISBN:
(纸本)9789897583971
Traffic Sign Recognition is very crucial for self-driving cars and Advanced Driver Assistance Systems. As the vehicle moves within a region or across regions, it encounters a variety of signs which needs to be recognized with very high accuracy. It is generally observed that traffic signs have large intra-class variability and small inter-class variability. This makes visual distinguishability between distinct classes extremely irregular. In this paper we propose a hierarchical classifier in which the number of coarse classes is automatically determined. This gives the advantage of dedicated classifiers trained for classes which are more difficult to distinguish. This is an application oriented work which involves systematic and intelligent combination of machine learning and computervision based algorithms with required modifications for designing fully automated hierarchical classification framework for traffic sign recognition. The proposed solution is a real-time scalable machine learning based approach which can efficiently take care of wide intra-class variations without extracting desired handcrafted features beforehand. It eliminates the need for manually observing and grouping relevant features, thereby reducing human time and efforts. The classifier performance accuracy is surpassing the accuracy achieved by humans on publicly available GTSRB traffic sign dataset with lesser parameters than the existing solutions.
Fall detection holds immense importance in the field of healthcare, where timely detection allows for instant medical assistance. In this context, we propose a 3D ConvNet architecture which consists of 3D Inception mo...
详细信息
In this paper, a robust image hashing framework is presented using discrete cosine transformation and singular value decomposition. Firstly, the input image is normalized using geometric moment and normalized coeffici...
详细信息
ISBN:
(纸本)9781450366151
In this paper, a robust image hashing framework is presented using discrete cosine transformation and singular value decomposition. Firstly, the input image is normalized using geometric moment and normalized coefficients are divided into non-overlapping blocks. The selected blocks based on a peace-wise non-linear chaotic map are transformed using discrete cosine transom followed by singular value decomposition. Then a feature matrix is constructed in reliance on Hessian matrix and the final hash values are obtained. The proposed hashing system is resilient to different content-preserving image distortions such as geometric and filtering operations. The simulated results demonstrate the efficiency proposed framework in terms of security and robustness.
Traditional 3D convolutions are computationally expensive, memory intensive, and due to large number of parameters, they often tend to overfit. On the other hand, 2D CNNs are less computationally expensive and less me...
详细信息
Person re-identification aims to associate images of the same person over multiple non-overlapping camera views at different times. Depending on the human operator, manual re-identification in large camera networks is...
详细信息
Cosmetic makeup is an art in itself which people use to often enhance their beauty and express themselves. Putting on makeup is a tedious task, so who would not love to see themselves in makeup before physically apply...
详细信息
ISBN:
(纸本)9781450366151
Cosmetic makeup is an art in itself which people use to often enhance their beauty and express themselves. Putting on makeup is a tedious task, so who would not love to see themselves in makeup before physically applying it? In this work, we demonstrate the transfer of makeup from a reference makeup image to some subject image. Our technique involves generation and use of 3D face models of these images. Some nice features of our pipeline include: (1) process adapts to the lighting conditions of the subject image. (2) after makeup on subject image, it is possible to view it in different indoor and outdoor lighting conditions. (3) accessories can be added realistically with proper lighting so that they naturally fit in the subject's *** show that these also make our makeup look very realistic and improved compared to many existing *** main advantage of our novel pipeline is the requirement of only the subject and reference images as input. This is unlike other techniques using 3D models which require extensive data collection.
In this work, we address the problem of dynamic gesture recognition using a pose based video descriptor. The proposed approach takes as input video frames and extracts pose-specific image regions which are further pro...
详细信息
ISBN:
(纸本)9781450366151
In this work, we address the problem of dynamic gesture recognition using a pose based video descriptor. The proposed approach takes as input video frames and extracts pose-specific image regions which are further processed by a pre-trained Convolutional Neural Network (CNN) to derive a pose-based descriptor for each frame. A Long Short Term Memory (LSTM) network is trained from scratch for dynamic gesture classification by learning long-term spatiotemporal relations among features. We also demonstrate that only using video data (RGB frames and optical flow) one can design an effective model for recognizing dynamic gestures. We utilize ChaLearn multi-modal gesture challenge dataset [13] and Cambridge hand gesture dataset [18] for evaluation of the proposed algorithm achieving an accuracy of 91.27% and 96% respectively using only RGB data.
Egocentric activity recognition (EAR) is an emerging area in the field of computervision research. Motivated by the current success of Convolutional Neural Network (CNN), we propose a multi-stream CNN for multimodal ...
详细信息
ISBN:
(纸本)9781450366151
Egocentric activity recognition (EAR) is an emerging area in the field of computervision research. Motivated by the current success of Convolutional Neural Network (CNN), we propose a multi-stream CNN for multimodal egocentric activity recognition using visual (RGB videos) and sensor stream (accelerometer, gyroscope, etc.). In order to effectively capture the spatio-temporal information contained in RGB videos, two types of modalities are extracted from visual data: Approximate Dynamic image (ADI) and Stacked Difference image (SDI). These image-based representations are generated both at clip level as well as entire video level, and are then utilized to finetune a pretrained 2D-CNN called MobileNet, which is specifically designed for mobile vision applications. Similarly for sensor data, each training sample is divided into three segments, and a deep 1D-CNN network is trained (corresponding to each type of sensor stream) from scratch. During testing, the softmax scores of all the streams (visual + sensor) are combined by late fusion. The experiments performed on multimodal egocentric activity dataset demonstrates that our proposed approach can achieve state-of-the-art results, outperforming the current best handcrafted and deep learning based techniques.
The proceedings contain 14 papers. The special focus in this conference is on Document Analysis and Recognition. The topics include: Word-Wise Handwriting Based Gender Identification Using Multi-Gabor Response Fusion;...
ISBN:
(纸本)9789811393600
The proceedings contain 14 papers. The special focus in this conference is on Document Analysis and Recognition. The topics include: Word-Wise Handwriting Based Gender Identification Using Multi-Gabor Response Fusion;a Secure and Light Weight User Authentication System Based on Online Signature Verification for Resource Constrained Mobile Networks;benchmark Datasets for Offline Handwritten Gurmukhi Script Recognition;benchmark Dataset: Offline Handwritten Gurmukhi City Names for Postal Automation;attributed Paths for Layout-Based Document Retrieval;textual Content Retrieval from Filled-in Form images;A Study on the Effect of CNN-Based Transfer Learning on Handwritten Indic and Mixed Numeral Recognition;symbol Spotting in Offline Handwritten Mathematical Expressions;online Handwritten Bangla Character Recognition Using Frechet Distance and Distance Based Features;an Efficient Multi Lingual Optical Character Recognition System for indian Languages Through Use of Bharati Script;telugu Word Segmentation Using Fringe Maps;an Efficient Character Segmentation Algorithm for Connected Handwritten Documents.
We present a reduced model based on position based dynamics for real-time simulation of human musculature. We demonstrate our methods on the muscles of the human arm. Co-simulation of all the muscles of the human arm ...
详细信息
ISBN:
(纸本)9781450366151
We present a reduced model based on position based dynamics for real-time simulation of human musculature. We demonstrate our methods on the muscles of the human arm. Co-simulation of all the muscles of the human arm allow us to accurately track the development of stresses and strains in the muscles, when the arm is moved. We evaluate our method for accuracy by comparing it with gold standard simulation models based on finite volume methods, and demonstrate the stability of the method under flexion, extension and torsion.
暂无评论