Conventional convolutional neural networks (CNN) are trained on large domain datasets and are hence typically over-represented and inefficient in limited class applications. An efficient way to convert such large many...
详细信息
ISBN:
(纸本)9781450366151
Conventional convolutional neural networks (CNN) are trained on large domain datasets and are hence typically over-represented and inefficient in limited class applications. An efficient way to convert such large many-class pre-trained networks into small few-class networks is through a hierarchical decomposition of its feature maps. To alleviate this issue, we propose an automated framework for such decomposition in Hierarchically Self Decomposing CNN (HSD-CNN), in four steps. HSD-CNN is derived automatically using a class-specific filter sensitivity analysis that quantifies the impact of specific features on a class prediction. The decomposed hierarchical network can be utilized and deployed directly to obtain sub-networks for a subset of classes, and it is shown to perform better without the requirement of retraining these sub-networks. Experimental results show that HSD-CNN generally does not degrade accuracy if the full set of classes is used. Interestingly, when operating on known subsets of classes, HSD-CNN has an improvement in accuracy with a much smaller model size requiring much fewer operations. HSD-CNN flow is verified on the CIFAR10, CIFAR100 and CALTECH101 datasets. We report accuracies up to 85.6% (94.75%) on scenarios with 13 (4) classes of CIFAR100, using a pre-trained VGG-16 network on the full dataset. In this case, the proposed HSD-CNN requires 3.97x fewer parameters and has 71.22% savings in operations, in comparison to baseline VGG-16 containing features for all 100 classes.
In this paper, we propose a novel, real-time dynamic hand gesture recognition framework using convolutional neural network with depth and RGB data fusion. Hand gestures are a natural form of communication between huma...
ISBN:
(纸本)9781450366151
In this paper, we propose a novel, real-time dynamic hand gesture recognition framework using convolutional neural network with depth and RGB data fusion. Hand gestures are a natural form of communication between humans as well as between human and machine. They also find important applications in areas such as sign language recognition, man-machine interaction and behavior understanding. Natural hand gestures are complex hand movements in space and time and are challenging to recognize. In our proposed framework, we use both RGB and depth data to automatically recognize dynamic hand gestures. Initially, we work with RGB and depth data separately. We find the motion history of the gesture performed with RGB data and independently with depth data to store the motion information of the moving hands. Motion history of the performed gesture stores the rich information of the movement. Then, we use transfer learning on two separate VGG16 networks, where one network is fine-tuned using RGB motion history while the other network is fine-tuned using depth motion history, to configure them for dynamic hand gesture recognition problem. Then, using the two fine-tunned VGG16 networks, we extract the features of both the motion history images obtained from RGB and depth data separately, for each dynamic hand gesture. We then, integrate the features obtained from both the networks using weighted summation, to accurately and robustly recognize the dynamic hand gesture. We perform experiments on standard and the publicly available dynamic hand gesture datasets and show that our method outperforms state of the art methods.
Biometric systems play an important role in the field of information security as they are extremely required for user authentication. Automatic signature recognition and verification is one of the biometric techniques...
详细信息
Biometric systems play an important role in the field of information security as they are extremely required for user authentication. Automatic signature recognition and verification is one of the biometric techniques, which is currently receiving renewed interest and is only one of several techniques used to verify the identities of individuals. Signatures provide a secure means for confirmation and authorization in legal documents. So nowadays, signature identification and verification becomes an essential component in automating the rapid processing of documents containing embedded signatures. In this paper, a technique for a bi-script off-line signature identification system is proposed. In the proposed signature identification system, the signatures of English and Bengali (Bangla) are considered for the identification process. Different features such as under sampled bitmaps, modified chain-code direction features and gradient features computed from both background and foreground components are employed for this purpose. Support Vector Machines (SVMs) and Nearest Neighbour (NN) techniques are considered as classifiers for signature identification in the proposed system. A database of 1554 English signatures and 1092 Bengali signatures are used to generate the experimental results. Various results based on different features are calculated and analysed. The highest accuracies of 99.41%, 98.45% and 97.75% are obtained based on the modified chain-code direction, under-sampled bitmaps and gradient features respectively using 1800 (1100 English+700 Bengali) samples for training and 846 (454 English+392 Bengali) samples for testing.
Convolutional Neural Networks (CNNs) have grown in popularity and usage tremendously over the last few years, spanning across different task such as computervision tasks, natural language processing, video recognitio...
详细信息
ISBN:
(纸本)9781450398473
Convolutional Neural Networks (CNNs) have grown in popularity and usage tremendously over the last few years, spanning across different task such as computervision tasks, natural language processing, video recognition, and recommender systems. Despite the algorithmic advancements that drove the growth of CNN still has considerable computational and memory overhead that poses challenges in achieving real-time performance. Each input image requires millions to even billions of elementary arithmetic operations before the network obtains the result. In CNNs, convolutional and pooling layers are followed by activation layers involving various activation functions. Hence, a lot of work has been done to reduce these costs in the last few years. Numerous optimizations have addressed at both hardware and software levels. In this paper, we propose a software-based solution for improving the performance of inference of networks. We suggest a technique for the approximate computation of the convolution operation based on clustering and sharing of weights. We have utilized Gaussian Mixture Models for clustering. We exploit weight sparsity to further reduce computations on top of the clustering method. We were able to achieve a considerable reduction in the MAC operations and the overall computation speedup on popular CNN architectures
Although there are advanced technologies for character recognition, automatic descriptive answer evaluation is an open challenge for the document image analysis community due to large diversified handwritten text and ...
ISBN:
(纸本)9781450397056
Although there are advanced technologies for character recognition, automatic descriptive answer evaluation is an open challenge for the document image analysis community due to large diversified handwritten text and answers to the question. This paper presents a novel method for detecting anomaly handwritten text in the responses written by the students to the questions. The method is proposed based on the fact that when the students are confident in answering questions, the students usually write answers legibly and neatly while they are not confident, they write sloppy writing which may not be easy for the reader to understand. To detect such anomaly handwritten text, we explore a new combination of Fourier transform and deep learning model for detecting edges. This result preserves the structure of handwritten text. For extracting features for classification of anomaly text and normal text, the proposed method studies the behavior of writing style, especially the variation at ascenders and descenders. Therefore, the proposed work draws principal axis which is invariant to rotation, scaling and some extent to distortion for the edge images. With respect to principal axis, the proposed method draws medial axis using uppermost and lowermost points. The distance between the medial axis and principal axis points are considered as feature vector. Further, the feature vector is passed to Artificial Neural Network for classification of anomaly text. The proposed method is evaluated by testing on our own dataset, standard dataset of gender identification (IAM) and handwritten forgery detection dataset (ACPR 2019). The results on different datasets show that the proposed work outperforms the existing methods.
暂无评论