This paper presents an offline text-independent writer identification system in a multi-language environment using sparse auto-encoder (SAE) based codebook (SAEC) and non-text-segmentation feature extraction methods. ...
详细信息
ISBN:
(纸本)9781509034840
This paper presents an offline text-independent writer identification system in a multi-language environment using sparse auto-encoder (SAE) based codebook (SAEC) and non-text-segmentation feature extraction methods. The proposed codebook is designed by SAE structure and clustering. The superior of the designed frame work is that features can be effectively extracted and without the pre-step of text segmentation. The novel codebook can deal with mixed-languages at the same time. Hand-written texts in Chinese and English are considered in this study. We use both HIT Chinese database and IAM offline English database. The classification rate achieves the challenging accuracy of 95.56% concerning top 1 and 99.17% concerning top 10 on the two mixed databases. Another interesting aspect of our study is the evaluation of the factors such as patch sizes, patch numbers and the amount of text that influence the identification results.
RGB-D camera can easily record both color and depth images and previous works have proved that combining them together could dramatically improve the RGB-D based object recognition accuracy. In this paper, a new metho...
详细信息
RGB-D camera can easily record both color and depth images and previous works have proved that combining them together could dramatically improve the RGB-D based object recognition accuracy. In this paper, a new method based on a subset approach was introduced to learn higher level features from the raw data. The raw RGB and depth images were divided into several subsets according to their shapes and colors, guaranteeing that any two different objects in each subset are nearly not similar. Then a RGB-Subset-sparse auto-encoder was trained to extract features from RGB images and a Depth-Subset-sparse auto-encoder was trained to extract features from depth images for each subset. Then the learned features were transmitted to recursive neural networks (RNNs) to reduce the dimensionality of the features and learn robust hierarchical feature representations. The feature representations learned from RGB images and depth images were concatenated as the final features and then sent to a softmax classifier for classification. The proposed method is evaluated on three benchmark RGB-D datasets, RGB-D dataset of Lai et al., 2D3D dataset of Browatzki et al. and Aharon dataset of Aharon et al. Compared with other methods, ours achieves state-of-the-art performance on the first two datasets. Furthermore, to validate the generalization of our subset approach, we also do some extra experiments of applying the subsets approach to several previous works, these accuracies improved significantly. (C) 2015 Elsevier B.V. All rights reserved.
This paper presents an offline text-independent writer identification system in a multi-language environment using sparse auto-encoder(SAE) based codebook(SAEC) and non-text-segmentation feature extraction methods. Th...
详细信息
This paper presents an offline text-independent writer identification system in a multi-language environment using sparse auto-encoder(SAE) based codebook(SAEC) and non-text-segmentation feature extraction methods. The proposed codebook is designed by SAE structure and clustering. The superior of the designed frame work is that features can be effectively extracted and without the pre-step of text segmentation. The novel codebook can deal with mixed-languages at the same time. Hand-written texts in Chinese and English are considered in this study. We use both HIT Chinese database and IAM offline English database. The classification rate achieves the challenging accuracy of 95.56% concerning top 1 and 99.17% concerning top 10 on the two mixed databases. Another interesting aspect of our study is the evaluation of the factors such as patch sizes, patch numbers and the amount of text that influence the identification results.
Knowledge discovery in databases (KDD) has received great progress in recent years for the need of mining useful knowledge in the ever growing information. The advances in machine learning technologies effectively pro...
详细信息
ISBN:
(纸本)9781479983896
Knowledge discovery in databases (KDD) has received great progress in recent years for the need of mining useful knowledge in the ever growing information. The advances in machine learning technologies effectively promote KDD in the procedures of feature extraction and data categorization. This paper introduces a framework that combines feature extraction and categorization of the collected data in order to recognize useful structured patterns that underlies the raw data. This frame work consists of three modules: data pre-processing module, feature extraction module, and feature classification module. We propose a four-layered deep neural network as the feature extraction architecture. Each layer is trained in an unsupervised way as one auto-encoder with sparsity constraint. We employ a softmax classifier to assign a label to the extracted feature. The supervised and unsupervised training strategies are discussed at the end of this paper to disambiguate the training procedure of the entire model.
Knowledge discovery in databases (KDD) has received great progress in recent years for the need of mining useful knowledge in the ever growing information. The advances in machine learning technologies effectively pro...
详细信息
ISBN:
(纸本)9781479984671
Knowledge discovery in databases (KDD) has received great progress in recent years for the need of mining useful knowledge in the ever growing information. The advances in machine learning technologies effectively promote KDD in the procedures of feature extraction and data categorization. This paper introduces a framework that combines feature extraction and categorization of the collected data in order to recognize useful structured patterns that underlies the raw data. This frame work consists of three modules: data pre-processing module, feature extraction module, and feature classification module. We propose a four-layered deep neural network as the feature extraction architecture. Each layer is trained in an unsupervised way as one auto-encoder with sparsity constraint. We employ a softmax classifier to assign a label to the extracted feature. The supervised and unsupervised training strategies are discussed at the end of this paper to disambiguate the training procedure of the entire model.
Deep learning algorithms such as convolutional neural networks (CNN) have been successfully applied in computer vision. This paper attempts to adapt the optical camera-oriented CNN to its microwave counterpart, i.e. s...
详细信息
ISBN:
(纸本)9781479969913
Deep learning algorithms such as convolutional neural networks (CNN) have been successfully applied in computer vision. This paper attempts to adapt the optical camera-oriented CNN to its microwave counterpart, i.e. synthetic aperture radar (SAR). As a preliminary study, a single layer of convolutional neural network is used to automatically learn features from SAR images. Instead of using the classical backpropagation algorithm, the convolution kernel is trained on randomly sampled image patches using unsupervised sparse auto-encoder. After convolution and pooling, an input SAR image is then transformed into a series of feature maps. These feature maps are then used to train a final softmax classifier. Initial experiments on MSTAR public data set show that an accuracy of 90.1% can be achieved on three types of targets classification task, and an accuracy of 84.7% is achievable on ten types of targets classification task.
Articulatory features (AFs) are viewed as the universal speech attributes for cross-language speech recognition. They are usually detected using a bank of multi-layer perceptrons (MLPs) in a supervised manner. In this...
详细信息
ISBN:
(纸本)9781479942190
Articulatory features (AFs) are viewed as the universal speech attributes for cross-language speech recognition. They are usually detected using a bank of multi-layer perceptrons (MLPs) in a supervised manner. In this paper, we propose to apply the deep learning method to detect AF-based speech attributes in a semi-supervised manner for cross-language speech recognition. The experimental results on Tibetan phone recognition showed that the deep learning method can detect the AF-based speech attributes more accurately and has higher phone recognition rates than MLPs.
Handwritten Characters Recognition has long been a tough problem in pattern recognition and machine learning. Some special tasks, such as automatic check preprocessing, require Handwritten Chinese Legal Amounts recogn...
详细信息
ISBN:
(纸本)9781479952083
Handwritten Characters Recognition has long been a tough problem in pattern recognition and machine learning. Some special tasks, such as automatic check preprocessing, require Handwritten Chinese Legal Amounts recognition as a prerequisite. Since we expect to utilize machine instead of human to process bank checks, the recognition rate in such task must reach a relatively high rate. This paper proposes to use deep learning, auto-encoder as an effective approach for obtaining hierarchical representations of Isolated Handwritten Chinese Legal Amounts. Experiments show such representations are highly abstractive and can be used in character recognition. Meanwhile, a novel way by combining multiple Neural Networks in doing the work is proposed which proves to be able to improve the recognition rate significantly. This method reports a 0.64% error rate on a large test set over 10,000 samples and outperforms traditional methods using hand-crafted features and convolutional neural network committees (another deep learning model), narrowing the gap to human performance.
RGB-D image is a multimodal data. Previous works have proved that using color and depth images together can dramatically increase the RGB-D based object recognition accuracy, but most of them either simply take all mo...
详细信息
ISBN:
(纸本)9783319093338;9783319093321
RGB-D image is a multimodal data. Previous works have proved that using color and depth images together can dramatically increase the RGB-D based object recognition accuracy, but most of them either simply take all modalities as input, ignoring information about specific modalities, or train a first layer representation for each modality separately and concatenate them ignoring correlated modality information. In this paper, we use a variant of the sparse auto-encoder (SAE) which can specify how mode-sparse or mode-dense the features should be. A new deep learning network combining the variant SAE with the recursive neural networks (RNNs) was proposed. Through it, we got very discriminating features and obtained state of the art performance on a standard RGB-D object dataset.
暂无评论