Bangla is one of the most widely used languages worldwide. This paper presents an application of image retrieval techniques to automatically judge the aesthetic quality of handwritten Bangla isolated characters. Retri...
详细信息
ISBN:
(纸本)9781509055593
Bangla is one of the most widely used languages worldwide. This paper presents an application of image retrieval techniques to automatically judge the aesthetic quality of handwritten Bangla isolated characters. Retrieval techniques are also adapted to give improvement suggestions, with a plan to incorporate the methods in applications which can assist in learning/teaching handwriting. The proposed method borrows key concepts from content-based image retrieval. Our method was tested on the BanglaLekha-Isolated data set, which contains images of 84 Bangla characters, with nearly 2000 samples per character. The data set contains evaluation of the aesthetic quality of the handwriting judged on a scale of 1 - 5. For this work, the dataset was partitioned into a test set of 400 images and a database-set of approximate to 1600 images, per Bangla character. Assuming that a scoring difference of 1 is acceptable, the proposed method achieves an accuracy of 77.39% when using features extracted by a convolutional neural network based autoencoder. Experiments were also done with the popular HOG feature. However, the autoencoder-based results showed clear superiority compared the HOG-based results. Our proposed method for improvement suggestions also shows that it is possible to shows samples from the dataset which will help users improve their handwriting while requiring small changes to their own handwriting.
We discuss how an autoencoder can detect system-level anomalies in a real-time gross settlement system by reconstructing a set of liquidity vectors. A liquidity vector is an aggregated representation of the underlying...
详细信息
ISBN:
(纸本)9789897582479
We discuss how an autoencoder can detect system-level anomalies in a real-time gross settlement system by reconstructing a set of liquidity vectors. A liquidity vector is an aggregated representation of the underlying payment network of a settlement system for a particular time interval. Furthermore, we evaluate the performance of two autoencoders on real-world payment data extracted from the TARGET2 settlement system. We do this by generating different types of artificial bank runs in the data and determining how the autoencoders respond. Our experimental results show that the autoencoders are able to detect unexpected changes in the liquidity flows between banks.
Malicious software is generated with more and more modified features of which the methods to detect malicious software use characteristics. Automatic classification of malicious software is efficient because it does n...
详细信息
ISBN:
(纸本)9783319700878;9783319700861
Malicious software is generated with more and more modified features of which the methods to detect malicious software use characteristics. Automatic classification of malicious software is efficient because it does not need to store all characteristic. In this paper, we propose a transferred generative adversarial network (tGAN) for automatic classification and detection of the zero-day attack. Since the GAN is unstable in training process, often resulting in generator that produces nonsensical outputs, a method to pre-train GAN with autoencoder structure is proposed. We analyze the detector, and the performance of the detector is visualized by observing the clustering pattern of malicious software using t-SNE algorithm. The proposed model gets the best performance compared with the conventional machine learning algorithms.
Document classification is challenging due to handling of voluminous and highly non-linear data, generated exponentially in the era of digitization. Proper representation of documents increases efficiency and performa...
详细信息
ISBN:
(纸本)9783319525037;9783319525020
Document classification is challenging due to handling of voluminous and highly non-linear data, generated exponentially in the era of digitization. Proper representation of documents increases efficiency and performance of classification, ultimate goal of retrieving information from large corpus. Deep neural network models learn features for document classification unlike the engineered feature based approaches where features are extracted or selected from the data. In the paper we investigate performance of different classifiers based on the features obtained using two approaches. We apply deep autoencoder for learning features while engineering features are extracted by exploiting semantic association within the terms of the documents. Experimentally it has been observed that learning feature based classification always perform better than the proposed engineering feature based classifiers.
The weather has a strong influence on food retailers' sales, as it affects customers emotional state, drives their purchase decisions, and dictates how much they are willing to spend. In this paper, we introduce a...
详细信息
ISBN:
(纸本)9783319618456;9783319618449
The weather has a strong influence on food retailers' sales, as it affects customers emotional state, drives their purchase decisions, and dictates how much they are willing to spend. In this paper, we introduce a deep learning based method which use meteorological data to predict sales of a Japanese chain supermarket. To be specific, our method contains a long short-term memory (LSTM) network and a stacked denoising autoencoder network, both of which are used to learn how sales changes with the weathers from a large amount of history data. We showed that our method gained initial success in predicting sales of some weather-sensitive products such as drinks. Particularly, our method outperforms traditional machine learning methods by 19.3%.
In an era when big data are becoming the norm, there is less concern with the quantity but more with the quality and completeness of the data. In many disciplines, data are collected from heterogeneous sources, result...
详细信息
ISBN:
(纸本)9781538627150
In an era when big data are becoming the norm, there is less concern with the quantity but more with the quality and completeness of the data. In many disciplines, data are collected from heterogeneous sources, resulting in multi-view or multi-modal datasets. The missing data problem has been challenging to address in multi-view data analysis. Especially, when certain samples miss an entire view of data, it creates the missing view problem. Classic multiple imputations or matrix completion methods are hardly effective here when no information can be based on in the specific view to impute data for such samples. The commonly-used simple method of removing samples with a missing view can dramatically reduce sample size, thus diminishing the statistical power of a subsequent analysis. In this paper, we propose a novel approach for view imputation via generative adversarial networks (GANs), which we name by VIGAN. This approach first treats each view as a separate domain and identifies domain-to-domain mappings via a GAN using randomly-sampled data from each view, and then employs a multi-modal denoising autoencoder (DAE) to reconstruct the missing view from the GAN outputs based on paired data across the views. Then, by optimizing the GAN and DAE jointly, our model enables the knowledge integration for domain mappings and view correspondences to effectively recover the missing view. Empirical results on benchmark datasets validate the VIGAN approach by comparing against the state of the art. The evaluation of VIGAN in a genetic study of substance use disorders further proves the effectiveness and usability of this approach in life science.
We propose to use a feature representation obtained by pairwise learning in a low-resource language for query-byexample spoken term detection (QbE-STD). We assume that word pairs identified by humans are available in ...
详细信息
ISBN:
(纸本)9781509041176
We propose to use a feature representation obtained by pairwise learning in a low-resource language for query-byexample spoken term detection (QbE-STD). We assume that word pairs identified by humans are available in the lowresource target language. The word pairs are parameterized by a multi-lingual bottleneck feature (BNF) extractor that is trained using transcribed data in high-resource languages. The multi-lingual BNFs of the word pairs are used as an initial feature representation to train an autoencoder (AE). We extract features from an internal hidden layer of the pairwise trained AE to perform acoustic pattern matching for QbE-STD. Our experiments on the TIMIT and Switchboard corpora show that the pairwise learning brings 7.61% and 8.75% relative improvements in mean average precision (MAP) respectively over the initial feature representation.
Selecting a proper set of features with the best discrimination is always a challenge in classification. In this paper we propose a method, named GLLC (General Locally Linear Combination), to extract features using a ...
详细信息
ISBN:
(纸本)9781538625859
Selecting a proper set of features with the best discrimination is always a challenge in classification. In this paper we propose a method, named GLLC (General Locally Linear Combination), to extract features using a deep autoencoder and reconstruct a sample based on other samples in a low dimensional space, then the class with minimum reconstruction error is selected as the winner. Extracting features along with the discrimination characteristic of the sparse model can create a robust classifier that shows simultaneous reduction of samples and features. Although the main application of GLLC is in the visual classification and face recognition, it can be used in other applications. We conduct extensive experiments to demonstrate that the proposed algorithm gain high accuracy on various datasets and outperforms the state-of-the-art methods.
We tackle the problem of mobile visual search. Moving pictures experts group (MPEG) has completed a standard named compact descriptor for visual search (CDVS) to provide a standardized syntax in the context of image r...
详细信息
ISBN:
(纸本)9781509021758
We tackle the problem of mobile visual search. Moving pictures experts group (MPEG) has completed a standard named compact descriptor for visual search (CDVS) to provide a standardized syntax in the context of image retrieval application. CDVS applies principal components analysis to reduce the dimension of local feature descriptor as the input of global descriptor pipeline, and utilizes traditional fisher vector as the local feature descriptor aggregation algorithm. However, the descriptor components of SIFT and Fisher Vector (FV) have highly non-Gaussian statistics, and applying a single PCA transform can in-fact hurt compression performance at high rates. We develop a net-based architecture combining neural networks with FV layer to obtain fisher vector. There are two advantages in our architecture comparing with CDVS global descriptor pipeline. One is that we employ "autoencoder" networks to reduce the dimensionality of data, the other is that we exploit a trainable system to learn parameters after the FV codebook obtained. The experiments demonstrate an obvious advantage of our proposed architecture in terms of CDVS retrieval task.
We aim to reduce the cost of sound monitoring for maintain machinery by reducing the sampling rate, i.e., sub-Nyquist sampling. Monitoring based on sub-Nyquist sampling requires two sub-systems: a sub-system on-site f...
详细信息
ISBN:
(纸本)9781509063413
We aim to reduce the cost of sound monitoring for maintain machinery by reducing the sampling rate, i.e., sub-Nyquist sampling. Monitoring based on sub-Nyquist sampling requires two sub-systems: a sub-system on-site for sampling machinery sounds at a low rate and a sub-system off-site for detecting anomalies from the subsampled signal. This paper proposes a method for achieving both subsystems. First, the proposed method uses non-uniform sampling to encode higher than the Nyquist frequency. Second, the method applies a long short-term memory-(LSTM)-based autoencoder network for detecting anomalies. The novelty of the proposed network is that the subsampled time-domain signal is demultiplexed and received as input in an end-to-end manner, enabling anomaly detection from the subsampled signal. Experimental results indicate that our method is suitable for anomaly detection from the subsampled signal.
暂无评论