Deep networks are well known for their powerful function approximations. To train a deep network efficiently, greedy layer-wise pre-training and fine tuning are required. Typically, pre-training, aiming to initialize ...
详细信息
ISBN:
(纸本)9781479914845
Deep networks are well known for their powerful function approximations. To train a deep network efficiently, greedy layer-wise pre-training and fine tuning are required. Typically, pre-training, aiming to initialize a deep network, is implemented via unsupervised feature learning, with multiple feature representations generated. However, in general only the last layer representation is to be employed because of its abstraction and compactness being the best with comparisons to the ones of lower layers. To make full use of the representations of all layers, this paper proposes a feature ensemble learning method based on sparse autoencoders for image classification. Specifically, we train three softmax classifiers by using the representations of different layers, instead of one classifier trained by applying the last layer representation. Of the three softmax classifiers, two are obtained by training stacked auto encoders with fine tuning, and the other one is obtained by directly using a concatenation of two representations. To improve accuracy and stability of a single softmax classifier, the ensemble of multiple classifiers is considered, and some Naive Bayes combination rules are introduced to integrate the three classifiers. Experimental results on the MNIST and COIL datasets are presented, with comparisons to other classification methods.
The human visual system proves smart in extracting both global and local features. Can we design a similar way for unsupervised feature learning? In this paper, we propose a novel pooling method within an unsupervised...
详细信息
ISBN:
(纸本)9781467395052
The human visual system proves smart in extracting both global and local features. Can we design a similar way for unsupervised feature learning? In this paper, we propose a novel pooling method within an unsupervised feature learning framework, named Rich and Robust Feature Pooling (R~2FP), to better explore rich and robust representation from sparse feature maps of the input data. Both local and global pooling strategies are further considered to instantiate such a method and intensively studied. The former selects the most conductive features in the sub-region and summarizes the joint distribution of the selected features, while the latter is utilized to extract multiple resolutions of features and fuse the features with a feature balancing kernel for rich representation. Extensive experiments on several image recognition tasks demonstrate the superiority of the proposed techniques.
Thanks to its hierarchical and generative nature,Deep Belief Network(DBN) is effective to feature representation and extraction in signal *** this paper,DBN is investigated and implemented to monaural speech ***,two...
详细信息
Thanks to its hierarchical and generative nature,Deep Belief Network(DBN) is effective to feature representation and extraction in signal *** this paper,DBN is investigated and implemented to monaural speech ***,two separate DBNs are trained to extract features from mixed noisy signals and target clean speech ***,the two types of extracted features are associated together by training a BP neural network to obtain a mapping from the features of mixed signals to the features of target ***,by performing DBN and the above mapping neural network,target speech can be estimated from the input mixed *** are conducted on different kinds of mixed signals including female/male speech mixtures,human-speech/Gaussian-noise audio mixtures,and human-speech/music audio *** PESQ scores of the extracted speech are 3.32,2.59,and 3.42 respectively,which illustrates that the model performs well on speech separation tasks,especially on the mixed signals where the inference signals have obvious spectral structures.
A variety of techniques based on numerical characteristics are currently presented for mining time-series data. However, we find that time-series data generally contain curves sharing some set of visual characteristic...
详细信息
A variety of techniques based on numerical characteristics are currently presented for mining time-series data. However, we find that time-series data generally contain curves sharing some set of visual characteristics and *** characteristics offer a deeper understanding of time-series data, and open up a potential new technique for time-series analysis. Particularly beneficial from recent advances in deep neural networks, representations and features can be automatically learnt by deep learning architectures such as autoencoders. Based on that, our work proposes a novel method, named time-series visualization(TSV), to efficiently detect visual characteristics from curves of time-series data and use these characteristics for intelligent analysis. Architecture and algorithm of TSV based on stacked autoencoders are introduced in this paper. Further, important factors affecting the performance of TSV are discussed based on empirical results. Through empirical evaluation, it is demonstrated that TSV has better efficiency and higher classification accuracy on analyzing the datasets with significant curve feature.
Learning the non-linear image upscaling process has previously been considered as a simple regression process, where various models have been utilized to describe the correlations between high-resolution (HR) and low-...
详细信息
ISBN:
(纸本)9781479983407
Learning the non-linear image upscaling process has previously been considered as a simple regression process, where various models have been utilized to describe the correlations between high-resolution (HR) and low-resolution (LR) images/patches. In this paper, we present a multitask learning framework based on deep neural network for image super-resolution, where we jointly consider the image super-resolution process and the image degeneration process. By sharing parameters between the two highly relevant tasks, the proposed framework could effectively improve the obtained neural network based mapping model between HR and L-R image patches. Experimental results have demonstrated clear visual improvement and high computational efficiency, especially with large magnification factors.
Deep networks are well known for their powerful function approximations. To train a deep network efficiently, greedy layer-wise pre-training and fine tuning are required. Typically, pre-training, aiming to initialize ...
详细信息
ISBN:
(纸本)9781479914821
Deep networks are well known for their powerful function approximations. To train a deep network efficiently, greedy layer-wise pre-training and fine tuning are required. Typically, pre-training, aiming to initialize a deep network, is implemented via unsupervised feature learning, with multiple feature representations generated. However, in general only the last layer representation is to be employed because of its abstraction and compactness being the best with comparisons to the ones of lower layers. To make full use of the representations of all layers, this paper proposes a feature ensemble learning method based on sparse autoencoders for image classification. Specifically, we train three softmax classifiers by using the representations of different layers, instead of one classifier trained by applying the last layer representation. Of the three softmax classifiers, two are obtained by training stacked auto-encoders with fine tuning, and the other one is obtained by directly using a concatenation of two representations. To improve accuracy and stability of a single softmax classifier, the ensemble of multiple classifiers is considered, and some Naive Bayes combination rules are introduced to integrate the three classifiers. Experimental results on the MNIST and COIL datasets are presented, with comparisons to other classification methods.
Electric power SCADA (Supervisory Control and Data Acquisition) system gradually transforming from a separate private network to an open public network, seriously increases the vulnerability risk in electric power SCA...
详细信息
Electric power SCADA (Supervisory Control and Data Acquisition) system gradually transforming from a separate private network to an open public network, seriously increases the vulnerability risk in electric power SCADA. In order to assess the vulnerability risk in electric power SCADA system, the paper firstly uses Delphi method and AHP (Analytic Hierarchy Process) to build an index system of vulnerability risk assessment, to fully represent the vulnerability of electric power SCADA system. As index data of vulnerability risk assessment in power SCADA is characterized by strong relation and high dimensionality, the method of autoencoder is proposed to reduce dimensionality of index data by representing high-dimensional data in a low dimensional space. Auto encoder method can obtain the optimal initial weight in pre-training and then back-propagate error derivatives adjusting weights with the initial weights to minimize the reconstruction error finally getting the best reconstructed results. The paper conducts simulation experiments about reconstruction error in pre-training and fine-tuning process in MATLAB experimental platform, and the experimental results show that dimensional code received by reducing dimensionality of data can basically fully represent high-dimensional data. The lowdimensional code as input can significantly reduce the complexity in the construction of model of vulnerability risk assessment in Electric power SCADA system in later work.
In this study, we trained a deep autoencoder to build compact representations of short-term spectra of multiple speakers. Using this compact representation as mapping features, we then trained an artificial neural net...
详细信息
ISBN:
(纸本)9781479971299
In this study, we trained a deep autoencoder to build compact representations of short-term spectra of multiple speakers. Using this compact representation as mapping features, we then trained an artificial neural network to predict target voice features from source voice features. Finally, we constructed a deep neural network from the trained deep autoencoder and artificial neural network weights, which were then fine-tuned using back-propagation. We compared the proposed method to existing methods using Gaussian mixture models and frame-selection. We evaluated the methods objectively, and also conducted perceptual experiments to measure both the conversion accuracy and speech quality of selected systems. The results showed that, for 70 training sentences, frame-selection performed best, regarding both accuracy and quality. When using only two training sentences, the pre-trained deep neural network performed best, regarding both accuracy and quality.
Videos always exhibit various pattern motions, which can be modeled according to dynamics between adjacent frames. Previous methods based on linear dynamic system can model dynamic textures but have limited capacity o...
详细信息
ISBN:
(纸本)9783319105932;9783319105925
Videos always exhibit various pattern motions, which can be modeled according to dynamics between adjacent frames. Previous methods based on linear dynamic system can model dynamic textures but have limited capacity of representing sophisticated nonlinear dynamics. Inspired by the nonlinear expression power of deep autoencoders, we propose a novel model named dynencoder which has an autoencoder at the bottom and a variant of it at the top (named as dynpredictor). It generates hidden states from raw pixel inputs via the autoencoder and then encodes the dynamic of state transition over time via the dynpredictor. Deep dynencoder can be constructed by proper stacking strategy and trained by layer-wise pre-training and joint fine-tuning. Experiments verify that our model can describe sophisticated video dynamics and synthesize endless video texture sequences with high visual quality. We also design classification and clustering methods based on our model and demonstrate the efficacy of them on traffic scene classification and motion segmentation. ...
暂无评论