This paper deals with rotation and scale invariant texture classification problem at the machinelearning level by modelling these variations in the texture data as a covariate shift. Covariate shift between the train...
详细信息
ISBN:
(纸本)9781479911806
This paper deals with rotation and scale invariant texture classification problem at the machinelearning level by modelling these variations in the texture data as a covariate shift. Covariate shift between the training and testing data is minimised by estimating importance weights for the training data which are then incorporated in a standard machinelearning algorithm like support vector machines. The effectiveness of these importance weighted support vector machines (IW-SVM) are tested on the Brodatz dataset. The comparative classification results with several other state of the art methodologies demonstrate the effectiveness of the proposed covariate shift approach for rotation and scale invariant texture classification.
A central concern for many learning algorithms is how to efficiently store what the algorithm has learned. An algorithm for the compression of Nonnegative Matrix Factorizations is presented. Compression is achieved by...
详细信息
ISBN:
(纸本)9781479936946
A central concern for many learning algorithms is how to efficiently store what the algorithm has learned. An algorithm for the compression of Nonnegative Matrix Factorizations is presented. Compression is achieved by embedding the factorization in an encoding routine. Its performance is investigated using two standard test images, Peppers and Barbara. The compression ratio (18:1) achieved by the proposed Matrix Factorization improves the storage-ability of Nonnegative Matrix Factorizations without significantly degrading accuracy (approximate to 1-3dB degradation is introduced). We learn as before, but storage is cheaper.
This paper presents an automatic approach for parameter training for a sparsity-based pitch estimation method that has been previously published. For this pitch estimation method, the harmonic dictionary is a key para...
详细信息
ISBN:
(纸本)9781509063413
This paper presents an automatic approach for parameter training for a sparsity-based pitch estimation method that has been previously published. For this pitch estimation method, the harmonic dictionary is a key parameter that needs to be carefully prepared beforehand. In the original method, extensive human supervision and involvement are required to construct and label the dictionary. In this study, we propose to employ dictionary learning algorithms to learn the dictionary directly from training data. We apply and compare 3 typical dictionary learning algorithms, i.e., the method of optimized directions (MOD), K-SVD and online dictionary learning (ODL), and propose a post-processing method to label and adapt a learned dictionary for pitch estimation. Results show that MOD and properly initialized ODL (pi-ODL) can lead to dictionaries that exhibit the desired harmonic structures for pitch estimation, and the post-processing method can significantly improve performance of the learned dictionaries in pitch estimation. The dictionary obtained with pi-ODL and post-processing attained pitch estimation accuracy close to the optimal performance of the manual dictionary. It is positively shown that dictionary learning is feasible and promising for this application.
Bayesian change-point detection, together with latent variable models, allows to perform segmentation over high-dimensional time-series. We assume that change-points lie on a lower-dimensional manifold where we aim to...
详细信息
ISBN:
(纸本)9781728166629
Bayesian change-point detection, together with latent variable models, allows to perform segmentation over high-dimensional time-series. We assume that change-points lie on a lower-dimensional manifold where we aim to infer subsets of discrete latent variables. For this model, full inference is computationally unfeasible and pseudo-observations based on pointestimates are used instead. However, if estimation is not certain enough, change-point detection gets affected. To circumvent this problem, we propose a multinomial sampling methodology that improves the detection rate and reduces the delay while keeping complexity stable and inference analytically tractable. Our experiments show results that outperform the baseline method and we also provide an example oriented to a human behavior study.
A novel real-time acoustic feedback (RTAF) based on machinelearning to reduce the duration and to improve the progress in the rehabilitation is presented. Wearable technology (WT) has emerged as a viable means to pro...
详细信息
ISBN:
(纸本)9781538654774
A novel real-time acoustic feedback (RTAF) based on machinelearning to reduce the duration and to improve the progress in the rehabilitation is presented. Wearable technology (WT) has emerged as a viable means to provide low-cost digital healthcare and therapy course outside the medical environment like hospitals and clinics. In this paper we show that the RTAF together with WTs can offer an excellent solution to be used in rehabilitation. The method of RTAF based on machinelearning as well as a study for proving its effectiveness are presented below. The results show a faster recovery time using RTAF. The proposed RTAF shows a great potential to be used and deployed to support digital healthcare, therapy and rehabilitation.
The objective of deep learning methods based on encoder-decoder architectures for music source separation is to approximate either ideal time-frequency masks or spectral representations of the target music source(s). ...
详细信息
ISBN:
(纸本)9781509063413
The objective of deep learning methods based on encoder-decoder architectures for music source separation is to approximate either ideal time-frequency masks or spectral representations of the target music source(s). The spectral representations are then used to derive time-frequency masks. In this work we introduce a method to directly learn time-frequency masks from an observed mixture magnitude spectrum. We employ recurrent neural networks and train them using prior knowledge only for the magnitude spectrum of the target source. To assess the performance of the proposed method, we focus on the task of singing voice separation. The results from an objective evaluation show that our proposed method provides comparable results to deep learning based methods which operate over complicated signal representations. Compared to previous methods that approximate time-frequency masks, our method has increased performance of signal to distortion ratio by an average of 3.8 dB.
The aim of this paper is two-fold. First, we show that the newly developed spectral method known as kernel entropy component analysis (kernel ECA) captures cluster structure, which is very important in semi-supervised...
详细信息
ISBN:
(纸本)9781467310260
The aim of this paper is two-fold. First, we show that the newly developed spectral method known as kernel entropy component analysis (kernel ECA) captures cluster structure, which is very important in semi-supervised learning, and we provide an analysis showing how mixture weights influence kernel ECA in a mixture of cluster components setting. Second, we develop a semi-supervised kernel ECA classifier based on the Lasso framework, and report promising results compared to the state-of-the art.
Recent work has shown that convolutional neural networks (CNNs) trained in a supervised fashion for speaker identification are able to extract features from spectrograms which can be used for speaker clustering. These...
详细信息
ISBN:
(纸本)9781509063413
Recent work has shown that convolutional neural networks (CNNs) trained in a supervised fashion for speaker identification are able to extract features from spectrograms which can be used for speaker clustering. These features are represented by the activations of a certain hidden layer and are called embeddings. However, previous approaches require plenty of additional speaker data to learn the embedding, and although the clustering results are then on par with more traditional approaches using MFCC features etc., room for improvements stems from the fact that these embeddings are trained with a surrogate task that is rather far away from segregating unknown voices - namely, identifying few specific speakers. We address both problems by training a CNN to extract embeddings that are similar for equal speakers (regardless of their specific identity) using weakly labeled data. We demonstrate our approach on the well-known TIMIT dataset that has often been used for speaker clustering experiments in the past. We exceed the clustering performance of all previous approaches, but require just 100 instead of 590 unrelated speakers to learn an embedding suited for clustering.
In this paper, regularized lightweight deep convolutional neural network models, capable of effectively operating in realtime on devices with restricted computational power for highresolution video input are proposed....
详细信息
ISBN:
(纸本)9781728166629
In this paper, regularized lightweight deep convolutional neural network models, capable of effectively operating in realtime on devices with restricted computational power for highresolution video input are proposed. Furthermore, a novel regularization method motivated by the Quadratic Mutual Information, in order to improve the generalization ability of the utilized models is proposed. Extensive experiments on various binary classification problems involved in autonomous systems are performed, indicating the effectiveness of the proposed models as well as of the proposed regularizer.
This paper presents the foundations of a novel method for chirplet signal decomposition. In contrast to basis-pursuit techniques on over-complete dictionaries, the proposed method uses a reduced set of adaptive parame...
详细信息
ISBN:
(纸本)9781467310260
This paper presents the foundations of a novel method for chirplet signal decomposition. In contrast to basis-pursuit techniques on over-complete dictionaries, the proposed method uses a reduced set of adaptive parametric chirplets. The estimation criterion corresponds to the maximization of the likelihood of the chirplet parameters from redundant time-frequency marginals. The optimization algorithm that results from this scenario combines Gaussian mixture models and Huber's robust regression in an iterative fashion. Simulation results support the proposed avenue.
暂无评论