This paper presents a novel non-negative matrix factorization algorithm based on double sparsity K-SVD. It keeps the good parts-based representation. And meanwhile it has a well sparsity as sparse coding. The influenc...
详细信息
ISBN:
(纸本)9783037854624
This paper presents a novel non-negative matrix factorization algorithm based on double sparsity K-SVD. It keeps the good parts-based representation. And meanwhile it has a well sparsity as sparse coding. The influences given by different initialization condition have been successfully overcome. Compared with other algorithms, the algorithm proposed is much faster. This dissertation demonstrates the advantages of the proposed algorithm by simulator experimentation.
Several real-world applications, notably in non-negative matrix factorization, graph-based clustering, and machine learning, require solving a convex optimization problem over the set of stochastic and doubly stochast...
详细信息
ISBN:
(纸本)9781728141909
Several real-world applications, notably in non-negative matrix factorization, graph-based clustering, and machine learning, require solving a convex optimization problem over the set of stochastic and doubly stochastic matrices. A common feature of these problems is that the optimal solution is generally a low-rank matrix. This paper suggests reformulating the problem by taking advantage of the low-rank factorization X = UVT and develops a Riemannian optimization framework for solving optimization problems on the set of low-rank stochastic and doubly stochastic matrices. In particular, this paper introduces and studies the geometry of the low-rank stochastic multinomial and the doubly stochastic manifold in order to derive first-order optimization algorithms. Being carefully designed and of lower dimension than the original problem, the proposed Riemannian optimization framework presents a clear complexity advantage. The claim is attested through numerical experiments on real-world and synthetic data for non-negative matrix factorization (NFM) applications. The proposed algorithm is shown to outperform, in terms of running time, state-of-the-art methods for NFM.
non-negative matrix factorization (NMF) is a popular research problem in data dimensional reduction. Conventional NMF approaches cannot achieve a subspace made up of binary codes from the high-dimensional data space. ...
详细信息
ISBN:
(纸本)9781665412544
non-negative matrix factorization (NMF) is a popular research problem in data dimensional reduction. Conventional NMF approaches cannot achieve a subspace made up of binary codes from the high-dimensional data space. To address the above-mentioned problem, we propose a method based on non-negative matrix factorization to generate a low-dimensional subspace made up of binary codes from the high-dimensional data. The problem can be mathematically expressed as a 0-1 integer mixed optimization problem. For this purpose, We put forward a method based on discrete cyclic coordination descent to obtain a local optimal solution. Experiments show that our means can obtain the better clustering ability than conventional non-negative matrix factorization and its variant approaches.
non-negative matrix factorization (NMF), as a useful decomposition method for multivariate data, has been widely used in pattern recognition, information retrieval and computer vision. NMF is an effective algorithm to...
详细信息
ISBN:
(纸本)9781577354635
non-negative matrix factorization (NMF), as a useful decomposition method for multivariate data, has been widely used in pattern recognition, information retrieval and computer vision. NMF is an effective algorithm to find the latent structure of the data and leads to a parts-based representation. However, NMF is essentially an unsupervised method and can not make use of label information. In this paper, we propose a novel semi-supervised matrix decomposition method, called Constrained non-negative matrix factorization, which takes the label information as additional constraints. Specifically, we require that the data points sharing the same label have the same coordinate in the new representation space. This way, the learned representations can have more discriminating power. We demonstrate the effectiveness of this novel algorithm through a set of evaluations on real world applications.
Document clustering is central in modern information retrieval applications. Among existing models, non-negative-matrixfactorization (NMF) approaches have proven effective for this task. However, NMF approaches, like...
详细信息
ISBN:
(纸本)9781450350228
Document clustering is central in modern information retrieval applications. Among existing models, non-negative-matrixfactorization (NMF) approaches have proven effective for this task. However, NMF approaches, like other models in this context, exhibit a major drawback, namely they use the bag-of-word representation and, thus, do not account for the sequential order in which words occur in documents. This is an important issue since it may result in a significant loss of semantics. In this paper, we aim to address the above issue and propose a new model which successfully integrates a word embedding model, word2vec, into an NMF framework so as to leverage the semantic relationships between words. Empirical results, on several real-world datasets, demonstrate the benefits of our model in terms of text document clustering as well as document/word embedding.
We propose the approximation-theoretic technique of optimal recovery for imputing missing values in clustered data, specifically for non-negative matrix factorization (NMF), and develop an algorithm for implementation...
详细信息
ISBN:
(纸本)9781728107080
We propose the approximation-theoretic technique of optimal recovery for imputing missing values in clustered data, specifically for non-negative matrix factorization (NMF), and develop an algorithm for implementation. Under certain geometric conditions, we prove tight upper bounds on NMF relative error, which is the first bound of this type for missing values. Experiments on image data and biological data show that this technique performs as well as or better than other imputation techniques that account for local structure.
This paper addresses the problem of segmenting low-level partial feature point tracks belonging to multiple motions. We show that the local velocity vectors at each instant of the trajectory are an effective basis for...
详细信息
ISBN:
(纸本)9781424444199
This paper addresses the problem of segmenting low-level partial feature point tracks belonging to multiple motions. We show that the local velocity vectors at each instant of the trajectory are an effective basis for motion segmentation. We decompose the velocity profiles of point tracks into different motion components and corresponding non-negative weights using non-negative matrix factorization (NNMF). We then segment the different motions using spectral clustering on the derived weights. We test our algorithm on the Hopkins 155 benchmarking database and several new sequences, demonstrating that the proposed algorithm can accurately segment multiple motions at a speed of a few seconds per frame. We show that our algorithm is particularly successful on low-level tracks from real-world video that are fragmented, noisy and inaccurate.
This paper proposes a multi-stream speech recognition system that combines information from three complementary analysis methods in order to improve automatic speech recognition in highly noisy and reverberant environ...
详细信息
ISBN:
(纸本)9781467300469
This paper proposes a multi-stream speech recognition system that combines information from three complementary analysis methods in order to improve automatic speech recognition in highly noisy and reverberant environments, as featured in the 2011 PASCAL CHiME Challenge. We integrate word predictions by a bidirectional Long Short-Term Memory recurrent neural network and non-negative sparse classification (NSC) into a multi-stream Hidden Markov Model using convolutive non-negative matrix factorization (NMF) for speech enhancement. Our results suggest that NMF-based enhancement and NSC are complementary despite their overlap in methodology, reaching up to 91.9% average keyword accuracy on the Challenge test set at signal-to-noise ratios from -6 to 9 dB-the best result reported so far on these data.
Accurate estimation of delays in a network is crucial for its management. In real-world applications, it is not always possible to conduct on-demand measurements regularly on the overall network. Doing so is costly an...
详细信息
ISBN:
(纸本)9781665406017
Accurate estimation of delays in a network is crucial for its management. In real-world applications, it is not always possible to conduct on-demand measurements regularly on the overall network. Doing so is costly and time-consuming, and it is also possible that not all the equipments respond to the probes sent in the network. In this paper, we formulate the network delay prediction problem as a non-negative matrix factorization problem with piecewise constant coefficients of the approximate instantaneous representation of data. We choose this approach to utilize the strong spatial and temporal correlation that appear in network delay data. To solve this factorization problem, we consider two different algorithms: an alternating projected gradient algorithm and the NeNMF algorithm. We finally study the efficiency of our approach on two datasets. The first dataset is a synthetic dataset produced by a simulator that we have designed, and the second one is composed of RTT measurements from RIPE Atlas.
Liquid chromatography coupled to High-Resolution Mass Spectrometry (LC-HRMS) is the most widely used approach for the global detection of small molecules in biological samples (metabolomics). In complement to such MS1...
详细信息
ISBN:
(纸本)9789464593617;9798331519773
Liquid chromatography coupled to High-Resolution Mass Spectrometry (LC-HRMS) is the most widely used approach for the global detection of small molecules in biological samples (metabolomics). In complement to such MS1 data, structural identification of metabolites implies the acquisition of fragmentation spectra by performing tandem mass spectrometry (MS2) experiments. To achieve both global detection and identification in a single run, the recently introduced acquisition mode called Sequential Window Acquisition of all THeoretical fragment ions (SWATH-type) Data Independent Acquisition (DIA) alternates MS1 detection and MS2 analysis of large and continuous m/z windows. The resulting MS2 data, however, contain a mixture of fragment ions originating from different precursor ions. To deconvolve these data and reconstruct pure individual MS2 spectra, the few existing software rely on determining a peak shape for each precursor ion. Such a strategy, however, may fail to separate co-eluting compounds. Here, we show how sparse non-negative matrix factorization (NMF) can separate pure spectral components successfully. We developed an end-to-end workflow called DIA-NMF to process SWATH DIA files, identify the detected compounds, and showed that it outperforms the reference algorithms MS-DIAL and DecoMetDIA, especially in the case of low-intensity or co-eluting compounds. Importantly, the reconstructed spectra include all the MS1 and MS2 ions related to the sought compounds and thus provide enriched chemical information that facilitates interpretation and identification.
暂无评论