Next-generation wireless networks are witnessing an increasing number of clustering applications, and produce a large amount of non-linear and unlabeled data. In some degree, single kernel methods face the challenging...
详细信息
Next-generation wireless networks are witnessing an increasing number of clustering applications, and produce a large amount of non-linear and unlabeled data. In some degree, single kernel methods face the challenging problem of kernel choice. To overcome this problem for non-linear data clustering, multiple kernel graph-based clustering (MKGC) has attracted intense attention in recent years. However, existing MKGC methods suffer from two common problems: (1) they mainly aim to learn a consensus kernel from multiple candidate kernels, slight affinity graph learning, such that cannot fully exploit the underlying graph structure of non-linear data;(2) they disregard the high-order correlations between all base kernels, which cannot fully capture the consistent and complementary information of all kernels. In this paper, we propose a novel non-negative matrix factorization (NMF) tailored graph tensor MKGC method for non-linear data clustering, namely TMKGC. Specifically, TMKGC integrates NMF and graph learning together in kernel space so as to learn multiple candidate affinity graphs. Afterwards, the high-order structure information of all candidate graphs is captured in a 3-order tensor kernel space by introducing tensor singular value decomposition based tensor nuclear norm, such that an optimal affinity graph can be obtained subsequently. Based on the alternating direction method of multipliers, the effective local and distributed solvers are elaborated to solve the proposed objective function. Extensive experiments have demonstrated the superiority of TMKGC compared to the state-of-the-art MKGC methods.
This paper presents a voice conversion (VC) technique for noisy environments based on a sparse representation of speech. Sparse representation-based VC using non-negative matrix factorization (NMF) is employed for noi...
详细信息
This paper presents a voice conversion (VC) technique for noisy environments based on a sparse representation of speech. Sparse representation-based VC using non-negative matrix factorization (NMF) is employed for noise-added spectral conversion between different speakers. In our previous exemplar-based VC method, source exemplars and target exemplars are extracted from parallel training data, having the same texts uttered by the source and target speakers. The input source signal is represented using the source exemplars and their weights. Then, the converted speech is constructed from the target exemplars and the weights related to the source exemplars. However, this exemplar-based approach needs to hold all training exemplars (frames), and it requires high computation times to obtain the weights of the source exemplars. In this paper, we propose a framework to train the basis matrices of the source and target exemplars so that they have a common weight matrix. By using the basis matrices instead of the exemplars, the VC is performed with lower computation times than with the exemplar-based method. The effectiveness of this method was confirmed by comparing its effectiveness (in speaker conversion experiments using noise-added speech data) with that of an exemplar-based method and a conventional Gaussian mixture model (GMM)-based method.
We compared non-negative matrix factorization (NMF) and convolution kernel compensation techniques for high-density electromyogram decomposition. The experimental data were recorded from nine healthy persons during co...
详细信息
We compared non-negative matrix factorization (NMF) and convolution kernel compensation techniques for high-density electromyogram decomposition. The experimental data were recorded from nine healthy persons during controlled single degree of freedom (DOF) wrist flexion-extension, supination-pronation, and ulnar-radial deviation movements. We assembled the identified motor units and NMF components into three groups. Those active mostly during the first and the second movement direction per DOF were placed in the G1 and G3 groups, respectively. The remaining components were nonspecific for movement direction and were placed in the G2 group. In ulnar and radial deviation, the relative energies of identified cumulative motor unit spike trains (CSTs) and NMF components were similarly distributed among the groups. In other two movement types, the energy of NMF components in the G2 group was significantly larger than the energy of CSTs. We further performed a coherence analysis between CSTs and sums of NMF components in each group. Both decompositions demonstrated a solid match, but only at frequencies <3 Hz. At higher frequencies, the coherence hardly exceeded the value of 0.5. Potential reasons for these discrepancies include the negative impact of motor unit action potential shapes and noise on NMF decomposition.
Feature extraction methods for sound events have been traditionally based on parametric representations specifically developed for speech signals, such as the well-known Mel Frequency Cepstrum Coefficients (MFCC). How...
详细信息
Feature extraction methods for sound events have been traditionally based on parametric representations specifically developed for speech signals, such as the well-known Mel Frequency Cepstrum Coefficients (MFCC). However, the discrimination capabilities of these features for Acoustic Event Classification (AEC) tasks could be enhanced by taking into account the spectro-temporal structure of acoustic event signals. In this paper, a new front-end for AEC which incorporates this specific information is proposed. It consists of two different stages: short-time feature extraction and temporal feature integration. The first module aims at providing a better spectral representation of the different acoustic events on a frame-by-frame basis, by means of the automatic selection of the optimal set of frequency bands from which cepstral-like features are extracted. The second stage is designed for capturing the most relevant temporal information in the short-time features, through the application of non-negative matrix factorization (NMF) on their periodograms computed over long audio segments. The whole front-end has been evaluated in clean and noisy conditions. Experiments show that the removal of certain frequency bands (which are mainly located in the medium region of the spectrum for clean conditions and in low frequencies for noisy environments) in the short-time feature computation process in conjunction with the NMF technique for temporal feature integration improves significantly the performance of a Support Vector Machine (SVM) based AEC system with respect to the use of conventional MFCCs. (C) 2015 Elsevier Ltd. All rights reserved.
Social tagging, also noted as collaborative tagging or folksonomy, is an important way for users themselves to describe resources on the Web. The tags that the web users adopt to describe the resources are called soci...
详细信息
Social tagging, also noted as collaborative tagging or folksonomy, is an important way for users themselves to describe resources on the Web. The tags that the web users adopt to describe the resources are called social tags, and they have been widely used and studied. However, for the absence of a central controlled vocabulary, the semantics of the social tags are ambiguous due to constant changes of either the users' interests or the informal definitions, which makes it hard to directly make use of these social tags in the web applications. In this paper, we propose a non-negative matrix factorization (NMF) based method to automatically induce topic senses from social tags, which can then be used for the tag disambiguation. A novel automatic evaluation method is also proposed to evaluate our method. The experiment results show that the proposed topic sense induction method can help to provide precise resources search and recommendation, which is one of the key functionalities in social tagging systems. (C) 2014 Elsevier Inc. All rights reserved.
Diffuse reflectance spectrophotometry (DRS), often termed simply color scanning, is a technique applied commonly to marine sediments to provide records of compositional variations. Measured DRS spectra, however, repre...
详细信息
Diffuse reflectance spectrophotometry (DRS), often termed simply color scanning, is a technique applied commonly to marine sediments to provide records of compositional variations. Measured DRS spectra, however, represent the bulk response of a sediment's constituents and are therefore difficult to interpret in their raw form. A quantification technique will be proposed and discussed which approaches the analysis of DRS data sets as a linear mixing problem and applies a non-negative matrix factorization (NMF) algorithm in their decomposition. The presented methodology allows the spectra of the end-member sediment constituents and their fractional abundances to be determined using only the measured data set. Unlike other DRS data processing techniques, NMF only allows additive combinations of end-members to explain the data set and will therefore return only positive abundances. Analysis of sediments from the eastern Mediterranean Sea demonstrates the applicability of the NMF approach and the new method of Orbital Cycle Stacking establishes that the "unmixed" data is consistent with expected environmental change. (C) 2007 Elsevier B.V. All rights reserved.
A recent theoretical analysis shows the equivalence between non-negative matrix factorization (NMF) and spectral clustering based approach to subspace clustering. As NMF and many of its variants are essentially linear...
详细信息
A recent theoretical analysis shows the equivalence between non-negative matrix factorization (NMF) and spectral clustering based approach to subspace clustering. As NMF and many of its variants are essentially linear, we introduce a nonlinear NMF with explicit orthogonality and derive general kernel-based orthogonal multiplicative update rules to solve the subspace clustering problem. In nonlinear orthogonal NMF framework, we propose two subspace clustering algorithms, named kernel-based non-negative subspace clustering KNSC-Ncut and KNSC-Rcut and establish their connection with spectral normalized cut and ratio cut clustering. We further extend the nonlinear orthogonal NMF framework and introduce a graph regularization to obtain a factorization that respects a local geometric structure of the data after the nonlinear mapping. The proposed NMF-based approach to subspace clustering takes into account the nonlinear nature of the manifold, as well as its intrinsic local geometry, which considerably improves the clustering performance when compared to the several recently proposed state-of-the-art methods. (C) 2018 Elsevier Ltd. All rights reserved.
Identification of the family to which a malware specimen belongs is essential in understanding the behavior of the malware and developing mitigation strategies. Solutions proposed by prior work, however, are often not...
详细信息
Identification of the family to which a malware specimen belongs is essential in understanding the behavior of the malware and developing mitigation strategies. Solutions proposed by prior work, however, are often not practicable due to the lack of realistic evaluation factors. These factors include learning under class imbalance, the ability to identify new malware, and the cost of production-quality labeled data. In practice, deployed models face prominent, rare, and new malware families. At the same time, obtaining a large quantity of up-to-date labeled malware for training a model can be expensive. In this article, we address these problems and propose a novel hierarchical semi-supervised algorithm, which we call the HNMFk Classifier, that can be used in the early stages of the malware family labeling process. Our method is based on non-negative matrix factorization with automatic model selection, that is, with an estimation of the number of clusters. With HNMFk Classifier, we exploit the hierarchical structure of the malware data together with a semi-supervised setup, which enables us to classify malware families under conditions of extreme class imbalance. Our solution can perform abstaining predictions, or rejection option, which yields promising results in the identification of novel malware families and helps with maintaining the performance of the model when a low quantity of labeled data is used. We perform bulk classification of nearly 2,900 both rare and prominent malware families, through static analysis, using nearly 388,000 samples from the EMBER-2018 corpus. In our experiments, we surpass both supervised and semi-supervised baseline models with an F1 score of 0.80.
Diffusion tensor imaging (DTI) offers rich insights into the physical characteristics of white matter (WM) fiber tracts and their development in the brain, facilitating a network representation of brain's traffic ...
详细信息
Diffusion tensor imaging (DTI) offers rich insights into the physical characteristics of white matter (WM) fiber tracts and their development in the brain, facilitating a network representation of brain's traffic pathways. Such a network representation of brain connectivity has provided a novel means of investigating brain changes arising from pathology, development or aging. The high dimensionality of these connectivity networks necessitates the development of methods that identify the connectivity building blocks or sub-network components that characterize the underlying variation in the population. In addition, the projection of the subject networks into the basis set provides a low dimensional representation of it, that teases apart different sources of variation in the sample, facilitating variation-specific statistical analysis. We propose a unified framework of non-negative matrix factorization and graph embedding for learning sub-network patterns of connectivity by their projective non-negative decomposition into a reconstructive basis set, as well as, additional basis sets representing variational sources in the population like age and pathology. The proposed framework is applied to a study of diffusion-based connectivity in subjects with autism that shows localized sparse sub-networks which mostly capture the changes related to pathology and developmental variations. (C) 2014 Elsevier B.V. All rights reserved.
With its unique geometric properties, non-negative matrix factorization (NMF) has become one of the widely used clustering methods in the field of data mining. Regrettably, most existing NMF methods are sensitive to s...
详细信息
With its unique geometric properties, non-negative matrix factorization (NMF) has become one of the widely used clustering methods in the field of data mining. Regrettably, most existing NMF methods are sensitive to super-noise (super-outliers). This paper proposes a novel robust clustering method to address this issue. Based on the Hx loss function, this method establishes a novel robust adaptive local structure learning strategy, reducing the interference of noise (outliers) on data reconstruction and space exploration. In addition, a new orthogonal regularization term is incorporated into the model, ensuring the orthogo-nality of the factor matrix and enhancing the discriminant ability. Finally, we develop an efficient algorithm to solve the resultant model and analyze its convergence from theoret-ical and experimental aspects. Experimental results on random synthetic data sets and benchmark databases demonstrate that the proposed method outperforms the existing robust NMF methods in terms of spatial structure learning, discriminant power, and robustness.(c) 2022 Elsevier Inc. All rights reserved.
暂无评论