Non-negative matrix factorization (NMF) is an efficient dimension reduction method and plays an important role in many pattern recognition and computer vision tasks. However, conventional NMF methods are not robust si...
详细信息
Non-negative matrix factorization (NMF) is an efficient dimension reduction method and plays an important role in many pattern recognition and computer vision tasks. However, conventional NMF methods are not robust since the objective functions are sensitive to outliers and do not consider the geometric structure in datasets. In this paper, we proposed a correntropy graph regularized NMF (CGNMF) to overcome the aforementioned problems. CGNMF maximizes the correntropy between data matrix and its reconstruction to filter out the noises of large magnitudes, and expects the coefficients to preserve the intrinsic geometric structure of data. We also proposed a modified version of our CGNMF which construct the adjacent graph by using sparse representation to enhance its reliability. Experimental results on popular image datasets confirm the effectiveness of CGNMF.
The concerns of data-intensiveness and energy awareness are actively reshaping the design of high-performance computing (HPC) systems nowadays. The Graph500 is a widely adopted benchmark for evaluating the performance...
详细信息
ISBN:
(纸本)9781479947638
The concerns of data-intensiveness and energy awareness are actively reshaping the design of high-performance computing (HPC) systems nowadays. The Graph500 is a widely adopted benchmark for evaluating the performance of computing systems for data-intensive workloads. In this paper, we introduce a data-parallel implementation of Graph500 on the Intel Single-chip Cloud Computer (SCC). The SCC features a non-coherent many-core architecture and multi-domain on-chip DVFS support for dynamic power management. With our custom-made shared virtual memory programming library, memory sharing among threads is done efficiently via the shared physical memory (SPM) while the library has taken care of the coherence. We conduct an in-depth study on the power and performance characteristics of the Graph500 workloads running on this system with varying system scales and power states. Our experimental results are insightful for the design of energy-efficient many-core systems for data-intensive applications.
Virtualization is the foundation for cloud computing, and the virtualization can not be achieved without software defined, elastic, flexible and scalable virtual layers. Unfortunately, if multiple virtual storage devi...
详细信息
Virtualization is the foundation for cloud computing, and the virtualization can not be achieved without software defined, elastic, flexible and scalable virtual layers. Unfortunately, if multiple virtual storage devices are chained together, the system may be subject to severe performance degradation. While the read-ahead (RA) mechanism in storage devices plays a very important role to improve I/O performance, RA may not be effective as expected for multiple virtualization layers, since it is originally designed for one layer only. When I/O requests are passed through a long I/O path, they may trigger a chain reaction and lead to unnecessary data transmission and thus bandwidth waste. In this paper, we study the dynamic behavior of RA through multiple I/O layers and demonstrate that if controlled well, RA can greatly accelerate I/O speed. We present RAFlow, a RA control mechanism, to effectively improve I/O performance by strategically expanding RA window at each layer. Our real-world experiments show that it can achieve 20% to 50% performance improvement in I/O paths with up to 8 virtualized storage devices.
Regarding the non-negativity property of the magnitude spectrogram of speech signals, nonnegative matrix factorization (NMF) has obtained promising performance for speech separation by independently learning a diction...
详细信息
ISBN:
(纸本)9781479928941
Regarding the non-negativity property of the magnitude spectrogram of speech signals, nonnegative matrix factorization (NMF) has obtained promising performance for speech separation by independently learning a dictionary on the speech signals of each known speaker. However, traditional NM-F fails to represent the mixture signals accurately because the dictionaries for speakers are learned in the absence of mixture signals. In this paper, we propose a new transductive NMF algorithm (TNMF) to jointly learn a dictionary on both speech signals of each speaker and the mixture signals to be separated. Since TNMF learns a more descriptive dictionary by encoding the mixture signals than that learned by NMF, it significantly boosts the separation performance. Experiments results on a popular TIMIT dataset show that the proposed TNMF-based methods outperform traditional NMF-based methods for separating the monophonic mixtures of speech signals of known speakers.
User request trace-oriented monitoring is an effective method to improve the reliability of cloud systems. However, there are some difficulties in getting traces in practice, which hinder the development of trace-orie...
详细信息
User request trace-oriented monitoring is an effective method to improve the reliability of cloud systems. However, there are some difficulties in getting traces in practice, which hinder the development of trace-oriented monitoring research. In this paper, we release a fine-grained user request-centric open trace data set, called Trace Bench, collected on a real world cloud storage system deployed in a real environment. During collecting, many aspects are considered to simulate different scenarios, including cluster size, request type, workload speed, etc. Besides recording the traces when the monitored system is running normally, we also collect the traces under the situation with faults injected. With a mature injection tool, 14 faults are introduced, including function faults and performance faults. The traces in Trace Bench are clustered in different files, where each file corresponds to a certain scenario. The whole collection work lasted for more than half a year, resulting in more than 360, 000 traces in 361 files. In addition, we also employ several applications based on Trace Bench, which validate the helpfulness of Trace Bench for the field of trace-oriented monitoring.
An improved algorithm is proposed for the reconstruction of singular connectivity from the available pairwise connections during preprocessing phase. To evaluate the performance of the algorithm, an in-house computati...
详细信息
An improved algorithm is proposed for the reconstruction of singular connectivity from the available pairwise connections during preprocessing phase. To evaluate the performance of the algorithm, an in-house computational fluid dynamics (CFD) code is used in which high-order finite-difference method for spatial discretization running on the Tianhe-1A supercomputer is employed. Test cases with a varied amount of mesh points are chosen, and the test results indicate that the improved singular connection reconstruction algorithm can achieve a speedup of 2000× at least when compared with the naive search method adopted in the former version of our code. Moreover, the parallel efficiency can benefit from the strategy of local communication based on the algorithm.
The Internetware" paradigm is fundamentally changing the traditional way of software development. More and more software projects are developed, maintained and shared on the Internet. However, a large quantity of...
详细信息
ISBN:
(纸本)9781450323697
The Internetware" paradigm is fundamentally changing the traditional way of software development. More and more software projects are developed, maintained and shared on the Internet. However, a large quantity of heterogeneous software resources have not been organized in a reasonable and efficient way. Software feature is an ideal material to characterize software resources. The effectiveness of feature- related tasks will be greatly improved, if a multi-grained feature repository is available. In this paper, we propose a novel approach for organizing, analyzing and recommend- ing software features. Firstly, we construct a Hierarchical rEpository of Software feAture (HESA). Then, we mine the hidden affnities among the features and recommend relevant and high-quality features to stakeholders based on HESA. Finally, we conduct a user study to evaluate our approach quantitatively. The results show that HESA can organize software features in a more reasonable way compared to the traditional and the state-of-the-art approaches. The result of feature recommendation is effective and interesting. Categories and Subject Descriptors D.2.9 [Software Engineering]: Mining Software Reposi- tory;H.3.3 [Information Storage and retrieval]: Fea- ture Model, Clustering, Query formulation General Terms Algorithms, Human Factors.
Nowadays, the demand for software resources on different granularity is becoming prominent in software engineering field. However, a large quantity of heterogeneous software resources have not been organized in a reas...
详细信息
With the rapid development of the Internet, the de facto inter-domain routing protocol, Border Gateway Protocol (BGP), has become very vulnerable to many attacks. For this, several secure inter-domain protocols have b...
详细信息
Wireless sensor networks (WSN) is a key technology extensively applied in many fields, such as transportation, health-care and environment monitoring. Despite rapid development, the exponentially increasing data emana...
详细信息
暂无评论