Audio matching automatically retrieves all excerpts that have the same content as the query audio clip from given audio recordings. The extracted feature is critical for audio matching and the Chroma Energy Normalized...
详细信息
Audio matching automatically retrieves all excerpts that have the same content as the query audio clip from given audio recordings. The extracted feature is critical for audio matching and the Chroma Energy Normalized Statistics(CENS) feature is the state-of-the-arts. However, CENS might behave unsatisfactorily on some audio because it is a handcraft feature. In this paper, we propose to utilize the features learned by Convolutional Deep Belief Network(CDBN) to enhance the performance of audio matching. Benefit from the strong generalization ability of CDBN, our method works better than CENS based methods on most audio datasets. Since the features learned by CDBN are binary-valued, we can develop a more efficient audio matching algorithm by taking the advantage of this property. Experimental results on both TIMIT dataset and a simulated music dataset confirm effectiveness of the proposed CDBN based method comparing with the traditional CENS feature based algorithm.
Interconnect network plays an important role in high performance computing systems. And its manageability directly affects the RAS (i.e., Reliability, Availability, and Serviceability) of the whole system. The Tianhe-...
详细信息
Interconnect network plays an important role in high performance computing systems. And its manageability directly affects the RAS (i.e., Reliability, Availability, and Serviceability) of the whole system. The Tianhe-2 system located in NSCC-gz (i.e., National Supercomputing Center of China in Guangzhou) uses proprietary interconnect network, which includes 5,856 high-radix network router chips (i.e., NRC) and 18,304 network interface chips (i.e., NIC). For such a very large-scale interconnect network, it is a great challenge to manage (such as configure, monitor, and debug) the numerous network chips and its network ports in an efficient way. By implementing the in-band management with very few hardware resources, the interconnect network in Tianhe-2 system achieves a highly efficient network management. In this paper, we introduce the design and implementation of the in-band management for interconnect network in Tianhe-2 system, especially emphasizing on several key features, including the set of achieved management functionalities, the architecture of network management, the format of management packets, the data flow and processing of management packets, etc. In this paper, we also evaluate the performance of in-band management by mainly comparing with out-band management scheme. The preliminary results demonstrate the efficiency of the in-band management for interconnect network in Tianhe-2 system.
The advancement in the process leads to more concern about the Single Event(SE) sensitivity of the Differential Cascade Voltage Switch Logic(DCVSL) circuits. The simulation results indicate that the Single Event Trans...
详细信息
The advancement in the process leads to more concern about the Single Event(SE) sensitivity of the Differential Cascade Voltage Switch Logic(DCVSL) circuits. The simulation results indicate that the Single Event Transient(SET) generated at the DCVSL gate is much larger than that at the ordinary CMOS gate, and their SET variation is different. Based on charge collection, in this paper, the effective collection time theory is proposed to set forth the SET pulse generated at the DCVSL gate. Through 3D TCAD mixed-mode simulation in 65 nm twin-well bulk CMOS process, the effects on SET variation of device parameters such as well contact size and environment parameters such as voltage are investigated.
The big data era is characterized by the emergence of live data with high volume and fast arrival rate, it poses a new challenge to stream processing applications: how to process the unbounded live data in real time w...
详细信息
ISBN:
(纸本)9781509053827
The big data era is characterized by the emergence of live data with high volume and fast arrival rate, it poses a new challenge to stream processing applications: how to process the unbounded live data in real time with high throughput. The sliding window technique is widely used to handle the unbounded live data by storing the most recent history of streams. However, existing centralized solutions cannot satisfy the requirements for high processing capacity and low latency due to the single-node bottleneck. Moreover, existing studies on distributed windows primarily focus on specific operators, while a general framework for processing various window-based operators is wanted. In this paper, we firstly classify the window-based operators to two categories: data-independent operators and data-dependent operators. Then, we propose GDSW, a general framework for distributed count-based sliding window, which can handle both of data-independent and data-dependent operators. Besides, in order to balance system load, we further propose a dynamic load balance algorithm called DAD based on buffer usage. Our framework is implemented on Apache Storm 0.10.0. Extensive evaluation shows that GDSW can achieve sub-second latency, and 10X improvement in throughput compared with centralized processing, when processing rapid data rate or big size window.
The traditional identifier locator split network has many issues such as inflexibility, hard to innovate and difficult to deploy. SDN (Software Defined Network) provides a new direction for designing flexible identifi...
详细信息
ISBN:
(纸本)9781467386456
The traditional identifier locator split network has many issues such as inflexibility, hard to innovate and difficult to deploy. SDN (Software Defined Network) provides a new direction for designing flexible identifier locator split network. The recent identifier locator split network based on SDN use the OpenFlow switch directly via rewritting the address, which lacks the scalability and utilizes locator address ineffectively. An OpenFlow switch named IDOpenFlow is proposed to support the communication based on identifier. IDOpenFlow switch provides the communication mechanism via encapsulating the packets, which has good scalability and utilizing locator address effectively. IDOpenFlow switch encapsulates and decapsulates packets according flow entries which are installed by SDN controller. Moreover, the prototype system shows that IDOpenFlow effectively supports the communication for both the fixed node and the mobile node. With respect to the issues of software forwarding performance, a high-performance IDOpenFlow switch based on Intel DPDK (which is named A-IDOpenFlow) is proposed. The results of Ixia test tool show that: 1) for packets more than 128 bytes, A-IDOpenFlow switch supports the communication based on identifier at rate of 10Gbit/s; 2) for small packet of 64 bytes, the rate of A-IDOpenFlow is 7.25 times faster than the rate of IDOpenFlow.
Fault resilience has became a major issue for HPC systems, particularly, in the perspective of future E-scale systems, which will consist of millions of CPU cores and other components. MPI-level fault tolerant constru...
详细信息
Anomaly detection over multi-dimensional data stream has attracted considerable attention recently in various fields, such as network, finance and aerospace. In many cases, anomalies are composed of a sequence of mult...
详细信息
ISBN:
(纸本)9781509053827
Anomaly detection over multi-dimensional data stream has attracted considerable attention recently in various fields, such as network, finance and aerospace. In many cases, anomalies are composed of a sequence of multi-dimensional data, and it's necessary to detect this type of anomalies accurately and efficiently over data stream. Existing online methods of anomaly detection merely focus on the single-dimensional sequence. What's more, current studies about multi-dimensional sequence are mainly concentrated on static database. However, the anomaly detection for multi-dimensional sequence over data stream is much more difficult, due to the complexity of multidimensional sequence processing, the dynamic nature of data stream and the unbalance between normal and abnormal data. Facing these challenges, we propose an anomaly detection method for multi-dimensional sequence over data stream based on cost sensitive support vector machine (C-SVM) called ADMS. First, to improve the accuracy and efficiency, the ADMS transforms multi-dimensional sequences into feature vectors in a lossless way and prunes worthless features of these vectors. And then, the ADMS can detect abnormal sequences over dynamically imbalanced data stream by lively testing these vectors based on C-SVM. Experiments show that the false negative rate (FNR) of the ADMS is lower than 5%, the false positive rate (FPR) is lower than 7%, and the throughput is improved 42% by pruning worthless features. In addition, the AMDS performs well when there are concept drifts over the data stream.
Dangling pointer error is pervasive in C/C++ programs and it is very hard to detect. This paper introduces an efficient detector to detect dangling pointer error in C/C++ programs. By selectively leave some memory acc...
Dangling pointer error is pervasive in C/C++ programs and it is very hard to detect. This paper introduces an efficient detector to detect dangling pointer error in C/C++ programs. By selectively leave some memory accesses unmonitored, our method could reduce the memory monitoring overhead and thus achieves better performance over previous methods. Experiments show that our method could achieve an average speed up of 9% over previous compiler instrumentation based method and more than 50% over previous page protection based method.
Powering is an important operation in many computation intensive workloads. This paper investigates the performance of different styles to calculate the powering operations from the application level. A series of smal...
详细信息
ISBN:
(纸本)9781509045181
Powering is an important operation in many computation intensive workloads. This paper investigates the performance of different styles to calculate the powering operations from the application level. A series of small benchmark codes that calculate the powering operations in different ways are designed. Their performance is evaluated on Intel Xeon CPU under Intel compilation environments. The results show that the number of floating-point operations and the related runtime are sensitive to the value of the exponent Y and how it is used. When Y is an immediate integer number whose value is known at compile time, the cost of powering is much less than the situation when Y is an integer variable whose value is known at runtime. When Y is defined as a real variable, the cost of powering is always high, be it equals to an integer number or not. Based on the investigations, performance optimizations are applied to a kernel subroutine from a real-world supersonic combustion simulation code, which intensively involves powering operations. The result shows that the performance of that subroutine is improved for 13.25 times on the Intel Xeon E5-2692 CPU.
Binary Exchange Algorithm (BEA) always introduces excessive shuffle operations when mapping FFTs on vector SIMD DSPs. This can greatly restrict the overall performance. We propose a novel mod (2P-1) shuffle function a...
详细信息
ISBN:
(纸本)9781467390408
Binary Exchange Algorithm (BEA) always introduces excessive shuffle operations when mapping FFTs on vector SIMD DSPs. This can greatly restrict the overall performance. We propose a novel mod (2P-1) shuffle function and Mod-BEA algorithm (MBEA), which can halve the shuffle operation count and unify the shuffle mode. Such unified shuffle mode inspires us to propose a set of novel mod (2P-1) shuffle memory-access instructions, which can totally eliminate the shuffle operations. Experimental results show that the combination of MBEA and the proposed instructions can bring 17.2%-31.4% performance improvements at reasonable hardware cost, and compress the code size by about 30%.
暂无评论