检索结果-内蒙古大学图书馆

Euromicro Conference on parallel, distributed and Network-Based processing

作者： Jijun Cao Liquan Xiao Zhengbin Pang Kefei Wang Jiaqing Xu College of Computer National University of Defense Technology China Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology China

Interconnect network plays an important role in high performance computing systems. And its manageability directly affects the RAS (i.e., Reliability, Availability, and Serviceability) of the whole system. The Tianhe-2 system located in NSCC-gz (i.e., National Supercomputing Center of China in Guangzhou) uses proprietary interconnect network, which includes 5,856 high-radix network router chips (i.e., NRC) and 18,304 network interface chips (i.e., NIC). For such a very large-scale interconnect network, it is a great challenge to manage (such as configure, monitor, and debug) the numerous network chips and its network ports in an efficient way. By implementing the in-band management with very few hardware resources, the interconnect network in Tianhe-2 system achieves a highly efficient network management. In this paper, we introduce the design and implementation of the in-band management for interconnect network in Tianhe-2 system, especially emphasizing on several key features, including the set of achieved management functionalities, the architecture of network management, the format of management packets, the data flow and processing of management packets, etc. In this paper, we also evaluate the performance of in-band management by mainly comparing with out-band management scheme. The preliminary results demonstrate the efficiency of the in-band management for interconnect network in Tianhe-2 system.

关键词： Routing Ports (Computers) Network topology Optical switches Hardware Topology Registers

来源：评论

学校读者我要写书评

暂无评论

High-performance Audio Matching with Features Learned by Convolutional Deep Belief Network

High-performance Audio Matching with Features Learned by Con...

引用

2016 IEEE 13th International Conference on Signal processing（ICSP2016）

作者： Weijiang Feng Naiyang Guan Zhigang Luo Institute of Software College of Computer National University of Defense Technology Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology

Audio matching automatically retrieves all excerpts that have the same content as the query audio clip from given audio recordings. The extracted feature is critical for audio matching and the Chroma Energy Normalized Statistics(CENS) feature is the state-of-the-arts. However, CENS might behave unsatisfactorily on some audio because it is a handcraft feature. In this paper, we propose to utilize the features learned by Convolutional Deep Belief Network(CDBN) to enhance the performance of audio matching. Benefit from the strong generalization ability of CDBN, our method works better than CENS based methods on most audio datasets. Since the features learned by CDBN are binary-valued, we can develop a more efficient audio matching algorithm by taking the advantage of this property. Experimental results on both TIMIT dataset and a simulated music dataset confirm effectiveness of the proposed CDBN based method comparing with the traditional CENS feature based algorithm.

关键词： audio matching convolutional deep belief network content-based audio retrieval

来源：评论

学校读者我要写书评

暂无评论

QSobel:A novel quantum image edge extraction algorithm

引用

science China(Information sciences) 2015年第1期58卷 107-119页

作者： ZHANG Yi LU Kai GAO YingHui Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology College of Computer National University of Defense Technology College of Electronic Science and Engineering National University of Defense Technology

Edge extraction is an indispensable task in digital image processing. With the sharp increase in the image data, real-time problem has become a limitation of the state of the art of edge extraction *** this paper, QSobel, a novel quantum image edge extraction algorithm is designed based on the flexible representation of quantum image(FRQI) and the famous edge extraction algorithm Sobel. Because FRQI utilizes the superposition state of qubit sequence to store all the pixels of an image, QSobel can calculate the Sobel gradients of the image intensity of all the pixels simultaneously. It is the main reason that QSobel can extract edges quite fast. Through designing and analyzing the quantum circuit of QSobel, we demonstrate that QSobel can extract edges in the computational complexity of O(n2) for a FRQI quantum image with a size of2 n × 2n. Compared with all the classical edge extraction algorithms and the existing quantum edge extraction algorithms, QSobel can utilize quantum parallel computation to reach a significant and exponential ***, QSobel would resolve the real-time problem of image edge extraction.

关键词： edge extraction quantum image processing FRQI Sobel computational complexity

来源：评论

学校读者我要写书评

暂无评论

GPU acceleration of subgraph isomorphism search in large scale graph

引用

Journal of Central South University 2015年第6期22卷 2238-2249页

作者：杨博卢凯高颖慧王小平徐凯 Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology College of Computer National University of Defense Technology Department of Electronic Science and Engineering National University of Defense Technology

A novel framework for parallel subgraph isomorphism on GPUs is proposed, named GPUSI, which consists of GPU region exploration and GPU subgraph matching. The GPUSI iteratively enumerates subgraph instances and solves the subgraph isomorphism in a divide-and-conquer fashion. The framework completely relies on the graph traversal, and avoids the explicit join operation. Moreover, in order to improve its performance, a task-queue based method and the virtual-CSR graph structure are used to balance the workload among warps, and warp-centric programming model is used to balance the workload among threads in a warp. The prototype of GPUSI is implemented, and comprehensive experiments of various graph isomorphism operations are carried on diverse large graphs. The experiments clearly demonstrate that GPUSI has good scalability and can achieve speed-up of 1.4–2.6 compared to the state-of-the-art solutions.

关键词： parallel graph isomorphism GPU backtrack paradigm

来源：评论

学校读者我要写书评

暂无评论

Simulation study of N-hit SET variation in differential cascade voltage switch logical circuits

引用

science China(Information sciences) 2015年第2期58卷 165-173页

作者： HUANG PengCheng CHEN ShuMing CHEN JianJun WU ZhenYu LIANG ZhengFa HU ChunMei LIANG Bin LIU BiWei Micro-electronics and Microprocessor Institute College of Computer ScienceNational University of Defense Technology National Laboratory for Parallel and Distributed Processing College of Computer ScienceNational University of Defense Technology

The advancement in the process leads to more concern about the Single Event(SE) sensitivity of the Differential Cascade Voltage Switch Logic(DCVSL) circuits. The simulation results indicate that the Single Event Transient(SET) generated at the DCVSL gate is much larger than that at the ordinary CMOS gate, and their SET variation is different. Based on charge collection, in this paper, the effective collection time theory is proposed to set forth the SET pulse generated at the DCVSL gate. Through 3D TCAD mixed-mode simulation in 65 nm twin-well bulk CMOS process, the effects on SET variation of device parameters such as well contact size and environment parameters such as voltage are investigated.

关键词： differential cascade voltage switch logic(DCVSL) single event transient(SET) effective collection time pulse feedback feature(PFF) across-coupled structure

来源：评论

学校读者我要写书评

暂无评论

Service fault tolerance for highly reliable service-oriented systems: an overview

引用

science China(Information sciences) 2015年第5期58卷 7-18页

作者： ZHENG ZiBin LYU Michael Rung Tsong WANG HuaiMin Shenzhen Research Institute The Chinese University of Hong Kong National Laboratory for Parallel & Distributed Processing National University of Defense Technology

Service-oriented systems are widely-employed in e-business, e-government, finance, management systems, and so on. Service fault tolerance is one of the most important techniques for building highly reliable service-oriented systems. In this paper, we provide an overview of various service fault tolerance techniques,including sections on fault tolerance strategy design, fault tolerance strategy selection, and Byzantine fault tolerance. In the first section, we introduce the design of static and dynamic fault tolerance strategies, as well as the major problems when designing fault tolerance strategies. After that, based on various fault tolerance strategies, in the second section, we identify significant components from a complex service-oriented system, and investigate algorithms for optimal fault tolerance strategy selection. Finally, in the third section, we discuss a special type of service fault tolerance techniques, i.e., the Byzantine fault tolerance.

关键词： fault tolerance software reliability Web service SOA

来源：评论

学校读者我要写书评

暂无评论

GDSW: A General Framework for distributed Sliding Window over Data Streams

GDSW: A General Framework for Distributed Sliding Window ove...

引用

International Conference on parallel and distributed Systems (ICPADS)

作者： Huan Chen Yijie Wang Yuan Wang Xingkong Ma Science and Technology on Parallel and Distributed Processing Laboratory College of Computer National University of Defense Technology Changsha Hunan P. R. China

ISBN: (纸本)9781509053827

The big data era is characterized by the emergence of live data with high volume and fast arrival rate, it poses a new challenge to stream processing applications: how to process the unbounded live data in real time with high throughput. The sliding window technique is widely used to handle the unbounded live data by storing the most recent history of streams. However, existing centralized solutions cannot satisfy the requirements for high processing capacity and low latency due to the single-node bottleneck. Moreover, existing studies on distributed windows primarily focus on specific operators, while a general framework for processing various window-based operators is wanted. In this paper, we firstly classify the window-based operators to two categories: data-independent operators and data-dependent operators. Then, we propose GDSW, a general framework for distributed count-based sliding window, which can handle both of data-independent and data-dependent operators. Besides, in order to balance system load, we further propose a dynamic load balance algorithm called DAD based on buffer usage. Our framework is implemented on Apache Storm 0.10.0. Extensive evaluation shows that GDSW can achieve sub-second latency, and 10X improvement in throughput compared with centralized processing, when processing rapid data rate or big size window.

关键词： distributed databases Storms parallel processing Semantics Sparks distributed processing Real-time systems

来源：评论

学校读者我要写书评

暂无评论

IDOpenFlow: An OpenFlow switch to support identifier-locator split communication

IDOpenFlow: An OpenFlow switch to support identifier-locator...

引用

IEEE Conference on Industrial Electronics and Applications (ICIEA)

作者： Yannan Yang Yaping Liu Zhihong Liu School of Computer National University of Defense Technology Changsha China Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology Changsha China

ISBN: (纸本)9781467386456

The traditional identifier locator split network has many issues such as inflexibility, hard to innovate and difficult to deploy. SDN (Software Defined Network) provides a new direction for designing flexible identifier locator split network. The recent identifier locator split network based on SDN use the OpenFlow switch directly via rewritting the address, which lacks the scalability and utilizes locator address ineffectively. An OpenFlow switch named IDOpenFlow is proposed to support the communication based on identifier. IDOpenFlow switch provides the communication mechanism via encapsulating the packets, which has good scalability and utilizing locator address effectively. IDOpenFlow switch encapsulates and decapsulates packets according flow entries which are installed by SDN controller. Moreover, the prototype system shows that IDOpenFlow effectively supports the communication for both the fixed node and the mobile node. With respect to the issues of software forwarding performance, a high-performance IDOpenFlow switch based on Intel DPDK (which is named A-IDOpenFlow) is proposed. The results of Ixia test tool show that: 1) for packets more than 128 bytes, A-IDOpenFlow switch supports the communication based on identifier at rate of 10Gbit/s; 2) for small packet of 64 bytes, the rate of A-IDOpenFlow is 7.25 times faster than the rate of IDOpenFlow.

关键词： Switches Protocols Software Routing Servers Scalability

来源：评论

学校读者我要写书评

暂无评论

A C-SVM Based Anomaly Detection Method for Multi-Dimensional Sequence over Data Stream

A C-SVM Based Anomaly Detection Method for Multi-Dimensional...

引用

International Conference on parallel and distributed Systems (ICPADS)

作者： Han Bao Yijie Wang Science and Technology on Parallel and Distributed Processing Laboratory College of Computer National University of Defense Technology Changsha Hunan P. R. China

ISBN: (纸本)9781509053827

Anomaly detection over multi-dimensional data stream has attracted considerable attention recently in various fields, such as network, finance and aerospace. In many cases, anomalies are composed of a sequence of multi-dimensional data, and it's necessary to detect this type of anomalies accurately and efficiently over data stream. Existing online methods of anomaly detection merely focus on the single-dimensional sequence. What's more, current studies about multi-dimensional sequence are mainly concentrated on static database. However, the anomaly detection for multi-dimensional sequence over data stream is much more difficult, due to the complexity of multidimensional sequence processing, the dynamic nature of data stream and the unbalance between normal and abnormal data. Facing these challenges, we propose an anomaly detection method for multi-dimensional sequence over data stream based on cost sensitive support vector machine (C-SVM) called ADMS. First, to improve the accuracy and efficiency, the ADMS transforms multi-dimensional sequences into feature vectors in a lossless way and prunes worthless features of these vectors. And then, the ADMS can detect abnormal sequences over dynamically imbalanced data stream by lively testing these vectors based on C-SVM. Experiments show that the false negative rate (FNR) of the ADMS is lower than 5%, the false positive rate (FPR) is lower than 7%, and the throughput is improved 42% by pruning worthless features. In addition, the AMDS performs well when there are concept drifts over the data stream.

关键词： Feature extraction Training Transforms Heuristic algorithms Testing Hidden Markov models Databases

来源：评论

学校读者我要写书评

暂无评论

NR-MPI: A non-stop and fault resilient MPI supporting programmer defined data backup and restore for E-scale super computing systems

Supercomputing Frontiers and Innovations

引用

Supercomputing Frontiers and Innovations 2016年第1期3卷 4-21页

作者： Suo, Guang Lu, Yutong Liao, Xiangke Xie, Min Cao, Hongjia State Key Laboratory of High Performance Computing National University of Defense Technology Changsha Hunan Province China Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology Changsha Hunan Province China

Fault resilience has became a major issue for HPC systems, particularly, in the perspective of future E-scale systems, which will consist of millions of CPU cores and other components. MPI-level fault tolerant constructs, such as ULFM, are being proposed to support software level fault tolerance. However, there are few systematic evaluations by application programmers using benchmarks or pseudo applications. This paper proposes NR-MPI, a Non-stop and Fault Resilient MPI, supporting programmer defined data backup and restore. To help programmers write fault tolerant programs, NR-MPI provides a set of friendly programming interfaces and a state transition diagram for data backup and restore. This paper focuses on design, implementation and evaluation of NR-MPI. Specifically,this paper puts emphases on failure detection in MPI library, friendly programming interface extending for NR-MPI and examples of fault tolerant programs based NRMPI. Furthermore, to support failure recovery of applications, NR-MPI implements data backup interfaces based on double in-memory checkpoint/*** conduct experiments with both NPB benchmarks and Sweep3D on TH supercomputer in NSCC-TJ. Experimental results show that NR-MPI based fault tolerant programs can recover from failures online without restarting, and the overhead is small even for applications with tens of thousands of cores. © The Authors 2016.

关键词： Message passing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：