检索结果-内蒙古大学图书馆

Scalability of 3D deterministic particle transport on the Intel MIC architecture

Nuclear Science and Techniques 2015年第5期26卷 88-97页

作者：王庆林刘杰龚春叶邢座程 Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology Science and Technology on Space Physics Laboratory

The key to large-scale parallel solutions of deterministic particle transport problem is single-node computation performance. Hence, single-node computation is often parallelized on multi-core or many-core computer architectures. However, the number of on-chip cores grows quickly with the scale-down of feature size in semiconductor technology. In this paper, we present a scalability investigation of one energy group time-independent deterministic discrete ordinates neutron transport in 3D Cartesian geometry(Sweep3D) on Intel's Many Integrated Core(MIC) architecture, which can provide up to 62 cores with four hardware threads per core now and will own up to 72 in the future. The parallel programming model, Open MP, and vector intrinsic functions are used to exploit thread parallelism and vector parallelism for the discrete ordinates method, respectively. The results on a 57-core MIC coprocessor show that the implementation of Sweep3 D on MIC has good scalability in performance. In addition, the application of the Roofline model to assess the implementation and performance comparison between MIC and Tesla K20 C Graphics processing Unit(GPU) are also reported.

关键词：计算机体系结构可扩展性粒子输运三维几何英特尔麦克风离散坐标法计算性能

来源：评论

学校读者我要写书评

暂无评论

DKNNS:Scalable and accurate distributed K nearest neighbor search for latency-sensitive applications

引用

Science China(Information Sciences) 2013年第3期56卷 123-139页

作者： FU YongQuan WANG YiJie National Key Laboratory for Parallel and Distributed Processing School of Computer ScienceNational University of Defense Technology

To reduce the access latencies of end hosts,latency-sensitive applications need to choose suitably close service machines to answer the access requests from end *** K nearest neighbor search locates K service machines closest to end hosts,which can efficiently optimize the access latencies for end *** work has weakness in terms of the accuracy and *** to the scalable and accurate K nearest neighbor search problem,we propose a distributed K nearest neighbor search method called DKNNS in this *** machines are organized into a locality-aware multilevel *** first locates a service machine that starts the search process based on a farthest neighbor search scheme,then discovers K nearest service machines based on a backtracking approach within the proximity region containing the target in the latency *** analysis,simulation results and deployment experiments on the PlanetLab show that,DKNNS can determine K approximately optimal service machines,with modest completion time and query ***,DKNNS is also quite stable that can be used for reducing frequent searches by caching found nearest neighbors.

关键词： latency sensitive network applications K nearest neighbor search network coordinate

来源：评论

学校读者我要写书评

暂无评论

Service fault tolerance for highly reliable service-oriented systems: an overview

引用

Science China(Information Sciences) 2015年第5期58卷 7-18页

作者： ZHENG ZiBin LYU Michael Rung Tsong WANG HuaiMin Shenzhen Research Institute The Chinese University of Hong Kong National Laboratory for Parallel & Distributed Processing National University of Defense Technology

Service-oriented systems are widely-employed in e-business, e-government, finance, management systems, and so on. Service fault tolerance is one of the most important techniques for building highly reliable service-oriented systems. In this paper, we provide an overview of various service fault tolerance techniques,including sections on fault tolerance strategy design, fault tolerance strategy selection, and Byzantine fault tolerance. In the first section, we introduce the design of static and dynamic fault tolerance strategies, as well as the major problems when designing fault tolerance strategies. After that, based on various fault tolerance strategies, in the second section, we identify significant components from a complex service-oriented system, and investigate algorithms for optimal fault tolerance strategy selection. Finally, in the third section, we discuss a special type of service fault tolerance techniques, i.e., the Byzantine fault tolerance.

关键词： fault tolerance software reliability Web service SOA

来源：评论

学校读者我要写书评

暂无评论

A Programming Language Approach to Internet-Based Virtual Computing Environment

引用

Journal of Computer Science & Technology 2011年第4期26卷 600-615页

作者：王戟沈锐王怀民 National Laboratory for Parallel and Distributed Processing School of ComputerNational University of Defense Technology

There is an increasing need to build scalable distributed systems over the Internet infrastructure. However the development of distributed scalable applications suffers from lack of a wide accepted virtual computing environment. Users have to take great efforts on the management and sharing of the involved resources over Internet, whose characteristics are intrinsic growth, autonomy and diversity. To deal with this challenge, Internet-based Virtual Computing Environment （iVCE） is proposed and developed to serve as a platform for distributed scalable applications over the open infrastructure, whose kernel mechanisms are on-demand aggregation and autonomic collaboration of resources. In this paper, we present a programming language for iVCE named Owlet. Owlet conforms with the conceptual model of iVCE, and exposes the iVCE to application developers. As an interaction language based on peer-to-peer content-based publish/subscribe scheme, Owlet abstracts the Internet as an environment for the roles to interact, and uses roles to build a relatively stable view of resources for the on-demand resource aggregation. It provides language constructs to use 1） distributed event driven rules to describe interaction protocols among different roles, 2） conversations to correlate events and rules into a common context, and 3） resource pooling to do fault tolerance and load balancing among networked nodes. We have implemented an Owlet compiler and its runtime environment according to the architecture of iVCE, and built several Owlet applications, including a peer-to-peer file sharing application. Experimental results show that, with iVCE, the separation of resource aggregation logic and business logic significantly eases the process of building scalable distributed applications.

关键词： distributed architecture distributed programming on demand aggregation virtual computing

来源：评论

学校读者我要写书评

暂无评论

Surveying concurrency bug detectors based on types of detected bugs

引用

Science China(Information Sciences) 2017年第3期60卷 5-31页

作者： Zhendong WU Kai LU Xiaoping WANG Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology College of Computer National University of Defense Technology

Concurrency bugs widely exist in concurrent programs and have caused severe failures in the real world. Researchers have made significant progress in detecting concurrency bugs, which improves software reliability. In this paper, we survey the most up-to-date and well-known concurrency bug detectors. We categorize the existing detectors based on the types of concurrency bugs. Consequently, we analyze data race detectors, atomicity violation detectors, order violation detectors, and deadlock detectors, respectively. We also discuss some other techniques which are mostly related to concurrency bug detection, including schedule bounding techniques, interleaving optimizing techniques, path expanding techniques, and deterministic replay techniques. Additionally, we statistically analyze the reviewed detectors and get some interesting findings, for instance, nearly 86% of previous detectors focus on data races and atomicity violations, and dynamic approaches are popular(74%). We also discuss the limitations of previous detectors, finding that 91% of previous detectors suffer from false negatives and 64% of previous detectors suffer from runtime overhead. Based on the reviewed detectors and statistical analysis, we conclude some future research directions, including accuracy, performance,applicability, and integrality.

关键词： concurrency bug detection data race atomicity violation order violation deadlock

来源：评论

学校读者我要写书评

暂无评论

A peer-to-peer IO buffering service based on RAM-grid

引用

International Journal of Autonomous and Adaptive Communications Systems 2009年第4期2卷 382-396页

作者： Zhang, Yiming Chu, Rui Li, Dongsheng National Laboratory for Parallel and Distributed Processing National University of Defense Technology Changsha 410073 China

The performance of IO-intensive applications is determined by the hit ratio of local disk cache and the IO latency of missed disk accesses. To improve the IO performance, in this paper we propose PIB, a peer-to-peer IO buffering service based on RAM-grid (a new grid system aiming at memory resource sharing in WAN). The PIB service acts as a two-level disk cache, which buffers obsolete blocks in idle nodes on the internet for IO-intensive applications. It reduces the IO latency of missed disk accesses by means of the speed advantage of network over disks, and improves the hit ratio of local cache based on accurate identification of IO patterns. The effectiveness of our proposals is demonstrated through the trace driven simulation studies. Copyright © 2009 Inderscience Enterprises Ltd.

关键词： Random access storage

来源：评论

学校读者我要写书评

暂无评论

An Efficient Broadcast Authentication Protocol in Wireless Sensor Networks

引用

电子学报(英文版) 2009年第2期18卷 368-372页

作者： ZHAO Xin WANG Xiaodong YU Wanrong ZHOU Xingming National Key Laboratory of Parallel and Distributed Processing National University of Defense Technology Changsha 410073 China

Broadcast authentication is a critical security service in wireless sensor networks. A protocol named μTESLA[1] has been proposed to provide efficient authentication service for such networks. However, when applied t... 详细信息

Broadcast authentication is a critical security service in wireless sensor networks. A protocol named μTESLA^[1] has been proposed to provide efficient authentication service for such networks. However, when applied to applications such as time synchronization and fire alarm in which broadcast messages are sent infrequently, μTESLA encounters problems of wasted key resources and slow message verification. This paper presents a new protocol named GBA (Generalized broadcast authentication), for efficient broadcast authentication in these applications. GBA utilises the one-way key chain mechanism of μTESLA, but modifies the keys and time intervals association, and changes the key disclosure mechanism according to the message transmission model in these applications. The proposed technique can take full use of key resources, and shorten the message verification time to an acceptable level. The analysis and experiments show that GBA is more efficient and practical than μESLA in appli ations with various message transmission models.

关键词：无线传感器网络认证协议广播 applications practical 安全服务身份验证时间同步

来源：评论

学校读者我要写书评

暂无评论

A coarse-grained reconfigurable computing architecture with loop self-pipelining

引用

Science in China(Series F) 2009年第4期52卷 575-587页

作者： DOU Yong WU GuiMing XU dinHui ZHOU XingMing National Laboratory for Parallel & Distributed Processing National University of Defense Technology Changsha 410073 China

Reconfigurable computing tries to achieve the balance between high efficiency of custom computing and flexibility of general-purpose computing. This paper presents the implementation techniques in LEAP, a coarse-grained reconfigurable array, and proposes a speculative execution mechanism for dynamic loop scheduling with the goal of one iteration per cycle and implementation techniques to support decoupling synchronization between the token generator and the collector. This paper also in- troduces the techniques of exploiting both data dependences of intra- and inter-iteration, with the help of two instructions for special data reuses in the loop-carried dependences. The experimental results show that the number of memory accesses reaches on average 3% of an RISC processor simulator with no memory optimization. In a practical image matching application, LEAP architecture achieves about 34 times of speedup in execution cycles, compared with general-purpose processors.

关键词： reconfigurable computing loop pipelining data driven register promotion

来源：评论

学校读者我要写书评

暂无评论

SKY:Efficient peer-to-peer networks based on distributed Kautz graphs

引用

Science in China(Series F) 2009年第4期52卷 588-601页

作者： ZHANG YiMing LU XiCheng LI DongSheng National Laboratory for Parallel and Distributed Processing National University of Defense Technology Changsha 410073 China

Many proposed P2P networks are based on traditional interconnection topologies. Given a static topology, the maintenance mechanism for node join/departure is critical to designing an efficient P2P network. Kautz graphs have many good properties such as constant degree, low congestion and optimal diameter. Due to the complexity in topology maintenance, however, to date there have been no effective P2P networks that are proposed based on Kautz graphs with base ~ 2. To address this problem, this paper presents the ＂distributed Kautz （D-Kautz） graphs＂, which adapt Kautz graphs to the characteristics of P2P networks. Using the D-Kautz graphs we further propose SKY, the first effective P2P network based on Kautz graphs with arbitrary base. The effectiveness of SKY is demonstrated through analysis and simulations.

关键词： peer-to-peer network Kautz graph constant degree topology maintenance D-Kautz graph

来源：评论

学校读者我要写书评

暂无评论

Slicing hierarchical automata for model checking UML statecharts 4th

Slicing hierarchical automata for model checking UML statech...

引用

4th International Conference on Formal Engineering Methods, ICFEM 2002

作者： Ji, Wang Wei, Dong Qi, Zhi-Chang National Laboratory for Parallel and Distributed Processing China

ISBN: (纸本)9783540000297

Hierarchical Automata has been widely used in modeling dynamic aspects of reactive software, such as in UML Statecharts. At the same time, model checking is an automatic technique to ensure the correctness of software models, where state space explosion is the main obstacle to applying this technique in large scale applications. The paper presents a method for slicing hierarchical automata with respect to properties to be verified. The considered formalism is Extended Hierarchical Automata (EHA), in which a set of dependence relations is specified after analyzing characteristics such as hierarchy, concurrency and synchronization. We present the algorithm of slicing EHA based on the slicing criterion in terms of states and transitions. The algorithm can remove the hierarchies and concurrent states which are irrelevant to the property, and reduce the state space efficiently in model checking UML Statecharts. © Springer-Verlag Berlin Heidelberg 2002.

关键词： Model checking

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：