检索结果-内蒙古大学图书馆

MPtostream:an OpenMP compiler for CPU-GPU heterogeneous parallel systems

Science China(Information Sciences) 2012年第9期55卷 1961-1971页

作者： YANG XueJun,TANG Tao ,WANG GuiBin,JIA Jia & XU XinHai national laboratory for parallel and distributed processing,national University of Defense Technology,Changsha 410073,China 1. National Laboratory for Parallel and Distributed Processing National University of Defense Technology Changsha 410073 China

In light of GPUs’ powerful floating-point operation capacity,heterogeneous parallel systems incorporating general purpose CPUs and GPUs have become a highlight in the research field of high performance computing(HPC).However,due to the complexity of programming on GPUs,porting a large number of existing scientific computing applications to the heterogeneous parallel systems remains a big *** OpenMP programming interface is widely adopted on multi-core CPUs in the field of scientific *** effectively inherit existing OpenMP applications and reduce the transplant cost,we extend OpenMP with a group of compiler directives,which explicitly divide tasks among the CPU and the GPU,and map time-consuming computing fragments to run on the GPU,thus dramatically simplifying the *** have designed and implemented MPtoStream,a compiler of the extended OpenMP for AMD’s stream processing *** experimental results show that programming with the extended directives deviates from programming with OpenMP by less than 11% modification and achieves significant speedup ranging from 3.1 to 17.3 on a heterogeneous system,incorporating an Intel Xeon E5405 CPU and an AMD FireStream 9250 GPU,over the execution on the Xeon CPU alone.

关键词： GPGPU stream OpenMP compiler

来源：评论

学校读者我要写书评

暂无评论

Speeding up the MATLAB complex networks package using graphic processors

引用

Chinese Physics B 2011年第9期20卷 460-467页

作者：张百达唐玉华吴俊杰李鑫 National laboratory for Parallel and Distributed Processing School of ComputerNational University of Defense Technology Department of Computer Science and Technology School of ComputerNational University of Defense Technology

The availability of computers and communication networks allows us to gather and analyse data on a far larger scale than previously. At present, it is believed that statistics is a suitable method to analyse networks with millions, or more, of vertices. The MATLAB language, with its mass of statistical functions, is a good choice to rapidly realize an algorithm prototype of complex networks. The performance of the MATLAB codes can be further improved by using graphic processor units （GPU）. This paper presents the strategies and performance of the GPU implementation of a complex networks package, and the Jacket toolbox of MATLAB is used. Compared with some commercially available CPU implementations, GPU can achieve a speedup of, on average, 11.3x. The experimental result proves that the GPU platform combined with the MATLAB language is a good combination for complex network research.

关键词： complex networks graphic processors unit MATLAB Jacket Toolbox

来源：评论

学校读者我要写书评

暂无评论

A field-based service management and discovery method in multiple clouds context

引用

Frontiers of Computer Science 2019年第5期13卷 976-995页

作者： Shuai ZHANG Xinjun MAO Fu HOU Peini LIU College of Computer National University of Defense Technology Changsha 410073 China National Laboratory for Parallel and Distributed Processing National University of Defense Technology Changsha 410073 China

In diverse and self-governed multiple clouds context, the service management and discovery are greatly challenged by the dynamic and evolving features of services. How to manage the features of cloud services and support accurate and efficient service discovery has becomean open problem in the area of cloud computing. This paper proposes a field model of multiple cloud services and corresponding service discovery method to address the issue. Different from existing researches, our approach is inspired by Bohr atom model. We use the abstraction of energy level and jumping mechanism to describe services status and variations, and thereby to support the service demarcation and discovery. The contributions of this paper are threefold. First, we propose the abstraction of service energy level to represent the status of services, and service jumping mechanism to investigate the dynamic and evolving features as the variations and re-demarcation of cloud services according to their energy levels. Second, we present user acceptable service region to describe the services satisfying users' requests and corresponding service discovery method, which can significantly decrease services search scope and improve the speed and precision of service discovery. Third, a series of algorithms are designed to implement the generation of field model, user acceptable service regions, service jumping mechanism, and user-oriented service discovery. We have conducted an extensive experiments on QWS dataset to validate and evaluate our proposed models and algorithms. The results show that field model can well support the representation of dynamic and evolving aspects of services in multiple clouds context and the algorithms can improve the accuracy and efficiency of service discovery.

关键词： service field service energy level service jumping service management service discovery multiple clouds

来源：评论

学校读者我要写书评

暂无评论

Why API documentation is insufficient for developers: an empirical study

引用

Science China(Information Sciences) 2021年第1期64卷 248-250页

作者： Qiang FAN Yue YU Tao WANG Gang YIN Huaimin WANG Key Laboratory of Parallel and Distributed Computing College of ComputerNational University of Defense Technology

Dear editor,Application programming interface (API) documentation plays an important role in software development and reuse [1] for both API maintainers and API *** documentation helps developers understand and reuse codes effectively [2] and focus their time on desired interfaces and functions instead of the entire system [3].Most high-quality open source projects maintain complete and informative official *** documentation typically conveys detailed specifications,such as class/inter face hierarchies and method descriptions,which can be of great benefit to developers [4].However,despite its authoritativeness and thoroughness,single-sourced official documentation does not always meet the developers'requirements [5].

关键词： Information Systems and Communication Service

来源：评论

学校读者我要写书评

暂无评论

Instance-Specific Algorithm Selection via Multi-Output Learning

引用

Tsinghua Science and Technology 2017年第2期22卷 210-217页

作者： Kai Chen Yong Dou Qi Lv Zhengfa Liang the National Laboratory for Parallel and Distributed Processing National University of Defense TechnologyChangsha 410037China the College of Computer National University of Defense TechnologyChangsha 410037China

Instance-specific algorithm selection technologies have been successfully used in many research fields,such as constraint satisfaction and planning. Researchers have been increasingly trying to model the potential relations between different candidate algorithms for the algorithm selection. In this study, we propose an instancespecific algorithm selection method based on multi-output learning, which can manage these relations more *** kinds of multi-output learning methods are used to predict the performances of the candidate algorithms：（1）multi-output regressor stacking;（2） multi-output extremely randomized trees; and（3） hybrid single-output and multioutput trees. The experimental results obtained using 11 SAT datasets and 5 Max SAT datasets indicate that our proposed methods can obtain a better performance over the state-of-the-art algorithm selection methods.

关键词： algorithm selection multi-output learning extremely randomized trees performance prediction constraint satisfaction

来源：评论

学校读者我要写书评

暂无评论

Comparison of heavy-ion induced SEU for D- and TMR-flip-flop designs in 65-nm bulk CMOS technology

引用

Science China(Information Sciences) 2014年第10期57卷 223-229页

作者： HE YiBai CHEN ShuMing School of Computer Science National University of Defense Technology Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology

Heavy ion experiments were performed on D flip-flop(DFF) and TMR flip-flop(TMRFF) fabricated in a 65-nm bulk CMOS process. The experiment results show that TMRFF has about 92% decrease in SEU crosssection compared to the standard DFF design in static test mode. In dynamic test mode, TMRFF shows much stronger frequency dependency than the DFF design, which reduces its advantage over DFF at higher operation frequency. At 160 MHz, the TMRFF is only 3.2× harder than the standard DFF. Such small improvement in the SEU performance of the TMR design may warrant reconsideration for its use in hardening design.

关键词： SEU flip-flop TMR heavy-ion frequency

来源：评论

学校读者我要写书评

暂无评论

SPICE modeling of memristors with multilevel resistance states

引用

Chinese Physics B 2012年第9期21卷 594-600页

作者：方旭东唐玉华吴俊杰 National Laboratory for Parallel and Distributed Processing School of ComputerNational University of Defense Technology Department of Computer Science and Technology School of ComputerNational University of Defense Technology

With CMOS technologies approaching the scaling ceiling, novel memory technologies have thrived in recent years, among which the memristor is a rather promising candidate for future resistive memory （RRAM）. Memristor＇s potential to store multiple bits of information as different resistance levels allows its application in multilevel cell （MCL） tech- nology, which can significantly increase the memory capacity. However, most existing memristor models are built for binary or continuous memristance switching. In this paper, we propose the simulation program with integrated circuits emphasis （SPICE） modeling of charge-controlled and flux-controlled memristors with multilevel resistance states based on the memristance versus state map. In our model, the memristance switches abruptly between neighboring resistance states. The proposed model allows users to easily set the number of the resistance levels as parameters, and provides the predictability of resistance switching time if the input current/voltage waveform is given. The functionality of our models has been validated in HSPICE. The models can be used in multilevel RRAM modeling as well as in artificial neural network simulations.

关键词： memristor multilevel cell SPICE model

来源：评论

学校读者我要写书评

暂无评论

Betweenness-based algorithm for a partition scale-free graph

引用

Chinese Physics B 2011年第11期20卷 556-564页

作者：张百达吴俊杰唐玉华周静 National Laboratory for Parallel and Distributed Processing School of ComputersNational University of Defense Technology Department of Computer Science and Technology School of ComputersNational University of Defense Technology

Many real-world networks are found to be scale-free. However, graph partition technology, as a technology capable of parallel computing, performs poorly when scale-free graphs are provided. The reason for this is that traditional partitioning algorithms are designed for random networks and regular networks, rather than for scale-free networks. Multilevel graph-partitioning algorithms are currently considered to be the state of the art and are used extensively. In this paper, we analyse the reasons why traditional multilevel graph-partitioning algorithms perform poorly and present a new multilevel graph-partitioning paradigm, top down partitioning, which derives its name from the comparison with the traditional bottom-up partitioning. A new multilevel partitioning algorithm, named betweenness-based partitioning algorithm, is also presented as an implementation of top-down partitioning paradigm. An experimental evaluation of seven different real-world scale-free networks shows that the betweenness-based partitioning algorithm significantly outperforms the existing state-of-the-art approaches.

关键词： graph partitioning betweenness-based partitioning algorithm scale free network

来源：评论

学校读者我要写书评

暂无评论

Effect of supply voltage and body-biasing on single-event transient pulse quenching in bulk fin field-effect-transistor process

引用

Chinese Physics B 2016年第4期25卷 495-500页

作者：于俊庭陈书明陈建军黄鹏程宋睿强 College of Computer National University of Defense Technology Changsha 410073 China National Laboratory for Parallel and Distributed Processing National University of Defense Technology Changsha 410073 China

Charge sharing is becoming an important topic as the feature size scales down in fin field-effect-transistor （FinFET） technology. However, the studies of charge sharing induced single-event transient （SET） pulse quenching with bulk FinFET are reported seldomly. Using three-dimensional technology computer aided design （3DTCAD） mixed-mode simulations, the effects of supply voltage and body-biasing on SET pulse quenching are investigated for the first time in bulk FinFET process. Research results indicate that due to an enhanced charge sharing effect, the propagating SET pulse width decreases with reducing supply voltage. Moreover, compared with reverse body-biasing （RBB）, the circuit with forward body-biasing （FBB） is vulnerable to charge sharing and can effectively mitigate the propagating SET pulse width up to 53% at least. This can provide guidance for radiation-hardened bulk FinFET technology especially in low power and high performance applications.

关键词： body-biasing SET pulse quenching charge sharing bulk FinFET process

来源：评论

学校读者我要写书评

暂无评论

Configuration-oriented symbolic test sequence construction method for EFSM

Configuration-oriented symbolic test sequence construction m...

引用

29th Annual International Computer Software and Applications Conference, COMPSAC 2005

作者： Li, Shuhao Wang, Ji Wang, Xin Qi, Zhi-Chang National Laboratory for Parallel and Distributed Processing Changsha China

ISBN: (纸本)0769522092

This paper presents a new approach to generating configuration-oriented executable symbolic test sequences from Extended Finite State Machine (EFSM) models. The information about the values of the context variables and the domain intervals of the input parameters are exploited to guide the derivation of the test sequences. Meanwhile, the transition guards along the test sequences are continually used to reduce the domain intervals of the input parameters. Experiments indicate that this method significantly reduces the EFSM state space to be explored and the number of non-executable symbolic test sequences to be generated. Since parameterized input events are allowed to occur in EFSM cycles, this method is suitable for testing the open reactive systems that interact with the environments via parameterized input events. © 2005 IEEE.

关键词： Computer software

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：