检索结果-内蒙古大学图书馆

A compiler for exploiting nested parallelism in OpenMP programs

parallel COMPUTING 2005年第10-12期31卷 960-983页

作者： Tian, XM Hoeflinger, JP Haab, G Chen, YK Girkar, M Shah, S Intel Corp Intel Compiler Labs Software & Solut Grp Santa Clara CA 95052 USA Intel Corp Appl Res Lab Core Technol Grp Santa Clara CA 95052 USA Intel Corp Parallel & Distributed Solut Div Software & Solut Grp Champaign IL 61820 USA

This paper presents the design and implementation of a parallelization framework and OpenMP runtime support in Intel (R) C++ & Fortran compilers for exploiting nested parallelism in applications using OpenMP pragmas or directives. We conduct the performance evaluation of two multimedia applications parallelized with OpenMP pragmas and compiled with the Intel C++ compiler on Hyper-Threading Technology (HT) enabled multiprocessor systems. The performance results show that the multithreaded code generated by the Intel compiler achieved a speedup up to 4.69 on 4 processors with HT enabled for five different input video sequences for the H.264 encoder workload, and a 1.28 speedup on an HT enabled single-CPU system and 1.99 speedup on an HT-enabled dual-CPU system for the audio visual speech recognition workload. The performance gain due to exploiting nested parallelism for leveraging Hyper-Threading Technology is up to 70% for two multimedia workloads under different multiprocessor system configurations. These results demonstrate that hyper-threading benefits can be achieved by exploiting nested parallelism through Intel compiler and runtime system support for OpenMP programs, (c) 2005 Elsevier B.V. All rights reserved.

关键词： compiler parallelization nested parallelism OpenMP hyper-threading performance

来源：评论

学校读者我要写书评

暂无评论

A direct execution approach to simulating mobile agent algorithms

引用

JOURNAL OF SUPERCOMPUTING 2004年第2期29卷 171-184页

作者： Li, XH Cao, JN He, YX Hong Kong Polytech Univ Dept Comp Internet & Mobile Comp Lab Kowloon Hong Kong Peoples R China Wuhan Univ State Key Lab Software Engn Data & Knowledge Engn Lab Wuhan 430072 Hubei Peoples R China Wuhan Univ Sch Comp Parallel & Distributed Comp Lab Wuhan 430072 Hubei Peoples R China

Mobile agent technology has been applied to develop the solutions for various kinds of parallel and distributed computing problems. However, performance evaluation of mobile agent algorithms remains a difficult task, mainly due to the characteristics of mobile agents such as distributed and asynchronous execution, autonomy and mobility. This paper proposes a general approach based on direct execution simulation for evaluating the performance of mobile agent algorithms by collecting and analyzing the information about the agents during their execution. We describe the proposed generic simulation model, named MADES, the architecture of a software environment based on MADES, and a prototype implementation. A mobile agent-based distributed load balancing algorithm has been used for experiments with the prototype.

关键词： mobile agent direct execution simulation parallel and distributed simulation MADES

来源：评论

学校读者我要写书评

暂无评论

A direct execution approach to simulating mobile agent algorithms

A direct execution approach to simulating mobile agent algor...

引用

International Symposium on parallel and distributed Processing and Applications (ISPA 2003)

关键词： mobile agent direct execution simulation parallel and distributed simulation MADES

来源：评论

学校读者我要写书评

暂无评论

On the performance of maestro2 high performance network equipment, using new improvement techniques

On the performance of maestro2 high performance network equi...

引用

23rd IEEE International Performance, Computing, and Communications Conference, Conference Proceedings, IPCCC 2004

作者： Yamagiwa, Shinichi Ferreira, Kevin Campos, Luis Miguel Aoki, Keiichi Ono, Masaaki Wada, Koichi Fukuda, Munehiro Sousa, Leonel PDM and FC Rua Latino Coelho 87 1050-134 Lisboa Portugal Parallel/Distributed Comp. Lab. University of Tsukuba 1-1-1 Tennodai Tsukuba Ibaraki 305-8573 Japan Comp. and Software Systems UW1-331 University of Washington Bothell 18815 Campus Way NE Bothell WA 98011-8246 IST/INESC-ID Rua Alves Redol 9 1000-029 Lisboa Portugal

Cluster computers have become the vehicle of choice to build high performance computing environments. To fully exploit the computing power of these environments, one must utilize high performance network and protocol technologies, since the communication patterns of parallel applications running on clusters require low latency and high throughput, not achievable by using off-the-shell network technologies. We have developed a technology to build high performance network equipment, called Maestro2. This paper describes the novel techniques used by Maestro2 to extract maximum performance from the physical medium and studies the impact of software-level parameters. The results obtained clearly show that Maestro2 is a promising technology, presenting very good results both in terms of latency and throughput. The results also show the large impact of software overhead in the overall performance of the system and validate the need for optimized communication libraries for high performance computing.

关键词： Computer networks

来源：评论

学校读者我要写书评

暂无评论

GigaE PM: a high performance communication facility using a Gigabit Ethernet

引用

NEW GENERATION COMPUTING 2000年第2期18卷 177-186页

作者： Sumimoto, S Tezuka, H Hori, A Harada, H Takahashi, T Ishikawa, Y Real World Comp Partnership Parallel & Distributed Syst Software Lab Tsukuba Ibaraki 3050032 Japan

A high performance communication facility, called the GigaE PM, has been designed and implemented for parallel applications on clusters of computers using a Gigabit Ethernet. The GigaE PM provides not only a reliable high bandwidth and low latency communication, but also supports existing network protocols such as TCP/IP. A reliable communication mechanism for a parallel application is implemented on the firmware on a NIC while existing network protocols are handled by an operating system kernel. A prototype system has been implemented using an Essential Communications Gigabit Ethernet card. The performance results show that a 58.3 mu s round trip time for a four byte user message, Emd 56.7 MBytes/sec bandwidth for a 1,468 byte message have been achieved on Intel Pentium II 400 MHz PCs. We have implemented MPICH-PM on top of the GigaE PM, and evaluated the NAS parallel benchmark performance. The results show that the IS class S performance on the GigaE PM is 1.8 times faster than that on TCP/IP.

关键词： distributed and paralell computing high performance communication Gigabit Ethernet commodity network

来源：评论

学校读者我要写书评

暂无评论

Continuations for parallel logic programming 00

Continuations for parallel logic programming

引用

Proceedings of the 2nd International ACM SIGPLAN Conference on Principles and Practice of Declarative Programming (PPDP'00)

作者： Todoran, Eneia Papaspyrou, Nikolaos S. Technical University of Cluj-Napoca Dept. of Computer Science Parallel/Distributed Syst. Lab. Baritiu Str. 28 3400 Cluj-Napoca Romania Natl. Technical University of Athens Dept. of Elec./Computer Engineering Software Engineering Laboratory 15780 Zografou Greece

ISBN: (纸本)1581132654

This paper gives denotational models for three logic programming languages of progressive complexity, adopting the "logic programming without logic" approach. The first language is the control flow kernel of sequential Prolog, featuring sequential composition and backtracking. A committed-choice concurrent logic language with parallel composition (parallel AND) and don't care nondeterminism is studied next. The third language is the core of Warren's basic Andorra model, combining parallel composition and don't care nondeterminism with two forms of don't know nondeterminism (interpreted as sequential and parallel OR) and favoring deterministic over nondeterministic computation. We show that continuations are a valuable tool in the analysis and design of semantic models for both sequential and parallel logic programming. Instead of using mathematical notation, we use the functional programming language Haskell as a metalanguage for our denotational semantics, and employ monads in order to facilitate the transition from one language under study to another.

关键词： Logic programming

来源：评论

学校读者我要写书评

暂无评论

Object-oriented run-time support for data-parallel applications 2nd

Object-oriented run-time support for data-parallel applicati...

引用

2nd International Symposium on Computing in Object-Oriented parallel Environments

作者： Bi, H Kessler, M Wilhelmi, M GMD Inst Comp Architecture & Software Technol Parallel & Distributed Syst Lab D-12489 Berlin Germany

ISBN: (纸本)3540653872

We present a C++ template run-time library, PROMOTER, and discuss run-time support for data-parallel applications. The PROMOTER run-time library provides a uniform framework for data-parallel applications, covering a broad spectrum of granularity, regularity and dynamicity. It supports user-defined data structures ranging from dense to sparse arrays, regular to irregular index structures and data distributions. The object-oriented design and implementation of the PROMOTER run-time library not only provides an easy data-parallel programming environment, but also leads to an efficient implementation of data-parallel applications through object reuse and object specialization.

关键词： C++ (programming language)

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：