检索结果-内蒙古大学图书馆

IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID)

作者： G.D.S. Craveiro L.M. Sato SENAC College of Computer Science and Technology Sao Paulo Brazil Computer Engineering Department Polytechnic School University of São Paulo Brazil

ISBN: (纸本)9780780384309

The advances in hardware technology enable the inclusion of an SMP node into a cluster of PC or even clusters of SMP. These are becoming viable alternatives for high performance computing. The challenge is the exploration of the computational resources that these hardware platforms provide. A hybrid programming paradigm which uses a shared memory architecture through multi threading and through a message passing model for inter node communication is an alternative. However, programming in such a paradigm is very hard. This work presents CPAR-Cluster, a runtime system that provides shared memory abstraction on top of a cluster composed by mono and multiprocessor nodes. Its implementation is at the library level and does not require special resources such as specific hardware or operating system modifications. Models, strategies, implementation aspects and some results are presented.

关键词： MONOS devices parallel programming parallel processing Hardware Computer languages Message passing Libraries High performance computing programming profession Educational institutions

来源：评论

学校读者我要写书评

暂无评论

A distributed shared object model based on a hierarchical consistency protocol for heterogeneous clusters 04

A distributed shared object model based on a hierarchical co...

引用

IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID)

作者： Xuli Liu Hong Jiang Leen-Kiat Soh Department of Computer Science and Engineering University of Nebraska Lincolnshire NE USA

ISBN: (纸本)9780780384309

The significant performance-to-cost ratio advantage of clusters, combined with recent advances in middleware (programming environment) and networking technologies, has made them the single most popular and fastest growing platform for high performance computing in recent years. While the message passing interface (MPI) still dominates as a means of parallel programming in clusters, it is nevertheless desirable for programmers to program in a single address space, not only across a cluster but also among multiple, likely heterogeneous, clusters so as to significantly extend the computing power of a single cluster. In this paper we propose a distributed shared object (DSO) model based on a distributed hierarchical consistency model (DHCM) protocol for heterogeneous clusters. DHCM, inspired by but significantly improved over the local consistency, is designed to help maintain coherence and consistency in a DSO programming environment and to adapt to different levels of consistency. The notion of adaptive consistency is proposed and partially implemented to improve the efficiency in consistency control, and scalability is addressed as well through the hierarchical structure of the protocol design. We implemented this model purely in Java for portability and heterogeneity. The performance of DHCM is evaluated by executing the LU application chosen from the SPLASH-2 benchmark suite on a 128-node Linux cluster. The experimental results show that the protocol with a hierarchical structure significantly outperforms the protocol with a single-tier in terms of execution time, indicating higher scalability.

关键词： Protocols programming environments Space technology Scalability Middleware High performance computing Message passing parallel programming programming profession Computer interfaces

来源：评论

学校读者我要写书评

暂无评论

An OpenMP-like interface for parallel programming in Java

引用

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE 2001年第8-9期13卷 793-814页

作者： Kambites, ME Obdrzálek, J Bull, JM Univ Edinburgh Edinburgh Parallel Comp Ctr Edinburgh EH9 3JZ Midlothian Scotland Univ York Dept Math York YO10 5DD N Yorkshire England Masaryk Univ Fac Informat Brno 60200 Czech Republic

This paper describes the definition and implementation of an OpenMP-like set of directives and library routines for shared memory parallel programming in Java, A specification of the directives and routines is proposed and discussed. A prototype implementation, consisting of a compiler and a runtime library, both written entirely in Java, is presented, which implements most of the proposed specification. Some preliminary performance results are reported. Copyright (C) 2001 John Wiley & Sons, Ltd.

关键词： Java parallel programming shared memory directives compiler

来源：评论

学校读者我要写书评

暂无评论

Scalable distributed depth-first search with greedy work stealing

Scalable distributed depth-first search with greedy work ste...

引用

International Conference on Tools for Artificial Intelligence (ICTAI)

作者： J. Jaffar A.E. Santosa R.H.C. Yap K.Q. Zhu School of Computing National University of Singapore Singapore

We present a framework for the parallelization of depth-first combinatorial search algorithms on a network of computers. Our architecture is intended for a distributed setting and uses a work stealing strategy coupled with a small number of primitives for the processors (which we call workers) to obtain new work and to communicate to other workers. These primitives are a minimal imposition and integrate easily with constraint programming systems. The main contribution is an adaptive architecture, which allows workers to incrementally join and leave and has good scaling properties as the number of workers increases. Our empirical results illustrate that near-linear speedup for backtrack search is achieved for up to 61 workers. It suggests that near-linear speedup is possible with even more workers. The experiments also demonstrate where departures from linearity can occur for small problems, and also for problems where the parallelism can itself affect the search as in branch and bound.

关键词： Concurrent computing Logic programming Computer networks parallel processing parallel programming Distributed computing Electronic mail Computer architecture Linearity Search problems

来源：评论

学校读者我要写书评

暂无评论

Employing nested OpenMP for the parallelization of multi-zone computational fluid dynamics applications

Employing nested OpenMP for the parallelization of multi-zon...

引用

International Symposium on parallel and Distributed Processing (IPDPS)

作者： E. Ayguade M. Gonzalez X. Martorell G. Jost Computer Architecture Department (UPC) Centre Europeu de Parallelism de Barcelona Barcelona Spain NAS Division NASA Ames Research Center Moffett Field CA USA

Summary form only given. We describe the parallelization of the multizone code versions of the NAS parallel benchmarks employing multilevel OpenMP parallelism. For our study we use the NanosCompiler, which supports nesting of OpenMP directives and provides clauses to control the grouping of threads, load balancing, and synchronization. We report the benchmark results, compare the timings with those of different hybrid parallelization paradigms and discuss OpenMP implementation issues which effect the performance of multilevel parallel applications.

关键词： Computational fluid dynamics parallel processing Yarn parallel programming Concurrent computing Application software NASA Timing Fluid dynamics Computer architecture

来源：评论

学校读者我要写书评

暂无评论

Distributed parallel file system for I/O intensive parallel computing on clusters

Distributed parallel file system for I/O intensive parallel ...

引用

International Conference on Electrical Engineering, Computing Science and Automatic Control (CCE)

作者： S. Dominguez-Dominguez J. Buenabad-Chavez Secci6n de Computación Departmento de Ingenieria Eléctrica Centro de Investigacidn y de Estudios Avanzados del I.P.N. Mexico

Local area networks are now widely used to run parallel applications. They are particularly suitable to run I/O intensive applications, because each node usually includes disk space. However, programming these applications is rather difficult. The programmer must partition data into disk nudes and, at run time, transfer data from disk space into the memory of each node that uses the data and vice versa. Also, such partitioning gives data a fixed location in disk space, and is usually not adequate for performance because processors do not access mostly data in their local memory or disk, but on disk space in remote nodes. This paper presents a distributed parallel file system that both eases the programming and improves the performance of parallel I/O intensive applications. Our file system eases programming through mapping into memory files of size up to hundreds of Giga bytes. It improves performance through automatically diffusing, or migrating and replicating, data in files to the local memory or local disk of the processors that use the data. Data diffusion occurs under a multiple-readers-single-writer protocol. On applications tested the performance gain can be up to 20 % compared to versions using the MPI file system.

关键词： File systems parallel processing programming profession parallel programming Message passing Local area networks Dynamic programming Protocols System testing Performance gain

来源：评论

学校读者我要写书评

暂无评论

Key design techniques of a 40 ns 16 Kbit embedded EEPROM memory

Key design techniques of a 40 ns 16 Kbit embedded EEPROM mem...

引用

International Conference on Communications, Circuits and Systems (ICCCAS)

作者： Fei Xu Xiangqing He Li Zhang Institute of Microelectronics Tsinghua University Beijing China

ISBN: (纸本)0780386477

A 2K/spl times/8 bit EEPROM memory, which operates with a single 3.3 V power supply based on SMIC 0.35 /spl mu/m EEPROM process, has been developed. Several key design techniques are summarized. An improved read out circuit that consists of SA (sense amplifier), bit line decoding and an optimized logic circuit to minimize the read access time, is described particularly, as well as the approaches to optimize the program operation and to generate on-chip high voltage. A 40 ns typical read access time and 2 ms page programming time are achieved. The active and standby currents are 10 mA and 100 /spl mu/A respectively.

关键词： EPROM Circuits Voltage Switches Decoding Operational amplifiers Charge pumps parallel programming Latches Parasitic capacitance

来源：评论

学校读者我要写书评

暂无评论

A fast evolutionary programming for adaptive FIR filter

A fast evolutionary programming for adaptive FIR filter

引用

International Conference on Communications, Circuits and Systems (ICCCAS)

作者： Jie Zhang Xiaofeng Liao Hui Zhao Juebang Yu School of Electronic Engineering University of Electronic Science and Technology Chengdu China

ISBN: (纸本)0780386477

The LMS algorithm is commonly used in the optimum design of the adaptive filter, because the LMS adaptive algorithm is a simple algorithm and it can be realized easily. But the convergence behavior and maladjustment of the LMS algorithm is seriously affected by the step-size, and the optimum parameter of step-size cannot be calculated easily. Evolutionary programming is an optimum algorithm in which the optimization of N-dimensions real-numbers are research objects. In this paper, the FIR filter is an example. In the design of the adaptive filter, we use a fast evolutionary programming algorithm. Cauchy mutation takes the place of Gauss mutation for improving the speed of the convergence. This algorithm is not dependent on any parameter; we can get a good result by the simulation and indicate the validity of the algorithm.

关键词： Genetic programming Finite impulse response filter Adaptive filters Optimization methods Least squares approximation Algorithm design and analysis Adaptive algorithm Genetic mutations Convergence parallel programming

来源：评论

学校读者我要写书评

暂无评论

Mobile agent programming for clusters with parallel skeletons

引用

5th International Conference on High Performance Computing for Computational Science (VECPAR 2002)

作者： Aversa, R Di Martino, B Mazzocca, N Venticinque, S Univ Naples 2 Dipartimento Ingn Informaz I-81031 Aversa CE Italy

ISBN: (纸本)3540008527

parallel programming effort can be reduced by using high-level constructs such as algorithmic skeletons. Within the Magda toolset, supporting programming and execution of mobile agent based distributed applications, we provide a skeleton-based parallel programming environment, based on specialization of Algorithmic Skeleton Java interfaces and classes. Their implementation include mobile agent features for execution on heterogeneous systems, such as clusters of WSs and PCs, and support reliability and dynamic workload balancing. The user can thus develop a parallel, mobile agent based application by simply specialising a given set of classes and methods and using a set of added functionalities.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

TOWARDS parallel programming BY TRANSFORMATION: THE FAN SKELETON FRAMEWORK** Parts of this work were presented in the preliminary version at the HiPS workshop of IPPS’99 (Puerto Rico)and at the PaCT’99 conference (St. Petersburg)

引用

parallel Algorithms and Applications 2001年第2期16卷 87-121页

作者： M. Aldinucci[a] S. Gorlatch[b] C. Lengauer[b] S. Pelagatti[a] [a] Dipartimento di Informatica Universit di Pisa Pisa Italy [b] Fakultat fur Mathematik und Informatik Universitat Passau Passau Germany

A Functional Abstract Notation (FAN) is proposed for the specification and design of parallel algorithms by means ofskeletons- high-level patterns with parallel semantics. The main weakness of the current programming systems based on skeletons ii that the user is still responsible for finding the most appropriate skeleton composition for a given application and a given parallel architectureWe describe a transformational framework for the development of skeletal programs which is aimed at filling this gap. The framework makes use oftransformation ruleswhich are semantic equivalences among skeleton compositions. For a given problem, an initial, possibly inefficient skeleton specification is refined by applying a sequence of transformations. Transformations are guided by a set of performance prediction models which forecast the behavior of each skeleton and the performance benefits of different rules. The design process is supported by a graphical tool which locates applicable transformations and provides performance estimates, thereby helping the programmer in navigating through the program refinement space. We give an overview of the FAN framework and exemplify its use with performance-directed program derivations for simple case studies. Our experience can be viewed as a first feasibility study of methods and tools for transformational, performance-directed parallel programming using skeletons.

关键词： parallel programming Program transformations Algorithm design Skeletons Functional programming Performance models Program refinement tools

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：