检索结果-内蒙古大学图书馆

5th international symposium on High-Performance Computer Architecture (HPCA)

作者： Falsafi, B Wood, DA Purdue Univ Sch Elect & Comp Engn W Lafayette IN 47907 USA

ISBN: (纸本)0769500048

this paper proposes a novel queue-based programming abstraction, parallel Dispatch Queue (PDQ), that enables efficient parallel execution of fine-grain software communication protocols. parallel systems often use fine-grain software handlers to integrate a network message into computation. Executing such handlers in parallel requires access synchronization around resources. Much as a monitor construct in a concurrent language protects accesses to a set of data structures, PDQ allows messages to include a synchronization key protecting handler accesses to a group of protocol resources. By simply synchronizing messages in a queue prior to dispatch, PDQ not only eliminates the overhead of acquiring/releasing synchronization primitives but also prevents busy-waiting within handlers. In this paper, we study PDQ's impact on software protocol performance in the context of fine-grain distributed shared memory (DSM) on ail SMP cluster: Simulation results running shared-memory applications indicate that: (i) parallel software protocol execution using PDQ significantly improves performance in fine-grain DSM, (ii) tight integration of PDQ and embedded processors into a single custom device carl offer performance competitive or better than arl all-hardware DSM, and (iii) PDQ best benefits cost-effective systems that use idle SMP processors (rather than custom embedded processors) to execute protocols. On a cluster of 4 16-way SMPs, a PDQ-based parallel protocol running on idle SMP processors improves application performance by a factor of 2.6 over a system running a serial protocol on a single dedicated processor.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Limits to the performance of software shared memory: A layered approach

Limits to the performance of software shared memory: A layer...

引用

5th international symposium on High-Performance Computer Architecture (HPCA)

作者： Bilas, A Jiang, DM Zhou, YY Singh, JP Univ Toronto Dept Elect & Comp Engn Toronto ON M5S 3G4 Canada

ISBN: (纸本)0769500048

Much research has been done in fast communication on clusters and in protocols for supporting software shared memory across them. However, the end performance of applications that were written for the more proven hardware-coherent shared memory is still not very good on these systems. three major layers of software (and hardware) stand between the end user and parallel performance, each with its own functionality and performance characteristics. they include the communication layer, the software protocol layer that supports the programming model, and the application layer. these layers provide a useful framework to identify the key remaining limitations and bottlenecks in software shared memory systems, as well as the areas where optimization efforts might yield the greatest performance improvements. this paper performs such an integrated study, using this layered framework, for two types of software distributed shared memory systems: page-based shared virtual memory (SVM) and fine-grained software systems (FG). For the two system layers (communication and protocol), we focus on the performance costs of basic operations in the layers rather than on their functionalities. this is possible because their functionalities are now fairly mature. the less mature applications layer is treated through application restructuring. We examine the layers individually and in combination, understanding their implications for the two types of protocols and exposing the synergies among layers.

关键词： distributed computer systems

来源：评论

学校读者我要写书评

暂无评论

An approach for measuring IP security performance in a distributed environment 13th

引用

13th international parallel Processing symposium, IPPS 1999 Held in Conjunction with the 10th symposium on parallel and distributed Processing, SPDP 1999

作者： Chappell, Brett L. Marlow, David T. Irey, Philip M. O'donoghue, Karen System Research and Technology Department Combat Systems Branch Naval Surface Warfare Center Dahlgren Division DahlgrenVA22448-5-000 United States

ISBN: (纸本)3540658319

the Navy needs to use Multi Level Security (MLS) techniques in an environment with increasing amount of real time computation brought about by increased automation requirements and new more complex operations. NSWC-DD has initiated testing of a security protocol based on the commercial standard, IPSEC, which is becoming available in Commercial Off the Shelf (COTS) computing products. IPSEC is viewed as a critical component towards providing MLS capabilities. Current implementations of IPSEC are implemented in software as part of the kernel system software. the system engineer must carefully develop security policies versus applying this technology in a brute force way. this paper describes the security issues, the IPSEC standard, testing performed at NSWCDD and provides an approach to using this technology in the current resource constrained environment using today's COTS products. © Springer-Verlag Berlin Heidelberg 1999.

关键词： Computers

来源：评论

学校读者我要写书评

暂无评论

Scheduling user-level threads on distributed shared-memory multiprocessors

Scheduling user-level threads on distributed shared-memory m...

引用

5th international Conference on parallel Processing, Euro-Par 1999

作者： Polychronopoulos, Eleftherios D. Papatheodorou, theodore S. High Performance Computing Information Systems Laboratory Department of Computer Engineering and Informatics University of Patras Rio 26 500 Patras Greece

ISBN: (纸本)3540664432

In this paper we present Dynamic Bisectioning or DBS, a simple but powerful comprehensive scheduling policy for user-level threads, which unifies the exploitation of (multidimensional) loop and nested functional (or task) parallelism. Unlike other schemes that have been proposed and used thus far, DBS is not constrained to scheduling DAGs or singly nested parallel loops. Rather, our policy encompasses the most general type of parallel program model that allows arbitrary mix of nested loops and nested DAGs (directed acyclic task-graphs) or any combination of the above. DBS employs a simple but powerful two-level dynamic policy which is adaptive and sensitive to the type and amount of parallelism at hand. On one extreme DBS approximates static scheduling, hence facilitating locality of data, while at the other extreme it resorts to dynamic thread migration in order to balance uneven loads. Even the latter is done in a controlled way so as to minimize network latency. © Springer-Verlag Berlin Heidelberg 1999.

关键词： Scheduling

来源：评论

学校读者我要写书评

暂无评论

Rate of change load balancing in distributed and parallel systems

Proceedings of the International Parallel Processing Symposi...

引用

Proceedings of the international parallel Processing symposium, IPPS 1999年 701-707页

作者： Campos, Luis Miguel Scherson, Isaac Univ of California - Irvine Irvine United States

Dynamic Load Balancing is an important system function destined to distribute workload among available processors to improve throughput and/or execution times of parallel computer programs either uniform or non-uniform (jobs whose workload varies at run-time in unpredictable ways). Non-uniform computation and communication requirements may bog down a parallel computer if no efficient load distribution is effected. A novel distributed algorithm for load balancing is proposed and is based on local Rate of Change observations rather than on global absolute load numbers. It is a totally distributed algorithm and requires no centralized trigger and/or decision makers. the strategy is discussed and analyzed by means of experimental simulation.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Skipper: A skeleton-based parallel programming environment for real-time image processing applications 5th

Skipper: A skeleton-based parallel programming environment f...

引用

5th international Conference on parallel computing Technologies, PaCT 1999

作者： Sérot, Jocelyn Ginhac, Dominique Dérutin, Jean-Pierre LASMEA UMR 6602-CNRS Campus des Cezeaux AubièreF-63177 France

ISBN: (纸本)3540663630

this paper presents SKiPPER, a programming environment dedicated to the fast prototyping of parallel vision algorithms on MIMD- DM platforms. SKiPPER is based upon the concept of algorithmic skele- tons, i.e. higher order program constructs encapsulating recurring forms of parallel computations and hiding their low-level implementation de- tails. Each skeleton is given an architecture-independent functional (but executable) specification and a portable implementation as a generic pro- cess template. the source program is a purely functional specification of the algorithm in which all parallelism is made explicit by means of com- posing instances of selected skeletons, each instance taking as parameters the application specific sequential functions written in C. SKiPPER compiles this specification down to a process graph in which nodes cor- respond to sequential functions and/or skeleton control processes and edges to communications. this graph is then mapped onto the target topology using a third-party CAD software (SynDEx). the result is a dead-lock free, optimized (but still portable) distributed executive, which SKiPPER finally turns into executable code for the target platform. the initial specification, written in ML language, can also be executed on any sequential platform to check the correctness of the parallel algorithm. the applicability of SKiPPER concepts and tools has been demonstrated by parallelising several realistic real-time vision applications both on a multi-DSP platform and a network of workstations. It is here illustrated with a real-time vehicle detection and tracking application. © Springer-Verlag Berlin Heidelberg 1999.

关键词： Specifications

来源：评论

学校读者我要写书评

暂无评论

Design and implementation of a scalable parallel system for multidimensional analysis and OLAP

Design and implementation of a scalable parallel system for ...

引用

international symposium on parallel Processing

作者： S. Goil A. Choudhary Department of Electrical & Computer Engineering Technological Institute Northwestern University Evanston IL USA

Multidimensional Analysis and On-Line Analytical Processing (OLAP) uses summary information that requires aggregate operations along one or more dimensions of numerical data values. Query processing for these applications require different views of data for decision support. the Data Cube operator provides multi-dimensional aggregates, used to calculate and store summary information on a number of dimensions. the multi-dimensionality of the underlying problem can be represented both in relational and multi-dimensional databases, the latter being a better fit when query performance is the criteria for judgment. Relational databases are scalable in size and efforts are on to make their performance acceptable. On the other hand multi-dimensional databases perform well for such queries, although they are nor very scalable. parallel computing is necessary to address the scalability and performance issues for these data sets. In this paper we present a parallel and scalable infrastructure for OLAP and multidimensional analysis. We use chunking to store data either as a dense block using multidimensional arrays (md-arrays) or a sparse set using a Bit encoded sparse structure (BESS). Chunks provide a multidimensional index structure for efficient dimension oriented data accesses much the same as md-arrays do. Operations within chunks and between chunks are a combination of relational and multi-dimensional operations depending on whether the chunk is sparse or dense. We present performance results for data sets with 3, 5 and 10 dimensions for our implementation on the IBM SP-2 which show good speedup and scalability.

关键词： Multidimensional systems Databases Data structures Aggregates Information analysis Decision support systems Data mining Performance analysis

来源：评论

学校读者我要写书评

暂无评论

Java data parallel extensions with runtime system support 5

Java data parallel extensions with runtime system support

引用

5th international Conference on High Performance computing

作者： Wen, YH Carpenter, B Fox, G Zhang, GS Syracuse Univ NE Parallel Architecture Ctr Syracuse NY 13244 USA

ISBN: (纸本)0818691948

In order to provide Java the ability for supporting scientific parallel computing, we introduce a data parallel extension to Java language with runtime system support. We will provide the distributed arrays extension to Java, and discuss the related operation anal control over the new distributed array. Communication involving distributed arrays are handles through a standard of collective communication library. WE also will make the programming in a Single Program Multiple Data (SPMD) model.

关键词： Java programming language

来源：评论

学校读者我要写书评

暂无评论

Proceedings - symposium on Computer Architecture and High Performance computing

Proceedings - Symposium on Computer Architecture and High Pe...

引用

5th international Conference on High Performance computing, HiPC 1998

ISBN: (纸本)0818691948

the proceedings contain 61 papers. the topics discussed include: new number representation and conversion techniques on reconfigurable mesh;precise control of instruction caches;more on arbitrary boundary packed arithmetic;more on arbitrary boundary packed arithmetic;PERL - a registerless architecture;design alternatives for shared memory multiprocessors;a simple optimal list ranking algorithm;a parallel skeletonization algorithm and its VLSI architecture;improving error bounds for multipole-based treecodes;computation of penetration measures for convex polygons and polyhedra for graphics applications;extrapolation in distributed adaptive integration;and java data parallel extensions with runtime system support.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Automatic parallel I/O performance optimization using genetic algorithms 7

Automatic parallel I/O performance optimization using geneti...

引用

7th international symposium on High Performance distributed computing

作者： Chen, Y Winslett, M Cho, Y Kuo, S IBM Corp Almaden Res Ctr Dept Comp Sci San Jose CA 95120 USA

ISBN: (纸本)0818685794

the complexity of parallel I/O systems imposes significant challenge in managing and utilizing the available system resources to meet application performance, portability and usability goals. We believe that a parallel I/O system that automatically selects efficient I/O plans for user applications is a solution to this problem. In this paper, we present such an automatic performance optimization approach for scientific applications performing collective I/O requests on multidimensional arrays. the approach is based on, a high level description of the target workload and execution. environment characteristics, and applies genetic algorithms to select high quality I/O plans. We have validated this approach in the Panda parallel I/O library. Our performance evaluations on the IBM SP show that this approach can, select high quality I/O plans under a variety of system conditions with a low overhead, and the genetic algorithm-selected I/O plans are in general better than the default plans used in Panda.

关键词： Genetic algorithms

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：