检索结果-内蒙古大学图书馆

8th acm SIGPLAN symposium on principles and practice of parallel programming

作者： Foster, I. Argonne National Laboratory and University of Chicago

No abstract available.

ISBN: (纸本)9781581133462

No abstract available.

关键词：

来源：评论

学校读者我要写书评

暂无评论

LogGPS: A parallel computational model for synchronization analysis 01

LogGPS: A parallel computational model for synchronization a...

引用

8th acm SIGPLAN symposium on the principles and practice of parallel Computing

作者： Ino, F Fujimoto, N Hagihara, K Osaka Univ Grad Sch Engn Sci Osaka 5608531 Japan

ISBN: (纸本)9781581133462

We present a new parallel computational model, named Log-GPS, which captures synchronization. the LogGPS model is an extension of the LogGP model, which abstracts communication on parallel platforms. Although the LogGP model captures long messages with one bandwidth parameter (G), it does not capture synchronization that is needed before sending a long message by high-level communication libraries. Our model has one additional parameter, S, defined as the threshold for message length, above which synchronous messages are sent. We also present some experimental results using both models. the results include (1) a verification of the LogGPS model, (2) an example of synchronization analysis using an MPI program and (3) a comparison of the models. the results indicate that the LogGPS model is more accurate than the LogGP model, and analyzing synchronization costs is important when improving parallel program performance.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Pointer and escape analysis for multithreaded programs 01

Pointer and escape analysis for multithreaded programs

引用

8th acm SIGPLAN symposium on the principles and practice of parallel Computing

作者： Salcianu, A Rinard, M MIT Comp Sci Lab Cambridge MA 02139 USA

ISBN: (纸本)9781581133462

this paper presents a new combined pointer and escape analysis for multithreaded programs. the algorithm uses a new abstraction called parallel interaction graphs to analyze the interactions between threads and extract precise points-to, escape, and action ordering information for objects accessed by multiple threads. the analysis is compositional, analyzing each method or thread once to extract a parameterized analysis result that can be specialized for use in any context. It is also capable of analyzing programs. that use the unstructured form of multithreading present in languages such as Java and standard threads packages such as POSIX threads. We have implemented the analysis in the MIT Flex compiler for Java and used the extracted information to 1) verify that programs correctly use region-based allocation constructs, 2) eliminate dynamic checks associated with the use of regions, and 3) eliminate unnecessary synchronization. Our experimental results show that analyzing the interactions between threads significantly increases the effectiveness of the region analysis and region check elimination, but has little effect for synchronization elimination.

关键词： thREADS Multithreading Java Java programming language Pointer Unix operating system Escape Analysis

来源：评论

学校读者我要写书评

暂无评论

Statistical scalability analysis of communication operations in distributed applications 01

Statistical scalability analysis of communication operations...

引用

8th acm SIGPLAN symposium on the principles and practice of parallel Computing

作者： Vetter, JS McCracken, MO Lawrence Livermore Natl Lab Ctr Appl Sci Comp Livermore CA 94551 USA

ISBN: (纸本)9781581133462

Current trends in high performance computing suggest that users will soon have widespread access to clusters of multiprocessors with hundreds, if not thousands, of processors. this unprecedented degree of parallelism will undoubtedly expose scalability limitations in existing applications, where scalability is the ability of a parallel algorithm on a parallel architecture to effectively utilize an increasing number of processors. Users will need precise and automated techniques for detecting the cause of limited scalability. this paper addresses this dilemma. First, we argue that users face numerous challenges in understanding application scalability: managing substantial amounts of experiment data, extracting useful trends from this data, and reconciling performance information with their application's design. Second, we propose a solution to automate this data analysis problem by applying fundamental statistical techniques to scalability experiment data. Finally, we evaluate our operational prototype on several applications, and show that statistical techniques offer an effective strategy for assessing application scalability. In particular, we find that non-parametric correlation of the number of tasks to the ratio of the time for communication operations to overall communication time provides a reliable measure for identifying communication operations that scale poorly.

关键词： Scalability Experimental data statistical techniques Consumers Communication Research Automated recording parallel architectures Surgery Multiprocessor High Performance Computing parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Dynamic adaptation to available resources for parallel computing in an autonomous network of workstations 01

Dynamic adaptation to available resources for parallel compu...

引用

8th acm SIGPLAN symposium on the principles and practice of parallel Computing

作者： Rencuzogullari, U Dwarkadas, S Univ Rochester Dept Comp Sci Rochester NY 14627 USA

ISBN: (纸本)9781581133462

Networks of workstations (NOWs), which are generally composed of autonomous compute elements networked together, axe an attractive parallel computing platform since they offer high performance at low cost. the autonomous nature of the environment, however, often results in inefficient utilization due to load imbalances caused by three primary factors: 1) unequal load (compute or communication) assignment to equally-powerful compute nodes, 2) unequal resources at compute nodes, and 3) multiprogramming. these load imbalances result in idle waiting time on cooperating processes that need to synchronize or communicate data. Additional waiting time may result due to local scheduling decisions in a multiprogrammed environment. In this paper, we present a combined approach of compile-time analysis, run-time load distribution, and operating system scheduler cooperation for improved utilization of available resources in an autonomous NOW. the techniques we propose allow efficient resource utilization by taking into consideration all three causes of load imbalance in addition to locality of access in the process of load distribution. the resulting adaptive load distribution and cooperative scheduling system allows applications to take advantage of parallel resources when available by providing better performance than when the loaded resources axe not used at all.

关键词： unbalanced load Load distribution parallel PROCESSING (COMPUTERS) networks of workstations Workstations autonomous network

来源：评论

学校读者我要写书评

暂无评论

Contention elimination by replication of sequential sections in distributed shared memory programs 01

Contention elimination by replication of sequential sections...

引用

8th acm SIGPLAN symposium on principles and practice of parallel programming

作者： Lu, H. Cox, A.L. Zwaenepoel, W. Department of Computer Science Rice University Houston TX 77005 United States

ISBN: (纸本)9781581133462

In shared memory programs contention often occurs at the transition between a sequential and a parallel section of the code. As all threads start executing the parallel section, they often access data just modified by the thread that executed the sequential section, causing a flurry of data requests to converge on that processor. We address this problem in a software distributed shared memory system by replicating the execution of the sequential sections on all processors. Communication during this replicated sequential execution is reduced by using multicast. We have implemented replicated sequential execution with multicast support in OpenMP/NOW, a version of of OpenMP that runs on networks of workstations. We do not rely on compile-time data analysis, and therefore we can handle irregular and pointer-based applications. We show significant improvement for two pointer-based applications that suffer from severe contention without replicated sequential execution.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Accurate data redistribution cost estimation in software distributed shared memory systems 01

Accurate data redistribution cost estimation in software dis...

引用

8th acm SIGPLAN symposium on principles and practice of parallel programming

作者： Morris III, D.G. Lowenthal, D.K. Hewlett-Packard Company 19447 Pruneridge Avenue Cupertino CA 95014 United States

ISBN: (纸本)9781581133462

Distributing data is one of the key problems in implementing efficient distributed-memory parallel programs. the problem becomes more difficult in programs where data redistribution between computational phases is considered. the global data distribution problem is to find the optimal distribution in multi-phase parallel programs. Solving this problem requires accurate knowledge of data redistribution cost. We are investigating this problem in the context of a software distributed shared memory (SDSM) system, in which obtaining accurate redistribution cost estimates is difficult. this is because SDSM communication is implicit: It depends on access patterns, page locations, and the SDSM consistency protocol. We have developed integrated compile- and run-time analysis for SDSM systems to determine accurate redistribution cost estimates with low overhead. Our resulting system, SUIF-Adapt, can efficiently and accurately estimate execution time, including redistribution, to within 5% of the actual time in all of our test cases and is often much closer. these precise costs enable SUIF-Adapt to find efficient global data distributions in multiple-phase programs.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Reference idempotency analysis: A framework for optimizing speculative execution

Reference idempotency analysis: A framework for optimizing s...

引用

8th acm SIGPLAN symposium on principles and practice of parallel programming

作者： Seon Wook Kim Ooi, C.-L. Eigenmann, R. Falsafi, B. Vijaykumar, T.N. Intel Corp. Champaign IL United States

Recent proposals for multithreaded architectures allow threads with unknown dependences to execute speculatively in parallel. these architectures use hardware speculative storage to buffer uncertain data, track data dependences and roll back incorrect executions. Because all memory references access the speculative storage, current proposals implement this storage using small memory structures for fast access. the limited capacity of the speculative storage causes considerable performance loss due to speculative storage overflow whenever a thread's speculative state exceeds the storage capacity. Larger threads exacerbate the over-flow problem but are preferable to smaller threads, as larger threads uncover more parallelism. In this paper, we discover a new program property called memory reference idempotency. Idempotent references need not be tracked in the speculative storage, and instead can directly access non-speculative storage (i.e., the conventional memory hierarchy). thus, we reduce the demand fo r speculative storage space. We define a formal framework for reference idempotency and present a novel compiler-assisted speculative execution model. We prove the necessary and sufficient conditions for reference idempotency using our model. We present a compiler algorithm to label idempotent memory references for the hardware. Experimental results show that for our benchmarks, over 60% of the references in non-parallelizable program sections are idempotent.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Dynamic instrumentation of threaded applications

Proceedings of the ACM SIGPLAN Symposium on Principles and P...

引用

Proceedings of the acm SIGPLAN symposium on principles and practice of parallel programming, PPOPP 1999年 49-59页

作者： Xu, Zhichen Miller, Barton P. Naim, Oscar Univ of Wisconsin Madison WI United States

the design of non-trace based performance instrumentation techniques for threaded programs is investigated to provide detailed performance data while maintaining control of instrumentation costs. the design is based on low contention data structures. the Paradyn's dynamic instrumentation is extended to handle threaded programs. To associate data with individual threads, all threads must share the same instrumentation code and assign each thread with its own private copy of performance counters or timers. the asynchrony in a threaded program poses a major challenge to dynamic instrumentation.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Evaluation of computing paradigms for n-body simulations on distributed memory architectures

Proceedings of the ACM SIGPLAN Symposium on Principles and P...

引用

Proceedings of the acm SIGPLAN symposium on principles and practice of parallel programming, PPOPP 1999年 25-36页

作者： McCurdy, Collin Mellor-Crummey, John Univ of Wisconsin Madison United States

the efficiency of HPF with respect to irregular applications is still largely unproven. While recent work has shown that a highly irregular hierarchical n-body force calculation method can be implemented in HPF, we have found that the implementation contains inefficiencies which cause it to run up to a factor of three times slower than our hand-coded, explicitly parallel implementation. Our work examines these inefficiencies, determines that most of the extra overhead is due to a single aspect of the communication strategy, and demonstrates that fixing the communication strategy can bring the overheads of the HPF application to within 25% of those of the hand-coded version.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：