To exploit the full potential of GPGPUs for general purpose computing, DOACR parallelism abundant in scientific and engineering applications must be harnessed. However, the presence of cross-iteration data dependences...
详细信息
To exploit the full potential of GPGPUs for general purpose computing, DOACR parallelism abundant in scientific and engineering applications must be harnessed. However, the presence of cross-iteration data dependences in DOACR loops poses an obstacle to execute their computations concurrently using a massive number of fine-grained threads. This work focuses on iterative PDE solvers rich in DOACR parallelism to identify optimization principles and strategies that allow their efficient mapping to GPGPUs. Our main finding is that certain DOACR loops can be accelerated further on GPGPUs if they are algorithmically restructured (by a domain expert) to be more amendable to GPGPU parallelization, judiciously optimized (by the compiler), and carefully tuned by a performance-tuning tool. We substantiate this finding with a case study by presenting a new parallel SSOR method that admits more efficient data-parallel SIMD execution than red-black SOR on GPGPUs. Our solution is obtained non-conventionally, by starting from a K-layer SSOR method and then parallelizing it by applying a non-dependence-preserving scheme consisting of a new domain decomposition technique followed by a generalized loop tiling. Despite its relatively slower convergence, our new method outperforms red-black SOR by making a better balance between data reuse and parallelism and by trading off convergence rate for SIMD parallelism. Our experimental results highlight the importance of synergy between domain experts, compiler optimizations and performance tuning in maximizing the performance of applications, particularly PDE-based DOACR loops, on GPGPUs.
A class of integral inequalities is transformed into homogeneous symmetric polynomial inequalities beyond Tarski model,where the number of elements of the polynomial,say n,is also a variable and the coefficients are f...
详细信息
A class of integral inequalities is transformed into homogeneous symmetric polynomial inequalities beyond Tarski model,where the number of elements of the polynomial,say n,is also a variable and the coefficients are functions of *** is closely associated with some open problems formulated recently by Yang et *** Timofte's dimension-decreasing method for symmetric polynomial inequalities,combined with the inequality-proving package BOTTEMA and a program of implementing the method known as successive difference substitution,we provide a procedure for deciding the nonnegativity of the corresponding polynomial inequality such that the original integral inequality is mechanically decidable;otherwise,a counterexample will be *** effectiveness of the algorithm is illustrated by some more examples.
The International Workshop on the Web and Requirements engineering (WeRE) was held in conjunction with the 18 th International IEEE Requirements engineering Conference (RE'10) in Sydney (Australia) on September 2...
The International Workshop on the Web and Requirements engineering (WeRE) was held in conjunction with the 18 th International IEEE Requirements engineering Conference (RE'10) in Sydney (Australia) on September 28 2010. WeRE intends to be an international forum for exchanging ideas on both using Web technologies as a platform in the requirements engineering field, and applying requirements engineering in the development and use of websites. Papers focused on new domains and new experiences with the connection between requirements engineering and the Web were presented in WeRE.
It my great pleasure and honor to welcome you to FoSER 2010: The FSE/SDP Workshop on the Future of Software engineering Research. This workshop was organized in collaboration with and made possible by generous support...
详细信息
ISBN:
(纸本)9781450304276
It my great pleasure and honor to welcome you to FoSER 2010: The FSE/SDP Workshop on the Future of Software engineering Research. This workshop was organized in collaboration with and made possible by generous support from the Software Design and Productivity Coordinating Group (SDP) of the U.S. National Coordination Office (NCO) for Networking and Information Technology Research and Development (NITRD), and the National Science Foundation (NSF). This one-time, international working conference has brought together top academic and industrial researchers and government research funding agency personnel from around the world to engage in an extended discussion of consequential new ideas about the future of our field. The ideas produced by this workshop will be disseminated in two forms. First, the position papers accepted by the program committee will be published in a companion to the Proceedings of FSE-18. Second, the workshop findings will be published subsequently in a special report by NITRD/SDP. The call for papers for FoSER 2010 solicited 4-page position papers with new ideas about the future of software and software-reliant systems, and the research that will be needed to meet future needs. Papers were expected to be creative and thought-provoking, and to articulate compelling new perspectives, positions, problem formulations, assumptions and approaches. The workshop did not seek, and did not accept, technical research papers or abstracts. The workshop received a total of 139 position papers. Of these, 90 papers (65%) were accepted. Each paper was reviewed by at least two members of the workshop program committee. The committee was asked to accept all papers presenting significant new ideas about how our field should move forward. Copyright 2010 ACM.
One of the key factors in successful information security management is the effective compliance of security policies and proper integration of "people", "process" and "technology". When ...
详细信息
DRAM row buffer conflicts can increase memory access latency significantly. This paper presents a new pageallocation-based optimization that works seamlessly together with some existing hardware and software optimizat...
详细信息
DRAM row buffer conflicts can increase memory access latency significantly. This paper presents a new pageallocation-based optimization that works seamlessly together with some existing hardware and software optimizations to eliminate significantly more row buffer conflicts. Validation in simulation using a set of selected scientific and engineering benchmarks against a few representative memory controller optimizations shows that our method can reduce row buffer miss rates by up to 76% (with an average of 37.4%). This reduction in row buffer miss rates will be translated into performance speedups by up to 15% (with an average of 5%).
This paper presents a novel method on coloring the grayscale images. For this purpose, a combination of artificial neural networks and some image processing algorithms was developed to transfer colors from a user-sele...
详细信息
IEEEXtreme is an IEEE Student Activities Committee initiative to create a worldwide programming contest for IEEE Student Branches. The success of the past editions and the way IEEEXtreme is evolving, suggests that it ...
详细信息
Maintainability, extendibility and reusability of components in the design of robot control architectures is a major challenge. Parallel kinematic robots feature a wide variety of structures and applications. They are...
详细信息
暂无评论