the proceedings contain 40 papers. the topics discussed include: hybrid algorithms for list ranking and graph connected components;parallel multiple precision division by a single precision divisor;scalable clustering...
ISBN:
(纸本)9781457719516
the proceedings contain 40 papers. the topics discussed include: hybrid algorithms for list ranking and graph connected components;parallel multiple precision division by a single precision divisor;scalable clustering using multiple GPUs;hybrid implementation of error diffusion dithering;porting irregular reductions on heterogeneous CPU-GPU configurations;building algorithmically nonstop fault tolerant MPI programs;high-level template for the task-based parallel wavefront pattern;coordination mechanisms for selfish multi-organization scheduling;maximizing throughput of jobs with multiple resource requirements;scheduling diverse high performance computingsystems withthe goal of maximizing utilization;a dynamic scheduling framework for emerging heterogeneous systems;improving graph coloring on distributed-memory parallel computers;and multi-model prediction for enhancing content locality in elastic server infrastructures.
In many numerical applications resulting from computational science and engineering problems, the solution of sparse linear systems is the most prohibitively compute intensive task. Consequently, the linear solvers ne...
详细信息
ISBN:
(纸本)9780769539393
In many numerical applications resulting from computational science and engineering problems, the solution of sparse linear systems is the most prohibitively compute intensive task. Consequently, the linear solvers need to be carefully chosen and efficiently implemented in order to harness the available computing resources. Krylov subspace based iterative solvers have been widely used for solving large systems of linear equations. In this paper, we focus on the design of such iterative solvers to take advantage of massive parallelism of general purpose Graphics Processing Units (GPU)s. We will consider Stabilized BiConjugate Gradient (BiCGStab) and Conjugate Gradient Squared (CGS) methods for the solutions of sparse linear systems with unsymmetric coefficient matrices. We discuss data structures and efficient implementation of these solvers on the NVIDIA's CUDA platform. We evaluate scalability and performance of our implementations in the context of a financial engineering problem of solving multidimensional option pricing PDEs using sparse grid combination technique.
Efficient processing of similarity joins is important for a large class of data analysis and data-mining applications. this primitive finds all pairs of records within a predefined distance threshold of each other. Ho...
详细信息
ISBN:
(纸本)9780769539393
Efficient processing of similarity joins is important for a large class of data analysis and data-mining applications. this primitive finds all pairs of records within a predefined distance threshold of each other. However, most of the existing approaches have been based on spatial join techniques designed primarily for data in a vector space. Treating data collections as metric objects brings a great advantage in generality, because a single metric technique can be applied to many specific search problems quite different in nature. In this paper, we concentrate our attention on a special form of join, the Self Similarity Join, which retrieves pairs from the same dataset. In particular, we consider the case in which the dataset is split into subsets that are searched for self similarity join independently (e.g, as in a distributedcomputing environment). To this end, we formalize the abstract concept of epsilon-Cover, prove its correctness, and demonstrate its effectiveness by applying it to two real implementations on a real-life large dataset.
Cluster computing, Cloud computing and GPU computing play overlapping and complementary roles in parallel processing of geospatial data within the general HPC framework. the fast increasing hardware capacities of mode...
详细信息
ISBN:
(纸本)9781450304320
Cluster computing, Cloud computing and GPU computing play overlapping and complementary roles in parallel processing of geospatial data within the general HPC framework. the fast increasing hardware capacities of modern personal computers equipped with chip multiprocessor CPUs and massively parallel GPUs have made high performance computing of large-scale geospatial data in a personal computing environment possible. We discuss the framework of Personal HPC-G and compare it with traditional Cluster computing and the newly emerging Cloud computing. We consider Personal HPC-G possesses many favorable features: low initial and operational costs, good support for data management and excellent support for both numeric modeling and interactive visualization. A case study on developing a parallel spatial statistics module for visual explorations on top of Personal HPC-G is subsequently presented. Copyright 2010 ACM.
A very ambitious objective in the field of policy-based systems is the provision of an intuitive and transparent way for policy specification, refinement and enforcement. this is one of the key enabling technologies f...
详细信息
A very ambitious objective in the field of policy-based systems is the provision of an intuitive and transparent way for policy specification, refinement and enforcement. this is one of the key enabling technologies for a simplified security management of complex networked environments. Currently, security policies are enforced by configuring the end devices by means of low-level device-specific parameters manually derived from high level specifications. this process, defined as policy translation, is still performed without a holistic view of the overall security requirements. this paper presents the Network Contextualization Tool (NCTool), a software supporting administrators in performing network dependent activities when configuring security enabled devices. the tool provides a great advantage in the management of complex networks. In fact, it simplifies the network administration tasks and reduces effort and responsibilities for the administrators, thus decreasing the risk of mistaken configurations.
As the computing ability of high performance computers are improved by increasing the number of computing elements, how to utilize the available computing resources becomes an important issue. Different strategies to ...
详细信息
ISBN:
(纸本)9781605585871
As the computing ability of high performance computers are improved by increasing the number of computing elements, how to utilize the available computing resources becomes an important issue. Different strategies to solve an problem based on a multi-processing system can bring about distinct performance. In this paper, we propose a method to predict the performance of parallel applications. the method describes the parallel features of the multi-processing systems in a hierarchy way, and evaluates solutions based on the description. In this way, programmers can find the better solution of an application before real programming.
In artificial neural networks (ANNs) and fuzzy inference systems (FISs), hardware implementation is significantly effective in improving real-time performance by utilizing their parallel processing structures. thus, n...
详细信息
ISBN:
(纸本)9781424435968
In artificial neural networks (ANNs) and fuzzy inference systems (FISs), hardware implementation is significantly effective in improving real-time performance by utilizing their parallel processing structures. thus, numerous hardware solutions for ANNs and FISs have been provided for time-critical control applications. the ink drop spread (IDS) method is a modeling technique that has been proposed as a new paradigm of soft computing. the structure of IDS models is similar to that of ANNs: they comprise distributed intermediate units referred to as IDS units. In this paper, the hardware design of the IDS unit is presented and it is demonstrated that the hardware implementation is effective in enhancing the real-time performance of IDS modeling systems.
this paper presents the concept of pluggable parallelisation that allows scientists to develop "sequential like" codes that can take advantage of multi-core, cluster and grid systems. In this approach parall...
详细信息
ISBN:
(纸本)9781605585871
this paper presents the concept of pluggable parallelisation that allows scientists to develop "sequential like" codes that can take advantage of multi-core, cluster and grid systems. In this approach parallel applications are developed by plugging parallelisation patterns/idioms into scientific codes (e.g., "sequential like" codes), softening the move from sequential to parallel programming and promoting the separation between domain specific code and parallelisation issues. Pluggable parallelisation combines three characteristics: 1) parallelisation is performed from "outside to inside", localising parallelisation concerns into well defined modules, reducing changes required to the domain specific code and avoiding invasive parallelisation of base code;2) control view is separated from data view promoting a stronger separation of concerns which improves reuse of parallelisation concerns across platforms and enables fine-grained refinements;and 3) abstractions can be composed, supporting the development of more complex patterns based on fine-grained features. this paper presents the concept of pluggable parallelisation and shows how some well-known parallelisation strategies can be implemented in this approach. Results show that this is a feasible approach and performance is competitive with traditional parallel programming. Copyright 2009 ACM.
暂无评论