parallel random access memory, or PRAM, is a now venerable model of parallel computation that that still retains its usefulness for the design and analysis of parallel algorithms. parallel computational models propose...
详细信息
ISBN:
(纸本)0769523129
parallel random access memory, or PRAM, is a now venerable model of parallel computation that that still retains its usefulness for the design and analysis of parallel algorithms. parallel computational models proposed after PRAM address short comings of PRAM in terms of modeling realism of actual machines. In this work, we propose a multiple instruction stream partitioned PRAM, or "stream PRAM." This model embodies the reality of a small number of parallel processors, each with local memory (which could also be small), where a problem is generally evenly distributed among all processing elements. Actual hardware configurations limit the number of shared memories which can be efficiently implemented. By allowing each shared memory to also act as an independent instruction stream, more functionality is possible with a small extra cost. The additional instruction streams provide limited asynchronous abilities and offer the flexibility of a reconfigurable network as well as allowing the processing elements to perform independent actions. Because the proposed stream PRAM allows variable sizes for processors, memory, and problem sizes, it is valuable for present as well as future parallelism.
The persistent mood of exhilaration in the research community over exponential increases in the capacity of computational resources has been tempered recently by the realization that a torrential influx of data from i...
详细信息
ISBN:
(纸本)0769523129
The persistent mood of exhilaration in the research community over exponential increases in the capacity of computational resources has been tempered recently by the realization that a torrential influx of data from instruments, sensors and simulations is growing even faster than the resources needed to analyze it. The impact of this "data deluge," challenging enough by itself, is exacerbated by the fact that many data intensive projects today involve teams of collaborators spread out across geographically and organizationally distinct sites. A system that addresses these conditions must enable a community of collaborators, distributed throughout the wide area, to get responsive answers to dynamic queries and analyses applied to terascale or larger data sets. In this paper we describe the NetSolve/D system architecture which is designed to achieve this goal of data intensive on-line computing.
Most parallel computing resources are controlled by batch schedulers that place requests for computation in a queue until access to compute nodes is granted. Queue waiting times are notoriously hard to predict, making...
详细信息
ISBN:
(纸本)1424403073
Most parallel computing resources are controlled by batch schedulers that place requests for computation in a queue until access to compute nodes is granted. Queue waiting times are notoriously hard to predict, making it difficult for users not only to estimate when their applications may start, but also to pick among multiple batch-scheduled resources the one that will produce the shortest turnaround time. As a result, an increasing number of users resort to "redundant requests": several requests are simultaneously submitted to multiple batch schedulers on behalf of a single job;once one of these requests is granted access to compute nodes, the others are canceled. Using simulation as well as experiments with a production batch scheduler we investigate whether redundant requests are harmful in terms of (i) schedule performance and fairness, (ii) system load, and (iii) system predictability. We find that two main issues with redundant requests are load on the middleware and unfairness towards users who do not use redundant requests, which both depend on the number of users who use redundant requests and on the amount of request redundancy these users employ.
Xilinx Virtex FPGAs offer the possibility of dynamic and partial run-time reconfiguration. When designing a system that includes this feature it has to be made sure, that no signal lines cross the. border to other rec...
详细信息
ISBN:
(纸本)0769523129
Xilinx Virtex FPGAs offer the possibility of dynamic and partial run-time reconfiguration. When designing a system that includes this feature it has to be made sure, that no signal lines cross the. border to other reconfigurable regions. The complex modular design flow to generate partial bitstreams and the need of macros for physical interconnection of IP-Cores causes the necessity to investigate in alternatives. This paper describes the design and implementation of a software reconfigurable multiprocessor system, based on Xilinx MicroBlaze soft-core processors. A real application in the automotive domain implemented on a Xilinx Virtex-II 3000 FPGA is used to present results.
One of the most important issues in Grid computing is to provide a secure environment that allows administrators to contribute their resources and users to utilize them. Currently diverse methods are required to obtai...
详细信息
ISBN:
(纸本)0769523129
One of the most important issues in Grid computing is to provide a secure environment that allows administrators to contribute their resources and users to utilize them. Currently diverse methods are required to obtain certificates for the different Grids. In this paper we showcase a prototype of a tool that simplifies the tasks associated with maintaining a Grid certificate authority and simplifies the application process for the user to interact with multiple certificate authorities.
Harness is a pluggable heterogeneous distributed Virtual Machine (DVM) environment for parallel and distributed scientific computing. This paper describes recent improvements in the Harness kernel design. By using a l...
详细信息
ISBN:
(纸本)0769523129
Harness is a pluggable heterogeneous distributed Virtual Machine (DVM) environment for parallel and distributed scientific computing. This paper describes recent improvements in the Harness kernel design. By using a lightweight approach and moving previously integrated system services into software modules, the software becomes more versatile and adaptable. This paper outlines these changes and explains the major Harness kernel components in more detail. A short overview is given of ongoing efforts in integrating RMIX, a dynamic heterogeneous reconfigurable communication framework, into the Harness environment as a new plug-in software module. We describe the overall impact of these changes and how they relate to other ongoing work.
Our problem is about a routing of a vehicle with pickup and delivery of product with time window constraints. This problem requires to be attended with instances of medium scale (nodes ≥ 100). A strong active time wi...
详细信息
ISBN:
(纸本)0769523129
Our problem is about a routing of a vehicle with pickup and delivery of product with time window constraints. This problem requires to be attended with instances of medium scale (nodes ≥ 100). A strong active time window exists (≥ 90%) with a large factor of amplitude (≥ 75%). This problem is NP-hard and for such motive the application of an exact method is limited by the computational time. This paper proposes a specialized genetic algorithm. We report good solutions in computational times below 5 minutes. The previous finding allows its application in business environment where the decision time is critical.
The Chained Lin-Kernighan algorithm (CLK) is one of the best heuristics to solve Traveling Salesman Problems (TSP). In this paper a distributed algorithm is proposed, were nodes in a network locally optimize TSP insta...
详细信息
ISBN:
(纸本)0769523129
The Chained Lin-Kernighan algorithm (CLK) is one of the best heuristics to solve Traveling Salesman Problems (TSP). In this paper a distributed algorithm is proposed, were nodes in a network locally optimize TSP instances by using the CLK algorithm. We show that the distributed variant finds better tours compared to the original CLK given the same total amount of computation time. Hence, the cooperation of the processes in the distributed algorithm increases the effectiveness of the approach beyond the maximally achievable reduction in computation time due to parallelization. E.g. for TSP instance f 13795, the original CLK got stuck in local optima in each of 10 runs, whereas the distributed algorithm found optimal tours in each run requiring less than 10 CPU minutes per node on average in an 8 node setup.
The Protein Folding Problem studies the way in which a protein - a chain of amino acids - will 'fold' into its natural state. Predicting the way in which various proteins fold can be fundamental in developing ...
详细信息
ISBN:
(纸本)0769523129
The Protein Folding Problem studies the way in which a protein - a chain of amino acids - will 'fold' into its natural state. Predicting the way in which various proteins fold can be fundamental in developing treatments of diseases such as Alzeihmers and Systic Fibrosis. Classical solutions to calculating the final conformation of a protein structure are resource-intensive. The Hydrophobic-Hydrophilic (HP) method is one way of simplifying the problem. We introduce a novel method of solving the HP protein folding problem in both two and three dimensions using Ant Colony Optimizations and a distributed programming paradigm. Tests across a small number of processors indicate that the multiple colony distributed ACO (MACO) approach is scalable and outperforms single colony implementations.
This paper introduces Peer-to-Peer distributed Computing (P2PDisCo) software, which provides an interface for distributing the computation of Java programs to multiple workstations. P2PDisCo can be used to distribute ...
详细信息
ISBN:
(纸本)0769523129
This paper introduces Peer-to-Peer distributed Computing (P2PDisCo) software, which provides an interface for distributing the computation of Java programs to multiple workstations. P2PDisCo can be used to distribute any Java program that uses files for storing input and output parameters without significant code modifications to the Java program itself. P2PDisCo has been built over Chedar peer-to-peer middleware and is currently being used for speeding up the training of neural networks with evolutionary algorithm.
暂无评论