This work presents a parallel implementation of the implicitly restarted Arnoldi/Lanczos method for the solution of eigenproblems approximated by the finite element method. The implicitly restarted Arnoldi/Lanczos use...
详细信息
ISBN:
(纸本)9780889868113
This work presents a parallel implementation of the implicitly restarted Arnoldi/Lanczos method for the solution of eigenproblems approximated by the finite element method. The implicitly restarted Arnoldi/Lanczos uses a restart scheme in order to improve the convergence of the desired portion of the spectrum, maintaining the orthogonality of the Krylov basis. The presented implementation is suitable for distributed memory architectures, specially PC clusters. In the parallel solution, a subdomain by subdomain approach was implemented and overlapping and non-overlapping mesh partitions were used. Compressed data structures in the formats CSRC and CSRC/CSR were used to store the global matrices coefficients. The parallelization of numerical linear algebra operations presented in both Krylov and implicitly restarted methods are discussed. In order to point out the efficiency and applicability of the proposed algorithms, a numerical example is shown.
Currently, the computational needs of scientific applications have grown to levels where it is necessary to have computers with a very high degree of parallelism. The IBM Blue Gene/L can hold in excess of 200K process...
详细信息
ISBN:
(纸本)9780889867741
Currently, the computational needs of scientific applications have grown to levels where it is necessary to have computers with a very high degree of parallelism. The IBM Blue Gene/L can hold in excess of 200K processors and it has been designed for high performance. However, failures in this large system are a major concern, since it has been demonstrated that a failure will drastically decrease the performance of the system. Checkpointing and log schemes have been utilized to overcome these failures, however, it has been shown that these techniques are not as effective as desired. Therefore, proactive failure detection and prediction has gained interest in the research community. In this study, we have collected the RAS event and Job logs from a large IBM Blue Gene/L over a three-month period. We have investigated the relationship among fatal and non-fatal events with the aim of proactive failure prediction. Based on our observations, we have developed a scheme for predicting fatal events based on the spatial and temporal relation among fatal and nonfatal events. We will show that with our scheme up to 84% of fatal events could be effectively predicted.
Operating system virtualization has recently become a popular technique to achieve better resource utilization in so-called '' server farm '' environments. This technique provides a virtual hardware in...
详细信息
ISBN:
(纸本)9780889866379
Operating system virtualization has recently become a popular technique to achieve better resource utilization in so-called '' server farm '' environments. This technique provides a virtual hardware interface on top of which one can run multiple instances of popular operating systems. The Xen Virtual Machine Monitor is an implementation of operating system virtualization that supports live migration, the transfer of a virtual operating system from one physical machine to another with minimal down time. We have utilized this capability to implement a monitoring and dynamic reconfiguration daemon that attempts to equalize the load on all host nodes in a group of machines running Xen. We have also implemented a simulator for testing balancing algorithms. Experiments using these tools have provided insight into the redistribution of virtualized operating systems and how this differs from the more thoroughly-studied problem of process-level load balancing.
Passive testing is a technique suitable for continuous, non-intrusive, autonomous testing of qualitative behavioural properties (correctness) of a deployed distributed system. A real passive tester has to face observa...
详细信息
ISBN:
(纸本)9780889866379
Passive testing is a technique suitable for continuous, non-intrusive, autonomous testing of qualitative behavioural properties (correctness) of a deployed distributed system. A real passive tester has to face observational uncertainty, which may lead to false verdicts. We submit the novel idea of a self-tuned passive tester, which is able to adapt to a priori unknown, and possibly changing delays in communication channels. We propose the structure and algorithms of a passive tester that "tunes itself" basing solely on its own, locally issued verdicts. This seemingly counter-intuitive principle is shown by simulation to be viable and effective.
Dynamic server resource allocation to services on networks, or utility computing, is a powerful technology to provide required computer resources for multiple service providers at low cost. Virtual machine (VM) techno...
详细信息
ISBN:
(纸本)9780889866386
Dynamic server resource allocation to services on networks, or utility computing, is a powerful technology to provide required computer resources for multiple service providers at low cost. Virtual machine (VM) technology can be combined with utility computing to further improve server resource utilization. An important technical issue about VM-based utility computing is optimal placement of VMs on physical server nodes, because performance of services may be seriously affected by VM placement. We address this issue by introducing an on-line placement algorithm on the basis of performance influence among services. We have implemented this algorithm and evaluated on a simulated environment. The results have shown that the proposed mechanism can get roughly 25% better score calculated by the placement rules compared with a random placement algorithm.
Complex system design and analysis requires simulations with different views or domains. Multi-domain simulation models typically are a collection of simulation model integrated through various means. These concurrent...
详细信息
ISBN:
(纸本)9780889866386
Complex system design and analysis requires simulations with different views or domains. Multi-domain simulation models typically are a collection of simulation model integrated through various means. These concurrent models may be implemented to be closely integrated or distributed as parallel-distributed simulation or mix of the two. Performance of a coupled simulation between multiple domain models depends on various factors. Some of them are, integration backplane architecture, messaging schemes, sharing scheme, level of coupling, fidelity of domain models etc. Framework for Optimizing Multidomain Simulation (MSOMS) is a framework which can be used to estimate performance and effectiveness of integration methods under various scenarios. The framework is referred as meta-simulation, since it is a "simulation of a simulation". This paper captures formulation and demonstration of MSOMS. We compared two possible implementations of a multi-domain model using MSOMS as a case study.
In this study we evaluate and compare the performance of our load balancing technique on irregular P2P systems embedded in two regular topologies: 1) Hypercube and 2) TreeP. Hypercube is one of the most studied interc...
详细信息
ISBN:
(纸本)9780889866379
In this study we evaluate and compare the performance of our load balancing technique on irregular P2P systems embedded in two regular topologies: 1) Hypercube and 2) TreeP. Hypercube is one of the most studied interconnection topologies and exhibits powerful interconnection features, while TreeP is Tree-based P2P network architecture and is based on a tessellation of a 1-D space. The load balancing technique employed is a two step strategy. In the first phase, it maps any irregular network topology to a regular one. In the second phase the load is balanced among the nodes using PSLB algorithm. In this paper we study and compare the performance of this two-step strategy on hypercube and TreeP topologies. This strategy is proven to be efficient and does not introduce a considerable overhead as shown in the experimental results.
In this paper, we present a new method for quantifying color information so as to detect edges in color images. Our method uses the volume of a pixel in the HSI color space, allied with noise reduction, thresholding a...
详细信息
ISBN:
(纸本)9780889869073
In this paper, we present a new method for quantifying color information so as to detect edges in color images. Our method uses the volume of a pixel in the HSI color space, allied with noise reduction, thresholding and edge thinning. We implement our algorithm using NVIDIA Compute Unified Device Architecture (CUDA) for direct execution on Graphics Processing Units (GPUs). Our experimental results show that: compared to traditional edge detection methods, our method can improve the accuracy of edge detection and withstand greater levels of noise in images;and our GPU implementation achieves speedups over related CUDA implementations.
In a grid computing environment, dynamicity, and geographically distributed sites, make task scheduling problems challenging to solve. It is hard for a local site to obtain precise real-time information about other si...
详细信息
ISBN:
(纸本)9780889866386
In a grid computing environment, dynamicity, and geographically distributed sites, make task scheduling problems challenging to solve. It is hard for a local site to obtain precise real-time information about other sites given that specific information on a site such as load and computing resources may change rapidly. Moreover, in data grid environment, large scale data intensive applications make scheduling problems even more challenging since both computational and data storage resources must be taken into consideration. In this paper we propose an innovative peer-to-peer scheduler to solve these problems. This scheduler is distributed and scalable. We used simulation to evaluate the performance of the scheduler under different circumstances, such as different number of hops to search suitable sites and different number of incoming tasks. Results show that our scheduler can successfully schedule around 75% of incoming tasks within their deadlines in average. For computation-intensive tasks, it can successfully schedule more than 90% of incoming tasks.
A hardware/software co-reconfiguration technique is introduced to design a programmable instruction decoder for heterogeneous multiprocessor systems that do not employ programmable gate-array. This technique includes ...
详细信息
ISBN:
(纸本)9780889866386
A hardware/software co-reconfiguration technique is introduced to design a programmable instruction decoder for heterogeneous multiprocessor systems that do not employ programmable gate-array. This technique includes an off-chip static reconfiguration of heterogeneous multiple target instruction sets, followed by an on-chip dynamic reconfiguration of binary source instructions. This co-reconfiguration technique does not require modifying existing compilers to retarget their new processors, nor does the technique oblige redesign of the processors to add new and/or extended instructions. This technique allows software developers to swiftly and accurately retarget their heterogeneous multiprocessor systems. In order to present the reconfiguration procedures and performance evaluations of the technique, a smart instruction decoder for Texas Instruments' TMS320C55 digital signal processors and ARM's ARM11 embedded processors was implemented and optimized.
暂无评论