One of the barriers that prevents the expansion and adoption of Grid technologies is the lack of a standard programming paradigm to port existing applications among different environments. The distributed Resource Man...
详细信息
This paper addresses the dynamic scheduling of parallel jobs with QoS demands (soft-deadlines) in multiclusters and grids. Three metrics (over-deadline, makespan and idle-time) are combined with variable weights to ev...
详细信息
The exponentially increasing complexity of many scientific applications and the high cost of supercomputing force us to explore new, sustainable, and affordable high-performance computing platforms. Recent significant...
详细信息
ISBN:
(纸本)0769521320
The exponentially increasing complexity of many scientific applications and the high cost of supercomputing force us to explore new, sustainable, and affordable high-performance computing platforms. Recent significant advances in FPGA technology and the inherent advantages of configurable logic have brought about new research efforts in the configurable computing field: parallel processing on configurable chips. We explore here parallel LU factorization of large sparse block-diagonal-bordered (BDB) matrices on a configurable multiprocessor that we have designed and implemented. A dynamic load balancing strategy is proposed and analyzed. Performance results for ieee power test systems are provided. Our research provides evidence that configurable logic can be a viable alternative to high-performance scientific computing.
A key challenge faced by large-scale, distributed applications in Grid environments is efficient, seamless data management. In particular, for applications that can benefit from access to data at variable granularitie...
详细信息
ISBN:
(纸本)0769521150
A key challenge faced by large-scale, distributed applications in Grid environments is efficient, seamless data management. In particular, for applications that can benefit from access to data at variable granularities, data management can pose additional programming burdens to an application developer. This paper presents a case for the use of virtualized distributed file systems as a basis for data management for data-intensive, variable-granularity applications. The approach leverages on-demand transfer mechanisms of existing, de-jacto network file system clients and servers that support transfers of partial data sets in an application-transparent fashion, and complement them with user-level performance and functionality enhancements such as caching and encrypted communication channels. The paper uses a nascent application from the medical imaging field (Light Scattering Spectroscopy - LSS) as a motivation for the approach, and as a basis for evaluating its performance. Results from performance experiments that consider the 16-processor parallel execution of LSS analysis and database generation programs show that, in the presence of data locality, a virtualized wide-area distributed file system setup and configured by Grid middleware can achieve performance levels close (13% overhead or less) to that of a local disk, and superior (up to 680% speedup) to non-virtualized distributed file systems.
This paper reports on the design, implementation and performance evaluation of a suite of GridRPC programming middleware called Ninf-G Version 2 (Ninf-G2). NinfG2 is a reference implementation of the GridRPC API, a pr...
详细信息
ISBN:
(纸本)0769522564
This paper reports on the design, implementation and performance evaluation of a suite of GridRPC programming middleware called Ninf-G Version 2 (Ninf-G2). NinfG2 is a reference implementation of the GridRPC API, a proposed GGF standard. Ninf-G2 has been designed so that it provides 1) high performance in a large-scale computational Grid, 2) the rich functionalities which are required to adapt to compensate for the heterogeneity and unreliability of a Grid environment, and 3) an API which supports easy development and execution of Grid applications. Ninf-G2 is implemented to work with basic Grid services, such as GSI, GRAM, and MDS in the Globus Toolkit version 2. The performance of Ninf-G2 was evaluated using a weather forecasting system which was developed using Ninf-G2. The experimental results indicate that high performance can be attained even in relatively fine-grained task-parallel applications on hundreds of processors in a Grid environment.
The application fields of bytecode virtual machines and VLIW processors overlap in the area of embedded and mobile systems, where the two technologies offer different benefits, namely high code portability, low power ...
详细信息
The high price, long design and development cycles, programming difficulty and high maintenance cost of supercomputers limit their range of potential applications. Recent advances in Field-Programmable Gate Arrays (FP...
详细信息
The high price, long design and development cycles, programming difficulty and high maintenance cost of supercomputers limit their range of potential applications. Recent advances in Field-Programmable Gate Arrays (FPGAs) have made feasible the development of high-performance and programmable parallelsystems on a programmable chip (PSOPC). PSOPC's yield high-performance at low cost for many parallel applications. We present in this paper the design and implementation of our HERA (HEterogeneous Reconfigurable Architecture) machine that employs FPGAs to allow the simultaneous execution of a variety of parallel processing modes, including SIMD (Single-Instruction, Multiple-Data), MIMD (Multiple-Instruction, Multiple-Data) and M-SIMD (Multiple-SIMD). The processing element is centered on a single-precision ieee 754 floating-point unit (FPU) and employs a 7-stage pipeline. To demonstrate the robustness and viability of our approach, we propose a data partitioning scheme and employ mixed-mode scheduling for Cannon's matrix-matrix multiplication algorithm with matrices of arbitrary size and shape. Performance results on our 64-PE machine that employs a dual-FPGA system are better than the optimized performance on a dual-Xeon PC.
An ad hoc grid is a heterogeneous computing system composed of mobile devices. The problem studied here is to statically assign resources to the subtasks of an application, which has an execution time constraint, when...
详细信息
advances in computational science are closely tied to developments in high-performance computing. We consider the case of shelf sea modelling where models have been growing in complexity and where model domains have b...
详细信息
ISBN:
(纸本)0769521320
advances in computational science are closely tied to developments in high-performance computing. We consider the case of shelf sea modelling where models have been growing in complexity and where model domains have been growing and grid resolutions shrinking in pace with the increasing storage capacity and computing power of high-end systems. Terascale systems are now readily available with performance levels measurable in TeraFlop/s and memories counted in TeraBytes. The scientific case is now being made for regional models at 1km resolution, allowing the accurate representation of eddies, fronts and other regions containing steep gradients. The hydrodynamic model is increasingly being coupled with other models in multidisciplinary studies e.g. ecosystem modelling and wave modelling. We show that the performance attainable from the POLCOMS hydrodynamic code is measurable at about 0.5 TeraFlop/s on an IBM p690 cluster with 1024 processors. The scalability on this system and others is excellent up to 1000 processors. We describe a wide range of optimisations which have together enabled this code to reach these performance levels.
Microelectronic wafer and die level testing have undergone significant changes in the past few years. This paper's first section describes today's leading edge characteristics for numerous areas of this test t...
详细信息
Microelectronic wafer and die level testing have undergone significant changes in the past few years. This paper's first section describes today's leading edge characteristics for numerous areas of this test technology including the minimum I/O pad pitch, advances in contactor technologies, maximum number of I/Os probed, maximum number of die tested in parallel, the largest prober and substrates, and the maximum frequencies being tested at the wafer level. The second section will discuss the leading edge practices in three critical areas of wafer test: probe contactor cleaning, I/O pad damage minimization, and sorting good from bad die. The final section will present the communication methods between the design and the probe test organizations and some state-of-the-art examples for I/O pad designs.
暂无评论