Contemporary computing systems, especially large-scale systems such as Grids promise ultra-fast ubiquitous utility computing, always available at the flip of a switch. A major unresolved issue is the organization and ...
详细信息
ISBN:
(纸本)0769516866
Contemporary computing systems, especially large-scale systems such as Grids promise ultra-fast ubiquitous utility computing, always available at the flip of a switch. A major unresolved issue is the organization and efficient usage of such infrastructure in a commercial context where several entities compete for shared resources. this has long been resolved for conventional utility resources such as gas and electricity through commoditization, a variety of market designs, customization, and decision support for the resulting portfolios of assets and commitments. this paper reviews the state of Grid commercialization and compares it to the commercialization of conventional resources. We draw specific lessons for commercialized Grids and detail them as architecture requirements at each level of the architecture stack. We provide an example to illustrate the benefits of commercialized resources in terms of the financial clarity it brings to decisions for different user groups, namely application users and IT managers.
the Internet is quickly evolving into a global computing platform. Internetworking of computers means that computing resources available for personal use need not be confined to the users' local environment. Inste...
详细信息
ISBN:
(纸本)0769515797
the Internet is quickly evolving into a global computing platform. Internetworking of computers means that computing resources available for personal use need not be confined to the users' local environment. Instead, the Internet provides a delivery channel where remote computing resources are easily accessible. computer simulations often require highperformancecomputers (HPC) for fast number crunching. However, HPCs remain inaccessible to ordinary users due to the high cost involved. the aim of this project is to develop a web-based simulation architecture where users can input a simulation model, based on the European Space Agency Simulation Language (ESL), through a Java graphical front-end application, and letting backend HPCs to perform intensive computation on behalf of the user. We have successfully implemented this system within an Intranet environment, withthe possibility of actual implementation on the World Wide Web. In this report, we will discuss the architectural merits of this framework, our design philosophy and proposals for improvements and future development.
Configurations of contemporary DRAM memory systems become increasingly complex. A recent study [5] shows that application performance is highly sensitive to choices of configurations, and suggests that tuning burst si...
详细信息
ISBN:
(纸本)0769515258
Configurations of contemporary DRAM memory systems become increasingly complex. A recent study [5] shows that application performance is highly sensitive to choices of configurations, and suggests that tuning burst sizes and channel configurations be an effective way to optimize the DRAM performance for a given memory-intensive workload. However, this approach is workload dependent. In this study we show that, by utilizing fine-grain priority access scheduling, we are able to find a workload independent configuration that achieves optimal performance on a multi-channel memory system. Our approach can well utilize the available high concurrency and high bandwidth on such memory systems, and effectively reduce the memory stall time of memory-intensive applications. Conducting execution-driven simulation of a 4-way issue, 2 GHz processor, we show that the average performance improvement for fifteen memory-intensive SPEC2000 programs by using an optimized fine-grain priority scheduling is about 13115 and 8116 for a 2-channel and a 4-channel Direct Rambus DRAM memory systems, respectively, compared with gang scheduling. Compared with burst scheduling, the average performance improvement is 16116 and 14% for the 2-channel and 4-channel memory systems, respectively.
the BMI Eigenvalue Problem is one of optimization problems and is to minimize the greatest eigenvalue or a bilinear matrix function. this paper proposes a parallel algorithm to compute the ϵ-optimal solution of the BM...
performance models provide significant insight into the performance relationships between an application and the system used for execution. the major obstacle to developing performance models is the lack of knowledge ...
详细信息
ISBN:
(纸本)0769516866
performance models provide significant insight into the performance relationships between an application and the system used for execution. the major obstacle to developing performance models is the lack of knowledge about the performance relationships between the different functions that compose an application. this paper addresses the issue by using a coupling parameter, which quantifies the interaction between kernels, to develop performance predictions. the results, using three NAS Parallel Application Benchmarks, indicate that the predictions using the coupling parameter were greatly improved over a traditional technique of summing the execution times of the individual kernels in an application. In one case the coupling predictor had less than 1% relative error in contrast the summation methodology that had over 20% relative error. Further, as the problem size and number of processors scale, the coupling values go through a finite number of major value changes that is dependent on the memory subsystem of the processor architecture.
Speculative multithreading has been recently proposed to boost performance by means of exploiting thread-level parallelism in applications difficult to parallelize. the performance of these processors heavily depends ...
详细信息
ISBN:
(纸本)0769515258
Speculative multithreading has been recently proposed to boost performance by means of exploiting thread-level parallelism in applications difficult to parallelize. the performance of these processors heavily depends on the partitioning policy used to split the program into threads. Previous work uses heuristics to spawn speculative threads based on easily-detectable program constructs such as loops or subroutines. In this work we propose a profile-based mechanism to divide programs into threads by searching for those parts of the code that have certain features that could benefit from potential thread-level parallelism. Our profile-based spawning scheme is evaluated oil a Clustered Speculative Multithreaded Processor and results show large performance benefits. When the proposed spawning scheme is compared with traditional heuristics, we outperform them by almost 20%. When a realistic value predictor and a 8-cycle thread initialization penalty is considered, the performance difference between them is maintained. the speed-lip over a single thread execution is higher than 5x for a 16-thread-unit processor and close to 2x for a 4-thread-unit processor.
this paper presents the architecture, design and test of a unified arithmetic processor, developed based on recently proposed partial product bit-matrix decomposition and dynamic reconfiguration parallel processing me...
FPGAs allow the implementation of very complex designs (∼1million of gates);they are good candidates to host special purpose systems designed to boost conventional computingarchitectures. Several computationally int...
详细信息
We present NeST. a flexible software-only storage appliance designed to-meet the storage needs of the Grid. NeST has three key features that make it well-suited for deployment in a Grid environment. First, NeST provid...
详细信息
ISBN:
(纸本)0769516866
We present NeST. a flexible software-only storage appliance designed to-meet the storage needs of the Grid. NeST has three key features that make it well-suited for deployment in a Grid environment. First, NeST provides a generic data transfer architecturethat supports multiple data transfer-protocols (including GridFTP and NFS), and allows for the easy addition of new protocols. Second, NeST is dynamic, adapting itself on-the-fly so that it runs effectively on amide range of hardware and software platforms. third, NeST is Grid-aware, implying that features that are necessary for integration into the Grid, such as storage space guarantees, mechanisms for resource and data discovery, user authentication, and quality of service, are a part of the NeST infrastructure.
In this paper we present new approaches to highperformance protein database scanning on two novel massively parallel architectures to gain supercomputer power at low cost. the first architecture is built around a Beo...
暂无评论