One of the challenges in high-performance computing is to provide users with reliable, remote data access in a distributed, heterogeneous environment. the increasing popularity of high-speed wide area networks and cen...
详细信息
ISBN:
(纸本)0780390741
One of the challenges in high-performance computing is to provide users with reliable, remote data access in a distributed, heterogeneous environment. the increasing popularity of high-speed wide area networks and centralized data repositories lead to the possibility of direct high-speed access to remote data sets from within a parallel application. In this paper we describe SEMPLAR, a library for remote, parallel I/O that combines the standard programming interface of MPI-IO withthe remote storage functionality of the SDSC Storage Resource Broker (SRB). SEMPLAR relies on parallel TCP streams to maximize the remote data throughput in a design that preserves the parallelism of the access all the way from the storage to the application. We have provided I/O performance resultsfor a high-performance computing work-load on three different clusters. On the NCSA TeraGrid cluster the ROMIO perf benchmark attained an aggregate read bandwidth of 291Mbps with 18 processors. the NAS btio benchmark achieved an aggregate write bandwidth of 74Mbps with 16 processors. the benchmark results are encouraging and show that SEMPLAR provides applications with scalable, high-bandwidth I/O across wide area networks.
Disk power consumption is becoming an increasingly important issue in high-end servers that execute large-scale data-intensive applications. In particular, array-based scientific codes can spend a significant portion ...
详细信息
ISBN:
(纸本)0769523129
Disk power consumption is becoming an increasingly important issue in high-end servers that execute large-scale data-intensive applications. In particular, array-based scientific codes can spend a significant portion of their power budget on the disk subsystem. Observing this, the prior research proposed several strategies, such as spining down to low-power modes or adjusting the speed of the disk in lower RPM, to reduce power consumption on the disk subsystem. A common characteristic of most of these techniques is that they are reactive, in the sense that they make their decisions based on the disk access patterns observed during execution. While such techniques are certainly useful and the published studies reveal that they can be very effective in some cases, one can conceivably achieve better results by adopting a proactive scheme. Focusing on array-intensive scientific applications, this paper makes two important contributions. First, it presents a compiler-driven proactive approach to disk power management. In this approach, the compiler analyzes the application code and extracts the disk access pattern. It then uses this information to insert explicit disk power management calls in the appropriate places in the code. It also preactivates a disk (placed into the low-power mode) before it is actually needed to eliminate the potential performance impact of disk power management. the second contribution of this paper is a code transformation approach that can be used to increase the savings coming from a disk power management scheme (whether reactive or proactive). Our experimental results with several scientific application codes show that boththe proactive disk power management approach and the disk layout aware code transformations are beneficial from both power consumption and execution time perspectives.
In this study, we have successfully developed a grid-enabled software distributed shared memory called Teamster-G. this system provides users with not only a shared memory programming interface but also a transparent ...
详细信息
the computational cost of training Artificial Neural Network (ANN) algorithms limits the use of large systems capable of processing complex problems. Implementing ANNs on a parallel or distributed platform to improve ...
详细信息
the simulations used in the field of high energy physics are compute intensive and exhibit a high level of data parallelism. these features make such simulations ideal candidates for Grid computing. We are taking as a...
详细信息
ISBN:
(纸本)0769523129
the simulations used in the field of high energy physics are compute intensive and exhibit a high level of data parallelism. these features make such simulations ideal candidates for Grid computing. We are taking as an example the GEANT4 detector simulation used for physics studies within the ATLAS experiment at CERN. One key issue in Grid computing is that of network and system security, which can potentially inhibit the wide spread use of such simulations. Virtualization provides a feasible solution because it allows the creation of virtual compute nodes in both local and remote compute clusters, thus providing an insulating layer which can play an important role in satisfying the security concerns of all parties involved. However, it has performance implications. this study provides quantitative estimates of the visualization and hyper-threading overhead for GEANT on commodity clusters. Results show that virtualization has less than 15% run-time overhead, and that the best run time (withthe non-SMP licence of ESX VMware) is achieved by using one virtual machine per CPU. We also observe that hyper-threading does not provide an advantage in this application. Finally, the effect of virtualization on run-time, throughput, mean response time and utilization is estimated using simulations.
HIPS-HPGC 2005 is a full-day workshop, focusing on high-performance grid computing and high-level parallel programming models. the papers deal with component models and service-based systems for grids, emphasizing on ...
HIPS-HPGC 2005 is a full-day workshop, focusing on high-performance grid computing and high-level parallel programming models. the papers deal with component models and service-based systems for grids, emphasizing on experiences with existing systems. Also the papers report on the state of the art of grid applications, both for academic and industrial problems
the low-cost and availability of network of workstations have made them attractive solution for high performance computing. Striking progress of network technology is enabling high-performance global computing, with t...
详细信息
the SAPIENT parallel analysis framework facilitates the efficient transformation of sequential applications into multilevel parallelapplicationsthat can be executed on polymorphic chip multiprocessor architectures. ...
详细信息
the SAPIENT parallel analysis framework facilitates the efficient transformation of sequential applications into multilevel parallelapplicationsthat can be executed on polymorphic chip multiprocessor architectures. We demonstrate how application characteristics are used to detect thread and data level parallelism in sequential applications and estimate parallel performance. We further demonstrate how SAPIENT determines the combination of application parallelism and polymorphic architecture configuration that maximizes performance. As an example, we present a detailed analysis of parallelism for an MPEG-2 decoder. We further summarize results for six other multimedia applications, identifying the presence of data and thread level parallelism, evaluating performance, and suggesting architecture configurations for each.
A programmable Java distributed system, which utilises the free resources of a heterogeneous set of computers linked together by a network, has been developed. the system has been successfully deployed on over 200 com...
详细信息
A programmable Java distributed system, which utilises the free resources of a heterogeneous set of computers linked together by a network, has been developed. the system has been successfully deployed on over 200 computers, which were distributed over a number of locations, and has been successfully used to process bioinformatics, biomedical engineering, and cryptography applications. We present two bioinformatics applications, DSEARCH, which performs sensitive database and DPRml which performs distributed phylogeny reconstruction by maximum likelihood.
We are interested in discovering the intrinsic dynamics of parallelapplications, which are independent of runtime environment, to aid in the development of appropriate tuning policies, especially dynamic load balanci...
详细信息
We are interested in discovering the intrinsic dynamics of parallelapplications, which are independent of runtime environment, to aid in the development of appropriate tuning policies, especially dynamic load balancing policies. Based on the novel idea of profiling mesh-based applications at a fine granularity of each mesh element, this paper proposes a synthetic application simulator which is driven by a series of application signatures mapping to the mesh structure. By integrating the ZOLTAN library into the system, our simulator provides a convenient test bed for developing and evaluating load balancing policies.
暂无评论