Grid computations require global access to massive data stores. To meet this need, the GridNFS project aims to provide scalable, high-performance, transparent, and secure wide-area data management as well as a scalabl...
详细信息
ISBN:
(纸本)1595936734
Grid computations require global access to massive data stores. To meet this need, the GridNFS project aims to provide scalable, high-performance, transparent, and secure wide-area data management as well as a scalable and agile name space. While parallel file systems give high I/O throughput, they are highly specialized, have limited operating system and hardware platform support, and often lack strong security mechanisms. Remote data access tools such as NFS and GridFTP overcome some of these limitations, but fail to provide universal, transparent, and scalable remote data access. As part of GridNFS, this paper introduces Direct-pNFS, which builds on the NFSv4.1 protocol to meet a key challenge in accessing remote parallel file systems: high-performance and scalable data access without sacrificing transparency, security, orportability. Experiments with Direct-pNFS demonstrate I/O throughput that equals or out performs the exported parallel file system across a range of workloads. Copyright 2007 ACM.
the resource availability in Grids is generally unpredictable due to the autonomous and shared nature of the Grid resources and stochastic nature of the workload resulting in a best effort quality of service. the reso...
详细信息
ISBN:
(纸本)1595936734
the resource availability in Grids is generally unpredictable due to the autonomous and shared nature of the Grid resources and stochastic nature of the workload resulting in a best effort quality of service. the resource providers optimize for throughput and utilization whereas the users optimize for application performance. We present a cost-based model where the providers advertise resource availability to the user community. We also present a multi-objective genetic algorithm formulation for selecting the set of resources to be provisioned that optimizes the application performance while minimizing the resource costs. We use trace-based simulations to compare the application performance and cost using the provisioned and the best effort approach with a number of artificially generated workflow-structured applications and a seismic hazard application from the earthquake science community. the provisioned approach shows promising results when the resources are under high utilization and/or the applications have significant resource requirements. Copyright 2007 ACM.
In this paper we describe QsNet(III), an adaptively routed network for highperformancecomputing (HPC) applications. We detail the structure of the network, the evolution of our adaptive routing algorithms from previ...
ISBN:
(纸本)9780769533803
In this paper we describe QsNet(III), an adaptively routed network for highperformancecomputing (HPC) applications. We detail the structure of the network, the evolution of our adaptive routing algorithms from previous generations of network and new applications of these techniques. We describe other HPC specific features including hardware support for barrier and broadcast and large numbers small packets. We also describe the implementation of the network.
Withthe development of new technologies such as big data, data mining and artificial intelligence, there is a great demand for distributed computing. Named Data Networking (NDN) is a promising future Internet archite...
详细信息
ISBN:
(纸本)9781538666142
Withthe development of new technologies such as big data, data mining and artificial intelligence, there is a great demand for distributed computing. Named Data Networking (NDN) is a promising future Internet architecture and is receiving more and more research attentions. In this paper, we propose a new architecture to support distributed computing in NDN. Due to the nature of NDN, the computing nodes can be selected in proximity to consumers/ users. In addition, we propose a new privacy protection scheme that allows users to anonymously access network resources, e.g.,computing resources. the effectiveness and overhead of this architecture and the devised privacy protection scheme are evaluated. the results demonstrate the feasibility of the proposed architecture.
Allgather is an important MPI collective communication. Most of the algorithms for allgather have been designed for homogeneous and tightly coupled systems. the existing algorithms for allgather on Gridsystems do not ...
详细信息
ISBN:
(纸本)1595936734
Allgather is an important MPI collective communication. Most of the algorithms for allgather have been designed for homogeneous and tightly coupled systems. the existing algorithms for allgather on Gridsystems do not efficiently utilize the bandwidths available on slow wide-area links of the grid. In this paper, we present an algorithm for allgather on grids that efficiently utilizes wide-area bandwidths and is also wide-area optimal. Our algorithm is also adaptive to gridload dynamics since it considers transient network characteristics for dividing the nodes into clusters. Our experiments on a real-grid setup consisting of 3 sites show that our algorithm gives an average performance improvement of 52% over existing strategies.
highperformancecomputing Systems are moving heavily towards many-core processors with a deep hierarchy of memory. Accelerators like GPUs are widely being used for general purpose computing and processor architecture...
详细信息
ISBN:
(纸本)9781538666142
highperformancecomputing Systems are moving heavily towards many-core processors with a deep hierarchy of memory. Accelerators like GPUs are widely being used for general purpose computing and processor architectures are becoming increasingly complex to accommodate performance boost. this trend towards complex heterogeneous architecture makes the job of scientific application developers difficult in terms of performance, portability and productivity. With memory being distributed, this challenge becomes even more complex. Programming many-core shared memory systems are most widely accomplished using OpenMP, while MPI is used to manage the communications in a distributed system. Even though MPI provides a rich set of features, it is too explicit making users responsible for overlapping communication and computation. Task-based runtime systems have emerged as a solution to this challenge of programming these modern complex systems. this study surveys the landscape of task-based runtime systems that support distributed memory and presents a set of benchmark for evaluating and understanding runtime-performance and overheads of these systems.
Large-scale donation-based distributed infrastructures need to cope withthe inherent unreliability of participant nodes. A widely-used work scheduling technique in such environments is to redundantly schedule the out...
详细信息
ISBN:
(纸本)1595936734
Large-scale donation-based distributed infrastructures need to cope withthe inherent unreliability of participant nodes. A widely-used work scheduling technique in such environments is to redundantly schedule the out sourced computations to a number of nodes. We present the design and implementation of RIDGE, a reliability aware system which uses a node's prior performance and behavior to make more effective scheduling decisions. We have implemented RIDGE on top of the BOINC distributed computing infrastructure and have evaluated its performance on a live test bed consisting of 120 PlanetLab nodes. Our experimental results show that RIDGE is able to match or surpass the throughput of the best vanilla BOINC configuration under different reliability environments, by automatically adapting to the characteristics of the underlying environment. In addition, RIDGE is able to provide much lower work unit makes pans compared to BOINC, which indicates its desirability in service-oriented environments with time constraints. Copyright 2007 ACM.
Component-based programming has been applied to address the requirements of applications in highperformancecomputing (HPC). the usual service connectors of commercial component models do not fit some requirements of...
详细信息
ISBN:
(纸本)9780769530147
Component-based programming has been applied to address the requirements of applications in highperformancecomputing (HPC). the usual service connectors of commercial component models do not fit some requirements of HPC, mainly regarding the support of parallelism, however this paper looks at extensions to the usual notion of service connector to meet such requirements, using the # component model as a substratum, evidencing its expressiveness.
Acceleration for the training process of Deep Neural Networks (DNNs) has been the focus of deep learning field. there were many researches of accelerating deep learning on different platforms. Among them, Intel Xeon P...
详细信息
ISBN:
(纸本)9781538637906
Acceleration for the training process of Deep Neural Networks (DNNs) has been the focus of deep learning field. there were many researches of accelerating deep learning on different platforms. Among them, Intel Xeon Phi Co-processor is a many-core platform which provides both strong programmability and highperformance. But previous work about Intel Many Integrated Core (MIC) focused on parallel computing only in MIC. In this paper, we speed up the training process of DNNs applied for automatic speech recognition with CPU+MIC architecture. In this architecture, the training process of DNNs is executed both on MIC and CPU. We apply several optimization methods for I/O and calculation and set up experiments to approve these methods. Putting all methods together, results show that our optimized algorithm acquires about 20x speedup compared withthe original sequential algorithm on CPU which uses one core.
Recent developments in the international arena has meant the technology is now mature enough to bring together those required for the implementation of a grid computing facility. this paper examines the requirements a...
详细信息
ISBN:
(纸本)0769517722
Recent developments in the international arena has meant the technology is now mature enough to bring together those required for the implementation of a grid computing facility. this paper examines the requirements and applications for an eScience infrastructure with particular reference to developments in Europe.
暂无评论