Time sharing between cluster resources in Grid is a major issue in cluster and Grid integration. Classical Grid architecture involves a higher level scheduler which submits non overlapping jobs to the independent batc...
详细信息
ISBN:
(纸本)0769522564
Time sharing between cluster resources in Grid is a major issue in cluster and Grid integration. Classical Grid architecture involves a higher level scheduler which submits non overlapping jobs to the independent batch schedulers of each cluster of the Grid. The sequentiality induced by this approach does not fit with the expected number of users and job heterogeneity of the Grids. Time sharing techniques address this issue by allowing simultaneous executions of many applications on the same resources. Co-scheduling and gang scheduling are the two best known techniques for time sharing cluster resources. Co-scheduling relies on the operating system of each node to schedule the processes of every application. Gang scheduling ensures that the same application is scheduled on all nodes simultaneously. Previous work has proven that co-scheduling techniques outperforms gang scheduling when physical memory is not exhausted. In this paper, we introduce a new hybrid sharing technique providing checkpoint based explicit memory management. It consists in co-schedulingparallel applications within a set, until the memory capacity of the node is reached, and using gang scheduling related techniques to switch from one set to another one. We compare experimentally the merits of the three solutions: Co, Gang and Hybrid scheduling, in the context of out-of-core computing, which is likely to occur in the Grid context, where many users share the same resources. The experiments show that the hybrid solution is as efficient as the co-scheduling technique when the physical memory is not exhausted, and is more efficient than gang scheduling and co-scheduling when physical memory is exhausted.
This paper gives an overview of the material to be discussed in the invited keynote presentation by H. J. Siegel;it summarizes our research in [1]. Performing computing and communication tasks on parallel and distribu...
详细信息
We study hierarchical configuration of distributedsystems for achieving optimized system performance. A distributed system consists of a collection of local processes which are distributed over a network of processor...
详细信息
We study hierarchical configuration of distributedsystems for achieving optimized system performance. A distributed system consists of a collection of local processes which are distributed over a network of processors, and work in a cooperative manner to fulfill various tasks. A hierarchical approach is to group and organize the distributed processes into a logical hierarchy of multiple levels, so as to coordinate the local computation/control activities to improve the overall system performance. It has been proposed as an effective way to solve various problems in distributed computing, such as distributed monitoring, resourcescheduling, and network routing. The optimization problem considered in this paper is concerned with finding an optimal hierarchical partition of the processors, so that the total traffic flow over the network is minimized. The problem in its general form has been known to be NP-hard. Therefore, we just focus on distributed computing jobs which require collecting and processing information from all processors. By limiting levels of the hierarchy to two, we will establish the analytically optimal hierarchical configurations for two popular interconnection networks: mesh and hypercube. Based on analytical results, partitioning algorithms are proposed to achieve minimal communication cost (network traffic flow). We will also present and discuss heuristic algorithms for multiple-level hierarchical partitions.
While resourcemanagement and task scheduling are identified challenges of Grid computing, current Grid schedulingsystems mainly focus on CPU and network availability. Recent performance improvement of CPU and comput...
详细信息
In this paper we present a new decider mechanism for the self-tuning dynP job scheduler for modern resourcemanagementsystems. This scheduler switches the active scheduling policy dynamically during run time, in orde...
详细信息
ISBN:
(纸本)0769521320
In this paper we present a new decider mechanism for the self-tuning dynP job scheduler for modern resourcemanagementsystems. This scheduler switches the active scheduling policy dynamically during run time, in order to reject changing characteristics of waiting jobs. The new decider explicitly prefers a single scheduling policy instead of being fair to all available policies. We use discrete event simulations to evaluate the achieved slowdown and utilization and compare the results with the fair decider mechanism and the static basic scheduling policies. The evaluation shows, that the self-tuning dynP scheduler in combination with the preferred decider achieves good results and that it is superior to common static scheduling approaches, which use only a single policy.
High performance computing systems and cluster computers are becoming so cost-effective that even small research groups can afford them. Hence, efforts to take advantage of these widely distributedresources are becom...
详细信息
High performance computing systems and cluster computers are becoming so cost-effective that even small research groups can afford them. Hence, efforts to take advantage of these widely distributedresources are becoming popular. Although recent projects provide resourcemanagement and job scheduling to support groups of computational resources across the country working together on massive problems, they have not yet fully addressed how distributedparallel programs will communicate. Therefore, we propose a new paradigm to support cluster-to-cluster (C2C) communications, which handles run-time communications between parallel programs running on distributed clusters.
Administration of Grid resources is a time consuming and often tedious job. Most administrative requests are predictable, and in general, handling them requires knowledge of the local resources and the requester. In t...
详细信息
The Computational Grid is a promising computing platform for solving the problem of large-scale resource allocation. In order to build large-scale of the grid computing systems, some obvious problems of grid computing...
详细信息
ISBN:
(纸本)3540240136
The Computational Grid is a promising computing platform for solving the problem of large-scale resource allocation. In order to build large-scale of the grid computing systems, some obvious problems of grid computing, such as: security, resourcescheduling and management, authentication and authorization, have attracted more attention and research. Charging and accounting is an important part of grid computing, and will lead to develop the grid computing systems. The aim of this paper is to propose a service-oriented accounting architecture that involves many services interaction on the grid. In this paper, we describe firstly architecture of charging and accounting and its support system. Then, we give briefly an introduction to pricing schemes and accounting policies.
The major issue today on cluster and grid computing is the efficient resourcemanagement. The evaluation of scheduling strategies is hard because of the generation of jobs under realistic scenario. This is true for ri...
详细信息
ISBN:
(纸本)0769521320
The major issue today on cluster and grid computing is the efficient resourcemanagement. The evaluation of scheduling strategies is hard because of the generation of jobs under realistic scenario. This is true for rigid jobs (where the number of processors is fixed) and even more for moldable ones. This paper presents an approach to generate realistic workloads for this kind of jobs. The model we propose is based on the analysis of one year of utilization of the I-cluster, a 225 processors cluster. From this log we extract a typical load for this kind of parallel machines and introduce a way to generate synthetic realistic workloads in an automatic way. This work was done as a way to test scheduling strategies taking into account both rigid and moldable jobs so as the workload generator may handle moldable jobs.
The aim of this work is to design and then implement a Configuration Repository for a grid-based Problem Solving Environment (PSE), specialized to describe applications and data belonging to the remote sensing field. ...
详细信息
ISBN:
(纸本)1932415262
The aim of this work is to design and then implement a Configuration Repository for a grid-based Problem Solving Environment (PSE), specialized to describe applications and data belonging to the remote sensing field. In a distributed environment the information system plays a central role in order to enhance scheduling algorithms and, more generally, to address grid-aware application requirements. Taking into account that the grid is inherently dynamic, i.e. machine load and availability, network latency and bandwidth change continually, we present the design of a Configuration Repository for retrieving, storing and handling information. In particular, the proposed solution has been designed having in mind all of the specific characteristics belonging to the remote sensing systems.
暂无评论