this article describes a system for run-time prediction of applications in heterogeneous environments. To exploit the power of computational grids, scheduling systems need profound information about the job to be exec...
详细信息
ISBN:
(纸本)9780769534725
this article describes a system for run-time prediction of applications in heterogeneous environments. To exploit the power of computational grids, scheduling systems need profound information about the job to be executed. the run-time of a job is - beside others - not only dependent of its kind and complexity but also of the adequacy and load of the remote host where it will be executed. Accounting and billing are additional aspects that have to be considered when creating a schedule. Currently predictions are achieved by using descriptive models of the applications or by applying statistical methods to former jobs mostly neglecting the behaviour of users. Motivated by this, we propose a method that is not only based on the characteristics of a job but also takes the behaviour of single users and groups of similar users respectively into account. the basic idea of our approach is to cluster users, hosts and jobs and apply multiple methods in order to detect similarities and create forecasts. this is achieved by tagging jobs with attributes and by deriving predictions for similar attributed jobs whereas the recent behaviour of a user determines which predictions tire finally taken.
Recently we proposed algorithms for concurrent execution on multiple clusters [11]. In this case, data partitioning is done at two levels;first, the data is distributed to a collection of heterogeneous parallel system...
详细信息
ISBN:
(纸本)9780769528335
Recently we proposed algorithms for concurrent execution on multiple clusters [11]. In this case, data partitioning is done at two levels;first, the data is distributed to a collection of heterogeneous parallel systems with different resources and startup time, then, on each system the data is evenly partitioned to the available nodes. In this paper we report on a simulation study of the algorithms.
Centralized parallel packet switch algorithm and distributedparallel packet switch algorithm are two typical scheduling algorithms for parallel packet switch. this paper analyzes the two algorithms in detail, address...
详细信息
ISBN:
(纸本)9780769527369
Centralized parallel packet switch algorithm and distributedparallel packet switch algorithm are two typical scheduling algorithms for parallel packet switch. this paper analyzes the two algorithms in detail, addresses several key problems in their implementation and finally presents several available methods and suggestions to make the parallel packet switch more practical.
To provide fault tolerance to computer systems suffering from transient faults, checkpointing and rollback recovery is one of the widely-used techniques. Among others, two primary checkpointing schemes have been propo...
详细信息
ISBN:
(纸本)0769521355
To provide fault tolerance to computer systems suffering from transient faults, checkpointing and rollback recovery is one of the widely-used techniques. Among others, two primary checkpointing schemes have been proposed;independent and coordinated schemes. However, most existing works address only the need of employing a single checkpointing and rollback recovery scheme to a target system. In this paper, issues are discussed and a new algorithm is developed to address the need of integrating independent and coordinated checkpointing schemes for applications running in a hybrid distributed environment containing multiple heterogeneous subsystems. the required changes to the original checkpointing schemes for each subsystem and the overall prevented unnecessary rollbacks for the integrated system are presented Also described is an algorithm for collecting garbage checkpoints in the combined hybrid system.
this poster presentation describes our vision, goals and plans for HARNESS, a distributed reconfigurable and heterogeneous computing environment that supports dynamically adaptable parallel applications. HARNESS build...
详细信息
ISBN:
(纸本)0818685794
this poster presentation describes our vision, goals and plans for HARNESS, a distributed reconfigurable and heterogeneous computing environment that supports dynamically adaptable parallel applications. HARNESS builds on the core concept of the personal virtual machine as an abstraction for distributedparallel programming but fundamentally extends this idea, greatly enhancing dynamic capabilities. HARNESS is being designed to embrace dynamics at every level through a pluggable model that allows multiple distributed virtual machines (DVMs) to merge, split and interact with each other It provides mechanisms for new and legacy applications to collaborate with each other using the HARNESS infrastructure, and defines and implements new plug-in interfaces and modules so that applications can dynamically customize their virtual environment. HARNESS fits well within the larger picture of computational grids as a dynamic mechanism to hide the heterogeneity and complexity of the nationally distributed infrastructure. HARNESS DVMs allow programmers and users to construct personal subsets of an existing computational grid and treat them as unified network computers, providing a familiar and comfortable environment that provides easy-to-understand scoping. Similarly, a particular site could use HARNESS to construct a virtual machine that is presented and utilized as a single resource for scheduling within the grid. Our research focuses on understanding and developing three key capabilities within the framework of a heterogeneous computing environment: I) Techniques and methods for creating an environment where multiple distributed virtual machines can collaborate, merge or split;2) Specification and design of plug-in interfaces to allow dynamic extensions to services and functionality within a distributed virtual machine;and 3) Methodologies for distinct parallel applications to discover each other;dynamically attach, collaborate, and cleanly detach.
computingthe configuration space obstacles is an important problem in spatial planning for robotics applications. In this paper, we present parallel algorithm for computingthe configuration space obstacles by using ...
详细信息
computingthe configuration space obstacles is an important problem in spatial planning for robotics applications. In this paper, we present parallel algorithm for computingthe configuration space obstacles by using hypercube multiprocessors. the digitized images of the obstacles and the robot are stored in an N × N image plane. An algorithm for handling robots whose shapes are arbitrary convex polygons was presented. Our algorithms take O(logN) time and O(1) space which is asymptotically optimal for hypercube computers.
Concurrency and consistency are the two inherent and complex characteristics of distributed systems. their types, levels and implementation procedures determine the nature and efficiency of a distributed system. Concu...
详细信息
ISBN:
(纸本)9781665431682
Concurrency and consistency are the two inherent and complex characteristics of distributed systems. their types, levels and implementation procedures determine the nature and efficiency of a distributed system. Concurrency and consistency are difficult concepts to understand, moreover, without a comprehensive understanding a complete system cannot be designed and built. Applying a comprehensive understanding of concurrency and consistency to the design of a distributed system will generate a system that is more closely aligned withthe desired outcomes. this paper analyses both concurrency and consistency in distributed systems to present a comprehensive understanding of their requirements, types, levels, benefits and limitations. Initially, it analyses concurrency and compares it withparallelism to distinguish the two related but distinct terms. Subsequently, it analyses consistency and different consistency models including a comparative analysis of strong consistency and weak consistency models, and data-centric consistency and client-centric consistency models.
this paper introduces Strings, a high performance distributed shared memory system designed for clusters of symmetrical multiprocessors (SMPs). the distinguishing feature of this system is the use of a fully multi-thr...
详细信息
ISBN:
(纸本)0818685794
this paper introduces Strings, a high performance distributed shared memory system designed for clusters of symmetrical multiprocessors (SMPs). the distinguishing feature of this system is the use of a fully multi-threaded runtime system, written using POSIX threads. Strings also allows multiple application threads to be run on each node in a cluster Since most modem UNIX systems can multiplex these threads on kernel level light weight processes, applications written using Strings can use all the processors in a SMP machine. this paper describes some of the architectural details of the system and analyzes the performance improvements with two example programs and a few benchmark programs from the SPLASH-2 suite.
the paper focuses on fault-tolerant distributed computations where processes can take local checkpoints without coordinating with each other Several distributed on-line algorithms are presented which avoid roll-back p...
ISBN:
(纸本)0818685794
the paper focuses on fault-tolerant distributed computations where processes can take local checkpoints without coordinating with each other Several distributed on-line algorithms are presented which avoid roll-back propagation by forcing additional local checkpoints in processes. the effectiveness of the algorithms is evaluated in several application examples, showing their limited capability of bounding the number of additional checkpoints.
Fine grained data distributions are widely used to balance computational loads across complete processes in parallel scientific applications. When a fine grained data distribution is used in memory, performance of I/O...
详细信息
ISBN:
(纸本)0818685794
Fine grained data distributions are widely used to balance computational loads across complete processes in parallel scientific applications. When a fine grained data distribution is used in memory, performance of I/O intensive applications can be limited not only by disk speed bur also by message passing, because a large number of small messages may be generated by the implementation strategy used in the underlying parallel file system or parallel I/O library. Combining (or packetizing) a set of small messages into a large message is generally known to speed up parallel I/O. However, overall I/O performance is affected not only by small messages bur also by other factors like cyclic block size and interconnect characteristics. We describe small message combination and communication scheduling for fine grained delta distributions in the Panda parallel I/O library mid analyze I/O performance on parallel platforms having different interconnects: IBM SP2, IBM workstation cluster connected by FDDI and Pentium It cluster connected by Myrinet.
暂无评论