Scheduling parallel tasks represented as a Directed Acyclic Graph (DAG), on a multiprocessor system has been an important research area in the past decades. One of the critical aspects of a class of scheduling algorit...
详细信息
Scheduling parallel tasks represented as a Directed Acyclic Graph (DAG), on a multiprocessor system has been an important research area in the past decades. One of the critical aspects of a class of scheduling algorithms, called "List Scheduling", is how to decide which task is to be scheduled next. This is achieved by assigning priorities to the nodes or the edges of the input DAG, and thus the task with the highest priority will be scheduled next. This paper proposes a low complexity scheduling algorithm to improve the priority node selection criteria in list scheduling algorithms. The worst case performance of the proposed algorithm is analyzed for general input DAGs. Also, the worst case performance and the optimality conditions are obtained for free structured input DAGs. The performance comparison study shows that the proposed algorithm outperforms existing scheduling algorithms especially for input DAGs with high communication overheads. The performance improvement over existing algorithms becomes larger as the input DAG becomes more dense and the level of parallelism in the DAG is increased.
parallel applications with inconstant usage patterns presents a big challenge to programmers in that the spawning of tasks and the communication between them may be conditional (named "conditional parallel progra...
详细信息
parallel applications with inconstant usage patterns presents a big challenge to programmers in that the spawning of tasks and the communication between them may be conditional (named "conditional parallel programming"). Ideally, the programmer should not be burdened by operational issues which have little relationship to the application itself. This paper proposes a new parallel programming environment, ATME, to automate task scheduling in conditional parallel programming. By adaptively producing accurate estimates of the task model prior to execution, ATME modifies task distribution to improve the system and application performance.
We describe our experience in implementing an instrumentation system for post-mortem ParaGraph visualization of PVM programs. The instrumentation system can be enabled without adding explicit startup code to the origi...
详细信息
We describe our experience in implementing an instrumentation system for post-mortem ParaGraph visualization of PVM programs. The instrumentation system can be enabled without adding explicit startup code to the original program. Furthermore, processes can enroll into the instrumentation system at any point during the execution of the application, allowing for algorithms with arbitrary parallelism. We also provide additional facilities like user task labeling and an intelligent process graph layout interface to help the programmer relate performance data to the profiled distributed application.
This paper presents two design patterns useful for parallel computations of master-slave model. These patterns are concerned with task management and parallel and distributed data structures. They can be used to help ...
详细信息
ISBN:
(纸本)0780336763
This paper presents two design patterns useful for parallel computations of master-slave model. These patterns are concerned with task management and parallel and distributed data structures. They can be used to help address the issues of data partition and mapping, dynamic task allocation, and load balancing in parallel programming with the benefit of less programming efforts and better program structures. The patterns are described in object-oriented notation, accompanied with illustrative examples in C++. We also provide our experience in applying these patterns to two scientific simulation programs simulating Ising model and plasma respectively. Since master-slave model is a widely used parallel programming paradigm, the design patterns presented in this paper have large potential application in parallel computations.
A high performance, freely accessible medical image processing environment based on a distributed architecture is presented: MedlGrid is the result of a joined interaction between scientists devoted to the design and ...
详细信息
A high performance, freely accessible medical image processing environment based on a distributed architecture is presented: MedlGrid is the result of a joined interaction between scientists devoted to the design and deployment of new and efficient tomographic reconstruction techniques, researchers in the field of distributed and parallel architectures, and physicians interested in experimenting with new advances in the field of image reconstruction and analysis. The main goal of the project was to design an easily accessible and usable environment with which the medical community can experiment on one side, and that research groups can use as a reference or as a basis for continuing research on the other side. The outcome of this work consists of a prototypal grid infrastructure along with an open and distributed software environment. The grid computing architecture includes a storage server, a high performance parallelcomputing unit, and two PCs that act as clients to the system and that are located in geographically distant areas. The Globus Toolkit has been chosen to implement the middleware between hardware and software. The latter consists of a set of tools and strategies to reconstruct, display, analyze as well as store, share, distribute and organize medical images, addressing the major problems in the field of image reconstruction and processing. It is platform independent, remotely executable, freely downloadable and accessible, and based on open source code.
In the development of advanced information systems, such as parallel processing computers and ATM switching systems, the speed and capacity of board-mountable processors and switches will increase considerably through...
详细信息
In the development of advanced information systems, such as parallel processing computers and ATM switching systems, the speed and capacity of board-mountable processors and switches will increase considerably through advances in VLSI technology. VCSEL based smart pixels will be the key devices to interconnect the VLSI chips, with high throughput at both the inter- and intra-board level. The technical issues involved in applying VCSEL based smart pixels to parallel optical interconnection systems are discussed from the viewpoint of their system applications.
In the usual treatment of impedances of beamline structures the electromagnetic response is computed under the assumption that the source charge trajectory is parallel to the propagation axis and is unaffected by the ...
详细信息
In the usual treatment of impedances of beamline structures the electromagnetic response is computed under the assumption that the source charge trajectory is parallel to the propagation axis and is unaffected by the wake of the structure. For high energy beams of relatively low current this is generally a valid assumption. Under certain conditions the assumption of a parallel source charge trajectory is no longer valid and the effects of the changing trajectory must be included in the analysis. Here the usual transmission line analysis that has been applied to BPM type transverse kickers is extended to include the self-consistent motion of the beam in the structure.
We present FastTrans - a parallel, distributed-memory simulator for transportation networks that uses a queue-based event-driven approach to traffic microsimulation. Queue-based simulation models have been shown to be...
详细信息
ISBN:
(纸本)9781424457717
We present FastTrans - a parallel, distributed-memory simulator for transportation networks that uses a queue-based event-driven approach to traffic microsimulation. Queue-based simulation models have been shown to be significantly faster than cellular-automata type approaches, sacrificing spatial granularity for speed, while preserving link and intersection dynamics with high fidelity. Significant advances over previous work include the size of the simulated network, support for dynamic responses to congestion and the absence of precomputed routes - all routing calculations are executed online. We present initial results from a scalability study using a real-world network from the North-East region of the United States comprising over 1.5 million network elements and over 25 million vehicular trips. Simulation of an entire day's worth of realistic vehicular itineraries involving approximately five billion simulated events executes in less than an hour of wall-clock time on a distributedcomputing cluster. Initial results suggest almost linear speed-ups with cluster size.
Owing to advances in terminal performance and network speed communication-intensive distributed multimedia applications have grown. Hence the development of end-to-end quality of service (QoS) management mechanisms is...
详细信息
Owing to advances in terminal performance and network speed communication-intensive distributed multimedia applications have grown. Hence the development of end-to-end quality of service (QoS) management mechanisms is increasingly demanded for the applications. Since user's QoS requirements change and available resources for network and terminal fluctuate occasionally, adaptability is required for QoS management mechanisms. In this paper we propose a multi-agent-based, decentralized, adaptive QoS management framework that is suitable to a distributed multimedia environment. In this framework, the QoS management task is divided into subtasks, and the agents that take charge of the subtasks are designed. The adaptability is realized using collaborative QoS adaptation among the agents. A layered structure of multi-agent system is introduced to segregate long-term QoS adaptation from short-term QoS adaptation. Experimental and simulation results show the validity of layered QoS management using Application Agent (AA) and Stream Agent (SA).
This article presents an environment which supplies UNIX network and file-server services for a standalone single program multiple data (SPMD) programming environment running on a parallel machine. Networking services...
详细信息
This article presents an environment which supplies UNIX network and file-server services for a standalone single program multiple data (SPMD) programming environment running on a parallel machine. Networking services are accessed via a BSD socket-like interface on each node of the parallel machine, which provides access to a full UDP/IP protocol implementation residing on a custom-designed I/O node. An experimental SPMD NFS client system was developed which supports the standard UNIX file-system interface, implementing a fully dynamic NFS remote mount mechanism and an innovative caching scheme on each node. As a practical demonstration of the usability of this environment, a "real-world" file indexing application was successfully ported and parallelised.
暂无评论