A limited-area numeric weather prediction model specifically targeted for parallel computers has been successfully implemented an an IBM SP2 distributed-memory parallel computer. The model employs an explicit finite-d...
详细信息
A limited-area numeric weather prediction model specifically targeted for parallel computers has been successfully implemented an an IBM SP2 distributed-memory parallel computer. The model employs an explicit finite-difference scheme and was parallelised using a simple domain decomposition technique. On a twelve processor SP2, a 24 hour forecast using archived operational data and including a sophisticated representation of physical processes was run at a range of resolutions between 150 km and 19 km and near-linear speedups were achieved. Major weather centres have indicated a requirement for regional prediction models to be run at resolutions of approximately 5 km by the end of the decade. Based on this work, it appears that this target can be achieved through the use of scalable parallel computers.< >
distributed programs are much more difficult to design, understand and implement than sequential or parallel ones. This is mainly due to the uncertainty created by the asynchrony inherent to distributed machines. So a...
详细信息
distributed programs are much more difficult to design, understand and implement than sequential or parallel ones. This is mainly due to the uncertainty created by the asynchrony inherent to distributed machines. So appropriate concepts and tools have to be devised to help the programmer of distributed applications in his task. This paper is motivated by the practical problem called distributed debugging. It presents concepts and tools that help the programmer to analyze distributed executions. Two basic problems are addressed: replay of a distributed execution (how to reproduce an equivalent execution despite of asynchrony) and the detection of a stable or unstable property of a distributed execution. Concepts and tools presented are fundamental when designing an environment for distributed program development. This paper is essentially a survey presenting a state of the art in replay mechanisms and detection of unstable properties on global states of distributed executions.< >
This paper examines the latest DB2 Optimizer technologies employed by IBM for MVS (based on System R and R*) in both Versions 3 and 4, and compares those to the emerging implementation of the starbust Optimizer on the...
详细信息
This paper examines the latest DB2 Optimizer technologies employed by IBM for MVS (based on System R and R*) in both Versions 3 and 4, and compares those to the emerging implementation of the starbust Optimizer on the Common Server (C/S) platforms, DB2/6000 and DB2/2, in Version 2. It will review the optimization process, from the parsing of SQL statements through the computations of various Cost Models available for MVS and C/S. Discussion will include the impact and use of different catalog statistics, the depth of predicate transformations, query rewrite opportunities, environmental features exploited, and other differences within their respective optimization approaches.
This paper studies properties of messages communication modes in distributed systems. It establishes a simple, hierarchical and homogeneous characterization of logically instantaneous, causally ordered and first-in-fi...
详细信息
This paper studies properties of messages communication modes in distributed systems. It establishes a simple, hierarchical and homogeneous characterization of logically instantaneous, causally ordered and first-in-first-out communications. It is shown that a distributed computation obeys one of the previous communication modes iff a communication graph of messages does not include a cycle. This characterization plays a key role when one is interested in designing, analyzing, testing or debugging asynchronous distributed computations. This graph-based approach shows there is some unity in the characterization of deadlock, concurrency control, memory consistency and communication modes.< >
distributed computer systems for real-time control require a global timebase with high precision. A small time skew between local clocks in the system is required to obtain good control performance through well synchr...
详细信息
distributed computer systems for real-time control require a global timebase with high precision. A small time skew between local clocks in the system is required to obtain good control performance through well synchronised task execution, but also provides a base for efficient communication. In distributed safety critical applications, clocks have traditionally been synchronised with fault tolerant clock synchronisation algorithms. With these methods, a limited number of erroneous clock readings are allowed in each adjustment. On the other hand, readings from all clocks in the system are required before an adjustment can be made. In this paper an alternative approach, the Daisy Chain method, is proposed and compared with present solutions. Daisy Chain synchronisation does not allow erroneous clock readings, but methods of avoiding them are described. Due to its simplicity, the method can be implemented with little hardware. Low precision frequency sources are sufficient and recovery after arbitrary failures is fast because no special start up phase is required. The paper also discusses effects of quantisation uncertainty and transmission delay, and outline the implementation of a global time base in an embedded distributed real-time architecture.< >
Monitoring program execution in a distributed system can generate large quantities of data, and the collection and processing of the monitoring data is one of the primary factors that contribute to the complexity of d...
详细信息
Monitoring program execution in a distributed system can generate large quantities of data, and the collection and processing of the monitoring data is one of the primary factors that contribute to the complexity of distributed monitoring. In order to reduce such complexity, a hierarchical distributed performance monitoring system has been developed. In this paper we describe an optimization method to improve the efficiency of the monitoring system. By considering the topology used by the application program and the distribution of monitoring records, an optimized grouping can be determined to obtain an improved performance for the monitoring system. The experiments presented in this paper have demonstrated such an improvement in performance.< >
A transactional paradigm is suggested for computer-assisted parallelization of programs and register-cache scheduling. It can serve as a building tool for pipelining, data parallellism, or generic parallellism in a va...
详细信息
A transactional paradigm is suggested for computer-assisted parallelization of programs and register-cache scheduling. It can serve as a building tool for pipelining, data parallellism, or generic parallellism in a variety of architectures and the cost of execution can be estimated realistically.< >
The use of a microkernel as the foundation for modern distributed operating systems is a common decision made among system designers. This paper describes the RHODOS distributed operating system, the design decisions,...
详细信息
The use of a microkernel as the foundation for modern distributed operating systems is a common decision made among system designers. This paper describes the RHODOS distributed operating system, the design decisions, its' microkernel and kernel servers. The different approaches to the cooperation between the RHODOS microkernel and the kernel servers are presented including an analysis of the trade-offs these approaches yield between modularity and performance.< >
This paper analyzes the ability of several bounded degree networks that are commonly used for parallel computation to tolerate faults. Among other things it is shown that an N-node butterfly containing N/sup 1-/spl ep...
详细信息
This paper analyzes the ability of several bounded degree networks that are commonly used for parallel computation to tolerate faults. Among other things it is shown that an N-node butterfly containing N/sup 1-/spl epsiv// worst-case faults (for any constant /spl epsiv/>0) can emulate a fault-free butterfly of the same size with only constant slowdown. Similar results are proven for the shuffle-exchange graph. Hence, these networks become the first connected bounded-degree networks known to be able to sustain more than a constant number of worst-case faults without suffering more than a constant-factor slowdown in performance.< >
Chronolog is an extension of logic programming based on temporal logic. The paper presents a framework which can be used to exploit multiple levels of parallelism found an Chronolog programs, context parallelism, AND-...
详细信息
Chronolog is an extension of logic programming based on temporal logic. The paper presents a framework which can be used to exploit multiple levels of parallelism found an Chronolog programs, context parallelism, AND- and OR-parallelism. Based on an analysis of these modes of parallelism in Chronolog programs, a parallel execution mechanism of the language is discussed and a formal execution model is given. The inherent context-parallelism in Chronolog programs occurs when more than one child-computation are active at a time, and it is exploited through dynamic tagging approach typically used in dataflow computers. At the level of clause arguments, we introduce an intermediate virtual machine (CVM), which is granulated to exploit the argument parallelism through temporal unification. We also give the details of the CVM instruction set. The model is process-based and supports AND-, OR-parallelism in the highly distributed dataflow environment.< >
暂无评论