PVM, a message-passing software system for parallel processing, is used on a wide variety of processor platforms, but this portability restricts execution speed. The work here will address this problem mainly in the c...
详细信息
PVM, a message-passing software system for parallel processing, is used on a wide variety of processor platforms, but this portability restricts execution speed. The work here will address this problem mainly in the context of Ethernet-based systems, proposing two PVM enhancements for such systems. The first enhancement exploits the fact that an Ethernet has broadcast capability. Since unenhanced PVM must, to keep portability, avoid using broadcast, execution speed is sacrificed. In addition, the larger the system, the larger the sacrifice in speed. A solution to this problem is presented. The second enhancement is intended for use in applications in which many concurrent tasks finish at the same time, and thus simultaneously try to transmit to a master process. On an Ethernet, this produces excessively long random backoffs, reducing program speed. An enhancement, termed 'programmed backoff,' is proposed.
The RACE(R) parallel computer system provides a high-performance parallel interconnection network at low cost. This paper describes the architecture and implementation of the RACE system, a parallel computer for embed...
详细信息
The RACE(R) parallel computer system provides a high-performance parallel interconnection network at low cost. This paper describes the architecture and implementation of the RACE system, a parallel computer for embedded applications. The topology of the network, which is constructed with 6-port switches, can be specified by the customer, and is typically a fat-tree, a Clos network, or a mesh. The network employs a preemptable circuit switched strategy. The network and the processor-network interface work together to provide high performance: 160 megabytes per second transfer rates with about 1 microsecond of latency. Priorities can be used to guarantee tight real-time constraints of a few microseconds through a congested network. A self-regulating circuit adjusts the impedence and output delay of the pin-driver pads.
作者:
Bode, ArndtInstitut für Informatik
Lehrstuhl für Rechnertechnik und Rechnerorganisation Technische Universitat Munchen MunchenD-80290 Germany
This article covers research at Technische Universität München on distributed and parallel architectures and applications. First, an overview on the parallel processing research organization is given. The se...
详细信息
The problem of bicriterion scheduling of jobs with identical processing times by uniform processors is considered. The first criterion is the minimization of either total or maximum costs, the second one is the minimi...
详细信息
The problem of bicriterion scheduling of jobs with identical processing times by uniform processors is considered. The first criterion is the minimization of either total or maximum costs, the second one is the minimization of maximum cost with different cost functions. Polynomial time algorithms are presented to determine all efficient solutions and the optimal solution for a given global criterion.
Concurrency control based on conventional techniques requires additional efforts for deadlock detection and elimination. The possibility of a deadlock is also connected to the introduction of delays, and repeated rest...
详细信息
Concurrency control based on conventional techniques requires additional efforts for deadlock detection and elimination. The possibility of a deadlock is also connected to the introduction of delays, and repeated restarts of transactions in deadlock cycles. In the proposed approach, a technique for generation of data flow precedence graphs among transactions at data sites has been studied. The local access graph approach is a fully distributed approach. Through local computations, the approach can prevent deadlocks in a distributed system.< >
This paper proposes a distributed dynamic processor sharing scheme in torus-connected multicomputer systems. It is applicable to database query and on-line transaction processing applications. In such a system, each p...
详细信息
ISBN:
(纸本)0818671955
This paper proposes a distributed dynamic processor sharing scheme in torus-connected multicomputer systems. It is applicable to database query and on-line transaction processing applications. In such a system, each processor can process small transaction tasks locally and support parallel execution of large transaction tasks in a timesharing fashion. distributed management of processors is achieved by our scheme based on the distributed submesh table (DST) which describes how processors are clustered and how many time-slices that each cluster can provide.
&ACE is a high performance parallel Prolog system developed at the Laboratory for Logic, databases, and Advanced Programming that exploits and-parallelism from Prolog programs. &ACE was developed to exploit MI...
详细信息
&ACE is a high performance parallel Prolog system developed at the Laboratory for Logic, databases, and Advanced Programming that exploits and-parallelism from Prolog programs. &ACE was developed to exploit MIMD parallelism. However, SPMD parallelism also arises naturally in many Prolog programs. In this paper we develop runtime techniques that allow systems that have primarily been designed to exploit MIMD parallelism (such as &ACE) to also efficiently exploit SPMD parallelism. These runtime techniques have been incorporated in the &ACE system. Performance of &ACE augmented with these techniques on programs containing SPMD parallelism is presented.
Although managing multiple copies of a database has been the subject of intensive research for quite some time now, it has yet to fulfill its promise in practical applications. In the current state of distributed data...
详细信息
Although managing multiple copies of a database has been the subject of intensive research for quite some time now, it has yet to fulfill its promise in practical applications. In the current state of distributed database technology, data replication, if implemented at all, is typically enforced by the read-one-write-all protocol. More complicated but less restrictive replica control protocols, though a popular topic in research, are not implemented in any widely-used systems. A major reason for this lack of acceptance is that the performance impact of these protocols cannot be easily quantified, since very little existing performance figures of replicated commercial database systems are available. This paper investigates the overhead of data replication, and compares the performance of a seldom implemented protocol, quorum consensus, with the performance of a widely accepted protocol, primary copy.
作者:
A. BodeInstitut für Informatik
Lehrstuhl für Rechnertechnik und Rechnerorganisation Technische Universität München Munchen Germany
This article covers research at Technische Universitat Munchen on distributed and parallel architectures and applications. First, an overview on the parallel processing research organization is given. The second main ...
详细信息
This article covers research at Technische Universitat Munchen on distributed and parallel architectures and applications. First, an overview on the parallel processing research organization is given. The second main topic covers an integrated hierarchical programming environment TOPSYS for parallel and distributedsystems developed as part of the research grant.< >
In this paper the control architecture and the characteristics of the synchronization of an industrial application are presented. The control procedure is implemented with a loosely coupled distributed real time syste...
详细信息
In this paper the control architecture and the characteristics of the synchronization of an industrial application are presented. The control procedure is implemented with a loosely coupled distributed real time system, where parallel processing is possible. There are five nodes in the network, one master actuator, three slave actuators and a machine controller. All nodes are implemented using Motorola's 68332 controller. The position and velocity of the master are transmitted as a command to slaves. The network protocol used is CAN (controlled area network). On-line correction and synchronization are done through a serial based network. In this paper synchronization methods, characteristics of CAN, control architecture electronics used are introduced.< >
暂无评论