A systematic method of mapping algorithms from single assignment algorithms into systolic arrays is presented. The method is based on a space-time mapping technique of the index sets. We present a method of generation...
详细信息
ISBN:
(纸本)0780320182
A systematic method of mapping algorithms from single assignment algorithms into systolic arrays is presented. The method is based on a space-time mapping technique of the index sets. We present a method of generation and selection of a valid transform dependency matrix that will yield an optimal or near optimal systolic array once it is mapped. The proposed method increases the visibility of the architecture in terms of processor delay and communication between processors at the algorithmic level, so that the designer is able to select a desired array at early stages of the design. An example of the proposed method is given.
The key to simplifying development and execution of large and complex distributedsystems is providing adequate development environment and runtime facilities for supporting cooperation, reliability and dynamic adapta...
详细信息
The key to simplifying development and execution of large and complex distributedsystems is providing adequate development environment and runtime facilities for supporting cooperation, reliability and dynamic adaptation. This paper focuses on analytical tools that assist in development of complex systems. The environment allows scalable specification of complex application behavior using mechanisms for abstracting group behavior and hierarchical composition of components. Behavior specification of applications can be analyzed automatically for two classes of problems: (1) reachability and liveness, and (2) consistency during failure recovery and dynamic adaptation. Consistency is preserved by automatically analyzing dependencies from behavior specification. This approach permits components of a complex system to cooperate in complex ways, and execute continuously for a long time with minimal disruption, despite failure or adaptation of some components.
This paper introduces the Doubly-Linked List (DLL) Protocol for distributed Shared Memory (DSM) Multiprocessor systems. The protocol makes uses of two linked list to keep track of valid copies of pages in the system, ...
详细信息
This paper introduces the Doubly-Linked List (DLL) Protocol for distributed Shared Memory (DSM) Multiprocessor systems. The protocol makes uses of two linked list to keep track of valid copies of pages in the system, thus eliminating the use of copy-sets. Simulation studies show that the DLL protocol achieved considerable speed-up for common mathematical problems including a linear equations solver and a matrix multiplier. Performance improvement of up to 51.9% over the Dynamic distributed Manager algorithm is obtained. Further improvement and possible modification of the protocol will also discussed.
distributed computer systems from time to time experience uneven loads on different resources. Dynamic load balancing aims to identify uneven load instances and takes appropriate actions to restore the balance. This p...
详细信息
ISBN:
(纸本)0780320182
distributed computer systems from time to time experience uneven loads on different resources. Dynamic load balancing aims to identify uneven load instances and takes appropriate actions to restore the balance. This paper presents our experiences in implementing a load balancing facility on the Amoeba system, which allows us to carry out a series of experiments with various algorithms. The results from a preliminary study of different load balancing algorithms are also presented. These results indicate that load balancing has great impact on system performance, it not only reduces the average response time of processes, but also the variation of response time. A comparison between these algorithm under various conditions is included, which indicates that with tens computers in a system, a centralized algorithm outperforms a distributed one. The results further indicate job initiation is an important part of a load balancing facility.
A new loop scheduling scheme called multithreaded self-scheduling (MSS) for distributed shared memory multiprocessor is proposed. Based on the principles of multithreading, MSS attempts to hide the remote memory acces...
详细信息
ISBN:
(纸本)0780320182
A new loop scheduling scheme called multithreaded self-scheduling (MSS) for distributed shared memory multiprocessor is proposed. Based on the principles of multithreading, MSS attempts to hide the remote memory access latencies by switching between multiple contexts of threads. Consequently, loops scheduled by using MSS can obtain better performance comparing to the single-thread approaches. In this paper, a series of simulation results corresponding to various parameter changes are presented, which provides a measure of the effectiveness of MSS under different boundary conditions and suggests the ways for further improvements.
distributed computer systems for real-time control require a global timebase with high precision. A small time skew between local clocks in the system is required to obtain good control performance through well synchr...
详细信息
ISBN:
(纸本)0780320182
distributed computer systems for real-time control require a global timebase with high precision. A small time skew between local clocks in the system is required to obtain good control performance through well synchronised task execution, but also provides a base for efficient communication. In distributed safety critical applications, clocks have traditionally been synchronised with fault tolerant clock synchronisation algorithms. With these methods, a limited number of erroneous clock readings are allowed in each adjustment. On the other hand, readings from all clocks in the system are required before an adjustment can be made. In this paper an alternative approach, the Daisy Chain method, is proposed and compared with present solutions. Daisy Chain synchronisation does not allow erroneous clock readings, but methods of avoiding them are described. Due to its simplicity, the method can be implemented with little hardware. Low precision frequency sources are sufficient and recovery after arbitrary failures is fast because no special start up phase is required. The paper also discusses effects of quantisation uncertainty and transmission delay, and outline the implementation of a global time base in an embedded distributed real-time architecture.
Conformance testing of communication protocols has recently become a major issue in the context of OSI-based standardization of protocols. The aim of conformance testing is to assure that a protocol fulfills an OSI sp...
详细信息
Conformance testing of communication protocols has recently become a major issue in the context of OSI-based standardization of protocols. The aim of conformance testing is to assure that a protocol fulfills an OSI specification. In this paper, a performance study is presented for a distributed protocol test system that has been installed for conformance testing of the ISDN D-channel signalling protocol. Using a general approach for performance measurements and evaluation in distributedsystems, a queueing model is developed and evaluated, based on runtimes as obtained from measurements of the test system. It is demonstrated that significant performance improvements can be achieved once the process scheduling strategy at the ISDN protocol testers is properly adjusted.
Some problems are very difficult to solve by mathematical programming approaches. A genetic algorithm (GA) is an extremely powerful optimization technique that could be used to solve such problems, but its efficiency ...
详细信息
Some problems are very difficult to solve by mathematical programming approaches. A genetic algorithm (GA) is an extremely powerful optimization technique that could be used to solve such problems, but its efficiency is dependent on its ability to do a large number of evaluations in a reasonable amount of time. A classical GA contains three basic operators - reproduction, crossover, and mutation. To increase the efficiency of a genetic algorithm the influence of migration in a multilevel distributed GA (MDGA) was tested. Several different structures of PC- computers connected in a local area network (LAN) were used for the MDGAs. MDGAs the use power of the computers better than one level distributed GAs. The problem of communication between the computers in the MDGAs was dealt with in two different ways: with files on a server or by sending packets.
In this paper, I analyze the ability of several bounded degree networks that are commonly used for parallel computation to tolerate faults. Among other things, I show that an N-node butterfly containing N1-Ε worst-ca...
详细信息
In this paper, I analyze the ability of several bounded degree networks that are commonly used for parallel computation to tolerate faults. Among other things, I show that an N-node butterfly containing N1-Ε worst-case faults (for any constant Ε > 0) can emulate a fault-free butterfly of the same size with only constant slowdown. Similar results are proven for the shuffle-exchange graph. Hence, these networks become the first connected bounded-degree networks known to be able to sustain more than a constant number of worst-case faults without suffering more than a constant-factor slowdown in performance.
New development in the design and implementation of a distributed, real-time image processing system is presented in this work. The system uses an IBM personal computer as the front end to a remote computer via the In...
详细信息
New development in the design and implementation of a distributed, real-time image processing system is presented in this work. The system uses an IBM personal computer as the front end to a remote computer via the Internet. The standard TCP/IP networking protocols are utilised to link the IBM-PC and high performance remote devices such as transputer networks and super-computers. The access to the powerful remote computers enables the system to complete complex image processing tasks in real-time. During processing the image is transferred to the remote machine and then transferred back to the PC for display. The system serves as a prototype for a full-feature image processing and analysis package, as well as a programming platform for the research and development of new image processing algorithms.
暂无评论