the issue of I/O device access in HARTS (Hexagonal architecture for Real-Time Systems)--a distributed real-time computer system under construction at the University of Michigan--is explicitly addressed. Several candid...
详细信息
ISBN:
(纸本)0818620471
the issue of I/O device access in HARTS (Hexagonal architecture for Real-Time Systems)--a distributed real-time computer system under construction at the University of Michigan--is explicitly addressed. Several candidate solutions are introduced, explored, and evaluated according to cost, complexity, reliability, and performance: (1) 'node-direct' distribution withthe intranode bus and a local I/O bus;(2) use of dedicated I/O nodes, which are placed in the hexagonal mesh as regular applications nodes, but which provide I/O services rather than computing services;and (3) use of a separate I/O network;which has led to the proposal of an 'interlaced' I/O network. the interlaced I/O network is intended to provide bothhighperformance without burdening node processors with I/O overhead and a high degree of reliability. Both static and dynamic multiownership protocols are developed for managing I/O device access in this I/O network. the relative merits of the two protocols are explored, and the performance and accessibility which each provides are simulated.
A scientific parallel processor called the R256 has been developed. the R256 is composed of 16X16 processing el- ements, and has the outstanding features of a " distributed parallel network " as well as on I...
详细信息
A scientific parallel processor called the R256 has been developed. the R256 is composed of 16X16 processing el- ements, and has the outstanding features of a " distributed parallel network " as well as on ieee 80-bit extended floating point computation ability. the computation accuracy, required by an exhaustive number of iterations in scientific computations, is resolved by the dedicated 80-bit VLSI processor, which was developed here for the R256. the innovative distributed parallel network was designed so as effectively resolve heavy communication problems, which are found in applications based on the Monte Carlo sim ulation technique. the R256 network was very economical at a hardware cost of /spl radic/N-folds (16 folds in this case) to that of an ideal full-crossbar switch, at the same time keeping the rates comparable to that of an ideal switch. the R256 demonstrates highperformance of 2-GB/s data transfer rates and 500-MFLOPS computation rates on a semiconductor device simulation application.
Recent work in microarchitecture has identified a new model of execution, restricted data flow, in which data-flow techniques are used to coordinate out-of-order execution of sequential instruction streams. It is beli...
详细信息
ISBN:
(纸本)081860719X
Recent work in microarchitecture has identified a new model of execution, restricted data flow, in which data-flow techniques are used to coordinate out-of-order execution of sequential instruction streams. It is believed that the restricted-data-flow model has great potential for implementing high-performancecomputing engines. A minimal functionality variant of the model, called HPSm, is defined. the instruction set, data path, timing and control of HPSm are described. A simulator of HPSm has been written, and some of the Berkeley RISC benchmarks have been executed on the simulator. Measurements obtained from these benchmarks, along with measurements obtained for the Berkeley RISC II, are reported.
A fault tolerant computerarchitecture, FTCX, is an experimental computerarchitecture intended to serve as a general-purpose real-time computing system for fault sensitive supervisory and control applications. FTCX u...
详细信息
ISBN:
(纸本)0818607033
A fault tolerant computerarchitecture, FTCX, is an experimental computerarchitecture intended to serve as a general-purpose real-time computing system for fault sensitive supervisory and control applications. FTCX uses tightly synchronous triplex computation in its core to detect and mask all first faults. Synchronization, fault detection, and fault correction are all performed in the hardware. Novel to this architecture are the means by which interrupt requests and data are exchanged between the simplex local or remote industry standard bus (VMEbus) environments and the triplexed core environment. these exchanges are software transparent, yet fully implement all of the necessary algorithms to maintain data consistency and synchronization in the three channels of the core, even in the face of byzantine faults.
作者:
Abu-Sufah, WalidKwok, Alex Y.Univ of Illinois
Cent for Supercomputer Research & Development Urbana IL USA Univ of Illinois Cent for Supercomputer Research & Development Urbana IL USA
the development of performance prediction tools for high-speed machine organizations has been recognized as a key problem facing the research community in parallel computing. A survey of the tools which have been deve...
详细信息
ISBN:
(纸本)0818606347
the development of performance prediction tools for high-speed machine organizations has been recognized as a key problem facing the research community in parallel computing. A survey of the tools which have been developed for performance prediction of the Cedar multiprocessor supercomputer of the University of Illinois is presented. the system is deterministic, modular, and automatic. the hierarchical organization of the system provides the user withthe ability to choose from a set of alternatives for predicting the performance with different levels of accuracy and cost, using 22 programs. the performance degradation due to conflicts in the shared memory delay in the Cedar interconnection network and synchronization overhead are measured. the results confirm that the architecture of Cedar is balanced. the performance of the Cedar interconnection network is very close to a crossbar. Synchronization overhead and shared memory conflicts could degrade performance for some programs considerably.
X-NODE is a single-chip VLSI processor to be realized in the mid 1980's and to be used as a building block for a tree-structured multiprocessor system (X-TREE). three major trends influence the design of this proc...
详细信息
the automatic coordination of instruction execution of SISD processors is examined in the context of minimizing the effects of branch execution. three areas, instruction prefetch, branch resolution, and issuer organiz...
the paper describes the results of simulation experiments of a tree organized multicomputer now being constructed in the Department of computer Science at Stony Brook. First the structure of the multicomputer is intro...
详细信息
Recent trends in computer technology and design favor the development of computer systems with low levels of multiprogramming. performance measures such as throughput can be improved under these circumstances by overl...
详细信息
暂无评论