Petri nets are a versatile tool for modeling and analyzing parallel and distributed computing systems. However, state explosion is a major impediment to their analysis and practical applications. To cope withthis pro...
详细信息
ISBN:
(纸本)0818626720
Petri nets are a versatile tool for modeling and analyzing parallel and distributed computing systems. However, state explosion is a major impediment to their analysis and practical applications. To cope withthis problem, this paper proposes a method for constructing hierarchically organized state space (HOSS) of a bounded Petri net. Using the HOSS, we obtain necessary and sufficient conditions for reachability and deadlock, and algorithms to test if a given state (marking) is reachable from the initial state and if there is a deadlock state (a state with no successor states).
Nonreplicated shared data of distributedapplications is optimally allocated to pre-specified multilevel memory partitions at the sites of a heterogeneous multicomputer network to minimize a weighted combination of sy...
详细信息
ISBN:
(纸本)0818626720
Nonreplicated shared data of distributedapplications is optimally allocated to pre-specified multilevel memory partitions at the sites of a heterogeneous multicomputer network to minimize a weighted combination of systemwide mean time delay performance and mean communication cost per access request. Greedy and fast optimization algorithms are presented for nonqueueing lightly-loaded as well as heavily-loaded multiqueue system models with channel, I/O, and memory hierarchy queues. Extensions to data exhibiting nonuniform access demand rates and distinct query and update statistics are presented.
the Vesicular Dataflow (VDF) model is presented in the paper. the VDF model has been formulated to introduce a way of storing and retrieving information and hence to reduce the main drawback of the basic DF model. Tok...
详细信息
ISBN:
(纸本)0818626720
the Vesicular Dataflow (VDF) model is presented in the paper. the VDF model has been formulated to introduce a way of storing and retrieving information and hence to reduce the main drawback of the basic DF model. Tokens can be stored in vesicles in the VDF model and then distributed in non-deterministic way. State-dependent computations and global variables can be expressed in the dataflow manner. Informal definition of the VDF model and some simple applications are covered by the paper.
Aroma simplifies the task of parallelizing large applications on multicomputers by providing applications with a shared object space. Aroma supports both traditional monolithic objects and aggregate object that can be...
详细信息
ISBN:
(纸本)0818626720
Aroma simplifies the task of parallelizing large applications on multicomputers by providing applications with a shared object space. Aroma supports both traditional monolithic objects and aggregate object that can be partitioned across multiple nodes. Aggregate objects support data parallelism efficiently. An Aroma program consists of tasks that operate on shared objects. Task typically execute on the node on which their input data is located, thus minimizing communication. Shared data objects have synchronization properties associated withthem, making it possible to parallelize a large class of applications without using explicit locks and condition variables. In this paper, we present and justify the Aroma language features, and give examples of Aroma programs. Aroma has been implemented on the Nectar multicomputer and we give performance results for several applications.
A method is described for remotely sensed image segmentation based on the integration of edge detection and region growing approaches. the method is designed for full exploitation of parallelprocessing. the initial e...
详细信息
A prototype of a machine that utilizes parallel architecture and that was designed for a synthetic aperture radar (SAR) precision two-dimensional processing algorithm is discussed. the architecture allows drastic redu...
详细信息
A prototype of a machine that utilizes parallel architecture and that was designed for a synthetic aperture radar (SAR) precision two-dimensional processing algorithm is discussed. the architecture allows drastic reduction of the processing time involved in remote sensing applications, preserving the elaboration accuracy and flexibility. Experiments on the prototype are presented and discussed.
A scientific parallel processor called the R256 has been developed. the R256 is composed of 16 × 16 processing elements and has the outstanding features of a distributedparallel network as well as an ieee 80-bit...
详细信息
ISBN:
(纸本)9780897913195
A scientific parallel processor called the R256 has been developed. the R256 is composed of 16 × 16 processing elements and has the outstanding features of a distributedparallel network as well as an ieee 80-bit extended-floating-point computation ability. the computation accuracy, required by an exhaustive number of iterations in scientific computations, is resolved by the dedicated 80-bit VLSI processor, which was developed for the R256. the innovative distributedparallel network was designed to resolve heavy communication problems, which are found in applications based on the Monte Carlo simulation technique. the R256 network was very economical at a hardware cost of √ N-fold (16-fold in this case) that of an ideal full-crossbar switch, while the rates were kept comparable to that of an ideal switch. the R256 demonstrates 2-Gb/s data transfer rates and 500-MFLOPS (million-floating-point-operations/s) computation rates on a semiconductor-device-simulation application.
A scientific parallel processor called the R256 has been developed. the R256 is composed of 16X16 processing el- ements, and has the outstanding features of a " distributedparallel network " as well as on I...
详细信息
A scientific parallel processor called the R256 has been developed. the R256 is composed of 16X16 processing el- ements, and has the outstanding features of a " distributedparallel network " as well as on ieee 80-bit extended floating point computation ability. the computation accuracy, required by an exhaustive number of iterations in scientific computations, is resolved by the dedicated 80-bit VLSI processor, which was developed here for the R256. the innovative distributedparallel network was designed so as effectively resolve heavy communication problems, which are found in applications based on the Monte Carlo sim ulation technique. the R256 network was very economical at a hardware cost of /spl radic/N-folds (16 folds in this case) to that of an ideal full-crossbar switch, at the same time keeping the rates comparable to that of an ideal switch. the R256 demonstrates high performance of 2-GB/s data transfer rates and 500-MFLOPS computation rates on a semiconductor device simulation application.
the authors proposed a computer with low-level parallelism as one of the basic computer architectures and built a large-scale experimental system called QA-2. By low-level parallelism is meant that a long-word instruc...
详细信息
ISBN:
(纸本)081860719X
the authors proposed a computer with low-level parallelism as one of the basic computer architectures and built a large-scale experimental system called QA-2. By low-level parallelism is meant that a long-word instruction controls simultaneously many ALUs, buses, registers and memories in a fine-grained parallelism mode. the QA-2 uses a 256-bit instruction by which four different ALU operations, four memory accesses to different/continuous locations and one powerful sequence control are all specified and performed in parallel. If many simultaneously executable operations are detected and embedded in one instruction at compile time, this type of computer can provide a high degree of performance for a wide variety of applications. the architectural benefits and limitations of low-level parallelism in performing 3-D color image generation and interpreting Prolog/Lisp programs are described. the hardware organization with four ALUs, which has actually been implemented in the QA-2, is verified to be adequate. Nearly three out of four ALUs can work in parallel. Any architecture with more than four ALUs cannot achieve a significant degree of performance enhancement.
Multicomputer systems withdistributed control form an architecture that simultaneously satisfies such design goals as high performance through parallel operation of VLSI processors, modular extensibility, fault toler...
详细信息
暂无评论