We investigate synchronization activities in application executing on distributed-memory MIMD architectures. three applications are used to quantify the performance impact of synchronization as the number of processor...
详细信息
We investigate synchronization activities in application executing on distributed-memory MIMD architectures. three applications are used to quantify the performance impact of synchronization as the number of processors is increased. We also investigate the performance improvement possible when synchronization is supported in hardware. the results show that significant performance improvement can be achieved. the hardware support should include barrier synchronization, operate-and-broadcast, and operations over subsets of processors.< >
the GRIP architecture allows efficient execution of functional programs on a multi-processor built from standard hardware components. State-of-the-art compilation techniques are combined with sophisticated runtime res...
详细信息
this paper deals withthe problem of task allocation, subjected to precedence constraints, on multiprocessor architectures with interprocessor communication delays. Two kinds of scheduling are distinguished: the deter...
详细信息
In this paper, we describe a parallel search-and-learn technique for obtaining high quality solutions to the Traveling Salesperson Problem (TSP). the combinatorial search space is decomposed so that multiple processor...
详细信息
ISBN:
(纸本)0818642009
In this paper, we describe a parallel search-and-learn technique for obtaining high quality solutions to the Traveling Salesperson Problem (TSP). the combinatorial search space is decomposed so that multiple processors can simultaneously look for local optimal solutions in the subspaces. the local optima are then compared to 'learn' which moves are good-a move is defined to be good if all the search processes have voted in consensus for the move. Based on this learning, the original problem is transformed into a constrained optimization;a constraint requires a specific edge to be included in the final tour. the constrained optimization problem is modelled as a TSP of smaller size, and is again solved using the parallel search technique. this process is repeated until a TSP of manageable size is reached. which can be solved effectively;the tour obtained at this last stage is then expanded retrogressively until the tour for the original problem is obtained. the results of parallel implementation on a 32-node transporter are described.
this conference proceedings contains 37 papers. the topics discussed are WSI devices, WSI architecture, multichip modules, the interconnection technologies for multichip assemblies project, massively parallel processi...
详细信息
ISBN:
(纸本)0780308670
this conference proceedings contains 37 papers. the topics discussed are WSI devices, WSI architecture, multichip modules, the interconnection technologies for multichip assemblies project, massively parallelprocessing systems, associative string processors, hybrid WSI circuits, WSI applications, WSI design, optical interconnection networks, WSI technology, test and reconfiguration, and power, clock and signal distribution circuits.
the proceedings contain 26 papers. the special focus in this conference is on Computer Performance Modeling, Measurement and Evaluation. the topics include: parallel simulation;properties and analysis of queueing netw...
ISBN:
(纸本)9783540572978
the proceedings contain 26 papers. the special focus in this conference is on Computer Performance Modeling, Measurement and Evaluation. the topics include: parallel simulation;properties and analysis of queueing network models with finite capacities;performance analysis and optimization withthe power-series algorithm;multiprocessor and distributed system design;response time distributions in queueing network models;fast simulation of rare events in queueing and reliability models;an inlxoduction to modeling dynamic behavior with time series analysis;issues in trace-driven simulation;maximum entropy analysis of queueing network models;performance modeling using DSPN express;relaxation for massively parallel discrete event simulation;an overview of tes processes and modeling methodology;performance engineering of client-server systems;queueing networks with finite capacities;performance instrumentation techniques for parallel systems;a survey of bottleneck analysis in closed networks of queues;software performance engineering;performance measurement using system monitors;providing quality of service packet switched networks;dependability and performability analysis;architectures and algorithms for digital multimedia on-demand servers;analysis and control of polling systems;modeling and analysis of transaction processing systems.
MCM technology can be usefully exploited in a number of areas in the design of large scale distributed memory MIMD processors. Within the interconnection technologies for multichip assemblies (ITMA) project two demons...
详细信息
ISBN:
(纸本)0780308670
MCM technology can be usefully exploited in a number of areas in the design of large scale distributed memory MIMD processors. Within the interconnection technologies for multichip assemblies (ITMA) project two demonstrators have been selected for different applications within a parallel computer. the first demonstrator circuit is an extended switch element using a multichip module technology. the MCM usess four Elite switch devices connected on a silicon substrate. Flip chip bonding is used. A summary of device characteristics is given. the second demonstrator integrates a RISC CPU chip set along with a communications coprocessor. the resulting module contains all the components required for a MIMD processing node, withthe exception of the DRAM and DRAM controller. the two demonstrator circuits are discussed.
the APPLAUSE ESPRIT Project is building major applications using the ElipSys parallel constraint logic programming system developed at ECRC. Two major aims of the project are to advance the state of the art in four co...
详细信息
Total exchange is the densest parallel communication primitive and poses a severe test for the capability of any parallel architecture. this operation is very important and arises in many applications. We show that a ...
详细信息
Total exchange is the densest parallel communication primitive and poses a severe test for the capability of any parallel architecture. this operation is very important and arises in many applications. We show that a reconfigurable parallel architecture called MICA can perform total exchange efficiently with simple algorithms. the mechanisms and structures of the reconfigurable switches, supporting simple total exchange algorithms, are presented. We introduce the basic operations the network supports as they are issued by the host to efficiently manage the network reconfiguration. Applications in linear algebra are briefly mentioned showing the way they can benefit from our architecture.< >
A parallel algorithm is developed which recognizes parity graphs in O(log/sup 2/ n) time using a linear number of processors. this improves previous results of G. Adhar and S. Peng (J. algorithms, vol. 11, pp. 252-284...
详细信息
A parallel algorithm is developed which recognizes parity graphs in O(log/sup 2/ n) time using a linear number of processors. this improves previous results of G. Adhar and S. Peng (J. algorithms, vol. 11, pp. 252-284, 1990) and of T. Przytycka and D. Corneil (J. algorithms, vol. 12, pp. 96-109, 1991).< >
暂无评论