parallel logic programming (PLP) systems have obtained good performance on traditional bus-based shared-memory architectures. However, the scalable multiprocessors being developed today pose new challenges. Our experi...
详细信息
ISBN:
(纸本)3540649522
parallel logic programming (PLP) systems have obtained good performance on traditional bus-based shared-memory architectures. However, the scalable multiprocessors being developed today pose new challenges. Our experience with a sophisticated PLP system, Andorra-I, demonstrates that indeed performance suffers greatly on modern architectures. In order to improve performance, we perform a detailed analysis of the cache behaviour of all Andorra-I data structures via executiondriven simulation of a DASH-like multiprocessor. Based on this analysis we optimise the Andorra-I code using 5 different techniques. Our results show that the techniques provide significant performance improvements, leading to the conclusion that PLP systems can and should perform well on modern scalable multiprocessors.
At the MPPOI '96 and '97 conferences, a new way to dynamically control in-flight pulses by a shepherd pulse to enhance rime-alignment of co-propagating pulses in a bir-parallel WDM system for a single-mode fib...
详细信息
ISBN:
(纸本)0818685727
At the MPPOI '96 and '97 conferences, a new way to dynamically control in-flight pulses by a shepherd pulse to enhance rime-alignment of co-propagating pulses in a bir-parallel WDM system for a single-mode fiber was discussed and the first experimental evidence that this pulse shepherding effect carl be observed in a commercially available DS (dispersion-shifted) fiber was also presented. Here, we shall discuss the initial results towards the realization of a multi-km x gbytes/sec bit-parallel WDM single fiber link. the distance-speed product of this single fiber link is more than several orders of magnitude higher than that of a fiber ribbon link. the design of a 12 bit-parallel channels WDM system operating at I Gbit/sec per channel rate will first be presented. Experimental results for a two channel system operating at that rate will then be given. Also, new computer simulation results on how a large amplitude shepherd pulse may induce pulse compression on all the co-propagating data pulses, thereby improving the shaping of these pulses for a WDM system, will be presented and discussed.
the nesting (or placement) problem is an NP-hard combinatorial problem with important industrial applications, e.g. in apparel or footwear industry. this paper describes a hardware infrastructure to accelerate the pro...
详细信息
We experimentally demonstrate 100-m-long image fiber transmission of four-channel multiplexed two-dimensional signals. To upgrade the system throughput, we study several hundred Gb/s-class 2-D optical parallel data li...
详细信息
ISBN:
(纸本)0818685727
We experimentally demonstrate 100-m-long image fiber transmission of four-channel multiplexed two-dimensional signals. To upgrade the system throughput, we study several hundred Gb/s-class 2-D optical parallel data link using an image fiber for the transmission and 2-D arrays of vertical-cavity surface-emitting laser diodes (VCSEL's) and p-i-n photo-diodes (PD's) as the transmitter and the receiver respectively. this system employs space code division multiple access (Space-CDMA) to multiplex 2-D optical parallel signals. To establish multichannel optical link between 2-D VCSEL array and PD array with a high alignment precision and a good repeatability, we develop a novel visual alignment technique using a micro-optic image fiber coupler which consists of miniature cube beamsplitter and graded index (GRIN) rod lenses. the effectiveness of the visual alignment withthe image fiber coupler is experimentally demonstrated. this result will encourage the application of optical space-CDMA using an image fiber and 2-D arrays of VCSEL's and PD's to future high-throughput 2-D parallel data links connecting massively parallel processors.
the structural specification and modeling of time critical real-time systems has become a major area for recent research topics. this is particularly relevant for computer music when sound computation is realized invo...
详细信息
Object-relational databases management systems (OR-DBMS) extend the capabilities of the relational databases by allowing definition of new data types and methods to operate on these data types while retaining most of ...
详细信息
ISBN:
(纸本)0818691948
Object-relational databases management systems (OR-DBMS) extend the capabilities of the relational databases by allowing definition of new data types and methods to operate on these data types while retaining most of the relational model semantics. In this paper we examine issues related to parallelprocessing of queries in object-relational model with respect to efficient storage and retrieval of large objects. We extend the concept of collective I/O and other related techniques like request merging and data sieving in the database domain to achieve high performance in retrieval of large objects. We deal withthe I/O optimization problem in the query executor access methods and the low level runtime system. We also propose a new technique called pooled striping for efficient storage of large objects on multiple disks. the results presented in this paper clearly show the effectiveness of the proposed I/O optimization techniques in handling large amounts of data in a parallel object-relational database system.
the paper deals with a prototype digital hardware architecture implementing a "Cellular Fuzzy processing (CFP)" array. the powerful collective behavior of such systems derives basically from the ability to s...
Several new number representations based on a Residue Number System are presented which use the smallest prime numbers cts moduli and are suited for parallel computations on a reconfigurable mesh architecture. It is s...
详细信息
ISBN:
(纸本)0818691948
Several new number representations based on a Residue Number System are presented which use the smallest prime numbers cts moduli and are suited for parallel computations on a reconfigurable mesh architecture. It is shown how to convert in O(1) time any integer ranging between 0 and n-1, from any commonly used representation to any new representation proposed in this paper (and vice versa) using an n x O (log(2) n/log log n) reconfigurable mesh. In particular, some of the previously known conversion techniques are improved. Moreover, as a byproduct, it is shown how to compute in O(1) time the Prefix Sums of n bits improving previously known results. Applications to the Summation and Prefix Sums of N h-bit integers are also considered. the Summation and the Prefix Sums can be computed in O(1) time using O(h log N + log(2) N/log log N) x N h and O (h(2) + log(2) N/log(h + log N)) x O(N(h + log N)) reconfigurable meshes, respectively, improving all previously known results for most values of h including, for instance, h = O(log N).
this paper introduces an optimized mapping methodology for mapping instruction sequences (ISs) onto EPOM-processor arrays. the neu features of this mapping methodology result from a systematic specification and exploi...
详细信息
ISBN:
(纸本)0818691948
this paper introduces an optimized mapping methodology for mapping instruction sequences (ISs) onto EPOM-processor arrays. the neu features of this mapping methodology result from a systematic specification and exploitation of both instruction and processor level parallelism : ultra-low granularity of ISs requires an allocation and scheduling of individual instructions onto Me given processor array. Moreover, this mapping methodology is complete in the sense that it considers both array bus-bandwidths and processor resource constraints. the mapping methodology is based on two concepts. 1. instruction sequences (ISs): they represent a generalized form of directed cyclic graphs (DCG's) and allow to efficiently specify algorithm parallelism. Graph nodes represent instructions out of the instruction set of a target processor architecture [the96a] [the97a] 2. the EPOM-processor architecture: it represents an optimized target VLIW-processor architecture (in terms of cost and performance) for parallel implementation of ISs [the96a] and especially suited for par allel image/multimdedia processing [the95]. In this paper, special attention is paid to the optimization of the mapping process of ISs onto EPOM-processor arrays. Algorithm execution time minimization is used as optimization goal. the mapping methodology is partially based on integer-linear-programming and heuristic techniques. the solution time complexity is substantially reduced by developing a two-phase hierarchical! model, decoupling processor-array allocation from subsequent scheduling. the efficiency of this mapping methodology has been validated through experimental results on ISS Of well-known algorithm routines.
暂无评论