A time-driven,jlit-based, wormhole-routed, parallel processor network simulator has been designed in C with a user-friendly Graphical User Intelface (GUI). To accommodate the unique requirements of real-time networks,...
详细信息
A candidate network function is accurately defined for real-time parallel computers. A concise, time-driven, flit-based, priority-driven, wormhole-routed, network simulator has been designed. Experimentation is perfor...
详细信息
A candidate network function is accurately defined for real-time parallel computers. A concise, time-driven, flit-based, priority-driven, wormhole-routed, network simulator has been designed. Experimentation is performed by monitoring the latency and the throughput with variations in different network parameters. Initially the destination address, message length and message priority are generated randomly with a uniform distribution. Then, various non-uniformities are introduced to mimic realistic applications. Results are plotted and analyzed.< >
High speed recoded parallel multipliers constitute an affordable improvement compared to the serial-parallel add-shift designs. We present in this paper, a detailed discussion of the development of these recoded multi...
详细信息
High speed recoded parallel multipliers constitute an affordable improvement compared to the serial-parallel add-shift designs. We present in this paper, a detailed discussion of the development of these recoded multiplier algorithms and their FPGA implementations. The various issues involved in the design process are highlighted. The cost-performance comparison of the various recoded multipliers is studied and a discussion on the design methodology is also presented.< >
field programmable gate arrays (FPGAs) provide an innovative and flexible platform to implement and evaluate digital signal processing (DSP) applications. A CAD design methodology which is used to implement DSP algori...
详细信息
field programmable gate arrays (FPGAs) provide an innovative and flexible platform to implement and evaluate digital signal processing (DSP) applications. A CAD design methodology which is used to implement DSP algorithms is presented. An introduction is given to the various issues involved in the multi-chip partitioning of large DSP implementations, and approaches towards efficient auto-partitioners are also discussed in detail. The design and implementation of an 8-point 1D discrete cosine transform (DCT) and its inverse (IDCT) on a processor with FPGAs is presented in this paper, as an illustrative example of a typical DSP algorithm. The processor uses 16-bit precision, is implemented on six Xilinx 4000 type FPGAs and operates at 40 MHz.< >
The authors describe the model for a parallel divide-and-conquer algorithm incorporating both the symmetric and nonsymmetric overheads inherent in any parallel computing environment. An algorithm for computing optimal...
详细信息
The authors describe the model for a parallel divide-and-conquer algorithm incorporating both the symmetric and nonsymmetric overheads inherent in any parallel computing environment. An algorithm for computing optimal partitions is derived. This algorithm separates problem sizes into classes of problems that may use the same optimal partition size.< >
暂无评论