We present a parallelized and pipelined architecture for a generalized Laguerre-Volterra MIMO system to identify the time-varying neural dynamics underlying spike activities. The proposed architecture consists of a fi...
详细信息
ISBN:
(纸本)9780769543017
We present a parallelized and pipelined architecture for a generalized Laguerre-Volterra MIMO system to identify the time-varying neural dynamics underlying spike activities. The proposed architecture consists of a first stage containing a vector convolution and MAC (Multiply and Accumulation) component;a second stage containing a pre-threshold potential updating unit with an error approximation function component;and a third stage consisting of a gradient calculation unit. A flexible and efficient architecture that can accommodate different design speed requirements are generated. Simulation results are rigorously analyzed. A hardware IP library for versatile models and applications is proposed. The design runs on a Xilinx Virtex-6 FPGA and the processing core produces data samples at a maximum clock rate of 357MHz, which is 4.37 x 10(5) times faster than the corresponding software model running on an AMD Pheono 9750 Quad Core Processor. It occupies 216,766 LUTs, maximum 12 block-RAMs, and 2016 DSP-blocks.
Stochastic simulations and other scientific applications that depend on random numbers are increasingly implemented in a parallelized manner in programmable logic. High-quality pseudo-random number generators (PRNG), ...
详细信息
ISBN:
(纸本)9781595939340
Stochastic simulations and other scientific applications that depend on random numbers are increasingly implemented in a parallelized manner in programmable logic. High-quality pseudo-random number generators (PRNG), such as the Mersenne Twister, are often based on binary linear recurrences and have extremely long periods (more than 21024. Many software implementations of such PRNGs exist, but hardware implementations are rare. We have developed an optimized, resource-efficient parallel framework for this class of random number generators that exploits the underlying algorithm as well as FPGA-specific architectural features. The framework also incorporates fast "jump-ahead" capability for these PRNGs, allowing simultaneous, independent sub-streams to be generated in parallel by partitioning one long-period pseudo-random sequenceWe demonstrate parallelized implementations of three types of PRNGs -- the 32-, 64- and 128-bit SIMD Mersenne Twister -- on Xilinx Virtex-II Pro FPGAs. Their area/throughput performance is impressive: for example, compared clock-for-clock with a previous FPGA implementation, a "two-parallelized" 32-bit Mersenne Twister uses 41% fewer resources. It can also scale to 350 MHz for a throughput of 22.4 Gbps, which is 5.5x faster than the older FPGA implementation and 7.1x faster than a dedicated software implementation. The quality of generated random numbers is verified with the standard statistical test batteries Diehard and TestU01. We also present two real-world application studies with multiple RNG streams: the Ziggurat method for generating normal random variables and a Monte Carlo photon-transport *** availability of fast long-period random number generators with multiple streams accelerates hardware-based scientific simulations and allows them to scale to greater complexities
暂无评论