To eliminate the conceptual distance between the hardware instruction set and the user interface, some architects advocate High Level Language (HLL) machines. To obtain simple, fast and cheap machines, some architects...
详细信息
We present a parallelized and pipelined architecture for a generalized Laguerre-Volterra MIMO system to identify the time-varying neural dynamics underlying spike activities. The proposed architecture consists of a fi...
详细信息
ISBN:
(纸本)9780769543017
We present a parallelized and pipelined architecture for a generalized Laguerre-Volterra MIMO system to identify the time-varying neural dynamics underlying spike activities. The proposed architecture consists of a first stage containing a vector convolution and MAC (Multiply and Accumulation) component;a second stage containing a pre-threshold potential updating unit with an error approximation function component;and a third stage consisting of a gradient calculation unit. A flexible and efficient architecture that can accommodate different design speed requirements are generated. Simulation results are rigorously analyzed. A hardware IP library for versatile models and applications is proposed. The design runs on a Xilinx Virtex-6 FPGA and the processing core produces data samples at a maximum clock rate of 357MHz, which is 4.37 x 10(5) times faster than the corresponding software model running on an AMD Pheono 9750 Quad Core Processor. It occupies 216,766 LUTs, maximum 12 block-RAMs, and 2016 DSP-blocks.
Manycore architectures are expected to be the dominant trend in future general-purpose computing systems. With the number of on-chip processor cores rising to the hundreds, the problem of resource allocation cannot be...
详细信息
Manycore architectures are expected to be the dominant trend in future general-purpose computing systems. With the number of on-chip processor cores rising to the hundreds, the problem of resource allocation cannot be addressed with traditional methods employed by off-chip multiprocessor architectures. We propose the use of system-level bidding-based algorithms as an efficient and real-time on-chip mechanism for resource allocation in manycore architectures. We simulate and evaluate the proposed bidding-based algorithms in a hierarchical, on-chip network connected manycore platform. Experimental results indicate performance improvement between 15-25%, when compared to an on-chip round robin allocation, while achieving a balanced workload distribution. The obtained results encourage further investigation of the applicability of such system-level algorithms for addressing additional important problems in manycore architectures.
We present ParGeo, a multicore library for computational geometry algorithms. We describe two of the algorithms from ParGeo, convex hull and the smallest enclosing ball, and present a short evaluation of all implement...
ISBN:
(纸本)9781450392044
We present ParGeo, a multicore library for computational geometry algorithms. We describe two of the algorithms from ParGeo, convex hull and the smallest enclosing ball, and present a short evaluation of all implementations currently in ParGeo.
In this paper we present a coarse-grained parallel algorithm for solving the string edit distance problem for a string A and all substrings of a string C. Our method is based on a novel CGM/BSP parallel dynamic progra...
详细信息
ISBN:
(纸本)9781581135299
In this paper we present a coarse-grained parallel algorithm for solving the string edit distance problem for a string A and all substrings of a string C. Our method is based on a novel CGM/BSP parallel dynamic programming technique for computing all highest scoring paths in a weighted grid graph. The algorithm requires \log p rounds/supersteps and O(\fracn^2p\log m) local computation, where $p$ is the number of processors, p^2 \leq m \leq n. To our knowledge, this is the first efficient CGM/BSP algorithm for the alignment of all substrings of C with A. Furthermore, the CGM/BSP parallel dynamic programming technique presented is of interest in its own right and we expect it to lead to other parallel dynamic programming methods for the CGM/BSP.
Computing the product of two sparse matrices (SpGEMM) is a fundamental operation in various combinatorial and graph algorithms as well as various bioinformatics and data analytics applications for computing inner-prod...
详细信息
ISBN:
(纸本)9781450392044
Computing the product of two sparse matrices (SpGEMM) is a fundamental operation in various combinatorial and graph algorithms as well as various bioinformatics and data analytics applications for computing inner-product similarities. For an important class of algorithms, only a subset of the output entries are needed, and the resulting operation is known as Masked SpGEMM since a subset of the output entries is considered to be "masked out". In this work, we investigate various novel algorithms and data structures for this rather challenging and important computation, and provide guidelines on how to design a fast Masked-SpGEMM for shared-memory architectures.
暂无评论