The proceedings contain 37 papers. The topics discussed include: comparing fpga vs. custom CMOs and the impact on processor microarchitecture;VEGAS: soft vector processor with scratchpad memory;leap scratchpads: autom...
ISBN:
(纸本)9781450305549
The proceedings contain 37 papers. The topics discussed include: comparing fpga vs. custom CMOs and the impact on processor microarchitecture;VEGAS: soft vector processor with scratchpad memory;leap scratchpads: automatic memory and cache management for reconfigurable logic;NETTM: faster and easier synchronization for soft multicores via transactional memory;LegUp: high-level synthesis for fpga-based processor/accelerator systems;automatic SoC design flow on many-core processors: a software hardware co-design approach for fpgas;Torc: towards an open-source tool flow;fpgaSort: a high performance sorting architecture exploiting run-time reconfiguration on fpgas for large problem sorting;a platform for high level synthesis of memory-intensive image processing algorithms;energy-efficient specialization of functional units in a coarse-grained reconfigurable array;and DEEP: an iterative fpga-based many-core emulation system for chip verification and architecture research.
The proceedings contain 31 papers. The topics discussed include: memory-efficient fast Fourier transform on streaming data by fusing permutations;DeltaRNN: a power-efficient recurrent neural network accelerator;degree...
ISBN:
(纸本)9781450356145
The proceedings contain 31 papers. The topics discussed include: memory-efficient fast Fourier transform on streaming data by fusing permutations;DeltaRNN: a power-efficient recurrent neural network accelerator;degree-aware hybrid graph traversal on fpga-HMC platform;architecture exploration for HLS-oriented fpga debug overlays;graph-theoretically optimal memory banking for stencil-based computing kernels;ADAM: automated design analysis and merging for speeding up fpga development;high-performance QR decomposition for fpgas;a HOG-based real-time and multi-scale pedestrian detector demonstration system on fpga;combined spatial and temporal blocking for high-performance stencil computation on fpgas using OpenCL;P4-compatible high-level synthesis of low latency 100 Gb/s streaming packet parsers in fpgas;a scalable approach to exact resource-constrained scheduling based on a joint SDC and SAT formulation;dynamically scheduled high-level synthesis;and a customizable matrix multiplication framework for the Intel HARPv2 Xeon+fpga platform.
暂无评论