This letter presents a graphics processing unit (GPU)-based non-binary low density parity check multi-codeword decoder with both kernel execution and data transfer optimizations. A novel multi-codeword data structure ...
详细信息
This letter presents a graphics processing unit (GPU)-based non-binary low density parity check multi-codeword decoder with both kernel execution and data transfer optimizations. A novel multi-codeword data structure and its corresponding parallelism are proposed to boost the compute unified device architecture kernel execution. Moreover, practical methods of hiding the data transfer latency are presented to improve data transfer efficiency. Experimental results demonstrate that the throughput speedups achieved by the proposed decoder range from 3.12x to 185x over various Galois fields compared with the existing works on GPU.
Implementation of Quasi-Cyclic (QC) Low Density Parity-Check (LDPC) decoder on FPGA devices has shown great interest in both wireless communication, as well as error correction for Flash memories. This paper presents ...
详细信息
ISBN:
(纸本)9781509024674
Implementation of Quasi-Cyclic (QC) Low Density Parity-Check (LDPC) decoder on FPGA devices has shown great interest in both wireless communication, as well as error correction for Flash memories. This paper presents an FPGA flooded LDPC decoder which uses multiple codeword processing for efficient memory utilization. It is based on a partially parallel implementation, which relies on memory blocks for message passing between the processing units. We obtain efficient memory utilization by packing multiple messages corresponding to multiple codewords into the same Block RAM word. The increase in throughput is linear with the number of processed codewords. The proposed LDPC decoder can process up to 9 codewords in parallel, for 4-bit message quantization, or up to 12 codewords, for 3-bit message quantization, without introducing significant memory overhead.
暂无评论