The success of Deep Learning (DL) algorithms in computer vision tasks have created an on-going demand of dedicated hardware architectures that could keep up with the their required computation and memory complexities....
详细信息
ISBN:
(纸本)9781450371896
The success of Deep Learning (DL) algorithms in computer vision tasks have created an on-going demand of dedicated hardware architectures that could keep up with the their required computation and memory complexities. This task is particularly challenging when embedded smart camera platforms have constrained resources such as power consumption, Processing Element (PE) and communication. This article describes a heterogeneous system embedding an FPGA and a GPU for executing CNN inference for computer vision applications. The built system addresses some challenges of embedded CNN such as task and data partitioning, and workload balancing. The selected heterogeneous platform embeds an Nvidia (R) Jetson TX2 for the CPU-GPU side and an Intel Altera (R) Cyclone10GX for the FPGA side interconnected by PCIe Gen2 with a MIPI-CSI camera for prototyping. This test environment will be used as a support for future work on a methodology for optimized model partitioning.
The success of Deep Learning (DL) algorithms in computer vision tasks have created an on-going demand of dedicated hardware architectures that could keep up with the their required computation and memory complexities....
详细信息
ISBN:
(纸本)9781450371896
The success of Deep Learning (DL) algorithms in computer vision tasks have created an on-going demand of dedicated hardware architectures that could keep up with the their required computation and memory complexities. This task is particularly challenging when embedded smart camera platforms have constrained resources such as power consumption, Processing Element (PE) and communication. This article describes a heterogeneous system embedding an FPGA and a GPU for executing CNN inference for computer vision applications. The built system addresses some challenges of embedded CNN such as task and data partitioning, and workload balancing. The selected heterogeneous platform embeds an Nvidia® Jetson TX2 for the CPU-GPU side and an Intel Altera® Cyclone10GX for the FPGA side interconnected by PCIe Gen2 with a MIPI-CSI camera for prototyping. This test environment will be used as a support for future work on a methodology for optimized model partitioning.
暂无评论