In this paper, we present an FPGA-based multi-memory controller for accelerating computationally intensive applications. Our architecture accepts multiple inputs and produces multiple outputs for each clock cycle. The...
详细信息
ISBN:
(纸本)9781665406901
In this paper, we present an FPGA-based multi-memory controller for accelerating computationally intensive applications. Our architecture accepts multiple inputs and produces multiple outputs for each clock cycle. The architecture includes processor cores with pipelined functional units tailored for each application. Additionally, we present an approach to achieve one to two orders-of-magnitude speedup over a traditional software implementation executing on a conventional multi-core processor. Even though the clock frequency of the field-programmablecustomcomputing Machine (FCCM) is an order-of-magnitude slower than a conventional multi-core processor, the FCCM is significantly faster. We used the Power function as an application to demonstrate the merits of our FCCM. In our experiments, we executed the Power function in software and compared the software execution times with the execution time of an FCCM. Additionally, we also compared FCCM execution time with the OpenMP implementation of the function. Our experiments show that the results obtained using our multi-memory architecture are 57X faster than software implementation and 17X faster than OpenMP implementation executing the Power function, respectively.
In this paper, we present AutoRARE, a Java-based automated design tool for generating fieldprogrammable Gate Array (FPGA)-based hardware accelerators. AutoRARE automatically generates all VHDL models needed to build/...
详细信息
ISBN:
(纸本)9781538668085
In this paper, we present AutoRARE, a Java-based automated design tool for generating fieldprogrammable Gate Array (FPGA)-based hardware accelerators. AutoRARE automatically generates all VHDL models needed to build/synthesize a processor specifically tailored for each application. The user needs only provide the VHDL description of a special-purpose floating point Arithmetic Logic Unit (ALU) or function core. The tool generates the VHDL description for the memory interface, memory controller, host processor interface, and the application specific processor. We also present details of the FPGA-based multi-memory hardware accelerator for accelerating computationally intensive applications, generated using AutoRARE. The multi-memory hardware accelerator is highly pipelined and able to simultaneously read and write multiple floating point values from multiple memories. The multi-memory architecture is the key to providing hardware accelerators that execute 10X-100X faster than typical multi-core processors. The Taylor Series expansion of the sine/cosine function is used as an application to demonstrate the merits of the multi-memory hardware accelerator. In our experiments, we executed the Taylor Series in software and compared execution times with an FPGA-based hardware implementation. Our experiments show that the FPGA-based multi-memory Taylor Series hardware accelerator is 481X faster than software executing the Taylor Series on a typical server.
In this paper, we present AutoRARE, a Java-based automated design tool for generating fieldprogrammable Gate Array (FPGA)-based hardware accelerators. AutoRARE automatically generates all VHDL models needed to build/...
详细信息
ISBN:
(纸本)9781538668092;9781538668085
In this paper, we present AutoRARE, a Java-based automated design tool for generating fieldprogrammable Gate Array (FPGA)-based hardware accelerators. AutoRARE automatically generates all VHDL models needed to build/synthesize a processor specifically tailored for each application. The user needs only provide the VHDL description of a special-purpose floating point Arithmetic Logic Unit (ALU) or function core. The tool generates the VHDL description for the memory interface, memory controller, host processor interface, and the application specific processor. We also present details of the FPGA-based multi-memory hardware accelerator for accelerating computationally intensive applications, generated using AutoRARE. The multi-memory hardware accelerator is highly pipelined and able to simultaneously read and write multiple floating point values from multiple memories. The multi-memory architecture is the key to providing hardware accelerators that execute 10X-100X faster than typical multi-core processors. The Taylor Series expansion of the sine/cosine function is used as an application to demonstrate the merits of the multi-memory hardware accelerator. In our experiments, we executed the Taylor Series in software and compared execution times with an FPGA-based hardware implementation. Our experiments show that the FPGA-based multi-memory Taylor Series hardware accelerator is 481X faster than software executing the Taylor Series on a typical server.
暂无评论