In this paper, we describe a new computation method for 3D FCHC lattice gas model with FPGA. FCHC lattice gas model is a class of 3D cellular automata and used for simulating fluid dynamics. Many approaches with FPGAs...
详细信息
ISBN:
(纸本)3540408223
In this paper, we describe a new computation method for 3D FCHC lattice gas model with FPGA. FCHC lattice gas model is a class of 3D cellular automata and used for simulating fluid dynamics. Many approaches with FPGAs for cellular automata have been researched to date. However, practical three dimensional cellular automata such as an FCHC lattice gas model could not be processed efficiently because they required large size data for each cell and very complex update rules for computing cells. We implemented the new method on an FPGA board with one XC2V6000. the speed gain for FCHC lattice gas model with 128 x 128 x 128 lattice is about 200 times compared with Athlon processor 1800 MHz.
the control, signal and image processing applications are complex in terms of algorithms, hardware architectures and real-time/embedded constraints. System level CAD softwares are then useful to help the designer for ...
详细信息
ISBN:
(纸本)3540408223
the control, signal and image processing applications are complex in terms of algorithms, hardware architectures and real-time/embedded constraints. System level CAD softwares are then useful to help the designer for prototyping and optimizing such applications. these tools axe oftently based on design flow methodologies. this paper presents a seamless design flow which transforms a data dependence graph specifying the application into an implementation graph containing both data and control paths. the proposed approach follows a set of rules based on the RTL model and on mechanisms of synchronized data transfers in order to transform automatically the initial algorithmic graph into the implementation graph. this transformation flow is part of the extension of our AAA (Algorithm-Architecture Adequation) rapid prototyping methodology to support the optimized implementation of real-time applications on reconfigurable circuits. It has been implemented in SynDEx(1), a system level CAD software tool that supports AAA.
Intrusion Detection Systems such as Snort scan incoming packets for evidence of security threats. the most computation-intensive part of these systems is a text search against hundreds of patterns, and must be perform...
详细信息
ISBN:
(纸本)3540408223
Intrusion Detection Systems such as Snort scan incoming packets for evidence of security threats. the most computation-intensive part of these systems is a text search against hundreds of patterns, and must be performed at wire-speed. FPGAs are particularly well suited for this task and several such systems have been proposed. In this paper we expand on previous work, in order to achieve and exceed a processing bandwidth of 11Gbps. We employ a scalable, low-latency architecture, and use extensive fine-grain pipelining to tackle the fan-out, match, and encode bottlenecks and achieve operating frequencies in excess of 340MHz for fast Virtex devices. To increase throughput, we use multiple comparators and allow for parallel matching of multiple search strings.. We evaluate the area and latency cost of our approach and find that the match cost per search pattern character is between 4 and 5 logic cells.
this paper describes an FPGA implementation of a Connected Component Labelling algorithm (CCL), developed at Queen's University Belfast. the algorithm iteratively scans the input image, performing a non-zero maxim...
详细信息
ISBN:
(纸本)3540408223
this paper describes an FPGA implementation of a Connected Component Labelling algorithm (CCL), developed at Queen's University Belfast. the algorithm iteratively scans the input image, performing a non-zero maximum neighbourhood operation. It has been coded in Handel C language and targeted Celoxica RC1000-PP PCI board. the whole design was fully implemented and tested on real hardware in less than 24 man-hour. It uses a Virtex-E FPGA and two banks of off-chip memory. For 1024x1024 input images, the whole circuit consumes 583 FPGA slices and 5 Block RAMs and can run at 72 MHz, leading to a 68 pass/sec performance. the FPGA implementation outperforms, easily, an equivalent software implementation running on a 1.6 GHz Pentium-IV PC. A 10-fold speed up has been realised in many instances.
In this paper, we propose a design and implementation method for priority queuing mechanisms on FPGAs. First, we describe behavior of WFQ (weighted fair queuing) with several parameters in a model called concurrent pe...
详细信息
ISBN:
(纸本)3540408223
In this paper, we propose a design and implementation method for priority queuing mechanisms on FPGAs. First, we describe behavior of WFQ (weighted fair queuing) with several parameters in a model called concurrent periodic EFSMs. then, we derive a parameter condition for the concurrent EFSMs to execute their transitions without deadlocks in the specified time period repeatedly under the specified temporal constraints, using parametric model checking technique. From the derived parameter condition, we can decide adequate parameter values satisfying the condition, considering total costs of components. Based on the proposed method, high-reliable and high-performance WFQ circuits for gigabit networks can be synthesized on FPGAs.
this paper describes a high performance single-chip FPGA implementation of the new Advanced Encryption Standard (AES) algorithm dealing with 128-bit data/key blocks and operating in Counter (CTR) mode. Counter mode ha...
详细信息
ISBN:
(纸本)3540408223
this paper describes a high performance single-chip FPGA implementation of the new Advanced Encryption Standard (AES) algorithm dealing with 128-bit data/key blocks and operating in Counter (CTR) mode. Counter mode has a proven-tight security and it enables the simultaneous processing of multiple blocks without losing the feedback mode advantages. It also gives the advantage of allowing the use of similar hardware for both encryption and decryption parts. the proposed architecture is modular. the architecture basic module implements a single round of the algorithm withthe required expansion hardware and control signals. It gives very high flexibility in choosing the degree of pipelining according to the throughput requirements and hardware limitations and this gives the ability to achieve the best compromised design due to these aspects. the FPGA implementation presented is that of a pipelined single chip Rijndael design which runs at a rate of 10.8 Gbits/sec for full pipelining on an ALTERA APEX-EP20KE platform.
this paper proposes a real-time bioinspired visual encoding system for multielectrodes’ stimulation of the visual cortex supported on fieldprogrammablelogic. this system includes the spatio-temporal preprocessing s...
详细信息
this paper introduces the Multi-Micro Processor-Array (MMPA) as a kind of Evolvable Hardware (EHW) for an industry control system. At first it describes one of the traditional methods, logic method, for the reconfigur...
详细信息
ISBN:
(纸本)0769519571
this paper introduces the Multi-Micro Processor-Array (MMPA) as a kind of Evolvable Hardware (EHW) for an industry control system. At first it describes one of the traditional methods, logic method, for the reconfiguration of a system. then it applies an evolutionary algorithm to improve the reconfiguration so that the architecture of the control system can be configured dynamically and optimally. the evolutionary algorithm is executed in the structure of the MMPA. Relationship among the components and tasks is employed to speed up searching solutions. Physically the bus connects the microprocessors that form an array. logically the microprocessors construct a ring: Token Ring. the microprocessor that gets the token can send message to any other microprocessor. Each microprocessor stores overall data so when it gets the token it can reconfigure the whole system if necessary.
A self-reconfiguring platform is reported that enables an FPGA to dynamically reconfigure itself under the control of an embedded microprocessor. this platform has been implemented on Xilinx Virtex IItm and Virtex II ...
详细信息
Globally Asynchronous Locally Synchronous (GALS) Systems have provoked renewed interest over recent years as they have the potential to combine the benefits of asynchronous and synchronous design paradigms. It has bee...
详细信息
暂无评论