The problems of placement and routing are without doubt the most time-consuming part of the process of automatically. synthesizing and configuring circuits for field-programmablegatearrays (FPGAs). FPGAs offer the a...
详细信息
The problems of placement and routing are without doubt the most time-consuming part of the process of automatically. synthesizing and configuring circuits for field-programmablegatearrays (FPGAs). FPGAs offer the ability to quickly reconfigure circuits to support rapid prototyping, emulation, or configurable computing, but the time to perform placement and routing, which can take many hours, has become a serious bottleneck. This problem is addressed here by showing that the negotiation-based routing paradigm, which has been applied successfully in several FPGA routers, can be parallelized to achieve increased performance without any significant decrease in the quality of the results. In this paper, me report several new findings related to the negotiation-based routing paradigm. We examine in-depth the convergence of the negotiation-based routing algorithm. We illustrate that the negotiation-based algorithm can be parallelized. Finally, we demonstrate that a negotiation-based parallel FPGA router performs well in terms of delay and speedup with practical FPGA circuits.
In this paper we present a timing-driven router for symmetrical array-based FPGAs. The routing resources in the FPGAs consist of segments of various lengths. Researchers have shown that the number of segments, instead...
详细信息
In this paper we present a timing-driven router for symmetrical array-based FPGAs. The routing resources in the FPGAs consist of segments of various lengths. Researchers have shown that the number of segments, instead of wirelength, used by a net is the most critical factor in controlling routing delay in an FPGA. Thus, the traditional measure of routing delay on the basis of geometric distance of a signal is not accurate. To consider wirelength and delay simultaneously, we study a model of timing-driven routing trees, arising from the special properties of FPGA routing architectures. Based on the solutions to the routing-tree problem, we present a routing algorithm that is able to utilize various routing segments with global considerations to meet timing constraints. Experimental results show that our approach is very effective in reducing timing violations.
This paper presents the design principle of a SDH Digital Cross-connect (SDXC) matrix implemented with field programmable gate array (FPGA). The SDXC matrix enables construction of flexible SDH network, reduces the ef...
详细信息
ISBN:
(纸本)0780357434
This paper presents the design principle of a SDH Digital Cross-connect (SDXC) matrix implemented with field programmable gate array (FPGA). The SDXC matrix enables construction of flexible SDH network, reduces the effects of physical connection points and maintenance personnel.
The video signal preprocessing unit (processor) for thermovision camera, developed by the authors on the basis of a pyroelectric vidicon, is intended for: calculation of a difference between the "positive" a...
详细信息
ISBN:
(纸本)0819437956
The video signal preprocessing unit (processor) for thermovision camera, developed by the authors on the basis of a pyroelectric vidicon, is intended for: calculation of a difference between the "positive" and "negative" frames, obtained in obturation mode: n-divisible accumulation of the resulting frame;non-volatile storage of the received images;images transmission to the computer. These functions are realized by means of following units: video signal digitizer;arithmetic-logical unit;accumulation, display and archive memory units;microcontroller. Processor is developed with field programmable gate arrays (FPGA) use. Its structure is considered.
A fundamental feature of Dynamically Reconfigurable FPGAs (DRFPGAs) is that the logic and interconnect are time-multiplexed. Thus. for a circuit to be implemented on a DRFPGA, it needs to be partitioned such that each...
详细信息
A fundamental feature of Dynamically Reconfigurable FPGAs (DRFPGAs) is that the logic and interconnect are time-multiplexed. Thus. for a circuit to be implemented on a DRFPGA, it needs to be partitioned such that each subcircuit can be executed at a different time. In this paper, the partitioning of sequential circuits for execution on a DRFPGA is studied. To determine how to correctly partition a sequential circuit and what are the costs in doing so, we propose a new gate-level model that handles time-multiplexed computation. We also introduce an enchanced force directed scheduling (FDS) algorithm to partition sequential circuits that finds a correct partition with low logic and communication costs, under the assumption that maximum performance is desired. We use our algorithm to partition seven large ISCAS '89 sequential benchmark circuits. The experimental results show that the enhanced FDS reduces communication costs by 27.5 percent with only a 1.1 percent increase in the gate cost compared to traditional FDS.
This paper gives a hands-on example of how low-level optimization of the VHSIC Hardware Description Language (VHDL) code is extremely difficult within a contemporary field programmable gate array (FPGA) design how. Ho...
详细信息
This paper gives a hands-on example of how low-level optimization of the VHSIC Hardware Description Language (VHDL) code is extremely difficult within a contemporary field programmable gate array (FPGA) design how. However, low-level optimization can be accomplished, and by changing the VHDL coding style synthesis results can be improved. The design flow is considered from high-level descriptions (bubble diagrams), through logic synthesis to the point where hand optimization is required. For performance benchmarking a state machine from a contemporary computer bus, PCI, implemented in a Xilinx FPGA, is used. Practical design issues applied to time-critical implementations using FPGAs, especially the trade-offs of high-level versus low-level synthesis, are analyzed. Performance evaluation results of several PCI target state machines, coded using different styles and design methods are given in terms of time and area efficiency. Based on these findings improvements to the FPGA design methodology are proposed. (C) 1999 Elsevier Science B.V. All rights reserved.
A real-time, VanderLugt-type optical correlator using a single SLM is developed. A field programmable gate array is used to capture and process images obtained from a CCD camera at a rate of 60 video fields/s. During ...
详细信息
A real-time, VanderLugt-type optical correlator using a single SLM is developed. A field programmable gate array is used to capture and process images obtained from a CCD camera at a rate of 60 video fields/s. During both enrollment and verification, a finger slides over a glass prism and is input to the system via the frustration of the total internal reflection process. An autoenrollment procedure captures the optimal image during each slide. An optimal composite filter is implemented. The correlation detection process comprises real-time tracking of the correlation peak while the finger is sliding and a decision process based on projective decision boundaries. Real-life tests yielded a false rejection rate of 1% and a false acceptance rate of 0.2%. (C) 1999 Society of Photo-Optical Instrumentation Engineers. [S0091-3286(99)00901-0].
In this article we introduce the use of field programmable gate array (FPGA) into the central processing unit (CPU) as part of an arithmetic-logic unit (ALU). As the concept of an in-system configurable FPGA inside th...
详细信息
In this article we introduce the use of field programmable gate array (FPGA) into the central processing unit (CPU) as part of an arithmetic-logic unit (ALU). As the concept of an in-system configurable FPGA inside the CPU is becoming more and more popular, it is now used mainly for the purpose of testing and evaluating. We suggest that the use of FPGA as an extension to the ALU with its functions that are implemented in the logic circuit (which we call logic-ware) can greatly increase the performance of the CPU. (C) 1999 Elsevier Science B.V. All rights reserved.
This paper describes an application in high-performance signal processing using reconfigurable computing engines: a 250-MHz cross correlator for radio astronomy. Experimental results indicate that complementary metal-...
详细信息
This paper describes an application in high-performance signal processing using reconfigurable computing engines: a 250-MHz cross correlator for radio astronomy. Experimental results indicate that complementary metal-oxide-semiconductor (CMOS) field programmable gate arrays (FPGA's) can per form useful computation at 250 MHz. The notion of an "event horizon" for FPGA's leads to clear design constraints for highspeed application developers, and can be applied to a variety of real-time signal processing algorithms. Recent estimates indicate that higher performance FPGA's available early in 1998 can attain speeds of over 300 MHz using 20% fewer logic elements than current designs. The results of this design work provide important clues on how to improve FPGA architectures for signal processing at hundreds of MHz. Direct routing channels between logic elements can significantly increase performance. Routing architectures with four-way symmetry allow for rotations and reflections of subcircuits needed for optimal packing density. Experimental results indicate that clock buffering often limits the top speed of the FPGA. Wave pipelining of clock distribution network may improve FPGA performance.
In this paper, we propose a target board architecture suitable for embedded signal processing applications based on hardware software codesign. The target board, which serves as a system attached to a host PC via a PC...
详细信息
In this paper, we propose a target board architecture suitable for embedded signal processing applications based on hardware software codesign. The target board, which serves as a system attached to a host PC via a PCI bus interface, contains a TMS320C30 DSP processor and up to four Xilinx XC5204 FPGAs. The software and hardware sections of the codesign can be easily implemented using C and VHDL programming in the C30 processor and FPGAs, respectively. Based on the proposed target board architecture, the interface circuitry and the communication protocols between the hardware (FPGAs) and software (C30) sections are first derived. The interface circuitry is described in VHDL code and will be added to the FPGA design for high level synthesis. Five types of HW/SW communications are supported. A HW/SW codesign flow is also exploited, and a partitioning verification procedure is developed. To illustrate the merits of the proposed system, a HW/SW codesign, implementation example based on the G.728 LD-CELP decoder for speech compression is described.
暂无评论