We present an FPGA-based parallel hardware-software architecture for the computation of the Discrete Wavelet Transform (DWT), using the Recursive Merge Filtering (RMF) algorithm. The DWT is built in a bottom-up fashio...
详细信息
ISBN:
(纸本)354067442X
We present an FPGA-based parallel hardware-software architecture for the computation of the Discrete Wavelet Transform (DWT), using the Recursive Merge Filtering (RMF) algorithm. The DWT is built in a bottom-up fashion in logN steps, successively building complete DWTs by "merging" two smaller DWTs and applying the wavelet filter to only the "smooth" or DC coefficient from the smaller DWTs. The main bottleneck of this algorithm is the data routing process, which can be reduced by separating the computations into two types to introduce parallelism. This is achieved by using a virtual mapping structure to map the input. The data routing bottleneck has been transformed into simple arithmetic computations on the mapping structure. Due to the use of the FPGA-RAM for the mapping structure, the total number of data accesses to the main memory are reduced. This architecture shows how data routing in this problem can be transformed into a series of index computations.
New fast and highly complex fieldprogrammablegatearrays (FPGAs) allow the design of sophisticated decision logic within the trigger latency time of particle detectors. As an example we show the jet determination of...
详细信息
ISBN:
(纸本)9780769508436
New fast and highly complex fieldprogrammablegatearrays (FPGAs) allow the design of sophisticated decision logic within the trigger latency time of particle detectors. As an example we show the jet determination of the Hera-Hl detector at DESY (Deutsches Elektronen Synchrotron) Hamburg. It has to calculate all existing localized energy depositions (jets) in the calorimeter and deliver the result, sorted according to energy. The system is implemented by a network of 3/spl times/440 high density FPGA's which have to deliver the results in less than 1 /spl mu/s. The computing power of the system is equivalent to 70 billion operations per second.
Recently a number of heuristic based system-level synthesis algorithms have been proposed. Though these algorithms quickly generate good solutions, how close these solutions are to optimal is a question that is diffic...
详细信息
ISBN:
(纸本)9781581132441
Recently a number of heuristic based system-level synthesis algorithms have been proposed. Though these algorithms quickly generate good solutions, how close these solutions are to optimal is a question that is difficult to answer. While current exact techniques produce optimal results, they fail to produce them in reasonable time. This paper presents a synthesis algorithm that produces solutions of guaranteed quality (optimal in most cases or within a known bound) with practical synthesis times (few seconds to minutes). It takes a unified look (the lack of which is one of the main sources of sub-optimality in the heuristic techniques) at different aspects of system synthesis such as pipelining, selection, allocation, scheduling and FPGA reconfiguration. Our technique can handle both time constrained as well as resource constrained synthesis problems. We present results of our algorithm implemented as part of the Match project at Northwestern University.
The proceedings contains 25 papers from the 1998 acm/sigdainternationalsymposium on fieldprogrammablegatearrays (FPGA). Topics discussed include: new FPGA architectures;technology mapping for FPGAs;multi-FPGA sys...
详细信息
The proceedings contains 25 papers from the 1998 acm/sigdainternationalsymposium on fieldprogrammablegatearrays (FPGA). Topics discussed include: new FPGA architectures;technology mapping for FPGAs;multi-FPGA systems & other reprogrammable architectures;partitioning and floor planning for FPGAs;fault detection and fault tolerance for FPGAs;fast computer aided design (CAD) tools for FPGAs;time multiplexed FPGAs;FPGAs with embedded memory;and programmable architectures with special features.
Dynamically reconfigurable FPGAs have the potential to dramatically improve logic density by time-sharing a physical FPGA device. This paper presents a network-flow based partitioning algorithm for dynamically reconfi...
详细信息
Dynamically reconfigurable FPGAs have the potential to dramatically improve logic density by time-sharing a physical FPGA device. This paper presents a network-flow based partitioning algorithm for dynamically reconfigurable FPGAs based on the architecture in [2]. Experiments show that our approach outperforms the enhanced force-directed scheduling method in [2] in terms of communication cost.
An approach for runtime mapping is proposed that utilizes self-reconfigurability of multicontext fieldprogrammablegatearrays (FPGA) to achieve very high speedups over existing approaches. The idea is to design and ...
详细信息
An approach for runtime mapping is proposed that utilizes self-reconfigurability of multicontext fieldprogrammablegatearrays (FPGA) to achieve very high speedups over existing approaches. The idea is to design and map logic onto a multicontext FPGA that in turn maps problem instance dependent logic onto other contexts of the same FPGA. As a result, computer aided design tools need to be used just once for each problem and not once for every problem instance as is usually done.
This paper describes the hardware implementation of the Generalized Profile Search algorithm using online arithmetic and redundant data representation. This is part of the GenStorm project, aimed at providing a dedica...
详细信息
This paper describes the hardware implementation of the Generalized Profile Search algorithm using online arithmetic and redundant data representation. This is part of the GenStorm project, aimed at providing a dedicated computer for biological sequence processing based on reconfigurable hardware using FPGAs. The serial evaluation of the result made possible by a redundant data representation leads to a significant increase of data throughput in comparison with standard non redundant data coding.
A new search-based satisfiability (SAT) formulation that can handle entire fieldprogrammablegate array (FPGA), routing all nets concurrently is presented. The approach relies on a recently developed SAT engine that ...
详细信息
A new search-based satisfiability (SAT) formulation that can handle entire fieldprogrammablegate array (FPGA), routing all nets concurrently is presented. The approach relies on a recently developed SAT engine that uses systematic search with conflict directed nonchronological backtracking, capable of handling very large SAT instances. Preliminary experimental results suggest that this approach to FPGA routing is more viable than earlier binary decision diagram-based method.
In this paper;we address the routability and analog per performance issues involved in routing for array-based FPAAs that hate single-segment horizontal and vertical muting resources. We then present FAAR (field-progr...
详细信息
ISBN:
(纸本)0769500137
In this paper;we address the routability and analog per performance issues involved in routing for array-based FPAAs that hate single-segment horizontal and vertical muting resources. We then present FAAR (field-programmable Analog Array Router) and describe a routing algorithm developed for the target array-based FPAA architecture. Sequential muting technique is used for routing multi-terminal nets as well as multiple nets. Multi-terminal nets are broken into two-terminal pairs and muted. We use the notion of resource demand as a measure of the effect of a net-route on the muting of the other nets, while the number of programmable switches and the net-crossings are used as the metrics of interconnect parasitics. We present experiments to study the effect of various parameters such as the number of nets, terminals, CABs and IO cells on the renting as well as the performance degradation. FAAR routes with high efficiency while keeping performance degradation small, and has considerably: short execution times.
The placement phase of the compile process and an ultrafast placement algorithm targeted to fieldprogrammablegatearrays (FPGA) are presented. The algorithm is based on a combination of multiple-level, bottom-up clu...
详细信息
The placement phase of the compile process and an ultrafast placement algorithm targeted to fieldprogrammablegatearrays (FPGA) are presented. The algorithm is based on a combination of multiple-level, bottom-up clustering and hierarchical simulated annealing. It provides superior area results over a known high-quality placement tool on a set of large benchmark circuits, when both are restricted to a short run time. In addition, operating on its fastest mode, this tool can provide an accurate estimate of the wirelength achievable with good quality placement. This can be used in conjunction with a routing predictor, to determine the routability of a given circuit on a given FPGA device.
暂无评论