Dynamically reconfigurable FPGAs have the potential to dramatically improve logic density by time-sharing a physical FPGA device. this paper presents a network-flow based partitioning algorithm for dynamically reconfi...
详细信息
Dynamically reconfigurable FPGAs have the potential to dramatically improve logic density by time-sharing a physical FPGA device. this paper presents a network-flow based partitioning algorithm for dynamically reconfigurable FPGAs based on the architecture in [2]. Experiments show that our approach outperforms the enhanced force-directed scheduling method in [2] in terms of communication cost.
Over the past decade, much progress has been made to advance the acceleration of sparse linear operators such as SpMM and SpMV on FPGAs. Nevertheless, few works have attempted to address sparse triangular solver (SpTR...
详细信息
this paper describes the hardware implementation of the Generalized Profile Search algorithm using online arithmetic and redundant data representation. this is part of the GenStorm project, aimed at providing a dedica...
详细信息
this paper describes the hardware implementation of the Generalized Profile Search algorithm using online arithmetic and redundant data representation. this is part of the GenStorm project, aimed at providing a dedicated computer for biological sequence processing based on reconfigurable hardware using FPGAs. the serial evaluation of the result made possible by a redundant data representation leads to a significant increase of data throughput in comparison with standard non redundant data coding.
the placement phase of the compile process and an ultrafast placement algorithm targeted to fieldprogrammablegatearrays (FPGA) are presented. the algorithm is based on a combination of multiple-level, bottom-up clu...
详细信息
the placement phase of the compile process and an ultrafast placement algorithm targeted to fieldprogrammablegatearrays (FPGA) are presented. the algorithm is based on a combination of multiple-level, bottom-up clustering and hierarchical simulated annealing. It provides superior area results over a known high-quality placement tool on a set of large benchmark circuits, when both are restricted to a short run time. In addition, operating on its fastest mode, this tool can provide an accurate estimate of the wirelength achievable with good quality placement. this can be used in conjunction with a routing predictor, to determine the routability of a given circuit on a given FPGA device.
Experiments on Synthetic Aperture Radar require a data acquisition and storage system. Custom-made systems are well suited to this specific application, but the development of such a system is both time and resource c...
详细信息
ISBN:
(纸本)9783954048533
Experiments on Synthetic Aperture Radar require a data acquisition and storage system. Custom-made systems are well suited to this specific application, but the development of such a system is both time and resource consuming. the use of an off-the-shelf FPGA evaluation board with interchangeable analogue to digital converter cards as a data acquisition and pre-processing system was proposed. Such an approach allows not only for operation with different analogue front-ends, but also for the implementation of data pre-processing, such as modulation and decimation. the modulation makes further processing easier and the filtration-decimation reduces the size of the data stream. A complete system was tested thoroughly during a ground-based experiment and is now ready for airborne testing. As there are unused resources in the FPGA, they will be utilized in the future for the implementation of real-time processing.
A new search-based satisfiability (SAT) formulation that can handle entire fieldprogrammablegate array (FPGA), routing all nets concurrently is presented. the approach relies on a recently developed SAT engine that ...
详细信息
A new search-based satisfiability (SAT) formulation that can handle entire fieldprogrammablegate array (FPGA), routing all nets concurrently is presented. the approach relies on a recently developed SAT engine that uses systematic search with conflict directed nonchronological backtracking, capable of handling very large SAT instances. Preliminary experimental results suggest that this approach to FPGA routing is more viable than earlier binary decision diagram-based method.
An approach for runtime mapping is proposed that utilizes self-reconfigurability of multicontext fieldprogrammablegatearrays (FPGA) to achieve very high speedups over existing approaches. the idea is to design and ...
详细信息
An approach for runtime mapping is proposed that utilizes self-reconfigurability of multicontext fieldprogrammablegatearrays (FPGA) to achieve very high speedups over existing approaches. the idea is to design and map logic onto a multicontext FPGA that in turn maps problem instance dependent logic onto other contexts of the same FPGA. As a result, computer aided design tools need to be used just once for each problem and not once for every problem instance as is usually done.
Unlike their hard realtime counterparts, soft realtime applications are only expected to guarantee their "expected delay" over input data space. this paradigm shaft calls for customized statistical design te...
详细信息
ISBN:
(纸本)0769525237
Unlike their hard realtime counterparts, soft realtime applications are only expected to guarantee their "expected delay" over input data space. this paradigm shaft calls for customized statistical design techniques to replace the conventional pessimistic worst case analysis methodologies. Statistical design methods can provide a realistic assessment of design space, and improve the design quality by exploiting its stochastic behavior We present a novel probabilistic time budgeting algorithm that translates the application expected delay constraint into its components delay constraints. Our algorithm which is based on mathematical properties of the problem, determines the optimal maximum weighted timing relaxation of an application under expected delay constraint. Experimental results on core-based synthesis of several multimedia applications on FPGAs show about 20% and 19% average energy and area improvement, respectively.
A FPGA configuration method named configuration cloning is developed to exploit spatial and temporal regularity and locality in algorithms and architectures by copying and operating on the configuration bit-stream alr...
详细信息
A FPGA configuration method named configuration cloning is developed to exploit spatial and temporal regularity and locality in algorithms and architectures by copying and operating on the configuration bit-stream already resident in a FPGA. the method resulted in speed and power improvement over off-chip partial reconfiguration techniques, while not requiring additional interconnects and control hardware. Cloning requires only a small amount of hardware overhead. Digital signal processing applications are discussed to demonstrate the order of magnitude reductions in configuration time and power.
the design process for chip multiprocessors (CMPs) requires extremely long simulation times to explore performance, power, and thermal issues, particularly when operating system (OS) effects are included. In response,...
详细信息
ISBN:
(纸本)9781605581095
the design process for chip multiprocessors (CMPs) requires extremely long simulation times to explore performance, power, and thermal issues, particularly when operating system (OS) effects are included. In response, our novel FPGA-based emulation methodology models a full CMP design including applications and an OS, Activity counters programmed into the cores feed per-component microarchitectural power models. these models achieve under 10% error compared to detailed gate-level simulations. Our method retains software flexibility, but offers up to 35 X speedup compared to full-system software simulations. We present our approach by emulating a 2-core Leon3 cache-coherent multiprocessor running Linux and parallel benchmarks. In an example case study, our emulated system uses activity counts (a proxy for temperature) to guide process migration between the CMP cores. Overall, this paper's methodology makes possible detailed power and thermal studies of CMPs and their operating systems. Copyright 2008 acm.
暂无评论