We present a general technology-mapping methodology (TULIP) for field-programmable gate arrays (FPGAs) that can yield optimal results, and is applicable to any FPGA with a logic block composed of lookup tables (LUTs)....
详细信息
We present a general technology-mapping methodology (TULIP) for field-programmable gate arrays (FPGAs) that can yield optimal results, and is applicable to any FPGA with a logic block composed of lookup tables (LUTs). We introduce the concept of a virtual switch to model the internal connections of a logic block with multiple LUTs;each configuration of virtual switches is called a multiple-LUT block (MLB). A logic block can be precisely defined by a small but complete set of representative configurations called an MLB basis. The MLB bases for various commercial FPGA families are demonstrated. Given a logic block represented by its MLB basis, technology mapping is precisely formulated as a graph-covering problem, which is transformed into a mixed integer-linear programming (MILP) optimization problem in order to achieve our optimality and generality objectives. The MILP model is solved using a general-purpose MILP solver tool. The results of using TULIP for mapping some ISCAS-85 benchmark circuits to a variety of logic blocks are presented. Circuits of a few hundred gates can be mapped directly in a few minutes. To map larger circuits to complex logic blocks, some approximation techniques are proposed based on partitioning the input circuit and simplifying the MLB basis. We show that these approximations result in close-to-optimal mappings of the benchmark circuits.
The problem of finite state machine (FSM) encoding for low power in field-programmable gate arrays (FPGAs) is addressed. In this technology, one-hot encoding is typically recommended for large FSMs and binary encoding...
详细信息
The problem of finite state machine (FSM) encoding for low power in field-programmable gate arrays (FPGAs) is addressed. In this technology, one-hot encoding is typically recommended for large FSMs and binary encoding for small FSMs. A partitioned encoding approach is proposed which uses a combination of both binary encoding and zero-one-hot encoding with intermediate code size. Experimental results demonstrate that the proposed encoding approach can produce significant power savings.
Reconfigurable architectures that tightly integrate a standard CPU core with a field-programmable hardware structure have recently been receiving increased attention. The design of such a hybrid reconfigurable process...
详细信息
Reconfigurable architectures that tightly integrate a standard CPU core with a field-programmable hardware structure have recently been receiving increased attention. The design of such a hybrid reconfigurable processor involves a multitude of design decisions regarding the field-programmable structure as well as its system integration with the CPU core. Determining the impact of these design decisions on the overall system performance is a challenging task. In this paper, we first present a framework for the cycle-accurate performance evaluation of hybrid reconfigurable processors on the system level. Then, we discuss a reconfigurable processor for data-streaming applications, which attaches a coarse-grained reconfigurable unit to the coprocessor interface of a standard embedded CPU core. By means of a case study we evaluate the system-level impact of certain design features for the reconfigurable unit, such as multiple contexts, register replication, and hardware context scheduling. The results illustrate that a system-level evaluation framework is of paramount importance for studying the architectural trade-offs and optimizing design parameters for reconfigurable processors. (C) 2004 Elsevier B.V. All rights reserved.
As integrated circuits become increasingly more complex and expensive, the ability to make post-fabrication changes will become much more attractive. This ability can be realized using programmable logic cores. Curren...
详细信息
As integrated circuits become increasingly more complex and expensive, the ability to make post-fabrication changes will become much more attractive. This ability can be realized using programmable logic cores. Currently, such cores are available from vendors in the form of "hard" rectangular layouts. In this paper, we focus on an alternative approach for fine-grain programmability: vendors supply a synthesizable RTL version of their programmable logic core (a "soft" core) and the integrated circuit designer synthesizes the programmable logic fabric using standard cells. Although this technique suffers in terms of speed, density and power overhead, the task of integrating such cores is far easier than the task of integrating "hard" cores into an ASIC or SoC. When the required amount of programmable logic is small, this ease of use may be more important than the increased overhead. This paper presents two synthesizable "soft" programmable logic core architectures and describes their associated place and route issues. We compare the two architectures. to each other, and to a "hard" programmable logic core. We also show how these cores can be made more efficient by creating a nonrectangular architecture, an option not usually. available to "hard" core vendors. Finally, a proof-of-concept integrated circuit containing one of these cores is described.
A sensorless induction spindle motor drive using synchronous PWM (SPWM) and dead-time compensator with recurrent fuzzy-neural network (RFNN) speed controller is proposed in this study for advanced spindle motor applic...
详细信息
A sensorless induction spindle motor drive using synchronous PWM (SPWM) and dead-time compensator with recurrent fuzzy-neural network (RFNN) speed controller is proposed in this study for advanced spindle motor applications. First, the operating principles of a new type SPWM technique and the circuit of dead-time compensator using field-programmable gate arrays (FPGA) are described. Then, a speed observer based on a modified Luenberger observer is adopted to estimate the rotor speed. Moreover, since the control characteristics and motor parameters for a high-speed induction spindle motor drive are time-varying, an RFNN speed controller is developed to reduce the influence of parameter uncertainties and external disturbances. In addition, the RFNN is trained on-line using a delta adaptation law. Finally, the performance of the proposed sensorless induction spindle motor drive system is demonstrated using some simulated and experimental results. (C) 2003 Elsevier B.V. All rights reserved.
In this paper, we present two alternative architectures for implementing the Rivest-Shamir-Adleman (RSA) algorithm on reconfigurable hardware. Both architectures are innovative, especially with respect to the implemen...
详细信息
In this paper, we present two alternative architectures for implementing the Rivest-Shamir-Adleman (RSA) algorithm on reconfigurable hardware. Both architectures are innovative, especially with respect to the implementation of modular multiplication. As to the area vs time trade-off, the two solutions are at the extremes of the design-space, since one adopts a word serial approach, while the other has a fully parallel organization. Based on the analysis of these architectures for different values of the serialization factor, we explore the design-space for the field-programmablegate array (FPGA)-based implementation of the RSA algorithm. We systematically analyze and compare the results of the two design processes with respect to two fundamental metrics, namely execution time and FPGA resource usage. We emphasize pros and cons and comment trade-offs of the two design alternatives. (C) 2004 Elsevier B.V. All rights reserved.
The gate utilization of FPGAs and speed of emulation in multi-FPGA system are limited by the interconnection architecture and the number of pins. The time-multiplexing of interconnection wires is required for multi-FP...
详细信息
The gate utilization of FPGAs and speed of emulation in multi-FPGA system are limited by the interconnection architecture and the number of pins. The time-multiplexing of interconnection wires is required for multi-FPGA systems incorporating several state-of-the-art FPGAs. This article proposes a circuit partitioning algorithm called SCheduling driven Algorithm for TOMi (SCATOMi) for multi-FPGA systems with interconnection architecture called Time-multiplexed, Off chip, Multi-casting interconnection (TOMi). SCATOMi improves the performance of the TOMi architecture by limiting the number of inter-FPGA signal transfers on the critical path and considering the scheduling of inter-FPGA signal transfers. The performance of the partitioning result of SCATOMi is 5.5 times faster than traditional partitioning algorithms. Experiments on architecture comparison show that, by adopting the proposed TOMi interconnection architecture along with SCATOMi, the pin count is reduced to 15.2-81.3% while the critical path delay is reduced to 46.1-67.6% compared to traditional architectures. (C) 2004 Elsevier B.V. All rights reserved.
In this paper, we propose the idea of temporal logic replication in dynamically reconfigurable field-programmablegate array partitioning to reduce the communication cost. We show that this is a very effective means t...
详细信息
In this paper, we propose the idea of temporal logic replication in dynamically reconfigurable field-programmablegate array partitioning to reduce the communication cost. We show that this is a very effective means to reduce the communication cost by taking,advantage of the slack logic capacity available. Given a K-stage temporal partition, the min-area min-cut replication problem is defined and we present an optimal algorithm to, solve it. We also present a How-based replication heuristic which is applicable When there is. a tight area bound that limits the amount of possible replication. In addition, we show a correct network. flow model for partitioning sequential circuits temporally and propose a new hierarchical flow-based performance-driven partitioner for computing initial partitions without replication.
Wearable computers are embedded into the mobile environment of their users. A design challenge for wearable systems is to combine the high performance required for tasks such as video decoding with the low energy cons...
详细信息
Wearable computers are embedded into the mobile environment of their users. A design challenge for wearable systems is to combine the high performance required for tasks such as video decoding with the low energy consumption required to maximise battery runtimes and the flexibility demanded by the dynamics of the environment and the applications. In this paper, we demonstrate that reconfigurable hardware technology is able to answer this challenge. We present the concept and the prototype implementation of an autonomous wearable unit with reconfigurable modules (WURM). We discuss experiments that show the uses of reconfigurable hardware in WURM: ASICs-on-demand and adaptive interfaces. Finally, we present an experiment with an operating system layer for WURM.
The software programmable multiprocessor architecture has been employed extensively over the past two decades for embedded signal-processing applications. However, the increased complexity of such systems has, in many...
详细信息
The software programmable multiprocessor architecture has been employed extensively over the past two decades for embedded signal-processing applications. However, the increased complexity of such systems has, in many cases, required the use of hardware acceleration to meet the growing time-critical apsects of the design. Today's field-programmable gate arrays (FPGAs) offer an alternative or additional acceleration platform, especially to an application-specific integrated circuit (ASIC). However, the traditional low-level development methods, such as schematic capture or hardware description languages (HDLs), employed to implement these hardware accelerated parts of the design result in a design lifecycle mismatch between the rapid development techniques available for the software programmable parts. This paper presents high-level design languages that enable users to generate netlists for FPGAs directly from high-level C-like languages, thereby offering an equivalent programming solution to that available with microprocessors. It details how one of these languages can be integrated into a high-level design flow for the rapid development of heterogeneous embedded signal-processing systems and presents results from a benchmark.
暂无评论