Asychronous designs have been touted as having potential advantages in average performance, power consumption, modularity, and tolerance of metastability as compared to traditional synchronous logic. While delay-insen...
详细信息
Asychronous designs have been touted as having potential advantages in average performance, power consumption, modularity, and tolerance of metastability as compared to traditional synchronous logic. While delay-insensitive (DI) asynchronouscircuits are theoretically the most desirable type of asynchronous logic because they make the weakest timing assumptions, the complexity of implementing DI circuits in CMOS or similar technologies may make them impractical to use. The fact that event-based DI circuits are ill matched to CMOS does not necessarily mean that they are inherently inefficient, however. In this paper, we show that using Rapid Single Flux Quantum (RSFQ) superconducting circuits, in which information is represented as discrete voltage pulses or magnetic flux quanta, many powerful DI circuit primitives can be implemented at least as efficiently as Boolean logic gates. Since DI logic also alleviates the severe clock skew problems that can be expected at the switching speeds approaching a terahertz in this technology, it may well be a more practical basis for digital circuit design than alternatives traditionally used for CMOS.
We propose a quantitative measure for evaluating the timing-reliability of asynchronouscircuits designed on a variety of delay models. Using the measure, we evaluate the timing-reliability, as well as the speed perfo...
详细信息
We propose a quantitative measure for evaluating the timing-reliability of asynchronouscircuits designed on a variety of delay models. Using the measure, we evaluate the timing-reliability, as well as the speed performance and hardware cost, for various building blocks of asynchronoussystems. Finally, we give a guideline for choosing valid delay models for the design of dependable asynchronous processors.
This paper presents the architecture and design of a high-performance asynchronous Huffman decoder for compressed-code embedded processors. In such processors, embedded programs are stored in compressed form in instru...
详细信息
ISBN:
(纸本)0818683929
This paper presents the architecture and design of a high-performance asynchronous Huffman decoder for compressed-code embedded processors. In such processors, embedded programs are stored in compressed form in instruction ROM, then are decompressed on demand during instruction cache refill. The Huffman decoder a's used as a code decompression engine. The circuit is non-pipelined, and is implemented as an iterative self-timed ring. It achieves a high-speed decode rate with very low area overhead. Simulations using Lsim show an average throughput of 32 bits/25 ns on the output side (or 163 MBytes/sec, or 1303 Mbit/sec), corresponding to about 889 Mbit/sec on the input side. The area of the design is extremely small: under 1 mm(2) in a 0.8 micron full-custom layout. The decoder is estimated to heave higher throughput than any comparable synchronous Huffman decoder rafter normalizing for feature size and voltage), yet Is much smaller than synchronous designs. Its performance is also 83% faster than a recently published asynchronous Huffman decoder using the same technology.
A fast asynchronous shift register is used as the serializer and de-serializer in a novel bit-serial on-chip communication link. The link employs two-phase transition-based LEDR encoding. Acknowledgement is generated ...
详细信息
ISBN:
(纸本)0769524982
A fast asynchronous shift register is used as the serializer and de-serializer in a novel bit-serial on-chip communication link. The link employs two-phase transition-based LEDR encoding. Acknowledgement is generated only at the word level, rather than bit by bit. The shift register is designed to achieve bit time of a single gate delay. It is based on a wave-pipelined control path and on transition latches. The circuit achieved 67 Gbps data rate when simulated on 65mn CMOS technology and was immune to in-die process variations of up to 10 sigma.
The design of asynchronouscircuits typically requires a judicious definition of signals and modules, combined with a proper specification of their timing constraints, which can be a complex and error-prone process, u...
详细信息
ISBN:
(纸本)9798350305760
The design of asynchronouscircuits typically requires a judicious definition of signals and modules, combined with a proper specification of their timing constraints, which can be a complex and error-prone process, using standard Hardware Description Languages (HDLs). In this paper we introduce Yak, a new dataflow description language for asynchronous bundled data circuits. Yak allows designers to generate Verilog and timing constraints automatically, from a textual description of bundled data control flow structures and combinational logic blocks. The timing constraints are generated using the Local Clock Set (LCS) methodology and can be consumed by standard industry tools. Yak includes ergonomic language features such as structured bindings of channels undergoing fork and join operations, named value scope propagation along channels, and channel typing. Here we present Yak's language front-end and compare the automated synthesis and layout results of an example circuit with a manual constraint specification approach.
This paper presents the circuit design for phase alignment in a Digital Frequency Synthesizer (DFS), taking advantage of asynchronous level-mode state machines. An example of a real case asynchronous design is present...
详细信息
This paper presents the circuit design for phase alignment in a Digital Frequency Synthesizer (DFS), taking advantage of asynchronous level-mode state machines. An example of a real case asynchronous design is presented that provides superior results to alternative solutions. The designs are implemented in the Xilinx Spartan™-III family, a field programmable device in the 90nm technology. We explain the specific clock management application and the circuits for our designs, followed by a summary of the final results. Our silicon results indicate functionality improvement, area decrease, and jitter reduction compared to alternatives. In addition, taking advantage of novel asynchronouscircuits saves engineering effort during silicon characterization and design of future generations of products.
Over the past decade, the design of low-power processors is a primary requirement of emerging applications, as Internet of Things (IoT) and neuromorphic chips. Therefore, there has been renewed interest in asynchronou...
详细信息
Over the past decade, the design of low-power processors is a primary requirement of emerging applications, as Internet of Things (IoT) and neuromorphic chips. Therefore, there has been renewed interest in asynchronouscircuits for their low-power consumption and robustness. However, one of the main obstacles is the lack of commercial EDA tool support, which makes asynchronous design takes time and is not well-suited for industrial adoption. This brief proposes a new methodology for implementing asynchronous phase-decoupled click-based circuits with traditional EDA tools. To perform static timing analysis both in the control and data paths, we capture asynchronous event propagation via generated clocks. Moreover, we present an adaptive pipeline asynchronous RISC-V processor implemented on the FPGA, Xilinx ZCU102 board. The implementation result shows that the asynchronous RISC-V processor achieves a 3x dynamic power improvement against the synchronous one with a similar resource.
This paper describes the realization of an interconnect Delay Insensitive (DI) FPGA architecture with distributed asynchronous control. This architecture maintains the basic block structure of traditional FPGAs allowi...
详细信息
ISBN:
(纸本)9780769543833
This paper describes the realization of an interconnect Delay Insensitive (DI) FPGA architecture with distributed asynchronous control. This architecture maintains the basic block structure of traditional FPGAs allowing the potential use of existing FPGA design tools in block design. This asynchronous FPGA architecture is mainly aimed at tolerating the unpredictable delay variations caused by process and environment variations in current and future VLSI technology nodes and also targets power supply variations, including modes such as dynamic voltage scaling and variable Vdd, such as in applications featuring energy harvesting. This is achieved by making the longer inter-block interconnects DI, keeping the computational logic single-rail, and removing global clocks.
This paper addresses the power reduction techniques for the ultra-low power integrated circuits. We propose to implement non-volatile asynchronouscircuits which will have a quasi-zero leakage consumption, almost inst...
详细信息
ISBN:
(纸本)9781479987153
This paper addresses the power reduction techniques for the ultra-low power integrated circuits. We propose to implement non-volatile asynchronouscircuits which will have a quasi-zero leakage consumption, almost instant back-up and wake-up time and will be robust to unstable supply environments. This paper presents the implementation of the non-volatile C-element and Half-Buffer, based on hybrid technology incorporating 28nm CMOS FD-SOI and 40nm STT-MRAM magnetic technologies. We discuss our recent simulation results of the proposed non-volatile blocks and aswell more complex structures based on them. We derive the criteria of the implementation efficiency and compare the conventional asynchronous blocks with the proposed non-volatile ones.
Self-timed circuits present an attractive solution to the problem of process variation. However implementing self-timed combinational logic is complex and expensive. This paper presents a novel method for synthesising...
详细信息
ISBN:
(纸本)9781424439331
Self-timed circuits present an attractive solution to the problem of process variation. However implementing self-timed combinational logic is complex and expensive. This paper presents a novel method for synthesising indicating implementations of arbitrary encoded function blocks. The synthesis method reduces the cost of the implementations by distributing indication between the individual outputs of a function block. Covers are constructed by determining the minimal cost set of Prime Indicants which are required to indicate all of the input transitions of the function block The results of the procedure are demonstrated on a wide range of combinational logic blocks and show a reduction in literal count of between 38-99%.
暂无评论