As the design complexity increases dramatically, results of functional simulation are usually checked through only a part of signals during design verification. It is important, therefore, to consider the observabilit...
详细信息
ISBN:
(纸本)9781424437696
As the design complexity increases dramatically, results of functional simulation are usually checked through only a part of signals during design verification. It is important, therefore, to consider the observability of internal signals for effective checking. This paper proposes a static observability analysis method to automatically select internal observation signals, which improves the quality of functional verification. A series of formulas are defined to evaluate observability of internal signals, and an algorithm is proposed to locate the sources of low-observability. Such sources, rather than general hard-to-observe signals, are desirable internal observation signals. Experimental results indicate that signals selected by this method can improve the observability of designs more than those randomly selected from hard-to-observe signals.
Statistical static timing analysis (SSTA) considering process variation and aging effects is usually used to analyze circuit lifetime reliability at design phase. A key challenge for statistical lifetime reliability a...
详细信息
Statistical static timing analysis (SSTA) considering process variation and aging effects is usually used to analyze circuit lifetime reliability at design phase. A key challenge for statistical lifetime reliability analysis is that an accurate statistical timing model is needed to carefully model practical variation distribution as well as delay correlation. In this work, P2CLRAF, a circuit lifetime reliability analysis framework is proposed. It calibrates presilicon SSTA result by learning the collected data from path delay testing at post-silicon timing validation phase. A neural network inside P2CLRAF is trained to learn variation distribution and delay correlation based on the statistic of path delay testing. The learned information is then fed back to SSTA to further improve the accuracy of circuit lifetime reliability analysis. Experimental results demonstrate the effectiveness of the proposed analysis framework.
An 8-bit bit-slice TEA-cryptographic accelerator for 64-bit RSFQ secure coprocessors is proposed. The accelerator is based on Tiny Encryption Algorithm (TEA) and mainly consists of bit-slice adders and bit-slice shift...
An 8-bit bit-slice TEA-cryptographic accelerator for 64-bit RSFQ secure coprocessors is proposed. The accelerator is based on Tiny Encryption Algorithm (TEA) and mainly consists of bit-slice adders and bit-slice shifters. Synchronous concurrent-flow clocking is used to design a fully pipelined RSFQ logic design. For verifying the algorithm and the logic design, the RSFQ logic circuits of the proposed accelerator have been simulated with a target operating frequency of 50 GHz. It consists of 21 stages. The throughput is 7.672 × 10 7 64-bit TEA encryptions per second.
Dynamic Binary Translation (DBT) has been widely used in various applications. Although new architectures and micro-architectures often create performance opportunities for programmers and compilers, such performance ...
详细信息
ISBN:
(纸本)9781612843568
Dynamic Binary Translation (DBT) has been widely used in various applications. Although new architectures and micro-architectures often create performance opportunities for programmers and compilers, such performance opportunities may not be exploited by legacy executables. For example, the additional general-purpose and XMM registers in the Intel64 architecture do not benefit the IA-32 binaries. In this paper, we designed and developed a DBT system to dynamically promote stack variables in the source binaries to the additional registers of the target architecture. One of the most challenging problems is how to deal with the possible but rare memory aliases between promoted stack variables and other implicit memory references. We devised a runtime alias detection approach based on the page protection mechanism in Linux and a novel stack switching method to catch memory aliases at run-time. This approach is much less expensive than traditional approaches like inserting address checking instructions. On an Intel64 platform, our DBT system with speculative stack variable promotion has sped up several SPEC CPU2006 benchmarks in IA-32 code, with the largest performance gain over 45%.
An 8-bit bit-parallel RSFQ microprocessor, named HUTU, is proposed. It can execute 28 different instructions. Each instruction consists of eight bits. Harvard-type architecture is adopted for parallel processing betwe...
An 8-bit bit-parallel RSFQ microprocessor, named HUTU, is proposed. It can execute 28 different instructions. Each instruction consists of eight bits. Harvard-type architecture is adopted for parallel processing between the control unit and the datapath. The control unit uses an asynchronous timing method to avoid pipeline flushing and to reduce the area. Concurrent-flow clocking is adopted in the datapath for high performance. The simulation result shows that the elements of HUTU run correctly.
Current QoS-aware automatic service composition queries over a network of web services are often one-time in nature. After a network of web services is built, such queries are issued once, and answers are found from t...
详细信息
Strategies for partitioning an application¿s data and computation play fundamental role in determining the efficiency of parallelization. This paper describes a sophisticated strategy for partitioning data and co...
详细信息
Strategies for partitioning an application¿s data and computation play fundamental role in determining the efficiency of parallelization. This paper describes a sophisticated strategy for partitioning data and computation known as multi-partitioning, which can support the best parallelization for some applications such as the line sweep computations. However, the implementation of multi-partitioning is very difficult and, as we know, there is none automatic parallelizing compiler supports such partitioning strategy. Though the dHPF compiler implemented multi-partitioning as a special extension for block style HPF partitioning, it still needs the programmer¿s participation to analyze the application and decide the data distribution scheme. In this paper, we present a global tiling transformation algorithm and a tile-to-processors mapping strategy called hyper-diagonal modular mapping, to implement the multi-partitioning strategy. The experimentation with NPB2.3-serial SP shows that the code generated by the compiler achieves scalable performance.
Logic design of a 16-bit bit-slice shifter for 64-bit superconducting rapid single-flux-quantum (RSFQ) microprocessors is proposed. The shifter supports three types of shift operations including logic shift, arithmeti...
Logic design of a 16-bit bit-slice shifter for 64-bit superconducting rapid single-flux-quantum (RSFQ) microprocessors is proposed. The shifter supports three types of shift operations including logic shift, arithmetic shift and rotating shift. Each of 64-bit shift input operands is divided into four slices of 16-bit each. In order to simulate the digital function and timing of the proposed 16-bit bit-slice shifter, we design a logic-level simulation model based on the Open Dataset of CONNECT Cell Library for AIST ADP2. As the results of simulation, the information of RSFQ circuits, such as the number of Josephson junctions, area and latency of the 16-bit bit slice shifter can be obtained. The simulation results show that the proposed 16-bit bit-slice shifter can work correctly.
Recently, deep neural networks (DNNs) have been widely applied in mobile intelligent applications. The inference for the DNNs is usually performed in the cloud. However, it leads to a large overhead of transmitting da...
详细信息
Rapid single-flux-quantum (RSFQ) is expected to be the next generation integrated circuit technology because of its ultra-high-speed with ultra-low-power consumption. We propose datapath circuits for an 8-bit bit-para...
Rapid single-flux-quantum (RSFQ) is expected to be the next generation integrated circuit technology because of its ultra-high-speed with ultra-low-power consumption. We propose datapath circuits for an 8-bit bit-parallel RSFQ microprocessor. The proposed datapath circuits process 8-bit data each clock cycle. Seven instructions are executed in the datapath, including ADD, ADDI, IN, OUT, LOADI, SRL and MOV. The datapath circuits consist of eight input ports, eight output ports, five multiplexers (MUXs), two 8-bit data registers and one 8-bit bit-parallel arithmetic logic unit (ALU). The datapath circuits contain 12 pipeline stages and 2993 JJs based on the Open Dataset of CONNECT Cell Library for AIST ADP2 without considering wiring cells. We perform digital simulation of the proposed datapath circuits. The simulation results show correct operation with the assumed frequency of 20 GHz.
暂无评论