The work solves a new problem of optimizing the boundary of buffered clock trees, which has not been addressed in the design automation as yet. Precisely, we want to show that the clock cells that directly drive flip-...
详细信息
The work solves a new problem of optimizing the boundary of buffered clock trees, which has not been addressed in the design automation as yet. Precisely, we want to show that the clock cells that directly drive flip-flops should not necessarily be buffers. By taking into account the internal structure of flip-flops, we can have a freedom of choosing either buffers or inverters for the cell implementation from library. This in fact leads to cancel out the two inverters, one in the driving buffer and another in each flip-flop, thereby reducing the power consumption on the clock tree, including flip-flops. We generalize this idea to look into the possibility of co-optimizing the driving buffers and flip-flops together to reduce the clock power at the boundary of clock trees, and propose an effective four-step synthesis algorithm of clock tree boundary for low power. By applying our proposed technique to benchmark circuits, it is observed that the clock power is able to be reduced by 4.45% similar to 6.33% further on average without timing violation.
Modern standard cells contain intercell margins at the left and right ends for better lithography. We introduce defect probability, which is the probability that a lithography defect occurs if the margins between two ...
详细信息
Modern standard cells contain intercell margins at the left and right ends for better lithography. We introduce defect probability, which is the probability that a lithography defect occurs if the margins between two adjacent cells are missing. Computing the defect probability of all cell pairs is impractical due to lengthy lithography simulations and huge number of cell pair combinations. Two approximate methods are employed to make this computation possible: reducing the range of optical proximity correction and grouping cell pairs of similar geometry at the cell boundary. We also present how the cell layout can be modified for a lower defect probability with no impact on the cell electrical parameters. Defect probability is applied to two physical design optimization problems. In the automatic placement, we consider that all cells are initially without margins. We want to locate two cells adjacent if their defect probability is zero (or negligibly small) or insert margins in between;this is achieved using the average defect probability as one of the cost terms of the placement. Experiments in 28-nm commercial library demonstrate an 8% reduction in the area with a 4% shorter wirelength. In the second application, we assume that the standard placement using cells with margins have been performed. We want to identify redundant margins that can be removed while the defect probability is kept zero. We take a step forward and shuffle the location of a few consecutive cells in the same row so that more redundant margins are identified. Once all the redundant margins are removed, newly created whitespace is distributed to reduce routing congestion in highly congested areas. Experiments indicate a 48% reduction in the number of overflow routing grids.
Clock power is the major contributor to dynamic power for modern IC design. A conventional single-bit;flip-flop cell uses an inverter chain with a high drive strength to drive the clock signal. Clustering such cells a...
详细信息
ISBN:
(纸本)9781450307116
Clock power is the major contributor to dynamic power for modern IC design. A conventional single-bit;flip-flop cell uses an inverter chain with a high drive strength to drive the clock signal. Clustering such cells and forming a multi-bit flip-flop can share the drive strength, dynamic power, and area of the inverter chain, even can save the clock network power and facilitate the skew control. Hence, in this paper, we focus on multi-bit flip-flop clustering at post-placement to gain these benefits. Utilizing the properties of Manhattan distance and coordinate transformation, we model the problem instance by two interval graphs and use a pair of linear-size sequences as our representation. Without enumerating all compatible combinations, we extract only partial sequences that are necessary to cluster flip-flops at a time, thus leading to an efficient clustering scheme. Moreover, our coordinate transformation brings fast;operations to execute our algorithm. Experimental results show the superior efficiency and effectiveness of our algorithm.
As technology continues to shrink, leakage power becomes an important issue for modern FPGAs. In this paper, we address the leakage issue of partially dynamical reconfigurable FPGAs. We focus on eliminating leakage wa...
详细信息
ISBN:
(纸本)9781595937094
As technology continues to shrink, leakage power becomes an important issue for modern FPGAs. In this paper, we address the leakage issue of partially dynamical reconfigurable FPGAs. We focus on eliminating leakage waste due to the delay between reconfiguration and task execution. We propose a post-placement leakage-aware scheduling algorithm that refines a placement generated by a performance-driven scheduler such that leakage waste is minimized and performance is not sacrificed. Experimental results on real and synthetic designs demonstrate the effectiveness and efficiency of our algorithm on leakage optimization.
暂无评论