This article describes a surface micromachined cantilever beam-based resonator for biological sensing applications. The study used a novel microfabrication technique of merged epitaxial lateral overgrowth (MELO) and c...
详细信息
Previous papers have shown that the slow scaling of wire delays compared to logic delays will prevent superscalar performance from scaling with technology. In this paper we show that the optimal pipeline for superscal...
详细信息
Previous papers have shown that the slow scaling of wire delays compared to logic delays will prevent superscalar performance from scaling with technology. In this paper we show that the optimal pipeline for superscalar becomes shallower with technology, when wire delays are considered, tightening previous results that deeper pipelines perform only as well as shallower pipelines. The key reason for the lack of performance scaling is that superscalar does not have sufficient parallelism to hide the relatively-increased wire delays. However, Simultaneous Multithreading (SMT) provides the much-needed parallelism. We show that an SMT running a multiprogrammed workload with just 4-way issue not only retains the optimal pipeline depth over technology generations, enabling at least 43% increase in clock speed every generation, but also achieves the remainder of the expected speedup of two per generation through IPC. As wire delays become more dominant in future technologies, the number of programs needs to be scaled modestly to maintain the scaling trends, at least till the near-future 50nm technology. While this result ignores bandwidth constraints, using SMT to tolerate latency due to wire delays is not that simple because SMT causes bandwidth problems. Most of the stages of a modern out-of-order-issue pipeline employ RAM and CAM structures. Wire delays in conventional, latency-optimized RAM/CAM structures prevent them from being pipelined in a scaled manner. We show that this limitation prevents scaling of SMT throughput. We use bit-line scaling to allow RAM/CAM bandwidth to scale with technology. Bitline scaling enables SMT throughput to scale at the rate of two per technology generation in the near future.
In this paper we propose a probabilistic approach to retrieve video clips similar to a given query video clip. In our approach the video clips are partitioned into video segments based on their content homogeneity, an...
详细信息
In this paper we propose a probabilistic approach to retrieve video clips similar to a given query video clip. In our approach the video clips are partitioned into video segments based on their content homogeneity, and video segments in database are connected to construct candidate clips and compare with the query clip for their similarity(or distance) during the query process. An efficient sch.me is developed to estimate the probability density functions of the distances between the candidate clips and query clip, and based on these density functions, two methods are devised to reduce the number of candidate clips for comparison to speed up the retrieval process. Experimental results show that our proposed approach can notably speed up the retrieval of similar video clips, while maintaining high retrieval accuracy.
Inductive noise in high-performance microprocessors is a reliability issue caused by variations in processor current (di/dt) which are converted to supply-voltage glitches by impedances in the power-supply network. In...
详细信息
Inductive noise in high-performance microprocessors is a reliability issue caused by variations in processor current (di/dt) which are converted to supply-voltage glitches by impedances in the power-supply network. Inductive noise has been addressed by using decoupling capacitors to maintain low impedance in the power supply over a wide range of frequencies. However, even well-designed power supplies exhibit (a few) peaks of high impedance at resonant frequencies caused by RLC resonant loops. Previous architectural proposals adjust current variations by controlling instruction fetch and issue, trading off performance and energy for noise reduction. However, the proposals do not consider some conceptual issues and have implementation challenges. The issues include requiring fast response, responding to variations that do not threaten the noise margins, or responding to variations only at the resonant frequency while the range of high impedance extends to a resonance band around the resonant frequency. While previous sch.mes reduce the magnitude of variations, our proposal, called resonance toning, changes the frequency of current variations away from the resonance band to a non-resonant frequency to be absorbed by the power supply. Because inductive noise is a resonance problem, resonance tuning reacts only to repeated variations in the resonance band, and not to isolated variations. Reacting after a few repetitions allows more time for the response and reduces unnecessary responses, decreasing performance and energy loss.
What makes a network architecture efficient? And how do we measure efficiency? In this paper, we study architectural issues in the context of the sensor reachback problem, from an information theoretic perspective. Sp...
详细信息
What makes a network architecture efficient? And how do we measure efficiency? In this paper, we study architectural issues in the context of the sensor reachback problem, from an information theoretic perspective. Specifically, we find that in an information-theoretically optimal reachback network, all of the following statements hold: There exists a solution to the problem of transporting the sources over the channels if and only if a suitably defined multicommodity flow is feasible. If a solution exists, then a solution exists based on separate source and channel coding only. When multiple solutions exist, under a natural linear cost model defined in terms of Shannon information, an optimal solution is given by a minimum-cost multicommodity flow. Based on these results we can make a number of statements about what constitutes an optimal system architecture for an important class of communication networks, where optimality is defined in a pure information theoretic sense, but has a very clear and intuitive network flow interpretation.
The paper presents a method for improving the phase noise performance of a CMOS quadrature LC oscillator through parasitic compensation. Owing to the parasitic resistance in the inductor, the LC oscillator suffers fro...
详细信息
The paper presents a method for improving the phase noise performance of a CMOS quadrature LC oscillator through parasitic compensation. Owing to the parasitic resistance in the inductor, the LC oscillator suffers from a low Q-value, which degrades its phase noise performance. In this design, through the parasitic-compensation method, the LC oscillator will be made to oscillate at the frequency when the effective impedance of the parallel LC resonator is at the peak. This will increase the Q-value of the LC resonator, which improves the phase noise performance of the circuit. A 2.63 GHz quadrature CMOS LC oscillator with a phase noise of -112.3 dBc/Hz at 600kHz offset is demonstrated, consuming 7.5mW of power using an on-chip spiral inductor model.
This paper presents a novel optimization strategy based on the social algorithm and collective behaviors. The new algorithm proposed incorporates the information of the individuals within the society introduced as the...
详细信息
ISBN:
(纸本)0780385152
This paper presents a novel optimization strategy based on the social algorithm and collective behaviors. The new algorithm proposed incorporates the information of the individuals within the society introduced as their talent and the collective behavior of the society in the civilization called the liberty rate. The algorithm has been demonstrated on two benchmark problems and shown promising results for further investigation.
In this paper, we propose a simultaneous sch.duling and allocation algorithm for voltage-partitioned multiple-Vdddesign. By considering voltage partition during sch.duling and allocation, we may place the resources of...
详细信息
ISBN:
(纸本)0769520936
In this paper, we propose a simultaneous sch.duling and allocation algorithm for voltage-partitioned multiple-Vdddesign. By considering voltage partition during sch.duling and allocation, we may place the resources of same voltage in one partition, thereby reducing additional power meshes. Also, the partitioned design reduces the energy dissipation of level conveners by reducing cutsize between different-voltage partitions. The proposed algorithm starts from a random solution. Then, it performs sch.duling and allocation simultaneously while trying to satisfy both resource and time constraints. By gradually changing the sch.dule and allocation, the algorithm effectively explores solution spaces to achieve low-power and better partitioning in terms of the supply voltages. Relative to the minimum single voltage design, 36% of energy saving was achieved. Also, improvements for interconnect, level-conversion energy, and voltage clusters were observed.
Accurate reference point detection is one of the first and most important signal processing steps in automatic fingerprint identification systems. The fingerprint reference point, which is also known as the core point...
详细信息
Accurate reference point detection is one of the first and most important signal processing steps in automatic fingerprint identification systems. The fingerprint reference point, which is also known as the core point except in the case of arch type fingerprints, is defined as the location where the concave ridge curvature attains a maximum. In this paper we introduce a multi-resolution reference point detection algorithm that calculates the Poincaré index in the modulation domain using an AM-FM model of the fingerprint image. We present experimental results where this new algorithm is tested against the FVC 2000 Database 2 and a second database from the University of Bologna. In both cases, we find that the modulation domain algorithm delivers accuracy and consistency that exceed those of a recent competing technique based on integration of sine components in two adjacent regions.
Multiple Input Translinear Element (MITE) networks can be used to implement a wide variety of static and dynamic functions. The basic product-of-power law MITE circuit is characterized by an input connectivity matrix ...
详细信息
Multiple Input Translinear Element (MITE) networks can be used to implement a wide variety of static and dynamic functions. The basic product-of-power law MITE circuit is characterized by an input connectivity matrix and an output connectivity matrix, Given a desired product-of-power law function, we describe a synthesis procedure that generates the connectivity matrices. This multipleinput-multiple-output synthesis has distinct advantages over the existing multiple-input-single-output synthesis by requiring lesser (or at most the same) number of MITEs and by being suitable for automated synthesis. MITE networks in which all the MITEs do not have the same number of gates are transformed into MITE networks in which the number of gates is the same by a procedure called completion. We describe a novel completion procedure for these product-of-power law circuits.
暂无评论