Recent years have witnessed an unprecedented surge of interest, from social networks to drug discovery, in learning representations of graph-structured data. However, graph neural networks, the machine learning models...
详细信息
In current convolutional neural network (CNN) accelerators, communication (i.e., memory access) dominates the energy consumption. This work provides comprehensive analysis and methodologies to minimize the communicati...
详细信息
ISBN:
(数字)9781728161495
ISBN:
(纸本)9781728161501
In current convolutional neural network (CNN) accelerators, communication (i.e., memory access) dominates the energy consumption. This work provides comprehensive analysis and methodologies to minimize the communication for CNN accelerators. For the off-chip communication, we derive the theoretical lower bound for any convolutional layer and propose a dataflow to reach the lower bound. This fundamental problem has never been solved by prior studies. The on-chip communication is minimized based on an elaborate workload and storage mapping scheme. We in addition design a communication-optimal CNN accelerator architecture. Evaluations based on the 65nm technology demonstrate that the proposed architecture nearly reaches the theoretical minimum communication in a three-level memory hierarchy and it is computation dominant. The gap between the energy efficiency of our accelerator and the theoretical best value is only 37-87%.
This work proposes a high-speed pipelined-two-step time-to-digital converter (TDC) with a dynamic time amplification (DTA) to improve the resolution at low power. The key element of this TDC is the DTA. It samples the...
详细信息
ISBN:
(数字)9781728133201
ISBN:
(纸本)9781728133218
This work proposes a high-speed pipelined-two-step time-to-digital converter (TDC) with a dynamic time amplification (DTA) to improve the resolution at low power. The key element of this TDC is the DTA. It samples the residual time errors as voltages held in the MOM capacitors and discharges them to generate the amplified time difference. Thanks to the dynamic time-voltage-time conversion, the DTA realizes high linearity and power efficiency, and can be employed to build a pipeline TDC architecture with high sampling frequency because of its sample and hold operation. Moreover, the DTA maintains constant gain, so only a one-time forground calibration for gain mismatch is required in this TDC. Simulations show that the TDC designed in 65 nm CMOS achieves 8-bit, 1.75 ps of time resolution, and 1 LSB INL and 1.6 LSB DNL with one-time foreground calibration at 400 MHz sampling frequency while just consuming 726 μW power, which corresponds to 18.45 fJ/Conv. FoM.
Learning from natural bacteria flagellum, we demonstrate a magnetic polymer multilayer conical microrobot that bestow the controllable propulsion upon external rotating magnetic field with uniform intensity. The magne...
详细信息
Bias Temperature Instability (BTI) is one of the dominant CMOS aging mechanisms. It causes time-dependent variation, threatening circuit lifetime reliability. BTI-induced circuit errors are not detectable at the fabri...
详细信息
ISBN:
(数字)9781728174679
ISBN:
(纸本)9781728174686
Bias Temperature Instability (BTI) is one of the dominant CMOS aging mechanisms. It causes time-dependent variation, threatening circuit lifetime reliability. BTI-induced circuit errors are not detectable at the fabrication stage. On-line monitoring schemes are therefore necessary to capture the degradations during the operational time. Traditional aging monitoring techniques exhibit high implementation complexity and low stability. In this paper, we propose a BTI monitoring approach by simply tracking the start-up behavior of SRAM cells. SRAM is a widely used on-chip device in many applications. We study the impact of BTI for SRAM start-up values and age some cells in a manipulated manner. The BTI degradation is evaluated based on the number of SRAM cells starting with a certain value. This technique can be used to estimate the degradation for on-chip logic circuits without introducing additional circuitry, and thus has very low implementation complexity. We use an SRAM array with 1024 cells to estimate the degradations for multiple logic circuits, and show the average mean absolute percentage error as 8.48%. In addition, this technique is robust considering process, voltage and temperature variations.
Energy system models underpin decisions by energy system planners and operators. Energy system modelling faces a transformation: accounting for changing meteorological conditions imposed by climate change. To enable t...
详细信息
To boost energy saving for the general delay-tolerant IoT networks, a two-stage, and single-relay queueing communication scheme is investigated. Concretely, a traffic-aware $N$-threshold and gated-service policy are a...
详细信息
Deep neural networks for medical image reconstruction are traditionally trained using high-quality ground-truth images as training targets. Recent work on Noise2Noise (N2N) has shown the potential of using multiple no...
详细信息
Inspired by the geometric method proposed by Jean-Pierre MAREC, we first consider the Hohmann transfer problem between two coplanar circular orbits as a static nonlinear programming problem with an inequality constrai...
详细信息
Inspired by the geometric method proposed by Jean-Pierre MAREC, we first consider the Hohmann transfer problem between two coplanar circular orbits as a static nonlinear programming problem with an inequality constraint. By the Kuhn-Tucker theorem and a second-order sufficient condition for minima, we analytically prove the global minimum of the Hohmann transfer. Two sets of feasible solutions are found: one corresponding to the Hohmann transfer is the global minimum and the other is a local minimum. We next formulate the Hohmann transfer problem as boundary value problems, which are solved by the calculus of variations. The two sets of feasible solutions are also found by numerical examples. Via static and dynamic constrained optimizations, the solution to the Hohmann transfer problem is re-discovered, and its global minimum is analytically verified using nonlinear programming.
暂无评论