Spike detection plays a central role in neural data processing and brain-machine interfaces (BMIs). A challenge for future-generation implantable BMIs is to build a spike detector that features both low hardware cost ...
详细信息
Neural processing units (NPUs) are becoming an integral part in all modern computing systems due to their substantial role in accelerating neural networks (NNs). The significant improvements in cost-energy-performance...
详细信息
Neural processing units (NPUs) are becoming an integral part in all modern computing systems due to their substantial role in accelerating neural networks (NNs). The significant improvements in cost-energy-performance stem from the massive array of multiply accumulate (MAC) units that remarkably boosts the throughput of NN inference. In this work, we are the first to investigate the thermal challenges that NPUs bring, revealing how MAC arrays, which form the heart of any NPU, impose serious thermal bottlenecks to on-chip systems due to their excessive power densities. For the first time, we explore: 1) the effectiveness of precision scaling and frequency scaling (FS) in temperature reductions and 2) how advanced on-chip cooling using superlattice thin-film thermoelectric (TE) open doors for new tradeoffs between temperature, throughput, cooling cost, and inference accuracy in NPU chips. Our work unveils that hybrid thermal management, which composes different means to reduce the NPU temperature, is a key. To achieve that, we propose and implement PFS-TE technique that couples precision and FS together with superlattice TE cooling for effective NPU thermal management. Using commercial signoff tools, we obtain accurate power and timing analysis of MAC arrays after a full-chip design is performed based on 14-nm Intel FinFET technology. Then, multiphysics simulations using finite-element methods are carried out for accurate heat simulations in the presence and absence of on-chip cooling. Afterward, comprehensive design-space exploration is presented to demonstrate the Pareto frontier and the existing tradeoffs between temperature reductions, power overheads due to cooling, throughput, and inference accuracy. Using a wide range of NNs trained for image classification, experimental results demonstrate that our novel NPU thermal management increases the inference efficiency (TOPS/Joule) by $1.33\times $ , $1.87\times $ , and $2\times $ under different temperature constrain
This paper presents an energy-efficient standard-cell library design scheme: MEPNTC, targeting ultra-low-voltage near/sub- V th operation. MEPNTC exploits an alternative logic style and inverse-narrow-width-effect (I...
详细信息
ISBN:
(数字)9781728197104
ISBN:
(纸本)9781728197111
This paper presents an energy-efficient standard-cell library design scheme: MEPNTC, targeting ultra-low-voltage near/sub- V th operation. MEPNTC exploits an alternative logic style and inverse-narrow-width-effect (INWE) to extend the minimum energy point operation. A carefully engineered design style is presented to improve the PVT and glitch immunity of the cells while preserving balanced noise margins across a wider VDD range. The reduced parasitics and performance boost from both techniques have demonstrated up to 30 % -60 % of energy savings at 0.5V, typical near- V th level for general-purpose hardware accelerator benchmarks (32-bit Booth Multiplier, 25- Tap FIR Filter, Forward Discrete Cosine Transform and JPEG Image Compression Units) compared to standard CMOS and INWE aware CMOS designs in 65-nm bulk CMOS technology.
This paper presents two novel circuit arrangements for an ultra-low voltage, low-power 4-to-2 compressor targeting typical near-V th application domain. A hybrid logic style is utilized to exploit energy efficiency b...
详细信息
ISBN:
(数字)9781728133201
ISBN:
(纸本)9781728133218
This paper presents two novel circuit arrangements for an ultra-low voltage, low-power 4-to-2 compressor targeting typical near-V th application domain. A hybrid logic style is utilized to exploit energy efficiency by means of parasitic reduction in circuit blocks. Proposed structures are evaluated against prevalent compressors in terms of their typical figure of merits and noise immunity. From extensive post-layout simulations in 65-nm bulk CMOS process technology, the most optimal arrangement was found to be 35% more power efficient, 3.4% faster, 8% more area efficient and 37% better in PDP at 0.4V DD compared to most appealing implementations in literature.
Vulnerabilities in privileged software layers have been exploited with severe consequences. Recently, Trusted Execution Environments (TEEs) based technologies have emerged as a promising approach since they claim stro...
详细信息
ISBN:
(数字)9781728195353
ISBN:
(纸本)9781728195360
Vulnerabilities in privileged software layers have been exploited with severe consequences. Recently, Trusted Execution Environments (TEEs) based technologies have emerged as a promising approach since they claim strong confidentiality and integrity guarantees regardless of the trustworthiness of the underlying system software. In this paper, we consider one of the most prominent TEE technologies, referred to as Intel Software Guard Extensions (SGX). Despite many formal approaches, there is still a lack of formal proof of some critical processes of Intel SGX, such as remote attestation. To fill this gap, we propose a fully automated, rigorous, and sound formal approach to specify and verify the Enhanced Privacy ID (EPID)-based remote attestation in Intel SGX under the assumption that there are no side-channel attacks and no vulnerabilities inside the enclave. The evaluation indicates that the confidentiality of attestation keys is preserved against a Dolev-Yao adversary in this technology. We also present a few of the many inconsistencies found in the existing literature on Intel SGX attestation during formal specification.
embeddedsystems design has lately become particularly challenging due to fast increasing system complexities, real-time demands and reliability requirements. At the same time, designs are constrained by stringent pow...
详细信息
One challenge imposed by ubiquitous computing of embeddedsystems is the need for power and energy-efficient implementations, particularly because many of them are operated with batteries. In this sense, tailored appl...
详细信息
One challenge imposed by ubiquitous computing of embeddedsystems is the need for power and energy-efficient implementations, particularly because many of them are operated with batteries. In this sense, tailored application-specific processors can meet the resource requirements of a specific application in the most efficient way. In this paper, we present TailoredCore, a design methodology to generate application-specific processors based on a core architecture implementation. This methodology analyzes the application to be executed and produces a customized RISC-V core with the resources required, while reducing the hardware overhead due to, for instance, instructions and registers not needed. Using TailoredCore, we achieve up to 38% savings in registers and 12% in logic elements when generating cores for five CHStone benchmark applications and implementing them on an FPGA. These savings in the area also correspond to a reduction of the required power and energy.
embeddedsystems design has lately become particularly challenging due to fast increasing system complexities, realtime demands and reliability requirements. At the same time, designs are constrained by stringent powe...
详细信息
ISBN:
(纸本)9783800749454
embeddedsystems design has lately become particularly challenging due to fast increasing system complexities, realtime demands and reliability requirements. At the same time, designs are constrained by stringent power budgets, limited memory capacity and short time-to-markets. Recently, various methods to automatically generate, analyze, and optimize different stages of the design flow have been investigated. The evaluation of these methods, however, often focuses on a single associated field of research and thus may not consider the domain-crossing effects that stem from the interaction of different design methods. This paper presents a complete and fully automated workflow that covers model-based generation, analysis, and optimization aspects for embedded firmware and demonstrates it on a virtual system prototype of a typical inertial sensor node. We illustrate the integration of different design and evaluation methods on a realistic example and show potential opportunities for the application of gathered information in order to improve the design.
作者:
Dutt, NikilRegazzoni, Carlo S.Rinner, BernhardYao, XinNikil Dutt (Fellow
IEEE) received the Ph.D. degree from the University of Illinois at Urbana–Champaign Champaign IL USA in 1989.""He is currently a Distinguished Professor of computer science (CS) cognitive sciences and electrical engineering and computer sciences (EECS) with the University of California at Irvine Irvine CA USA. He is a coauthor of seven books. His research interests include embedded systems electronic design automation (EDA) computer architecture distributed systems healthcare Internet of Things (IoT) and brain-inspired architectures and computing.""Dr. Dutt is a Fellow of ACM. He was a recipient of the IFIP Silver Core Award. He has received numerous best paper awards. He serves as the Steering Committee Chair of the IEEE/ACM Embedded Systems Week (ESWEEK). He is also on the steering organizing and program committees of several premier EDA and embedded system design conferences and workshops. He has served on the Editorial Boards for the IEEE Transactions on Very Large Scale Integration (VLSI) Systems and the ACM Transactions on Embedded Computing Systems and also previously served as the Editor-in-Chief (EiC) for the ACM Transactions on Design Automation of Electronic Systems. He served on the Advisory Boards of the IEEE Embedded Systems Letters the ACM Special Interest Group on Embedded Systems the ACM Special Interest Group on Design Automationt and the ACM Transactions on Embedded Computing Systems. Carlo S. Regazzoni (Senior Member
IEEE) received the M.S. and Ph.D. degrees in electronic and telecommunications engineering from the University of Genoa Genoa Italy in 1987 and 1992 respectively.""He is currently a Full Professor of cognitive telecommunications systems with the Department of Electrical Electronics and Telecommunication Engineering and Naval Architecture (DITEN) University of Genoa and a Co-Ordinator of the Joint Doctorate on Interactive and Cognitive Environments (JDICE) international Ph.D. course started initially as EU Erasmus Mundus Project and
Autonomous systems are able to make decisions and potentially take actions without direct human intervention, which requires some knowledge about the system and its environment as well as goal-oriented reasoning. In c...
详细信息
Autonomous systems are able to make decisions and potentially take actions without direct human intervention, which requires some knowledge about the system and its environment as well as goal-oriented reasoning. In computersystems, one can derive such behavior from the concept of a rational agent with autonomy (“control over its own actions”), reactivity (“react to events from the environment”), proactivity (“act on its own initiative”), and sociality (“interact with other agents”) as fundamental properties \n[1]\n. Autonomous systems will undoubtedly pervade into our everyday lives, and we will find them in a variety of domains and applications including robotics, transportation, health care, communications, and entertainment to name a few. \nThe articles in this month’s special issue cover concepts and fundamentals, architectures and techniques, and applications and case studies in the exciting area of self-awareness in autonomous systems.
This paper reports on the basic findings and future perspectives of a capacity building project funded by the European Union. The International Master of Science on Cyber Physical systems (MS@CPS) is a collaborative p...
详细信息
This paper reports on the basic findings and future perspectives of a capacity building project funded by the European Union. The International Master of Science on Cyber Physical systems (MS@CPS) is a collaborative project that aims to establish a master program in cyber physical systems (CPS). A consortium composed of nine partners proposed the project. Three partners are European and from Germany, UK and Sweden; while the other six partners are from the South Mediterranean region and include: Palestine, Jordan and Tunisia. The consortium is led by the University of Siegen in Germany who also manages the implementation of the work packages. CPS is an emerging engineering subject with significant economic and societal implications, which motivated the consortium to propose the establishment of a master program to offer educational and training opportunities at graduate level in the fields of CPS. In this paper, CPS as a field of study is presented with an emphasis on its importance, especially with regard to meeting local needs. A brief description of the project is presented in conjunction with the methodology for developing the courses and their learning outcomes.
暂无评论