检索结果-内蒙古大学图书馆

An Ultra-Low Power, "Always-On" Camera Front-End for Posture Detection in Body Worn Cameras Using Restricted Boltzman Machines

引用

IEEE TRANSACTIONS ON MULTI-SCALE COMPUTING SYSTEMS 2015年第4期1卷 187-194页

作者： Desai, Soham Jayesh Shoaib, Mohammed Raychowdhury, Arijit Georgia Inst Technol Sch Elect & Comp Engn Atlanta GA 30332 USA Microsoft Corp Microsoft Res Redmond WA 98052 USA

The Internet of Things (loTs) has triggered rapid advances in sensors, surveillance devices, wearables and body area networks with advanced Human-Computer Interfaces (HCI). One such application area is the adoption of Body Worn Cameras (BWCs) by law enforcement officials. The need to be 'always-on' puts heavy constraints on battery usage in these camera front-ends, thus limiting their widespread adoption. Further, the increasing number of such cameras is expected to create a data deluge, which requires large processing, transmission and storage capabilities. Instead of continuously capturing and streaming or storing videos, it is prudent to provide "smartness" to the camera front-end. This requires hardware assisted image recognition and template matching in the front-end, capable of making judicious decisions on when to trigger video capture or streaming. Restricted Boltzmann Machines (RBMs) based neural networks have been shown to provide high accuracy for image recognition and are well suited for low power and re-configurable systems. In this paper we propose an RBM based "always-on" camera front-end capable of detecting human posture. Aggressive behavior of the human being in the field of view will be used as a wake-up signal for further data collection and classification. The proposed system has been implemented on a Xilinx Virtex 7 XC7VX485T platform. A minimum dynamic power of 19.18 mW for a target recognition accuracy while maintaining real time constraints has been measured. The hardware-software co-design illustrates the trade-offs in the design with respect to accuracy, resource utilization, processing time and power. The results demonstrate the possibility of a true "always-on" body-worn camera system in the loT environment.

关键词： algorithms implemented in hardware object recognition reconfigurability wearable computers

来源：评论

学校读者我要写书评

暂无评论

FOURIER-TRANSFORMS IN VLSI

引用

IEEE TRANSACTIONS ON COMPUTERS 1983年第11期32卷 1047-1057页

作者： THOMPSON, CD Division of Computer Science University of California Abstract Authors References Cited By Keywords Metrics Similar Download Citation Email Print Request Permissions

This paper surveys nine designs for VLSI circuits that compute N-element Fourier transforms. The largest of the designs requires O(N2 log N) units of silicon area; it can start a new Fourier transform every O(log N) t... 详细信息

关键词： algorithms implemented in hardware FFT Fourier transform VLSI area-time complexity computational complexity mesh-connected computers parallel algorithms shuffle-exchange network

来源：评论

学校读者我要写书评

暂无评论

hardware generation of arbitrary random number distributions from uniform distributions via the inversion method

引用

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 2007年第8期15卷 952-962页

作者： Cheung, Ray C. C. Lee, Dong-U Luk, Wayne Villasenor, John D. Univ London Imperial Coll Sci Technol & Med Dept Comp London SW7 2AZ England Univ Calif Los Angeles Dept Elect Engn Los Angeles CA 90095 USA

We present an automated methodology for producing hardware-based random number generator (RNG) designs for arbitrary distributions using the inverse cumulative distribution function (ICDF). The ICDF is evaluated via piecewise polynomial approximation with a hierarchical segmentation scheme that involves uniform segments and segments with size varying by powers of two which can adapt to local function nonlinearities. Analytical error analysis is used to guarantee accuracy to one unit in the last place (ulp). Compact and efficient RNGs that can reach arbitrary multiples of the standard deviation sigma can be generated. For instance, a Gaussian RNG based on our approach for a Xilinx Virtex-4 XC4VLX100-12 field-programmable gate array produces 16-bit random samples up to 8.2 sigma. It occupies 487 slices, 2 block-RAMs, and 2 DSP-blocks. The design is capable of running at 371 MHz and generates one sample every clock cycle.

关键词： algorithms implemented in hardware automatic synthesis Chebyshev approximation and theory computer arithmetic elementary function approximation error analysis gate arrays piecewise polynomial approximation

来源：评论

学校读者我要写书评

暂无评论

Pruning Binarized Neural Networks Enables Low-Latency, Low-Power FPGA-Based Handwritten Digit Classification

Pruning Binarized Neural Networks Enables Low-Latency, Low-P...

引用

IEEE High Performance Extreme Computing Virtual Conference (HPEC)

作者： Payra, Syamantak Loke, Gabriel Fink, Yoel Steinmeyer, Joseph D. Stanford Univ Dept Elect Engn Stanford CA 94305 USA MIT Dept Mat Sci & Engn Cambridge MA USA MIT Dept Mat Sci & Engn Dept Elect Engn & Comp Sci Cambridge MA USA MIT Inst Soldier Nanotechnol Cambridge MA USA MIT Dept Elect Engn & Comp Sci Cambridge MA USA

ISBN: (纸本)9798350308600

As neural networks are increasingly deployed on mobile and distributed computing platforms, there is a need to lower latency and increase computational speed while decreasing power and memory usage. Rather than using FPGAs as accelerators in tandem with CPUs or GPUs, we directly encode individual neural network layers as combinational logic within FPGA hardware. Utilizing binarized neural networks minimizes the arithmetic computation required, shrinking latency to only the signal propagation delay. We evaluate size-optimization strategies and demonstrate network compression via weight quantization and weight-model unification, achieving 96% of the accuracy of baseline MNIST digit classification models while using only 3% of the memory. We further achieve 86% decrease in model footprint, 8mW dynamic power consumption, and <9ns latency, validating the versatility and capability of feature-strength-based pruning approaches for binarized neural networks to flexibly meet performance requirements amid application resource constraints.

关键词： algorithms implemented in hardware Combinational Logic Cost/Performance Neural Nets Optical Character Recognition

来源：评论

学校读者我要写书评

暂无评论

FPGA-Driven Pseudorandom Number Generators Aimed at Accelerating Monte Carlo Methods

FPGA-Driven Pseudorandom Number Generators Aimed at Accelera...

引用

7th ACS/IEEE International Conference on Computer Systems and Applications (AICCSA-09)

作者： Bachir, Tarek Ould Brault, Jean-Jules Ecole Polytech Montreal Dept Elect Engn Montreal PQ H3T 1J4 Canada

ISBN: (纸本)9781424438075

hardware acceleration in High Performance Computing (HPC) context is of growing interest, particularly in the field of Monte Carlo methods where the resort to Field Programmable Gate Array (FPGA) technology has been proven as an effective media, capable of enhancing by several orders the speed execution of stochastic processes. The spread-use of reconfigurable hardware for stochastic simulation gathered a significant effort towards effective implementations of hardware pseudorandom numbers generators (PRNGs) - these generators needed to exhibit a statistically proven random behaviour and to be charactarized by a very long period. In this paper we present the state of the art of hardware pseudorandom number generation in the context of Monte Carlo acceleration. We highlight the emerging trends over the most recent publications and suggest some insights on the forthcoming works. Furthermore, we provide a complete hardware description of a new gaussian variate generator (GVG) and an exponential variate generator (EVG) based on a decision-tree technique of ours, herein presented as well. The prototypes implemented on a Xilinx Virtex II Pro XC2VP100 FPGA occupy from 150 to 417 slices and reach 280 MHz, while exhibiting good statistical behaviours with high p-values on the chi(2) test and offering a unitary Knuth ratio.

关键词： Monte Carlo methods FPGA acceleration Pseudorandom number generation Sampling methods algorithms implemented in hardware Exponential distribution Normal distribution

来源：评论

学校读者我要写书评

暂无评论

A Methodology for Parabolic Synthesis of Unary Functions for hardware Implementation

A Methodology for Parabolic Synthesis of Unary Functions for...

引用

2nd International Conference on Signals, Circuits and Systems

作者： Hertz, Erik Nilsson, Peter Lund Univ Elect & Informat Technol Dept S-22100 Lund Sweden

ISBN: (纸本)9781424426270

This paper introduces a parabolic synthesis methodology for developing approximations of unary functions like trigonometric functions and logarithms which are specialized for efficient hardware mapped VLSI design. The advantages with the methodology are, short critical path, fast computation and high throughput enabled by a high degree of architectural parallelism. The feasibility of the methodology is shown by developing an approximation of the sine function for implementation in hardware.

关键词： algorithms implemented in hardware computer arithmetic parabolic synthesis parallel design style VLSI

来源：评论

学校读者我要写书评

暂无评论

LSTM Cell Implementation on FPGAs

引用

PARALLEL PROCESSING LETTERS 2021年第2期31卷

作者： Dec, Grzegorz Rafal Rzeszow Univ Technol Dept Comp & Control Engn W Pola 2 PL-35959 Rzeszow Poland

This paper presents and discusses the implementation of an LSTM cell on an FPGA with an activation function inspired by the CORDIC algorithm. The realization is performed using both IEEE754 standard and 32-bit integer numbers. The case with floating-point arithmetic is analyzed with and without DSP blocks provided by the Xilinx design suite. The alternative implementation including the integer arithmetic was optimized for a minimal number of clock cycles. Presented implementation uses xc6slx150t-2fgg900 and achieves high calculations accuracy for both cases.

关键词： algorithms implemented in hardware neural nets reconfigurable hardware

来源：评论

学校读者我要写书评

暂无评论

FPGA-based Learning Acceleration for LSTM Neural Network

引用

PARALLEL PROCESSING LETTERS 2023年第1N02期33卷 2350001-2350001页

作者： Dec, Grzegorz Rafal Rzeszow Univ Technol Dept Comp & Control Engn W Pola 2 PL-35959 Rzeszow Poland

This paper presents and discusses the implementation of a learning accelerator for an LSTM neural network that utilizes an FPGA. The accelerator consists of a backpropagation through time algorithm for an LSTM. The presented net performs a binary classification task and consists of an LSTM and a dense layer. The performance is then compared to both a hard-coded Python implementation and an implementation using Keras library and the GPU. The implementation is executed using the DSP blocks, available via the Vivado Design Suite, which is in compliance with the IEEE754 standard. The results of the simulation show that the FPGA implementation remains accurate and achieves higher speed than the other solutions.

关键词： Backpropagation through time algorithms implemented in hardware neural nets reconfigurable hardware

来源：评论

学校读者我要写书评

暂无评论

FPGA-based Neural Net for Failures Prediction in the Cold Forging Process

引用

PARALLEL PROCESSING LETTERS 2022年第1N02期32卷 2150023-2150023页

作者： Dec, Grzegorz Rafal Rzeszow Univ Technol Dept Comp & Control Engn W Pola 2 PL-35959 Rzeszow Poland

This paper presents and discusses the implementation of deep neural network for the purpose of failure prediction in the cold forging process. The implementation consists of an LSTM and a dense layer implemented on FPGA. The network was trained beforehand on Desktop Computer using Keras library for Python and the weights and the biases were embedded into the implementation. The implementation is executed using the DSP blocks, available via Vivado Design Suite, which are in compliance with the IEEE754 standard. The simulation of the network achieves 100% classification accuracy on the test data and high calculation speed.

关键词： algorithms implemented in hardware neural nets reconfigurable hardware Industry 4.0

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：