检索结果-内蒙古大学图书馆

A new CAD tool for Single Event Transient Analysis and mitigation on Flash-based FPGAs

INTEGRATION-thE VLSI JOURNAL 2019年 67卷 73-81页

作者： Azimi, S. Du, B. Sterpone, L. Codinachs, D. M. Grimoldi, R. Cattaneo, L. Politecn Torino Dipartimento Automat & Informat Turin Italy European Space Agcy Noordwijk Netherlands OHB Italia Milan Italy Microchip Milan Italy

Flash-based field programmable Gate Array (FPGA) devices are nowadays golden core of many applications especially in space and avionic fields where reliability is an important concern. In particular, for Flash-based FPGAs, when adopted in those applications, the main concern is radiation-induced voltage glitches known as Single Event Transient (SET) in the combinational logic. In this work, a new CAD tool is presented for evaluating the sensitivity of the implemented circuit regarding SET and mitigating this effect. this tool has been applied to EUCLID space mission project including more than ten modules. the experimental results demonstrate the efficiency of the proposed tool.

关键词： field programmable gate arrays (FPGA)

来源：评论

学校读者我要写书评

暂无评论

On-the-Fly Parallel Data Shuffling for Graph Processing on OpenCL-based FPGAs 29

On-The-Fly Parallel Data Shuffling for Graph Processing on O...

引用

29th international conference on field-programmable logic and applications (FPL)

作者： Chen, Xinyu Bajaj, Ronak Chen, Yao He, Jiong He, Bingsheng Wong, Weng-Fai Chen, Deming Natl Univ Singapore Singapore Singapore Adv Digital Sci Ctr Singapore Singapore Alibaba Grp Hangzhou Peoples R China Univ Illinois Urbana IL 61801 USA

ISBN: (纸本)9781728148847

Graph processing has attracted much attention recently due to its popularity in many big data analytic applications. With high performance and energy efficiency, FPGAs can be an attractive architecture for graph processing. A number of techniques such as caching using block RAMs (BRAMs) to reduce random accesses of global memory and multiple processing element (PE) instances for high throughput have been explored. OpenCL-based FPGAs natively support a high-level programming paradigm, providing good programmability to developers. However, challenges remain because the run-time dependency introduced by multiple PEs usually cannot be handled efficiently by OpenCL's high-level control granularity. In this paper, we propose a novel on-the-fly parallel data shuffling technique that can be implemented in OpenCL to solve this problem. We have integrated our shuffling technique to an edge-centric graph processing framework which achieves a throughput of more than 1,000 million traversed edges per second (MTEPS) on PageRank, SpMV, BFS and SSSP applications and is even better than existing RTL-based designs.

关键词： field programmable gate arrays Parallel processing throughput Hardware Layout Engines Computer architecture

来源：评论

学校读者我要写书评

暂无评论

DynaBurst: Dynamically Assemblying DRAM Bursts over a Multitude of Random Accesses 29

DynaBurst: Dynamically Assemblying DRAM Bursts over a Multit...

引用

29th international conference on field-programmable logic and applications (FPL)

作者： Asiatici, Mikhail Ienne, Paolo Ecole Polytech Fed Lausanne Sch Comp & Commun Sci CH-1015 Lausanne Switzerland

ISBN: (纸本)9781728148847

the effective bandwidth of the FPGA external memory, usually DRAM, is extremely sensitive to the access pattern. Nonblocking caches that handle thousands of outstanding misses (miss-optimized memory systems) can dynamically improve bandwidth utilization whenever memory accesses are irregular and application-specific optimizations are not available or are too costly in terms of design time. However, they require a memory controller with wide data ports on the FPGA side and cannot fully take advantage of the memory interfaces with multiple narrow ports that are common on SoC FPGAs. Moreover, as their scope is limited to single memory requests, the access pattern they generate may cause frequent DRAM row conflicts, which further reduce DRAM bandwidth. In this paper, we propose DynaBurst, an extension of miss-optimized memory systems that generates variable-length bursts to the memory controller. By making memory accesses locally more sequential, we minimize the number of DRAM row conflicts, and by adapting the burst length on a per-request basis we minimize bandwidth wastage. On a multiple, narrow-ported DDR3 controller, we provide 28% geometric mean and up to 3.4x speedup compared to a traditional nonblocking cache of the same area, while the prior single-request approach would not have been cost-effective. On a controller with a single, wide port, we can further improve the performance of miss-optimized systems by up to 2.4 x.

关键词： Random access memory Bandwidth field programmable gate arrays Memory management Optimization throughput Registers

来源：评论

学校读者我要写书评

暂无评论

Limago: an FPGA-based Open-source 100 GbE TCP/IP Stack 29

Limago: an FPGA-based Open-source 100 GbE TCP/IP Stack

引用

29th international conference on field-programmable logic and applications (FPL)

作者： Ruiz, Mario Sidler, David Sutter, Gustavo Alonso, Gustavo Lopez-Buedo, Sergio Univ Autonoma Madrid Escuela Politecn Super High Performance Comp & Networking Res Grp Madrid Spain Swiss Fed Inst Technol Dept Comp Sci Syst Grp Zurich Switzerland NAUDIT HPCN Madrid Spain

ISBN: (纸本)9781728148847

the realization that the network is becoming an important bottleneck in computing clusters and in the cloud has led in the past years to an increase scrutiny of how networking functionality is deployed. From TCP Offload Engines (TOEs) to Software Defined Networking (SDN), including Smart NICs and In-Network Data Processing, a wide range of approaches are currently being explored to increase the efficiency of networks and tailor its functionality to the actual needs of the application at hand. To address the need for an open and customizable networking stack, in this paper we introduce Limago, an FPGA-based open-source implementation of a TCP/IP stack operating at 100 Gbit/s. To our knowledge, Limago provides the first complete description of an FPGA-based TCP/IP stack at these speeds, thereby illustrating the bottlenecks that must be addressed, proposing several innovative designs to reach the necessary throughput, and showing how to incorporate advanced protocol features into the design. As an example, Limago supports the TCP Window Scale option, addressing the Long Fat Pipe issue. Limago not only enables 100 Gbit/s Ethernet links in an open source package, but also paves the way to programmable and fully customizable NICs based on FPGAs.

关键词： TCPIP field programmable gate arrays Microsoft Windows Bandwidth Cloud computing Open source software

来源：评论

学校读者我要写书评

暂无评论

Data stream statistics over sliding windows: How to summarize 150 Million updates per second on a single node 29

Data stream statistics over sliding windows: How to summariz...

引用

29th international conference on field-programmable logic and applications (FPL)

作者： Chrysos, Grigorios Papapetrou, Odysseas Pnevmatikatos, Dionisios Dollas, Apostolos Garofalakis, Minos Tech Univ Crete Sch Elect & Comp Engn Iraklion Greece Eindhoven Univ Technol Dept Math & Comp Sci Eindhoven Netherlands ATHENA Res & Innovat Ctr Athens Greece

ISBN: (纸本)9781728148847

Traditional data management systems map information using centralized and static data structures. Modern applications need to process in real time datasets much larger than system memory. To achieve this, they use dynamic entities that are updated with streaming input data over a sliding window. For efficient and high performance processing, approximate sketch synopses of input streams have been proposed as effective means for the summarization of streaming data over large sliding windows with probabilistic accuracy guarantees. this work presents a system-level solution to accelerate the Exponential Count-Min (ECM) sketch algorithm on reconfigurable technology. Different reconfigurable architectures for the sketch structure that correspond to different cost and performance tradeoffs are presented. We map the proposed system-level ECM sketch architectures to a high-end modern HPC platform to achieve guaranteed and best-effort update rates up to 150 and 180 million tuples per second respectively. We compare the performance of the implemented system against the best optimized multi-thread software alternative and show that our scalable full-system accelerators outperform software solutions by 5-7.5x for Virtex6 devices and in excess of 10x for current Ultrascale devices.

关键词： Electronic countermeasures Microsoft Windows Complexity theory Histograms field programmable gate arrays IP networks throughput

来源：评论

学校读者我要写书评

暂无评论

Self-Parameterized Chaotic Map: A Hardware-efficient Scheme Providing Wide Chaotic Range

Self-Parameterized Chaotic Map: A Hardware-efficient Scheme ...

引用

IEEE international conference on Electronics, Circuits and Systems (ICECS)

作者： Partha Sarathi Paul Anurag Dhungel Maisha Sadia Md Razuan Hossain Barry Muldrey Md Sakib Hasan University of Mississippi MS USA

We present a general method called “self-parameterization” for designing one-dimensional (1-D) chaotic maps that provide wider chaotic regions than existing 1-D maps. A wide chaotic range is a desirable property as it strengthens the security feature by enlarging the design space in many hardware-security applications, including reconfigurable logic and encryption. the proposed self-parameterized scheme reduces the hardware cost by involving only one chaotic map that modulates its own control parameter at every iteration by passing the map's previous output through a simple linear transformation. the widening of chaotic range after adding self-parameterization is demonstrated on three classical map functions: logistic, tent, and sine. Two hardware-efficient self-parameterized schemes are presented: one for field-programmable gate array (FPGA) implementation and the other one for integrated circuit (IC) implementation. the chaotic performance of the proposed scheme is evaluated with bifurcation plots and three established chaotic entropy metrics including, Lyapunov exponent, correlation coefficient, and correlation dimension.

关键词： Measurement Reconfigurable logic Bifurcation Hardware Entropy Encryption Security

来源：评论

学校读者我要写书评

暂无评论

Information Metamaterials and Information Systems

Information Metamaterials and Information Systems

引用

international conference on Infrared and Millimeter Waves

作者： Tie Jun Cui Institute of Electromagnetic Space Southeast University Nanjing China State Key Laboratory of Millimeter Waves Southeast University Nanjing China

Metamaterials have great capabilities and flexibilities in controlling electromagnetic (EM) waves since their subwavelength meta-atoms can be designed and tailored in desired ways. However, once the structure-only metamaterials (i.e., passive metamaterials) are fabricated, their functions will be fixed. To control the EM waves dynamically, some active devices are integrated into the meta-atoms, yielding active metamaterials. Traditionally, the active metamaterials include tunable metamaterials and reconfigurable metamaterials, which have either small-range tunability or a few numbers of reconfigurability. Recently, a special kind of active metamaterials, digital coding and programmable metamaterials, have been presented, which can realize a large number of distinct functionalities and switch them in real time with the aid of field programmable gate array. More importantly, the digital coding representations of metamaterials make it possible to bridge the digital world and physical world using the metamaterial platform, and make the metamaterials process digital information directly, resulting in information metamaterials. In this regard, the information metamaterial is no longer an effective material or medium, but an information system. In this presentation, I will firstly introduce the evolution of metamaterials, and then present the concepts and basic principles of digital coding metamaterials and information metamaterials. With more details, I will introduce a series of information metamaterial systems, including the programmable meta-systems, software meta-systems, and intelligent meta-systems. Particularly, I will present two important applications of the information metamaterials: intelligent computational imaging and wireless communications.

关键词： Wireless communication Imaging Switches logic gates Metamaterials Encoding Software

来源：评论

学校读者我要写书评

暂无评论

Towards an Efficient Accelerator for DNN-based Remote Sensing Image Segmentation on FPGAs 29

Towards an Efficient Accelerator for DNN-based Remote Sensin...

引用

29th international conference on field-programmable logic and applications (FPL)

作者： Liu, Shuanglong Luk, Wayne Imperial Coll London Dept Comp London England

ISBN: (纸本)9781728148847

Among popular techniques in remote sensing image (RSI) segmentation, Deep Neural Networks (DNNs) have gained increasing interest but often require high computation complexity, which largely limits their applicability in on-board space platforms. therefore, various dedicated hardware designs on FPGAs have been developed to accelerate DNNs. However, it imposes difficulty on the design of efficient accelerators for DNN-based segmentation algorithms, since they need to perform both convolution and deconvolution which are two fundamentally different types of operations. this paper proposes a uniform architecture to efficiently implement both convolution and deconvolution in one vector multiplication module. this architecture is further optimized through exploiting different levels of parallelism and layer fusion to achieve low latency for RSI segmentation tasks. Moreover, an optimized DNN model is developed for real-time RSI segmentation, which shows preferable accuracy compared to other methods. the proposed hardware accelerator efficiently implements the DNN model on Intel's Arria 10 device, demonstrating 1578 GOPS of throughput and 17.4 ms of latency, i.e., 57 images per second.

关键词： Computer architecture Image segmentation Convolution Deconvolution Kernel field programmable gate arrays Computational modeling

来源：评论

学校读者我要写书评

暂无评论

Timing-aware routing in the RapidWright framework 29

Timing-aware routing in the RapidWright framework

引用

29th international conference on field-programmable logic and applications (FPL)

作者： Liu, Leo Kapre, Nachiket Univ Waterloo Sch Elect & Comp Engn Waterloo ON Canada

ISBN: (纸本)9781728148847

We can extract approximate, fine-grained timing information of routing resources of Xilinx FPGAs using the RapidWright open-source framework. the absence of timing information makes it difficult to implement timing-aware FPGA CAD tools using RapidWright. It is impractical to invoke Vivado's timing analysis engine for each choice within an optimization loop of your custom CAD algorithm as that would slow down execution by orders of magnitude. We route a set of one-time calibration tests on the FPGA using Vivado to extract path delays, and setup a system of linear equations based on the unknown delays associated with each routing resource used in the calibration route. We run this calibration for an interconnect tile but generalize the result to the entire FPGA due to device symmetry. We then solve these equations using least squares approximation as the resulting system is low-rank. this is due to the routing restrictions imposed by the FPGA fabric for legality of the connection and correctness of Vivado's timing analysis. We are able to learn an approximate timing model for RapidWright that is within 1% error (0.01 ns) of Vivado timing analysis by running approximate to 30 calibration runs and needing under 60 seconds of Vivado timing analysis. We demonstrate this technique on Xilinx XCKU115 FPGA (-3, -2, and -1 speed grades). the open-source RapidRoute custom router previously lost to Vivado by as much as 0.3-0.4 ns on timing slack when using a crude timing model. With our timing model enhancements, we allow RapidRoute to close the slack gap with Vivado and even outperform Vivado marginally on occasion. Our timing model generation is lightweight and can be discovered for each FPGA device instead of bundling memory-hungry timing libraries with RapidWright.

关键词： field programmable gate arrays Routing Delays Tools Mathematical model Calibration

来源：评论

学校读者我要写书评

暂无评论

Real-Time Multi-Pedestrian Detection in Surveillance Camera using FPGA 29

Real-Time Multi-Pedestrian Detection in Surveillance Camera ...

引用

29th international conference on field-programmable logic and applications (FPL)

作者： Jinguji, Akira Sada, Youki Nakahara, Hiroki Tokyo Inst Technol Tokyo Japan

ISBN: (纸本)9781728148847

In surveillance cameras, pedestrians and objects are detected using Convolutional Neural Network (CNN) based Object Detection such as YOLO and SSD. Since the size of the CNN input image is fixed, small objects cannot be detected when the high-resolution image is resized. By splitting image and applying object detection to each split image, CNN can detect small objects and take advantage of the high performance of the FPGA. We demonstrate an object detector by splitting the given image as stream data to YOLOv2 on an FPGA which has a very high performance. It achieved both practical speed and accuracy. Due to the difference in scale between training data and test data, the detection of small objects fails when the granularity of split is small. Splitting the image matches the scale of the data. When detection throughput is high enough, it detects many objects with a practical speed. We demonstrate the comparison of FPGA and mobile GPU in the proposed image split method of object detection. We implement Object Detection Systems by Xilinx ZCU104 FPGA board and NVIDIA Mobile GPU boards with a USB camera. the image split method can be adapted as it is to all implementations of object detection. We compare the performance of the FPGA and GPU. As a result of the experiment, FPGA achieves 547.0 FPS per image and is 3.9 times faster than Mobile GPU. When the given image is split into 4 x 4 grids, the system realizes 34.2 FPS and it satisfies the real-time requirement on the standard camera(30FPS). We showed that ultra-fast object detection can be used to improve accuracy.

关键词： field programmable gate arrays Object detection Task analysis Graphics processing units throughput Pipeline processing Cameras

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：