检索结果-内蒙古大学图书馆

2018 IEEE International Conference on Advances in Computing, Communication Control and Networking, ICACCCN 2018

作者： Kumari, Juhi Rawat, Tarun K. Electronics and Communication Department Netaji Subhas Institute of Technology New Delhi India

ISBN: (纸本)9781538641194

This paper proposes work on an efficient implementation of third order differentiator which is based on the lattice wave digital filter (LWDF). For third order Lattice wave digital differentiator (LWDD), data flow graph (DFG) is represented and using an unfolding algorithm it is transformed into an efficient design. The simulation result validates that the proposed design takes less execution time to give the required result. Further simulations also suggest that as an order of unfolding increases less execution time is required. © 2018 IEEE.

关键词： data flow graphs

来源：评论

学校读者我要写书评

暂无评论

Optimising loops in dynamic dataflow

引用

IET CIRCUITS DEVICES & SYSTEMS 2017年第2期11卷 113-122页

作者： Santiago, Leandro Marzulo, Leandro A. J. Sena, Alexandre C. Alves, Tiago A. O. Franca, Felipe M. G. Univ Fed Rio de Janeiro PESC COPPE Programa Engn Sistemas & Comp Rio De Janeiro Brazil Univ Estado Rio de Janeiro IME Rio De Janeiro Brazil

Dynamic dataflow allows simultaneous execution of instructions in different iterations of a loop, boosting parallelism exploitation. In this model, operands are tagged with their associated instance number, which is incremented as they go through the loop. Instruction execution is triggered when all input operands with the same tag become available. However, this traditional tagging mechanism often requires the generation of several control instructions to manipulate tags and guarantee the correct match. To address this problem, this work presents three dataflow loop optimisation techniques. The stack-tagged dataflow is a tagging mechanism that uses stacks of tags to reduce control overheads in dataflow. On the other hand, as nested loops may increase the overhead of stack-tag comparison, tag resetting can be used to set the tag to zero whenever it is safe, allowing a one-level reduction at the stack depth. Finally, loop skipping allows to further avoid stack comparison overhead in loops, when the number of iterations can be determined by the compiler. Experimental results show the overhead, drawbacks and benefits for the three optimisations presented. Moreover, the results suggested that a hybrid compiling approach can be used to get the best performance of each technique.

关键词： dynamic dataflow tagging mechanism dataflow overhead reduction data flow graphs data handling stack tagged dataflow nested loops dataflow loop optimisation parallel programming

来源：评论

学校读者我要写书评

暂无评论

RELATIONSHIPS BETWEEN COMMON GRAPHICAL REPRESENTATIONS USED IN SYSTEM ENGINEERING

引用

Insight 2018年第1期21卷 8-11页

作者： Long, James E.

Most system engineers today use graphical representations of a system to communicate its functional and data requirements. The most commonly used representations are the Function flow Block Diagram (FFBD), data flow Diagram (DFD), N2 Chart, IDEF0 Diagram, and Behavior Diagram (BD). This paper discusses the characteristics of each and shows how they are related. © 2018 International Council on Systems Engineering.

关键词： data flow graphs

来源：评论

学校读者我要写书评

暂无评论

Detecting Hardware Trojans in Unspecified Functionality Through Solving Satisfiability Problems 22

Detecting Hardware Trojans in Unspecified Functionality Thro...

引用

22nd Asia and South Pacific Design Automation Conference (ASP-DAC)

作者： Fern, Nicole San, Ismail Cheng, Kwang-Ting (Tim) UC Santa Barbara Santa Barbara CA 93106 USA Anadolu Univ Eskisehir Turkey HKUST Hong Kong Hong Kong Peoples R China

ISBN: (纸本)9781509015580

For modern complex designs it is impossible to fully specify design behavior, and only feasible to verify functionally meaningful scenarios. Hardware Trojans modifying only unspecified functionality are not possible to detect using existing verification methodologies and Trojan detection strategies. We propose a detection methodology for these Trojans by 1) precisely defining "suspicious" unspecified functionality in terms of information leakage, and 2) formulating detection as a satisfiability problem that can take advantage of the recent advances in both boolean and satisfiability modulo theory (SMT) solvers. The formulated detection procedure can be applied to a gate-level design using commercial equivalence checking tools, or directly to the Verilog/VHDL code by reasoning about the satisfiability of SMT expressions built from traversing the data-flow graph. We demonstrate the effectiveness of our approach on an adder coprocessor and a UART communication controller infected with Trojans which process information leaked from the on-chip bus during idle cycles using signals with only partially specified behavior.

关键词： data flow graphs

来源：评论

学校读者我要写书评

暂无评论

A Unified Framework for Throughput Analysis of Streaming Applications under Memory Constraints 22

A Unified Framework for Throughput Analysis of Streaming App...

引用

22nd International Conference on Engineering of Complex Computer Systems (ICECCS)

作者： Zhu, Xue-Yang Chinese Acad Sci Inst Software State Key Lab Comp Sci Beijing Peoples R China

ISBN: (纸本)9781538624319

Streaming applications are an important class of applications in real-time embedded systems, which usually run under restricted resource constraints and with real-time requirement. They are often modeled with Synchronous data flow graphs (SDFGs) or Cyclo-Static data flow graphs (CSDFGs) at the design stage. A proper analysis of the models gives a predictable design for a system. In this paper, we focus on the throughput analysis of (C)SDFGs, taking into account memory constraints. Memory related analysis needs to choose a memory abstraction that decides when the space of consumed data is released and when the required space is claimed. Different memory abstractions may lead to different achievable throughputs. The existing techniques, however, consider only a certain abstraction. If a model is implemented according to other abstractions, the analysis result may not truly evaluate the performance of the system. In this paper, we present a novel unified framework for throughput analysis of memory constrained (C)SDFGs for different abstractions, aiming to provide evaluations matching up to the corresponding implementations. Our methods are exact. Experiments are carried out on several models of real streaming applications and hundreds of synthetic graphs to evaluate the effects and performance of our methods.

关键词： data flow graphs iteration period memory abstractions self-timed execution time stamp

来源：评论

学校读者我要写书评

暂无评论

High Level Synthesis of Asynchronous Circuits from data flow graphs

引用

21st International Workshop on Power and Timing Modeling, Optimization, and Simulation

作者： van Leuken, Rene van Leeuwen, Tom Arriens, Huib Lincklaen Delft Univ Technol Fac Elect Engn Math & Comp Sci Circuits & Syst Grp Delft Netherlands

ISBN: (纸本)9783642241536;9783642241543

This paper presents a toolbox for the automatic generation of asynchronous circuits starting from a data flow graph description. The toolbox consists of a scheduling and code generation tool. We use traditional scheduling algorithms as for synchronous circuits, but have replaced the implied synchronous controller for an asynchronous distributed control network. The control circuit allows for true asynchronous operation of all digital resources and as a result of its scalable distributed topology allows unlimited resource sharing. The distributed controllers can be created by connecting a small number of pre-designed sub-controllers which are presented in this paper. Prototype IP-blocks of these sub-controller circuits have been designed in a 90nm ASIC design process. Our toolbox is a capable to generate large complex asynchronous solutions, with upto 20 percent power saving, and as least as good latency performance as of synchronous solutions.

关键词： data flow graphs

来源：评论

学校读者我要写书评

暂无评论

An Efficient Task-based All-Reduce for Machine Learning Applications

An Efficient Task-based All-Reduce for Machine Learning Appl...

引用

2017 Machine Learning in HPC Environments, MLHPC 2017

作者： Li, Zhenyu Davis, James Jarvis, Stephen Department of Computer Science University of Warwick Coventry United Kingdom

ISBN: (纸本)9781450351379

All-Reduce is a collective-combine operation frequently utilised in synchronous parameter updates in parallel machine learning algorithms. The performance of this operation - and subsequently of the algorithm itself - is heavily dependent on its implementation, configuration and on the supporting hardware on which it is run. Given the pivotal role of all-reduce, a failure in any of these regards will significantly impact the resulting scientific output. In this research we explore the performance of alternative allreduce algorithms in data-flow graphs and compare these to the commonly used reduce-broadcast approach. We present an architecture and interface for all-reduce in task-based frameworks, and a parallelization scheme for object-serialization and computation. We present a concrete, novel application of a butterfly all-reduce algorithm on the Apache Spark framework on a high-performance compute cluster, and demonstrate the effectiveness of the new butterfly algorithm with a logarithmic speed-up with respect to the vector length compared with the original reduce-broadcast method - a 9x speed-up is observed for vector lengths in the order of 108. This improvement is comprised of both algorithmic changes (65%) and parallel-processing optimization (35%). The effectiveness of the new butterfly all-reduce is demonstrated using real-world neural network applications with the Spark framework. For the model-update operation we observe significant speedups using the new butterfly algorithm compared with the original reduce-broadcast, for both smaller (Cifar and Mnist) and larger (ImageNet) datasets. © 2017 Association for Computing Machinery.

关键词： data flow graphs

来源：评论

学校读者我要写书评

暂无评论

Automatic generation of VHDL hardware code from data flow graphs

Automatic generation of VHDL hardware code from data flow gr...

引用

6th IEEE International Symposium on Applied Computational Intelligence and Informatics, SACI 2011

作者： Necsulescu, Philip I. Groza, Voicu University of Ottawa School of Information Technology and Engineering Ottawa Canada

ISBN: (纸本)9781424491094

The Software/Hardware Implementation and Research Architecture (SHIRA) is a C to hardware toolchain developed by the Computer Architecture Research Group (CARG) of the University of Ottawa. The framework and algorithms to generate the hardware from an Intermediate Representation (IR) of the C code is needed. This paper presents the conceiving, design, and development of a module that generates the hardware for custom instructions identified by specialized SHIRA components without the need for any user interaction. The module is programmed in Java and takes a data flow Graph (DFG) as an IR for input. It then generates VHDL code that targets the Altera Field Programmable Gate Arrays (FPGA). It is possible to use separate components for each operation or to set a maximum number for each component which leads to component reuse and reduces chip area use. The performance improvement of the generated code is compared to using only the processor's standard instruction set. © 2011 IEEE.

关键词： data flow graphs

来源：评论

学校读者我要写书评

暂无评论

Efficient mapping of CDFG onto coarse-grained reconfigurable array architectures

Efficient mapping of CDFG onto coarse-grained reconfigurable...

引用

Asia and South Pacific Design Automation Conference

作者： Satyajit Das Kevin J. M. Martin Philippe Coussy Davide Rossi Luca Benini Department of Electrical Electronic and Information Engineering University of Bologna Italy Univ. Bretagne-Sud Lorient France ETH Integrated Systems Laboratory Zurich Switzerland

ISBN: (纸本)9781509015597

In the approaching era of IoT, flexible and low power accelerators have become essential to meet aggressive energy efficiency targets. During the last few decades, Coarse Grain Reconfigurable Arrays (CGRA) have demonstrated high energy efficiency as accelerators, especially for high-performance streaming applications. While existing CGRAs mostly rely on partial and full predication techniques to support conditional branches, inefficient architecture and mapping support for handling control flow limits the use of CGRAs in accelerating either only inner loop bodies, or transformed loops specifically adapted to the target CGRA. This paper proposes a novel CGRA architecture with support for jump and conditional jump instructions and a lightweight global synchronization mechanism to enable complete Control data flow Graph (CDFG) mapping in an ultra-low-power environment. The architecture is coupled with a complete design flow that efficiently maps applications with heavy control flow starting from a generic C language description. The proposed mapping approach reduces the impact of wasteful instruction issues in the conventional approaches of predication providing an average energy improvement of 1.44× and 1.6× when compared to the state of the art partial and full predication techniques. Moreover, the proposed method achieves an average speed-up up to 21× and an energy improvement up to 50.42× while executing applications with heavy control flow with respect to sequential execution on a low-power embedded CPU, demonstrating its suitability for next generation IoT applications.

关键词： reconfigurable architectures data flow graphs

来源：评论

学校读者我要写书评

暂无评论

A Highly Efficient and Comprehensive Image Processing Library for C++-based High-Level Synthesis

A Highly Efficient and Comprehensive Image Processing Librar...

引用

FSP 2017;Fourth International Workshop on FPGAs for Software Programmers

作者： M. Akif Oezkan Oliver Reiche Frank Hannig Juergen Teich Friedrich-Alexander University Erlangen-Nurnberg (FAU)

ISBN: (纸本)9783800744435

Field Programmable Gate Arrays (FPGAs) are proved to be among the most suitable architectures for image processing applications. However, accelerating algorithms using FPGAs is a time-consuming task and needs expertise. Whereas the recent advancements in High-Level Synthesis (HLS) promise to solve this problem, today's HLS tools require apt hardware descriptions of algorithms to be able to provide favorable implementations. A solution is developing highly parameterizable and optimized HLS libraries for the fundamental image processing components. Another solution is providing a higher level of abstraction in the form of a Domain-Specific Language (DSL) and a corresponding efficient back end for hardware design. In this paper, we provide a highly efficient and parameterizable C++ library for image processing applications, which would be the cornerstone for both approaches. In our library, nodes of a stream-based data flow graph can be described as C++ objects for specified functions, and the whole application can be efficiently parallelized just by defining a global constant as the parallelization factor. Moreover, the key hardware design elements, i. e., line buffers and sliding windows with different border handling patterns, can be utilized individually to ease the design of more complicated applications.

关键词： level of abstraction computer hardware design Image processing hue lightness saturation Libraries high level synthesis Field programmable gate arrays data flow graphs Cornerstones Abstraction algorithms

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：