检索结果-内蒙古大学图书馆

25th IEEE international conference on Electronics, Circuits and systems (ICECS)

作者： Spagnolo, Fanny Perri, Stefania Frustaci, Fabio Corsonello, Pasquale Univ Calabria Dept Informat Modeling Elect & Syst Engn Arcavacata Di Rende Italy Univ Calabria Dept Mech Energy & Management Engn Arcavacata Di Rende Italy

ISBN: (纸本)9781538695623

Deep learning is rapidly becoming a strong boost to the already pervasive field of computer vision. State-of-the-art Convolutional Neural Networks reach accuracies comparable to human senses. However, the high computational load and low energy efficiency make their implementation on modern embedded systems hard. In this paper, several strategies for designing fast convolutional engines suitable to hardware accelerate Convolutional Neural Networks are evaluated. When implemented within a complete embedded system based on a Zynq Ultrascale+ SoC device, two of the proposed architectures achieve a peak performance of 131.6 GMAC/s at 234MHz running frequency, by occupying at most similar to 13% of the DSP slices available on chip. All the proposed engines overcome state-of-the-art competitors, exhibiting a performance/DSP utilization ratio up to 29.6 times higher.

关键词： Convolutional Neural Networks FPGA DSP slices MACs SIMD architectures

来源：评论

学校读者我要写书评

暂无评论

Preface

Lecture Notes in Computer Science (including subseries Lectu...

引用

Lecture Notes in computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2020年 12471 LNCS卷 v页

作者： Orailoglu, Alex Jung, Matthias Reichenbach, Marc Department of Computer Science and Engineering University of California San Diego La Jolla CA United States Department of Electrical and Computer Engineering Fraunhofer IESE Kaiserslautern Germany Department of Computer Science Friedrich-Alexander University Erlangen Germany

来源：评论

学校读者我要写书评

暂无评论

Work-in-Progress: Impact of Graph Partitioning on SNN Placement for a Multi-Core Neuromorphic Architecture

Work-in-Progress: Impact of Graph Partitioning on SNN Placem...

引用

international conference on Compilers, architectures and Synthesis for embedded systems (CASES)

作者： Barchi, Francesco Urgese, Gianvito Macii, Enrico Acquaviva, Andrea Politecn Torino Dept Control & Comp Engn I-10129 Turin Italy

ISBN: (纸本)9781538655641

In this paper, we evaluate a partitioning and placement technique for mapping concurrent applications over a globally asynchronous locally synchronous (GALS) multi-core architecture designed for simulating a spiking neural network (SNN) in real-time. We designed a task placement pipeline capable of analysing the network of neurons and producing a placement configuration that enables a reduction of communication between computational nodes. The neuron-to-core mapping problem has been formalised as a two phases problem: Partitioning and Placement. The Partitioning phase aims at grouping together the most connected network components, maximising the amount of self-connections within each identified group. For this purpose we used a multilevel k-way graph partitioning strategy capable of generating network-partitions. The Placement phase aims at placing groups of neurons over the chip mesh minimising the communication between computational nodes. For implementing this step, we designed and evaluate the performances of three placement variants. In the results, we point out the importance of using a partitioning algorithm for the SNN graph. We were able to achieve an increase in self-connections of 19% and an improvement of the final overall post-placement synaptic elongation of 29% using the simulated annealing placement technique, compared to 22% obtained without partitioning.

关键词： Neurons Sociology Statistics computer architecture Simulated annealing Program processors Brain modeling

来源：评论

学校读者我要写书评

暂无评论

Exploring Energy Efficiency Model Generalization on Multicore embedded Platforms 26

Exploring Energy Efficiency Model Generalization on Multicor...

引用

26th Euromicro international conference on Parallel, Distributed, and Network-Based Processing (PDP)

作者： Rexha, Hergys Lafond, Sebastien Abo Akad Univ Fac Sci & Engn Turku Finland

ISBN: (纸本)9781538649756

In this paper we investigate the relation between energy efficiency model and workload type executed in modern embedded architectures. From the energy efficiency model obtained in our previous work we select a few configuration points to verify that the prediction in terms of relative energy efficiency is maintained through different workload scenarios. A configuration point is defined as a set of platform tunable metrics, such as DVFS point, DPM level and utilization rate. As workloads, we use a combination of synthetic generators and real world applications from the embedded domain. In our experiments we use two different architectures for testing the model generality, which provide examples of real systems. First we have a comparison of the efficiency obtained by the two architecturally different chips (ARM and INTEL) in different configuration points and different workload scenarios. Second we try to explain the different results through the thermal management done by the two different chips. At the end we show that only in the case of workloads highly composed by integer instructions the results from the two architectures converge and show the need for a specific model trained with integer operations.

关键词： computer architecture Temperature measurement Power dissipation Load modeling Benchmark testing Energy consumption Brain modeling

来源：评论

学校读者我要写书评

暂无评论

Harnessing Concurrency in Synchronous Block Diagrams to Parallelize simulation on Multi-Core Hosts

Harnessing Concurrency in Synchronous Block Diagrams to Para...

引用

simulation Winter conference

作者： Andreas Naderlinger Department of Computer Sciences University of Salzburg Salzburg AUSTRIA

ISBN: (数字)9781728132839

ISBN: (纸本)9781728120522

Model-based and simulation-supported engineering based on the formalism of synchronous block diagrams is among the best practices in software development for embedded and real-time systems. As the complexity of such models and the associated computational demands for their simulation steadily increase, efficient execution strategies are needed. Although there is an inherent concurrency in most models, tools are not always capable of taking advantage of multi-core architectures of simulation host computers to simulate blocks in parallel. In this paper, we outline the conceptual obstacles in general and discuss them specifically for the widely used simulation environment Simulink. We present an execution mechanism that harnesses multi-core hosts for accelerating individual simulation runs through parallelization. The approach is based on a model transformation. It does not require any changes in the simulation engine, but introduces minimal data propagation delays in the simulated signal chains. We demonstrate its applicability in an automotive case study.

关键词： Computational modeling Software packages Engines Concurrent computing Adaptation models Mathematical model

来源：评论

学校读者我要写书评

暂无评论

Handling Global and Local Time and Energy Constraints of Sequence Diagrams 20

Handling Global and Local Time and Energy Constraints of Seq...

引用

20th UKSim-AMSS international conference on computer Modelling and simulation (UKSim)

作者： Andrade, Vinicius Camargo Peres, Leticia Mara Del Fabro, Marcos Didonet Fed Technol Univ Parana Dept Comp Sci Ponta Grossa Brazil Univ Fed Parana UFPR Dept Comp Sci Curitiba Parana Brazil

ISBN: (纸本)9781538658789

The Unified modeling Language (UML) has been widely adopted for modeling different sorts of applications. Despite having several kinds of diagrams, they were not designed verifying the execution of real-time embedded systems with time and energy constraints. There are UML profiles that capture this information, but it is necessary to rely on a separated validation framework. The main approach to fill this gap is to translate UML the models into representations such as Petri nets. However, existing works have little support for addressing energy and time constraints at the same time. This paper presents a technique for transforming UML sequence diagrams with energy and time constraints into timed Petri net models. These Petri net models are then used as input into software verification tools like Tina and GTT.

关键词： Sequence Diagram Time Petri Nets Meta-modeling Model Transformation embedded Software

来源：评论

学校读者我要写书评

暂无评论

Proceedings of PMBS 2018: Performance modeling, Benchmarking and simulation of High Performance computer systems, Held in conjunction with SC 2018: The international conference for High Performance Computing, Networking, Storage and Analysis

Proceedings of PMBS 2018: Performance Modeling, Benchmarking...

引用

2018 IEEE/ACM Performance modeling, Benchmarking and simulation of High Performance computer systems, PMBS 2018

ISBN: (纸本)9781728101828

The proceedings contain 13 papers. The topics discussed include: a metric for evaluating supercomputer performance in the era of extreme heterogeneity;evaluating SLURM simulator with real-machine SLURM and vice versa;automated instruction stream throughput prediction for Intel and AMD microarchitectures;deep learning at scale on NVIDIA V100 accelerators;algorithm selection of MPI collectives using machine learning techniques;miniVite: a graph analytics benchmarking tool for massively parallel systems;and improving MPI reduction performance for manycore architectures with OpenMP and data compression.

关键词：

来源：评论

学校读者我要写书评

暂无评论

SDCWorks: A Formal Framework for Software Defined Control of Smart Manufacturing systems 18

SDCWorks: A Formal Framework for Software Defined Control of...

引用

9th ACM/IEEE international conference on Cyber-Physical systems (ICCPS) held as part of Cyber-Physical systems (CPS) Week

作者： Potok, Matthew Chen, Chien-Ying Mitra, Sayan Mohan, Sibin Univ Illinois Dept Elect & Comp Engn Urbana IL 61801 USA Univ Illinois Dept Comp Sci Urbana IL 61801 USA

ISBN: (纸本)9781538653012

Discrete manufacturing systems are complex cyberphysical systems (CPS) and their availability, performance, and quality have a big impact on the economy. Smart manufacturing promises to improve these aspects. One key approach that is being pursued in this context is the creation of centralized software-defined control (SDC) architectures and strategies that use diverse sensors and data sources to make manufacturing more adaptive, resilient, and programmable. In this paper, we present SDCWorks-a modeling and simulation framework for SDC. It consists of the semantic structures for creating models, a baseline controller, and an open source implementation of a discrete event simulator for SDCWorks models. We provide the semantics of such a manufacturing system in terms of a discrete transition system which sets up the platform for future research in a new class of problems in formal verification, synthesis, and monitoring. We illustrate the expressive power of SDCWorks by modeling the realistic SMART manufacturing testbed of University of Michigan. We show how our open source SDCWorks simulator can be used to evaluate relevant metrics (throughput, latency, and load) for example manufacturing systems.

关键词： Manufacturing systems Adaptation models Analytical models Cost accounting Smart manufacturing computer architecture"

来源：评论

学校读者我要写书评

暂无评论

A Performance Analysis of Vector Length Agnostic Code

A Performance Analysis of Vector Length Agnostic Code

引用

international conference on High Performance Computing & simulation (HPCS)

作者： Angela Pohl Mirko Greese Biagio Cosenza Ben Juurlink Embedded Systems Architecture Technische Universität Berlin Berlin Germany

ISBN: (数字)9781728144849

ISBN: (纸本)9781728144856

Vector extensions are a popular mean to exploit data parallelism in applications. Over recent years, the most commonly used extensions have been growing in vector length and amount of vector instructions. However, code portability remains a problem when speaking about a compute continuum. Hence, vector length agnostic (VLA) architectures have been proposed for the future generations of ARM and RISC-V processors. With these architectures, code is vectorized independently of the vector length of the target hardware platform. It is therefore possible to tune software to a generic vector length. To understand the performance impact of VLA code compared to vector length specific code, we analyze the current capabilities of code generation for ARM's SVE architecture. Our experiments show that VLA code reaches about 90% of the performance of vector length specific code, i.e. a 10% overhead is inferred due to global predication of instructions. Furthermore, we show that code performance is not increasing proportionally with increasing vector lengths due to the higher memory demands.

关键词： Neon Hardware computer architecture Program processors Registers Kernel Benchmark testing

来源：评论

学校读者我要写书评

暂无评论

The Stability of Human Supervisory Control Operator Behavioral Models Using Hidden Markov Models

The Stability of Human Supervisory Control Operator Behavior...

引用

2019 IEEE/RSJ international conference on Intelligent Robots and systems (IROS)

作者： Haibei Zhu Mary L. Cummings Department of Electrical and Computer Engineering Duke University Durham NC USA

Human supervisory control (HSC) is a widely used knowledge-based control scheme, in which human operators are in charge of planning and making high-level decisions for systems with embedded autonomy. With the variability of operators' behaviors in such systems, the stability of an operator modeling technique, i.e., that a modeling approach produces similar results across repeated applications, is critical to the extensibility and utility of such a model. Using an unmanned vehicle simulation testbed where such vehicles can be hacked, we compared two operator behavioral models from two different experiments using a hidden Markov modeling (HMM) approach. The resulting HMM models revealed operators' dominant strategies when conducting hacking detection tasks. The similarity between these two models was measured via multiple aspects, including model structure, state distribution, divergence distance, and co-emission probability distance. The similarity measure results demonstrate the stability of modeling human operators in HSC scenarios using HMM models. These results indicate that even when operators perform differently on specific tasks, such an approach can reliably detect whether strategies change across different experiments.

关键词：

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：