检索结果-内蒙古大学图书馆

Performance evaluation of neural network hardware using time-shared bus and integer representation architecture

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS 1996年第6期E79D卷 888-896页

作者： Yasunaga, M Ochiai, T Institute of Information Science and Electronics University of Tsukuba Tsukuba-shi 305 Japan

Neural network hardware using time-shared bus and integer representation architecture has already been fabricated and reported from the design viewpoint. However, nothing related to performance evaluation of hardware has yet been presented. Computation-speed, scalability and learning accuracy of hardware are evaluated theoretically and experimentally using a Back Propagation (BP) algorithm. In addition, a mirror-weight assignment technique is proposed for high-speed computation in the BP. NETTalk, an English-pronunciation-reasoning task, has been chosen as the target application for the BP. In the experiment, recently-developed neuro-hardware based on the above architecture and its parallel programming language are used. An outline of the language is described along with BP programming. Mirror-weight assignment allows maximum speed at 55.0 MCUPS (Million Connections Updated Per Second) using 256 neurons in the hidden-layer (numbers of neurons in input- and output-layers are fixed at 203 and 26 respectively in NETTalk). In addition, if scalability is defined as a function of the number of neurons in the hidden-layer, the machine retains high scalability at 0.5 if such a maximum speed needs to be used. No degradation in learning accuracy occurs when experimental results computed using the neuro-hardware are compared with those obtained by floating-point representation architecture (workstation). The experiment indicates that the present integer representational design of the neuro-hardware is sufficient for NETTalk. Performance has been evaluated theoretically. evaluation purposes, it is assumed that most of the execution-time is taken up by bus cycles. On the basis of this assumption, an analytical model of computation-speed and scalability is proposed. Analytical predictions agreed well with experimental results.

关键词： neural networks parallel computing parallel programming language performance evaluation scalability

来源：评论

学校读者我要写书评

暂无评论

Probabilistic guards: A mechanism for increasing the granularity of work-stealing programs

引用

parallel COMPUTING 2019年 82卷 19-36页

作者： Yoritaka, Hiroshi Matsui, Ken Yasugi, Masahiro Hiraishi, Tasuku Umatani, Seiji Kyushu Inst Technol Iizuka Fukuoka 8208502 Japan Kyoto Univ Kyoto 6068501 Japan Ad Sol Nissin Corp Tokyo Japan Nintendo Co Ltd Kyoto Japan

We propose probabilistic guards and analyze their performance. To reduce the total task division cost, probabilistic guards can prevent thief workers from stealing small tasks from victim workers probabilistically. In this study, we have implemented probabilistic guards on a work-stealing framework called Tascell. Without an upper limit to the number of repeated probabilistically prevented steal attempts, a thief may repeat an unbounded number of probabilistically prevented steal attempts until success if a victim uses a probabilistic guard that rejects steal attempts with a non-zero probability. We measured the actual numbers of repeated attempts until success, and evaluated the performance of probabilistic guards with various upper limits. In this paper, we also propose virtual probabilistic guards that act as probabilistic guards without repeating probabilistically prevented steal attempts. Virtual probabilistic guards exhibit superior performance compared to probabilistic guards. Our evaluation is based on parallelized "highly serial" force calculation in a shared memory environment and five Tascell programs in a distributed memory environment. (C) 2018 Elsevier B.V. All rights reserved.

关键词： parallel programming language Work stealing Probability Concurrency Many-core Barnes-Hut algorithm

来源：评论

学校读者我要写书评

暂无评论

Rendezvous Facilities in a Distributed Computer System

引用

Journal of Computer Science & Technology 1995年第2期10卷 188-192页

作者：廖先湜金兰 DepartmentofComputerScience TsinghuaUniversityBeijing100084 DepartmentofComputerScience CaliforniaSt

The distributed computer system described in this paper is a set of computernodes interconnected in an interconnection network via packet-switching *** nodes communicate with each other by means of message-passing protocols. Thispaper presents the implementation of rendezvous facilities as highlevel prhoitives provided by a parallel programming language to support interprocess cornmunication andsynchronisation.

关键词： Rendevous packet-switching interface message-passing protocols interprocess communication and synchronization high-level primitive parallel programming language interconnection network

来源：评论

学校读者我要写书评

暂无评论

A parallel application programming and processing environment proposal for grid computing

A parallel application programming and processing environmen...

引用

15th IEEE International Conference on Computational Science and Engineering (CSE) / 10th IEEE/IFIP International Conference on Embedded and Ubiquitous Computing (EUC)

作者： Gomes Junior, Augusto Mendes Sato, Liria Matsumoto Massetto, Francisco Isidro Anhembi Morumbi Univ Engn & Technol Sch Sao Paulo Brazil Univ Sao Paulo Dept Comp & Digital Syst Engn Sao Paulo Brazil Fed Univ ABC Dept Comp Sci Santo Andre Brazil

ISBN: (纸本)9781467351652;9780769549149

The execution of parallel applications, using grid computing, requires an environment that enables them to be executed, managed, scheduled and monitored. The execution environment must provide a processing model, consisting of programming and execution models, with the objective appropriately exploiting grid computing characteristics. This paper proposes a parallel processing model, based on shared variables for grid computing, consisting of an execution model that is appropriate for the grid and a CPAR parallel language programming model. The environment is designed to execute parallel applications in grid computing, where all the characteristics present in grid computing are transparent to users. The results show that this environment is an efficient solution for the execution of parallel applications.

关键词： Distributed systems Grid computing High performance computing parallel programming language

来源：评论

学校读者我要写书评

暂无评论

Implementation of the EARTH programming model on SMP clusters: a multi-threaded language and runtime system

引用

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE 2003年第9期15卷 821-844页

作者： Tremblay, G Morrone, CJ Amaral, JN Gao, GR Univ Alberta Dept Comp Sci Edmonton AB Canada Univ Delaware Dept Elect & Comp Engn Comp Architect & Parallel Syst Lab Newark DE USA Univ Quebec Dept Informat Montreal PQ Canada

This paper describes the design and implementation of an Efficient Architecture for Running THreads (EARTH) runtime system for a multi-processor/multi-node cluster. The (EARTH) model was designed to support the efficient execution of parallel (multi-threaded) programs with irregular fine-grain parallelism using off-the-shelf computers. Implementing an EARTH runtime system requires an explicitly threaded runtime system. For portability, we built this runtime system on top of Pthreads under Linux and used sockets for inter-node communication. Moreover, in order to make the best use of the resources available on a cluster of symmetric multi-processors (SMP), this implementation enables the overlapping of communication and computation. We used Threaded-C, a language designed to implement the programming model supported by the EARTH architecture. This language allows the expression of various levels of parallelism and provides the primitives needed to manage the required communication and synchronization. The Threaded-C programming language supports irregular fine-grain parallelism through a two-level hierarchy of threads and fibers. It also provides various synchronization and communication constructs that reflect the nature of EARTH'S fibers-non-preemptive execution with data-driven scheduling-as well as the extensive use of split-phase transactions on EARTH to execute long-latency operations. Copyright (C) 2003 John Wiley Sons, Ltd.

关键词： multi-threading cluster computing parallel programming language

来源：评论

学校读者我要写书评

暂无评论

Managing distributed shared arrays in a bulk-synchronous parallel programming environment

引用

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE 2004年第2-3期16卷 133-153页

作者： Kessler, CW Linkoping Univ Inst Datavetenskap PELAB Dept Comp Sci S-58183 Linkoping Sweden

NestStep is a parallel programming language for the BSP (bulk-hronous parallel) programming model. In this article we describe the concept of distributed shared arrays in NestStep and its implementation on top of MPI. In particular, we present a novel method for runtime scheduling of irregular, direct remote accesses to sections of distributed shared arrays. Our method, which is fully parallelized, uses conventional two-sided message passing and thus avoids the overhead of a standard implementation of direct remote memory access based on one-sided communication. The main prerequisite is that the given program is structured in a BSP-compliant way. Copyright (C) 2004 John Wiley Sons, Ltd.

关键词： NestStep BSP model bulk synchronous parallelism parallel programming language distributed shared array runtime scheduling of communication

来源：评论

学校读者我要写书评

暂无评论

Software to silicon [hardware compilation]

引用

IEE Review 2000年第5期46卷 15-19页

作者： I. Page R. Dettmer

The principles of hardware compilation could be set to rewrite the rule book of silicon design. This article describes Handel-C, a parallel programming language that is providing programmers with a route to FPGA-based VLSI design, not by offering them a familiar programming environment combined with access to the parallel constructs familiar to the hardware designer, but by expressing the parallelism at a totally different level of abstraction; it doesn't describe the hardware like an HDL does - it describes the computation.

关键词： very high level language Hardware-software codesign circuit layout CAD computation description firmware parallel programming language silicon design FPGA-based VLSI design Handel-C field programmable gate arrays parallel programming hardware-software codesign Firmware Computer-aided circuit analysis and design abstraction level hardware description languages High level languages Electronic engineering computing parallel languages hardware compilation

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：