检索结果-内蒙古大学图书馆

An Improved parallel prefix computation on 2D-Mesh Network

Procedia Technology 2013年 10卷 919-926页

作者： Sudhanshu Kumar Jha Department of Computer Applications National Institute of Technology Jamshedpur – 831 014 (INDIA

parallel prefix is an important technique that has been widely accepted in many area of scientific and engineering research. In this paper we propose an improved parallel prefix computation algorithm on n × n mesh network that requires 2 n + 5 times. Our proposed algorithm can be compare with the traditional parallel prefix algorithm that requires 3 n + 2 time on same architecture.

关键词： prefix computation parallel prefix computation modified prefix 2D-mesh network

来源：评论

学校读者我要写书评

暂无评论

Energy efficient parallel and distributed simulation

Energy efficient parallel and distributed simulation

引用

作者： Biswas, Aradhya Georgia Institute of Technology

学位级别：博士

New challenges and opportunities emerge as computing interacts with our surroundings in unprecedented ways. One of these challenges is the energy consumed by computations and communications. In large cloud-based computing systems, it is a major concern because it forms the largest proportion of the environmental and operational costs of data centers. In mobile systems, it directly impacts battery life. This work focuses on understanding and reducing power and energy consumption of the parallel and distributed execution of discrete event simulations, an area not extensively studied in the past. We first empirically characterize the energy consumption of widely used synchronization algorithms. Then a model and techniques are presented and exercised to create energy profile of a distributed simulation system. These demonstrate that distributed execution and synchronization can incur a significant energy and power overhead. To study and optimize the energy required for distributed execution, a property termed zero-energy synchronization is proposed. A zero-energy synchronization algorithm based on an oracle is presented, and a practical implementation is discussed. A more generic synchronization algorithm termed Low Energy YAWNS (LEY) is also proposed. LEY represents the first attempt to design a synchronization algorithm for energy efficiency and, in principle, can achieve zero-energy synchronization for a large class of distributed simulation applications. To employ the energy efficiency of specialized computing hardware platforms, recurrence relations for simulating G/G/1 queueing networks, directly implementable using library primitives, are proposed. In addition to optimizations and scalability they offer, the use of library primitives ease development and open up avenues for adapting the simulation for custom hardware. Composition of parallel prefix scans further improve the energy efficiency of the proposed recurrences and similar sequences of parallel prefix sca

关键词： Energy efficiency parallel computing Distributed computing parallel and distributed simulation Discrete event simulation Energy profiling Performance Measurement Synchronization algorithm Dynamic data driven application system Edge computing Middleware Queuing network simulation Data parallel simulation parallel prefix computation

来源：评论

学校读者我要写书评

暂无评论

Generic Functional parallel Algorithms: Scan and FFT

引用

PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL 2017年第ICFP期1卷 1-25页

作者： Elliott, Conal Target Minneapolis MN 55403 USA

parallel programming, whether imperative or functional, has long focused on arrays as the central data type. Meanwhile, typed functional programming has explored a variety of data types, including lists and various forms of trees. Generic functional programming decomposes these data types into a small set of fundamental building blocks: sum, product, composition, and their associated identities. Definitions over these few fundamental type constructions then automatically assemble into algorithms for an infinite variety of data types-some familiar and some new. This paper presents generic functional formulations for two important and well-known classes of parallel algorithms: parallel scan (generalized prefix sum) and fast Fourier transform (FFT). Notably, arrays play no role in these formulations. Consequent benefits include a simpler and more compositional style, much use of common algebraic patterns and freedom from possibility of run-time indexing errors. The functional generic style also clearly reveals deep commonality among what otherwise appear to be quite different algorithms. Instantiating the generic formulations, two well-known algorithms for each of parallel scan and FFT naturally emerge, as well as two possibly new algorithms.

关键词： generic programming parallel prefix computation fast Fourier transform

来源：评论

学校读者我要写书评

暂无评论

parallel prefix computation in the Recursive Dual-Net

Parallel Prefix Computation in the Recursive Dual-Net

引用

10th International Conference on Algorithms and Architectures for parallel Processing

作者： Li, Yamin Peng, Shietung Chu, Wanming Hosei Univ Dept Comp Sci Tokyo 1848584 Japan Univ Aizu Dept Comp Hardware Aizu Wakamatsu Fukushima Japan

ISBN: (纸本)9783642131189

In this paper, we propose an efficient algorithm for parallel prefix computation in recursive dual-net, a newly proposed network. The recursive dual-net RDNk (B) for k > 0 has (2n(0))(2k) /2 nodes and d(0) + k links per node, where no and do are the number of nodes and the node-degree of the base network B, respectively. Assume that each node holds one data item, the communication and computation time complexities of the algorithm for parallel prefix computation in RDNk (B),k > 0, are 2(k+1) - 2 + 2k * T-comm(0) and 2(k+1) - 2 + 2(k) * T-comp(0), respectively, where T-comm(0) and T-comp(0) are the communication and computation time complexities of the algorithm for parallel prefix computation in the base network B, respectively.

关键词： Interconnection networks algorithm parallel prefix computation

来源：评论

学校读者我要写书评

暂无评论

Monte Carlo Simulation on GPGPU using prefix computation method 1

Monte Carlo Simulation on GPGPU using Prefix Computation met...

引用

IEEE International Conference on Electrical, Computer and Communication Technologies

作者： Babu, P. Ravi Shyamala, K. Rao, K. Srinivasa Univ Hyderbad Sch Phys Hyderabad Andhra Pradesh India Osmania Univ UCE Dept CSE Hyderabad Andhra Pradesh India TRR Coll Engn Dept ECE Medak India

ISBN: (纸本)9781479960859

Random probability estimation is one of the computational intensive factors in Monte Carlo simulation. This paper presents the parallel implementation of random probability estimation for a Monte Carlo simulation. parallel prefix computation is used to accelerate the speedup of parallel formulation of random probability estimation. The proposed work is implemented using C++ AMP (Accelerated Massive parallelism) programming language and tested on General Purpose computation on Graphics Processing Unit (GPGPU). The experimental result shows that the average speedup achieved on GPU-based implementation is 29.61% when compared to sequential implementation of random probability estimation. The performance of the proposed work is also evaluated and compared with actual American option pricing values.

关键词： C plus plus AMP GPGPU Monte Carlo Simulation parallel prefix computation Random probability estimation

来源：评论

学校读者我要写书评

暂无评论

Reconfigurable hardware solution to parallel prefix computation

引用

JOURNAL OF SUPERCOMPUTING 2008年第1期43卷 43-58页

作者： Park, Jin Hwan Dai, H. K. SUNY Albany Dept Comp Sci New Paltz NY 12561 USA Oklahoma State Univ Dept Comp Sci Stillwater OK 74078 USA

This paper presents the design and implementation of an efficient reconfigurable parallel prefix computation hardware on field-programmable gate arrays (FPGAs). The design is based on a pipelined dataflow algorithm, and control logic is added to reconfigure the system for arbitrary parallelism degree. The system receives multiple input streams of elements in parallel and produces output streams in parallel. It has an advantage of controlling the degree of parallelism explicitly at run time. The time complexity of the design is O(d+(N-d)/d), where d and N are parallelism degree and stream size, respectively. When the stream size is sufficiently larger than the initial trigger time of the pipeline (d), the time complexity becomes O(N/d). Unlike the prefix computation circuits found in the literature, the design is scalable for different problem sizes including unknown sized data. The design is modular based on a finite state machine, and implemented and tested for target FPGA devices Xilinx Spartan2S XC2S300EFT256-6Q and XC2S600EFG676-6.

关键词： parallel prefix computation reconfigurable hardware field-programmable gate arrays dataflow pipeline

来源：评论

学校读者我要写书评

暂无评论

Much ado about two (pearl) -: A pearl on parallel prefix computation

引用

ACM SIGPLAN NOTICES 2008年第1期43卷 29-35页

作者： Voigtlaender, Janis Tech Univ Dresden Inst Theoret Informat D-01062 Dresden Germany

This pearl develops a statement about parallel prefix computation in the spirit of Knuth's 0-1-Principle for oblivious sorting algorithms. It turns out that 0- 1 is not quite enough here. The perfect hammer for th... 详细信息

关键词： algorithms languages verification 0-1-principle free theorems parallel prefix computation relational parametricity

来源：评论

学校读者我要写书评

暂无评论

Much Ado about Two (Pearl) A Pearl on parallel prefix computation

Much Ado about Two (<i>Pearl</i>) A Pearl on Parallel Prefix...

引用

35th ACM-SIGPLAN-SIGACT Symposium on Principles of Programming Languages

作者： Voigtlaender, Janis Tech Univ Dresden Inst Theoret Informat D-01062 Dresden Germany

This pearl develops a statement about parallel prefix computation in the spirit of Knuth's 0-1-Principle for oblivious sorting algorithms. It turns out that 0-1 is not quite enough here. The perfect hammer for the... 详细信息

ISBN: (纸本)9781595936899

This pearl develops a statement about parallel prefix computation in the spirit of Knuth's 0-1-Principle for oblivious sorting algorithms. It turns out that 0-1 is not quite enough here. The perfect hammer for the nails we are going to drive in is relational parametricity.

关键词： 0-1-principle free theorems parallel prefix computation relational parametricity

来源：评论

学校读者我要写书评

暂无评论

Hybrid Han-Carlson adder

Hybrid Han-Carlson adder

引用

2012 IEEE 55th International Midwest Symposium on Circuits and Systems, MWSCAS 2012

作者： Muthyala Sudhakar, Sreenivaas Chidambaram, Kumar P. Swartzlander Jr., Earl E. Department of Electrical and Computer Engineering University of Texas at Austin Austin TX 78712 United States

ISBN: (纸本)9781467325264

This paper explores a variation of the Han-Carlson adder for large word sizes and compares the performance of the new design with the traditional design. This work introduces a second type of design with two Brent-Kung stages each at the beginning and at the end and with Kogge-Stone stages in the middle, henceforth referred to as the "Hybrid Han-Carlson design." With the new design, the Hybrid Han-Carlson adder, the delay increases slightly, but the complexity, silicon area and power are reduced significantly. © 2012 IEEE.

关键词： area complexity delay Han-Carlson adders hybrid Han-Carlson adders parallel prefix computation power

来源：评论

学校读者我要写书评

暂无评论

Arithmetic coding in parallel

引用

INTERNATIONAL JOURNAL OF FOUNDATIONS OF COMPUTER SCIENCE 2005年第6期16卷 1207-1217页

作者： Supol, J Melichar, B Czech Tech Univ Fac Elect Engn Dept Comp Sci & Engn Prague 12135 2 Czech Republic

We present an EREW PRAM cost optimal parallel algorithm for arithmetic coding computation. We solve the problem in 0(log n) time using n/log n processors. Each part of the algorithm as well as a well-known parallel pr... 详细信息

关键词： Arithmetic coding EREW PRAM NC algorithm parallel prefix computation parallel text compression

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：