检索结果-内蒙古大学图书馆

5th international conference on Embedded Software and Systems

作者： Khanfir, Sami Jemni, Mohamed Ecole Super Sci & Tech Tunis Res Unit Technol Informat & Commun Tunis Tunisia

ISBN: (纸本)9780769532875

A novel fast scheme for Discrete Wavelet Transform (DWT) was lately introduced under the name of lifting scheme [4, 10]. this new scheme presents many advantages over the convolution-based approach [10, 11]. For instance it is very suitable for parallelization. In this paper we present two new FPGA-based parallel implementations of the DWT lifting-based scheme. the first implementation uses pipelining, parallel processing and data reuse to increase the speed up of the algorithm. In the second architecture a controller is introduced to deploy dynamically a suitable number of clones accordingly to the available hardware resources on a targeted environment. these two architectures are able of processing large size incoming images or multi framed images in real-time. the simulations driven on a Xilinx Virtex-5 FPGA environment has proven the practical efficiency of our contribution. In fact, the first architecture has given an operating frequency of 289 MHz, and the second architecture demonstrated the controller's capabilities of determining the true available resources needed for a successful deployment of independent clones, over a targeted FPGA environment and processing the task in parallel.

关键词： Field programmable gate arrays (FPGA)

来源：评论

学校读者我要写书评

暂无评论

Design aspects of multi-level reconfigurable architectures

引用

JOURNAL OF SIGNAL processing SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY 2008年第1期51卷 23-37页

作者： Lange, Sebastian Middendorf, Martin Univ Leipzig Dept Comp Sci Parallel Comp & Complex Syst Grp D-04009 Leipzig Germany

Dynamically reconfigurable hardware has already been deployed for accelerating computationally demanding applications. Some of these hardware architectures allow run time reconfiguration but this usually leads to a large reconfiguration overhead. the advantage of run time reconfiguration is that it allows new algorithmic solutions for many applications. To study the potential of frequent run time reconfiguration it is interesting to investigate its costs and benefits from an abstract point of view and to develop new architectural concepts. Multi-level reconfigurable architectures are one such concept that introduces several levels of reconfiguration. this paper deals with new types of multi-level reconfigurable architectures. the corresponding problem of finding the best granularity for different reconfiguration levels is formulated and investigated. Although this problem is shown to be NP-complete, an interesting restricted subcase is solved optimally in polynomial time. For the general case, a good heuristic is proposed that is based on solutions for the restricted case. Results on three example applications show that the reconfiguration cost can be reduced with the new architectures. Based on a proposed measure of relative efficiency it is also shown that the new architectures are more efficient so that they obtain a larger reconfiguration cost reduction with less additional hardware.

关键词： reconfigurable architecture dynamic reconfiguration multi-level reconfiguration

来源：评论

学校读者我要写书评

暂无评论

A parallel ARCHITECTURAL IMPLEMENTATION OF thE FAST thREE STEP SEARCH ALGORIthM FOR BLOCK MOTION ESTIMATION

A PARALLEL ARCHITECTURAL IMPLEMENTATION OF THE FAST THREE ST...

引用

5th international Multi-conference on Systems, Signals and Devices

作者： Srinivasarao, B. K. N. Chakrabarti, Indrajit Indian Inst Technol Dept Elect & Elect Commun Engn Kharagpur 721302 W Bengal India

ISBN: (纸本)9781424422050

this paper proposes a parallel architecture for a fast three step search (FTSS) algorithm, which is used in motion estimation. FTSS algorithm involves reduced number of search points and is thus less computationally expensive compared to the standard three step search (TSS) algorithm. Degradation of performance while applying the FTSS algorithm to several standard images has been shown to be insignificant compared to the standard TSS algorithm. the proposed architecture uses only three processing elements accompanied with use of intelligent data arrangement and memory configuration. A technique for reducing external memory accesses has also been developed. the proposed architecture for FTSS provides an efficient solution for applications requiring real-time motion estimations, because it requires smaller area and power than what would be required to implement TSS. the proposed architecture provides the solution for low bit-rate video applications like video telephony and teleconferencing.

关键词： parallel architectures Teleconferencing Video processing

来源：评论

学校读者我要写书评

暂无评论

parallel processing Puzzle N²-1 on cluster architectures performance analysis

Parallel processing Puzzle N<SUP>2</SUP>-1 on cluster archit...

引用

30th international conference on Information Technology Interfaces

作者： Sanz, Victoria de Giusti, Armando Chichizola, Franco Naiouf, Marcelo De Giusti, Laura Instituto de Investigación en Informática (III-LIDI) School of Computer Sciences UNLP

ISBN: (纸本)9789537138127

An analysis of a parallel solution of N-2-1 Puzzle using clusters, is presented. this problem is interesting due to its complexity and related applications, particularly in the field of robotics. A variation of classic heuristics for forecasting the work to be done in order to reach a solution is analyzed, and it is shown that its use significantly improves the time of sequential algorithm A*. then, a parallel solution on a distributed architecture is presented and speedup is analyzed based on the number of processors, efficiency, and the possible superlinearity when scaling the problem.

关键词： parallel algorithms distributed processing speedup superlinearity efficiency scalability

来源：评论

学校读者我要写书评

暂无评论

parallel memory architecture for elliptic curve cryptography over GF_(p) aimed at efficient FPGA implementation

引用

JOURNAL OF SIGNAL processing SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY 2008年第1期51卷 39-55页

作者： Laue, Ralf Huss, Sorin A. Tech Univ Darmstadt Dept Comp Sci Integrated Circuits & Syst Lab Darmstadt Germany

parallelization of operations is of utmost importance for efficient implementation of Public Key Cryptography algorithms. Starting with a classification of parallelization methods at different abstraction levels of public key algorithms, we propose a novel memory architecture for elliptic curve implementations with multiple modular multiplier units. this architecture is well-suited for different point addition and doubling algorithms over GF(p) to be implemented on FPGAs. It allows the execution time to scale with the number of modular multipliers and exhibits nearly no overhead compared to the mere runtime of the multipliers. the advantages of this distributed memory architecture are demonstrated by means of two different point addition and doubling algorithms.

关键词： elliptic curve cryptography parallelization memory architecture FPGA

来源：评论

学校读者我要写书评

暂无评论

parallel and Distributed Visualization the State of the Art

Parallel and Distributed Visualization The State of the Art

引用

5th international conference on Computer Graphics, Imaging and Visualization (CGIV)

作者： Meligy, Ali Middle E Univ Grad Studies Fac Informat Technol Amman Jordan

ISBN: (纸本)9780769533599

Visualization is one of the most important applications of computer graphics. To have a parallel infrastructure for visualization, some technologies would be needed. We identify the state-of-the-art technologies that have prepared for building such an infrastructure and examine a collection of applications that would benefit from it. We consider a broad range of scientific and technological advances in visualization, which are relevant to visual supercomputing. Mainly, we present the original abstracts from the cited papers.

关键词： parallel processing distributed processing cluster computers parallel rendering algorithms visual supercomputing visualization autonomic computing mobile visualization

来源：评论

学校读者我要写书评

暂无评论

parallel simulated annealing for materialized view selection in data warehousing environments

Parallel simulated annealing for materialized view selection...

引用

8th international conference on algorithms and architectures for parallel processing

作者： Derakhshan, Roozbeh Stantic, Bela Korn, Othmar Dehne, Frank Swiss Fed Inst Technol Zurich Switzerland Griffith Univ Inst Integrated & Intelligent Syst Brisbane Qld Australia Carleton Univ Sch Comp Sci Ottawa ON K1S 5B6 Canada

ISBN: (纸本)9783540695004

In order to facilitate efficient query processing, the information contained in data warehouses is typically stored as a set of materialized views. Deciding which views to materialize represent a challenge in order to minimize view maintenance and query processing costs. Some existing approaches are applicable only for small problems, which are far from reality. In this paper we introduce a new approach for materialized view selection using parallel Simulated Annealing (PSA) that selects views from an input Multiple View processing Plan (MVPP). With PSA, we are able to perform view selection on MVPPs having hundreds of queries and thousands of views. Also, in our experimental study we show that our method provides a significant improvement in the quality of the obtained set of materialized views over existing heuristic and sequential simulated annealing algorithms.

关键词： parallel simulated annealing data warehousing materialized view selection

来源：评论

学校读者我要写书评

暂无评论

Performance of OpenMP benchmarks on Multicore processors

Performance of OpenMP benchmarks on Multicore processors

引用

8th international conference on algorithms and architectures for parallel processing

作者： Marowka, Ami Shenkar Coll Engn & Design 12 Anna Frank IL-52526 Ramat Gan Israel

ISBN: (纸本)9783540695004

the appearance of Multicore processors brings high performance computing to the desktop and opens the doors of mainstream computing for parallel computing. this paradigm shift leads the integration of paxallel programming standards for high-end shard-memory machine architectures into desktop programming environments. In this paper we present a performance study of these new systems. We evaluate the performance of an OpenMP shared-memory programming model that is integrated into Microsoft Visual Studio C++ 2005 and Intel C++ compilers on a multicore processor. We benchmarked using the NAS OpenMP high-level applications benchmarks and the EPCC OpenMP low-level benchmarks. We report the basic timings, scalability, and run-time profiles of each benchmark and analyze the running results.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

parallel Computation Techniques for Ontology Reasoning

Parallel Computation Techniques for Ontology Reasoning

引用

7th international Semantic Web conference (ISWC 2008)

作者： Bock, Juergen FZI Res Ctr Informat Technol Karlsruhe Germany

ISBN: (纸本)9783540885634

As current reasoning techniques are not designed for massive parallelisation, usage of parallel computation techniques in reasoning establishes a major research problem. I will propose two possibilities of applying parallel computation techniques to ontology reasoning: parallel processing of independent ontological modules, and tailoring the reasoning algorithms to parallel architectures.

关键词： parallel architectures

来源：评论

学校读者我要写书评

暂无评论

Scheduling of QR factorization algorithms on SMP and multi-core architectures

Scheduling of QR factorization algorithms on SMP and multi-c...

引用

16th Euromicro international conference on parallel, Distributed and Network-Based processing

作者： Quintana-Orti, Gregorio Quintana-Orti, Enrique S. Chan, Ernie de Geijn, Robert A. van Van Zee, Field G. Univ Jaume 1 Dept Ingn & Ciencia Computadores Castellon de La Plana 12071 Spain Univ Texas Austin Dept Comp Sci Austin TX 78712 USA

ISBN: (纸本)9780769530895

this paper examines the scalable parallel implementation of the QR factorization of a general matrix, targeting SMP and multi-core architectures. Two implementations of algorithms-by-blocks are presented. Each implementation views a block of a matrix as the fundamental unit of data, and likewise, operations over these blocks as the primary unit of computation. the first is a conventional blocked algorithm similar to those included in libFLAME and LAPACK but expressed in a way that allows operations in the so-called critical path of execution to be computed as soon as their dependencies are satisfied. the second algorithm captures a higher degree of parallelism with an approach based on Givens rotations while preserving the performance benefits of algorithms based on blocked Householder transformations. We show that the implementation effort is greatly simplified by expressing the algorithms in code with the FLAME/FLASH API, which allows matrices stared by blocks to be viewed and managed as matrices of matrix blocks. the SuperMatrix run-time system utilizes FLASH to assemble and represent matrices but also provides out-of-order scheduling of operations that is transparent to the programmer Scalability of the solution is demonstrated on ccNUMA platform with 16 processors and an SMP architecture with 16 cores.

关键词： Matrix algebra

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：