检索结果-内蒙古大学图书馆

Conference on Multimedia Hardware Architectures 1998

作者： Freimann, A Brune, T Pirsch, P Informat Technol Lab D-30167 Hannover Germany

ISBN: (纸本)0819427519

When implementing today's video compression standards on programmable processors, it is essential to optimize the algorithms with respect to the underlying hardware. As an example, the core decoder functions of the H.263 hybrid coding scheme were implemented on a SIMD controlled processor with four parallel VLIW data paths, the HiPAR-DSP. The decoder tasks were implemented employing local memory, parallelization on several levels, and data statistics. Special effort was paid on the computation intensive tasks IDCT, and motion compensated frame reconstruction. To speed up the IDCT computation, a data dependent approach was chosen, which distinguishes different block types. The determination of IDCT block type could be parallelized together with other tasks, thus no additional overhead is required. Frame reconstruction mainly benefits from data parallel operations and transparent DMA transfers to and from external memory.

关键词： video decoding algorithm mapping SIMD parallelization techniques H.263 MPEG

来源：评论

学校读者我要写书评

暂无评论

Design of array processors for 2-D Discrete Fourier Transform

引用

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS 1997年第4期E80D卷 455-465页

作者： Peng, ST Sedukhin, I Sedukhin, S Department of Computer Software Distributed Parallel Processing Laboratory University of Aizu Aizu-Wakamatsu-shi. 965 -80 Japan RandD Group Hiwada Electronic Corporation (Pioneer Group) Fukushima-ken 969-13 Japan

In this paper the design of systolic array processors for computing 2-dimensional Discrete Fourier Transform (2-D DFT) is considered. We investigated three different computational schemes for designing systolic array processors using systematic approach. The systematic approach guarantees to find optimal systolic array processors from a large solution space in terms of the number of processing elements and I/O channels, the processing time, topology, pipeline period, etc. The optimal systolic array processors are scalable, modular and suitable for VLSI implementation. An application of the designed systolic array processors to the prime-factor DFT is also presented.

关键词： algorithm mapping 2-dimensional discrete Fourier transform parallelprocessing systolic array processors VLSI architectures

来源：评论

学校读者我要写书评

暂无评论

mapping 3-d IIR digital filter onto systolic arrays

引用

MULTIDIMENSIONAL SYSTEMS AND SIGNAL PROCESSING 1996年第1期7卷 7-26页

作者： ElGuibaly, F Tawfik, A Department of Electrical and Computer Engineering University of Victoria Victoria Canada

We present here an efficient systolic implementation for 3-D IIR digital filters. The systolic implementation is obtained by using an algebraic mapping technique. This new mapping technique gives us the choice to mix pipelined variables and broadcast variables. We also determine, through the mapping method, the buffer sizes, the direction of variables propagations and the data feeding and extracting points. The resultant systolic array implementation is a modular structure composed of 2-D filter modules connected by simple buffers. This new systolic implementation is regular, modular and amenable to VLSI implementation.

关键词： multidimensional digital filter algorithm mapping combinatorial geometry systolic array design digital filter design task scheduling processor assignment

来源：评论

学校读者我要写书评

暂无评论

SOME NEW DESIGNS OF 2-D ARRAY FOR MATRIX MULTIPLICATION AND TRANSITIVE CLOSURE

引用

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS 1995年第4期6卷 351-362页

作者： TSAY, JC CHANG, PY Institute of Computer Science and Information Engineering College of Engineering National Chiao Tung University Hsinchu Taiwan

In this paper, we present some new regular iterative algorithms for matrix multiplication and transitive closure. With these algorithms, by spacetime mapping the 2-D arrays with 2N-1 and [(3N-1)/2] execution times for matrix multiplication can be obtained, Meanwhile, we can derive a 2-D array with 4N-2 execution time for transitive closure based on the sequential Warshall-Floyd algorithm. All these new 2-D arrays for matrix multiplication and transitive closure have the advantages of faster and more regular than other previous designs.

关键词： algorithm mapping MATRIX MULTIPLICATION MESH ARRAY SYSTOLIC ARRAY SPHERICAL ARRAY TRANSITIVE CLOSURE VLSI

来源：评论

学校读者我要写书评

暂无评论

DESIGN OF SPACE-OPTIMAL REGULAR ARRAYS FOR algorithmS WITH LINEAR SCHEDULES

引用

IEEE TRANSACTIONS ON COMPUTERS 1995年第5期44卷 683-694页

作者： TSAY, JC CHANG, PY Inst. of Comput. Sci. & Inf. Eng. Nat. Chiao Tung Univ. Hsinchu Taiwan

The problem of designing space-optimal 2D regular arrays for N x N x N cubical mesh algorithms with linear schedule ai + bj + ck, 1 less than or equal to a less than or equal to b less than or equal to c, and N = nc, is studied. Three novel nonlinear processor allocation methods, each of which works by combining a partitioning technique (gcd-partition) with different nonlinear processor allocation procedures (traces), are proposed to handle different cases, In cases where a + b less than or equal to c, which are dealt with by the first processor allocation method, space-optimal designs can always be obtained in which the number of processing elements is equal to N-2/c. For other cases where a + b > c and either a = b and b = c, two other optimal processor allocation methods are proposed. Besides, the closed form expressions for the optimal number of processing elements are derived for these cases.

关键词： algorithm mapping DATA DEPENDENCY LINEAR SCHEDULE MATRIX MULTIPLICATION OPTIMIZING COMPILER SPACE-OPTIMAL SYSTOLIC ARRAY

来源：评论

学校读者我要写书评

暂无评论

AN ALGEBRAIC-THEORY FOR MODELING DIRECT INTERCONNECTION NETWORKS

AN ALGEBRAIC-THEORY FOR MODELING DIRECT INTERCONNECTION NETW...

引用

SUPERCOMPUTING 92 CONF

作者： KAUSHIK, SD SHARMA, S HUANG, CH JOHNSON, JR JOHNSON, RW SADAYAPPAN, P Department of Computer and Information Science Ohio State University Columbus 43210 OH United States Department of Mathematics and Computer Science Drexel University Philadelphia 19176 PA United States Department of Computer Science St. Cloud State University St. Cloud 56301 MN United States

ISBN: (纸本)0818626305

We present an algebraic theory based on tensor products for modeling direct interconnection networks. This algebraic theory has been used for designing and implementing block recursive numerical algorithms on shared-memory vector multiprocessors. This theory can be used for mapping algorithms expressed in tensor product form onto distributed-memory architectures. In this paper, we focus on the modeling of direct interconnection networks. Rings, n-dimensional meshes, and hypercubes are represented in tensor product form. algorithm mapping using tensor product formulation is demonstrated by mapping matrix transposition and matrix multiplication onto different networks. © 1992 IEEE.

关键词： TENSOR PRODUCT BLOCK RECURSIVE algorithm DIRECT INTERCONNECTION NETWORK algorithm mapping

来源：评论

学校读者我要写书评

暂无评论

AN OPTIMAL SYSTOLIC ARRAY FOR THE ALGEBRAIC PATH PROBLEM

引用

IEEE TRANSACTIONS ON COMPUTERS 1991年第1期40卷 100-105页

作者： LEWIS, PS KUNG, SY PRINCETON UNIV DEPT ELECT ENGNPRINCETONNJ 08544

A new systolic array design for the Algebraic Path Problem (APP) is presented that is both simpler and more efficient than previously proposed configurations. This array uses N2 orthogonally connected processing elements and requires 2N I/O connections. Total computation time is 5N - 2, which is the minimum time possible in a systolic implementation. The data pipelining rate is one, so no pipeline interleave is required. For multiple problem instances a block pipeline rate of N can be achieved, which is optimal for an array of N2 processing elements.

关键词： ALGEBRAIC PATH PROBLEM algorithm mapping MATRIX INVERSION PARALLEL PROCESSING SHORTEST PATH PROBLEM SYSTOLIC ARRAYS TRANSITIVE CLOSURE PROBLEM VLSI ARCHITECTURES

来源：评论

学校读者我要写书评

暂无评论

RECONFIGURABLE SIMD MASSIVELY PARALLEL COMPUTERS

引用

PROCEEDINGS OF THE IEEE 1991年第4期79卷 429-443页

作者： LI, HW STOUT, QF UNIV MICHIGAN DEPT ELECT ENGNADV COMP ARCHITECTURE LABANN ARBORMI 48109 UNIV MICHIGAN DEPT ELECT ENGNSCI COMP LABANN ARBORMI 48109

Reconfigurable SIMD parallel processor is a member of SIMD architectures. Its most distinguished feature is the utilization of the reconfigurability of the interconnection network to 1) establish a network topology well mapped to the algorithm communication graph so that higher efficiency can be achieved, and to 2) remove faulty processors from the network so that the system operation can be kept uninterrupted while maintaining the same or slightly degraded efficiency. This paper describes several existing reconfigurable SIMD parallel architectures and their reconfiguration mechanism, demonstrates the effectiveness of algorithm mapping through reconfiguration, and discusses fault tolerant schemes via reconfiguration.

关键词： algorithm communication graph algorithm mapping fault tolerant computing fault-tolerant schemes faulty processors interconnection network multiprocessor interconnection networks network topology parallel architectures parallel architectures reconfigurable SIMD massively parallel computers system operation

来源：评论

学校读者我要写书评

暂无评论

TIME OPTIMAL LINEAR SCHEDULES FOR algorithmS WITH UNIFORM DEPENDENCIES

引用

IEEE TRANSACTIONS ON COMPUTERS 1991年第6期40卷 723-742页

作者： SHANG, WJ FORTES, JAB PURDUE UNIV SCH ELECT ENGNW LAFAYETTEIN 47907

An algorithm can be thought of as a set of indexed computations and if one computation uses data generated by another computation then this data dependence can be represented by the difference of their indexes (called dependence vector). Many important algorithms are characterized by the fact that data dependencies are uniform, i.e., the values of the dependence vectors are independent of the indexes of computations. Linear schedules are a special class of schedules described by a linear mapping of computation indexes into time. This paper addresses the problem of identifying optimal linear schedules for uniform dependence algorithms so that their execution time is minimized. Procedures are proposed to solve this problem based on the mathematical solution of a nonlinear optimization problem. The complexity of these procedures is independent of the size of the algorithm. Actually, the complexity is exponential in the dimension of the index set of the algorithm and, for all practical purposes, very small due to the limited dimension of the index set of algorithms of practical interest. The results reported in this paper can be used to derive time-optimal systolic designs and applied in optimizing compilers to restructure programs at compile-time in order to maximally exploit available parallelism.

关键词： algorithm mapping DATA DEPENDENCY LINEAR SCHEDULE OPTIMIZING COMPILER NESTED-LOOP PROGRAM SYSTOLIC ARRAY TIME-OPTIMAL

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：