检索结果-内蒙古大学图书馆

16th Euromicro International conference on parallel, distributed and Network-Based processing

作者： Kadidlo, Juergen Strey, Alfred Exasol AG D-90411 Nurnberg Germany Univ Innsbruck Inst Comp Sci A-6020 Innsbruck Austria

ISBN: (纸本)9780769530895

The correlation between two signals (cross correlation) is a standard approach to feature detection. The normalized form of cross correlation (normalized correlation coefficient) is particularly used for template matching. In this case, the two-dimensional correlation of images is considered. One of its biggest drawbacks is the need for a lot of computational power, especially when many correlation coefficients are computed. This paper presents a new method for a high performance thread- and data-parallel computation of normalized cross correlation in the spatial domain. It will be shown that a speedup of up to 5 can be achieved solely by a sophisticated programming of the SIMD unit of a standard microprocessor Furthermore, the new data-parallel implementation in the spatial domain can even outperform an (also data-parallel) frequency domain implementation.

关键词： Template matching

来源：评论

学校读者我要写书评

暂无评论

Massively parallel computing techniques might improve highway maintenance

引用

IEEE CONCURRENCY 1998年第1期6卷 58-+页

作者： Huang, YG Tillotson, H Snaith, M Univ Birmingham Sch Civil Engn Birmingham B15 2TT W Midlands England

Manual methods of measuring defects in roads show poor repeatability and reproducibility. Cracking is a principle indicator of defect progression in a road pavements, and the authors' overall objective is to develop a practical automatic, repeatable, and reproducible method of determining the extent of cracking. Their research aims at using a distributed array of processors to achieve practical speeds for processing digitized images of road surfaces to detect cracks. The algorithms described here provide for two processes. The first converts a gray-scale image into a binary image that represents most of the cracks and eliminates most of the noise from the surface texture. This initial screening process might suffice for the bulk of a road having few cracks. The second process combines the crack fragments in the binary image into continous cracks and gives the highway engineer an appropriate output. The article includes results in which individual images were judged to contain cracks or not contain cracks by eight independent observers and by processing on the DAP to the end of the initial screening process. The authors have found that single images can be processed to the initial screening stage in the 40-millisecond limit for real-time processing provided by the British TV standard.

关键词： parallel processing Road transportation Velocity measurement TV Reproducibility of results Surface cracks Pixel Cameras Resource management Data engineering

来源：评论

学校读者我要写书评

暂无评论

GPU Implementation of Bitplane Coding with parallel Coefficient processing for High Performance image Compression

引用

IEEE TRANSACTIONS ON parallel AND distributed SYSTEMS 2017年第8期28卷 2272-2284页

作者： Enfedaque, Pablo Auli-Llinas, Francesc Moure, Juan Carlos Univ Autonoma Barcelona Dept Informat & Commun Engn E-08193 Barcelona Spain Univ Autonoma Barcelona Dept Comp Architecture & Operating Syst E-08193 Barcelona Spain

The fast compression of images is a requisite in many applications like TV production, teleconferencing, or digital cinema. Many of the algorithms employed in current image compression standards are inherently sequential. High performance implementations of such algorithms often require specialized hardware like field integrated gate arrays. Graphics processing Units (GPUs) do not commonly achieve high performance on these algorithms because they do not exhibit fine-grain parallelism. Our previous work introduced a new core algorithm for wavelet-based image coding systems. It is tailored for massive parallel architectures. It is called bitplane coding with parallel coefficient processing (BPC-PaCo). This paper introduces the first high performance, GPU-based implementation of BPC-PaCo. A detailed analysis of the algorithm aids its implementation in the GPU. The main insights behind the proposed codec are an efficient thread-to-data mapping, a smart memory management, and the use of efficient cooperation mechanisms to enable inter-thread communication. Experimental results indicate that the proposed implementation matches the requirements for high resolution (4 K) digital cinema in real time, yielding speedups of 30x with respect to the fastest implementations of current compression standards. Also, a power consumption evaluation shows that our implementation consumes 40x less energy for equivalent performance than state-of-the-art methods.

关键词： image coding SIMD computing graphics processing unit (GPU) compute unified device architecture (CUDA)

来源：评论

学校读者我要写书评

暂无评论

parallel image matching in a distributed system

Parallel image matching in a distributed system

引用

Proceedings of the IEEE 1st International conference on Algorithms and Architectures for parallel processing. Part 1 (of 2)

作者： You, J. Zhu, W.P. Pissaloux, E. Cohen, H.A. Univ of South Australia

image matching based on image feature pixels involves heavily iterated computation and repeated memory access. In our previous work the detection of interesting points has been reported as an efficient pre-processing step to extract binary images for further matching in terms of certain distance measurement. This paper presents our extension to a parallel implementation of the matching scheme for object recognition on a low cost heterogeneous PVM (parallel virtual Machine) network. While most of the sequential execution time is spent on image feature extraction, distance transform and matching measurement, our investigation shows that a distributed memory multicomputer can best meet the high computational and memory access demands in image processing. The performance is evaluated in terms of execution time. We conclude that parallel image processing can be implemented on a general distributed system to achieve the speedup without specific hardware requirement.

关键词： image processing

来源：评论

学校读者我要写书评

暂无评论

An object interface for interoperability of image processing parallel library in a distributed environment

引用

13th International conference on image Analysis and processing

作者： Clematis, A D'Agostino, D Galizia, A IMATI CNR I-16149 Genoa Italy

ISBN: (纸本)3540288694

image processing applications are computing demanding and since a long time much attention has been paid to the use of parallel processing. Emerging distributed and Grid based architectures represent new and well suited platforms that promise the availability of the required computational power. In this direction image processing has to evolve to heterogeneous environments, and a crucial aspect is represented by the interoperability and reuse of available and high performance code. This paper describes our experience in the development of PIMA(GE)(2), parallel image processing GEnoa server, obtained wrapping a library using the CORBA framework. Our aim is to obtain a high level of flexibility and dynamicity in the server architecture with a possible limited overhead. The design of a hierarchy of image processing operation objects and the development of the server interface are discussed.

关键词： image processing

来源：评论

学校读者我要写书评

暂无评论

NPU-based image compositing in a distributed visualization system

引用

IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2007年第4期13卷 798-809页

作者： Pugmire, David Monroe, Laura Davenport, Carolyn Connor DuBois, Andrew DuBois, David Poole, Stephen Los Alamos Natl Lab Los Alamos NM 87545 USA Oak Ridge Natl Lab Oak Ridge TN 37831 USA

This paper describes the first use of a Network processing Unit (NPU) to perform hardware-based image composition in a distributed rendering system. The image composition step is a notorious bottleneck in a clustered rendering system. Furthermore, image compositing algorithms do not necessarily scale as data size and number of nodes increase. Previous researchers have addressed the composition problem via software and/or custom-built hardware. We used the heterogeneous multicore computation architecture of the Intel IXP28XX NPU, a fully programmable commercial off-the-shelf (COTS) technology, to perform the image composition step. With this design, we have attained a nearly four-times performance increase over traditional software-based compositing methods, achieving sustained compositing rates of 22-28 fps on a 1, 024 x 1, 024 image. This system is fully scalable with a negligible penalty in frame rate, is entirely COTS, and is flexible with regard to operating system, rendering software, graphics cards, and node architecture. The NPU-based compositor has the additional advantage of being a modular compositing component that is eminently suitable for integration into existing distributed software visualization packages.

关键词： hardware-assisted image compositing high-performance computing image compositing Network processing Unit parallel rendering PC clusters visualization volume rendering

来源：评论

学校读者我要写书评

暂无评论

A highly parallel design for fractal image compression

A highly parallel design for fractal image compression

引用

2005 International conference on parallel and distributed processing Techniques and Applications, PDPTA'05

作者： Robinson, Patrick Lee, Tai-Chi Department of Computer Science Saginaw Valley State University University Center MI 48710 United States

ISBN: (纸本)9781932415582

image compression continues to be an important field of research, the ability to quickly and accurately compress images is beneficial to many areas including high-speed imaging, space exploration, defense applications, and multimedia applications. This paper presents a proposed parallel implementation of Fractal image Compression on a grid of parallel computing elements. These processing nodes each perform the compression of a region of an image, and are arranged in a way to provide a completely parallel searching of the entire domain.

关键词： image compression

来源：评论

学校读者我要写书评

暂无评论

Efficient parallel algorithms for hierarchical clustering on arrays with reconfigurable optical buses

引用

JOURNAL OF parallel AND distributed COMPUTING 2000年第9期60卷 1137-1153页

作者： Wu, CH Horng, SJ Tsai, HR Natl Taiwan Univ Sci & Technol Dept Elect Engn Taipei Taiwan Chinese Naval Acad Dept Informat Management Kaohsiung Taiwan Ling Tung Coll Dept Informat Management Taichung Taiwan

Clustering is a basic operation in image processing and computer vision, and it plays an important role in unsupervised pattern recognition and image segmentation. While there are many methods for clustering, the single-link hierarchical clustering is one of the most popular techniques. In this paper, with the advantages of both optical transmission and electronic computation, we design efficient parallel hierarchical clustering algorithms on the arrays with reconfigurable optical buses (AROB). We first design three efficient basic operations which include the matrix multiplication of two N x N matrices, finding the minimum spanning tree of a graph with N vertices, and identifying the connected component containing a specified vertex. Based on these three data operations, an O(log N) time parallel hierarchical clustering algorithm is proposed using N-3 processors. Furthermore, if the connectivity of the AROB with four-port connection is allowed, two constant time clustering algorithms can be also derived using N-4 and N-3 processors, respectively. These results improve on previously known algorithms developed on various parallel computational models. (C) 2000 Academic Press.

关键词： cluster analysis hierarchical clustering image processing pattern recognition parallel algorithm arrays with reconfigurable optical buses (AROB)

来源：评论

学校读者我要写书评

暂无评论

A highly scalable interconnection network for parallel image processing

A highly scalable interconnection network for parallel image...

引用

conference on Neural Network and distributed processing

作者： Wang, HY Gu, WK Zhejiang Univ Dept Informat & Elect Engn Hangzhou 310027 Zhejiang Peoples R China

ISBN: (纸本)0819442836

In this paper, we introduce a new hierarchical interconnection network for massively parallel systems, named Fully Connected Cubic Network (FCCN). FCCN is able to emulate the popular Hypercube. FCCN has a constant nodal degree of 4 and it therefore eliminates the problem of large fanout in Hypercube. Moreover, the constant degree is an important requirement for efficiently fabricating an architecture in parallel image processing. FCCN is also a highly scalable architecture in that the existing links remain intact when new nodes are introduced. FCCN is maximally fault tolerant and it enjoys reasonably low diameter, growth of the number of links and average internodal distance. At last, FCCN is used for parallel image processing system for interconnection. The computation results show that FCCN is a high efficient interconnection network for parallel image processing.

关键词： Fully Connected Cubic Network(FCCN) interconnection network(IN) self-routing algorithm parallel image processing

来源：评论

学校读者我要写书评

暂无评论

parallel DATA RESAMPLING AND FOURIER INVERSION BY THE SCAN-LINE METHOD

引用

IEEE TRANSACTIONS ON MEDICAL IMAGING 1995年第3期14卷 454-463页

作者： NOLL, DC WEBB, JA WARFEL, TE CARNEGIE MELLON UNIV SCH COMP SCIPITTSBURGHPA 15213 CARNEGIE MELLON UNIV DEPT ELECT & COMP ENGNPITTSBURGHPA 15213

Fourier inversion is an efficient method for image reconstruction in a variety of applications, for example, in computed tomography and magnetic resonance imaging. Fourier inversion normally consists of two steps, interpolation of data onto a rectilinear grid, if necessary, and inverse Fourier transformation, This paper presents interpolation by the scan-line method, in which the interpolation algorithm is implemented in a form consisting only of row operations and data transposes, The two-dimensional inverse Fourier transformation can also be implemented with only row operations and data transposes, Accordingly, Fourier inversion can easily be implemented on a parallel computer that supports row operations and data transposes on row distributed data The conditions under which the scan-line implementations are algorithmically equivalent to the original serial computer implementation are described and methods for improving accuracy outside of those conditions are presented, The scan-line algorithm is implemented on the iWarp parallel computer using the Adapt language for parallel image processing. This implementation is applied to magnetic resonance data acquired along radial-lines and spiral trajectories through Fourier transform space.

关键词： Resampling Interpolation parallel computers image reconstruction Fourier Interpolation Arithmetic Fourier Transform method of lines Inversion

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：