检索结果-内蒙古大学图书馆

International Symposium on Instrumentation and Measurement, Sensor Network and Automation

作者： Bohu Huang Haibin Zhang Inst. of Computing Theory & Technology Xidian University Xi'an China

ISBN: (纸本)9781479927173

As the sizes of FPGA device grow, the long run-time of the placement is becoming a great challenge for the FPGA design flow. Simulated annealing is the best-known method applied to this problem due to the good quality of result (QoR), but its computation time seems not satisfactory. In this paper, we propose a parallel placement algorithm named MPP-SA (Multi-core parallel Placement algorithm based on Simulated Annealing). Our goal is to provide a fast placement algorithm with high QoR. MPP-SA has the same annealing schedule as the traditional simulated annealing, but it uses the parallel approach to move blocks concurrently by multiple threads that are run on different cores of the same processor. To ensure the correctness of the results, MPP-SA also uses synchronization technology and lock mechanism, which brings some overheads. However, experiment results show that these overheads have not seriously affected the performance of our algorithm, especial for large circuits. Compared with the placement algorithm of T_VPlace in VPR5.0, MPP-SA is able to decrease the run-time of 5 different size benchmark circuits by an average of 32%-42% without losing QoR.

关键词： FPGA multi-core parallel algorithm simulated annealing design AIDS

来源：评论

学校读者我要写书评

暂无评论

A HYBRID GRANULARITY parallel algorithm FOR PRECISE INTEGRATION OF STRUCTURAL DYNAMIC RESPONSES

引用

Acta Mechanica Solida Sinica 2008年第1期21卷 28-33页

作者： Yuanyin Li Xianlong Jin Genguo Li High Performance Computing Center Shanghai Jiaotong University Shanghai 200240 China Shanghai Supercomputer Center Shanghai 201203 China

Precise integration methods to solve structural dynamic responses and the corresponding time integration formula are composed of two parts： the multiplication of an exponential matrix with a vector and the integration term. The second term can be solved by the series solution. Two hybrid granularity parallel algorithms are designed, that is, the exponential matrix and the first term are computed by the fine-grained parallel algorithra and the second term is computed by the coarse-grained parallel algorithm. Numerical examples show that these two hybrid granularity parallel algorithms obtain higher speedup and parallel efficiency than two existing parallel algorithms.

关键词： dynamic response precise integration hybrid granularity parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

parallel COMPUTING OF COVARIANCE MATRIX AND ITS APPLICATION ON HYPERSPECTRAL DATA PROCESS

PARALLEL COMPUTING OF COVARIANCE MATRIX AND ITS APPLICATION ...

引用

IEEE International Geoscience and Remote Sensing Symposium (IGARSS)

作者： Mao-zhi Wang Da-ming Wang Wen-xi Xu Bin-yang Chen Ke Guo Chengdu Univ Technol Geomath Key Lab Sichuan Prov Chengdu 610059 Sichuan Peoples R China

ISBN: (纸本)9781467311595

A parallel algorithm of covariance matrix, which is used to realize the dimensionality reduction process of hyperspectral image based on Principal Component Analysis (PCA) and Minimum Noise Fraction (MNF), is proposed in this paper. The performance of the parallel algorithm according to the experiment under cluster circumstance with message passing interface (MPI) is discussed. The Gustafsun Law and Amdahl Law usually used to analyze the parallel algorithm results are also discussed in this experiment. At last, some further research areas and questions have been listed.

关键词： covariance matrix parallel algorithm message passing interface hyperspectral image

来源：评论

学校读者我要写书评

暂无评论

parallel Multi-Temporal Remote Sensing Image Change Detection on GPU

Parallel Multi-Temporal Remote Sensing Image Change Detectio...

引用

26th IEEE International parallel and Distributed Processing Symposium (IPDPS) / Workshop on High Performance Data Intensive Computing

作者： Zhu, Huming Cao, Yu Zhou, Zhiqiang Gong, Maoguo Xidian Univ Key Lab Intelligent Percept & Image Understanding Minist Educ China Xian Peoples R China

ISBN: (纸本)9780769546766

Change detection is an important technique in damage assessment area. As the amount of remote sensing images and the complexity of algorithms rise, the demand for processing power is increasing. In this paper, we propose PLog-FLCM, a parallel algorithm for change detection. It is implemented on AMD Accelerated parallel Processing (APP) SDK v2 based on Open Computing Language. The parallel characteristics and implementation details of the proposed PLog-FLICM algorithm are presented. Experiments on several Synthetic Aperture Radar(SAR) images demonstrate that the proposed algorithm outperform other algorithms, and the designed parallel algorithm can greatly reduce the computational time of change detection algorithm. It has achieved speedups of between 63 and 145 times on AMD Radeon HD 6870 Graphics Processing Unit(GPU).

关键词： parallel algorithm GPU Remote Sensing Image Change Detection Multi-Temporal Analysis

来源：评论

学校读者我要写书评

暂无评论

parallel Community Detection for Massive Graphs 1

引用

9th International Conference on parallel Processing and Applied Mathematics (PPAM)

作者： Riedy, E. Jason Meyerhenke, Henning Ediger, David Bader, David A. Georgia Inst Technol Atlanta GA 30332 USA

ISBN: (数字)9783642314643

ISBN: (纸本)9783642314636;9783642314643

Tackling the current volume of graph-structured data requires parallel tools. We extend our work on analyzing such massive graph data with the first massively parallel algorithm for community detection that scales to current data sizes, scaling to graphs of over 122 million vertices and nearly 2 billion edges in under 7300 seconds on a massively multithreaded Cray XMT. Our algorithm achieves moderate parallel scalability without sacrificing sequential operational complexity. Community detection partitions a graph into subgraphs more densely connected within the subgraph than to the rest of the graph. We take an agglomerative approach similar to Clauset, Newman, and Moore's sequential algorithm, merging pairs of connected intermediate subgraphs to optimize different graph properties. Working in parallel opens new approaches to high performance. On smaller data sets, we find the output's modularity compares well with the standard sequential algorithms.

关键词： Community detection parallel algorithm graph analysis

来源：评论

学校读者我要写书评

暂无评论

A parallel algorithm for large systems of Volterra integral equations of Abel type

引用

JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS 2008年第1-2期220卷 749-758页

作者： Capobianco, G. Cardone, A. Univ Naples Federico II Dipartimento Matemat & Applicaz I-80126 Naples Italy Univ Molise Dipartimento STAT I-86090 Pesche IS Italy

A significative number of recent applications require numerical solution of large systems of Abel-Volterra integral equations. Here we propose a parallel algorithm to numerically solve a class of these systems, designed for a distributed-memory MIMD architecture. In order to achieve a good efficiency we employ a fully parallel and fast convergent waveform relaxation (WR) method and evaluate the lag term by using FFT techniques. To accelerate the convergence of the WR method and to best exploit the parallel architecture we develop special strategies. The performances of the resulting code, NSWR4, are illustrated on some examples. (c) 2008 Elsevier B. V. All rights reserved.

关键词： 45D05 45E10 45F15 65R20 65T50 65Y05 Volterra–Abel integral equations Waveform relaxation methods Fractional linear methods Fast Fourier transform parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

A parallel algorithm for accurate dot product

引用

parallel COMPUTING 2008年第6-8期34卷 392-410页

作者： Yamanaka, N. Ogita, T. Rump, S. M. Oishi, S. Waseda Univ Grad Sch Sci & Engn Tokyo 1698555 Japan Tokyo Womans Christian Univ Dept Math Tokyo 1678555 Japan Waseda Univ Fac Sci & Engn Tokyo 1698555 Japan Hamburg Univ Tcchnol Inst Reliable Comp D-21071 Hamburg Germany

parallel algorithms for accurate summation and dot product are proposed, They are parallelized versions of fast and accurate algorithms of calculating sum and dot product using error-free transformations which are recently proposed by Ogita et al. [T. Ogita, S.M. Rump, S. Oishi, Accurate sum and dot product, SIAM J. Sci. Comput. 26 (6) (2005) 1955-1988]. They have shown their algorithms are fast in terms of measured computing time. However, due to the strong data dependence in the process of their algorithms, it is difficult to parallelize them. Similarly to their algorithms, the proposed parallel algorithms in this paper are designed to achieve the results as if computed in K-fold working precision with keeping the fastness of their algorithms. Numerical results are presented showing the performance of the proposed parallel algorithm of calculating dot product. (C) 2008 Elsevier B.V. All rights reserved.

关键词： parallel algorithm accurate dot product accurate summation higher precision

来源：评论

学校读者我要写书评

暂无评论

Research on parallel computing model for Cubic-R architecture

Research on parallel computing model for Cubic-R architectur...

引用

4th International Conference on Multimedia Information Networking and Security (MINES)

作者： Yu, Nan Zheng, Shen Wuhan Univ Technol Sch Informat Engn Wuhan Hubei Peoples R China Natl Digital Switching Syst Engn & Technol R&D Ct Zhengzhou Henan Peoples R China

ISBN: (纸本)9780769548524;9781467330930

parallel computing model plays a great basic role in advanced computing;Based on researching existing parallel computing models, this paper brings forward a parallel computing model-Layer Forward Net toward Cubic-R architecture, and describes the model's structure, parameter, logic abstractly. Lastly towards typical N-Body problem, this paper designs a parallel algorithm, and analyses its complexity. The compared result shows that this model has low computing complexity, increase by layer and other merit.

关键词： Computing model parallel algorithm Layer Forward Net N-Body model

来源：评论

学校读者我要写书评

暂无评论

A parallel H.264 Encoder with CUDA: Mapping and Evaluation

A Parallel H.264 Encoder with CUDA: Mapping and Evaluation

引用

IEEE 18th International Conference on parallel and Distributed Systems (ICPADS)

作者： Wu, Nan Wen, Mei Su, Huayou Ren, Ju Zhang, Chunyuan Natl Univ Def Technol Comp Sch Changsha Hunan Peoples R China

ISBN: (纸本)9781467345651

Efficient mapping of a real-time HD video application to graphics hardware is challenging. Developers face the challenges of choosing the right parallelism model, balancing thread's process granularity between massive computing resources on the GPU, and partitioning tasks between the CPU and GPU. The paper illustrated the mapping approaches by a case of HD H.264 encoderbased on X264 reference code and then evaluating it on state-of-the-art CPU and GPUs in depth. In the paper, we first split most of the computing task into Single-Instruction Multiple-Thread (SIMT) kernels, which are then chained intocertaininput/output data stream. Then we implementeda completedH.264 encoding on the computer unified device architecture (CUDA) platform. Finally, we present methods for exploiting multi-level parallelism and memory efficiency when mapping H.264 code, which we use to increase the efficiency of the execution on GPUs. Our experimental results show that computation efficiencyof GPU and then real-time encoding performance are achievedwith CUDA.

关键词： GPU Programming Video Processing parallel algorithm Real Time Encode High-performance Media Computing

来源：评论

学校读者我要写书评

暂无评论

parallel Search for Ramsey Grid Colorings 12

Parallel Search for Ramsey Grid Colorings

引用

50th Annual Association-for-Computing-Machinery (ACM) Southeast Conference

作者： Apon, Daniel Li, Wing Ning Univ Maryland Dept Comp Sci College Pk MD 20742 USA Univ Arkansas Dept Comp Sci & Engn Fayetteville AR 72701 USA

ISBN: (纸本)9781450312035

Monochromatic-square-free grid coloring is a challenging computational problem with connections to Ramsey-theoretic combinatorics and multiparty communication complexity for which no polynomial time algorithm is known. In this paper, we report on a parallel search for exact grid coloring solutions and its implementation on a large-scale cluster computer. We obtain the first-known 2-color solution for the 14 X 14 grid.

关键词： grid coloring Ramsey theory parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：