检索结果-内蒙古大学图书馆

AN OPTICAL MULIT-MESH HYPERCUBE - A SCALABLE OPTICAL INTERCONNECTION NETWORK FOR MASSIVELY-parallel computing

JOURNAL OF LIGHTWAVE TECHNOLOGY 1994年第4期12卷 704-716页

作者： LOURI, A SUNG, H UNIV ARIZONA DEPT ELECT & COMP ENGNTUCSONAZ 85721

A new interconnection network for massively parallel computing is introduced. This network is called an Optical Multi-Mesh Hypercube (OMMH) network. The OMMH integrates positive features of both hypercube (small diameter, high connectivity, symmetry, simple control and routing, fault tolerance, etc.) and mesh (constant node degree and scalability) topologies and at the same time circumvents their limitations (e.g., the lack of scalability of hypercubes, and the large diameter of meshes). The OMMH can maintain a constant node degree regardless of the increase in the network size. In addition, the flexibility of the OMMH network makes it well suited for optical implementations. This paper presents the OMMH topology, analyzes its architectural properties and potentials for massively parallel computing, and compares it to the hypercube. Moreover, it also presents a three-dimensional optical design methodology based on free-space optics. The proposed optical implementation has totally space-invariant connection patterns at every node, which enables the OMMH to be highly amenable to optical implementation using simple and efficient large space-bandwidth product space-invariant optical elements.

关键词： HYPERCUBE INTERCONNECTION NETWORK OPTICAL INTERCONNECT parallel computing SCALABILITY SPACE-INVARIANCE

来源：评论

学校读者我要写书评

暂无评论

Efficient construction of the medial axis for a CAD model using parallel computing

引用

ENGINEERING WITH COMPUTERS 2018年第3期34卷 413-429页

作者： Zhu, Housheng Liu, Yusheng Wang, Hongwei Zhao, Jianjun Dalian Maritime Univ Informat Sci & Technol Coll Dalian 116026 Peoples R China Zhejiang Univ State Key Lab CAD&CG Hangzhou 310027 Zhejiang Peoples R China Univ Portsmouth Sch Engn Portsmouth PO1 3DJ Hants England HUST Sch Mech Sci & Engn Wuhan 430074 Hubei Peoples R China

As a simplified representation of a geometric model, the medial axis (MA) has been used in a wide range of engineering applications. While obtaining the true MA of a complicated CAD model is known to be a difficult task, current research is predominantly focused on computing its approximate MA instead. To improve its quality, this work develops a novel and efficient method for obtaining a high-quality MA composed of MA faces for a CAD model. Specifically, an MA point is computed using a dual-normal-tracing algorithm for each sample point. This algorithm can be implemented through GPU-enabled parallel computing and be executed in an iterative manner until MA points have been found for all sample points. After the iteration is completed, the MA points generated are then converted into the resultant MA by evaluating the topological connectivities of their corresponding sample points. Finally, the resultant MA is converted into MA faces using the information of boundary CAD faces. The proposed method is evaluated by analyzing its complexity and robustness, discussing its applicability and testing its performance in a couple of computational experiments. As shown in the evaluation, this method is easy to implement through exploiting parallel computing and can support effective and high-quality MA generation for a CAD model.

关键词： Medial axis parallel computing Mesh model GPU Dual-normal-tracing

来源：评论

学校读者我要写书评

暂无评论

Extending parallel computing with Constraint of Fixed Structure by Adjusting Graph

引用

IETE JOURNAL OF RESEARCH 2016年第4期62卷 453-467页

作者： Xiong, Huanliang Zeng, Guosun Ding, Chunling Wu, Canghai Wang, Wei Tongji Univ Dept Comp Sci & Technol Shanghai Peoples R China Jiangxi Agr Univ Software Coll Nanchang Peoples R China Univ Wisconsin Dept Elect & Comp Engn 1415 Johnson Dr Madison WI 53706 USA

Adding the number of computing nodes is a common approach to achieving higher performance in a parallel computing system. However, with constraint of fixed system architecture and fixed algorithm structure, it is difficult to improve the performance of parallel computing only by extending its scale absolutely. To realize such extension with fixed structure, we analyze key factors from architecture and parallel task, which affect the scalability, and then use the weighted graph to model architecture as well as parallel task. Especially, focusing on the case that architecture graph and parallel task graph are homogeneous, we propose the extension method of graph similarity;for the case that architecture graph and parallel task graph are heterogeneous, a critical-path-unchanged scaling method is proposed. Actually, the above two extending methods do not change the graph's structure. They only adjust the node weight and edge-weight in the relevant graph. Furthermore, through mathematical derivation, some conclusions about the new scaling methods are drawn. Finally, in order to verify the effectiveness, some simulative experiments are conducted on the platform SimGrid. The experimental results show that the proposed methods can realize iso-speed-efficiency extension, and can guide practical extensions for parallel computing.

关键词： Algorithm and machine critical path extending method fixed structure graph model graph similarity parallel computing

来源：评论

学校读者我要写书评

暂无评论

Numerical characterization of nonlinear dynamical systems using parallel computing: The role of GPUs approach

引用

COMMUNICATIONS IN NONLINEAR SCIENCE AND NUMERICAL SIMULATION 2016年 37卷 143-162页

作者： Fazanaro, Filipe I. Soriano, Diogo C. Suyama, Ricardo Madrid, Marconi K. de Oliveira, Jose Raimundo Bravo Munoz, Ignacio Attux, Romis Univ Fed Abc Ctr Engn Modelagem & Ciencias Sociais Aplicadas Ave Estados 5001 BR-09210580 Santo Andre SP Brazil Univ Estadual Campinas Dept Comp Engn & Ind Automat DCA Sch Elect & Comp Engn FEEC Ave Albert Einstein 400Cidade Univ Zeferino Vaz BR-13083852 Sao Paulo Brazil Univ Estadual Campinas DSE Sch Elect & Comp Engn FEEC Ave Albert Einstein 400Cidade Univ Zeferino Vaz BR-13083852 Sao Paulo Brazil Univ Alcala UAH Dept Elect DEPECA Carretera Madrid Barcelona Km 33-600 Madrid 28871 Spain

The characterization of nonlinear dynamical systems and their attractors in terms of invariant measures, basins of attractions and the structure of their vector fields usually outlines a task strongly related to the underlying computational cost. In this work, the practical aspects related to the use of parallel computing - specially the use of Graphics Processing Units (CPUs) and of the Compute Unified Device Architecture (CUDA) - are reviewed and discussed in the context of nonlinear dynamical systems characterization. In this work such characterization is performed by obtaining both local and global Lyapunov exponents for the classical forced Duffing oscillator. The local divergence measure was employed by the computation of the Lagrangian Coherent Structures (LCSs), revealing the general organization of the flow according to the obtained separatrices, while the global Lyapunov exponents were used to characterize the attractors obtained under one or more bifurcation parameters. These simulation sets also illustrate the required computation time and speedup gains provided by different parallel computing strategies, justifying the employment and the relevance of GPUs and CUDA in such extensive numerical approach. Finally, more than simply providing an overview supported by a representative set of simulations, this work also aims to be a unified introduction to the use of the mentioned parallel computing tools in the context of nonlinear dynamical systems, providing codes and examples to be executed in MATLAB and using the CUDA environment, something that is usually fragmented in different scientific communities and restricted to specialists on parallel computing strategies. (C) 2016 Elsevier B.V. All rights reserved.

关键词： Chaos CUDA Duffing oscillator CPU computing Lagrangian Coherent Structures Lyapunov bifurcation diagram Lyapunov exponents parallel computing Parameter space

来源：评论

学校读者我要写书评

暂无评论

Numerical orbit integration based on Lie series with use of parallel computing techniques

引用

ADVANCES IN SPACE RESEARCH 2014年第1期53卷 77-89页

作者： Mai, Enrico Geyer, Robin Leibniz Univ Hannover Inst Erdmessung D-30167 Hannover Germany Tech Univ Dresden Ctr Informat Serv & High Performance Comp ZIH D-01187 Dresden Germany

This article outlines necessary steps to perform numerical orbit integrations based on a Lie series approach. Its implementation requires an efficient evaluation of resulting series coefficients. As an example we treat the classical main problem in satellite orbit calculation (12 only) and the case of a 4 x 4-gravity field. All calculations were performed in very high precision with up to 100 significant digits. In comparison to independent third party computations this approach led to superior results referring to the verifiable constancy of various integrals of motion. To achieve a performance similar to classical numerical integrations in terms of acceptable computing time, at least for non-Keplerian motion problems, we exploited parallel computing capabilities. For our examples, run times were improved by several orders of magnitude, depending on the actual chosen precision level (up to a factor of 50,000 in case of double precision). Here we present the mathematical framework of the proposed orbital integration scheme as well as the work flow for its application in a multi-core, parallel computing environment. (C) 2013 COSPAR. Published by Elsevier Ltd. All rights reserved.

关键词： Numerical integration Satellite orbits Lie series parallel computing OpenMP

来源：评论

学校读者我要写书评

暂无评论

parallel computing of multiobjective optimization of air bearing 71

Parallel computing of multiobjective optimization of air bea...

引用

71st Society of Tribologists and Lubrication Engineers Annual Meeting and Exhibition 2016

作者： Chen, Hsin-Yi Wang, Nenzi Department of Mechanical Engineering Chang Gung University Taiwan

来源：评论

学校读者我要写书评

暂无评论

New multi-DSP parallel computing architecture for real-time image processing

引用

Journal of Systems Engineering and Electronics 2006年第4期17卷 883-889页

作者： Hu Junhong Zhang Tianxu Jiang Haoyang Dept. of Electronics and Information Engineering Central China Normal Univ.Wuhan 430079P.R. China Inst. for Pattern Recognition and Artificial Intelligence Huazhong Univ. of Science and Technology Wuhan 430074 P.R. China

The flexibility of traditional image processing system is limited because those system are designed for specific applications. In this paper, a new TMS320C64x-based multi-DSP parallel computing architecture is presented. It has many promising characteristics such as powerful computing capability, broad I/O bandwidth, topology flexibility, and expansibility. The parallel system performance is evaluated by practical experiment.

关键词： parallel computing image processing real-time computer architecture

来源：评论

学校读者我要写书评

暂无评论

Failure resilient heterogeneous parallel computing across multidomain clusters

引用

INTERNATIONAL JOURNAL OF HIGH PERFORMANCE computing APPLICATIONS 2005年第2期19卷 143-155页

作者： Kurzyniec, D Sunderam, V Emory Univ Dept Math & Comp Sci Atlanta GA 30322 USA

We propose lightweight middleware solutions that facilitate and simplify the execution of failure-resilient Message Passing Interface (MPI) programs across multidomain clusters. The system described in this paper leverages H2O, a distributed metacomputing framework, to route MPI message passing across heterogeneous aggregates located in different administrative or network domains. MPI communication is aided by a specially written H2O pluglet;messages that are destined for remote sites are intercepted and transparently forwarded to their final destinations. We demonstrate that the proposed technique is indeed effective in enabling communication by MPI programs across distinct clusters and across firewalls. Only marginally lowered performance was observed in our tests, and we believe the substantially increased functionality would compensate for this overhead in most situations. In addition to enabling multicluster communications, we note that with the increasing size and distribution of metacomputing environments, fault tolerance aspects become critically important. We argue that the fault tolerance model proposed by FT-MPI fits well in geographically distributed environments, even though its current implementation is confined to a single administrative domain. We describe extensions to overcome these limitations by combining FT-MPI with the H2O framework. Our holistic approach allows users to run fault-tolerant MPI programs on heterogeneous, geographically distributed shared machines, without sacrificing performance and with minimal involvement of resource providers.

关键词： fault tolerance grid-computing parallel computing

来源：评论

学校读者我要写书评

暂无评论

An efficient radial basis functions mesh deformation with greedy algorithm based on recurrence Choleskey decomposition and parallel computing

引用

JOURNAL OF COMPUTATIONAL PHYSICS 2019年 377卷 183-199页

作者： Fang, Hong Hu, Yikun Yu, Caihui Tie, Ming Liu, Jie Gong, Chunye Sci & Technol Space Phys Lab Beijing 100076 Peoples R China Hunan Univ Coll Informat Sci & Engn Changsha 410082 Hunan Peoples R China Natl Univ Def Technol Comp Sch Changsha 410073 Hunan Peoples R China Natl Univ Def Technol Coll Aerosp Sci & Engn Changsha 410073 Hunan Peoples R China

The mesh deformation method based on radial basis functions (RBF) has many advantages and is widely used. RBF based mesh deformation method mainly has two steps: data reduction and displacement interpolation. The data reduction step includes solving interpolation weight coefficients and searching for the node with the maximum interpolation error. The data reduction schemes based on greedy algorithm is used to select an optimum reduced set of surface mesh nodes. In this paper, a parallel mesh deformation method based on parallel data reduction and displacement interpolation is proposed. The proposed recurrence Choleskey decomposition method (RCDM) can decrease the computational cost of solving interpolation weight coefficients from O (N-c(4)) to O (N-c(3)), where N-c denotes the number of support nodes. The technology of parallel computing is used to accelerate the searching for the node with the maximum interpolation error and displacement interpolation. The combination of parallel data reduction and parallel interpolation can greatly improve the efficiency of mesh deformation. Two typical deformation problems of the ONERA M6 and DLR-F6 wing-body-Nacelle-Pylon configuration are taken as the test cases to validate the proposed approach and can get up to 19.57 times performance improvement with the proposed approach. Finally, the aeroelastic response of HIRENASD wing-body configuration is used to verify the efficiency and robustness of the proposed method. (C) 2018 Elsevier Inc. All rights reserved.

关键词： Mesh deformation Recurrence Choleskey decomposition method Radial basis functions Greedy algorithm Data reduction parallel computing

来源：评论

学校读者我要写书评

暂无评论

Cell-mapping orbit search for mission design at ocean worlds using parallel computing

引用

JOURNAL OF THE ASTRONAUTICAL SCIENCES 2021年第1期68卷 172-196页

作者： Koh, Dayung Anderson, Rodney L. Bermejo-Moreno, Ivan CALTECH Jet Prop Lab Pasadena CA 91125 USA Univ Southern Calif Aerosp & Mech Engn Dept Los Angeles CA 90007 USA

A cell-mapping approach is implemented and parallelized to analyze three-body problem orbits in the vicinity of icy moons (Europa and Enceladus). The cell-mapping method is developed for studying nonlinear dynamics with periodic motions. The method does not require previously known solutions as inputs, which is an essential requirement of continuation approaches, and does not impose symmetric constraints. As major strengths of the method, multiple-period periodic solutions and bifurcation studies can be easily performed. This method is especially applicable to a systematic periodic orbit search over a region of interest using an integration time of one period. The parallelized cell-mapping method facilitates a rapid understanding of the global dynamics.

关键词： Cell-mapping Periodic orbit Ocean worlds Icy moons Europa Enceladus parallel computing Three-body problem

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：