检索结果-内蒙古大学图书馆

10th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA)

作者： Wegrzyn, Agnieszka Univ Zielona Gora PL-65246 Zielona Gora Poland

ISBN: (纸本)078039402X

In the paper the method of computation all deadlocks and traps in the Petri net is presented. This method is based on Thelen method [9] and it was proposed in [10]. Methods of calculation of all deadlocks and trap v in Petri nets are very time consuming. Therefore it is very important to optimize a computation. The parallel computation method for the time reduction is proposed. Experimental results of presented method are discussed, as well.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

parallel Computation of RBF Kernels for Support Vector Classifiers

引用

5th SIAM International Conference on Data Mining

作者： Qiu, Shibin Lane, Terran Univ New Mexico Dept Comp Sci Albuquerque NM 87131 USA

ISBN: (纸本)9780898715934

While kernel support vector machines are powerful classification algorithms, their computational overhead can be significant, especially for large and high-dimensional data sets. A recent biomedical dataset, for instance, could take as long as 3 weeks to compute its RBF kernel matrix on a modern, single-processor workstation. In this paper, we develop methods for high-performance parallel computation of kernel matrices. There are two key components to a parallel implementation: distribution of the computation across nodes and communication to combine the results. To address the first, we employ a dimension-wise data partition that yields efficient computation and low communication overhead during the initial phase. This partition provides dramatic speedups on large and high-dimensional data, applies to a wide variety of kernel functions, and is an exact computation, producing the same kernel matrix as its sequential implementation. To address communication needs during the second phase, we introduce an approximation specific to the Gaussian RBF kernel that yields sparse partial kernel matrices and, thus, efficient communication. We analyze the approximation error of this method, demonstrating that it falls off exponentially with N, the parameter of the approximation. We also examine the positive definiteness of the approximation with respect to Mercer's condition and show that (a) in the limit of N our approximation becomes positive definite for any data set and (b) for a fixed data set, there exists a finite N yielding a positive definite kernel matrix. We also give a simple iterative method for selecting N to yield a positive definite kernel matrix on any fixed data set. In practice, we find that positive definiteness is achieved on all of the data sets we examine with very small N (2-5). Finally, we test the empirical performance of our two methods on a variety of large, real-world data sets, demonstrating large computational speedups with little or no impact on

关键词： parallel algorithms Support vector machines Kernel approximation SVM performance

来源：评论

学校读者我要写书评

暂无评论

Communication optimization and auto load balancing in parallel OSEM algorithm for fully 3-D SPECT reconstruction

Communication optimization and auto load balancing in parall...

引用

Nuclear Science Symposium/Medical Imaging Conference

作者： Ma Tianyu Zhou Rong Jin Yongjie

ISBN: (纸本)0780392213

In order to improve the computation speed of ordered subset expectation maximization (OSEM) algorithm for fully 3-D single photon emission computed tomography (SPECT) reconstruction, a parallelizing, scheme of OSEM reconstruction algorithm was implemented on an experimental beowulf-type cluster and impact factors on the parallel efficiency were investigated. Two approaches were employed to improve the efficiency: (1) the communication cost was minimized via overlapping communication with computation and (2) the idle time of processes was reduced by auto load balancing. Performance of the optimized parallel algorithm was evaluated in terms of computation time, speedup factor and parallel efficiency. Improvements were observed after optimization. The efficiency was raised from 83.86% to 92.07% in fully 3-D 128 x 128 x 128 SPECT reconstruction.

关键词： Load management Image reconstruction Concurrent computing Clustering algorithms Single photon emission computed tomography Reconstruction algorithms parallel algorithms Optical computing Positron emission tomography Three dimensional displays

来源：评论

学校读者我要写书评

暂无评论

parallel blocked algorithm for solving the algebraic path problem on a matrix processor

引用

1st International on High Performance Computing and Communications (HPCC 2005)

作者： Takahashi, A Sedukhin, S Univ Aizu Grad Sch Comp Sci & Engn Fukushima 9658580 Japan

ISBN: (纸本)3540290311

This paper presents a parallel blocked algorithm for the algebraic path problem (APP). It is known that the complexity of the APP is the same as that of the classical matrix-matrix multiplication-, however, solving the APP takes much more running time because of its unique data dependencies that limits data reuse drastically. We examine a parallel implementation of a blocked algorithm for the APP on the one-chip Intrinsity FastMATH adaptive processor, which consists of a scalar MIPS processor extended with a SIMD matrix coprocessor. The matrix coprocessor supports native matrix instructions on an array of 4 x 4 processing elements. Implementing with matrix instructions requires us to transform algorithms in terms of matrix-matrix operations. Conventional vectorization for SIMD vector processing deals with only the innermost loop;however, on the FastMATH processor, we need to vectorize two or three nested loops in order to convert the loops to equivalent one matrix operation. Our experimental results show a peak performance of 9.27 COPS and high usage rates of matrix instructions for solving the APP. Findings from our experimental results indicate that the SIMD matrix extension to (super)scalar processor would be very useful for fast solution of many matrix-formulated problems.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

parallel algorithms for LQ optimal control of discrete-time periodic linear systems

引用

JOURNAL OF parallel AND DISTRIBUTED COMPUTING 2002年第2期62卷 306-325页

作者： Benner, P Byers, R Mayo, R Quintana-Ortí, ES Hernández, V Univ Bremen Fachbereich Math & Informat 3 Zentrum Technomath D-28334 Bremen Germany Univ Kansas Dept Math Lawrence KS 66045 USA Univ Jaume 1 Dept Ingn & Ciencia Comp Castellon de La Plana 12080 Spain Univ Politecn Valencia Dept Sistemas Informat & Computac E-46071 Valencia Spain

This paper analyzes the performance of two parallel algorithms for solving the linear-quadratic optimal control problem arising in discrete-time periodic linear systems. The algorithms perform a sequence of orthogonal reordering transformations on formal matrix products associated with the periodic linear system and then employ the so-called matrix disk function to solve the resulting discrete-time periodic algebraic Riccati equations needed to determine the optimal periodic feedback. We parallelize these solvers using two different approaches, based on a coarse-grain and a medium-grain distribution of the computational load. The experimental results report the high performance and scalability of the parallel algorithms on a Beowulf cluster. (C) 2002 Elsevier Science (USA).

关键词： parallel algorithms LINEAR systems

来源：评论

学校读者我要写书评

暂无评论

Finding large independent sets in graphs and hypergraphs

引用

SIAM JOURNAL ON DISCRETE MATHEMATICS 2005年第3期18卷 488-500页

作者： Shachnai, H Srinivasan, A Technion Israel Inst Technol Dept Comp Sci IL-32000 Haifa Israel Bell Labs Lucent Technol Murray Hill NJ 07974 USA Univ Maryland Dept Comp Sci College Pk MD 20742 USA Univ Maryland Inst Adv Comp Studies College Pk MD 20742 USA

A basic problem in graphs and hypergraphs is that of finding a large independent set-one of guaranteed size. Understanding the parallel complexity of this and related independent set problems on hypergraphs is a fundamental open issue in parallel computation. Caro and Tuza [J. Graph Theory, 15 (1991), pp. 99-107] have shown a certain lower bound alpha(k)(H) on the size of a maximum independent set in a given k-uniform hypergraph H and have also presented an efficient sequential algorithm to find an independent set of size alpha k(H). They also show that alpha(k)(H) is the size of the maximum independent set for various hypergraph families. Here, we show that an RNC algorithm due to Beame and Luby [in Proceedings of the ACM-SIAM Symposium on Discrete algorithms, 1990, pp. 212-218] finds an independent set of expected size alpha(k)(H) and also derandomizes it for certain special cases. (An intriguing conjecture of Beame and Luby implies that understanding this algorithm better may yield an RNC algorithm to find a maximal independent set in hypergraphs, which is among the outstanding open questions in parallel computation.) We also present lower bounds on independent set size for nonuniform hypergraphs using this algorithm. For graphs, we get an NC algorithm to find independent sets of size essentially that guaranteed by the general (degree-sequence based) version of Turan's theorem.

关键词： independent sets parallel algorithms randomized algorithms

来源：评论

学校读者我要写书评

暂无评论

parallel algorithms for particles-turbulence two-way interaction direct numerical simulation

引用

JOURNAL OF parallel AND DISTRIBUTED COMPUTING 2002年第1期62卷 38-60页

作者： Ling, W Liu, J Chung, JN Crowe, CT Lucent Technol Inc Whippany NJ 07981 USA Univ Florida Dept Comp Sci & Informat Engn Gainesville FL 32611 USA Univ Florida Dept Mech Engn Gainesville FL 32611 USA Washington State Univ Sch Mech & Mat Engn Pullman WA 99164 USA

Understanding the demixing effect on the dispersion of particles by large-scale turbulence is very important in practical applications. Using pseudospectral and Lagrangian approaches, we have simulated a three-dimensional particle-laden mixing layer under one-way coupling effect. However, the computer resource required to simulate such a two-phase flow with high Reynolds number and two-way momentum coupling effect exceeds the limit of the current single processor. In this paper. the computation of particles and the two-way momentum coupling terms are partitioned in the span-wise direction because particles are distributed most evenly in this direction. The computation of the tree-dimensional flow field is first partitioned into three groups of processors because of the most independence of the computation among the three spatial dimensions. In each group, the domain is then partitioned using two different schemes based on the property of the fast Fourier transformation. The first one, the master slave scheme, is employed for Algorithm MS due to its simplicity and overlapping of communication and computation. The second one, the transpose approach, is used for Algorithm TP order to partition all of the flow field computation. An analysis shows that compared to Algorithm MS, Algorithm TP can also reduce nearly a half of the amount of communication work. Experiments show that Algorithm MS has obtained a speedup of 4.3 using 9 HP workstations and a speedup of 6.4 using 15 nodes of IBM SP-2 for a problem size on the order of 64(3), and Algorithm TP has achieved speedups 44% higher than Algorithm MS. (C) 2001 Elsevier Science.

关键词： parallel algorithms SIMULATION methods & models

来源：评论

学校读者我要写书评

暂无评论

parallel algorithms for robust broadband MVDR beamforming

引用

JOURNAL OF COMPUTATIONAL ACOUSTICS 2002年第1期10卷 69-96页

作者： Sinha, P George, AD Kim, K Univ Florida Dept Elect & Comp Engn High Performance Comp & Simulat Res Lab HCS Gainesville FL 32611 USA

Rapid advancements in adaptive sonar beamforming algorithms have greatly increased the computation and communication demands on beamforming arrays, particularly for applications that require in-array autonomous operation. By coupling each transducer node in a distributed array with a microprocessor, and networking them together, embedded parallel processing for adaptive beamformers can significantly reduce execution time, power consumption and cost, and increase scalability and dependability. In this paper, the basic narrowband Minimum Variance Distortionless Response (MVDR) beamformer is enhanced by incorporating broadband processing, a technique to enhance the robustness of the algorithm, and speedup of the matrix inversion task using sequential regression. Using this Robust Broadband MVDR (RB-MVDR) algorithm as a sequential baseline, two novel parallel algorithms are developed and analyzed. Performance results are included, among them execution time, scaled speedup, parallel efficiency, result latency and memory utilization. The testbed used is a distributed system comprised of a cluster of personal computers connected by a conventional network.

关键词： parallel algorithms ROBUST control

来源：评论

学校读者我要写书评

暂无评论

Fast algorithms for the bean critical-state model for superconductivity

引用

NUMERICAL FUNCTIONAL ANALYSIS AND OPTIMIZATION 2005年第2期26卷 177-192页

作者： Furati, KM Siddiqi, AH King Fahd Univ Petr & Minerals Dept Math Sci Dhahran 31261 Saudi Arabia

The Bean critical-state model describes the Penetration of magnetic field into type-II superconductors. Mathematically, it is a free boundary problem, and fast algorithms for its solution are needed in applied superconductivity. Existence and uniqueness of solution, parallel algorithms, stability, and error estimation for this model are discussed.

关键词： bean model parabolic variational and quasi-variational inequality parallel algorithms superconductivity

来源：评论

学校读者我要写书评

暂无评论

Time and Space parallelization of the Navier-Stokes equations

引用

COMPUTATIONAL & APPLIED MATHEMATICS 2005年第3期24卷 417-438页

作者： Albarreal Nunez, Isidoro I. Calzada Canalejo, M. Carmen Cruz Soto, Jose Luis Fernandez Cara, Enrique Galo Sanchez, Jose R. Marin Beltran, Mercedes Univ Seville Dept EDAN E-41080 Seville Spain Univ Cordoba Dept Informat & Analisis Numer E-14071 Cordoba Spain

In this paper, we will be mainly concerned with a parallel algorithm (in time and space) which is used to solve the incompressible Navier-Stokes problem. This relies on two main ideas: (a) a splitting of the main differential operator which permits to consider independently the most important difficulties (nonlinearity and incompressibility) and (b) the approximation of the resulting stationary problems by a family of second-order one-dimensional linear systems. The same strategy can be applied to two-dimensional and three-dimensional problems and involves the same level of difficulty. It can be also useful for the solution of other more complicate systems like Boussinesq or turbulence models. The behavior of the method is illustrated with some numerical experiments.

关键词： Navier-Stokes equations numerical solution parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：