检索结果-内蒙古大学图书馆

Modelling rotating stratified flows at laboratory-scale using spectrally-based DNS

OCEAN MODELLING 2012年 49-50卷 47-59页

作者： Winters, Kraig B. de la Fuente, Alberto Univ Calif San Diego Scripps Inst Oceanog Dept Aerosp & Mech Engn San Diego CA 92103 USA Univ Chile Dept Ingn Civil Santiago Chile

We describe the use of spectrally-based numerical methods in process studies of rotating stratified fluid dynamics relevant to oceans, lakes and the atmosphere. The objective is to take advantage of the well-known numerical properties of methods based on expansions in terms of trigonometric functions in applications for which inhomogeneous boundary conditions and/or irregular domains are desired. The underlying mathematical idea is the exchange of inhomogeneity from boundary conditions to forcing terms. The fundamental techniques for handling inhomogeneity in boundary conditions, symmetry mismatches between body forces and dependent variables at boundaries and the imposition of boundary conditions on internal or immersed boundaries are described and illustrated using simple idealized examples. These techniques are then combined to illustrate how these methods can be applied to several examples of flows from laboratory experiments. (C) 2012 Elsevier Ltd. All rights reserved.

关键词： Spectral methods Rotating stratified flow parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

Accelerated Network Coding with Dynamic Stream Decomposition on Graphics Processing Unit

引用

COMPUTER JOURNAL 2012年第1期55卷 21-34页

作者： Lee, Sangpil Ro, Won W. Yonsei Univ Sch Elect & Elect Engn Seoul 120749 South Korea

Network coding, a well-known technique for optimizing data-flow in wired and wireless network systems, has attracted considerable attention in various fields. However, the decoding complexity in network coding becomes a major performance bottleneck in the practical network systems;thus, several researches have been conducted for improving the decoding performance in network coding. Nevertheless, previously proposed parallel network coding algorithms have shown limited scalability and performance imbalance for different-sized transfer units and multiple streams. In this paper, we propose a new parallel decoding algorithm for network coding using a graphics processing unit (GPU). This algorithm can simultaneously process multiple incoming streams and can maintain its maximum decoding performance irrespective of the size and number of transfer units. Our experimental results show that the proposed algorithm exhibits a 682.2 Mbps decoding bandwidth on a system with GeForce GTX 285 GPU and speed-ups of up to 26 as compared to the existing single stream decoding procedure with a 128 x 128 coefficient matrix and different-sized data blocks.

关键词： network coding progressive decoding general-purpose computation on GPU parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

Backward error analysis of the AllReduce algorithm for householder QR decomposition

引用

JAPAN JOURNAL OF INDUSTRIAL AND APPLIED MATHEMATICS 2012年第1期29卷 111-130页

作者： Mori, Daisuke Yamamoto, Yusaku Zhang, Shao-Liang Kobe Univ Dept Computat Sci Nada Ku Kobe Hyogo 6578501 Japan Nagoya Univ Dept Computat Sci & Engn Chikusa Ku Nagoya Aichi 4648603 Japan

The AllReduce algorithm is a promising new algorithm for parallelizing the Householder QR decomposition A = QR of a tall and skinny matrix. It divides the input matrix A vertically in a recursive manner, computes the QR decompositions of each submatrix independently, and merges the results to obtain the QR decomposition of A. While this algorithm has been shown to achieve excellent speedup in various parallel environments, its rounding error properties have not been elucidated yet. In this paper, we present theoretical error analysis of the AllReduce algorithm. Specifically, we derive bounds for the backward error of A and deviation from orthogonality of the computed Q factor. Our analysis shows that both of these bounds are smaller than their counterparts for the conventional Householder QR algorithm. Moreover, the bounds decrease as the number of submatrices increases. These results are supported by numerical experiments. Thus we can conclude that the AllReduce algorithm can be used as a reliable method of orthogonalization in parallel environments.

关键词： AllReduce algorithm QR decomposition Householder transformation Error analysis parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

Non-iterative domain decomposition methods for a non-stationary Stokes-Darcy model with Beavers-Joseph interface condition

引用

APPLIED MATHEMATICS AND COMPUTATION 2012年第2期219卷 453-463页

作者： Feng, Wenqiang He, Xiaoming Wang, Zhu Zhang, Xu Missouri Univ Sci & Technol Dept Math & Stat Rolla MO 65409 USA Virginia Tech Dept Math Blacksburg VA 24061 USA

In order to solve a non-stationary Stokes-Darcy model with Beavers-Joseph interface condition, two non-iterative domain decomposition methods are proposed. At each time step, results from previous time steps are utilized to approximate the information on the interface and decouple the two physics. Both of the two methods are parallel. Numerical results suggest that the first method has accuracy order O(h(3) + Delta t). In order to improve the accuracy and efficiency, a three-step backward differentiation is used in the second method to achieve an accuracy order O(h(3) + Delta t(3)), which is illustrated by a numerical example. (C) 2012 Elsevier Inc. All rights reserved.

关键词： Stokes-Darcy flow Beavers-Joseph interface condition Domain decomposition method parallel algorithm Finite elements

来源：评论

学校读者我要写书评

暂无评论

An enhanced ACO algorithm to select features for text categorization and its parallelization

引用

EXPERT SYSTEMS WITH APPLICATIONS 2012年第5期39卷 5861-5871页

作者： Meena, M. Janaki Chandran, K. R. Karthik, A. Samuel, A. Vijay PSG Coll Technol Dept CSE Coimbatore 641004 Tamil Nadu India PSG Coll Technol Dept IT Coimbatore 641004 Tamil Nadu India

Feature selection is an indispensable preprocessing step for effective analysis of high dimensional data. It removes irrelevant features, improves the predictive accuracy and increases the comprehensibility of the model constructed by the classifiers sensitive to features. Finding an optimal feature subset for a problem in an outsized domain becomes intractable and many such feature selection problems have been shown to be NP-hard. Optimization algorithms are frequently designed for NP-hard problems to find nearly optimal solutions with a practical time complexity. This paper formulates the text feature selection problem as a combinatorial problem and proposes an Ant Colony Optimization (ACO) algorithm to find the nearly optimal solution for the same. It differs from the earlier algorithm by Aghdam et al. by including a heuristic function based on statistics and a local search. The algorithm aims at determining a solution that includes 'n' distinct features for each category. Optimization algorithms based on wrapper models show better results but the processes involved in them are time intensive. The availability of parallel architectures as a cluster of machines connected through fast Ethernet has increased the interest on parallelization of algorithms. The proposed ACO algorithm was parallelized and demonstrated with a cluster formed with a maximum of six machines. Documents from 20 newsgroup benchmark dataset were used for experimentation. Features selected by the proposed algorithm were evaluated using Naive bayes classifier and compared with the standard feature selection techniques. It was observed that the performance of the classifier had been improved with the features selected by the enhanced ACO and local search. Error of the classifier decreases over iterations and it was observed that the number of positive features increases with the number of iterations. (C) 2011 Elsevier Ltd. All rights reserved.

关键词： Bag of Words Metaheuristic algorithms Ant Colony Optimization Heuristic information Local search CHIR chi(2) parallel algorithm MapReduce Distributed environment

来源：评论

学校读者我要写书评

暂无评论

parallel c-means algorithm for image segmentation on a reconfigurable mesh computer

引用

parallel COMPUTING 2011年第4-5期37卷 230-243页

作者： Bouattane, Omar Cherradi, Bouchaib Youssfi, Mohamed Bensalah, Mohamed O. ENSET Mohammadia Morocco Fac Sci & Tech Mohammadia Morocco Univ Mohamed Agdal Fac Sci Rabat Morocco

In this paper, we propose a parallel algorithm for data classification, and its application for Magnetic Resonance Images (MRI) segmentation. The studied classification method is the well-known c-means method. The use of the parallel architecture in the classification domain is introduced in order to improve the complexities of the corresponding algorithms, so that they will be considered as a pre-processing procedure. The proposed algorithm is assigned to be implemented on a parallel machine, which is the reconfigurable mesh computer (RMC). The image of size (m x n) to be processed must be stored on the RMC of the same size, one pixel per processing element (PE). (C) 2011 Elsevier B.V. All rights reserved.

关键词： Image segmentation Classification MRI image parallel algorithm c-means

来源：评论

学校读者我要写书评

暂无评论

Signal Reconstruction in Multi-Windows Spline-Spaces Using the Dual System

引用

IEEE SIGNAL PROCESSING LETTERS 2012年第11期19卷 729-732页

作者： Onchis, Darian M. Univ Vienna Fac Math A-1090 Vienna Austria

The letter presents a non-massive parallel procedure to compute the biorthogonal dual system used for signal reconstruction in the case of spline-type spaces with multiple generators. The basis of this algorithm are the properties of the projection operator and the invertibility of the Gramian in the case of a Riesz basis. We use a parallel approach in both time and frequency for the computation of the dual system obtained by translation and sampling of a finitely number of atoms. Since there are many applications in signal and image processing where the spline-type spaces (also known as shift-invariant spaces) play a central role, fast computing methods are needed, especially in the multi-windows case, where the computations are expensive from the execution time and from memory storage point of view. We test the implementation on car crash data.

关键词： Biorthogonal dual system car crash data Gramian multi-window system parallel algorithm projection operator signal reconstruction spline-type spaces Riesz basis

来源：评论

学校读者我要写书评

暂无评论

A parallel multi-unit resource deadlock detection algorithm with O(log₂(min(m, n))) overall run-time complexity

引用

JOURNAL OF parallel AND DISTRIBUTED COMPUTING 2011年第7期71卷 938-954页

作者： Xiao, Xiang Lee, Jaehwan John IUPUI Purdue Sch Engn & Technol ECE Dept Indianapolis IN 46202 USA

Current MPSoCs typically consist of less than a dozen processing units. Future MPSoCs are likely to integrate many more. With this trend, dozens of applications can be running on an MPSoC concurrently and application deadlock on MPSoCs will become a severe problem. To address the application deadlock problem in current and future MPSoCs, this article proposes a parallel multi-unit resource deadlock detection algorithm, incorporating four contributions: (1) a classification of resource events that enables each category of events to be handled efficiently, (2) a parallel node hopping mechanism that explores the entire graph exponentially in parallel to obtain information about reachable processes of every resource, (3) an innovative hardware implementation of the node hopping mechanism using bit-wise computations, and (4) proofs of correctness and run-time complexity of the proposed algorithm. Based on information about reachable processes as well as sink nodes in the graph, the proposed algorithm detects deadlock in O(1) run-time. Compared with the worst case run-time of any previous algorithm, which employs a single scheme to handle all resource events, ours is considerably reduced to O(log(2)(min(m, n))) when implemented in hardware, where m and n are the number of processes and resources, respectively. (C) 2011 Elsevier Inc. All rights reserved.

关键词： Deadlock detection Deadlock detection in hardware Multi-unit resource systems Chip multiprocessor Graph traversing Reachability computation parallel algorithm Digital logic design RTOS Real-time embedded systems

来源：评论

学校读者我要写书评

暂无评论

A new parallel finite element algorithm for the stationary Navier-Stokes equations

引用

FINITE ELEMENTS IN ANALYSIS AND DESIGN 2011年第11期47卷 1262-1279页

作者： Shang, Yueqiang He, Yinnian Kim, Do Wan Zhou, Xiaojun Inha Univ Dept Math Inchon 402751 South Korea Guizhou Normal Univ Sch Math & Comp Sci Guiyang 550001 Peoples R China Xi An Jiao Tong Univ Fac Sci Xian 710049 Peoples R China

Based on two-grid discretization, a new parallel finite element algorithm for the stationary Navier-Stokes equations is proposed and analyzed. This algorithm first solves the Navier-Stokes equations using a coarse grid, and then corrects the resultant residual on a fine grid by solving local Navier-Stokes equations in a parallel manner with homogeneous boundary conditions. Existing sequential Navier-Stokes solver is available for each problem on sub-domains, so that the proposed parallel algorithm can be implemented on the top of existing sequential software. The error bounds of the approximate solution are estimated. Moreover, the efficiency of the algorithm is also demonstrated by numerical simulations of the lid-driven cavity flow, the backward-facing step flow, and the flow past a circular cylinder. (C) 2011 Elsevier B.V. All rights reserved.

关键词： Navier-Stokes equations Finite element parallel computing parallel algorithm Two-grid method Domain decomposition

来源：评论

学校读者我要写书评

暂无评论

Improvement of Two-step parallel Thinning algorithm for Military Map Contours

Improvement of Two-step Parallel Thinning Algorithm for Mili...

引用

2009中国控制与决策会议

作者： Xie Jianhua,Cai Na,Jing Yuanwei Information Science and Engineering,Northeastern University,Shenyang 110004,China

A fast parallel thinning algorithm is analyzed in this *** improved algorithm with two-step method is proposed to thin the contour of military *** make programming easy,the deleting array is given and the operation speed of the algorithm is *** algorithm is programmed with Visual C++6.0 and a good result is obtained. There is no distorted skeleton and excessive corrosion phenomenon,and the connectedness is satisfied.

关键词： Thinning parallel algorithm Skeleton Connectedness

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：