This paper deals with the parallelization of free-surface three-dimensional oceanographic model. The model is based on full nonlinear "primitive" equations of the ocean. Generalized vertical coordinate syste...
详细信息
ISBN:
(纸本)9783642148217
This paper deals with the parallelization of free-surface three-dimensional oceanographic model. The model is based on full nonlinear "primitive" equations of the ocean. Generalized vertical coordinate system is applied for better resolution of main features of the simulated basin. The numerical model is conservative. Numerical integration procedure is based on time-splitting method with Robert-Asselin filtering. Numerical model code is implemented for running on cluster computers. The parallelization is achieved using domain decomposition method and standard MPI to ensure portability of the code.
In this paper we introduce a software system which allows to carry out and visualize computational experiments for studying End researching the parallel algorithms of solving complicated computational problems in imit...
详细信息
ISBN:
(纸本)9783642148217
In this paper we introduce a software system which allows to carry out and visualize computational experiments for studying End researching the parallel algorithms of solving complicated computational problems in imitation mode on one single sequential computer. User can "assemble" a parallel computational system of cluster type that consists of multiprocessor and multicore nodes connected with the network, set up the problem to be solved, carry out the parallel solving algorithm, collect and analyze the results of computational experiments. To estimate the execution time of parallel method on current hardware system we use the sophisticated models. For every implemented parallel method we proved the theoretical estimations of the execution time by comparing the real time of the execution on the NNSU high performance cluster with the time, that can be calculated using the model.
For simulating CO catalytic oxidation on platinum-group metals asynchronous cellular automata (CA) with probabilistic transition rules (kinetic CA) are used. Based on the properties of surface kinetic CA has to have a...
详细信息
ISBN:
(纸本)9783642148217
For simulating CO catalytic oxidation on platinum-group metals asynchronous cellular automata (CA) with probabilistic transition rules (kinetic CA) are used. Based on the properties of surface kinetic CA has to have a huge cellular arrays and long evolution. It is obvious that modeling such processes in real time can only be done with the help of supercomputer. In the paper, parallel implementation of approximation of an kinetic CA with block-synchronous CA is investigated.
Many important scientific and engineering problems require the computation of a small number of eigenvalues of large nonsymmetric matrices. The biorthogonal Lanczos method is one of the methods to solve that problem. ...
详细信息
ISBN:
(纸本)9783642148217
Many important scientific and engineering problems require the computation of a small number of eigenvalues of large nonsymmetric matrices. The biorthogonal Lanczos method is one of the methods to solve that problem. In this paper, we introduce the s-step biorthogonal Lanczos method generating reduction matrices which are similar to reduction matrices generated by the standard biorthogonal Lanczos method. The s-step generalization of biorthogonal Lanczos method enhances parallel properties by forming s simultaneous search direction vectors. The s-step biorthogonal Lanczos method has the minimized synchronization points, which resulted in the minimized global communication compared to the standard biorthogonal Lanczos method.
A technique of parallel computing in simulating the deposition of diamond-like carbon thin film by molecular dynamics is proposed. The Tersoff potential which is a multi-body potential is adopted here in determining i...
详细信息
ISBN:
(纸本)9783642148217
A technique of parallel computing in simulating the deposition of diamond-like carbon thin film by molecular dynamics is proposed. The Tersoff potential which is a multi-body potential is adopted here in determining inter-atomic forces. The deposition of carbon thin film on diamond substrates and silicon substrates under different incident kinetic energies and different substrate temperatures are investigated. The multiprocessor of workstation computer containing 8 cores used for simulating the deposition is based on MPICH2 which is an implementation of message passing interface. The results show that the percentages of deposited sp3 carbon atoms differ from 6.1% to 34.8% depending on the type of substrate, incident kinetic energy and substrate temperature.
Questions of solution of three-dimensional diffraction problems are considered. The problems are formulated as weakly singular integral equations of 1 kind with alone unknown density. Discretization of these equations...
详细信息
ISBN:
(纸本)9783642148217
Questions of solution of three-dimensional diffraction problems are considered. The problems are formulated as weakly singular integral equations of 1 kind with alone unknown density. Discretization of these equations is realized by means of special smoothing method of fit integral operators. Numerical solutions of systems of linear algebraic equations, approximating integral equations of diffraction problems, were found by using of the variational iterative method andparallel computing technology. We gave the numerical experiment results.
This paper continues development of information-statistical approach to minimization of multiextremal functions in the case of non-convex constraints. Proposed approach is called index method. Solving multidimensional...
详细信息
ISBN:
(纸本)9783642148217
This paper continues development of information-statistical approach to minimization of multiextremal functions in the case of non-convex constraints. Proposed approach is called index method. Solving multidimensional problem is reduced to solving equivalent single dimensional one. Dimension reduction is based on Peano curves that allow mapping multidimensional hyper cube onto the segment on real axis. We also use rotating Peano curves that allowed effectively parallelize algorithm to use hundreds of processors. Special attention was paid to mixed local-global strategy for algorithm convergence acceleration.
Heterogeneous multi-core architectures are the mainstream of processor designs for high-end embedded systems. Although such architectures promise high performance and low power consumption, challenges are raised for h...
详细信息
ISBN:
(纸本)9783642148217
Heterogeneous multi-core architectures are the mainstream of processor designs for high-end embedded systems. Although such architectures promise high performance and low power consumption, challenges are raised for how to program such devices. This paper presents "Multi-core Software APIs" (MSA) to address these issues. MSA is a library-based framework based asynchronous remote procedure call (RPC) mechanism. Aiming at distributed memory architectures, which is common in embedded systems, MSA supplies a function-offloading programming model. MSA consists of three modules, RPC module, message module, and streaming module, to provide task offloading, data transmission, and streaming data transmission, respectively. Furthermore, this paper provides two case studies, pi calculation and stereo vision, to show how MSA works on building multi-core applications.
Resource discovery in distributed computing systems is a critical issue to find and retrieve distributed resources rapidly. In general, most of previous proposed strategies focus on developing keyword searching approa...
详细信息
ISBN:
(纸本)9783642148217
Resource discovery in distributed computing systems is a critical issue to find and retrieve distributed resources rapidly. In general, most of previous proposed strategies focus on developing keyword searching approaches with preserving system scalability. In this paper, we propose a cluster-based hybrid overlay, which supports efficient keyword searching with the highly churn rate. The cluster-based hybrid overlay groups the nodes with the same attributes to form unstructured attribute-groups, and then clusters these attribute-groups with similar attributes to form attribute-clusters. Our proposed hybrid overlay could provide efficient multi-attribute and range-query searches with load balancing in large-scale P2P networks. Experimental results show that the proposed overlay performs well.
The ubiquity of many-core architectures poses challenges to software developers to make scalable software. To parallelize data-intensive applications on a many-core platform, one has to consider both hardware architec...
详细信息
ISBN:
(纸本)9783642148217
The ubiquity of many-core architectures poses challenges to software developers to make scalable software. To parallelize data-intensive applications on a many-core platform, one has to consider both hardware architecture and software characteristics when writing parallel codes. In this paper, we take Motion JPEG decoder as an example data-intensive application and take TILE64 as an example many-core platform. We parallelize the decoder with two different strategies and observe their impact on program performance and scalability. We design two algorithms, READ and WRITE, which differ in the direction of data movement between processor cores. Experimental results show that READ algorithm outperforms WRITE algorithm by 217% when decoding 1080P video on the TILE64 platform. It indicates that the arrangement of data flows in a data-intensive parallel program can have huge impact on program performance and scalability on a many-core platform.
暂无评论