In this study, we use the explicit difference method and also the combination of the explicit difference method with the implicit method to numerically solve the third order of Korteweg- de Vries (KDV) and modified Ko...
详细信息
In this study, we use the explicit difference method and also the combination of the explicit difference method with the implicit method to numerically solve the third order of Korteweg- de Vries (KDV) and modified Korteweg-de Vries (mKDV) equations. This method is applied to obtain soliton solutions on several problems and the results obtained are compared with each other and the exact solution of the problem. Also, due to that these problems are very ill-posed, when the time increases, the waveform of solitons and the input parameters of these methods are carefully examined. The L infinity , L 2 and mean-square errors of the solutions show that these methods can be applied to different problems and have very good accuracy. These methods are widely used in combination with other algorithms, but their very high execution time, especially in this category of problems, is always a big limitation. In this paper, a parallel approach for implementing these methods is presented and very good results have been obtained compared to sequential implementation. This is a reference to further study solutions of other models of KDV and mKDV problems as well as solving their inverse problems.
Current computer engineering evolves at an accelerated pace, with hardware advancing towards new chip multiprocessors (CMP) architectures and with supporting software gearing towards new programming and abstraction pa...
详细信息
Current computer engineering evolves at an accelerated pace, with hardware advancing towards new chip multiprocessors (CMP) architectures and with supporting software gearing towards new programming and abstraction paradigms, to obtain the maximum efficiency of the hardware at a low cost. In this context, Tilera Corporation has developed a brand new CMP architecture with 64 cores (tiles) called Tile64, and has launched several Peripheral Component Interconnect Express (PCIe) cards to be used and monitored from a host Personal Computer (PC). These cards may execute parallel applications built in C/C++ and compiled with the Tile-GCC compiler. We have previously demonstrated the usefulness of the Tile64 architecture for bioinformatics [S. Galvez, D. Diaz, P. Hernandez, F.J. Esteban, J.A. Caballero, G. Dorado, Next-generation bioinformatics: using many-core processor architecture to develop a web service for sequence alignment, Bioinformatics, 26 (2010) 683-686]. We have chosen a bioinformatics algorithm to test this many-core Tile64 architecture because of actual bioinformatics challenging needs: data-intensive workloads, space and time-consuming requirements and massive calculation. This algorithm, known as Needleman-Wunsch/Smith-Waterman (NW/SW), obtains an optimal sequence alignment in quadratic time and space cost, yet requires to be optimized to take full advantage of computing parallelization. In this paper we redesign, implement and fine-tune this algorithm, introducing key optimizations and changes that take advantage of specific Tile64 characteristics: RISC architecture, local tile's cache, length of memory word, shared memory usage, RAM file system, tile's intercommunication and job selection from a pool. The resulting algorithm - named MC64-NW/SW for multicore64 Needleman-Wunsch/Smith-Waterman - achieves a gain of similar to 1000% when compared with the same algorithm on a x86 multi-core architecture. As far as we know, our NW/SW implementation is the fastes
We present the hierarchical matrix (H-matrix) technique combined with the adaptive cross-approximation (ACA) applied to a three-dimensional (3D) elastostatic problem using the boundary element method (BEM). This is us...
详细信息
We present the hierarchical matrix (H-matrix) technique combined with the adaptive cross-approximation (ACA) applied to a three-dimensional (3D) elastostatic problem using the boundary element method (BEM). This is used in structural geology and geomechanics for the evaluation of the deformation and perturbed stress field associated with surfaces of displacement discontinuity. Such optimization significantly reduces (i) the time and memory needed for the resolution of the system of equations, but more importantly (ii) the time needed for the post-processing at observation points where the deformation and the perturbed stress field are evaluated. Specifically, it is shown that the H-matrix structure used with the ACA, clearly captures the kernel smoothness during the post-processing stage according to the field point positions, and optimizes the computation accordingly. Combined with the parallelization on multi-core processors, this technique allows intensive computations to be done on personal desktop and laptop computers. Numerical simulations are presented, showing the advantages of such optimizations compared to the standard method. (C) 2009 Elsevier Ltd. All rights reserved.
A selective encryption algorithm is proposed to improve the efficiency of high efficiency video coding (HEVC) video encryption and ensure the security of HEVC videos. The algorithm adopts the integer dynamic coupling ...
详细信息
A selective encryption algorithm is proposed to improve the efficiency of high efficiency video coding (HEVC) video encryption and ensure the security of HEVC videos. The algorithm adopts the integer dynamic coupling tent mapping optimization model as the pseudo-random sequence generator, and multi-core parallelization is used as the sequence generation mechanism. The binstrings during the process of context adaptive binary arithmetic coding are selected for encryption, which conforms to the features of invariable binstream and compatible format in terms of video encryption. Performance tests for six types of standard videos with different resolutions were performed. The results indicated that the encryption algorithm has a large key space and benefits from a high encryption effect.
In this paper a two-phase filter for removing "salt and pepper" noise is proposed. In the first phase, an adaptive median filter is used to identify the set of the noisy pixels;in the second phase, these pix...
详细信息
ISBN:
(纸本)9781467325851;9781467325837
In this paper a two-phase filter for removing "salt and pepper" noise is proposed. In the first phase, an adaptive median filter is used to identify the set of the noisy pixels;in the second phase, these pixels are restored according to a regularization method, which contains a data-fidelity term reflecting the impulse noise characteristics. The algorithm, which exhibits good performance both in denoising and in restoration, can be easily and effectively parallelized to exploit the full power of multi-core CPUs and GPGPUs;the proposed implementation based on the FastFlow library achieves both close-to-ideal speedup and very good wall-clock execution figures.
The advances in Cyber-Physical Systems (CPS) are also effecting the aeronautics. The growth of the cyber layer in aircraft is demanding higher throughput and eventually multi-core systems are becoming topics of intere...
详细信息
ISBN:
(纸本)9781510860186
The advances in Cyber-Physical Systems (CPS) are also effecting the aeronautics. The growth of the cyber layer in aircraft is demanding higher throughput and eventually multi-core systems are becoming topics of interest. The development of parallel real-time systems for multicore processors requires new approaches in model-based design and simulation-based verification. The Enhanced Ground Proximity Warning System (EGPWS) is a terrain awareness system that creates aural and visual warnings for the pilot to prevent Controlled Flight into Terrain (CFIT). This paper presents a multi-core parallelization workflow and a corresponding x-in-the-loop testing pipeline for model-based development of an EGPWS.
We propose PASCAL, a parallel unified algorithmic framework for generalized N-body problems. PASCAL utilizes tree data structures and user-controlled pruning or approximations to reduce the asymptotic runtime complexi...
详细信息
ISBN:
(纸本)9783319642031;9783319642024
We propose PASCAL, a parallel unified algorithmic framework for generalized N-body problems. PASCAL utilizes tree data structures and user-controlled pruning or approximations to reduce the asymptotic runtime complexity from being linear in the number of data points to be logarithmic. In PASCAL, the domain scientists express their N-body problem in terms of application-specific operations, and PASCAL generates the pruning and approximation conditions automatically from this high-level specification. In order to evaluate PASCAL, we generate solutions for six problems: k-nearest neighbors, range search, Euclidean minimum spanning tree, kernel density estimation, expectation maximization (EM), and Hausdorff distance chosen from various domains. We show that applying domain-specific optimizations and parallelizations to the algorithms generated by PASCAL achieves 10x to 230x speedup compared to state-of-the-art libraries on a dual-socket Intel Xeon processor with 16 cores on real world datasets. We also obtain a novel out-of-the-box asymptotically optimal algorithm for Hausdorff distance calculation and an improved algorithm for EM. This shows the impact and potential of PASCAL in rapidly extending to a larger class of problems that are yet to be explored.
Along with rapid advancements of convergent devices, increased software complexity paired with contrastingly shortened software product lifecycle have introduced new challenges from which the need to transform legacy ...
详细信息
ISBN:
(纸本)9781479965410
Along with rapid advancements of convergent devices, increased software complexity paired with contrastingly shortened software product lifecycle have introduced new challenges from which the need to transform legacy single-core based systems to multi-core systems have emerged. Unfortunately, existing software development processes are late in providing adequate support for multi-core parallelization, failing to keep up with the speed of advancements in multi-core based hardware systems. To address this gap, in our previous work we have proposed a software development process to support the transition of an existing single-core based software to a multi-core equivalent. We have also introduced a tool, the Architectural Decision Supporter (ADS), to assist in the selection of appropriate multi-core architectural patterns and in the search for proper construction components. In this paper, we introduce a selection method for choosing the most desirable candidate among various multi-core architectural patterns implemented using ADS. The proposed method provides the means to combine the contextual knowledge of domain applications and the technical knowledge of individual architectural pattern for multi-core processing.
暂无评论