A dynamical system is a frequently used mathematical model used to describe moving phenomena in nature. These types of dynamical systems are usually defined by special systems of differential equations. Computer simul...
详细信息
ISBN:
(纸本)9789492859280
A dynamical system is a frequently used mathematical model used to describe moving phenomena in nature. These types of dynamical systems are usually defined by special systems of differential equations. Computer simulations based on such models require solving these equations, and various numerical methods are often used for this purpose. Computer simulations of many dynamical natural phenomena have the property that the calculations can be parallelized very well. Parallel computing significantly speeds up calculation processes. The use of parallel algorithms for such purpose, involves the need to synchronize the threads of the process. The lack of synchronization can result in inaccurate or even erronous results of the calculation. On the other hand, the synchronization can significantly slow down computational processes. In addition, numerical methods are imprecise anyway, and the mathematical model itself describes the natural phenomenon only to a certain extent. So, the idea arose to synchronize threads not in each step of the algorithm. There is also the problem of optimizing the frequency of using synchronization. In the paper, this issue has been investigated on several examples of simulations. The results confirm that full synchronization control is often not necessary.
VORO++ is a software library written in C++ for computing the Voronoi tessellation, a technique in computational geometry that is widely used for analyzing systems of particles. VORO++ was released in 2009 and is base...
详细信息
VORO++ is a software library written in C++ for computing the Voronoi tessellation, a technique in computational geometry that is widely used for analyzing systems of particles. VORO++ was released in 2009 and is based on computing the Voronoi cell for each particle individually. Here, we take advantage of modern computer hardware, and extend the original serial version to allow for multithreaded computation of Voronoi cells via the OpenMP application programming interface. We test the performance of the code, and demonstrate that it can achieve parallel efficiencies greater than 95% in many cases. The multithreaded extension follows standard OpenMP programming paradigms, allowing it to be incorporated into other programs. We provide an example of this using the VoroTop software library, performing a multithreaded Voronoi cell topology analysis of up to 102.4 million *** summaryProgram title: VORO++CPC Library link to program files: https://doi .org /10 .17632 /tddc4w4zkk.1Developer's repository link: https://github .com /chr1shr /voro Licensing provisions: BSD 3-clause (with LBNL modification) programming language: C++External routines/libraries: OpenMPNature of problem: multithreaded computation of the Voronoi tessellation in two and three dimensions Solution method: The VORO++ library is built around several C++ classes that can be incorporated into other programs. The two largest components are the container... classes that spatially sort input particles into a grid-based data structure, allowing for efficient searches of nearby particles, and the voronoicell... classes that represent a single Voronoi cell as an arbitrary convex polygon or polyhedron. The Voronoi cell for each particle is built by considering a sequence of plane cuts based on neighboring particles, after which many different statistics (e.g. volume, centroid, number of vertices) can be computed. Since each Voronoi cell is calculated individually, the Voronoi cells can be computed u
It is seen that Weather Forecast Models (WFMs) are often implemented using the sequential programs. This usually takes longer execution time, larger computer resources and more power as WFMs involve high level computa...
详细信息
ISBN:
(纸本)9781467329255;9781467329224
It is seen that Weather Forecast Models (WFMs) are often implemented using the sequential programs. This usually takes longer execution time, larger computer resources and more power as WFMs involve high level computational tasks to process large amount of weather forecast data. These become problems for the weather forecast companies in terms of WFM performance. The companies have already tried to use the multi-core systems to overcome these, but it does not work always because of the poor selection and implementation of programming strategies. By addressing these problems, a research project has been conducted as a case study for the weather production company named Weather2 Ltd. The case study attempted multi-threaded programming based on the multi-core systems as a different implementation strategy for Weather2's WFM as solution to their problems in using sequential programs. The results of the case study showed that this new strategy could improve the performance of WFM significantly by reducing the execution time, using less computer resources and power. This paper presents the case study and its results.
We propose a method for parallel multi-view graph matrix completion for the prediction of ratings in recommender systems. The missing ratings are computed based on both the similarity matrix in addition to a rating ma...
详细信息
ISBN:
(纸本)9781728105543
We propose a method for parallel multi-view graph matrix completion for the prediction of ratings in recommender systems. The missing ratings are computed based on both the similarity matrix in addition to a rating matrix. The rating matrix is sparse and some items might not have any rating information available. The similarity matrix can be calculated from different item attributes available from ecommerce websites. As the input matrix becomes large, the need for more computationally efficient matrix completion increases. The main contribution of this paper is to show speed-up in calculating the missing ratings by using multi-threaded programming. Simulation results are based on the large input matrix and show reduction in RMSE for the case of cold start prediction.
Catenary is an important part of electrified railway, and its geometric parameters are important parameters reflecting the safe and stable operation of locomotives. With the improvement of its speed, there are higher ...
详细信息
ISBN:
(数字)9781510634497
ISBN:
(纸本)9781510634497
Catenary is an important part of electrified railway, and its geometric parameters are important parameters reflecting the safe and stable operation of locomotives. With the improvement of its speed, there are higher requirements for high-accuracy and real-time detection of geometric parameters of catenary. The existing systems have problems of long sampling interval, low real-time performance, and light-sensitive. Aiming at the actual requirement of dynamic measurement of catenary geometric parameters, a non-contact catenary geometric parameter detection system based on machine vision was developed. Firstly, a measurement model based on high-power line lasers and high-resolution area cameras was established to meet the application requirements. The measurement principle of the system was analyzed and the detailed formulas were deduced. Secondly, image difference, laser spot roundness analysis and other image processing algorithms were used to quickly and accurately detect the characteristics of laser points on the contact line with complex background. Based on the measurement model and algorithms mentioned above, the hardware and software platform of the system were built, and fast image acquisition and processing was realized by using multi-thread programming technology on high-performance industrial computer, which solved the problems of long sampling interval and low real-time performance during the measurement. Real-time image storage and display and preservation of detection results were realized in the software. Finally, a preliminary experiment was performed on the prototype, and the accuracy of the measurement results was analyzed. Experiment results showed that the system works stably and has high accuracy, which meets the practical application requirements.
The network attack graph is a powerful tool for analyzing network security, but the generation of a large-scale graph is non-trivial. The main challenge is from the explosion of network state space, which greatly incr...
详细信息
ISBN:
(纸本)9781728120805
The network attack graph is a powerful tool for analyzing network security, but the generation of a large-scale graph is non-trivial. The main challenge is from the explosion of network state space, which greatly increases time and storage costs. In this paper, three parallel algorithms are proposed to generate scalable attack graphs. An OpenMP-based programming implementation is used to test their performance. Compared with the serial algorithm, the best performance from the proposed algorithms provides a 10X speedup.
Cryptographic technology is the basic and core technology to guarantee network and information security. It is imperative to standardize cryptographic algorithms. Cryptographic algorithm test is the most important res...
详细信息
ISBN:
(纸本)9781538616000
Cryptographic technology is the basic and core technology to guarantee network and information security. It is imperative to standardize cryptographic algorithms. Cryptographic algorithm test is the most important research content of cryptographic normalization. Symmetric cryptographic algorithms are the most widely studied and applied. So it has the practical value to normalize the symmetric cryptographic algorithm. Symmetric cryptographic algorithm test tool is the most important design of the test method components. The test method components of this paper mainly include 20 kinds of random test methods. In order to shorten the test time and improve the efficiency of the test, the random test method of the component is also realized by multi-thread programming. The experiment verifies that the test tool can obtain the correct test result and can improve the flexibility of the test method.
The paper presents investigations on the performance of the finite element numerical integration algorithm for first order approximations and three processor architectures, popular in scientific computing, classical x...
详细信息
The paper presents investigations on the performance of the finite element numerical integration algorithm for first order approximations and three processor architectures, popular in scientific computing, classical x86_64 CPU, Intel Xeon Phi and NVIDIA Kepler GPU. We base the discussion on theoretical performance models and our own implementations for which we perform a range of computational experiments. For the latter, we consider a unifying programming model and portable OpenCL implementation for all architectures. Variations of the algorithm due to different problems solved and different element types are investigated and several optimizations aimed at proper optimization and mapping of the algorithm to computer architectures are demonstrated. The experimental results show the varying levels of performance for different architectures, but indicate that the algorithm can be effectively ported to all of them. The conclusions indicate the factors that limit the performance for different problems and types of approximation and the performance ranges that can be expected for FEM numerical integration on different processor architectures. (C) 2016 Elsevier B.V. All rights reserved.
CSP# is a formal modeling language that emphasizes the design of communication in concurrent systems. PAT framework provides a model checking environment for the simulation and verification of CSP# models. Although th...
详细信息
CSP# is a formal modeling language that emphasizes the design of communication in concurrent systems. PAT framework provides a model checking environment for the simulation and verification of CSP# models. Although the desired properties can be formally verified at the design level, it is not always straightforward to ensure the correctness of the system's implementation conforms to the behaviors of the formal design model. To avoid human error and enhance productivity, it would be beneficial to have a tool support to automatically generate the executable programs from their corresponding formal models. In this paper, we propose such a solution for translating verified CSP# models into C# programs in the PAT framework. We encoded the CSP# operators in a C# library-"***", where the event synchronization is based on the "Monitor" class in C#. The precondition and choice layers are built on top of the CSP event synchronization to support language-specific features. We further developed a code generation tool to automatically transform CSP# models into multi-threaded C# programs. We proved that the generated C# program and original CSP# model are equivalent on the trace semantics. This equivalence guarantees that the verified properties of the CSP# models are preserved in the generated C# programs. Furthermore, based on the existing implementation of choice operator, we improved the synchronization mechanism by pruning the unnecessary communications among the choice operators. The experiment results showed that the improved mechanism notably outperforms the standard JCSP library.
Large-scale network and graph analysis has received considerable attention recently. Graph mining techniques often involve an iterative algorithm, which can be implemented in a variety of ways. Using PageRank as a mod...
详细信息
ISBN:
(纸本)9783662480960;9783662480953
Large-scale network and graph analysis has received considerable attention recently. Graph mining techniques often involve an iterative algorithm, which can be implemented in a variety of ways. Using PageRank as a model problem, we look at three algorithm design axes: work activation, data access pattern, and scheduling. We investigate the impact of different algorithm design choices. Using these design axes, we design and test a variety of PageRank implementations finding that data-driven, push-based algorithms are able to achieve more than 28x the performance of standard PageRank implementations (e.g., those in GraphLab). The design choices affect both single-threaded performance as well as parallel scalability. The implementation lessons not only guide efficient implementations of many graph mining algorithms, but also provide a framework for designing new scalable algorithms.
暂无评论