In cryptographic applications, super long integer operations are often used. However, cryptographic algorithms generally run on a computer with a single-core CPU, and the related computing process is a type of serial ...
详细信息
In cryptographic applications, super long integer operations are often used. However, cryptographic algorithms generally run on a computer with a single-core CPU, and the related computing process is a type of serial execution. In this paper, we investigate how to parallelize the operations of super long integers in multi-core computer environment. The significance of this study lies in that along with the promotion of multi-core computing devices, and the enhancement of multi-core computing ability, we need to make the basic arithmetic of super long integers run in paralleling, which means blocking super long integers, running all data blocks on multi-core threads respectively, converting original serial execution into multi-core parallel computation, and storing multi-thread results after formatting them. According to experiments we have observed: if scheduling thread time is longer than computation, parallel algorithms execute faster, on the contrary, serial algorithms are better. On the whole, parallel algorithms can utilize the computing ability of multi-core hardware more efficiently.
Various existing performance metrics within the parallel systems domain are analyzed. These include the different flavors of speedup, efficiency, and isoefficiency. Execution time still remains the most widely used me...
详细信息
Various existing performance metrics within the parallel systems domain are analyzed. These include the different flavors of speedup, efficiency, and isoefficiency. Execution time still remains the most widely used metric. A new method to automatically estimate algorithmic cost is provided.
Simulation-based Inference (SBI) is a widely used set of algorithms to learn the parameters of complex scientific simulation models. While primarily run on CPUs in High-Performance Compute clusters, these algorithms h...
详细信息
Simulation-based Inference (SBI) is a widely used set of algorithms to learn the parameters of complex scientific simulation models. While primarily run on CPUs in High-Performance Compute clusters, these algorithms have been shown to scale in performance when developed to be run on massively parallel architectures such as GPUs. While parallelizing existing SBI algorithms provides us with performance gains, this might not be the most efficient way to utilize the achieved parallelism. This work proposes a new parallelism-aware adaptation of an existing SBI method, namely approximate Bayesian computation with Sequential Monte Carlo(ABC-SMC). This new adaptation is designed to utilize the parallelism not only for performance gain, but also toward qualitative benefits in the learnt parameters. The key idea is to replace the notion of a single 'step-size' hyperparameter, which governs how the state space of parameters is explored during learning, with step-sizes sampled from a tuned Beta distribution. This allows this new ABC-SMC algorithm to more efficiently explore the state-space of the parameters being learned. We test the effectiveness of the proposed algorithm to learn parameters for an epidemiology model running on a Tesla T4 GPU. Compared to the parallelized state-of-the-art SBI algorithm, we get similar quality results in similar to 100x fewer simulations and observe similar to 80x lower run-to-run variance across 10 independent trials.
Betweenness centrality is a metric to measure the relative importance of vertices within a graph. The computation of betweenness centrality is based on shortest paths which requires O(n+m) space and O(mn) and O(nm+n 2...
详细信息
ISBN:
(纸本)9781479989386
Betweenness centrality is a metric to measure the relative importance of vertices within a graph. The computation of betweenness centrality is based on shortest paths which requires O(n+m) space and O(mn) and O(nm+n 2 log n) time on unweighted and weighted graphs, respectively. It is time-consuming to deal with large-scale graphs, which motivates us resort to distributed computing and parallel algorithms. In this paper, we design a vertex-based parallel algorithm following the shortest path approach (SPBC). Moreover, we propose a distributed algorithm based on message propagation(MPBC) to quantify the importance of vertices. MPBC takes into account the real situation of information diffusion in social networks. We implement our algorithms on Graphlab and evaluate them through comprehensive experiments. The results show that both SPBC and MPBC scale well with the increasing number of machines. SPBC on 2 machines outperforms the classical centralized algorithm by 1.59 times in terms of running time. MPBC can handle graph with ten millions of vertices and edges within an acceptable time where classical algorithms become infeasible.
This paper deals with a problem of optimal control of complex multi-stage chemical reactions which often impose complicated restrictions on control variables, such as temperature or time. Without taking those restrict...
详细信息
This paper deals with a problem of optimal control of complex multi-stage chemical reactions which often impose complicated restrictions on control variables, such as temperature or time. Without taking those restrictions into account, the obtained optimal control sometimes can be useless as it would not be possible to implement such a control strategy in practice. In this work we propose a novel parallel memetic algorithm that allows obtaining feasible control strategies by monitoring the restrictions on control variables. The proposed algorithm and its software implementation were utilized to find feasible controls for several industrial chemical processes including the synthesis of the benzyl butyl ether, the hydroalumination of olefins with diisobutylaluminium hydride, and the catalytic reforming of gasoline. In addition, the obtained results were compared with the ones obtained by several other methods. The paper presents the results of conducted numerical experiments and the obtained controls for the specified chemical reactions.
We propose a new algorithm for the execution of Discrete Event System Specification (DEVS) simulations on parallel shared memory architectures. Our approach executes parallel discrete-event simulations by executing al...
详细信息
We propose a new algorithm for the execution of Discrete Event System Specification (DEVS) simulations on parallel shared memory architectures. Our approach executes parallel discrete-event simulations by executing all tasks in the PDEVS simulation protocol in parallel. The algorithm works by distributing the computations among different cores on shared memory architectures. To show the benefits of our algorithm, we present the results of a set of experiments using a synthetic benchmark and a real-world scenario using two independent computer architectures. The results obtained show how our algorithm accelerates simulations up to eight times, improving previous approaches. In addition, we show that our approach scales when we increase the number of CPU-cores used.
The area of computing with uncertainty considers problems where some information about the input elements is uncertain, but can be obtained using queries. For example, instead of the weight of an element, we may be gi...
详细信息
ISBN:
(纸本)9783959771801
The area of computing with uncertainty considers problems where some information about the input elements is uncertain, but can be obtained using queries. For example, instead of the weight of an element, we may be given an interval that is guaranteed to contain the weight, and a query can be performed to reveal the weight. While previous work has considered models where queries are asked either sequentially (adaptive model) or all at once (non-adaptive model), and the goal is to minimize the number of queries that are needed to solve the given problem, we propose and study a new model where k queries can be made in parallel in each round, and the goal is to minimize the number of query rounds. We use competitive analysis and present upper and lower bounds on the number of query rounds required by any algorithm in comparison with the optimal number of query rounds. Given a set of uncertain elements and a family of m subsets of that set, we present an algorithm for determining the value of the minimum of each of the subsets that requires at most (2 + epsilon) . opt(k) + O (1/epsilon . lgm) rounds for every 0 < epsilon < 1, where opt(k) is the optimal number of rounds, as well as nearly matching lower bounds. For the problem of determining the i-th smallest value and identifying all elements with that value in a set of uncertain elements, we give a 2-round-competitive algorithm. We also show that the problem of sorting a family of sets of uncertain elements admits a 2-round-competitive algorithm and this is the best possible.
We present a randomized O(m log(2) n) work, O( polylogn) depth parallel algorithm for minimum cut. This algorithm matches thework bounds of a recent sequential algorithm by Gawrychowski, Mozes, andWeimann [ICALP'2...
详细信息
We present a randomized O(m log(2) n) work, O( polylogn) depth parallel algorithm for minimum cut. This algorithm matches thework bounds of a recent sequential algorithm by Gawrychowski, Mozes, andWeimann [ICALP'20], and improves on the previously best parallel algorithm by Geissmann and Gianinazzi [SPAA'18], which performs O(m log(4) n) work in O(polylogn) depth. Our algorithm makes use of three components that might be of independent interest. First, we design a parallel data structure that efficiently supports batched mixed queries and updates on trees. It generalizes and improves thework bounds of a previous data structure of Geissmann and Gianinazzi and iswork efficient with respect to the best sequential algorithm. Second, we design a parallel algorithm for approximate minimum cut that improves on previous results by Karger and Motwani. We use this algorithm to give a work-efficient procedure to produce a tree packing, as in Karger's sequential algorithm for minimum cuts. Last, we design an efficient parallel algorithm for solving the minimum 2-respecting cut problem.
Enormous river basin information has been collected by for high resolution of the physically-based distributed hydrological model, while the scales of computational domain are often restricted by the intensive calcula...
详细信息
One of the important problems in the use of remote sensing from satellites is three-dimensional modeling of surface—fragments both dynamic (e.g., ocean surface) and slowly varying ones. Some researchers propose the u...
详细信息
暂无评论