The multi-spin coding of the Monte Carlo simulation of the three-state Potts model on the simple cubic lattice is presented. The ferromagnetic (F) model, the antiferromagnetic (AF) model, and the random mixture of the...
详细信息
The multi-spin coding of the Monte Carlo simulation of the three-state Potts model on the simple cubic lattice is presented. The ferromagnetic (F) model, the antiferromagnetic (AF) model, and the random mixture of the F and AF couplings are treated. The multispincoding technique is also applied to the block-spin transformation. The block-spin transformation of the F Potts model is simply realized by the majority rule, whereas the AF three-state Potts model is transformed to the block spin having a six-fold symmetry.
A new method for large-scale numerical simulations of neural networks is proposed which reduces the computational effort by incrementally updating the local fields and thus restricting the operations to flipped spins ...
详细信息
A new method for large-scale numerical simulations of neural networks is proposed which reduces the computational effort by incrementally updating the local fields and thus restricting the operations to flipped spins only. A highly optimized multi-spin algorithm is described employing words oriented along the columns of the coupling matrix unlike the horizontal structure in existing high-speed algorithms. An effective rate of 35+109 couplings/s on a Cray-YMP can be attained which is about five times as fast as best existing multi-spin implementations.
Simulating eight lattices for Pomeau's cellular automata simultaneously through bit-per-bit operations, a vectorized Fortran program reached 30 million updates per second and per Cray YMP processor. We give the fu...
详细信息
Simulating eight lattices for Pomeau's cellular automata simultaneously through bit-per-bit operations, a vectorized Fortran program reached 30 million updates per second and per Cray YMP processor. We give the full innermost loops.
Population annealing is a promising recent approach for Monte Carlo simulations in statistical physics, in particular for the simulation of systems with complex free-energy landscapes. It is a hybrid method, combining...
详细信息
Population annealing is a promising recent approach for Monte Carlo simulations in statistical physics, in particular for the simulation of systems with complex free-energy landscapes. It is a hybrid method, combining importance sampling through Markov chains with elements of sequential Monte Carlo in the form of population control. While it appears to provide algorithmic capabilities for the simulation of such systems that are roughly comparable to those of more established approaches such as parallel tempering, it is intrinsically much more suitable for massively parallel computing. Here, we tap into this structural advantage and present a highly optimized implementation of the population annealing algorithm on GPUs that promises speed-ups of several orders of magnitude as compared to a serial implementation on CPUs. While the sample code is for simulations of the 2D ferromagnetic Ising model, it should be easily adapted for simulations of other spin models, including disordered systems. Our code includes implementations of some advanced algorithmic features that have only recently been suggested, namely the automatic adaptation of temperature steps and a multi-histogram analysis of the data at different temperatures. Program summary Program Title: PAIsing Program Files doi: http://***/10.17632/sgzt4b7b3m.1 Licensing provisions: Creative Commons Attribution license (CC BY 4.0) Programming language: C, CUDA External routines/libraries: NVIDIA CUDA Toolkit 6.5 or newer Nature of problem: The program calculates the internal energy, specific heat, several magnetization moments, entropy and free energy of the 2D Ising model on square lattices of edge length L with periodic boundary conditions as a function of inverse temperature beta. Solution method: The code uses population annealing, a hybrid method combining Markov chain updates with population control. The code is implemented for NVIDIA GPUs using the CUDA language and employs advanced techniques such as mu
An algorithm for the simulation of the 3-dimensional random field Ising model with a binary distribution of the random fields is presented. It uses multi-spin coding and simulates 64 physically different systems simul...
详细信息
An algorithm for the simulation of the 3-dimensional random field Ising model with a binary distribution of the random fields is presented. It uses multi-spin coding and simulates 64 physically different systems simultaneously. On one processor of a Cray YMP it reaches a speed of 184 million spin updates per second. For smaller field strength we present a version of the algorithm that can perform 242 million spin updates per second on the same machine.
Neural networks composed of neurons with Q(N) states and synapses with Q(J) states are studied analytically and numerically. Analytically it is shown that these finite-state networks are much more efficient at informa...
详细信息
Neural networks composed of neurons with Q(N) states and synapses with Q(J) states are studied analytically and numerically. Analytically it is shown that these finite-state networks are much more efficient at information storage than networks with continuous synapses. In order to take the utmost advantage of networks with finite-state elements, a multineuron and multisynapse coding scheme is introduced which allows the simulation of networks having 1.0 x 10(9) couplings at a speed of 7.1 x 10(9) coupling evaluations per second on a single processor of the Cray-YMP. A local learning algorithm is also introduced which allows for the efficient training of large networks with finite-state elements.
Algorithms exhibiting parallelization on many different levels are discussed for short- and long-range cellular automata implemented on scalar, vector, SIMD and MIMD machines. Short range cellular automata are commonl...
详细信息
Algorithms exhibiting parallelization on many different levels are discussed for short- and long-range cellular automata implemented on scalar, vector, SIMD and MIMD machines. Short range cellular automata are commonly used for simulating hydrodynamic fluid flows, while long range cellular automata are applicable to neural networks at zero temperature. A common programming approach based upon multi-spin coding and including higher levels of parallelization when possible, has been used to implement these models on the SUN SPARC-1, the IBM-3090, the Alliant FX/2800, the NEC-SX3/11, the Cray-YMP/832 and the Connection Machine, CM-2. Section 4 of the paper compares the performance of these computers for the algorithms discussed in the text. Additionally, the major subroutines for each computer type are given in the Appendix.
The vectorized Monte Carlo algorithm by multi-spin coding is extended to the +/-J Ising spin glass model on a simple cubic lattice in a magnetic field. Explicit logical expression is given for this algorithm. In addit...
详细信息
The vectorized Monte Carlo algorithm by multi-spin coding is extended to the +/-J Ising spin glass model on a simple cubic lattice in a magnetic field. Explicit logical expression is given for this algorithm. In addition, shorter logical expressions are found in some special cases. They are given for the heat-bath method under the general condition and for the Metropolis method under the condition, H = 0.
In this paper, we present a parallel algorithm for Monte Carlo simulation of the 2D Ising Model to perform efficiently on a cluster computer using MPI. We use C++ programming language to implement the algorithm. In ou...
详细信息
In this paper, we present a parallel algorithm for Monte Carlo simulation of the 2D Ising Model to perform efficiently on a cluster computer using MPI. We use C++ programming language to implement the algorithm. In our algorithm, every process creates a sub-lattice and the energy is calculated after each Monte Carlo iteration. Each process communicates with its two neighbor processes during the job, and they exchange the boundary spin variables. Finally, the total energy of lattice is calculated by map-reduce method versus the temperature. We use multi-spin coding technique to reduce the inter-process communications. This algorithm has been designed in a way that an appropriate load-balancing and good scalability exist. It has been executed on the cluster computer of Plasma Physics Research Center which includes 9 nodes and each node consists of two quad-core CPUs. Our results show that this algorithm is more efficient for large lattices and more iterations.
We report on performance tests of pair interaction lattice gas automata in two and three dimensions coded in FORTRAN and C. The programs have been run on ALLIANT/FX-80, ALLIANT/FX-2800, CONVEX C2, CRAY-YMP, NEC/SX3, a...
详细信息
We report on performance tests of pair interaction lattice gas automata in two and three dimensions coded in FORTRAN and C. The programs have been run on ALLIANT/FX-80, ALLIANT/FX-2800, CONVEX C2, CRAY-YMP, NEC/SX3, and SUN/IPC. The maximum update rates are 200 million site updates per second on the NEC/SX3 (FORTRAN), 117 (2D version) and 29 (3D version) on the CRAY-YMP (C). As a byproduct we give results for the performance of integer arithmetic and bit operations. Usually the C-programs were somewhat faster than the FORTRAN-programs except on the NEC/SX3 where the C-compiler was not able to vectorize the main loops.
暂无评论