The problem of efficient parallelization of 2D Ising spin systems requires realistic algorithmic design and implementation based on an understanding of issues from computer science and statistical physics. In this wor...
详细信息
The problem of efficient parallelization of 2D Ising spin systems requires realistic algorithmic design and implementation based on an understanding of issues from computer science and statistical physics. In this work, we not only consider fundamental parallel computing issues but also ensure that the major constraints and criteria of 2D Ising spin systems are incorporated into our study. This realism in both parallel computation and statistical physics has rarely been reflected in previous research for this problem.
In this thesis,we designed and implemented a variety of parallel algorithms for both sweep spin selection and random spin selection. We analyzed our parallel algorithms on a portable and general parallel machine model, namely the LogP model. We were able to obtain rigorous theoretical run-times on LogP for all the parallel algorithms. Moreover, a
guiding equation was derived for choosing data layouts (blocked vs. stripped) for sweep spin selection. In regards to random spin selection, we were able to develop parallel algorithms with efficient communication schemes. We analyzed randomness of our schemes using statistical methods and provided comparisons between the different schemes. Furthermore, algorithms were implemented and performance data gathered and
analyzed in order to determine further design issues and validate theoretical analysis.
Data accessed by many sites are replicated in distributed environments for performance and availability. In this paper, replication schemes are examined in parallel image convolution processing. This paper presents a ...
详细信息
Data accessed by many sites are replicated in distributed environments for performance and availability. In this paper, replication schemes are examined in parallel image convolution processing. This paper presents a system architecture that we have developed with CORBA (Common Object Request Broker Architecture) for the processing. Employing CORBA enables us to make use of a cluster of workstations, each of which has a different level of computing power. The paper also describes a parallel and distributed image convolution processing model using replicas stored in a network of workstations, and reports some experimental results showing that our analytical model can agree with practical situations.
For any pair of distinct nodes in an n-pancake graph, we give an algorithm for construction of n - 1 internally disjoint paths connecting the nodes in the time complexity of polynomial order of n. The length of each p...
详细信息
For any pair of distinct nodes in an n-pancake graph, we give an algorithm for construction of n - 1 internally disjoint paths connecting the nodes in the time complexity of polynomial order of n. The length of each path obtained and the time complexity of the algorithm are estimated theoretically and verified by computer simulation.
Big Data has become one of the major areas of research for cloud service providers due to a large amount of data produced every day and the inefficiency of traditional algorithms and technologies to handle these large...
详细信息
Big Data has become one of the major areas of research for cloud service providers due to a large amount of data produced every day and the inefficiency of traditional algorithms and technologies to handle these large amounts of data. Big Data with its characteristics such as volume, variety, and veracity (3V) requires efficient technologies to process in real time. To solve this problem and to process and analyze this vast amount of data, there are many powerful tools like Hadoop and Spark, which are mainly used in the context of Big Data. They work following the principles of parallel computing. The challenge is to specify which Big Data's tool is better depending on the processing context. In this paper, we present and discuss a performance comparison between two popular Big Data frameworks deployed on virtual machines. Hadoop MapReduce and Apache Spark are used to efficiently process a vast amount of data in parallel and distributed mode on large clusters, and both of them suit for Big Data processing. We also present the execution results of Apache Hadoop in Amazon EC2, a major cloud computing environment. To compare the performance of these two frameworks, we use HiBench benchmark suite, which is an experimental approach for measuring the effectiveness of any computer system. The comparison is made based on three criteria: execution time, throughput, and speedup. We test Wordcount workload with different data sizes for more accurate results. Our experimental results show that the performance of these frameworks varies significantly based on the use case implementation. Furthermore, from our results we draw the conclusion that Spark is more efficient than Hadoop to deal with a large amount of data in major cases. However, Spark requires higher memory allocation, since it loads the data to be processed into memory and keeps them in caches for a while, just like standard databases. So the choice depends on performance level and memory constraints.
The National Science Foundation released a set of application benchmarks that would be a key factor in selecting the next-generation high-performance computing environment. These benchmarks consist of six codes that r...
详细信息
The National Science Foundation released a set of application benchmarks that would be a key factor in selecting the next-generation high-performance computing environment. These benchmarks consist of six codes that require large amount of memory and work with large data sets. Here we study the complexity, performance and scalability of these codes on three SGI machines: a 512-processor Altix 3700, a 512-processor Altix 3700/BX2 and 512-processor dual-core based Altix 4700;and a 128-processor Cray Opteron cluster interconnected by the Myrinet network. We evaluated these codes for two different problem sizes using different numbers of processors. Our results show that some of these codes scale up very well as we increase the number of processors while others scaled up poorly. Also, one code achieved about 2/3 of the peak rate of an SGI Altix processor. Moreover, the dual-core system achieved comparable performance results to the single-core system.
Due to their widespread use, Internet, Web 2.0 and digital sensors create data in non-traditional volumes (at terabytes and petabytes scale). The big data characterized by the four V's has brought with it new chal...
详细信息
Due to their widespread use, Internet, Web 2.0 and digital sensors create data in non-traditional volumes (at terabytes and petabytes scale). The big data characterized by the four V's has brought with it new challenges given the limited capabilities of traditional computing systems. This paper aims to provide solutions which can cope with very large data in Decision-Support Systems (DSSs). In the data integration phase, specifically, the authors propose a conceptual modeling approach for parallel and distributed Extracting-Transforming-Loading (ETL) processes. Among the complexity dimensions of big data, this study focuses on the volume of data to ensure a good performance for ETL processes. The authors' approach allows anticipating on the parallelization/distribution issues at the early stage of Data Warehouse (DW) projects. They have implemented an ETL platform called parallel-ETL (P-ETL for short) and conducted some experiments. Their performance analysis reveals that the proposed approach enables to speed up ETL processes by up to 33% with the improvement rate being linear.
In this work,the idea of energy surfaces which are designed by analogy to the flow 5eld in fluid dynamics is employed to build controller for truck-and-trailer back-upper problem with presence of dynamic obstacles in ...
详细信息
In this work,the idea of energy surfaces which are designed by analogy to the flow 5eld in fluid dynamics is employed to build controller for truck-and-trailer back-upper problem with presence of dynamic obstacles in arbitrary *** first demonstrate that the truck back upper problem in free space,which is an unstable nonlinear control problem,can be naturally solved by the controller based on the flow field of the two dimensional doublet in fluid ***,parallel and distributed algorithms on two dimensional lattices are further developed to compute the emergent flow field,abbreviated as EFF,for general obstacle avoidance navigation with kinematic *** new controller generally guides the navigation along the tangent direction of the streamlines in the developed emergent flow *** the kinematic constraints of the controller,a greedy search for the minimum of the orientation moments surrounding the motor will guide the motor to the desired *** efficiency and direct use of raw images are main advantages of our approach in many real time applications.
This paper evaluates speed-up and dependability of parallel differential evolutionary particle swarm optimization (DEEPSO) for on-line optimal operational planning of energy plants. The planning can be formulated as a...
详细信息
ISBN:
(纸本)9781509025985
This paper evaluates speed-up and dependability of parallel differential evolutionary particle swarm optimization (DEEPSO) for on-line optimal operational planning of energy plants. The planning can be formulated as a mixed integer nonlinear optimization problem (MINLP). When optimal operational planning of numbers of energy plants are calculated simultaneously in a data center, it is required to generate optimal operational planning as rapidly as possible considering control intervals and numbers of treated plants. One of the solutions for this challenge is speeding up by parallel and distributed processing (PDP). However, PDP utilizes numbers of processes and countermeasures for various faults of the processes should be considered. On-line optimal operational planning requires successive calculation at every control interval for keeping customer services. Therefore, sustainable (dependable) calculation keeping quality of solutions are required even if some of the calculation results cannot be returned from distributed processes. Using the proposed parallel DEEPSO based method, it is observed that calculation time becomes about 3 times faster than a sequential calculation, and high quality of solutions can be kept even with high fault probabilities.
We propose the complete cubic network structure to extend the existing class of hierarchical cubic networks, and establish a general connectivity result which states that the surviving graph of a complete cubic networ...
详细信息
We propose the complete cubic network structure to extend the existing class of hierarchical cubic networks, and establish a general connectivity result which states that the surviving graph of a complete cubic network, when a linear number of vertices are removed, consists of a large (connected) component and a number of smaller components which altogether contain a limited number of vertices. As applications, we characterize several fault-tolerance properties for the complete cubic network, including its restricted connectivity, i.e., the size of a minimum vertex cut such that the degree of every vertex in the surviving graph has a guaranteed lower bound;its cyclic vertex-connectivity, i.e., the size of a minimum vertex cut such that at least two components in the surviving graph contain a cycle;its component connectivity, i.e., the size of a minimum vertex cut whose removal leads to a certain number of components in its surviving graph;and its conditional diagnosability, i.e., the maximum number of faulty vertices that can be detected via a self-diagnostic process, in terms of the common Comparison Diagnosis model.
暂无评论