A distributed asynchronous algorithm that minimizes a functional whose minimum drifts with time is discussed. the communication delays among the processors are assumed to be stochastic with Markovian character. the au...
详细信息
the author gives a summary of some of the arguments favoring the adoption of the bulk-synchronous parallel (BSP) model as a standard for parallelcomputing. First, he argues that for parallelcomputing to become a maj...
详细信息
Many real-world problems in different industrial and economic fields are permutation combinatorial optimization problems. Solving to optimality large instances of these problems, such as flowshop problem, is a challen...
详细信息
ISBN:
(纸本)9780769552071
Many real-world problems in different industrial and economic fields are permutation combinatorial optimization problems. Solving to optimality large instances of these problems, such as flowshop problem, is a challenge for multi-core computing. this paper proposes a multi-threaded factoradic-based branch-and-bound algorithm to solve permutation combinatorial problems on multi-core processors. the factoradic, called also factorial number system, is a mixed radix numeral system adapted to numbering permutations. In this new parallel algorithm, the B&B is based on a matrix of integers instead of a pool of permutations, and work units exchanged between threads are intervals of factoradics instead of sets of nodes. Compared to a conventional pool-based approach, the obtained results on flowshop instances demonstrate that our new factoradic-based approach, on average, uses about 60 times less memory to store the pool of subproblems, generates about 1.3 times less page faults, waits about 7 times less time to synchronize the access to the pool, requires about 9 times less CPU time to manage this pool, and performs about 30,000 times less context switches.
Clusters of GPUs are emerging as a new computational scenario. Programming them requires the use of hybrid models that increase the complexity of the applications, reducing the productivity of programmers. We present ...
详细信息
ISBN:
(纸本)9780769546759
Clusters of GPUs are emerging as a new computational scenario. Programming them requires the use of hybrid models that increase the complexity of the applications, reducing the productivity of programmers. We present the implementation of OmpSs for clusters of GPUs, which supports asynchrony and heterogeneity for task parallelism. It is based on annotating a serial application with directives that are translated by the compiler. With it, the same program that runs sequentially in a node with a single GPU can run in parallel in multiple GPUs either local (single node) or remote (cluster of GPUs). Besides performing a task-based parallelization, the runtime system moves the data as needed between the different nodes and GPUs minimizing the impact of communication by using affinity scheduling, caching, and by overlapping communication withthe computational task. We show several applicactions programmed with OmpSs and their performance with multiple GPUs in a local node and in remote nodes. the results show good tradeoff between performance and effort from the programmer.
this paper describes how we solved 12 previously unsolved mixed-integer programming (MIP) instances from the MIPLIB benchmark sets. To achieve these results we used an enhanced version of ParaSCIP, setting a new recor...
详细信息
ISBN:
(纸本)9781509021406
this paper describes how we solved 12 previously unsolved mixed-integer programming (MIP) instances from the MIPLIB benchmark sets. To achieve these results we used an enhanced version of ParaSCIP, setting a new record for the largest scale MIP computation: up to 80,000 cores in parallel on the Titan supercomputer. In this paper we describe the basic parallelization mechanism of ParaSCIP, improvements of the dynamic load balancing and novel techniques to exploit the power of parallelization for MIP solving. We give a detailed overview of computing times and statistics for solving open MIPLIB instances.
Withthe further development of Internet of things technology, due to the increasing data and poor expansibility of the traditional storage architecture, it will become increasingly complex and lead to high energy con...
详细信息
ISBN:
(纸本)9781479966363
Withthe further development of Internet of things technology, due to the increasing data and poor expansibility of the traditional storage architecture, it will become increasingly complex and lead to high energy consumption. Different from the traditional storage system, the distributed cloud storage system can realize the storage of massive information, the management of files with large scale, and provide high query efficiency. this paper firstly presents the current problems lying in the heterogeneous data processing, then the cloud storage architecture and MapReduce programming model are introduced for the ClassifyMapReduce algorithm proposition. Finally, considering the processing methods of distributedcomputing and cloud computing models, advantages and disadvantages of MapReduce programming model, and the characteristics of heterogeneous data in IoT system, this paper proposes a parallel storage algorithm, ClassifyMapReduce, which is composed of three systemic functions: Classify function, Map function and Reduce function. Our experiment shows that it classifies the original heterogeneous data flow according to the data type to realize parallel processing, which greatly improves the storage and access efficiency.
Power flow calculation of shipboard power system aims at determining operation conditions of the whole system, such as voltages of every bus, power distribution and power loss in shipboard grid, according to given ope...
详细信息
ISBN:
(纸本)9781479970056
Power flow calculation of shipboard power system aims at determining operation conditions of the whole system, such as voltages of every bus, power distribution and power loss in shipboard grid, according to given operation conditions and network structure. Ladder-shaped shipboard power system is a newly developed network with greater reliability and flexibility. In this paper, a parallel power flow calculation algorithm is proposed which specifically considers the features of ladder shipboard power system. this algorithm divides the whole system into concentrated power supply network and several distribution subnets, and adopts node potential method and forward-backward sweep calculation to different calculation levels respectively. the test results indicate that this parallel algorithm has better performance of convergence and greater flexibility.
the mammalian immune system consists of vital tissue microenvironments that exhibit remarkable structural complexity and dynamic cellular behaviors. It is now possible to acquire time-lapse series of multi-channel thr...
详细信息
ISBN:
(纸本)9781424441266
the mammalian immune system consists of vital tissue microenvironments that exhibit remarkable structural complexity and dynamic cellular behaviors. It is now possible to acquire time-lapse series of multi-channel three dimensional images of multiple cell types and vasculature simultaneously, revealing dynamic events in their living tissue context. this talk will describe automated image analysis algorithms, parallel computation methods, and large-scale edit-based validation methods to detect and quantify key events such as homogeneous and heterogeneous cell-cell interactions, and methods to map events to their tissue context.
We describe a tool that implements a set of services to manipulate and store data from a radar network in a transparent way to end users. A major requirement of this system is data availability and reliability. Conseq...
详细信息
ISBN:
(纸本)9780769527369
We describe a tool that implements a set of services to manipulate and store data from a radar network in a transparent way to end users. A major requirement of this system is data availability and reliability. Consequently, we have implemented a redundancy schema based on the Information Dispersal Algorithm (IDA). Preliminary results show that the IDA based replication provides better reliability and less storage spending than traditional replication.
A parallel search technique for improving evolutionary algorithms is proposed. the method is based on a new philosophy of applying search operators. Two search operators compete for being applied. One is a hybrid oper...
详细信息
ISBN:
(纸本)0769524532
A parallel search technique for improving evolutionary algorithms is proposed. the method is based on a new philosophy of applying search operators. Two search operators compete for being applied. One is a hybrid operator (recombination plus mutation) and the other is pure mutation operator the aim of the proposed technique is to maintain a good equilibrium between the exploration and the exploitation of the search space. Experimental results prove that the parallel search outperforms the standard way of applying search operators. A new quality measure for search operators is also proposed.
暂无评论