This paper presents the design philosophy and implementation of the BALANCE system. BALANCE Is a flexible, network independent and computer architecture independent load balancing system which allows the building of r...
详细信息
This paper presents the design philosophy and implementation of the BALANCE system. BALANCE Is a flexible, network independent and computer architecture independent load balancing system which allows the building of reusable parallel and distributed applications. By implementing related services as generic servers with their connection endpoints registered in BALANCE, the clients can easily access the servers by server system calls. To demonstrate the flexibility of BALANCE, several widely different applications have been implemented and evaluated, including system servers, parallel and distributed applications and a scheduling testbed. The use of generic servers to improve system modularity and code reuse is also discussed. (C) 1997 by John Wiley & Sons, Ltd.
The Alternating Direction Method of Multipliers (ADMM) is a popular and promising distributed framework for solving large-scale machine learning problems. We consider decentralized consensus-based ADMM in which nodes ...
详细信息
The Alternating Direction Method of Multipliers (ADMM) is a popular and promising distributed framework for solving large-scale machine learning problems. We consider decentralized consensus-based ADMM in which nodes may only communicate with one-hop neighbors. This may cause slow convergence. We investigate the impact of network topology on the performance of an ADMM-based learning of Support Vector Machine using expander, and mean-degree graphs, and additionally some of the common modern network topologies. In particular, we investigate to which degree the expansion property of the network influences the convergence in terms of iterations, training and communication time. We furthermore suggest which topology is preferable. Additionally, we provide an implementation that makes these theoretical advances easily available. The results show that the performance of decentralized ADMM-based learning of SVMs in terms of convergence is improved using graphs with large spectral gaps, higher and homogeneous degrees.
In a grid computing environment, the network characteristics such as bandwidth and latency affect the task performance. The demands for bandwidth of wide-area networks become large and it reaches more than 100Gbps. In...
详细信息
In a grid computing environment, the network characteristics such as bandwidth and latency affect the task performance. The demands for bandwidth of wide-area networks become large and it reaches more than 100Gbps. In this article, we focus on parallel routes transmission, such as link aggregation, to realize large bandwidth network. The performance of grid computing with parallel routes transmission is evaluated on the emulated wide-area network.
Presented here is decryst, a software suite for structure determination from powder diffraction, which uses the direct-space method, and is able to apply anti-bump constraints automatically and efficiently during the ...
详细信息
Presented here is decryst, a software suite for structure determination from powder diffraction, which uses the direct-space method, and is able to apply anti-bump constraints automatically and efficiently during the procedure of global optimization using the crystallographic collision detection algorithm by Liu [Acta Cryst. (2017), A73, 414-422]. decryst employs incremental computation in its global-optimization cycles, which results in dramatic performance enhancement. It is also designed with parallel and distributed computing in mind, allowing for even better performance by simultaneous use of multiple processors. Owing to the parallelized usage of the equivalent position combination method [Deng & Dong (2009). J. Appl. Cryst. 42, 953-958] in decryst, it is particularly suitable for determination of structures with mostly unknown bonding relations, and offers some unprecedented opportunities for these structures. decryst is free and open-source software, and can be obtained at https://***/CasperVector/decryst/;it strives to be simple yet flexible, in the hope that the underlying techniques could be adopted in more crystallographic applications.
Because of the importance of Delaunay Triangulation in science and engineering, researchers have devoted extensive attention to parallelizing this fundamental algorithm. However, generating unstructured meshes for ext...
详细信息
Because of the importance of Delaunay Triangulation in science and engineering, researchers have devoted extensive attention to parallelizing this fundamental algorithm. However, generating unstructured meshes for extremely large point sets remains a barrier for scientists working with large scale or high resolution datasets. In our previous paper, we introduced a novel algorithm - Triangulation of Independent Partitions in parallel (TIPP) which divides the domain into many independent partitions that can be triangulated in parallel. However, using only a single master process introduced a performance bottleneck and inhibited scalability. In this paper, we refine our description of the original TIPP algorithm, and also extend TIPP to employ multiple master processes, distributing computational load across several machines. This new design improves both performance and scalability, and can produce 20 billion triangles using only 10 commodity nodes in under 30 minutes.
In many applications using database systems, the conventional method of transaction processing can not be used. This is on account of lack of integration and existence of centralized solutions. Such situations exist w...
详细信息
In many applications using database systems, the conventional method of transaction processing can not be used. This is on account of lack of integration and existence of centralized solutions. Such situations exist within heterogeneous systems, mobile database transactions and time-critical applications requiring admission on priority for a select group of transactions. For example, in conventional methods, the deadlock detection is based on use of delay to cause and watch deadlocks. It generates many difficulties, such as, (a) high overheads of periodic checking (b) Non-deterministic nature of the delays, and (c) difficulties to scale-up the centralized solutions. The existing proposal lacks in local processing for distributed transactions. The proposed technique uses normal message communication among peers. The proposal leads to enhanced role for resource sites. The proposal introduces asynchronous operations in transaction processing. As a result the detection processes do not wait for occurrences of time-outs delays. In most cases the technique eliminates the possibility of occurrence of waiting delays.
This report is devoted to discussion of numerical and symbolic computing ratio in beam physics. We tray to draw attention on basic conceptual and computational problems first of all. It is known that the main problem ...
详细信息
This report is devoted to discussion of numerical and symbolic computing ratio in beam physics. We tray to draw attention on basic conceptual and computational problems first of all. It is known that the main problem in modern computational beam physics connected with high performance computing realization. The most of used approaches are not appropriate for computing using multiprocessing systems. Here we give some possible solutions, which based on a matrix presentation of necessary information and modern information technologies.
Inference of phylogenetic trees comprising hundreds or even thousands of organisms based on the maximum likelihood method is computationally intensive. We present simple heuristics which yield accurate trees for synth...
详细信息
Inference of phylogenetic trees comprising hundreds or even thousands of organisms based on the maximum likelihood method is computationally intensive. We present simple heuristics which yield accurate trees for synthetic as well as real data and significantly reduce execution time. Those heuristics have been implemented in a sequential, parallel, and distributed program called RAxML-II which is freely available as open source code. We compare the performance of the sequential program with PHYML and MrBayes which-to the best of our knowledge-are currently the fastest and most accurate programs for phylogenetic tree inference based on statistical methods. Experiments are conducted using 50 synthetic 100 taxon alignments as well as nine real-world alignments comprising 101 up to 1000 sequences. RAxML-II outperforms MrBayes for real-world data both in terms of speed and final likelihood values. Furthermore, for real data RAxML-II requires less time (a factor of 2-8) than PHYML to reach PHYML's final likelihood values and yields better final trees due to its more exhaustive search strategy. For synthetic data MrBayes is slightly more accurate than RAxML-II and PHYML but significantly slower. The non-deterministic parallel program shows good speedup values and has been used to infer a 10 000-taxon tree comprising organisms from the domains Eukarya, Bacteria, and Archaea. Copyright (c) 2005 John Wiley & Sons, Ltd.
We discuss the problem of finding a dominant sequence for sending input data items from a low-end client to a server for computational intensive tasks under the realistic assumption of unpredictable communication beha...
详细信息
We discuss the problem of finding a dominant sequence for sending input data items from a low-end client to a server for computational intensive tasks under the realistic assumption of unpredictable communication behavior. Under this assumption, the client has to send the input data items using a specified sequence to maximize the number of computations performed by the server at any time. The sequence-finding problem is NP-hard for the general case. In this paper, we address three fundamental and useful applications: the product of two polynomials, matrices multiplication and Fast Fourier Transform. We show that the sequence-finding problems of the three applications can be solved optimally in linear time. However, we also show counter examples to rule out any possibility of finding a dominant sequence for sparse cases of the three applications. Finally, a simulation is conducted to show the usefulness of our method.
The last decade has witnessed the rapid growth of both the complexity and the scale of problem domains. A variety of techniques have been developed for fostering large and even hybrid simulations over the Internet. Al...
详细信息
The last decade has witnessed the rapid growth of both the complexity and the scale of problem domains. A variety of techniques have been developed for fostering large and even hybrid simulations over the Internet. Along this direction, existing work normally only suit for coarse-grained models. In this paper, we present a Grid infrastructure which applies to a hybrid simulation comprising models of different grains and/or models of various types in nature. A gateway approach has been proposed to present simulation models of an individual administrative domain for fine-grained problems, and it also bridges simulation models operating in multiple administrative domains to form hybrid simulations for studying large and complicated problems. A prototype infrastructure has realized with the support of federated simulation technology. Potential applications, such as simulation of huge crowd, have also been discussed.
暂无评论