this paper proposes a data race prevention scheme, which can prevent data races in the View-Oriented parallel Programming (VOPP) model. VOPP is a novel shared-memory data-centric parallel programming;model, which uses...
详细信息
ISBN:
(纸本)9781424452910
this paper proposes a data race prevention scheme, which can prevent data races in the View-Oriented parallel Programming (VOPP) model. VOPP is a novel shared-memory data-centric parallel programming;model, which uses views to bundle mutual exclusion with data access. We have implemented the data race prevention scheme with a memory protection mechanism. Experimental results show that the extra overhead of memory protection is trivial in our applications. We also present a new VOPP implementation-Maotai 2.0, which has advanced features such as deadlock avoidance, producer/consumer view and system queues, in addition to the data race prevention scheme. the performance of Maotai 2.0 is evaluated and compared with modern programming models such as OpenMP and Cilk.
In the sequential model of programming, instructions in a program are executed sequentially. Existing, programming languages are mainly designed for the sequential model. As the programming paradigm shifts from the se...
详细信息
ISBN:
(纸本)9783642030949
In the sequential model of programming, instructions in a program are executed sequentially. Existing, programming languages are mainly designed for the sequential model. As the programming paradigm shifts from the sequential to distributedcomputing, existing sequential programming languages have their limitations. Nevertheless, the sequential languages are the languages which most of programmers are most familiar with. One of the motivations of this research is to implement a framework to support the implementations of distributedapplications using Sequential programming languages Such as C/C++, COBOL, and Java. In this paper, we present an implementation of a framework for open distributed programming. Allowing programmers to write distributed programs in their favorite sequential programming languages makes the programming paradigm very unique to the existing programming paradigms.
Topology embedding enables us to execute a protocol designed for a specific (virtual) topology on another (real) topology by embedding the virtual topology on the real topology. In this paper, we propose a self-stabil...
详细信息
ISBN:
(纸本)9781424452910
Topology embedding enables us to execute a protocol designed for a specific (virtual) topology on another (real) topology by embedding the virtual topology on the real topology. In this paper, we propose a self-stabilizing emulation technique that provides reliable communication on a virtual topology in the presence of transient faults. the proposed protocol improves the execution slowdown of previous protocols [7], [8] and provides adaptive message delivery delay on the emulated channels, which is a new type of adaptability against transient faults.
Fault tolerance is an important requirement for long-running parallel programs. this paper presents a different approach to fault-tolerance support in message-passing parallel programs based on their structural and be...
详细信息
ISBN:
(纸本)9781424452910
Fault tolerance is an important requirement for long-running parallel programs. this paper presents a different approach to fault-tolerance support in message-passing parallel programs based on their structural and behavioral characteristics, commonly known as patterns. A classification of these patterns and their applicable fault-tolerance strategies is aimed to facilitate an application developer to incorporate appropriate fault-tolerance strategies to an application. Fault-tolerance strategies for two of the patterns are discussed, and one specific strategy is elaborated and analyzed. the presented strategies have been incorporated into a fault-tolerance support framework called FT-PAS. One objective of the framework is to separate the fault tolerance related details from an application developer's main objectives (separation-of-concerns). the paper presents the additional key features of the framework, and concludes with a discussion on current and future research directions.
In a ubiquitous computing environment, contexts are initially got and stored on those nodes scattered over the environment. However, the traditional reasoning about contexts applied a centralized approach which aggrav...
详细信息
ISBN:
(纸本)9781424452910
In a ubiquitous computing environment, contexts are initially got and stored on those nodes scattered over the environment. However, the traditional reasoning about contexts applied a centralized approach which aggravated the load of reasoning server and cost for communication of context. As an important approach for context reasoning, the rule-based reasoning can be easily decomposed of independent propositions. therefore, the rule-based context reasoning can be decomposed and distributed over those nodes of the ubiquitous environment to decrease the computing load of reasoning server. In this paper, we propose a distributed fuzzy reasoning Petri net model (dFRPN) towards the decomposition of distributed fuzzy reasoning. dFRPN model is able to formalize the distribution of fuzzy reasoning over every node of the whole environment. dFRPN model is characterized as a hierarchical structure to make a description of reasoning on nodes more clear. Considering the limited capabilities of some mobile nodes such as PDA and smart phone, we add the special migrating transition to define the load detection and corresponding actions. At the end of the paper, the feasibility of dFRPN model is validated through a case of context-awareness based personalized recommendation system.
We propose a new method that accelerates existing Byzantine Fault Tolerance (BFT) protocols for asynchronous distributed systems by parallelizing the involved consensuses. BFT realizes a reliable system against Byzant...
详细信息
ISBN:
(纸本)9781424452910
We propose a new method that accelerates existing Byzantine Fault Tolerance (BFT) protocols for asynchronous distributed systems by parallelizing the involved consensuses. BFT realizes a reliable system against Byzantine failures and is usually solved by repeatedly executing a consensus for a set of requests. Our method consistently parallelizes the consensus by introducing a new extra consensus on the order of processing agreed requests. We show the correctness of our method and analyze its performance in comparison with an existing non-parallelizing method and a naively parallelizing method. the results indicate that our parallelizing method is approximately 20% faster than those methods in such configurations where many replicas are running in order to increase reliability.
General Purpose computing over Graphical Processing Units (GPGPUs) is a huge shift of paradigm in parallelcomputingthat promises a dramatic increase in performance. But GPGPUs also bring an unprecedented level of co...
详细信息
ISBN:
(纸本)9781424452910
General Purpose computing over Graphical Processing Units (GPGPUs) is a huge shift of paradigm in parallelcomputingthat promises a dramatic increase in performance. But GPGPUs also bring an unprecedented level of complexity in algorithmic design and software development. In this paper we describe the challenges and design choices involved in parallelization of Bayesian Optimization Algorithm (BOA) to solve complex combinatorial optimization problems over nVidia commodity graphics hardware using Compute Unified Device Architecture (CUDA). BOA is a well-known multivariate Estimation of Distribution Algorithm (EDA) that incorporates methods for learning Bayesian Network (BN). It then uses BN to sample new promising solutions. Our implementation is fully compatible with modern commodity GPUs and therefore we call it gBOA (BOA on GPU). In the results section, we show several numerical tests and performance measurements obtained by running gBOA over an nVidia Tesla C1060 GPU. We show that in the best case we can obtain a speedup of up to 13x.
We have designed and implemented the Blue Whale File System (BWFS), a scalable distributed file system for large distributed data-intensive applications. With many of the features as previous distributed file systems,...
详细信息
ISBN:
(纸本)9781424452910
We have designed and implemented the Blue Whale File System (BWFS), a scalable distributed file system for large distributed data-intensive applications. With many of the features as previous distributed file systems, BWFS has successfully met our storage needs and is widely deployed within many fields. Although excellent for high-bandwidth access to large files, BWFS's out-of-band data transfer mode provides low efficiency under small files intensive workloads. In order to improve the overall performance of the file system, we propose a novel data transfer scheme. In such novel scheme, BWFS transfers data withthe hybrid data transfer policy that small files are transferred with in-band mode while large files are transferred with out-of-band mode. the prototype design and implementation is described and the various experiments are presented to demonstrate that the significant performance benefits of our prototype implementation under the small files intensive workloads. For small files intensive applications, BWF'S can achieve significantly higher throughput which increases by 60%.
this paper deals withthe learning of the membership functions for Mamdani Fuzzy Systems the number of labels of the variables and the tuning of them in order to obtain a set of Linguistic Fuzzy Systems with different...
详细信息
ISBN:
(纸本)9781424447350
this paper deals withthe learning of the membership functions for Mamdani Fuzzy Systems the number of labels of the variables and the tuning of them in order to obtain a set of Linguistic Fuzzy Systems with different trade-offs between accuracy and complexity, through the use of a two-level evolutionary multi-objective algorithm. the presented methodology employs a high level main evolutionary multi-objective heuristic searching the number of labels, and some distributed low level ones, also evolutionary, tuning the membership functions of the candidate variable partitions.
We propose an approximation algorithm for the problem of Fault-Tolerant Facility Location which is implemented in a distributed and asynchronous manner within O(n) rounds of communication. Here n is the number of vert...
详细信息
ISBN:
(纸本)9781424452910
We propose an approximation algorithm for the problem of Fault-Tolerant Facility Location which is implemented in a distributed and asynchronous manner within O(n) rounds of communication. Here n is the number of vertices in the network. As far as we know, the performance guarantee of similar algorithms (centralized) remains unknown except a special case where all cities have a uniform connectivity requirement. In this paper, we assume the shortest-path routing scheme deployed, as well as a constant (given) size of R, which represents the distinct levels of fault-tolerant capability provided by the system (i.e distinct connectivity requirements), and prove that the cost of our solution is no more than vertical bar R vertical bar . F* + C* in the general case, where F* and C* are respectively the facility cost and connection cost in an optimal solution. Further more, extensive numerical experiments showed that the quality of our solutions is comparable to the optimal solutions when vertical bar R vertical bar is no more than 10.
暂无评论