A snapshot scan algorithm produces an "instantaneous" picture of a region of shared memory that may be updated by concurrent processes. Many complex shared memory algorithms can be greatly simplified by stru...
详细信息
A snapshot scan algorithm produces an "instantaneous" picture of a region of shared memory that may be updated by concurrent processes. Many complex shared memory algorithms can be greatly simplified by structuring them around the snapshot scan abstraction. Unfortunately, the substantial decrease in conceptual complexity quite often is counterbalanced by an increase in computational complexity. In this paper, we introduce the notion of a weak snapshot scan, a slightly weaker primitive that has a more efficient implementation. We propose the following methodology for using this abstraction: first, design and verify an algorithm using the more powerful snapshot scan;second, replace the more powerful but less efficient snapshot with the weaker but more efficient snapshot, and show that the weaker abstraction nevertheless suffices to ensure the correctness of the enclosing algorithm. We give two examples of algorithms whose performance is enhanced while retaining a simple modular structure: bounded concurrent timestamping and bounded randomized consensus. The resulting timestamping protocol dominates all other currently known timestamping protocols: it matches the speed of the fastest known bounded concurrent timestamping protocol while actually reducing the register size by a logarithmic factor. The resulting randomized consensus protocol matches the computational complexity of the best known protocol that uses only bounded values.
Recent advances in the area of the Internet of Things shows that devices are usually resource-constrained. To enable advanced applications on these devices, it is necessary to enhance their performance by leveraging e...
详细信息
Recent advances in the area of the Internet of Things shows that devices are usually resource-constrained. To enable advanced applications on these devices, it is necessary to enhance their performance by leveraging external computing resources available in the network. This work presents a study of computational platforms to increase the performance of these devices based on the Mobile Cloud computing (MCC) paradigm. The main contribution of this paper is to research the advantages and possibilities of architectures with multiple offloading options. To this end, a review of architectures that use a combination of the computing layers in the available infrastructure to perform this paradigm and outsource processing load is presented. In addition, a proof-of-concept application is introduced to demonstrate its realization along all the network layers. The results of the simulations confirm the high flexibility to offload numerous tasks using different layers and the ability to overcome unfavorable scenarios.
This paper discusses cyclic to acyclic transformations performed on graphs representing computational sequences. Such transformations are critical to the development of models of computations and computer systems for ...
详细信息
This paper discusses cyclic to acyclic transformations performed on graphs representing computational sequences. Such transformations are critical to the development of models of computations and computer systems for performance prediction. The nature of cycles in computer programs for parallel processors is discussed. Transformations are then developed which replace cyclic graph structures by mean-value equivalent acyclic structures. The acyclic equivalents retain the noncyclic part of the structure in the original graph by evaluating a multiplicative factor associated with the mean time required for each vertex execution in the original graph. Bias introduced in the acyclic approximation is explored.
This paper focuses on compact deterministic self-stabilizing solutions for the leader election problem. When the solution is required to be silent (i.e., when the state of each process remains fixed from some point in...
详细信息
This paper focuses on compact deterministic self-stabilizing solutions for the leader election problem. When the solution is required to be silent (i.e., when the state of each process remains fixed from some point in time during any execution), there exists a lower bound of bits of memory per participating node , where n denotes the number of nodes in the system. This lower bound holds even in rings. We present a new deterministic (non-silent) self-stabilizing protocol for n-node rings that uses only memory bits per node, and stabilizes in rounds. Our protocol has several attractive features that make it suitable for practical purposes. First, it assumes an execution model that is used by existing compilers for real networks. Second, the size of the ring (or any upper bound on this size) does not need to be known by any node. Third, the node identifiers can be of various sizes. Finally, no synchrony assumption, besides weak fairness, is assumed. Our result shows that, perhaps surprisingly, silence can be traded for an exponential decrease in memory space without significantly increasing stabilization time or introducing restrictive assumptions.
Algebraic multigrid ( AMG) solves linear systems based on multigrid principles, but in a way that depends only on the coefficients in the underlying matrix.
Algebraic multigrid ( AMG) solves linear systems based on multigrid principles, but in a way that depends only on the coefficients in the underlying matrix.
In this paper we address the problem of mobile agents searching for a highly harmful item (called a black hole) in a ring network. The black hole is a stationary process that destroys visiting agents upon their arriva...
详细信息
In this paper we address the problem of mobile agents searching for a highly harmful item (called a black hole) in a ring network. The black hole is a stationary process that destroys visiting agents upon their arrival without leaving any observable trace of such a destruction. The task is to have at least one surviving agent able to unambiguously report the location of the black hole. We consider different scenarios and in each situation we answer some computational as well as complexity questions. We first consider agents that start from the same home base (co-located). We prove that two such agents are necessary and sufficient to locate the black hole;in our algorithm the agents perform O (n log n) moves (where n is the size of the ring) and we show that such a bound is optimal. We also consider time complexity and show how to achieve the optimal bound of 2n - 4 units of time using n - I agents. We generalize our technique to establish a trade-off between time and number of agents. We then consider the case of agents that start from different home bases (dispersed) and we show that if the ring is oriented, two dispersed agents are still necessary and sufficient. Also in this case our algorithm is optimal in terms of number of moves (Theta (n log n)). We finally show that if the ring is unoriented, three agents are necessary and sufficient;an optimal algorithm follows from the oriented case.
Software is a major source of reliability degradation in dependable systems. One of the classical remedies is to provide software fault tolerance by using N-Version Programming (NVP). However, due to requirements on n...
详细信息
Software is a major source of reliability degradation in dependable systems. One of the classical remedies is to provide software fault tolerance by using N-Version Programming (NVP). However, due to requirements on non-standard hardware and the need for changes and additions at all levels of the system, NVP solutions are costly, and have only been used in special cases. In a previous work, a low-cost architecture for NVP execution was developed. The key features of this architecture are the use of off-the-shelf components including communication standards and that the fault tolerance functionality, including voting, error detection, fault-masking, consistency management, and recovery, is moved into a separate redundancy management circuitry (one for each redundant computing node). In this article we present an improved design of that architecture, specifically resolving some potential inconsistencies that were not treated in detail in the original design. In particular, we present novel techniques for enforcing replica determinism. Our improved architecture is based on using the Controller Area Network (CAN). This choice goes beyond the obvious interest of using standards in order to reduce the cost, since all the rest of the architecture is designed to take full advantage of the CAN standard features, such as data consistency, in order to significantly reduce the complexity, the efficiency and the cost of the resultant system. Although initially developed for NVP, our redundancy management circuitry also supports other software replication techniques, such as active replication. (C) 2007 Elsevier B.V. All fights reserved.
The author describes how parallel computing can be integrated into courses throughout the computer science undergraduate curriculum. First, he explains why parallel computing is important, observing that many large, c...
详细信息
The author describes how parallel computing can be integrated into courses throughout the computer science undergraduate curriculum. First, he explains why parallel computing is important, observing that many large, computationally intensive problems are solved with parallel programs. He reasons that as the related areas of symmetric multiprocessing and distributed computing also become more important, practitioners will need to fully understand parallel computing. Next, he describes the current situation at most colleges and universities, noting that parallel computing is offered only as an upper-level elective. The author then describes how parallel computing can be integrated into each course. He begins with the introductory courses and continues with those on computer organization and computer architecture, operating systems, programming languages, theory of computation, standard algorithms, discrete event simulation, and numerical analysis. Finally, the author explains the requirements for successfully implementing this integrated approach.
This work proposes a sub-optimal method based on a two-layer structured meta-deep reinforcement learning (MDRL) approach to address the hardware impairment (HWI) optimization issue in large intelligent surface (LIS) s...
详细信息
This work proposes a sub-optimal method based on a two-layer structured meta-deep reinforcement learning (MDRL) approach to address the hardware impairment (HWI) optimization issue in large intelligent surface (LIS) systems. This method, designed for distributed LIS systems with reflection matrices, effectively enhances the system capacity and performance despite HWIs. Building upon existing techniques of dividing large-area LIS systems into multiple small-area subsystems, the simulated results demonstrate that sub-optimal LIS performance can be achieved with fewer samples in diverse dynamic wireless environments. This innovative approach enhances the adaptability of distributed LIS systems and offers an effective HWI management strategy, paving the way for future LIS system optimization.
In this paper we propose a distributed architecture to provide machine learning practitioners with a set of tools and cloud services that cover the whole machine learning development cycle: ranging from the models cre...
详细信息
In this paper we propose a distributed architecture to provide machine learning practitioners with a set of tools and cloud services that cover the whole machine learning development cycle: ranging from the models creation, training, validation and testing to the models serving as a service, sharing and publication. In such respect, the DEEP-Hybrid-DataCloud framework allows transparent access to existing e-Infrastructures, effectively exploiting distributed resources for the most compute-intensive tasks coming from the machine learning development cycle. Moreover, it provides scientists with a set of Cloud-oriented services to make their models publicly available, by adopting a serverless architecture and a DevOps approach, allowing an easy share, publish and deploy of the developed models.
暂无评论