Metastable failures in distributeddatabases, characterized by their self-sustaining feedback loops leading to significant performance degradation, have become increasingly prevalent with the rise of complex distribut...
详细信息
ISBN:
(纸本)9798331530044;9798331530037
Metastable failures in distributeddatabases, characterized by their self-sustaining feedback loops leading to significant performance degradation, have become increasingly prevalent with the rise of complex distributedsystems [1]. One of the main sustained feedback loops in these failures is retry storms. These failures are triggered by temporary changes in load, leading to a cascade of retrial requests that overwhelm the system even after the initial load spike has recovered [2]. We have leveraged queuing theory to propose an analytical method for modeling metastable failures due to retry storms [3]. Building on our previous work, this proposal outlines a systematic approach to mitigate these failures by eliminating retrial requests in distributed transaction systems. We focus on existing concurrency controlmechanisms where retries of distributed transactions can occur frequently. Specifically, we focus on two-phase locking (2PL) under high contention workloads, where many distributed transactions can abort due to deadlocks and be retried, causing metastable failures. We propose that by preprocessing distributed transactions, we can reorder the locking mechanism to avoid deadlocks and transaction retries under high contention workloads. The behavior and correctness of this approach will be validated using the queuing model we developed.
Major research topics on parallel and distributed frameworks focus on reliability, performance and programmability of large scale systems for, e.g., HPC or Big Data. The solutions proposed are often directly impacted ...
详细信息
ISBN:
(纸本)9781538655559
Major research topics on parallel and distributed frameworks focus on reliability, performance and programmability of large scale systems for, e.g., HPC or Big Data. The solutions proposed are often directly impacted by the large scale nature of the problems. Differently, high-throughput data stream generation is an important challenge for many scientific and industrial applications which is typically well suited for small to medium scale systems, and which has to respect specific constraints about, e.g., speed, throughput or output location. In this paper we present a framework dedicated to this class of problems. We propose a performance-oriented runtime system architecture able to generate constrained data streams issued from jobs dynamically submitted by the user. Our architecture is designed to scale from a single host to a medium-sized cluster with large topology flexibility to achieve high throughput capabilities while being widely adaptive to a variety of problems. We provide experimental evidence of the ability of our framework to meet high-throughput constraints on an industrial use-case, i.e., professional digital printing, that may require tens of Gbit/s sustained output rates. We show in our measurements that our system scales and reaches data rates close to the maximum throughput of our experimental cluster.
We introduce and formalize the concept of two-level-hierarchies (t-l-hs) of objects. The t-l-hs are special partial directed acyclic graphs having as vertices objects. An equivalence relation is also defined on their ...
详细信息
ISBN:
(纸本)0818672676
We introduce and formalize the concept of two-level-hierarchies (t-l-hs) of objects. The t-l-hs are special partial directed acyclic graphs having as vertices objects. An equivalence relation is also defined on their edges. They are used for modelling complex objects by supporting, in a uniform way, the well known abstractions like association, aggregation, grouping as well as specialization and generalization. Considering the t-l-h as the design unit, we develop a new Design process (Uni-Design) for the distributed Object-Oriented databases (DOODBs) by combining the main design approaches (Top-Down and Bottom-Up) in a consistent and complementary way leading to capture and represent explicitly most of the semantics of a system. We also choose the t-l-h as the elementary unit of distribution in order to adopt a distribution design strategy (a DOODB is now a set of possibly interrelated t-l-hs) and reconsider the notions of fragmentation and replication by integrating the distribution process at the last phase of the Uni-Design process.
This paper presents an efficient flexible resampling architecture for parallel particle filtering. The architecture incorporates distributed, delayed resampling mechanisms for fast resampling processing. The architect...
详细信息
ISBN:
(纸本)0780377656
This paper presents an efficient flexible resampling architecture for parallel particle filtering. The architecture incorporates distributed, delayed resampling mechanisms for fast resampling processing. The architecture consists up to four resampling units and 16 processing elements. Their interconnection can be dynamically reconfigured. The architecture is designed and evaluated for bearing tracking example. The architecture is designed for 0.25 mum CMOS technology.
The establishment of one logical database that spans countries and continents is increasingly becoming a realistic goal to achieve. This conceptual database would potentially consist of an ever growing number of compo...
详细信息
The establishment of one logical database that spans countries and continents is increasingly becoming a realistic goal to achieve. This conceptual database would potentially consist of an ever growing number of component databases. In this paper, we propose a scheme to build a Worldwide Database using a two-level approach. In particular, we describe how conglomerations (small and large) of databases are formed, modified, and evolved.
We study text analysis algorithms that use global optimization methods to compute local characteristics that are consistent with properties of the entire corpus rather than computed locally based on exogenous paramete...
详细信息
ISBN:
(纸本)9781450305525
We study text analysis algorithms that use global optimization methods to compute local characteristics that are consistent with properties of the entire corpus rather than computed locally based on exogenous parameters. In the iterative implementations that we consider, each step both reads and updates a database of parameter values. Motivated by a need for rapid analysis of large corpora, we have developed methods for efficient access to such databases on parallel computers. These methods combine Bloom filters, in-memory caches, and an HBase cluster to reduce communication costs greatly relative to simpler approaches that either fully distribute or fully replicate the database. Our design can achieve considerable run time, latency and storage space improvements relative to other methods. In one segmentation application, we improve performance by a factor of 3 relative to an HBase-based implementation.
We study a basic information ranking problem in networks where each node holds an individual preference over a set of items and the goal for each node is to identify a sorted list of items with the largest aggregate p...
详细信息
ISBN:
(纸本)9781467325790
We study a basic information ranking problem in networks where each node holds an individual preference over a set of items and the goal for each node is to identify a sorted list of items with the largest aggregate preference. We would like to achieve this with a fully decentralized algorithm that uses a limited per-node memory and limited pair-wise communications. We show how this problem can be reduced to a plurality selection problem where the goal for each node is to identify an item with the largest aggregate ranking score, and show that solving the reduced problem solves the original ranking problem with high probability. Then we introduce a simple and natural plurality selection algorithm for the selection over m > 1 items that uses only log(2) (m) + 1 bits of per-node memory and per pair-wise communication. We prove correctness of the algorithm with high probability as the number of nodes grows large for the case when each node communicates with any other node, and establish tight convergence time bounds. The information ranking problem studied in this paper is a basic ranking problem that arises in various applications such as sorting elements in distributed computing systems, paralleldatabases, and may as well serve as a model of decentralized inference and opinion formation in distributed environments.
The max-flow outer bound is achievable by regenerating codes for functional repair distributed storage system. However, the capacity of exact repair distributed storage system is an open problem. In this paper, the li...
详细信息
ISBN:
(纸本)9781479904464
The max-flow outer bound is achievable by regenerating codes for functional repair distributed storage system. However, the capacity of exact repair distributed storage system is an open problem. In this paper, the linear programming bound for exact repair distributed storage systems is formulated. A notion of symmetrical sets for a set of random variables is given and equalities of joint entropies for certain subsets of random variables in a symmetrical set is established. Concatenation coding scheme for exact repair distributed storage systems is proposed and it is shown that concatenation coding scheme is sufficient to achieve any admissible rate for any exact repair distributed storage system. Equalities of certain joint entropies of random variables induced by concatenation scheme is shown. These equalities of joint entropies are new tools to simplify the linear programming bound and to obtain stronger converse results for exact repair distributed storage systems.
Since the characteristic to current information systems is the dynamic change of their configurations and scales with non-stop provision of their services, the system management should inevitably rely on autonomic com...
详细信息
ISBN:
(纸本)0769518524
Since the characteristic to current information systems is the dynamic change of their configurations and scales with non-stop provision of their services, the system management should inevitably rely on autonomic computing. Since fault tolerance is the one of important system management issues, it should also be incorporated in autonomic computing environment. This paper argues what should be taken into consideration and what approach could be available to realize the fault tolerance in such environments.
A new parallel algorithm, based on the concept of antidiagonal wave pattern, for computing approximate inverses, is introduced for symmetric multiprocessor systems. The parallel normalized approximate inverses are use...
详细信息
ISBN:
(纸本)9780769534725
A new parallel algorithm, based on the concept of antidiagonal wave pattern, for computing approximate inverses, is introduced for symmetric multiprocessor systems. The parallel normalized approximate inverses are used in conjunction with parallel normalized preconditioned conjugate gradient-type schemes, for the efficient solution of sparse finite element linear systems. The parallel implementation issues of the new algorithm are discussed and the parallel performance is presented, using OpenMP.
暂无评论