Chandra and Toueg [J. ACM 43 (1996) 225] and Fromentin et al. [Proc. IEEE Internal. Conf. on Distrib. Comput., 1999, p. 470], respectively, stated that the weakest failure detector for any of non-blocking atomic commi...
详细信息
Chandra and Toueg [J. ACM 43 (1996) 225] and Fromentin et al. [Proc. IEEE Internal. Conf. on Distrib. Comput., 1999, p. 470], respectively, stated that the weakest failure detector for any of non-blocking atomic commitment and terminating reliable broadcast is the perfect failure detector P. Recently, Guerraoui [IPL 79 (2001) 99] presented a counterexample of those results, exhibiting a failure detector called Marabout (M) that is incomparable to P and yet solves those problems. In this paper we present three new perfect failure detector classes as alternatives to P and M. All our classes are weaker than P. Furthermore, two of them are also weaker than M, and yet solve non-blocking atomic commitment and terminating reliable broadcast. Interestingly, our failure detector classes are implementable whenever P is implementable (e.g., in a synchronous system), which is not the case with M. (C) 2003 Elsevier B.V. All rights reserved.
Kittyhawk represents our vision for a Web-scale computational resource that can accommodate a significant fraction of the world's computation needs and enable various parties to compete and cooperate in the provis...
详细信息
Kittyhawk represents our vision for a Web-scale computational resource that can accommodate a significant fraction of the world's computation needs and enable various parties to compete and cooperate in the provisioning of services on a consolidated platform. In this paper, we explain both the vision and the system architecture that supports it. We demonstrate these ideas by way of a prototype implementation that uses the IBM Blue Gene (R)/P platform. In the Kittyhawk prototype, we define a set of basic services that enable the allocation and interconnection of computing resources. By using examples, we show how higher layers of services can be built by using our basic services and standard open-source software.
It is well known that the average case deterministic communication complexity is bounded below by an entropic quantity, which one would now call deterministic information complexity. In this paper we show a correspond...
详细信息
It is well known that the average case deterministic communication complexity is bounded below by an entropic quantity, which one would now call deterministic information complexity. In this paper we show a corresponding upper bound. We also improve known lower bounds for the public coin Las Vegas communication complexity by a constant factor. (c) 2006 Elsevier B.V. All rights reserved.
Modern scientific data mainly consist of huge data sets gathered by a very large number of techniques and stored in much diversified and often incompatible data repositories. More in general, in the e-science environm...
详细信息
Modern scientific data mainly consist of huge data sets gathered by a very large number of techniques and stored in much diversified and often incompatible data repositories. More in general, in the e-science environment, it is considered as a critical and urgent requirement to integrate services across distributed, heterogeneous, dynamic "virtual organizations" formed by different resources within a single enterprise. In the last decade, Astronomy has become an immensely data-rich field due to the evolution of detectors (plates to digital to mosaics), telescopes and space instruments. The Virtual Observatory approach consists of the federation under common standards of all astronomical archives available worldwide, as well as data analysis, data mining and data exploration applications. The main drive behind such an effort is that once the infrastructure is complete, it will allow a new type of multi-wavelength, multi-epoch science, which can only be barely imagined. Data mining, or knowledge discovery in databases, while being the main methodology to extract the scientific information contained in such Massive Data Sets (MDS), poses crucial problems since it has to orchestrate complex problems posed by transparent access to different computing environments, scalability of algorithms, reusability of resources, etc. In the present paper we summarize the present status of the MDS in the Virtual Observatory and what is currently done and planned to bring advanced data mining methodologies in the case of the DAME (DAta Mining and Exploration) project. (C) 2010 Elsevier B.V. All rights reserved.
The Media Accelerating Peer Services system extends P2P infrastructures to improve multimedia services across heterogeneous computing platforms. In this article, we present an architecture and resource management and ...
详细信息
The Media Accelerating Peer Services system extends P2P infrastructures to improve multimedia services across heterogeneous computing platforms. In this article, we present an architecture and resource management and adaptation framework that transcends existing infrastructures to accommodate and accelerate multimedia peer applications and services. We also propose key technology components that support seamless adaptation of resources to enhance quality of service and the building of better tools and applications that utilize the peer-computing network's underlying power
We present private secure coded computation that ensures both the master's privacy and data security against the workers, where the master aims to compute a function of its private data and a specific data in a li...
详细信息
We present private secure coded computation that ensures both the master's privacy and data security against the workers, where the master aims to compute a function of its private data and a specific data in a library exclusively owned by the external workers. For privacy, the master should conceal the index of the specific data in the library against the workers. For security, the master should encrypt its private data. As an achievable scheme, we propose private secure polynomial codes and compare the proposed scheme with private polynomial codes in private coded computation and the optimal scheme of robust private information retrieval (RPIR).
Graphs appear in numerous applications including cyber security, the Internet, social networks, protein networks, recommendation systems, citation networks, and many more. Graphs with millions or even billions of node...
详细信息
Graphs appear in numerous applications including cyber security, the Internet, social networks, protein networks, recommendation systems, citation networks, and many more. Graphs with millions or even billions of nodes and edges are common-place. How to store such large graphs efficiently? What are the core operations/queries on those graph? How to answer the graph queries quickly? We propose Gbase, an efficient analysis platform for large graphs. The key novelties lie in (1) our storage and compression scheme for a parallel, distributed settings and (2) the carefully chosen graph operations and their efficient implementations. We designed and implemented an instance of Gbase using Mapreduce/Hadoop. Gbase provides a parallel indexing mechanism for graph operations that both saves storage space, as well as accelerates query responses. We run numerous experiments on real and synthetic graphs, spanning billions of nodes and edges, and we show that our proposed Gbase is indeed fast, scalable, and nimble, with significant savings in space and time.
In the atomic snapshot system model, the processes of an asynchronous distributed system communicate by atomic write and atomic snapshot read operations on a shared memory consisting of single-writer multiple-reader r...
详细信息
In the atomic snapshot system model, the processes of an asynchronous distributed system communicate by atomic write and atomic snapshot read operations on a shared memory consisting of single-writer multiple-reader registers. The processes may fail by crashing. It is shown that in this model, a wait-free full-information protocol complex is homotopy equivalent to the underlying input complex. A span in the sense of Herlihy and Shavit provides the homotopy equivalence. It follows that the protocol and input complexes are indistinguishable by ordinary homology or homotopy groups.
Subtrajectory query has been a fundamental operator in mobility data management and useful in the applications of trajectory clustering, co-movement pattern mining and contact tracing in epidemiology. In this paper, w...
详细信息
Subtrajectory query has been a fundamental operator in mobility data management and useful in the applications of trajectory clustering, co-movement pattern mining and contact tracing in epidemiology. In this paper, we make the first attempt to study subtrajectory query in trillion-scale GPS databases, so as to support applications with urban-scale moving users and weeks-long historical data. We develop SQUID as a distributed subtrajectory query processing engine on Spark, with threefold technical contributions. First, we propose compact index and storage layers to handle massive trajectory datasets with trillion-scale GPS points. Second, we leverage hybrid partitioning, together with local indexes that are disk I/O friendly, to facilitate pruning. Third, we devise a novel filter-and-refine query processing framework to effectively reduce the number of trajectories for verification. Our experiments are conducted on huge trajectory datasets with up to 520 billion GPS points. The results validate the compactness of the storage mechanism and the scalability of the distributed query processing framework.
The concept of the meta-universe is still in its early stages, but many leading tech companies have invested heavily in research and development for this technology. The development of meta-smart cities is a significa...
详细信息
The concept of the meta-universe is still in its early stages, but many leading tech companies have invested heavily in research and development for this technology. The development of meta-smart cities is a significant trend. In the meta-universe environment, integrating information systems is crucial for analyzing AI big data. Establishing an integrated platform for medical information systems is key to advancing information technology. In the context of the meta-universe, creating an efficient and unified integration platform to eliminate medical information silos and reduce system integration costs has become a pressing issue in medical informatization. This paper proposes a medical information system integration method based on an integration platform and utilizing cloud computing technology as a data center. The core business layer uses the integration software "Ensemble" as the integration platform. The underlying data center employs a Hadoop storage cluster with distributed data storage and parallel computing technology, and the existing scheduling algorithm is studied and analyzed to enhance the resource scheduling algorithm for medical small file data. The effectiveness of the algorithm is simulated and verified on an experimental platform, demonstrating improved efficiency in resource scheduling.
暂无评论