In a mediator, query usage should be carefully monitored to determine the optimized set of materialized sub-queries, since the integration schema of a mediator can be incrementally modified and the evaluation frequenc...
详细信息
In a mediator, query usage should be carefully monitored to determine the optimized set of materialized sub-queries, since the integration schema of a mediator can be incrementally modified and the evaluation frequency of a global query can also be continuously varied. This paper proposes a theoretical basis for adaptive selection of materialized sub-queries such that available storage in a mediator can be highly utilized at any time. In order to differentiate the recent usage of a query from the past, the accumulated usage frequency of a query decays as time goes by. Consequently, it is possible to find the optimum set of materialized sub-queries which minimizes the total evaluation cost of global queries in linear search complexity.
Today's common computing hardware-Internet connected desktop PCs and inexpensive, commodity off-the-shelf sensors such as Webcams-is an ideal platform for a worldwide sensor web. IrisNet provides a software infras...
详细信息
Today's common computing hardware-Internet connected desktop PCs and inexpensive, commodity off-the-shelf sensors such as Webcams-is an ideal platform for a worldwide sensor web. IrisNet provides a software infrastructure for this platform that lets users query globally distributed collections of high-bit-rate sensors powerfully and efficiently.
Information integration systems allow users to express queries over high-level conceptual models. However, such queries must subsequently be evaluated over collections of sources, some of which are likely to be expens...
详细信息
Information integration systems allow users to express queries over high-level conceptual models. However, such queries must subsequently be evaluated over collections of sources, some of which are likely to be expensive to use or subject to periods of unavailability. As such, it would be useful if information integration systems were able to provide users with estimates of the consequences of omitting certain sources from query execution plans. Such omissions can affect both the soundness (the fraction of returned answers which are returned) and the completeness (the fraction of correct answers which are returned) of the answer set returned by a plan. Many recent information integration systems have used conceptual models expressed in description logics (DLs). This paper presents an approach to estimating the soundness and completeness of queries expressed in the ALCQI DL. Our estimation techniques are based on estimating the cardinalities of query answers. We have have conducted some statistical evaluation of our techniques, the results of which are presented here. We also offer some suggestions as to how estimates for cardinalities of subqueries can be used to aid users in improving the soundness and completeness of query plans. (C) 2003 Elsevier B.V. All rights reserved.
distributed query processing is important for distributed Database Systems. Through the past years, the research focus in distributed query processing has been on how to realize join operations with different operator...
详细信息
distributed query processing is important for distributed Database Systems. Through the past years, the research focus in distributed query processing has been on how to realize join operations with different operators such as serni-join and Bloom Filter. Experiments show that using bloom filters, the hash-sernijoin, almost always does better than semi- join for the queryprocessing. However as long as you use bloom filter, you carmot avoid collisions. So in order to get the cheaper processing, some of the past work uses two or more bloom filters to do the hash-semijoin. However several factors still affect the cost and optimization result. 1. How to decide the perfect number of the bloom filters, and what kind of bloom filter should be chosen. 2. There is no way to avoid collisions when utilizing bloom filters. 3. With bloom filter, we cannot keep the exact location information of the joining attributes (loss of join information). 4. With bloom filter, we never can combine the useful composite semi-join in the process. Taking the idea of PERF join into account, why not use the bloom filter (hash-semijoin) concept but come up with a new kind of filter "Complete Reducing Filter" (CRF), which can avoid the disadvantages of bloom filter, as well as inherit the advantages of it? We propose and implement a new algorithm called Complete Reducing Filter (CRF) based on PERF join, which can keep the join location information, as well as lower transmission cost (because it's still using the filter concept). At the same time, CRF can combine the composite semi-join into the process, which overcome the impossibility if only using a bloom filter. With the variation of the bloom filter, we try to achieve better performance with lower cost.
The queryprocessing in a mobile computing environment involves join processing among different sites which include static servers and mobile computers. Because of the presence of asymmetric features in a mobile compu...
详细信息
The queryprocessing in a mobile computing environment involves join processing among different sites which include static servers and mobile computers. Because of the presence of asymmetric features in a mobile computing environment, the conventional queryprocessing for a distributed database cannot be directly applied to a mobile computing system. In this paper, we first explore some unique features of a mobile environment and then, in light of these features, devise queryprocessing methods for both join and queryprocessing. Remote mobile joins are said to be effectual if they are, when being interleaved into a join sequence, able to reduce the amount of data transmission cost required for distributed mobile queryprocessing. Since mobile relations are employed as reducers in our proposed queryprocessing cost model, more mobile joins in the queryprocessing lead to less data transmitted through the network. With proper scheduling, interleaving effectual remote mobile joins into a query scheduling can significantly reduce the total amount of data transmission among different sites. A simulator is developed to evaluate the performance of algorithms devised. Our results show that the approach of interleaving the processing of distributed mobile queries with effectual remote mobile joins is not only efficient, but also effective in reducing the total amount of data transmission cost required to process distributed mobile queries.
The optimization of general queries in a distributed database management system is an important and challenging research issue. The problem is to find an optimal evaluation strategy for a given general query. In this ...
详细信息
The optimization of general queries in a distributed database management system is an important and challenging research issue. The problem is to find an optimal evaluation strategy for a given general query. In this paper, we propose an approach based on a combination of join and parallel semijoin operations to minimize the amount of data transmission in distributed query processing. First, we describe an efficient distributed query processing strategy using only semijoins. This strategy selects an optimal set instead of a sequence of semijoins to be executed in parallel in three phases: a projection phase, a transmission phase and a reduction phase. Then, we apply a sequence of joins as reducers for queryprocessing. Furthermore, we consider the problem of finding an optimal general sequence that fully reduces a general join query graph. We present a new method that combines parallel and sequential semijoins. We report on experiments that show that our approach based on parallel semijoins is not only efficient but also effective in reducing the total amount of data transmission required to process distributed queries. (C) 1999 Elsevier Science B.V. All rights reserved.
We propose a flexible and robust framework for distributed query processing based on mutant query, plans (MQP). A MQP is an XML representation of a query plan that can also include verbatim XML data, references to res...
详细信息
We propose a flexible and robust framework for distributed query processing based on mutant query, plans (MQP). A MQP is an XML representation of a query plan that can also include verbatim XML data, references to resource locations (URLs), or abstract resource names (URNs). Servers work using local, possibly incomplete knowledge, partially evaluate as much of the query plan as they can, incorporate the partial results into a new, mutated query plan and transfer it to some other server that can continue processing. We have implemented an initial version of this framework, and present preliminary performance results. (C) 2002 Elsevier Science B.V. All rights reserved.
We propose a flexible and robust framework for distributed query processing based on mutant query, plans (MQP). A MQP is an XML representation of a query plan that can also include verbatim XML data, references to res...
详细信息
We propose a flexible and robust framework for distributed query processing based on mutant query, plans (MQP). A MQP is an XML representation of a query plan that can also include verbatim XML data, references to resource locations (URLs), or abstract resource names (URNs). Servers work using local, possibly incomplete knowledge, partially evaluate as much of the query plan as they can, incorporate the partial results into a new, mutated query plan and transfer it to some other server that can continue processing. We have implemented an initial version of this framework, and present preliminary performance results. (C) 2002 Elsevier Science B.V. All rights reserved.
We present the design of ObjectGlobe, a distributed and open query processor for Internet data sources. Today, data is published on the Internet via Web servers which have, if at all, very localized queryprocessing c...
详细信息
We present the design of ObjectGlobe, a distributed and open query processor for Internet data sources. Today, data is published on the Internet via Web servers which have, if at all, very localized queryprocessing capabilities. The goal of the ObjectGlobe project is to establish an open marketplace in which data and queryprocessing capabilities can be distributed and used by any kind of Internet application. Furthermore, ObjectGlobe integrates cycle providers (i.e., machines) which carry out queryprocessing operators. The overall picture is to make it possible to execute a query with-in principle-unrelated query operators, cycle providers, and data sources. Such an infrastructure can serve as enabling technology for scalable e-commerce applications, e.g., B2B and B2C market places. to be able to integrate data and data processing operations of a large number of participants. One of the main challenges in the design of such an open system is to ensure privacy and security. We discuss the ObjectGlobe security requirements, show how basic components such as the optimizer and runtime system need to be extended, and present the results of performance experiments that assess the additional cost for secure distributed query processing. Another challenge is quality of service management so that users can constrain the costs and running times of their queries.
In some integration projects, complete integration of database instances may not be necessary. It may also be too costly and impossible to do so due to poor local data quality and insufficient instance-level knowledge...
详细信息
In some integration projects, complete integration of database instances may not be necessary. It may also be too costly and impossible to do so due to poor local data quality and insufficient instance-level knowledge. In this research, we study how multidatabases with global schemas should be represented and manipulated when the data instances from the local databases do not require to be fully integrated. We propose the tuple source (TS) relational model to represent multidatabases under such an integration requirement. This model extends the classical relational model by augmenting every relation with a source attribute to identify the local database that the tuples come from. The source attribute can also be used to specify the right context to interpret global data instances. To manipulate TS relations, we have developed a set of tuple source relational algebraic operations and an extended SQL query language known as TS-SQL. With TS relational model, flexible multidatabase queries that involve instances from different local databases can be formulated easily. In this paper, we also reported our distributed query processing and optimization strategies and their implementation. (C) 1999 Elsevier Science B.V. All rights reserved.
暂无评论