With the explosive growth of the web, people often need to monitor fresh information about their areas of interest by browsing the same sites repeatedly. Especially, even for a local. area, so many pages are created e...
详细信息
ISBN:
(数字)9783540769255
ISBN:
(纸本)9783540769231
With the explosive growth of the web, people often need to monitor fresh information about their areas of interest by browsing the same sites repeatedly. Especially, even for a local. area, so many pages are created every day that might be a greatly helpful knowledge in daily decision supports. In order to reduce such monitoring work and not to miss chances to meet critical information for a local area, we are developing a continuous Geographic Web Search System with Push-based web monitoring services. This paper will describe the system architecture to deal with multiple user queries to represent users' geographic attention over multiple data pages incoming from geographic web crawlers. If a newly incoming data page is relevant to a pre-registered query, the user who previously registered the query will be informed spontaneously about the new information by our notification service. This paper especially focuses on the problem of how to match multiple data pages and multiple queries as quickly as possible. For the purpose, we will propose a fast matching algorithm based on a spatial join and show a primitive experimental result with synthesis data.
The one-shot shortest path query has been studied for decades. However, in the applications on road networks, users are actually interested in the path with the minimum travel time (the fastest path), which varies as ...
详细信息
ISBN:
(纸本)9783540735397
The one-shot shortest path query has been studied for decades. However, in the applications on road networks, users are actually interested in the path with the minimum travel time (the fastest path), which varies as time goes. This motivates us to study the continuous evaluation of fastest path queries in order to capture the dynamics of road networks. Repeatedly evaluating a large number of fastest path queries at every moment is infeasible due to its computationally expensive cost. We propose a novel approach that employs the concept of the affecting area and the tolerance parameter to avoid the reevaluation while the travel time of the current answer is close enough to that of the fastest path. Furthermore, a grid-based index is designed to achieve the efficient processing of multiple queries. Experiments on real datasets show significant reduction on the total amount of reevaluation and therefore the cost for reevaluating a query.
In the k-medoid problem, given a dataset P, we are asked to choose k points in P as the medoids. The optimal medoid set minimizes the average Euclidean distance between the points in P and their closest medoid. Findin...
详细信息
ISBN:
(纸本)9783540735397
In the k-medoid problem, given a dataset P, we are asked to choose k points in P as the medoids. The optimal medoid set minimizes the average Euclidean distance between the points in P and their closest medoid. Finding the optimal k medoids is NP hard, and existing algorithms aim at approximate answers, i.e., they compute medoids that achieve a small, yet not minimal, average distance. Similarly in this paper, we also aim at approximate solutions. We consider, however, the continuous version of the problem, where the points in P move and our task is to maintain the medoid set on-the-fly (trying to keep the average distance small). To the best of our knowledge, this work constitutes the first attempt on continuous medoid queries. First, we consider centralized monitoring, where the points issue location updates whenever they move. A server processes the stream of generated updates and constantly reports the current medoid set. Next, we address distributed monitoring, where we assume that the data points have some computational capabilities, and they take over part of the monitoring task. In particular, the server installs adaptive filters (i.e., permissible spatial ranges, called safe regions) to the points, which report their location only when they move outside their filters. The distributed techniques reduce the frequency of location updates (and, thus, the network overhead and the server load), at the cost of a slightly higher average distance, compared to the centralized methods. Both our centralized and distributed methods do not make any assumption about the data moving patterns (e.g., velocity vectors, trajectories, etc) and can be applied to an arbitrary number of medoids k. We demonstrate the efficiency and efficacy of our techniques through extensive experiments.
The literature on skyline algorithms has so far dealt mainly with queries of static query points over static data sets. With the increasing number of mobile service applications and users, however, the need for contin...
详细信息
The literature on skyline algorithms has so far dealt mainly with queries of static query points over static data sets. With the increasing number of mobile service applications and users, however, the need for continuous skyline queryprocessing has become more pressing. A continuous skyline query involves not only static dimensions, but also the dynamic one. In this paper, we examine the spatiotemporal coherence of the problem and propose a continuous skyline queryprocessing strategy for moving query points. First, we distinguish the data points that are permanently in the skyline and use them to derive a search bound. Second, we investigate the connection between the spatial positions of data points and their dominance relationship, which provides an indication of where to find changes in the skyline and how to maintain the skyline continuously. Based on the analysis, we propose a kinetic-based data structure and an efficient skyline queryprocessing algorithm. We concisely analyze the space and time costs of the proposed method and conduct an extensive experiment to evaluate the method. To the best of our knowledge, this is the first work on continuous skyline queryprocessing.
暂无评论