Sequential pattern mining is an important method in data mining. Traditional mining algorithms are not adapted to the fast, unlimited, continuous and dynamic data stream because they are multiple pass in scanning data...
详细信息
Sequential pattern mining is an important method in data mining. Traditional mining algorithms are not adapted to the fast, unlimited, continuous and dynamic data stream because they are multiple pass in scanning database. Some approximate sequential pattern mining algorithms are proposed recently which cost too many system resources in sequence compare process. A sequential compare method based on Levenshtein-Automata is proposed in this paper. This method build state conversion model with pretreatment which can finish computing the sequences' similarity in linear time. A combination of Levenshtein-Automata computation and common computation of edit distance is presented in allusion to the Levenshtein-Automata's problem of using too much memory, so a tradeoff between time cost and space cost is implemented. The experiment result shows this method is effective and efficient.
The skyline query is frequently used to find a set of dominating data points (called skyline points) in a multidimensional dataset It is one of the most important query methods for database, datastream, P2P networks. ...
详细信息
The skyline query is frequently used to find a set of dominating data points (called skyline points) in a multidimensional dataset It is one of the most important query methods for database, datastream, P2P networks. However, it has not been implemented in sensor networks due to limited energy of the sensor nodes. This paper presents an energy-efficient approximate skyline query scheme for sensor networks. According to the experiments, this scheme can greatly improve the lifetime of sensor networks compared to the naive skyline query.
In domain ontologies, there is usually no weight assigned to the link between two concepts. This has been considered as one of main obstacles in using ontologies. Semantic Association (SA) is to depict the correlation...
详细信息
In domain ontologies, there is usually no weight assigned to the link between two concepts. This has been considered as one of main obstacles in using ontologies. Semantic Association (SA) is to depict the correlation of two concepts, and can be measured as the weight of the link. In this paper, we defined Degree of Association (DOA) to measure SA from a concept to its direct-related concept in domain ontology, and proposed a Language-Model-Based Method (LMBM) to compute DOA. Our idea comes from the intuition that the semantic relationship between two concepts implies certain semantic association of them. We took probabilistic model for computing DOA, and used Maximum Likelihood Estimation to estimate parameters. We tested the proposed method on two different domain ontologies, and applied it in experiments of semantic query expansion. Experimental results show the benefit of our approach and demonstrate the promising effectiveness over semantic query expansion.
In contextual information retrieval, the retrieval of information depends on the time and place of submitting query, history of interaction, task in hand, and many other factors that are not given explicitly but impli...
详细信息
In contextual information retrieval, the retrieval of information depends on the time and place of submitting query, history of interaction, task in hand, and many other factors that are not given explicitly but implicitly lie in the interaction and surroundings of searching, namely the context. User's cognition is one of important contextual factors for understanding his or her personal needs. We propose a model called DOSAM to get user's individual cognitive structure on domain knowledge. DOSAM is developed from the spreading-activation model of psychology and is established on the domain ontology. The cost analysis of algorithm shows that it is feasible to get cognitive structure by DOSAM. Personalized search experimental results on digital library indicate that DOSAM can help improve the search effectiveness and user's satisfaction.
In this paper, based on concept lattices and dual concept lattices, we introduced a pair of rough set approximation operators within formal contexts. The proposed approximations operators don't require the equival...
详细信息
In this paper, based on concept lattices and dual concept lattices, we introduced a pair of rough set approximation operators within formal contexts. The proposed approximations operators don't require the equivalence relation any more. The properties of the proposed approximation operators are discussed in details.
In this paper, we propose a new Dynamic datacentric Storage (DDS) mechanism in wireless sensor network. DDS, which is aware of the data distributions of the network, dynamically adjusts the mappings from sensor readin...
详细信息
In this paper, we propose a new Dynamic datacentric Storage (DDS) mechanism in wireless sensor network. DDS, which is aware of the data distributions of the network, dynamically adjusts the mappings from sensor readings to the storage points to reduce the cost of storing these readings, as well as to balance the storage and workload in the network. Moreover, it takes advantage of the GPSR routing protocol to store multiple copies of readings to improve the robustness of the network with little overhead. Simulation results show that the approach is more energy-efficient and robust than other data-centric schemes.
Load shedding has been widely used in data stream management systems (DSMSs) to keep DSMSs running steadily. One key problem in load shedding is determining how much system load to shed. Existing works tend to adapt c...
详细信息
Load shedding has been widely used in data stream management systems (DSMSs) to keep DSMSs running steadily. One key problem in load shedding is determining how much system load to shed. Existing works tend to adapt coarse algorithm (CA) to solve this problem. In this paper, we present an adaptive PI controller-based load shedding framework for data stream. The main contribution of this paper is our use of feedback control theory to design the load shedding scheme. In contrast to the existing approaches, we firstly apply system identification to establish a dynamic model to describe DSMS, which enables us analyze DSMS quantitatively. Then, based on the model, we use the Root Locus method to design the PI controller with proven performance guarantees. The adaptive framework has been implemented by modifying Borealis system. Theoretic analysis and experimental results demonstrate that our approach is robust even when system load changes frequently. Comparing to existing strategies, our approach also achieves significantly better performance.
An effective way to optimize XML queries is to minimize XML queries. In this paper, we improve redundance elimination in XPath queries greatly by incorporating two novel kinds of constraints: parent constraint and sib...
详细信息
An effective way to optimize XML queries is to minimize XML queries. In this paper, we improve redundance elimination in XPath queries greatly by incorporating two novel kinds of constraints: parent constraint and sibling constraint, and by extending the tractable fragment to include descendant-or-self axis. The two novel kinds of constraints, together with child constraint and descendant constraint, form a family of constraints, which complicate the problem but offer possibilities for further minimization. Two techniques, tree augmentation and simulation augmentation, are employed to cope with constraints. We elaborate on the minimizing algorithms and running efficiencies both in the absence and in the presence of various kinds of constraints.
We study the problem of answering queries given a set of mappings between peer ontologies. In addition to the schema mapping between peer ontologies, there are axioms to give constraints to classes and properties. We ...
详细信息
We study the problem of answering queries given a set of mappings between peer ontologies. In addition to the schema mapping between peer ontologies, there are axioms to give constraints to classes and properties. We propose a set of rules to build graphs for the axioms. Because the axioms have different properties, the generated graphs are classified into four sets. In each peer, its RDF/OWL query languages can support regular expressions. If it wants to be transitive along semantic paths in peer knowledge management systems, we must rewrite conjunctive and disjunctive queries between peers. Because conjunctive queries are well-understood, we focus on a novel algorithm to rewrite disjunctive queries along semantic paths based on the graphs. For all atoms of a disjunctive query, we consider its union as a set and find the maximum rewritings over peers through a graphical way. Finally we do extensive simulation experiments. The simulation results show our algorithm can generate more rewritings than the naive rewriting algorithm at each distance.
In domain ontology, semantic association (SA) is used to depict the correlation between two concepts. In this paper, we define semantic association degree (SAD) for measuring SA in the domain ontology. We first presen...
详细信息
In domain ontology, semantic association (SA) is used to depict the correlation between two concepts. In this paper, we define semantic association degree (SAD) for measuring SA in the domain ontology. We first present a method to measure SAD of two direct related concepts by evaluating the semantic relationship between them, and then give another method to measure SAD of two indirect related concepts though SAD of two directed neighboring concepts. A set of comparison experiments show the benefit of our approaches.
暂无评论