XML has become an important format for exchange data. Ranking of XML search results directly relates to XML information retrieval performance. Most of the existing ranking models consider words statistical characteris...
详细信息
ISBN:
(纸本)9780769533667
XML has become an important format for exchange data. Ranking of XML search results directly relates to XML information retrieval performance. Most of the existing ranking models consider words statistical characteristics in the XML document, but they do not consider position of the node a word belongs to. That is to say, all of nodes in XML document have the equal importance. However, different node plays different role in the entire XML document. So, the same content in different node should have different weight. It means different nodes should have different node semantic weight. In this paper, we present a VSM based method for XML node semantic weight (XNSW-VSM), which is scaled by the similarity between the node and the whole document. Experiment data were selected from Wiki data sets. The Pearson correlation coefficient between semantic results given by experts and the model results is 0.827. It shows the node semantic weight model can analyze the importance of node in XML document and it will be helpful for improving ranking results.
With growing numbers of suppliers/buyers and the introduction of new additional applications, the simplified and centralized organization of data in the E-commerce systems becomes a focus in reducing operational and c...
详细信息
ISBN:
(纸本)9780769533667
With growing numbers of suppliers/buyers and the introduction of new additional applications, the simplified and centralized organization of data in the E-commerce systems becomes a focus in reducing operational and capital expenses. With the availability of faster server hardware, network connections and high performance databases, the realization of a common database hosting data for multiple applications becomes technically feasible. However, this kind of deployment will bring new challenges to database systems. One critical challenge is the database should be optimized for real-time accessing databases on large data size. In this report, a novel main memory database architecture based on solid state disks is suggested to meet the requirement. Experiment results shows following benefits can be obtained from the solid state disks: transaction throughput, average response time of write transactions, and database recovery time.
This article analyses two kinds of information filtering technologies used in the e-commerce recommendation system: content-based filtering and collaborative filtering. On the foundation of collaborative filtering, we...
详细信息
This article analyses two kinds of information filtering technologies used in the e-commerce recommendation system: content-based filtering and collaborative filtering. On the foundation of collaborative filtering, we research the influential role of cluster technology to improve the recommendation system quality, to reduce the computation complexity, and to enhance the system real-time response speed.
With the unceasing growth of XML data in World Wide Web, XML document retrieval and clustering retrieval results are confronted with both challenges and opportunities. One of the challenges is how to improve the quali...
详细信息
With the unceasing growth of XML data in World Wide Web, XML document retrieval and clustering retrieval results are confronted with both challenges and opportunities. One of the challenges is how to improve the quality of XML retrieval results. Firstly, according to the features of XML documents, a method of modeling XML retrieval result documents is brought forward, which integrates both structural semantic features and content information of XML documents. Then, a measure method to compute similarity, including structural semantic similarity and keywords similarity, between retrieval result documents is suggested;and a strategy named Item Frequency in Cluster-Inverse Cluster Frequency to extract labels from result clusters is presented. Experiments indicate that the clustering quality for XML retrieval results based on hybrid similarity is obviously better than the one only based on content similarity.
The processing of XML queries can result in evaluation of various structural relationships. Efficient algorithms for evaluating ancestor-descendant and parent-child relationships have been proposed. Whereas the proble...
详细信息
The processing of XML queries can result in evaluation of various structural relationships. Efficient algorithms for evaluating ancestor-descendant and parent-child relationships have been proposed. Whereas the problems of evaluating preceding-sibling-following-sibling and preceding-following relationships are still open. In this paper, we studied the structural join and staircase join for sibling relationship. First, the idea of how to filter out and minimize unnecessary reads of elements using parent's structural information is introduced, which can be used to accelerate structural joins of parent-child and preceding-sibling-following-sibling relationships. Second, two efficient structural join algorithms of sibling relationship are proposed. These algorithms lead to optimal join performance: nodes that do not participate in the join can be judged beforehand and then skipped using B^+-tree index. Besides, each element list joined is scanned sequentially once at most. Furthermore, output of join results is sorted in document order. We also discussed the staircase join algorithm for sibling axes. Studies show that, staircase join for sibling axes is close to the structural join for sibling axes and shares the same characteristic of high efficiency. Our experimental results not only demonstrate the effectiveness of our optimizing techniques for sibling axes, but also validate the efficiency of our algorithms. As far as we know, this is the first work addressing this problem specially.
An effective way to optimize XML queries is to minimize XML queries. In this paper we improve redundance elimination in XPath queries greatly by incorporating two novel kinds of constraints: parent constraint and sibl...
详细信息
ISBN:
(纸本)0769528740
An effective way to optimize XML queries is to minimize XML queries. In this paper we improve redundance elimination in XPath queries greatly by incorporating two novel kinds of constraints: parent constraint and sibling constraint, and by extending the tractable fragment to include descendant-or self axis. The two novel kinds of constraints, together with child constraint and descendant constraint, form a family of constraints, which complicate the problem but offer possibilities for further minimization. Two techniques, tree augmentation and simulation augmentation, are employed to cope with constraints. We elaborate on the minimizing algorithms and running efficiencies both in the absence and in the presence of various kinds of constraints.
In this paper, based on concept lattices and dual concept lattices, we introduced a pair of rough set approximation operators within formal contexts. The proposed approximations operators don't require the equival...
详细信息
In this paper, based on concept lattices and dual concept lattices, we introduced a pair of rough set approximation operators within formal contexts. The proposed approximations operators don't require the equivalence relation any more. The properties of the proposed approximation operators are discussed in details.
Ontology matching determines the correspondences between concepts and relations of related ontologies. In this paper, we put forward an ontology hierarchies matching approach based on lattices alignment. The proposed ...
详细信息
Ontology matching determines the correspondences between concepts and relations of related ontologies. In this paper, we put forward an ontology hierarchies matching approach based on lattices alignment. The proposed lattice-based matching algorithm can be utilized not only in matching processes between two ontologies, but also in annotation processes between an ontology and its corresponding resources. Experiments on spatiotemporal ontology annotation have been carried out which shown the applicability of the approach.
Both WordNet and Chinese Classified Thesaurus(CCT) are widely used in information retrieval and management systems. In this paper we propose a novel approach for building bilingual ontologies based on these existing k...
详细信息
In this paper, we introduce two pairs operators in fuzzy formal contexts. Based on the proposed operators, we present two types of generalized variable precision formal concepts, i.e. property oriented crisp-fuzzy con...
详细信息
ISBN:
(纸本)9783540734505
In this paper, we introduce two pairs operators in fuzzy formal contexts. Based on the proposed operators, we present two types of generalized variable precision formal concepts, i.e. property oriented crisp-fuzzy concepts and object oriented fuzzy-crisp concepts. We have different level generalized formal concepts with different precision level. Last, we discuss the relationship between different precision level generalized concepts lattices in details.
暂无评论