As a developing P2P system, Gnutella has upgraded its protocol to 0.6, which significantly changed the characteristics of its hosts. However, few previous work has given a wide-scale study to the new version of Gnutel...
详细信息
ISBN:
(纸本)3540311424
As a developing P2P system, Gnutella has upgraded its protocol to 0.6, which significantly changed the characteristics of its hosts. However, few previous work has given a wide-scale study to the new version of Gnutella. In addition, various kinds of P2P models are used to evaluate P2P systems or mechanisms, but the reliability of some hypotheses used in the models are not carefully studied or proved. In this paper, we try to remedy this situation by performing a large scaled measurement study on Gnutella with the help of some new crawling approaches. In particular, we characterize Gnutella by its queries, shared files and peer roles. Our measurements show that the assumption that query arrival follows Poisson distribution may not be true in Gnutella and most peers incline to share files of very limited types, even when MP3 files are excluded. We also find that many ultrapeers in Gnutella are not well selected. Statistical data provided in this paper can also be useful for P2P modeling and simulation.
The idea of building query-oriented routing indices has changed the way of improving routing efficiency from the basis as it can learn the content distribution during the query routing process. It gradually improves r...
详细信息
ISBN:
(纸本)0769526799
The idea of building query-oriented routing indices has changed the way of improving routing efficiency from the basis as it can learn the content distribution during the query routing process. It gradually improves routing efficiency with no excessive network overhead of the routing index construction and maintenance. However the previously proposed mechanism is not practically effective due to the slow improvement of routing efficiency. In this paper we propose a novel mechanism for query-oriented routing indices which quickly achieves high routing efficiency at low cost. The maintenance method employs reinforcement learning to utilize mass peer behaviors to construct and maintain routing indices. It explicitly uses the expected value of returned content number to depict the content distribution, which helps quickly approximate the real distribution. Meanwhile, the routing method is to retrieve as many contents as possible. It also helps speed up the learning process further The experimental evaluation shows that the mechanism has high routing efficiency, quick learning ability and satisfactory performance under chum.
A lot of work has been done on drawing word senses into retrieval to deal with the word sense ambiguity problem, but most of them achieved negative results. In this paper, we first implement a WSD system for nouns and...
详细信息
ISBN:
(纸本)3540352252
A lot of work has been done on drawing word senses into retrieval to deal with the word sense ambiguity problem, but most of them achieved negative results. In this paper, we first implement a WSD system for nouns and verbs, then the language sense model (LSM) for information retrieval is proposed. The LSM combines the terms and senses of a document seamlessly through an EM algorithm. Retrieval on TREC collections shows that the LSM outperforms both the vector space model (BM25) and the traditional language model significantly for both medium and long queries (7.53%-16.90%). Based on the experiments, we can also empirically draw the conclusion that the fine-grained senses will improve the retrieval performance when they are properly used.
Since we can hardly get semantics from the low-level features of the image, it is much more difficult to analyze the image than textual information on the Web. Traditionally, textual information around the image is us...
详细信息
Defining and using ontology to annotate web resources with semantic markups is generally perceived as the primary way to implement the vision of the Semantic Web. The ontology provides a shared and machine understanda...
详细信息
Many previous works of data mining user queries in Peer-to-Peer systems focused their attention on the distribution of query contents. However, few has been done towards a better understanding of the time series distr...
详细信息
With the increasing use of ontologies in Semantic Web and enterprise knowledgemanagement, it is critical to develop scalable and efficient ontology management systems. In this paper, we present Minerva, a storage and...
详细信息
ISBN:
(纸本)3540383298
With the increasing use of ontologies in Semantic Web and enterprise knowledgemanagement, it is critical to develop scalable and efficient ontology management systems. In this paper, we present Minerva, a storage and inference system for large-scale OWL ontologies on top of relational databases. It aims to meet scalability requirements of real applications and provide practical reasoning capability as well as high query performance. The method combines Description Logic reasoners for the TBox inference with logic rules for the ABox inference. Furthermore, it customizes the database schema based on inference requirements. User queries are answered by directly retrieving materialized results from the back-end database. The effective integration of ontology inference and storage is expected to improve reasoning efficiency, while querying without runtime inference guarantees satisfactory response time. Extensive experiments on University Ontology Benchmark show the high efficiency and scalability of Minerva system.
In the semantic web context,the formal representation of knowledge is not resourceful while the informal one with uncertainty prevails. In order to provide an uncertainty reasoning service for semantic web application...
详细信息
ISBN:
(纸本)3540311424
In the semantic web context,the formal representation of knowledge is not resourceful while the informal one with uncertainty prevails. In order to provide an uncertainty reasoning service for semantic web applications, we propose a probabilistic extension of Description Logic, namely Probabilistic Description Logic Program (PDLP). In this paper, we introduce the syntax and intensional semantics of PDLP, and present a fast reasoning algorithm making use of Logic Programming techniques. This extension is expressive, lightweight, and intuitive. Based on this extension, we implement a PDLP reasoner, and apply it into practical use: Tourism Ontology Uncertainty Reasoning system (TOUR). The TOUR system uses PDLP reasoner to make favorite travel plans on top of an integrated tourism ontology, which describes travel cites and services with their evaluation.
In order to lay a solid foundation for the emerging semantic web, effective and efficient management of large RDF(S) data is in high demand. In this paper we propose an approach to the storage, query, manipulation and...
详细信息
ISBN:
(纸本)3540292276
In order to lay a solid foundation for the emerging semantic web, effective and efficient management of large RDF(S) data is in high demand. In this paper we propose an approach to the storage, query, manipulation and inference of large RDF(S) data on top of relational databases. Specifically, RDF(S) inference is done on the database in advance instead of on the fly, so that the query efficiency is maximized. To reduce the cost of inference, two inference modes, the batch mode and the incremental mode, are provided for different scenarios. In both modes, optimized strategies axe applied for efficiency purpose. In order to support efficient query and inference on the database, the storage schema is also specially designed. In addition, a powerful RDF(S) query and manipulation language RQML is provided for easy and uniform data access in a declarative way. Finally, we evaluate and report the performance on both query and inference of our approach. Experiments show that our approach achieves encouraging performance in million-scale real data.
暂无评论