Nowadays more and more people like to publish their comments on a product on the Web. Mining such unstructured data (product reviews) is exciting hot and challenging research and application topic. In this paper, we f...
详细信息
Nowadays more and more people like to publish their comments on a product on the Web. Mining such unstructured data (product reviews) is exciting hot and challenging research and application topic. In this paper, we focus on mining product reviews written in Chinese. We aim at extract the structural information from Chinese product reviews. By structural information, we mean product features and corresponding opinion words expressed in each review text. There are already some works done for reviews written in English, but less in Chinese. In this paper, we propose an effective method to extract candidate features and some effective pruning rules to prune the features. Also, we introduce a pattern extraction and matching step to improve our results. The experiment results show our approach is very effective, and has a good recall and precision.
The edit distance between two given strings X and Y is the minimum number of edit operations that transform X into Y. In ordinary course, string editing is based on character insert, delete, and substitute operations....
详细信息
The edit distance between two given strings X and Y is the minimum number of edit operations that transform X into Y. In ordinary course, string editing is based on character insert, delete, and substitute operations. It has been suggested that extending this model with block edits would be useful in applications such as DNA sequence comparison and sentence similarity computation. However, the existing algorithms have generally focused on the normalized edit distance, and seldom of them consider the block swap operations at a higher level. In this paper, we introduce an extended edit distance algorithm which permits insertions, deletions, and substitutions at character level, and also permits block swap operations. Experimental results on randomly generated strings verify the algorithm's rationality and efficiency. The main contribution of this paper is that we present an algorithm to compute the lowest edit cost for string transformation with block swap in polynomial time, and propose a breaking points selection algorithm to improve the computation speed.
The integer sign vulnerability is a comparatively new and subtle type of vulnerabilities, they can compromise system security. Especially, if a sign vulnerability occurs in operating system kernel, it may result in ve...
详细信息
The integer sign vulnerability is a comparatively new and subtle type of vulnerabilities, they can compromise system security. Especially, if a sign vulnerability occurs in operating system kernel, it may result in very serious invalid read/write operations to kernel memory area. Unfortunately, little attention has been paid to static detecting them automatically. This paper presents a novel approach to detecting sign vulnerabilities in Linux kernel using type qualifier technique. We introduce three pairs of type qualifier and corresponding lattices to identify some key kernel data and relationships between them. Based on an extended type inference tool, we are able to effectively detect known and unknown sign vulnerabilities from elaborately preprocessed Linux kernel files. Our experiences demonstrate that type qualifier technique can be applied to detect sign vulnerabilities effectively.
Monitoring on data streams is an efficient method of acquiring the characters of data stream. However the available resources for each data stream are limited, so the problem of how to use the limited resources to pro...
详细信息
Monitoring on data streams is an efficient method of acquiring the characters of data stream. However the available resources for each data stream are limited, so the problem of how to use the limited resources to process infinite data stream is an open challenging problem. In this paper, we adopt the wavelet and sliding window methods to design a multi-resolution summarization data structure, the Multi-Resolution Summarization Tree (MRST) which can be updated incrementally with the incoming data and can support point queries, range queries, multi-point queries and keep the precision of queries. We use both synthetic data and real-world data to evaluate our algorithm. The results of experiment indicate that the efficiency of query and the adaptability of MRST have exceeded the current algorithm, at the same time the realization of it is simpler than others.
Nowadays, WSMO (Web Service Modeling Ontology)1 has received great attention of academic and business communities, since its potential to achieve dynamic and scalable infrastructure for web services is extracted. Ther...
详细信息
Recently there have been growing interests in the applications of wireless sensor networks. Innovative techniques that improve energy efficiency to prolong the network lifetime are highly required. Clustering is an ef...
详细信息
Recent research has focused on density queries for moving objects in highly dynamic scenarios. An area is dense if the number of moving objects it contains is above some threshold. Monitoring dense areas has applicati...
详细信息
Advances in wireless networks and positioning technologies (e.g., CPS) have enabled new data management applications that monitor moving objects. In such new applications, realtime data analysis such as clustering ana...
详细信息
ISBN:
(纸本)9783540717027
Advances in wireless networks and positioning technologies (e.g., CPS) have enabled new data management applications that monitor moving objects. In such new applications, realtime data analysis such as clustering analysis is becoming one of the most important requirements. In this paper, we present the problem of clustering moving objects in spatial networks and propose a unified framework to address this problem. Due to the innate feature of continuously changing positions of moving objects, the clustering results dynamically change. By exploiting the unique features of road networks, our framework first introduces a notion of cluster block (CB) as the underlying clustering unit. We then divide the clustering process into the continuous maintenance of CBs and periodical construction of clusters with different criteria based on CBs. The algorithms for efficiently maintaining and organizing the CBs to construct clusters are proposed. Extensive experimental results show that our clustering framework achieves high efficiency for clustering moving objects in real road networks.
Both WordNet and Chinese Classified Thesaurus(CCT) are widely used in information retrieval and management systems. In this paper we propose a novel approach for building bilingual ontologies based on these existing k...
详细信息
Ontology matching determines the correspondences between concepts and relations of related ontologies. In this paper, we put forward an ontology hierarchies matching approach based on lattices alignment. The proposed ...
详细信息
Ontology matching determines the correspondences between concepts and relations of related ontologies. In this paper, we put forward an ontology hierarchies matching approach based on lattices alignment. The proposed lattice-based matching algorithm can be utilized not only in matching processes between two ontologies, but also in annotation processes between an ontology and its corresponding resources. Experiments on spatiotemporal ontology annotation have been carried out which shown the applicability of the approach.
暂无评论