This paper addresses a new text classification method: Sparse Topic Model, which represents documents by the sparse coding of topics. Topics contain more semantic information than words, so it's more effective for...
详细信息
ISBN:
(纸本)9781479902590
This paper addresses a new text classification method: Sparse Topic Model, which represents documents by the sparse coding of topics. Topics contain more semantic information than words, so it's more effective for feature representation of documents. Topics are extracted from documents by LDA in an unsupervised way. Based on these topics, sparse coding is applied to discover more high-level representation. We compare the Sparse Topic Model with the traditional methods, such as SVM, and the experimental result show that the proposed method achieves better performance, especially when the number of training examples is limited. The effect of topic number and word number per topic on the performance is also investigated. Due to the unsupervised characteristic of Sparse Topic Model, it's very useful for real application.
Wireless Sensor Networks (WSNs) can be viewed as a new type of distributed databases. data management technology is one of the core technologies of WSNs. In this demo we show a Query Processing system based on TinyOS ...
详细信息
Recently there has been a lot of interest in graph-based analysis, with examples including social network analysis, recommendation systems, document classification and clustering, and so on. A graph is an abstraction ...
详细信息
Recently there has been a lot of interest in graph-based analysis, with examples including social network analysis, recommendation systems, document classification and clustering, and so on. A graph is an abstraction that naturally captures data objects as well as relationships among those objects. Objects are represented as nodes and relationships are represented as edges in the graph. There are many cases in which similarities among nodes are required to compute. SimRank is one of the simple and intuitive algorithms for this purpose. It is rigidly based on the random walk theorem. Existing methods on SimRank computation suffer from one limitation: the computing cost can be very high in practice. In order to optimize the computation of SimRank, a few techniques have been proposed. However, the performance of these methods are still limited by the processing ability of the single computer. Ideally, we would like to develop new parallel solutions that can offer improved processing power to compute SimRank on large data set. In this paper, we propose parallel algorithms for SimRank computation on Map-Reduce framework, and more specifically its open source implementation, Hadoop. Two different parallel methods are proposed and their performances are evaluated and compared. Furthermore, we employ the proposed methods to do the similarity computation in order to recommend appropriate products to users in social recommender systems.
Very recently, the study of social networks has received a huge attention since we can learn and understand many hidden properties of our society. This paper investigates the potential of social network analysis to se...
详细信息
Sensor fusion is the combining of sensory data from disparate sources such that the resulting information is in some sense better than would be possible when these sources were used individually. The natural uncertain...
详细信息
Greenhouse gases remote sensing monitoring system is implementation of greenhouse gases remote sensing applied technologies. This paper discusses the business application mode, operation scheme and application technol...
详细信息
Recent years have witnessed the explosive growth of online social networks (OSNs), which provide a perfect platform for observing the information propagation. Based on the theory of complex network analysis, consideri...
详细信息
Schema summarization on large-scale databases is a challenge. In a typical large database schema, a great proportion of the tables are closely connected through a few high degree tables. It is thus difficult to separa...
详细信息
Schema summarization on large-scale databases is a challenge. In a typical large database schema, a great proportion of the tables are closely connected through a few high degree tables. It is thus difficult to separate these tables into clusters that represent different topics. Moreover, as a schema can be very big, the schema summary needs to be structured into multiple levels, to further improve the usability. In this paper, we introduce a new schema summarization approach utilizing the techniques of community detection in social networks. Our approach contains three steps. First, we use a community detection algorithm to divide a database schema into subject groups, each representing a specific subject. Second, we cluster the subject groups into abstract domains to form a multi-level navigation structure. Third, we discover representative tables in each cluster to label the schema summary. We evaluate our approach on Freebase, a real world large-scale database. The results show that our approach can identify subject groups precisely. The generated abstract schema layers are very helpful for users to explore database.
Using the correlation of the GHZ triplet states, a broadcasting multiple blind signature scheme is proposed. Different from classical multiple signature and current quantum signature schemes, which could only deliver ...
详细信息
Using the correlation of the GHZ triplet states, a broadcasting multiple blind signature scheme is proposed. Different from classical multiple signature and current quantum signature schemes, which could only deliver either multiple signature or unconditional security, our scheme guarantees both by adopting quantum key preparation, quantum encryption algorithm and quantum entanglement. Our proposed scheme has the properties of multiple signature, blindness, non-disavowal, non-forgery and traceability. To the best of our knowledge, we are the first to propose the broadcasting multiple blind signature of quantum cryptography.
It is important to extract the aspects from the comments of shoppers about certain products. Product aspect descriptions often contain words of same meaning, and discriminating these synonyms effectively can improve t...
详细信息
暂无评论