Concept-cognitive learning (CCL) is a hot topic in recent years, and it has attracted much attention from the communities of formal concept analysis, granular computing and cognitive computing. However, the relationsh...
详细信息
Information entropy and its extension, which are important generalization of entropy, have been applied in many research domains today. In this paper, a novel generalized relative entropy is constructed to avoid some ...
详细信息
With the rapid development of Internet technology, crowdsourcing, as a flexible, effective and low-cost problem-solving method, has begun to receive more and more attention. The use of crowdsourcing to evaluate the qu...
With the rapid development of Internet technology, crowdsourcing, as a flexible, effective and low-cost problem-solving method, has begun to receive more and more attention. The use of crowdsourcing to evaluate the quality of linked data has also become a research hotspot. This paper proposes the concept of Domain Specialization Test (DST), which uses domain professional testing tasks DSTs to evaluate the professionalism of workers, and combines the idea of Mini-batch Gradient Descent (MBGD) to improve the EM algorithm, and the MBEM algorithm is proposed to achieve efficient and accurate evaluation of task results. The experimental results show that the proposed method can screen out the appropriate workers for the linked data crowdsourcing task and improve the accuracy and iteration efficiency of the results.
With the extensive application of the knowledge base (KB), how to complete it is a hot topic on Semantic Web. However, many problems go with the big data, and the event matching is one of these problems, which is find...
With the extensive application of the knowledge base (KB), how to complete it is a hot topic on Semantic Web. However, many problems go with the big data, and the event matching is one of these problems, which is finding out the entities referring to the same things in the real world and also the key point in the extending process. To enrich the emergency knowledge base (E-SKB) we constructed before, we need to filter out the news from several web pages and find the same news to avoid data redundancy. In this paper, we proposed a hierarchy blocking method to reduce the times of comparisons and narrow down the scope by extracting the news properties as the blocking keys. The method transforms the event matching problem into a clustering problem. Experimental results show that the proposed method is superior to the existing text clustering algorithm with high precision and less comparison times.
NBSVM is one of the most popular methods for text classification and has been widely used as baselines for various text representation approaches. It uses Naive Bayes (NB) feature to weight sparse bag-of-n-grams repre...
详细信息
On September 5, 2015, the State Council of Chinese Government, China’s cabinet formally announced its Action Framework for Promoting Big data (***, 2015). This is the milestone for China to catch up the global wave o...
详细信息
Privacy-preserving data publication problem has attracted more and more attentions in recent years. A lot of related research works have been done towards dataset with single sensitive attribute. However, usually, ori...
详细信息
Link-based similarity measures play a significant role in many graph based applications. Consequently, mea- suring node similarity in a graph is a fundamental problem of graph data mining. Personalized PageRank (PPR...
详细信息
Link-based similarity measures play a significant role in many graph based applications. Consequently, mea- suring node similarity in a graph is a fundamental problem of graph data mining. Personalized PageRank (PPR) and Sim- Rank (SR) have emerged as the most popular and influen- tial link-based similarity measures. Recently, a novel link- based similarity measure, penetrating rank (P-Rank), which enriches SR, was proposed. In practice, PPR, SR and P-Rank scores are calculated by iterative methods. As the number of iterations increases so does the overhead of the calcula- tion. The ideal solution is that computing similarity within the minimum number of iterations is sufficient to guaran- tee a desired accuracy. However, the existing upper bounds are too coarse to be useful in general. Therefore, we focus on designing an accurate and tight upper bounds for PPR, SR, and P-Rank in the paper. Our upper bounds are designed based on the following intuition: the smaller the difference between the two consecutive iteration steps is, the smaller the difference between the theoretical and iterative similar- ity scores becomes. Furthermore, we demonstrate the effec- tiveness of our upper bounds in the scenario of top-k similar nodes queries, where our upper bounds helps accelerate the speed of the query. We also run a comprehensive set of exper- iments on real world data sets to verify the effectiveness and efficiency of our upper bounds.
Publishing articles in high-impact English journals is difficult for scholars around the world, especially for non-native English-speaking scholars (NNESs), most of whom struggle with proficiency in English. In order ...
Sentiment analysis in tourism domain has drawn much attention in past few years, which calls for more precise sentiment word embedding method. The article proposes a kernel optimization function for sentiment word emb...
详细信息
Sentiment analysis in tourism domain has drawn much attention in past few years, which calls for more precise sentiment word embedding method. The article proposes a kernel optimization function for sentiment word embedding. And the method aims at integrating the semantic information, statistics information and sentiment information and maintains the similarity between sentiment words in terms of sentiment orientation. The experiment result shows that the optimal sentiment vectors successfully extract the features in terms of sentiment information and the difference between concretization and abstraction of a sentiment words.
暂无评论