topic detection and tracking provides a flat and unorganized view of a document collection and cannot adequately reflect the content of the complete collection as some of the information is lost in the process. topic ...
详细信息
topic detection and tracking provides a flat and unorganized view of a document collection and cannot adequately reflect the content of the complete collection as some of the information is lost in the process. topic models account for more information and lead to a more organized view of the document collection. In this paper, we propose a more efficient model named Related topic Network with a new term weighting method. Empirical evaluation using two realworld datasets consisting of 953 and 5,550 news documents demonstrates the utility of the proposed model and shows that the new term weighting method leads to performance improvement.
In order to make the best use of the content of the stories, motivated by the idea of word co-occurrence, we propose a dynamic co-occurrence relationship between words within a certain story, explore the story similar...
详细信息
In order to make the best use of the content of the stories, motivated by the idea of word co-occurrence, we propose a dynamic co-occurrence relationship between words within a certain story, explore the story similarity computation method based on the dynamic co-occurrence, and apply it to Chinese story link detection system. Experimental results show that the similarity computation method performs well, and greatly improves the performance of the Chinese story link detection system.
We present a data-driven study on which sources were the first to report on news events. For this, we implemented a news-aggregator that included a large number of established news sources and covered one year of data...
详细信息
ISBN:
(纸本)9781450320382
We present a data-driven study on which sources were the first to report on news events. For this, we implemented a news-aggregator that included a large number of established news sources and covered one year of data. We present a novel framework that is able to retrieve a large number of events and not only the most salient ones, while at the same time making sure that they are not exclusively of local *** analysis then focuses on different aspects of the news cycle. In particular we analyze which are the sources to break most of the news. By looking when certain events become bursty, we are able to perform a finer analysis on those events and the associated sources that dominate the global news-attention. Finally we study the time it takes news outlet to report on these events and how this reects different strategies of which news to report.A general finding of our study is that big news agencies remain an important threshold to cross to bring global attention to particular news, but it also shows the importance of focused (by region or topic) outlets.
Recently, there have been significant advances in several areas of language technology, including clustering, text categorization, and summarization. However, efforts to combine technology from these areas in a practi...
详细信息
Recently, there have been significant advances in several areas of language technology, including clustering, text categorization, and summarization. However, efforts to combine technology from these areas in a practical system for information access have been limited. In this paper, we present Columbia's Newsblaster system for online news summarization. Many of the tools developed at Columbia over the years are combined together to produce a system that crawls the web for news articles, clusters them on specific topics and produces multidocument summaries for each cluster.
In this paper,we build an on-line topicdetection and state prediction system,which can automatically collect Internet web pages,cluster them into topics,and predict the hot topics' states.A HMM-based prediction m...
详细信息
In this paper,we build an on-line topicdetection and state prediction system,which can automatically collect Internet web pages,cluster them into topics,and predict the hot topics' states.A HMM-based prediction model is proposed to predict the Internet hot topic's state,and the prediction method is testified in an actual network *** this system,we train the observations of the topics by the hidden Markov model and save the models in a HMM library for the topic's *** with similar life cycle are recorded and share a same model. Experimental results are shown.
Social networks enable users to freely communicate with each other and share their recent news, ongoing activities or views about different topics. As a result, they can be seen as a potentially viable source of infor...
详细信息
Social networks enable users to freely communicate with each other and share their recent news, ongoing activities or views about different topics. As a result, they can be seen as a potentially viable source of information to understand the current emerging topics/events. The ability to model emerging topics is a substantial step to monitor and summarize the information originating from social sources. Applying traditional methods for event detection which are often proposed for processing large, formal and structured documents, are less effective, due to the short length, noisiness and informality of the social posts. Recent event detection techniques address these challenges by exploiting the opportunities behind abundant information available in social networks. This article provides an overview of the state of the art in event detection from social networks.
暂无评论