Product review mining is the process of extracting opinions of customers in reviews which are expressed by natural language. As the first phrase of product review mining, product feature extraction decides the quality...
详细信息
ISBN:
(纸本)9780769540207
Product review mining is the process of extracting opinions of customers in reviews which are expressed by natural language. As the first phrase of product review mining, product feature extraction decides the quality of subsequent phrases. In this paper, we build a combined approach based on bootstrapping and ID3, ID3 is used as a feature selection algorithm in the iteration of bootstrapping. Given the seed set and classification feature set, the combined approach can automatically extract textual patterns with different structures, and avoid the design of textual pattern structures and the design of similarity function among textual patterns. We implement an automated product feature extraction system with the combined approach. Compare to previous study, our system achieves higher precision and better portability.
Mining textual patterns in news, tweets, papers, and many other kinds of text corpora has been an active theme in text mining and NLP research. Previous studies adopt a dependency parsing-based pattern discovery appro...
详细信息
ISBN:
(纸本)9781450348874
Mining textual patterns in news, tweets, papers, and many other kinds of text corpora has been an active theme in text mining and NLP research. Previous studies adopt a dependency parsing-based pattern discovery approach. However, the parsing results lose rich context around entities in the patterns, and the process is costly for a corpus of large scale. In this study, we propose a novel typed textual pattern structure, called meta pattern, which is extended to a frequent, informative, and precise subsequence pattern in certain context. We propose an efficient framework, called MetaPAD, which discovers meta patterns from massive corpora with three techniques: (1) it develops a context-aware segmentation method to carefully determine the boundaries of patterns with a learnt pattern quality assessment function, which avoids costly dependency parsing and generates high-quality patterns;(2) it identifies and groups synonymous meta patterns from multiple facets-their types, contexts, and extractions;and (3) it examines type distributions of entities in the instances extracted by each group of patterns, and looks for appropriate type levels to make discovered patterns precise. Experiments demonstrate that our proposed framework discovers high-quality typed textual patterns efficiently from different genres of massive corpora and facilitates information extraction.
In the past few years, social media has become an integral part of modern society. It also has surfaced as an influential tool that helps a business or individual in gaining identity and reputation. Predicting the pop...
详细信息
ISBN:
(纸本)9781728174341
In the past few years, social media has become an integral part of modern society. It also has surfaced as an influential tool that helps a business or individual in gaining identity and reputation. Predicting the popularity of images before they are posted on social media thus may have a profound impact to reveal individual preference and public attention. However, an accurate prediction is a challenging task, mainly on account of factors that play a part in this. Previous studies, although achieve favourable results, overlook one unique characteristic of semantics in textual metadata, i.e., the language modeling, to better model the context information of a post. To that end, we propose to exploit the language modeling features together with user profile and post metadata features. The language model features are extracted by utilizing the probability of word occurrence, while the user profile and post metadata features are provided as attributes by the original data source. Several state-of-the-art statistical modeling techniques are employed to investigate the performance of the proposed features on different estimation procedures. Experiments on a large-scale Flickr dataset demonstrate the benefits of the proposed features on predicting the popularity of social media posts.
作者:
Frenz, Christopher M.CUNY
New York City Coll Technol Dept Comp Engn Technol Brooklyn NY 11201 USA
Background: While keyword based queries of databases such as Pubmed are frequently of great utility, the ability to use regular expressions in place of a keyword can often improve the results output by such databases....
详细信息
Background: While keyword based queries of databases such as Pubmed are frequently of great utility, the ability to use regular expressions in place of a keyword can often improve the results output by such databases. Regular expressions can allow for the identification of element types that cannot be readily specified by a single keyword and can allow for different words with similar character sequences to be distinguished. Results: A Perl based utility was developed to allow the use of regular expressions in Pubmed searches, thereby improving the accuracy of the searches. Conclusion: This utility was then utilized to create a comprehensive listing of all DFN deafness mutations discussed in Pubmed records containing the keywords "human ear".
This article aims to identify whether cultural factors will cast influences on EFL learners' English and Chinese written compositions. By comparing the participants' English and Chinese compositions entitled &...
详细信息
This article aims to identify whether cultural factors will cast influences on EFL learners' English and Chinese written compositions. By comparing the participants' English and Chinese compositions entitled "How I deal with stress", the author tries to figure out the different writing patterns in the learner's writing performance. 25 EFL learners are involved in this study, and the participants' written essays are analyzed from the perspective of textual construction, cohesive devices and syntactic complexity. The data was collected and calculated by Juku network system based on the corpus data inside. Four variables(ways of placing the subject, amount of sentences, mean length of a sentence, verb phrases) were tested with the help of SPSS *** results tend to lead to the following conclusions: first: The low-context(LC) culture cast some influence on EFL learners' English and Chinese written works. Second: The factors of HC culture have little effect on Chinese learners' English writing performance now. On the contrary, the LC culture has some impact on their essays. Third: The syntactic complexity of Chinese and English compositions has suggested little significance in the same testees' writing performance.
Text strategy is one part of discourse analysis, which is the choice made by the producer of the text during the process of communication to realize some certain communicative *** research on text strategy provides so...
详细信息
Text strategy is one part of discourse analysis, which is the choice made by the producer of the text during the process of communication to realize some certain communicative *** research on text strategy provides some advantageous evidence for foreign language teaching, especially English writing *** thesis, taking textual pattern, text type and text-strategic continuity as examples, expounds the connotation of text strategy and emphasizes the importance of introducing the theories to the teaching of college English writing, how the theories enlighten the teaching of college English writing, which not only strengthens students’ text awareness, but also improves their English writing ability and the ability of interpreting different genres of texts effectively.
暂无评论