This paper focuses on how to improve aspect-level opinion mining for online customer reviews. We first propose a novel generative topic model, the Joint Aspect/Sentiment (JAS) model, to jointly extract aspects and asp...
详细信息
This paper focuses on how to improve aspect-level opinion mining for online customer reviews. We first propose a novel generative topic model, the Joint Aspect/Sentiment (JAS) model, to jointly extract aspects and aspect-dependent sentiment lexicons from online customer reviews. An aspect-dependent sentiment lexicon refers to the aspect-specific opinion words along with their aspect-aware sentiment polarities with respect to a specific aspect. We then apply the extracted aspectdependent sentiment lexicons to a series of aspect-level opinion mining tasks, including implicit aspect identification, aspect-based extractive opinion summarization, and aspect-level sentiment classification. Experimental results demonstrate the effectiveness of the JAS model in learning aspectdependent sentiment lexicons and the practical values of the extracted lexicons when applied to these practical tasks.
Support Vector Machines are an effective form of binary-class classification algorithm. To enhance the utilization of text structural features for information extraction, which are greatly restricted by the Hidden Mar...
详细信息
Support Vector Machines are an effective form of binary-class classification algorithm. To enhance the utilization of text structural features for information extraction, which are greatly restricted by the Hidden Markov Model (HMM), this paper proposes a support vector machine multi-class classification based on Markov properties to extract the information from a citation database. The proposed model extracts symbol characteristics as features and composes a binary tree of the transition probabilities. Experiments show that the proposed method outperforms HMM and basic SVM methods.
Text classification is the key technology for topic tracking, and vector space model (VSM) is one of the most simple and effective topics representation model. Feature selection algorithm in VSM is an important means ...
详细信息
Text classification is the key technology for topic tracking, and vector space model (VSM) is one of the most simple and effective topics representation model. Feature selection algorithm in VSM is an important means of data pre-processing, and it can reduce vector space dimension and improve the generalization ability of the algorithm. Therefore, it is necessary for feature selection algorithms to be in-depth and extensive research. So we develop a topic tracking system to study how feature dimension and the value of K-neighbors affect topic tracking. Then we get the variation law that they affect topic tracking, and add up their optimal values in topic tracking. Finally, TDT evaluation methods prove that optimal topic tracking performance based on adjusting the value of K-neighbors for text increases by 7.246% more than feature dimension.
There exists a large and underutilized resource of archaeological literature, both formal, such as scholarly journals and less formal in the form of `grey literature'. In the archaeological domain the vast majorit...
详细信息
There exists a large and underutilized resource of archaeological literature, both formal, such as scholarly journals and less formal in the form of `grey literature'. In the archaeological domain the vast majority of this literature contains some geo-spatial element as well as the expected temporal information and therefore its ease of discovery would be greatly enhanced were it accessible via a geo-spatially enabled search mechanism. As a result of this, geo-referencing these types of material and integrating them with other resources, such as monument inventories, is seen as a desirable enhancement for digital archives serving the archaeological research community. This paper provides an overview of a number of the approaches to the integration of such legacy literature into geospatial search mechanisms in an archaeological context. In particular efforts to achieve this via the Archaeotools e-science project and its use of natural language processing and a geo-spatial cross-walk service are discussed as well as potential future enhancements to the process.
Transformation of a source schema to a target schema is an important activity in data integration. In XML data transformation for data integration purposes, when an XML source schema with its conforming data is transf...
详细信息
Transformation of a source schema to a target schema is an important activity in data integration. In XML data transformation for data integration purposes, when an XML source schema with its conforming data is transformed to the target XML schema, one of the important XML constraints, XML keys that are defined on the source schema for expressive semantics can also be transformed. Thus, whether keys should be transformed and preserved, and if not preserved, whether keys can be captured in another form of XML constraints are important questions. To answer these questions, first, we define XML keys and XML functional dependencies(XFD) on document type definition (DTD). Second, we show key preservation in transformation. If keys are not preserved, we then show how to capture them as XFDs. We term this as key transition. Our research about the XML key preservation and transition is towards handling the issues of integrity constraints in XML data integration.
Referential integrity is one of the integrity constraints for any data model. In relational data model, inclusion dependency (ID) and foreign key (FK) are well studied and are widely used. In last decade, with the gro...
详细信息
Referential integrity is one of the integrity constraints for any data model. In relational data model, inclusion dependency (ID) and foreign key (FK) are well studied and are widely used. In last decade, with the growing use of XML as data representation and exchange format over the web, the issue of integrity constraints in XML has received great importance to the database community. In this paper, we propose XML inclusion dependency (XID) and XML foreign key(XFK). When proposing, we show how both XID and XFK can be defined over the Document Type Definition (DTD) and are satisfied by the XML documents. We introduce a novel concept tuple that produces semantically correct values in the XML documents when satisfactions are checked. We also show that XFK is defined with the combination of XID and XML key.
暂无评论