检索结果-内蒙古大学图书馆

2nd International Conference on Information tech.ology and Computer Science, ITCS2010

作者： Tang, Yang Li, Fangtao Huang, Minlie Zhu, Xiaoyan Department of Computer Sci. and Tech. State Key Laboratory on Intelligent Technology and Systems Tsinghua University Beijing 100084 China

ISBN: (纸本)9780769540740

As online community question answering (cQA) portals like Yahoo! Answers1 and Baidu Zhidao2 have attracted over hundreds of millions of questions, how to utilize these questions and accordant answers becomes increasingly important for cQA websites. Prior approaches focus on using information retrieval tech.iques to provide a ranked list of questions based on their similarities to the query. Due to the high variance of question quality and answer quality, users have to spend lots of time on finding the truly best answers from retrieved results. In this paper, we develop an answer retrieval and summarization system which directly provides an accurate and comprehensive answer summary instead of a list of similar questions to user's query. To fully explore the information of relations between queries and questions, between questions and answers, and between answers and sentences, we propose a new probabilistic scoring model to distinguish high-quality answers from lowquality answers. By fully exploiting these relations, we summarize answers using a maximum coverage model. Experiment results on the data extracted from Chinese cQA websites demonstrate the efficacy of our proposed method. © 2010 IEEE.

关键词： Websites

来源：评论

学校读者我要写书评

暂无评论

Grouping product features using semi-supervised learning with soft-constraints

Grouping product features using semi-supervised learning wit...

引用

23rd International Conference on Computational Linguistics, Coling 2010

作者： Zhai, Zhongwu Liu, Bing Xu, Hua Jia, Peifa State Key Lab of Intelligent Tech. and Sys. Tsinghua National Lab for Info. Sci. and Tech. Tsinghua Univ. China Dept. of Comp. Sci. University of Illinois Chicago United States

In opinion mining of product reviews, one often wants to produce a summary of opinions based on product features/attributes. However, for the same feature, people can express it with different words and phrases. To produce a meaningful summary, these words and phrases, which are domain synonyms, need to be grouped under the same feature group. This paper proposes a constrained semisupervised learning method to solve the problem. Experimental results using reviews from five different domains show that the proposed method is competent for the task. It outperforms the original EM and the state-of-the-art existing methods by a large margin.

关键词： Sentiment analysis

来源：评论

学校读者我要写书评

暂无评论

Measuring the non-compositionality of multiword expressions

Measuring the non-compositionality of multiword expressions

引用

23rd International Conference on Computational Linguistics, Coling 2010

作者： Bu, Fan Zhu, Xiaoyan Li, Ming State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer Sci. and Tech. China David R. Cheriton School of Computer Science University of Waterloo Canada

Multiword Expressions (MWEs) appear frequently and ungrammatically in the natural languages. Identifying MWEs in free texts is a very challenging problem. This paper proposes a knowledge-free, training-free, and language-independent Multiword Expression Distance (MED). The new metric is derived from an accepted physical principle, measures the distance from an n-gram to its semantics, and outperforms other state-of-the-art methods on MWEs in two applications: question answering and named entity extraction.

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

Answering Opinion Questions with Random Walks on Graphs

Answering Opinion Questions with Random Walks on Graphs

引用

Joint Conference of the 47th Annual Meeting of the Association for Computational Linguistics and 4th International Joint Conference on Natural Language Processing of the AFNLP, ACL-IJCNLP 2009

作者： Li, Fangtao Tang, Yang Huang, Minlie Zhu, Xiaoyan State Key Laboratory on Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer Sci. and Tech. Tsinghua University Beijing100084 China

ISBN: (纸本)9781617382581

Opinion Question Answering (Opinion QA), which aims to find the authors’ sentimental opinions on a specific target, is more challenging than traditional fact-based question answering problems. To extract the opinion oriented answers, we need to consider both topic relevance and opinion sentiment issues. Current solutions to this problem are mostly ad-hoc combinations of question topic information and opinion information. In this paper, we propose an Opinion PageRank model and an Opinion HITS model to fully explore the information from different relations among questions and answers, answers and answers, and topics and opinions. By fully exploiting these relations, the experiment results show that our proposed algorithms outperform several state of the art baselines on benchmark data set. A gain of over 10% in F scores is achieved as compared to many other systems. © 2009 ACL and AFNLP.

关键词： Natural language processing systems

来源：评论

学校读者我要写书评

暂无评论

Partially observed Maximum Entropy Discrimination Markov Networks

Partially observed Maximum Entropy Discrimination Markov Net...

引用

22nd Annual Conference on Neural Information Processing systems, NIPS 2008

作者： Zhu, Jun Xing, Eric P. Zhang, Bo State Key Lab of Intelligent Tech and Sys. Tsinghua National TNList Lab. Dept. Comp Sci and Tech. China Tsinghua University Beijing China School of Comp. Sci. Carnegie Mellon University Pittsburgh PA 15213 United States

ISBN: (纸本)9781605609492

Learning graphical models with hidden variables can offer semantic insights to complex data and lead to salient structured predictors without relying on expensive, sometime unattainable fully annotated training data. While likelihood-based methods have been extensively explored, to our knowledge, learning structured prediction models with latent variables based on the max-margin principle remains largely an open problem. In this paper, we present a partially observed Maximum Entropy Discrimination Markov Network (PoMEN) model that attempts to combine the advantages of Bayesian and margin based paradigms for learning Markov networks from partially labeled data. PoMEN leads to an averaging prediction rule that resembles a Bayes predictor that is more robust to overfitting, but is also built on the desirable discriminative laws resemble those of the M3N. We develop an EM-style algorithm utilizing existing convex optimization algorithms for M3N as a subroutine. We demonstrate competent performance of PoMEN over existing methods on a real-world web data extraction task.

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

Model-based bridge recognition in high resolution SAR image

Model-based bridge recognition in high resolution SAR image

引用

MIPPR 2009 - Automatic Target Recognition and Image Analysis: 6th International Symposium on Multispectral Image Processing and Pattern Recognition

作者： Pei, Deli Sun, Fuchun Wang, Hongqiao Chen, Ning Department of Computer Science and Technology Tsinghua University Beijing 100084 China State Key Lab. of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Tsinghua University Beijing 100084 China Xi'an High Tech. Research Institute Xi'an Shanxi 710025 China

ISBN: (纸本)9780819478061

A new bridge recognition method in Synthetic Aperture Radar (SAR) image using bridge model and SVM is presented in this paper. Firstly, water region is extracted from original SAR image by self-adapt segmentation and mathematic morphological method. Then based on bridge model, the bridge candidates in water region can be found out and their texture feature can also be ***, with the usage of SVM, whether these objects are bridges or not can be identified and the real bridges can be mark ed in the SAR image, as well as the corresponding information such as position. Experiment results show that, with the help of water region segmentation and bridge model, the efficiency of the algorithm is greatly promoted, meanwhile, the recognition accuracy is guaranteed by high quality classifier of SVM. © 2009 Copyright SPIE - The International Society for Optical Engineering.

关键词： Synthetic aperture radar

来源：评论

学校读者我要写书评

暂无评论

StatSnowball: A statistical approach to extracting entity relationships 09

StatSnowball: A statistical approach to extracting entity re...

引用

18th International World Wide Web Conference, WWW 2009

作者： Zhu, Jun Nie, Zaiqing Liu, Xiaojing Zhang, Bo Wen, Ji-Rong Dept. of Comp. Sci. and Tech. Tsinghua University Beijing 100084 China Microsoft Research Asia No. 49 Zhichun Road Beijing 100080 China Dept. of EEIS University of Sci. and Tech. of China Hefei 230027 China State Key Laboratory of Intelligent Technology and Systems China Tsinghua National Laboratory for Information Science and Technology China

ISBN: (纸本)9781605584874

Traditional relation extraction methods require pre-specified relations and relation-specific human-tagged examples. Bootstrapping systems significantly reduce the number of training examples, but they usually apply heuristic-based methods to combine a set of strict hard rules, which limit the ability to generalize and thus generate a low recall. Further-more, existing bootstrapping methods do not perform open information extraction (Open IE), which can identify various types of relations without requiring pre-specifications. In this paper, we propose a statistical extraction framework called Statistical Snowball (StatSnowball), which is a bootstrapping system and can perform both traditional relation extraction and Open IE. StatSnowball uses the discriminative Markov logic networks (MLNs) and softens hard rules by learning their weights in a maximum likelihood estimate sense. MLN is a general model, and can be configured to perform different levels of relation extraction. In StatSnwoball, pattern selection is performed by solving an 1-norm penalized maximum likelihood estimation, which enjoys well-founded theories and efficient solvers. We extensively evaluate the performance of StatSnowball in different configurations on both a small but fully labeled data set and large-scale Web data. Empirical results show that StatSnowball can achieve a significantly higher recall without sacrificing the high precision during iterations with a small number of seeds, and the joint inference of MLN can improve the performance. Finally, StatSnowball is efficient and we have developed a working entity relation search engine called Renlifang based on it. Copyright is held by the International World Wide Web Conference Committee (IW3C2).

关键词： Search engines

来源：评论

学校读者我要写书评

暂无评论

The Chinese pinyin input method based on internet data

引用

Journal of Computational Information systems 2009年第3期5卷 1167-1173页

作者： Yang, Lei State Key Lab of Intelligent Tech. and Sys. Tsinghua University Beijing 100084 China Sohu Research and Development Center Beijing 100084 China

The corpus to train the language model is a key factor to decide the performance of the Chinese pinyin input method. Traditional products take public documents as the corpus. The Internet provides a cheap, massive and living repository. With the procedures of word segmentation, frequency counting, word correction and phonetic annotation, better corpus is built. The typing experiment by several pinyin products shows that, products applying Internet-based corpora all outperform the traditional products. 1553-9105/ Copyright © 2009 Binary Information Press.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

Incorporate web search tech.ology to solve out-of-vocabulary words in Chinese word segmentation

PACLIC 23 - Proceedings of the 23rd Pacific Asia Conference ...

引用

PACLIC 23 - Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation 2009年 2卷 454-463页

作者： Qiao, Wei Sun, Maosong State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Department of Computer Sci. and Tech. Tsinghua University Beijing 100084 China

ISBN: (纸本)9789624423198

Chinese word segmentation (CWS) is the fundamental tech.ology for many NLPrelated applications. It is reported that more than 60% of segmentation errors is caused by the out-of-vocabulary (OOV) words. Recent studies in CWS show that, statistical machine learning method is, to some extent, effective on solving OOV words. But labeled data is limited in size and unbalanced in content which makes it impossible to obtain all the required knowledge to recognize OOV words. In this paper, large scaled web data is incorporated as knowledge supplement. A framework which combines using web search tech.ology and machine learning method is proposed. For each sentence, basic segmentation is performed using linear-chain Conditional Random Fields (CRF) model. Substrings which CRF model gives low confidence decisions are extracted and sent to search engine to perform web search based word segmentation. Final decision is made by considering both CRF model based segmentation result and that of web search based result. Evaluations are conducted on SIGHAN Bakeoff 2005 and 2006 datasets, showing the effectiveness of the proposed framework on dealing with OOV words. © 2009 by Wei Qiao and Maosong Sun.

关键词： Search engines

来源：评论

学校读者我要写书评

暂无评论

Web image clustering based-on Multi-Instance

Web image clustering based-on Multi-Instance

引用

2008 IEEE/WIC/ACM International Conference on Web Intelligence and intelligent Agent tech.ology - Workshops, WI-IAT Workshops 2008

作者： Lu, Jing Ma, Shaoping Zhang, Min State Key Lab. of Intelligent Tech. and Systems Tsinghua National Lab. for Information Science and Tech. Department of Computer Science and Tech. Tsinghua Uni Beijing 100084 China

ISBN: (纸本)9780769534961

In image retrieval and annotation, Multi-Instance Learning has been studied actively. Most of the methods solve the MIL problem in a supervised way. In this paper, we proposed two unsupervised frameworks for clustering multi-instance objects based on Expectation Maximization (EM) and iterative heuristic optimization respectively. For each framework, we introduced three new algorithms of finding users' interests on specific web images without any manual labeled data. And comparative studies have shown the effectiveness of the proposed algorithms. © 2008 IEEE.

关键词： Maximum principle

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：