检索结果-内蒙古大学图书馆

13th Annual Conference of the European Association for Machine Translation, EAMT 2009

作者： Wu, Dekai Fung, Pascale Human Language Technology Center HKUST Department of Computer Science and Engineering University of Science and Technology Clear Water Bay Hong Kong Hong Kong Human Language Technology Center HKUST Department of Electronic and Computer Engineering University of Science and Technology Clear Water Bay Hong Kong Hong Kong

We present a series of empirical studies aimed at illuminating more precisely the likely contribution of semantic roles in improving statistical machine translation accuracy. The experiments reported study several aspects key to success: (1) the frequencies of types of SMT errors where semantic parsing and role labeling could help, and (2) if and where semantic roles offer more accurate guidance to SMT than merely syntactic annotation, and (3) the potential quantitative impact of realistic semantic role guidance to SMT systems, in terms of BLEU and METEOR scores. © 2009 European Association for Machine Translation.

关键词： computer aided language translation

来源：评论

学校读者我要写书评

暂无评论

Are unaligned words important for machine translation?

Are unaligned words important for machine translation?

引用

13th Annual Conference of the European Association for Machine Translation, EAMT 2009

作者： Zhang, Yuqi Matusov, Evgeny Ney, Hermann Human Language Technology and Pattern Recognition Lehrstuhl für Informatik 6 - Computer Science Department RWTH Aachen University D-52056 Aachen Germany

In this paper, we deal with the problem of a large number of unaligned words in automatically learned word alignments for machine translation (MT). These unaligned words are the reason for ambiguous phrase pairs extracted by a statistical phrase-based MT system. In translation, this phrase ambiguity causes deletion and insertion errors. We present hard and optional deletion approaches to remove the unaligned words in the source language sentences. Improvements in translation quality are achieved both on large and small vocabulary tasks with the presented methods. © 2009 European Association for Machine Translation.

关键词： Machine translation

来源：评论

学校读者我要写书评

暂无评论

Using citations to generate surveys of scientific paradigms

Using citations to generate surveys of scientific paradigms

引用

human language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, NAACL HLT 2009

作者： Mohammad, Saif Dorr, Bonnie Egan, Melissa Hassan, Ahmed Muthukrishan, Pradeep Qazvinian, Vahed Radev, Dragomir Zajic, David Institute for Advanced Computer Studies University of Maryland United States Computer Science University of Maryland United States Human Language Technology Center of Excellence United States Center for Advanced Study of Language United States Department of Electrical Engineering and Computer Science University of Michigan United States School of Information University of Michigan United States

ISBN: (纸本)9781932432411

The number of research publications in various disciplines is growing exponentially. Researchers and scientists are increasingly finding themselves in the position of having to quickly understand large amounts of technical material. In this paper we present the first steps in producing an automatically generated, readily consumable, technical survey. Specifically we explore the combination of citation information and summarization techniques. Even though prior work (Teufel et al., 2006) argues that citation text is unsuitable for summarization, we show that in the framework of multi-document survey creation, citation texts can play a crucial role. © 2009 Association for Computational Linguistics.

关键词： Surveys

来源：评论

学校读者我要写书评

暂无评论

Skill characterization based on betweenness

Skill characterization based on betweenness

引用

22nd Annual Conference on Neural Information Processing Systems, NIPS 2008

作者： Şimşek, Özgür Barto, Andrew G. Department of Computer Science University of Massachusetts Amherst MA 01003 United States Max Planck Institute for Human Development Center for Adaptive Behavior and Cognition Berlin Germany

ISBN: (纸本)9781605609492

We present a characterization of a useful class of skills based on a graphical representation of an agent's interaction with its environment. Our characterization uses betweenness, a measure of centrality on graphs. It captures and generalizes (at least intuitively) the bottleneck concept, which has inspired many of the existing skill-discovery algorithms. Our characterization may be used directly to form a set of skills suitable for a given task. More importantly, it serves as a useful guide for developing incremental skill-discovery algorithms that do not rely on knowing or representing the interaction graph in its entirety.

关键词： Artificial intelligence

来源：评论

学校读者我要写书评

暂无评论

Multi-class confidence weighted algorithms

Multi-class confidence weighted algorithms

引用

2009 Conference on Empirical Methods in Natural language Processing, EMNLP 2009, Held in Conjunction with ACL-IJCNLP 2009

作者： Crammer, Koby Dredze, Mark Kulesza, Alex Department of Computer and Information Science University of Pennsylvania Philadelphia PA 19104 United States Human Language Technology Center of Excellence Johns Hopkins University Baltimore MD 21211 United States

The recently introduced online confidence-weighted (CW) learning algorithm for binary classification performs well on many binary NLP tasks. However, for multi-class problems CW learning updates and inference cannot be computed analytically or solved as convex optimization problems as they are in the binary case. We derive learning algorithms for the multi-class CW setting and provide extensive evaluation using nine NLP datasets, including three derived from the recently released New York Times corpus. Our best algorithm outperforms state-of-the-art online and batch methods on eight of the nine tasks. We also show that the confidence information maintained during learning yields useful probabilistic information at test time. © 2009 ACL and AFNLP.

关键词： Natural language processing systems

来源：评论

学校读者我要写书评

暂无评论

Generating high-coverage semantic orientation lexicons from overtly marked words and a thesaurus

Generating high-coverage semantic orientation lexicons from ...

引用

2009 Conference on Empirical Methods in Natural language Processing, EMNLP 2009, Held in Conjunction with ACL-IJCNLP 2009

作者： Mohammad, Saif Dunne, Cody Dorr, Bonnie Laboratory for Computational Linguistics and Information Processing University of Maryland United States Human-Computer Interaction Lab. University of Maryland United States Institute for Advanced Computer Studies University of Maryland United States Department of Computer Science University of Maryland United States Human Language Technology Center of Excellence United States

Sentiment analysis often relies on a semantic orientation lexicon of positive and negative words. A number of approaches have been proposed for creating such lexicons, but they tend to be computationally expensive, and usually rely on significant manual annotation and large corpora. Most of these methods use WordNet. In contrast, we propose a simple approach to generate a high-coverage semantic orientation lexicon, which includes both individual words and multi-word expressions, using only a Roget-like thesaurus and a handful of affixes. Further, the lexicon has properties that support the Polyanna Hypothesis. Using the General Inquirer as gold standard, we show that our lexicon has 14 percentage points more correct entries than the leading WordNet-based high-coverage lexicon (SentiWordNet). In an extrinsic evaluation, we obtain significantly higher performance in determining phrase polarity using our thesaurus-based lexicon than with any other. Additionally, we explore the use of visualization techniques to gain insight into the our algorithm beyond the evaluations mentioned above. © 2009 ACL and AFNLP.

关键词： Thesauri

来源：评论

学校读者我要写书评

暂无评论

Evaluating an intelligent tutoring system for making legal arguments with hypotheticals

引用

International Journal of Artificial Intelligence in Education 2009年第4期19卷 401-424页

作者： Pinkwart, Niels Ashley, Kevin Lynch, Collin Aleven, Vincent Department of Informatics Clausthal University of Technology Clausthal-Zellerfeld Germany Learning Research and Development Center School of Law and Intelligent Systems Program University of Pittsburgh Pittsburgh PA United States Learning Research and Development Center Intelligent Systems Program University of Pittsburgh Pittsburgh PA United States Human-Computer Interaction Institute School of Computer Science Carnegie Mellon University Pittsburgh PA United States

Argumentation is a process that occurs often in ill-defined domains and that helps deal with the illdefinedness. Typically a notion of "correctness" for an argument in an ill-defined domain is impossible to define or verify formally because the underlying concepts are open-textured and the quality of the argument may be subject to discussion or even expert disagreement. Previous research has highlighted the advantages of graphical representations for learning argumentation skills. A number of intelligent tutoring systems have been built that support students in rendering arguments graphically, as they learn argumentation skills. The relative instructional benefits of graphical argument representations have not been reliably shown, however. In this paper we present a formative evaluation of LARGO (Legal ARgument Graph Observer), a system that enables law students graphically to represent examples of legal interpretation with hypotheticals they observe while reading texts of U.S. Supreme Court oral arguments. We hypothesized that, compared to a text-based alternative, LARGO's diagramming language geared toward depicting hypothetical reasoning processes, coupled with non-directive feedback, helps students better extract the important information from argument transcripts and better learn argumentation skills. A first pilot study, conducted with volunteer first-semester law students, provided support for the hypothesis. The system especially helped lower-aptitude students learn argumentation skills, and LARGO improved the reading skills of students as they studied expert arguments. A second study with LARGO was conducted as a mandatory part of a first-semester University law course. Although there were no differences in the learning outcomes of the two conditions, the second study showed some evidence that those students who engaged more with the argument diagrams through the advice did better than the text condition. One lesson learned from these two studies is that gr

关键词： Students

来源：评论

学校读者我要写书评

暂无评论

Adaptive regularization of weight vectors

Adaptive regularization of weight vectors

引用

23rd Annual Conference on Neural Information Processing Systems, NIPS 2009

作者： Crammer, Koby Kulesza, Alex Dredze, Mark Department of Electrical Enginering Technion Haifa 32000 Israel Department of Computer and Information Science University of Pennsylvania Philadelphia PA 19104 United States Human Language Tech. Center of Excellence Johns Hopkins University Baltimore MD 21211 United States

ISBN: (纸本)9781615679119

We present AROW, a new online learning algorithm that combines several useful properties: large margin training, confidence weighting, and the capacity to handle non-separable data. AROW performs adaptive regularization of the prediction function upon seeing each new instance, allowing it to perform especially well in the presence of label noise. We derive a mistake bound, similar in form to the second order perceptron bound, that does not assume separability. We also relate our algorithm to recent confidence-weighted online learning techniques and show empirically that AROW achieves state-of-the-art performance and notable robustness in the case of non-separable data.

关键词： E-learning

来源：评论

学校读者我要写书评

暂无评论

Audio segmentation for speech recognition using segment features

Audio segmentation for speech recognition using segment feat...

引用

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： David Rybach Christian Gollan Ralf Schluter Hermann Ney Human Language Technology and Pattern Recognition Computer Science Department RWTH Aachen University Germany

Audio segmentation is an essential preprocessing step in several audio processing applications with a significant impact e.g. on speech recognition performance. We introduce a novel framework which combines the advantages of different well known segmentation methods. An automatically estimated log-linear segment model is used to determine the segmentation of an audio stream in a holistic way by a maximum a posteriori decoding strategy, instead of classifying change points locally. A comparison to other segmentation techniques in terms of speech recognition performance is presented, showing a promising segmentation quality of our approach.

关键词： Speech recognition Streaming media Decoding Broadcasting Loudspeakers Automatic speech recognition Bayesian methods humans Natural languages Pattern recognition

来源：评论

学校读者我要写书评

暂无评论

Toward machine translation with statistics and syntax and semantics

Toward machine translation with statistics and syntax and se...

引用

IEEE Workshop on Automatic Speech Recognition and Understanding

作者： Dekai Wu Department of Computer Science & Engineering Human Language Technology Center Hong Kong University of Science and Technology Hong Kong China

ISBN: (纸本)9781424454785

In this paper, we survey some central issues in the historical, current, and future landscape of statistical machine translation (SMT) research, taking as a starting point an extended three-dimensional MT model space. We posit a socio-geographical conceptual disparity hypothesis, that aims to explain why language pairs like Chinese-English have presented MT with so much more difficulty than others. The evolution from simple token-based to segment-based to tree-based syntactic SMT is sketched. For tree-based SMT, we consider language bias rationales for selecting the degree of compositional power within the hierarchy of expressiveness for transduction grammars (or synchronous grammars). This leads us to inversion transductions and the ITG model prevalent in current state-of-the-art SMT, along with the underlying ITG hypothesis, which posits a language universal. Against this backdrop, we enumerate a set of key open questions for syntactic SMT. We then consider the more recent area of semantic SMT. We list principles for successful application of sense disambiguation models to semantic SMT, and describe early directions in the use of semantic role labeling for semantic SMT.

关键词： Statistics Surface-mount technology Space technology Machine learning humans computer science Labeling Hardware Speech recognition Pattern recognition

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：