检索结果-内蒙古大学图书馆

Role identification from free text using hidden Markov models 2nd

2nd Hellenic Conference on Artificial Intelligence

作者： Sigletos, G Paliouras, G Karkaletsis, V Software and Knowledge Engineering Laboratory Institute of Informatics and Telecommunications N.C.S.R. “Demokritos” Greece

ISBN: (纸本)3540434720

In this paper we explore the use of hidden Markov models on the task of role identification from free text. Role identification is an important stage of the information extraction process, assigning roles to particular types of entities with respect to a particular event. Hidden Markov models (HMMs) have been shown to achieve good performance when applied to information extraction tasks in both semistructured and free text. The main contribution of this work is the analysis of whether and how linguistic processing of textual data can improve the extraction performance of HMMs. The emphasis is on the minimal use of computationally expensive linguistic analysis. The overall conclusion is that the performance of HMMs is still worse than an equivalent manually constructed system. However, clear paths for improvement of the method are shown, aiming at a method, which is easily adaptable to new domains.

关键词： Hidden Markov models

来源：评论

学校读者我要写书评

暂无评论

引用

6th European Conference on Case-Based Reasoning, ECCBR 2002

作者： Patterson, David W. Rooney, Niall Galushka, Mykola The Northern Ireland Knowledge Engineering Laboratory School of Information and Software Engineering University of Ulster at Jordanstown Newtownabbey County Antrim United Kingdom

ISBN: (纸本)3540441093

In this paper, we present three techniques for knowledge discovery in case-based reasoning. The first two techniques D-HS and D-HS+SR are concerned with the discovery of similarity knowledge and operate on an uncompacted case-base while the third technique D-HS+PSR is concerned with the discovery of both similarity and case knowledge and operates on a compacted case-base. All three techniques provide a very efficient and competent means of similarity determination in CBR, which are empirically shown to be up to 25 times faster than k-NN without any loss in competency. D-HS+PSR proposes a novel approach to automatically engineering compact case-bases with a minimal overhead to the system, compared to other approaches such as case deletion/addition. Additionally as the approach provides a means for automatically reducing the number of cases required in the case-base without any loss in problem solving competency it has the greatest implication of the three techniques for reducing the effects of the utility problem in CBR. © Springer-Verlag Berlin Heidelberg 2002.

关键词： Case based reasoning

来源：评论

学校读者我要写书评

暂无评论

PatEdit: An information extraction pattern editor for fast system customization 3

PatEdit: An information extraction pattern editor for fast s...

引用

3rd International Conference on Language Resources and Evaluation, LREC 2002

作者： Farmakiotou, Dimitra Karkaletsis, Vangelis Koutsias, Ioannis Petasis, George Spyropoulos, Constantine D. Demokritos Software and Knowledge Engineering Laboratory Institute of Informatics and Telecommunications P.O. BOX 60228 Aghia Paraskevi AthensGR-15310 Greece

This paper addresses the problem of Information Extraction (IE) system customization to new domains and extraction needs with the use of PatEdit, an IE Pattern Editor. PatEdit is a human-Assisted knowledge engineering tool, that facilitates the production of IE patterns. First, we present the problem of IE system customisation and the use of human assisted knowledge engineering tools. Then, we describe PatEdit with respect to the IE pattern language used and discuss its characteristics that facilitate rapid pattern writing. Finally, the exploitation of PatEdit in two information extraction projects is presented along with our plans for future work.

关键词： knowledge engineering

来源：评论

学校读者我要写书评

暂无评论

Ellogon: A new text engineering platform 3

Ellogon: A new text engineering platform

引用

3rd International Conference on Language Resources and Evaluation, LREC 2002

作者： Petasis, Georgios Karkaletsis, Vangelis Paliouras, Georgios Androutsopoulos, Ion Spyropoulos, Constantine D. Demokritos Software and Knowledge Engineering Laboratory Institute of Informatics and Telecommunications P.O. BOX 60228 Aghia Paraskevi AthensGR-15310 Greece

This paper presents Ellogon, a multi-lingual, cross-platform, general-purpose text engineering environment. Ellogon was designed in order to aid both researchers in natural language processing, as well as companies that produce language engineering systems for the end-user. Ellogon provides a powerful TIPSTER-based infrastructure for managing, storing and exchanging textual data, embedding and managing text processing components as well as visualising textual data and their associated linguistic information. Among its key features are full Unicode support, an extensive multi-lingual graphical user interface, its modular architecture and the reduced hardware requirements.

关键词： Graphical user interfaces

来源：评论

学校读者我要写书评

暂无评论

Stacking classifiers for anti-spam filtering of e-mail

Stacking classifiers for anti-spam filtering of e-mail

引用

2001 Conference on Empirical Methods in Natural Language Processing, EMNLP 2001

作者： Sakkis, Georgios Androutsopoulos, Ion Paliouras, Georgios Karkaletsis, Vangelis Spyropoulos, Constantine D. Stamatopoulos, Panagiotis Department of Informatics University of Athens TYPA Buildings Panepistimiopolis AthensGR-157 71 Greece Software and Knowledge Engineering Laboratory Institute of Informatics and Telecommunications National Centre for Scientific Research "Demokritos" Ag. Paraskevi AthensGR-153 10 Greece

We evaluate empirically a scheme for combining classifiers, known as stacked generalization, in the context of anti-spam filtering, a novel cost-sensitive application of text categorization. Unsolicited commercial email, or "spam", floods mailboxes, causing frustration, wasting bandwidth, and exposing minors to unsuitable content. Using a public corpus, we show that stacking can improve the efficiency of automatically induced anti-spam filters, and that such filters can be used in real-life applications. © 2021 Empirical Methods in Natural Language Processing, EMNLP 2001.

关键词： Electronic mail

来源：评论

学校读者我要写书评

暂无评论

User-driven navigation pattern discovery from internet data

User-driven navigation pattern discovery from internet data

引用

International Workshop on Web Usage Analysis and User Profiling, WEBKDD 1999

作者： Baumgarten, Matthias Büchner, Alex G. Anand, Sarabjot S. Mulvenna, Maurice D. Hughes, John G. Northern Ireland Knowledge Engineering Laboratory University of Ulster United Kingdom School of Information and Software Engineering University of Ulster United Kingdom MINEit Software Ltd. Faculty of Informatics University of Ulster United Kingdom

ISBN: (纸本)9783540678182

Managers of electronic commerce sites need to learn as much as possible about their customers and those browsing their virtual premises, in order to maximise the return on marketing expenditure. The discovery of marketing related navigation patterns requires the development of data mining algorithms capable of the discovery of sequential access patterns from web logs. This paper introduces a new algorithm called MiDAS that extends traditional sequence discovery with a wide range of web-specific features. Domain knowledge is described as flexible navigation templates that can specify generic navigational behaviour of interest, network structures for the capture of web site topologies, concept hierarchies and syntactic constraints. Unlike existing approaches MiDAS supports sequence discovery from multidimensional data, which allows the detection of sequences across monitored attributes, such as URLs and http referrers. Three methods for pruning the sequences, resulting in three different types of navigational behaviour are presented. The experimental evaluation has shown promising results in terms of functionality as well as scalability. © Springer-Verlag Berlin Heidelberg 2000.

关键词： HTTP

来源：评论

学校读者我要写书评

暂无评论

Data mining and XML: current and future issues

Data mining and XML: current and future issues

引用

International Conference on Web Information Systems engineering

作者： A.G. Buchner M. Baumgarten M.D. Mulvenna R. Bohm S.S. Anand MINEit Software Limited Belfast UK Northern Ireland Knowledge Engineering Laboratory University of Ulster UK

This paper describes potential synergies between data mining and XML, which include the representation of discovered data mining knowledge, knowledge discovery from XML documents, XML-based data preparation and XML-ba... 详细信息

ISBN: (纸本)0769505775

关键词： Data mining XML Predictive models Web mining Markup languages knowledge engineering Laboratories Mining industry Humans Information analysis

来源：评论

学校读者我要写书评

暂无评论

Evaluating modeling efficiency of a specific software architecture

引用

Systems and Computers in Japan 2000年第11期31卷 1-11页

作者： Kawakami, Masumi Yoshida, Atsushi Isoda, Sadahiro Knowledge-Based Info. Engineering Toyohashi University of Technology Toyohashi 441-8580 Japan Systems Development Laboratory Hitachi Ltd. Toyohashi University of Technology Wakayama University Nagoya University University of Tokyo Department of Computer Science University of Illinois Urbana IL United States Spec. Interest Grp. on Software Eng. Info. Processing Society of Japan

software architectural styles that represent structural characteristics of software programs range from specific ones that can be applied to a particular domain to generic ones that can be applied to any domain. If a specific architectural style is available for the target system to be developed, it is appropriate to apply it together with its associated modeling method. However, no quantitative evaluation on the efficiency of specific architectural styles has as yet been reported. This paper presents a quantitative comparison of two architectural styles: specific and generic software architectural styles. The comparison shows that a specific architectural style combined with its associated modeling method allows us to reduce modeling cost as much as a few scores of percent compared with the generic one combined with its modeling method. The improvement results from the characteristics that (1) a specific software architectural style requires less rewriting of modeling diagrams due to its inherent basic structure and (2) there is less redundant information among modeling diagrams.

关键词： Client server computer systems

来源：评论

学校读者我要写书评

暂无评论

Selectional restrictions in HPSG 00

Selectional restrictions in HPSG

引用

Proceedings of the 18th conference on Computational linguistics - Volume 1

作者： Ion Androutsopoulos Robert Dale Software and Knowledge Engineering Laboratory Institute of Informatics and Telecommunications National Centre for Scientific Athens Greece Macquaric University Sydney NSW Australia

ISBN: (纸本)9781558607170

Selectional restrictions are semantic sortal constraints imposed on the participants of linguistic constructions to capture contextually-dependent constraints on interpretation. Despite their limitations, selectional restrictions have proven very useful in natural language applications, where they have been used frequently in word sense disambiguation, syntactic disambiguation, and anaphora resolution. Given their practical value, we explore two methods to incorporate selectional restrictions in the HPSG theory, assuming that the reader is familiar with HPSG. The first method employs HPSG's BACKGROUND feature and a constraint-satisfaction component pipe-lined after the parser. The second method uses subsorts of referential indices, and blocks readings that violate selectional restrictions during parsing. While theoretically less satisfactory, we have found the second method particularly useful in the development of practical systems.

关键词：

来源：评论

学校读者我要写书评

暂无评论

An experimental comparison of naive Bayesian and keyword-based anti-spam filtering with personal e-mail messages 00

An experimental comparison of naive Bayesian and keyword-bas...

引用

Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval

作者： Ion Androutsopoulos John Koutsias Konstantinos V. Chandrinos Constantine D. Spyropoulos Software and Knowledge Engineering Laboratory Institute of Informatics and Telecommunications National Centre for Scientific Research 'Demokritos' 153 10 Ag. Paraskevi Athens Greece

ISBN: (纸本)9781581132267

The growing problem of unsolicited bulk e-mail, also known as “spam”, has generated a need for reliable anti-spam e-mail filters. Filters of this type have so far been based mostly on manually constructed keyword patterns. An alternative approach has recently been proposed, whereby a Naive Bayesian classifier is trained automatically to detect spam messages. We test this approach on a large collection of personal e-mail messages, which we make publicly available in “encrypted” form contributing towards standard benchmarks. We introduce appropriate cost-sensitive measures, investigating at the same time the effect of attribute-set size, training-corpus size, lemmatization, and stop lists, issues that have not been explored in previous experiments. Finally, the Naive Bayesian filter is compared, in terms of performance, to a filter that uses keyword patterns, and which is part of a widely used e-mail reader.

关键词： evaluation (general) machine learning and IR filtering/routing text categorization test collections

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：