检索结果-内蒙古大学图书馆

Textual resource acquisition and engineering

IBM JOURNAL OF RESEARCH AND DEVELOPMENT 2012年第3-4期56卷 4:1-4:11页

作者： Chu-Carroll, J. Fan, J. Schlaefer, N. Zadrozny, W. IBM Corp Div Res Thomas J Watson Res Ctr Yorktown Hts NY 10598 USA Carnegie Mellon Univ Sch Comp Sci Pittsburgh PA 15213 USA

A key requirement for high-performing question-answering (QA) systems is access to high-quality reference corpora from which answers to questions can be hypothesized and evaluated. However, the topic of source acquisition and engineering has received very little attention so far. This is because most existing systems were developed under organized evaluation efforts that included reference corpora as part of the task specification. The task of answering Jeopardy!(TM) questions, on the other hand, does not come with such a well-circumscribed set of relevant resources. Therefore, it became part of the IBM Watson (TM) effort to develop a set of well-defined procedures to acquire high-quality resources that can effectively support a high-performing QA system. To this end, we developed three procedures, i.e., source acquisition, source transformation, and source expansion. Source acquisition is an iterative development process of acquiring new collections to cover salient topics deemed to be gaps in existing resources based on principled error analysis. Source transformation refers to the process in which information is extracted from existing sources, either as a whole or in part, and is represented in a form that the system can most easily use. Finally, source expansion attempts to increase the coverage in the content of each known topic by adding new information as well as lexical and syntactic variations of existing information extracted from external large collections. In this paper, we discuss the methodology that we developed for IBM Watson for performing acquisition, transformation, and expansion of textual resources. We demonstrate the effectiveness of each technique through its impact on candidate recall and on end-to-end QA performance.

关键词： QUESTION-answering systems TEXT Retrieval Conference WATSON (Computer) TRANSMISSION of texts ITERATIVE methods (Mathematics) ENCYCLOPEDIAS & dictionaries web databases

来源：评论

学校读者我要写书评

暂无评论

On evaluating an approach for balancing the trade-off on XML schema design

引用

INTERNATIONAL JOURNAL OF web INFORMATION SYSTEMS 2012年第4期8卷 371-389页

作者： Schroeder, Rebeca Duarte, Denio dos Santos Mello, Ronaldo Univ Fed Parana Dept Informat Curitiba Parana Brazil Fed Univ Fronteira Sul Dept Comp Sci Chapeco Brazil Univ Fed Santa Catarina Dept Informat & Stat Florianopolis SC Brazil

Purpose - Designing efficient XML schemas is essential for XML applications which manage semi-structured data. On generating XML schemas, there are two opposite goals: to avoid redundancy and to provide connected structures in order to achieve good performance on queries. In general, highly connected XML structures allow data redundancy, and redundancy-free schemas generate disconnected XML structures. The purpose of this paper is to describe and evaluate by experiments an approach which balances such trade-off through a workload analysis. Additionally, it aims to identify the most accessed data based on the workload and suggest indexes to improve access performance. Design/methodology/approach - The paper applies and evaluates a workload-aware methodology to provide indexing and highly connected structures for data which are intensively accessed through paths traversed by the workload. Findings - The paper presents benchmarking results on a set of design approaches for XML schemas and demonstrates that the XML schemas generated by the approach provide high query performance and low cost of data redundancy on balancing the trade-off on XML schema design. Research limitations/implications - Although an XML benchmark is applied in these experiments, further experiments are expected in a real-world application. Practical implications - The approach proposed may be applied in a real-world process for designing new XML databases as well as in reverse engineering process to improve XML schemas from legacy databases. Originality/value - Unlike related work, the reported approach integrates the two opposite goal in the XML schema design, and generates suitable schemas according to a workload. An experimental evaluation shows that the proposed methodology is promising.

关键词： web databases Performance of web applications XML logical design Workload Workload-driven methodology Indexing and retrieval of XML data Information retrieval Extensible markup language

来源：评论

学校读者我要写书评

暂无评论

databases on the web: national web domain survey 11

Databases on the Web: national web domain survey

引用

15th International Database Engineering and Applications Symposium (IDEAS)

作者： Shestakov, Denis Aalto Univ Dept Media Technol Konemiehentie 2 Espoo 02150 Finland

ISBN: (纸本)9781450306270

The deep web, the part of the web consisting of web pages filled with information from myriads of online databases, is to date relatively unexplored. Even its basic characteristics such as, for instance, the number of searchable databases on the web are disputable. In this paper, we address the problem of accurate estimation of the deep web by sampling one national web domain. We report some of our results obtained when surveying the Russian web. The survey findings, namely the size estimates of the deep web, could be useful for further studies to handle data in the deep web.

关键词： web databases deep web web characterization web measurement national web structured data cluster random sampling virtual hosting

来源：评论

学校读者我要写书评

暂无评论

web data management /

引用

2012年

作者： Serge Abiteboul ... [et al.].

来源：内蒙古大学图书馆图书评论

学校读者我要写书评

暂无评论

WordPress Top Plugins 1

引用

2010年

作者： Brandon Corbin

ISBN: (数字)9781849511414

ISBN: (纸本)9781849511407

Time flies when you're having fun. This is the right way to describe this WordPress Top Plugins book by Brandon Corbin. With real world examples and by showing you the perks of having these plugins installed on your websites, the author is all set to captivate your interest from start to end. Regardless of whether this is your first time working with WordPress, or you’re a seasoned WordPress coding ninja, WordPress Top Plugins will walk you through finding and installing the best plugins for generating and sharing content, building communities and reader base, and generating real advertising revenue.

关键词： web databases Database management

来源：评论

学校读者我要写书评

暂无评论

PopulusLog: People Information Database

PopulusLog: People Information Database

引用

24th International Symposium on Computer and Information Sciences

作者： Cakmak, Ali Kirac, Mustafa Ozsoyoglu, Gultekin Case Western Reserve Univ Dept Elect Engn & Comp Sci Cleveland OH 44106 USA

ISBN: (纸本)9781424450213

Information about individuals on publicly available web sites stands as a valuable, yet unorganized, data source. Turning such an enormous data source into a "database" is highly desirable as it has the potential to lead to novel ways of using the available information to the largest extent. In this paper, we present PopulusLog, a novel web data mining system. PopulusLog is a pioneering example of next generation search engines which produces and provides access to non-intuitive knowledge on tire web. It involves a framework for tools that collect, extract, mine, query, browse, and visualize information about anonymous people.

关键词： information extraction search engines machine learning web databases entity tagging

来源：评论

学校读者我要写书评

暂无评论

LABRADOR: Efficiently publishing relational databases on the web by using keyword-based query interfaces

引用

INFORMATION PROCESSING & MANAGEMENT 2007年第4期43卷 983-1004页

作者： Mesquita, Filipe da Silva, Altigran S. de Moura, Edleno S. Calado, Pavel Laender, Alberto H. F. Univ Fed Amazonas Dept Comp Sci BR-69077000 Manaus Amazonas Brazil IST INESC ID Oporto Portugal Univ Fed Minas Gerais Dept Comp Sci BR-31270901 Belo Horizonte MG Brazil

A vast amount of valuable information, produced and consumed by people and institutions, is currently stored in relational databases. For many purposes, there is an ever increasing demand for having these databases published on the web, so that users can query the data available in them. An important requirement for this to happen is that query interfaces must be as simple and intuitive as possible. In this paper we present LABRADOR, a system for efficiently publishing relational databases on the web by using a simple text box query interface. The system operates by taking an unstructured keyword-based query posed by a user and automatically deriving an equivalent SQL query that fits the user's information needs, as expressed by the original query. The SQL query is then sent to a DBMS and its results are processed by LABRADOR to create a relevance-based ranking of the answers. Experiments we present show that LABRADOR can automatically find the most suitable SQL query in more than 75% of the cases, and that the overhead introduced by the system in the overall query processing time is almost insignificant. Furthermore, the system operates in a non-intrusive way, since it requires no modifications to the target database schema. (c) 2006 Elsevier Ltd. All rights reserved.

关键词： keyword-based queries web databases Bayesian networks

来源：评论

学校读者我要写书评

暂无评论

We the curators

引用

NATURE METHODS 2008年第9期5卷 754-755页

作者： Doerr, Allison

Two groups describe wiki platforms for community-based curation of gene annotations or biological pathways.

关键词： GENETICS -- Computer network resources WIKIS (Computer science) web databases DATABASE searching ELECTRONIC information resources INFORMATION science

来源：评论

学校读者我要写书评

暂无评论

A knowledge management system for Chinese language arts teachers

引用

BRITISH JOURNAL OF EDUCATIONAL TECHNOLOGY 2008年第5期39卷 935-943页

作者： Lin, Janet Mei-Chuen Wang, Pei-Yu Natl Taiwan Normal Univ Grad Inst Informat & Comp Educ Taipei Taiwan

The article discusses a knowledge management system which has been developed for Chinese language arts teachers. Details about the system and the subject of Chinese language arts are provided. The complexity of Chinese language arts textbooks are examined and the management program is presented as a way to effectively organize class material. A diagram illustrating the schema of the database program is presented and several images illustrating the usefulness and functionality of the program are also provided.

关键词： CHINESE language -- Data processing KNOWLEDGE management web databases LANGUAGE arts EDUCATIONAL technology LANGUAGE arts teachers

来源：评论

学校读者我要写书评

暂无评论

Research on Middleware of Automatic Finding and Integration of Deep web Query Interface

Research on Middleware of Automatic Finding and Integration ...

引用

International Seminar on Future Information Technology and Management Engineering

作者： Lin, Peiguang Lv, Chao Jin, Ku Shandong Univ Finance Sch Comp & Informat Engn Jinan 250014 Peoples R China

ISBN: (纸本)9780769534800

In this paper, the Deep web technologies are analyzed and discussed, and a middleware of finding and integrating Deep web query interface automatically is proposed. This middleware extracts the attributes of query interfaces and judges them whether interfaces of web databases by computing the similarity between them;it can also clustering query interfaces and construct an integrated query interface. This middleware provides a practical tool for finding query interface automatically and constructing integrated query interfaces.

关键词： deep web integration query interface deep web Middleware web databases AUTOMATIC FINDINGS Interface Interfaces

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：