检索结果-内蒙古大学图书馆

WSEMQT: a novel approach for quality-based evaluation of web data sources for a data warehouse

IET SOFTWARE 2020年第7期14卷 806-815页

作者： Bhutani, Priyanka Saha, Anju Gosain, Anjana GGS Indraprastha Univ USIC&T Sec 16C New Delhi India

The incorporation of suitable external data from the World Wide web offers an effective solution for enriching the data in the data warehouse (DW). However, the main challenge is the quality-aware selection of web data sources to maintain the quality of the DW. In the previous works, the quality evaluation of web sources is through expert evaluation only, which makes it a very lengthy process. Also, since the quality model consists of mixed quality factors from diverse domains of web, DW and underlying business, finding an expert possessing an expertise of all these domains is a huge bottleneck in the evaluation process. In order to overcome these existing issues, this study proposes a novel multi-level approach web source evaluation with multi-criteria decision-making and web quality testing tools (WSEMQT) and underlying quality model web quality model for evaluating web sources for the DW. The authors introduce automated web source quality evaluation in the first level of web source based evaluation and multiple dimensions of quality evaluation at the second level of expert-based evaluation. At both the levels, multi-criteria decision-making methods are applied to the evaluation scores obtained to ascertain the ranked list of web sources. The authors present a real-world academic web data case study which shows that the proposed approach can be executed successfully for real-world problems.

关键词： Internet decision making web sites data warehouses data warehouse World Wide web quality-aware selection web data sources mixed quality factors evaluation process expert-based evaluation multicriteria decision-making methods evaluation scores quality-based evaluation external data multilevel approach web source evaluation real-world academic web data case study web source based evaluation quality model web quality model web quality testing tools WSEMQT automated web source quality evaluation

来源：评论

学校读者我要写书评

暂无评论

Empirical Validation of webQMDW Model for Quality-based External web data Source Incorporation in a data Warehouse

引用

INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS 2021年第8期12卷 206-215页

作者： Bhutani, Priyanka Saha, Anju Gosain, Anjana GGSIP Univ USIC&T New Delhi India

In recent years, World Wide web has emerged as the most promising external data source for organizations' data Warehouses for valuable insights required in comprehensive decision making to gain a competitive edge. However, when the data Warehouse uses external data sources from the web without quality evaluation, it can adversely impact its quality. Quality models have been proposed in the research literature to evaluate and select web data sources for their integration in a data Warehouse. However, these models are only conceptually proposed and not empirically validated. Therefore, in this paper, the authors present the empirical validation conducted on a set of 57 subjects to thoroughly validate the set of 22 quality factors and the initial structure of the multi-level, multi-dimensional webQMDW quality model. The validated and restructured webQMDW model thus obtained can significantly enhance the decision- making in the DW by selecting high-quality web data sources.

关键词： data warehouse external data sources web data sources quality evaluation model quality model validation

来源：评论

学校读者我要写书评

暂无评论

Dealing with data from Multiple web sources 18

Dealing with Data from Multiple Web Sources

引用

24th Brazilian Symposium on Multimedia and the web (webMedia)

作者： Batista, Natercia A. Brandao, Michele A. Pinheiro, Michele B. Dalip, Daniel H. Moro, Mirella M. Univ Fed Minas Gerais Belo Horizonte MG Brazil Ctr Fed Educ Tecnol Minas Gerais Belo Horizonte MG Brazil

ISBN: (纸本)9781450358675

web data are heterogeneous and unstructured, which defines challenges for data crawling, integration and preprocessing. Different studies are "data-oriented" (i.e. based on the available data) but their results are restricted to their specific data. In contrast, there are various problems prior to identifying what data is needed to solve them, and often multiple data sources are needed. In this context, crawling, integrating and preprocessing data appropriately enables to create datasets for solving such problems. Therefore, this short course addresses these three activities by discussing challenges and practical solutions.

关键词： web data sources Practical perspective

来源：评论

学校读者我要写书评

暂无评论

DeXIN: An Extensible Framework for Distributed XQuery over Heterogeneous data sources

DeXIN: An Extensible Framework for Distributed XQuery over H...

引用

11th International Conference on Enterprise Information Systems

作者： Ali, Muhammad Intizar Pichler, Reinhard Truong, Hong-Linh Dustdar, Schahram Vienna Univ Technol Database & Artificial Intelligence Grp Vienna Austria Vienna Univ Technol Distributed Syst Grp Vienna Austria

ISBN: (纸本)9783642013461

In the web environment, rich, diverse sources of heterogeneous and distributed data are ubiquitous. In fact, even the information characterizing a single entity - like, for example, the information related to a web service - is normally scattered over various data sources using various languages such as XML, RDF, and OWL. Hence, there is a strong need for web applications to handle queries over heterogeneous, autonomous, and distributed data sources. However, existing techniques do not provide sufficient support for this task. In this paper we present DeXIN, an extensible framework for providing integrated access over heterogeneous, autonomous, and distributed web data sources, which can be utilized for data integration in modern web applications and Service Oriented Architecture. DeXIN extends the XQuery language by supporting SPARQL queries inside XQuery, thus facilitating the query of data modeled in XML, RDF, and OWL. DeXIN facilitates data integration in a distributed web and Service Oriented environment by avoiding the transfer of large amounts of data to a central server for centralized data integration and exonerates the transformation of huge amount of data into a common format for integrated access.

关键词： data integration Distributed query processing web data sources Heterogeneous data sources

来源：评论

学校读者我要写书评

暂无评论

Biological data integration: Wrapping data and tools

引用

IEEE TRANSACTIONS ON INFORMATION TECHNOLOGY IN BIOMEDICINE 2002年第2期6卷 123-128页

作者： Lacroix, Z Arizona State Univ Tempe AZ 85287 USA

Nowadays scientific data is inevitably digital and stored in a wide variety of formats in heterogeneous systems. Scientists need to access an integrated view of remote or local heterogeneous data sources with advanced data accessing, analyzing, and visualization tools. Building a digital library for scientific data requires accessing and manipulating data extracted from flat files or databases, documents retrieved from the web as well as data generated by software. We present an approach to wrapping web data sources, databases, flat riles, or data generated by tools through a database view mechanism. Generally, a wrapper has two tasks: it first sends a query to the source to retrieve data and, second builds the expected output with respect to the virtual structure. Our wrappers are composed of a retrieval component based on an intermediate object view mechanism called search views mapping the source capabilities to attributes, and an extensible Markup Language (XML) engine, respectively, to perform these two tasks. The originality of the approach consists of: 1) a generic view mechanism to access seamlessly data sources with limited capabilities and 2) the ability to wrap data sources as well as the useful specific tools they may provide. Our approach has been developed and demonstrated as part of the multidatabase system supporting queries via uniform object protocol model (OPM) interfaces.

关键词： biological data integration database view eXtensible Markup Language (XML) mediation web data sources

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：