检索结果-内蒙古大学图书馆

Knowledge discovery by probabilistic clustering of distributed databases

DATA & KNOWLEDGE ENGINEERING 2005年第2期54卷 189-210页

作者： McClean, S Scotney, B Morrow, P Greer, K Univ Ulster Sch Comp & Informat Engn Coleraine BT52 1SA Londonderry North Ireland

Clustering of distributed databases facilitates knowledge discovery through learning of new concepts that characterise common features and differences between datasets. Hence, general patterns can be learned rather than restricting learning to specific databases from which rules may not be generalisable. We cluster databases that hold aggregate count data on categorical attributes that have been classified according to homogeneous or heterogeneous classification schemes. Clustering of datasets is carried out via the probability distributions that describe their respective aggregates. The homogeneous case is straightforward. For heterogeneous data we investigate a number of clustering strategies, of which the most efficient avoid the need to compute a dynamic shared ontology to homogenise the classification schemes prior to clustering. (c) 2004 Elsevier B.V. All rights reserved.

关键词： distributed databases probabilistic clustering aggregates dynamic shared ontology

来源：评论

学校读者我要写书评

暂无评论

Privacy preserving association rule mining over distributed databases using genetic algorithm

引用

NEURAL COMPUTING & APPLICATIONS 2013年第1-sup期22卷 S351-S364页

作者： Keshavamurthy, Bettahally N. Khan, Asad M. Toshniwal, Durga Indian Inst Technol Dept Elect & Comp Engn Roorkee Uttarakhand India

Privacy preservation in distributed database is an active area of research. With the advancement of technology, massive amounts of data are continuously being collected and stored in distributed database applications. Indeed, temporal associations and correlations among items in large transactional datasets of distributed database can help in many business decision-making processes. One among them is mining frequent itemset and computing their association rules, which is a nontrivial issue. In a typical situation, multiple parties may wish to collaborate for extracting interesting global information such as frequent association, without revealing their respective data to each other. This may be particularly useful in applications such as retail market basket analysis, medical research, academic, etc. In the proposed work, we aim to find frequent items and to develop a global association rules model based on the genetic algorithm (GA). The GA is used due to its inherent features like robustness with respect to local maxima/minima and domain-independent nature for large space search technique to find exact or approximate solutions for optimization and search problems. For privacy preservation of the data, the concept of trusted third party with two offsets has been used. The data are first anonymized at local party end, and then, the aggregation and global association is done by the trusted third party. The proposed algorithms address various types of partitions such as horizontal, vertical, and arbitrary.

关键词： Privacy preservation distributed databases Genetic algorithm Association rules mining Trusted third-party computation

来源：评论

学校读者我要写书评

暂无评论

A framework and test-suite for assessing approaches to resolving heterogeneity in distributed databases

引用

INFORMATION AND SOFTWARE TECHNOLOGY 2000年第7期42卷 505-515页

作者： El-Khatib, HT Williams, MH MacKinnon, LM Marwick, DH Heriot Watt Univ Dept Comp & Elect Engn Edinburgh EH14 4AS Midlothian Scotland

The problem of connecting together a number of different databases to produce an integrated information system has attracted a considerable amount of attention over the years and various approaches have been developed to handle this. However, the general problem of gathering related information from a number of existing heterogeneous databases is complex because of the differences in representation and meaning of data in different data sets. Many different approaches have been described to resolve this problem, and some prototype systems built. However, it is difficult to compare the effectiveness of different approaches and prototypes. This paper is aimed at addressing the specific issue of assessing the generality of different approaches. To this end it presents a framework for classifying the differences between data in different databases and a test-suite which can be used to evaluate and compare the extent to which different approaches handle different aspects of this heterogeneity. (C) 2000 Elsevier Science B.V. All rights reserved.

关键词： distributed databases linking databases heterogeneous databases

来源：评论

学校读者我要写书评

暂无评论

A low-cost checkpointing technique for distributed databases

引用

distributed AND PARALLEL databases 2001年第3期10卷 241-268页

作者： Lin, JL Dunham, MH So Methodist Univ Dept Comp Sci & Engn Dallas TX 75275 USA

For distributed databases, checkpointing is used to ensure an efficient way to perform global reconstruction. However, the need for global reconstruction is infrequent. Most current checkpointing approaches for distributed databases are too expensive during run time. Some of them allow the checkpointing process to run in parallel with normal transactions at the cost of more data and resource contention, which in turn causes longer response time for normal transactions. Thus, an efficient way to checkpoint distributed databases is needed to avoid degrading the system performance. This paper presents a low-cost solution, called Loosely Synchronized Local Fuzzy Checkpointing (LSLFC), to these problems. LSLFC supports global reconstruction, and our performance study shows that LSLFC has little overhead during run time.

关键词： distributed databases checkpointing fuzzy checkpoint global reconstruction

来源：评论

学校读者我要写书评

暂无评论

A DYNAMIC AND INTEGRATED CONCURRENCY-CONTROL FOR distributed databases

引用

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS 1989年第3期7卷 364-374页

作者： PONS, JF VILAREM, JF Centre de Recherche en Informatique de Montpellier Universite de Montpellier II Montpellier France

An integrated approach to concurrency control adaptively allows classical pessimistic (two-phase locking) or optimistic (using certification) approaches. The principles for a distributed integrated method controlling both locking and optimistic transactions are defined. The implementation of these principles leads to a method for constructing the serialization order of transactions, using their conflicts. This dynamic construction prevents the systematic rejection of old (long) readers, as in the multiversion methods. On the other hand, applying Thomas' rule to control the write conflicts permits the presence of old (long) writers.< >

关键词： Concurrency control distributed databases Transaction databases Certification Passive optical networks Optimization methods Centralized control Data structures Information retrieval

来源：评论

学校读者我要写书评

暂无评论

CONTROL ARCHITECTURE FOR NEXT-GENERATION COMMUNICATION-NETWORKS BASED ON distributed databases

引用

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS 1989年第3期7卷 418-423页

作者： MURAKAMI, K KATOH, M Fujitsu Laboratories Limited Kawasaki Japan

Intelligent routing control is defined as the process in which the network interrogates the databases containing the relationships between logical numbers, such as personal or information identifiers, and physical addresses in the transport network to find the terminal having the information required to process a user request. The routing control system presented uses distributed databases, each of which manages a switching system and all of which are connected through high-speed signalling networks separate from the transport network. If the requested physical address cannot be found in one database, search requests are distributed at the same time to all other databases. For up to 100 million subscribers, the routing control system can find a physical address within 1 s when each database uses ten memories accessed at 200 ns with an interdatabase linkage speed of 14 Mb/s.< >

关键词： Communication system control Next generation networking Communication networks Routing distributed databases Control systems Intelligent control Deductive databases Intelligent networks Switching systems

来源：评论

学校读者我要写书评

暂无评论

Perfopticon: Visual Query Analysis for distributed databases

引用

COMPUTER GRAPHICS FORUM 2015年第3期34卷 71-80页

作者： Moritz, Dominik Halperin, Daniel Howe, Bill Heer, Jeffrey Univ Washington Comp Sci & Engn Seattle WA 98195 USA

distributed database performance is often unpredictable due to issues such as system complexity, network congestion, or imbalanced data distribution. These issues are difficult for users to assess in part due to the opaque mapping between declaratively specified queries and actual physical execution plans. Database developers currently must expend significant time and effort scanning log files to isolate and debug the root causes of performance issues. In response, we present Perfopticon, an interactive query profiling tool that enables rapid insight into common problems such as performance bottlenecks and data skew. Perfopticon combines interactive visualizations of (1) query plans, (2) overall query execution, (3) data flow among servers, and (4) execution traces. These views coordinate multiple levels of abstraction to enable detection, isolation, and understanding of performance issues. We evaluate our design choices through engagements with system developers, scientists, and students. We demonstrate that Perfopticon enables performance debugging for real-world tasks.

关键词： Legal Executions Visualization data distribution level of abstraction Servers interactive query data flow distributed databases

来源：评论

学校读者我要写书评

暂无评论

Using evidence theory for the integration of distributed databases

引用

INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS 1997年第10期12卷 763-776页

作者： McClean, S Scotney, B Sch. of Info. and Software Eng. Faculty of Informatics University of Ulster Coleraine BT52 1SA Cromore Road United Kingdom

distributed databases allow us to integrate data from different sources which have not previously been combined. In this article, we are concerned with the situation where the data sources are held in a distributed database. Integration of the data is then accomplished using the Dempster-Shafer representation of evidence. The weighted sum operator is developed and this operator is shown to provide an appropriate mechanism for the integration of such data. This representation is particularly suited to statistical samples which may include missing values and be held at different levels of aggregation. Missing values are incorporated into the representation to provide lower and upper probabilities for propositions of interest. The weighted sum operator facilitates combination of samples with different classification schemes. Such a capability is particularly useful for knowledge discovery when we are searching for rules within the concept hierarchy, defined in terms of probabilities or associations. By integrating information from different sources, we may thus be able to induce new rules or strengthen rules which have already been obtained. We develop a framework for describing such rules and show how we may then integrate rules at a high level without having to resort to the raw data, a useful facility for knowledge discovery where efficiency is of the essence. (C) 1997 John Wiley & Sons, Inc.

关键词： Rule data mining Evidence Theory Statistical sampling weight sums distributed databases Essences Data sources

来源：评论

学校读者我要写书评

暂无评论

QUERYING distributed databases ON LOCAL AREA NETWORKS

引用

PROCEEDINGS OF THE IEEE 1987年第5期75卷 563-572页

作者： HEVNER, AR YAO, SB Database Systems Research Center College of Business and Management University of Maryland College Park MD USA

distributed databases on local area networks present additional considerations for query optimization over databases on geographically distributed, point-to-point networks. This paper surveys and evaluates the state of current research on distributed query optimization for local area networks. A classification taxonomy is presented and used to analyze the proposed query-optimization algorithms. The unique features of each algorithm are highlighted and a qualitative comparison of the algorithms is given. Future research directions are discussed.

关键词： distributed databases Local area networks Query processing Data communication Database systems Protocols Spatial databases Cost function Microcomputers Network topology

来源：评论

学校读者我要写书评

暂无评论

AN ALGORITHM FOR CONCURRENCY-CONTROL AND RECOVERY IN REPLICATED distributed databases

引用

ACM TRANSACTIONS ON DATABASE SYSTEMS 1984年第4期9卷 596-615页

作者： BERNSTEIN, PA GOODMAN, N Sequoia Systems Inc.

In a one-copy distributed database, each data item is stored at exactly one site. In a replicated database, some data items may be stored at multiple sites. The main motivation is improved reliability: by storing important data at multiple sites, the DBS can operate even though some sites have *** paper describes an algorithm for handling replicated data, which allows users to operate on data so long as one copy is “available.” A copy is “available” when (i) its site is up, and (ii) the copy is not out-of-date because of an earlier *** algorithm handles clean, detectable site failures, but not Byzantine failures or network partitions.

关键词： continuous operation distributed databases replicated databases Serializability transaction processing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：