检索结果-内蒙古大学图书馆

distributed multivariate regression using wavelet-based collective data mining

JOURNAL OF PARALLEL AND distributed COMPUTING 2001年第3期61卷 372-400页

作者： Hershberger, DE Kargupta, H Washington State Univ Sch Elect Engn & Comp Sci Pullman WA 99164 USA

This paper presents a method for distributed multivariate regression using wavelet-based collective data mining (CDM). The method seamlessly blends machine learning and the theory of communication with the statistical methods employed in parametric multivariate regression to provide an effective data mining technique for use in a distributed data and computation environment. The technique is applied to two benchmark data sets, producing results that are consistent with those obtained by applying standard parametric regression techniques to centralized data sets. Evaluation of the method in terms of mode accuracy as a function of appropriateness of the selected wavelet function, relative number of nonlinear cross-terms. and sample size demonstrates that accurate parametric multivariate regression models call be generated from distributed, heterogeneous, data sets with minimal data communication overhead compared to that required to aggregate a distributed data set. Application of this method to linear discriminant analysis, which is related Co parametric multivariate regression, produced classification results on the Iris data set that are comparable to those obtained with centralized data analysis. (C) 2001 Academic Press.

关键词： data mining distributed data mining collective data mining knowledge discovery wavelets regression

来源：评论

学校读者我要写书评

暂无评论

distributed mining of maximal frequent itemsets on a data Grid system

引用

JOURNAL OF SUPERCOMPUTING 2006年第1期37卷 71-90页

作者： Luo, Congnan Pereira, Anil L. Chung, Soon M. Wright State Univ Dept Comp Sci & Engn Dayton OH 45435 USA

In this paper, we propose a new algorithm, named Grid-based distributed Max-Miner (GridDMM), for mining maximal frequent itemsets from databases on a data Grid. A frequent itemset is maximal if none of its supersets is frequent. GridDMM is specifically suitable for use in Grid environments due to low communication and synchronization overhead. GridDMM consists of a local mining phase and a global mining phase. During the local mining phase, each node mines the local database to discover the local maximal frequent itemsets, then they form a set of maximal candidate itemsets for the top-down search in the subsequent global mining phase. A new prefix-tree data structure is developed to facilitate the storage and counting of the global candidate itemsets of different sizes. We built a data Grid system on a cluster of workstations using the open-source Globus Toolkit, and evaluated the GridDMM algorithm in terms of performance, scalability, and the overhead of communication and synchronization. GridDMM demonstrates better performance than other sequential and parallel algorithms, and its performance is scalable in terms of the database size and the number of nodes.

关键词： data Grid distributed data mining maximal frequent itemsets association rules scalability

来源：评论

学校读者我要写书评

暂无评论

Collective mining of Bayesian networks from distributed heterogeneous data

引用

KNOWLEDGE AND INFORMATION SYSTEMS 2004年第2期6卷 164-187页

作者： Chen, R Sivakumar, K Kargupta, H Washington State Univ Sch Elect Engn & Comp Sci Pullman WA 99164 USA Univ Maryland Baltimore Cty Dept Comp Sci & Elect Engn Baltimore MD 21228 USA

We present a collective approach to learning a Bayesian network from distributed heterogeneous data. In this approach, we first learn a local Bayesian network at each site using the local data. Then each site identifies the observations that are most likely to be evidence of coupling between local and non-local variables and transmits a subset of these observations to a central site. Another Bayesian network is learnt at the central site using the data transmitted from the local site. The local and central Bayesian networks are combined to obtain a collective Bayesian network, which models the entire data. Experimental results and theoretical justification that demonstrate the feasibility of our approach are presented.

关键词： Bayesian network collective data mining distributed data mining heterogeneous data web log mining

来源：评论

学校读者我要写书评

暂无评论

mining OF ASSOCIATION RULES FROM distributed data USING MOBILE AGENTS

MINING OF ASSOCIATION RULES FROM DISTRIBUTED DATA USING MOBI...

引用

International Conference on e-Business (ICE-B 2009)

作者： Hu, Gongzhu Ding, Shaozhen Cent Michigan Univ Dept Comp Sci Mt Pleasant MI 48859 USA

ISBN: (纸本)9789896740061

In this paper, we propose an agent-based approach to mine association rules from data sets that are distributed across multiple locations while preserving the privacy of local data. This approach relies on the local systems to find frequent itemsets that are encrypted and the partial results are carried from site to site. In this way, the privacy of local data is preserved. We present a structural model that includes several types of mobile agents with specific functionalities and communication scheme to accomplish the task. These agents implement the privacy-preserving algorithms for distributed association rule mining.

关键词： Mobile agent distributed data mining Privacy preserving

来源：评论

学校读者我要写书评

暂无评论

Creation of data mining Algorithms as Functional Expression for Parallel and distributed Execution 13th

Creation of Data Mining Algorithms as Functional Expression ...

引用

13th International Conference on Parallel Computing Technologies (PaCT)

作者： Kholod, Ivan Petukhov, Ilya St Petersburg Electrotech Univ LETI St Petersburg Russia

ISBN: (纸本)9783319219097;9783319219080

The article describes extension of lambda-calculation for creation of parallel data mining algorithms. The proposed approach uses presentation of the algorithm as a consequence of pure functions with unified interfaces. For parallel execution we use special function that allows to change a structure of the algorithm and to implement various strategies for processing of data set and model.

关键词： Parallel algorithms data mining Parallel data mining distributed data mining data mining algorithms

来源：评论

学校读者我要写书评

暂无评论

data mining VIA distributed GENETIC PROGRAMMING AGENTS

DATA MINING VIA DISTRIBUTED GENETIC PROGRAMMING AGENTS

引用

20th European Modeling and Simulation Symposium

作者： Kronberger, Gabriel K. Winkler, Stephan M. Affenzeller, Michael Wagner, Stefan Upper Austria Univ Appl Sci Sch Informat Commun & Media Heurist & Evolutionary Algorithms Lab A-4232 Hagenberg Austria

ISBN: (纸本)9788890372407

Genetic programming is a powerful search method which can be applied to the typical data mining task of finding hidden relations in datasets. We describe the architecture of a distributed data mining system in which genetic programming agents create a large amount of structurally different models which are stored in a model database. A search engine for models that is connected to this database allows interactive exploration and analysis of models, and composition of simple models to hierarchical models. The search engine is the crucial component of the system in the sense that it supports knowledge discovery and paves the way for the goal of finding interesting hidden causal relations.

关键词： distributed data mining genetic programming knowledge discovery

来源：评论

学校读者我要写书评

暂无评论

data mining Technique for Reduction of Association Rules in distributed System 1

Data Mining Technique for Reduction of Association Rules in ...

引用

International Conference on Automatic Control and Dynamic Optimization Techniques (ICACDOT)

作者： Waghamare, Bhagyashri Bodhe, Yogesh Shree Ramchandra Coll Engn Dept Comp Engn Pune Maharashtra India Dr DY Patil Sch Engn Dept Comp Engn Pune Maharashtra India

ISBN: (纸本)9781509020805

In today's world, there are number of transactions can be performed on social media. In such distributed environment where timely accessing of data is important, it becomes difficult to generate strong association rules. So it is necessary to reduce these rules for increasing rule reduction rate. This paper uses w-Tabular algorithm which combines weight assignment method and Quine-Mccluskey method which increases data processing time in distributed system.

关键词： Association Rule mining data mining distributed data mining Frequent Item Sets mining Reduction Framework

来源：评论

学校读者我要写书评

暂无评论

Preparation of distributed Heterogeneous data for data mining 18

Preparation of Distributed Heterogeneous Data for Data Minin...

引用

18th International Conference on Soft Computing and Measurement (SCM)

作者： Batasova, Svetlana Efimova, Maria Kholod, Ivan Semenchenko, Alexey St Petersburg Electrotech Univ LETI Fac Comp Sci & Technol St Petersburg Russia

this paper describes an approach of data preparation for a data mining algorithms application. The approach integrates ETL tools for distributed heterogeneous data extraction and transformation and the DXelopes librar... 详细信息

ISBN: (纸本)9781467369619

关键词： data mining ETL distributed data mining

来源：评论

学校读者我要写书评

暂无评论

Fast cryptographic privacy preserving association rules mining on distributed homogenous data base

引用

12th International Conference on Knowledge-Based Intelligent Information and Engineering Systems

作者： Hussein, Mahmoud El-Sisi, Ashraf Ismail, Nabil Menofyia Univ Fac Comp & Informat CS Dept Shibin Al Kawm 32511 Egypt

ISBN: (纸本)9783540855644

Privacy is one of the most important properties of ail information system must satisfy. In which systems the need to share information among different, not trusted entities, the protection of sensible information has a relevant role. A relatively new trend shows that classical access control techniques are not sufficient to guarantee privacy when data mining techniques are used in a Malicious way. Privacy preserving data mining algorithms have been recently introduced with the aim of preventing the discovery of sensible information. In this paper we propose a modification to privacy preserving association rule mining oil distributed homogenous database algorithm. Our algorithm is faster than old one which modified with preserving privacy and accurate results. Modified algorithm is based on a semi-honest model with negligible collision probability. The flexibility to extend to any number of sites without any change in implementation call be achieved. And also any increase doesn't add more time to algorithm because all client sites perform the mining in the same time so the overhead in communication time only. The total bit-communication cost for our algorithm is function in (N) sites.

关键词： association rule mining apriori cryptography distributed data mining privacy security

来源：评论

学校读者我要写书评

暂无评论

From parallel data mining to grid-enabled distributed knowledge discovery

From parallel data mining to grid-enabled distributed knowle...

引用

11th International Conference on Rough Sets, Fuzzy Sets, data mining and Granular Computing (RSFDGrC 2007)

作者： Cesario, Eugenio Talia, Domenico CNRS ICAR F-75700 Paris France Univ Calabria DEIS Calabria Italy

ISBN: (纸本)9783540725299

data mining often is a compute intensive and time requiring process. For this reason, several data mining systems have been implemented on parallel computing platforms to achieve high performance in the analysis of large data sets. Moreover, when large data repositories are coupled with geographical distribution of data, users and systems, more sophisticated technologies are needed to implement high-performance distributed KDD systems. Recently computational Grids emerged as privileged platforms for distributed computing and a growing number of Grid-based KDD systems have been designed. In this paper we first outline different ways to exploit parallelism in the main data mining techniques and algorithms, then we discuss Grid-based KDD systems.

关键词： rough set parallel data mining distributed data mining grid

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：