Inaugural APKDD workshop presents state-of-the-art research and industry practices in the areas of analytic platforms and architectures for large scale data collection and organization, comprehensive data analysis, im...
详细信息
ISBN:
(纸本)9781450315685
Inaugural APKDD workshop presents state-of-the-art research and industry practices in the areas of analytic platforms and architectures for large scale data collection and organization, comprehensive data analysis, improvement of collection and organization methods and analysis of large data sets. workshop participants will share presentations, empirical and theoretical studies, best practices and algorithms related to data management platforms for heterogeneous data storage including mobile,large scale data analysis plat- forms, requirements methodology and "big data" analytic software. Copyright is held by the author/owner(s).
the proceedings contain 12 papers. the topics discussed include: the methods of maximum flow and minimum cost flow finding in fuzzy network;language identification for texts written in transliteration;on parameter ide...
the proceedings contain 12 papers. the topics discussed include: the methods of maximum flow and minimum cost flow finding in fuzzy network;language identification for texts written in transliteration;on parameter identification methods for Markov models applied to social networks;analyzing online social network data with biclustering and triclustering;term weighting in expert search task: analyzing communication patterns;semantic matching using concept lattice;self-tuning semantic image segmentation;a neural network-like combinatorial data structure for inferring classification tests;extraction of semantic relations between concepts with KNN algorithms on Wikipedia;computerized recognition system for historical manuscripts;an ontology-based approach to text-to-picture synthesis systems;and evaluating the quality level of projects, authors and experts.
A generalization of algorithm is proposed for implementing the well-known effective inductive method of constructing sets of cardinality (q+1) ((q+1)-sets) from their subsets of cardinality q ((q)-sets). A new neural ...
详细信息
A generalization of algorithm is proposed for implementing the well-known effective inductive method of constructing sets of cardinality (q+1) ((q+1)-sets) from their subsets of cardinality q ((q)-sets). A new neural network-like combinatorial data-knowledge structure supporting this algorithm is advanced. this structure can drastically increase the efficiency of inferring functional and implicative dependencies as like as association rules from a given dataset.
the paper offers simple robust algorithms for checking consistency of large volumes of measured data. the checks differentiate between data collected on a spatial grid at one time point;and data collected on a spatial...
详细信息
Multi-core CPUs are very efficient at executing multiple threads at the same time without significant performance penalty;this capability, however, results in increasing demand for the memory and the caches, which not...
详细信息
the proceedings contain 10 papers. the topics discussed include: composition of L-fuzzy contexts;iterator-based algorithms in self-tuning discovery of partial implications;completing terminological axioms with formal ...
the proceedings contain 10 papers. the topics discussed include: composition of L-fuzzy contexts;iterator-based algorithms in self-tuning discovery of partial implications;completing terminological axioms with formal concept analysis;a tool-based set theoretic framework for concept approximation;decision aiding software using FCA;closures and partial implications in educational data mining;attribute exploration in a fuzzy setting;on open problem - semantics of the clone items;computing the skyline of a relational table based on a query lattice;and using FCA for modeling conceptual difficulties in learning processes.
PDF is a widely used document format. By studying the structure of PDF file, we notice that incremental updates method used by PDF file can be used to embed information for covert communication. So in this paper, we p...
详细信息
the proceedings contain 25 papers. the topics discussed include: workload-based heuristics for evaluation of physical database architectures;multiple-site distributed spatial query optimization using spatial semijoins...
ISBN:
(纸本)9789986342748
the proceedings contain 25 papers. the topics discussed include: workload-based heuristics for evaluation of physical database architectures;multiple-site distributed spatial query optimization using spatial semijoins;cost models for approximate query evaluation algorithms;processing multiple databases in the Estonian water information system;hypermodelling live: OLAP for code clone recommendation;integration of business modeling and it modeling;using functional characteristics to analyze state changes of objects;electronic archive information system;performance measurement framework with indicator life-cycle support;multi-layered architecture of decision support system for monitoring of dangerous good transportation;formulating the enterprise architecture compliance problem;advanced e-learning environments and technologies;trends of the usage of adaptive learning in intelligent tutoring systems;and towards automatic structured web data extraction system.
Frequent itemset mining finds frequently occurring itemsets in transactional data. this is applied to diverse problems such as decision support, selective marketing, financial forecast and medical diagnosis. the cloud...
详细信息
ISBN:
(纸本)9781450315685
Frequent itemset mining finds frequently occurring itemsets in transactional data. this is applied to diverse problems such as decision support, selective marketing, financial forecast and medical diagnosis. the cloud, computation as an utility service, allows us to crunch large mining problems. there are a number of algorithms for doing frequent itemset mining, but none are out-of-the-box suited for the cloud, requiring large datastructures to be synchronized across the network. One of the best algorithms for doing frequent itemset mining is the known FP-growth (Frequent Patterns growth). We develop a cloud-enabled algorithmic variant for frequent itemset mining that scales with very little communication and computational overhead and even, with only one worker node, is faster than FP-growth. We develop the concept of a postfix path and show how this allows us to lower the communicational cost and leads to adjustable work sizes. this concept provides a very exible algorithmic solution that can be applied to a wide variety of different problem sizes and setups. Copyright 2012 ACM.
A novel tensor decomposition is proposed to make it possible to identify replicating structures in complex data, such as textures and patterns in music spectrograms. In order to establish a computational framework for...
详细信息
暂无评论