the papers in this special section were presented at the 13th International Workshop on datamining in Bioinformatics (BIOKDD’14) was organized in conjunction withthe ACM SIGKDD International conference on knowledge...
详细信息
the papers in this special section were presented at the 13th International Workshop on datamining in Bioinformatics (BIOKDD’14) was organized in conjunction withthe ACM SIGKDD International conference on knowledge Discovery and dataminingthat was held on August 24, 2014 in New York, NY. It brought together international researchers in the interacting disciplines of datamining, systems biology, and bioinformatics at the Bloomberg Headquarters venue. the goal of this workshop is to encourage knowledge Discovery and datamining (KDD) researchers to take on the numerous challenges that Bioinformatics offers.
High utility pattern mining extracts more useful and realistic knowledge from transaction databases compared to the traditional frequent pattern mining by considering the non-binary frequency values of items in transa...
详细信息
ISBN:
(纸本)9783642013065
High utility pattern mining extracts more useful and realistic knowledge from transaction databases compared to the traditional frequent pattern mining by considering the non-binary frequency values of items in transactions and different profit values for every item. However, the existing high utility pattern mining algorithms suffer from the level-wise candidate generation-and-test problem and need several database scans to mine the actual high utility patterns. In this paper, we propose a novel tree-based candidate priming technique HUC-Prime (high utility candidates prune) to efficiently mine high utility patterns without level-wise candidate generation-and-test. It exploits a pattern growthmining approach and needs maximum three database scans in contrast to several database scans of the existing algorithms. Extensive experimental results show that our technique is very efficient for high utility pattern mining and it outperforms the existing algorithms.
Since 2006, the International conference on Bioinformatics (InCoB) has been publishing selected papers in BMC Bioinformatics. Papers within the scope of the journal from the 13th InCoB July 31-2 August, 2014 in Sydney...
详细信息
Since 2006, the International conference on Bioinformatics (InCoB) has been publishing selected papers in BMC Bioinformatics. Papers within the scope of the journal from the 13th InCoB July 31-2 August, 2014 in Sydney, Australia have been compiled in this supplement. these span protein and proteome informatics, structural bioinformatics, software development and bioimaging to pharmacoinformatics and disease informatics, representing the breadth of bioinformatics research in the asia-pacific.
the paper presents an exceptionally simple method that allows a researcher to find out the interpretation of the complicated revitalization processes on the bases of the collected data. It employs spatial datamining ...
详细信息
ISBN:
(纸本)9781538621653
the paper presents an exceptionally simple method that allows a researcher to find out the interpretation of the complicated revitalization processes on the bases of the collected data. It employs spatial datamining and fuzzy inference system. thanks to building the FIS knowledge base and selecting the fuzzy rules, it is possible to point out the important factors that influence the process that is being the subject of the research. In this paper this is the dialogue between Polish local governments and their residents to solve the problem of urban revitalization in the context of smart city creation. Keywords-spatial datamining
Loop tiling is an effective loop transformation technique that tiles the iteration space of loop nests to improve the data locality. the appropriate data layout and transfer strategies are also important to assist loo...
详细信息
A large portion of data collected by many organisations today is about people;and often contains personal identifying information, such as names and addresses. Privacy and confidentiality are of great concern when suc...
详细信息
ISBN:
(纸本)9783642013065
A large portion of data collected by many organisations today is about people;and often contains personal identifying information, such as names and addresses. Privacy and confidentiality are of great concern when such data is being shared between organisations or made publicly available. Research in (privacy-preserving) data ruining and data linkage is suffering front a lack of publicly available real-world data, sets that contain personal information;And therefore experimental evaluations cart be difficult to conduct. In order to overcome this problem, we have developed a data generator that allows flexible creation of synthetic data, containing personal information with realistic characteristics, such as frequency distributions, attribute dependencies, and error probabilities. Our generator significantly improves earlier approaches, and allows the generation of data for individuals;families and households.
Lots of data from different domains are published as Linked Open data (LOD). While there are quite a few browsers for such data, as well as intelligent tools for particular purposes, a versatile tool for deriving addi...
详细信息
Lots of data from different domains are published as Linked Open data (LOD). While there are quite a few browsers for such data, as well as intelligent tools for particular purposes, a versatile tool for deriving additional knowledge by miningthe Web of Linked data is still missing. In this system paper, we introduce the RapidMiner Linked Open data extension. the extension hooks into the powerful datamining and analysis platform RapidMiner, and offers operators for accessing Linked Open data in RapidMiner, allowing for using it in sophisticated data analysis workflows without the need for expert knowledge in SPARQL or RDF. the extension allows for autonomously exploring the Web of data by following links, thereby discovering relevant datasets on the fly, as well as for integrating overlapping data found in different datasets. As an example, we show how statistical data from the World Bank on scientific publications, published as an RDF data cube, can be automatically linked to further datasets and analyzed using additional background knowledge from ten different LOD datasets. (C) 2015 Elsevier B.V. All rights reserved.
Log-linear models have been widely used in text mining tasks because it call incorporate a large number of possibly correlated features. In text mining;these possibly correlated features tire generated by conjunction ...
详细信息
ISBN:
(纸本)9783642013065
Log-linear models have been widely used in text mining tasks because it call incorporate a large number of possibly correlated features. In text mining;these possibly correlated features tire generated by conjunction of features. they are usually used with log-linear models to estimate robust conditional distributions. To avoid manual construction of conjunction of features. we propose a new algorithmic framework called F-tree for automatically generating and storing conjunctions of features in text mining tasks. this compact graph-based data structure allows fast one-vs-all matching of features in the feature space which is crucial for many text;mining tasks. Based on this hierarchical data structure;we propose a systematic method For removing redundant features to further reduce memory usage and improve performance. We do large-scale experiments oil three publicly-available datasets and show that this automatic method can get state-of the-art performance achieved by manual construction of Features.
the main objective of this research is to analyze information of customer contact made to a call center section of a telecommunication company in thailand. Using role hierarchy mining approach (as process mining techn...
详细信息
ISBN:
(纸本)9781467391900
the main objective of this research is to analyze information of customer contact made to a call center section of a telecommunication company in thailand. Using role hierarchy mining approach (as process mining technique) enabled us to focus on the hierarchical perspectives of the collected event log with respect to the functional relationships amongst the originators (i.e., the administrators and operators who were in charge of handling incoming calls made from customers/clients). Following a role hierarchy approach, we could better track and trace the interactions made between different personnel/administrators in a functional organizational structure from top to down. One of the main advantages of the proposed approach was the ability to detect and identify potential discrepancies and outbreaks made against the rules and responsibilities previously defined by the top management and human resource staff. Accordingly, the results of the study can help the telecommunication companies to improve their call center (customer service) section in such a way to result in more customer/client satisfaction toward the quality of service on work. therefore, applying a role hierarchy mining approach will eventually lead to improving the performance of the handling calls made to the call center section, in a more efficient, effective and timely manner.
Traditional sequential pattern milling deals with positive correlation between sequential patterns only, without considering negative relationship between them. In this paper, we present a. notion of impact-oriented n...
详细信息
ISBN:
(纸本)9783642013065
Traditional sequential pattern milling deals with positive correlation between sequential patterns only, without considering negative relationship between them. In this paper, we present a. notion of impact-oriented negative sequential rules, in which the left side is a positive sequential pattern or it's negation;and the right side is a predefined outcome or its negation. Impact-oriented negative sequential rides are formally defined to show the impact of sequential patterns on the outcome. and all efficient algorithm is designed to discover both positive and negative impact-oriented sequential rules. Experimental results oil both synthetic data and real-life data show the efficiency and effectiveness of the proposed technique.
暂无评论