The mechanism of gene regulation is of great interest for biologists, especially in the genomic field. One part of mechanisms controlling the genes expression is provided by the transcription factors, which are protei...
详细信息
ISBN:
(纸本)9787302139225
The mechanism of gene regulation is of great interest for biologists, especially in the genomic field. One part of mechanisms controlling the genes expression is provided by the transcription factors, which are proteins that can either repress or stimulate the transcription of a gene. In this paper, we propose a new data mining algorithm, based on boolean contexts, in order to extract a priori relevant frequent closed gensets, i.e., sets of tissus and associated sets of genes and transcription factors which are useful for the biologist. The key feature of our algorithm is a better compromise between the size of the search space and the conveyed discovered knowledge in bioinformatics. For this, the proposed algorithm, called MC(2)G for Mining Cconstraint Closed Gensets, uses the Frequent Pattern tree (fp-tree) structure, which is an extended Prefix-treestructure, to prime the search space. Moreover MC(2)G enables to define statistical and syntaxic constraints on the desired frequent closed gensets and uses them during the extraction process. Experimental comparisons with other algorithms are achieved on real world datasets. http://***/stamp/***?arnumber=4281879
暂无评论