The computational identification of regulatory elements in genomic DNA is key to understanding the regulatory infrastructure of a cell. We present an innovative tool to identify Transcription Factor Binding Sites (TFB...
详细信息
ISBN:
(纸本)9781424496365
The computational identification of regulatory elements in genomic DNA is key to understanding the regulatory infrastructure of a cell. We present an innovative tool to identify Transcription Factor Binding Sites (TFBSs) in genomic sequences. We show that our positional pattern detection tool is able to attain high sensitivity and specificity of TFBS detection by capturing dependencies between nucleotide positions within the TFBS, thereby elucidating complex interactions that may be critical for the TFBS activity. Further, we unveil a combination of two biologically realistic information processing methods that underlie our tool: third-generation neural networks (spiking neural networks) are used to represent the structure of TFBSs, and a genetic algorithm is used for optimization of network parameters. Initially, the networks are trained to distinguish known TFBS binding sites from negative examples in the learning phase. Then, the evolved network is used as a classifier to detect novel TFBSs in genomic sequences. Moreover, we show an application of our method to GAL4 binding sites in yeast. A two-neuron network topology is trained with real data from TRANSFAC and SCPD and evaluated through simulation. We show how neuron and synapse parameters can be evolved to improve classification results. Furthermore, the networks' predictions were compared against MAPPER, TFBIND and TFSEARCH. Our results reveal that our innovative tool has the potential to attain very high classification accuracy, with a very small number of false positives. These results show that information processing methods are able to capture important positional information in TFBSs and should be explored further to look at complex relationships underlying transcriptional and epigenetic regulation.
暂无评论