The proceedings contain 39 papers. The topics discussed include: classifying synthetic and biological DNA sequences with side effect machines;a hybrid clustering/evolutionary algorithm for RNA folding;an information t...
ISBN:
(纸本)9781424417780
The proceedings contain 39 papers. The topics discussed include: classifying synthetic and biological DNA sequences with side effect machines;a hybrid clustering/evolutionary algorithm for RNA folding;an information theoretic approach for the discovery of irregular and repetitive patterns in genomic data;PCA-based linear combinations of oligonucleotide frequencies for metagenomic DNA fragment binning;exploring chaos automata for protein sequences;cancer classification with incremental gene selection based on DNA microarray data;temporal and structural analysis of biological networks in combination with microarray data;a graph-based representation of gene expression profiles in DNA microarrays;network motifs in context: an exploration of the evolution of oscillatory dynamics in transcriptional networks;and evolution strategy with greedy probe selection heuristics for the non-unique oligonucleotide probe selection problem.
Various types of genome-wide data, such as sequence and gene expression data, have been generated and are available from public databases. These genome-wide data present major computational challenges as the number of...
详细信息
ISBN:
(纸本)9781467362139;9781467362146
Various types of genome-wide data, such as sequence and gene expression data, have been generated and are available from public databases. These genome-wide data present major computational challenges as the number of variables far exceeds the number of observations. Many computational tools have been developed for the analyses of these high dimensional data, and these methods have led to improved understanding of molecular biology. In particular, signature discovery (also known as variable selection or feature selection), a machine learning technique in which subsets of variables are selected to build robust models, are useful in mining these highdimensional functional genomic data. In this paper, we will review the applications of signature discovery methods in mining these high dimensional data. Specifically, we will focus on two applications, namely, the identification of signature genes predictive of disease phenotypes and the inference of regulatory networks. Signature genes predictive of disease phenotypes can be potentially used in the diagnosis and prognosis of diseases. Regulatory networks that capture the gene-to-gene influences can be used to provide the context of therapeutic intervention.
bioinformatics and computationalbiology are two fast-growing fields that require the direct application of computationalintelligence. The American patent system is currently going through the biggest reformation sin...
详细信息
ISBN:
(纸本)9781467358743
bioinformatics and computationalbiology are two fast-growing fields that require the direct application of computationalintelligence. The American patent system is currently going through the biggest reformation since the passage of Patent Act of 1952, and thus intellectual property rights (IPR) and strategies continue to be increasingly vital in these fields. In order to better understand the status quo of intellectual property (IP) specifically in the fields of biology that apply computationalintelligence, basic IP definitions, recent IP developments, and advanced protection strategies are presented and discussed.
Genomic annotations with functional controlled terms, such as the Gene Ontology (GO) ones, are paramount in modern biology. Yet, they are known to be incomplete, since the current biological knowledge is far to be def...
详细信息
ISBN:
(纸本)9781479931637
Genomic annotations with functional controlled terms, such as the Gene Ontology (GO) ones, are paramount in modern biology. Yet, they are known to be incomplete, since the current biological knowledge is far to be definitive. In this scenario, computational methods that are able to support and quicken the curation of these annotations can be very useful. In a previous work, we discussed the benefits of using the Probabilistic Latent Semantic Analysis algorithm in order to predict novel GO annotations, compared to some Singular Value Decomposition (SVD) based approaches. In this paper, we propose a further enhancement of that method, which aims at weighting the available associations between genes and functional terms before using them as input to the predictive system. The tests that we performed on the annotations of human genes to GO functional terms showed the efficacy of our approach.
Classifying Microarray data, which are of high dimensional nature, requires high computational power. Support Vector Machines-based classifier (SVM) is among the most common and successful classifiers used in the anal...
详细信息
ISBN:
(纸本)9781457702167
Classifying Microarray data, which are of high dimensional nature, requires high computational power. Support Vector Machines-based classifier (SVM) is among the most common and successful classifiers used in the analysis of Microarray data but also requires high computational power due to its complex mathematical architecture. Implementing SVM on hardware exploits the parallelism available within the algorithm kernels to accelerate the classification of Microarray data. In this work, a flexible, dynamically and partially reconfigurable implementation of the SVM classifier on Field Programmable Gate Array (FPGA) is presented. The SVM architecture achieved up to 85x speed-up over equivalent general purpose processor (GPP) showing the capability of FPGAs in enhancing the performance of SVM-based analysis of Microarray data as well as future bioinformatics applications.
Suffix trees, which are trie structures that present the suffixes of given sequences (e.g., strings), are widely used for sequence search in different application domains such as, text data mining, web intelligence, b...
详细信息
Finding the molecular features causes the halophilicity in the halostable organisms is helpful to understand the halophilic adaption. In this study, we proposed a prediction method for halophilic proteins by using a m...
详细信息
ISBN:
(纸本)9781467358743
Finding the molecular features causes the halophilicity in the halostable organisms is helpful to understand the halophilic adaption. In this study, we proposed a prediction method for halophilic proteins by using a machine learning method. The stages of this study are six-fold. First, we establish a non-redundant dataset of the halophilic proteins, collected from NCBI, Uniprotkb and EMBL-EBI databases. The dataset consists of 245 positive and negative proteins with sequence identity
Recently, next generation sequencing techniques have begun to produce huge amounts of sequencing data. To analyze these data, an efficient method that can handle large amounts of information is required. In this paper...
详细信息
ISBN:
(纸本)9781467358743
Recently, next generation sequencing techniques have begun to produce huge amounts of sequencing data. To analyze these data, an efficient method that can handle large amounts of information is required. In this paper, we proposed a method for classifying sets of DNA sequences by using a hidden Markov model self-organizing map. For this purpose, a learning algorithm that requires low computational costs was developed. The availability of this method was examined in experiments classifying DNA sequences of various types of genes.
In an extension of previous work, here we introduce a second-order optimization method for determining optimal paths from the substrate to a target product of a metabolic network, through which the amount of the targe...
详细信息
In an extension of previous work, here we introduce a second-order optimization method for determining optimal paths from the substrate to a target product of a metabolic network, through which the amount of the target is maximum. An objective function for the said purpose, along with certain linear constraints, is considered and minimized. The basis vectors spanning the null space of the stoichiometric matrix, depicting the metabolic network, are computed, and their convex combinations satisfying the constraints are considered as flux vectors. A set of other constraints, incorporating weighting coefficients corresponding to the enzymes in the pathway, are considered. These weighting coefficients appear in the objective function to be minimized. During minimization, the values of these weighting coefficients are estimated and learned. These values, on minimization, represent an optimal pathway, depicting optimal enzyme concentrations, leading to overproduction of the target. The results on various networks demonstrate the usefulness of the methodology in the domain of metabolic engineering. A comparison with the standard gradient descent and the extreme pathway analysis technique is also performed. Unlike the gradient descent method, the present method,
bioinformatics or computationalbiology is field of science in which biology, computer science and information technology merges into a single discipline. In modern computation biology, protein secondary structure pre...
详细信息
ISBN:
(纸本)9780769550138
bioinformatics or computationalbiology is field of science in which biology, computer science and information technology merges into a single discipline. In modern computation biology, protein secondary structure prediction is a major problem. Secondary structure prediction is depends on its amino acid sequence. Current studies prefer machine learning techniques for classification and regression task. Recently many researchers used various data mining and machine learning tool for protein structure prediction. Our intention is to use model based (i.e., supervised learning) approach for protein secondary structure prediction and our objective is to enhance the prediction of 2D protein structure problem using advance machine learning techniques like, linear and non-linear support vector machine with different kernel functions. The datasets used for this problem are Protein Data Bank (PDB) sets, which is based on structural classification of protein (SCOP), RS126 and CB513.
暂无评论