Suffix trees, which are trie structures that present the suffixes of given sequences (e.g., strings), are widely used for sequence search in different application domains such as, text data mining, web intelligence, b...
详细信息
Finding the molecular features causes the halophilicity in the halostable organisms is helpful to understand the halophilic adaption. In this study, we proposed a prediction method for halophilic proteins by using a m...
详细信息
ISBN:
(纸本)9781467358743
Finding the molecular features causes the halophilicity in the halostable organisms is helpful to understand the halophilic adaption. In this study, we proposed a prediction method for halophilic proteins by using a machine learning method. The stages of this study are six-fold. First, we establish a non-redundant dataset of the halophilic proteins, collected from NCBI, Uniprotkb and EMBL-EBI databases. The dataset consists of 245 positive and negative proteins with sequence identity
In an extension of previous work, here we introduce a second-order optimization method for determining optimal paths from the substrate to a target product of a metabolic network, through which the amount of the targe...
详细信息
In an extension of previous work, here we introduce a second-order optimization method for determining optimal paths from the substrate to a target product of a metabolic network, through which the amount of the target is maximum. An objective function for the said purpose, along with certain linear constraints, is considered and minimized. The basis vectors spanning the null space of the stoichiometric matrix, depicting the metabolic network, are computed, and their convex combinations satisfying the constraints are considered as flux vectors. A set of other constraints, incorporating weighting coefficients corresponding to the enzymes in the pathway, are considered. These weighting coefficients appear in the objective function to be minimized. During minimization, the values of these weighting coefficients are estimated and learned. These values, on minimization, represent an optimal pathway, depicting optimal enzyme concentrations, leading to overproduction of the target. The results on various networks demonstrate the usefulness of the methodology in the domain of metabolic engineering. A comparison with the standard gradient descent and the extreme pathway analysis technique is also performed. Unlike the gradient descent method, the present method,
Recently, next generation sequencing techniques have begun to produce huge amounts of sequencing data. To analyze these data, an efficient method that can handle large amounts of information is required. In this paper...
详细信息
ISBN:
(纸本)9781467358743
Recently, next generation sequencing techniques have begun to produce huge amounts of sequencing data. To analyze these data, an efficient method that can handle large amounts of information is required. In this paper, we proposed a method for classifying sets of DNA sequences by using a hidden Markov model self-organizing map. For this purpose, a learning algorithm that requires low computational costs was developed. The availability of this method was examined in experiments classifying DNA sequences of various types of genes.
bioinformatics or computationalbiology is field of science in which biology, computer science and information technology merges into a single discipline. In modern computation biology, protein secondary structure pre...
详细信息
ISBN:
(纸本)9780769550138
bioinformatics or computationalbiology is field of science in which biology, computer science and information technology merges into a single discipline. In modern computation biology, protein secondary structure prediction is a major problem. Secondary structure prediction is depends on its amino acid sequence. Current studies prefer machine learning techniques for classification and regression task. Recently many researchers used various data mining and machine learning tool for protein structure prediction. Our intention is to use model based (i.e., supervised learning) approach for protein secondary structure prediction and our objective is to enhance the prediction of 2D protein structure problem using advance machine learning techniques like, linear and non-linear support vector machine with different kernel functions. The datasets used for this problem are Protein Data Bank (PDB) sets, which is based on structural classification of protein (SCOP), RS126 and CB513.
Prediction of RNA structure is invaluable in creating new drugs and understanding genetic diseases. Several deterministic algorithms and soft computing-based techniques have been developed for more than a decade to de...
详细信息
Prediction of RNA structure is invaluable in creating new drugs and understanding genetic diseases. Several deterministic algorithms and soft computing-based techniques have been developed for more than a decade to determine the structure from a known RNA sequence. Soft computing gained importance with the need to get approximate solutions for RNA sequences by considering the issues related with kinetic effects, cotranscriptional folding, and estimation of certain energy parameters. A brief description of some of the soft computing-based techniques, developed for RNA secondary structure prediction, is presented along with their relevance. The basic concepts of RNA and its different structural elements like helix, bulge, hairpin loop, internal loop, and multiloop are described. These are followed by different methodologies, employing genetic algorithms, artificial neural networks, and fuzzy logic. The role of various metaheuristics, like simulated annealing, particle swarm optimization, ant colony optimization, and tabu search is also discussed. A relative comparison among different techniques, in predicting 12 known RNA secondary structures, is presented, as an example. Future challenging issues are then mentioned.
Many computational methods have been developed to predict protein crystallization. Most methods use amino acid and dipeptide compositions as part of the informative features. To advance the prediction accuracy, the su...
详细信息
ISBN:
(纸本)9781467358743
Many computational methods have been developed to predict protein crystallization. Most methods use amino acid and dipeptide compositions as part of the informative features. To advance the prediction accuracy, the support vector machine (SVM) based classifiers and ensemble approaches were effective and commonly-used techniques. However, these techniques suffer from the low interpretation ability of insight into crystallization. In this study, we utilize a newly-developed scoring card method (SCM) with a dipeptide composition feature to predict protein crystallization. This SCM classifier obtains prediction results 74%, 0.55 and 0.83 for accuracy, sensitivity and specificity, respectively, which is comparable to the SVM classifier using the same benchmarks. The experimental results show that the SCM classifier has advantages of simplicity, high interpretability, and high accuracy in predicting protein crystallization, compared with existing SVM-based ensemble classifiers.
In this study we propose an early lung cancer detection methodology using nucleus based features. First the sputum samples from patients are labeled with Tetrakis Carboxy Phenyl Porphine (TCPP) and fluorescent images ...
详细信息
ISBN:
(纸本)9781467358743
In this study we propose an early lung cancer detection methodology using nucleus based features. First the sputum samples from patients are labeled with Tetrakis Carboxy Phenyl Porphine (TCPP) and fluorescent images of these samples are taken. TCPP is a porphyrin that is able to assist in labeling lung cancer cells by increasing numbers of low density lipoproteins coating on the surface of cancer. We study the performance of well know machine learning techniques in the context of lung cancer detection on Biomoda dataset. We obtained an accuracy of 81% using 71 features related to shape, intensity and color in our previous work. By adding the nucleus segmented features we improved the accuracy to 87%. Nucleus segmentation is performed by using Seeded region growing segmentation method. Our results demonstrate the potential of nucleus segmented features for detecting lung cancer.
暂无评论