High-throughput sequencing and functional genomics technologies have given us the human genome sequence as well as those of other experimentally, medically, and agriculturally important species, thus enabling large-sc...
ISBN:
(数字)9781848161092
ISBN:
(纸本)9781848161085
High-throughput sequencing and functional genomics technologies have given us the human genome sequence as well as those of other experimentally, medically, and agriculturally important species, thus enabling large-scale genotyping and gene expression profiling of human populations. Databases containing large numbers of sequences, polymorphisms, structures, metabolic pathways, and gene expression profiles of normal and diseased tissues are rapidly being generated for human and model organisms. bioinformatics is therefore gaining importance in the annotation of genomic sequences; the understanding of the interplay among and between genes and proteins; the analysis of the genetic variability of species; the identification of pharmacological targets; and the inference of evolutionary origins, mechanisms, and relationships. This proceedings volume contains an up-to-date exchange of knowledge, ideas, and solutions to conceptual and practical issues of bioinformatics by researchers, professionals, and industry practitioners at the 6th Asia-Pacific bioinformatics Conference held in Kyoto, Japan, in January 2008.
This volume contains about 40 papers covering many of the latest developments in the fast-growing field of bioinformatics. The contributions span a wide range of topics, including computational genomics and genetics, ...
详细信息
ISBN:
(数字)9781860948732
ISBN:
(纸本)9781860948725
This volume contains about 40 papers covering many of the latest developments in the fast-growing field of bioinformatics. The contributions span a wide range of topics, including computational genomics and genetics, protein function and computational proteomics, the transcriptome, structural bioinformatics, microarray data analysis, motif identification, biological pathways and systems, and biomedical applications. Abstracts from the keynote addresses and invited talks are also *** papers not only cover theoretical aspects of bioinformatics but also delve into the application of new methods, with input from computation, engineering and biology disciplines. This multidisciplinary approach to bioinformatics gives these proceedings a unique viewpoint of the field.
The analysis of time series data is of capital importance for pharmacogenomics since the experimental evaluations are usually based on observations of time dependent reactions or behaviors of organisms. Thus, data min...
详细信息
ISBN:
(纸本)1860946232
The analysis of time series data is of capital importance for pharmacogenomics since the experimental evaluations are usually based on observations of time dependent reactions or behaviors of organisms. Thus, data mining in time series databases is an important instrument towards understanding the effects of drugs on individuals. However, the complex nature of time series poses a big challenge for effective and efficient data mining. In this paper, we focus on the detection of temporal dependencies between different time series: we introduce the novel analysis concept of threshold queries and its semi-supervised extension which supports the parameter setting by applying training datasets. Basically, threshold queries report those time series exceeding an user-defined query threshold at certain time frames. For semi-supervised threshold queries the corresponding threshold is automatically adjusted to the characteristics of the data set, the training dataset, respectively. In order to support threshold queries efficiently, we present a new efficient access method which uses the fact that only partial information of the time series is required at query time. In an extensive experimental evaluation we demonstrate the performance of our solution and show that semi-supervised threshold queries applied to gene expression data are very worthwhile.
In this paper, we give a complete characterization of the existence of a galled-tree network in the form of simple sufficient and necessary conditions. As a by-product we obtain as simple algorithm for constructing ga...
详细信息
ISBN:
(纸本)1860946232
In this paper, we give a complete characterization of the existence of a galled-tree network in the form of simple sufficient and necessary conditions. As a by-product we obtain as simple algorithm for constructing galled-tree networks. We also introduce a new necessary condition for the existence of a galled-tree network similar to bi-convexity.
The central role phylogeny plays in biology and its pervasiveness in comparative genomics studies have led researchers to develop a plethora of methods for its accurate reconstruction. Most phylogeny reconstruction me...
详细信息
ISBN:
(纸本)1860946232
The central role phylogeny plays in biology and its pervasiveness in comparative genomics studies have led researchers to develop a plethora of methods for its accurate reconstruction. Most phylogeny reconstruction methods, though, assume a single tree underlying a given sequence alignment. While a good first approximation in many cases, a tree may not always model the evolutionary history of a set of organisms. When events such as interspecific recombination occur, different regions in the alignment may have different underlying trees. Accurate reconstruction of the evolutionary history of a set of sequences requires recombination detection, followed by separate analyses of the non-recombining regions. Besides aiding accurate phylogenetic analyses, detecting recombination helps in understanding one of the main mechanisms of bacterial genome diversification. In this paper, we introduce RECOMP, an accurate and fast method for detecting recombination events in a sequence alignment. The method slides a fixed-width window across the alignment and determines the presence of recombination events based on a combination of topology and parsimony score differences in neighboring windows. On several synthetic and biological datasets, our method performs much faster than existing tools with accuracy comparable to the best available method.
Clustering is widely used in gene expression analysis, which helps to group genes with similar biological function together. The traditional clustering techniques are not suitable to be directly applied to gene expres...
详细信息
ISBN:
(纸本)1860946232
Clustering is widely used in gene expression analysis, which helps to group genes with similar biological function together. The traditional clustering techniques are not suitable to be directly applied to gene expression time series data, because of the inhered properties of local regulation and time shift. In order to cope with the existing problems, the local similarity and time shift, we have developed a new similarity measurement technique called Local Similarity Combination in this paper. And at last, we'll run our method on the real gene expression data and show that it works well.
The interaction between transcription factors and their DNA binding sites plays a key role for understanding gene regulation mechanisms. Recent studies revealed the presence of "functional polymorphism" in g...
详细信息
ISBN:
(纸本)1860946232
The interaction between transcription factors and their DNA binding sites plays a key role for understanding gene regulation mechanisms. Recent studies revealed the presence of "functional polymorphism" in genes that is defined as regulatory variation measured in transcription levels due to the cis-acting sequence differences. These regulatory variants are assumed to contribute to modulating gene functions. However, computational identifications of such functional cis-regulatory variants is a much greater challenge than just identifying consensus sequences, because cis-regulatory variants differ by only a few bases from the main consensus sequences, while they have important consequences for organismal phenotype. None of the previous studies have directly addressed this problem. We propose a novel discriminative detection method for precisely identifying transcription factor binding sites and their functional variants from both positive and negative samples (sets of upstream sequences of both bound and unbound genes by a transcription factor) based on the genome-wide location data. Our goal is to find such discriminative substrings that best explain the location data in the sense that the substrings precisely discriminate the positive samples from the negative ones rather than finding the substrings that are simply over-represented among the positive ones. Our method consists of two steps: First, we apply a decision tree learning method to discover discriminative substrings and a hierarchical relationship among them. Second, we extract a main motif and further a second motif as a cis-regulatory variant by utilizing functional annotations. Our genome-wide experimental results on yeast Saccharomyces cerevisiae show that our method presented significantly better performances for detecting experimentally verified consensus sequences than current motif detecting methods. In addition, our method has successfully discovered second motifs of putative functional cis-regulato
The factors governing codon and amino acid usages in the predicted protein-coding sequences of Tropheryma whipplei TW08/27 and Twist genomes have been analyzed. Multivariate analysis identifies the replicational-trans...
详细信息
ISBN:
(纸本)1860946232
The factors governing codon and amino acid usages in the predicted protein-coding sequences of Tropheryma whipplei TW08/27 and Twist genomes have been analyzed. Multivariate analysis identifies the replicational-transcriptional selection coupled with DNA strand-specific asymmetric mutational bias as a major driving force behind the significant inter-strand variations in synonymous codon usage patterns in T. whipplei genes, while a residual intra-strand synonymous codon bias is imparted by a selection force operating at the level of translation. The strand-specific mutational pressure has little influence on the amino acid usage, for which the mean hydropathy level and aromaticity are the major sources of variation, both having nearly equal impact. In spite of the intracellular fife-style, the amino acid usage in highly expressed gene products of T whipplei follows the cost-minimization hypothesis. Both the genomes under study are characterized by the presence of two distinct groups of membrane-associated genes, products of which exhibit significant differences in primary and potential secondary structures as well as in the propensity of protein disorder.
Automated discovery and extraction of biological relations from online documents, particularly MEDLINE texts, has become essential and urgent because such literature data are accumulated in a tremendous growth. In thi...
详细信息
ISBN:
(纸本)1860946232
Automated discovery and extraction of biological relations from online documents, particularly MEDLINE texts, has become essential and urgent because such literature data are accumulated in a tremendous growth. In this paper, we present an ontology-based framework of biological relation extraction system. This framework is unified and able to extract several kinds of relations such as gene-disease, gene-gene, and protein-protein interactions etc. The main contributions of this paper are that we propose a two-level pattern learning algorithm and organize patterns hierarchically.
The gene tree and species tree problem remains a central problem in phylogenomics. To overcome this problem, gene concatenation approaches have been used to combine a certain number of genes randomly from a set of wid...
详细信息
ISBN:
(纸本)1860946232
The gene tree and species tree problem remains a central problem in phylogenomics. To overcome this problem, gene concatenation approaches have been used to combine a certain number of genes randomly from a set of widely distributed orthologous genes selected from genome data to conduct phylogenetic analysis. The random concatenation mechanism prevents us from the further investigations of the inner structures of the gene data set employed to infer the phylogenetic trees and locates the most phylogenetically informative genes. In this work, a phylogenomic mining approach is described to gain knowledge from a gene data set by clustering genes in the gene set through a self-organizing map (SOM) to explore the gene dataset inner structures. From this, the most phylogenetically informative gene set is created by picking the maximum entropy gene from each cluster to infer phylogenetic trees by phylogenetic analysis. Using the same data set, the phylogenetic mining approach performs better than the random gene concatenation approach.
暂无评论