Recent years have seen an explosive growth in the amount of biological data available for analysis. The large volume of data collected makes it necessary to automatically classify and sort such data on a very large sc...
详细信息
Recent years have seen an explosive growth in the amount of biological data available for analysis. The large volume of data collected makes it necessary to automatically classify and sort such data on a very large scale. Typically, investigators use computational sequence analysis tools to assign functions to newly found gene products. The problem is to find the functions of a (unknown) gene product given its amino acid sequence. In this work we search for functional similarity between gene products by matching the functional domains that they contain. The domain-based approach addresses the main problem of sequence-based similarity, i.e., when the region of a gene product that is matched by a query sequence is not related to the function of that gene product. We use the hidden Markov representation of a gene product domain as described in the PFAM database, and then infer annotations that come from the Gene Ontology. To compute domain similarity between two gene products we introduce a fuzzy Jaccard similarity measure. We tested our domain-based similarity for the functional annotation of a set of 194 gene products extracted from the ENSEMBL Web site. We compared the domain similarity approach to the traditional way of performing functional annotation using a sequence-based similarity (BLAST and Smith-Waterman). The annotation was performed in all cases using a fuzzy K-nearest neighbor algorithm. We found that our domain-based annotation was better than the most common BLAST approach, but not as good as complex Smith-Waterman technique. The domain-based annotation has about 70% correct annotation rate at 17% false annotation rate
The concept of surprise is central to sensory processing,adaptation and learning,attention,and decision ***,no widely-accepted mathematical theory currently exists to quantitatively characterize surprise elicited by a...
详细信息
The concept of surprise is central to sensory processing,adaptation and learning,attention,and decision ***,no widely-accepted mathematical theory currently exists to quantitatively characterize surprise elicited by a stimulus or event,for observers that range from single neurons to complex natural or engineered *** describe a formal Bayesian definition of surprise that is the only consistent formulation under minimal axiomatic *** quantifies how data affects a natural or artificial observer,by measuring the difference between posterior and prior beliefs of the *** this framework we measure the extent to which humans direct their gaze towards surprising items while watching television and video *** are strongly attracted to locations of high Bayesian surprise,with 72%of all human gaze shifts directed towards locations more surprising than the average,a figure which rises to 84%when considering only gaze targets simultaneously selected by all *** resulting theory of surprise is applicable across different spatio-temporal scales,modalities,and levels of abstraction.
An intelligent system is featured by both its abilities of interpreting what are observed via discovering knowledge about the world it survives,and its problem solving skills of handling each issue encountered in the ...
详细信息
An intelligent system is featured by both its abilities of interpreting what are observed via discovering knowledge about the world it survives,and its problem solving skills of handling each issue encountered in the ***,the abilities and skills are obtained by two types of learning via evidences or data from the *** to noises in observation and a finite size of samples,learning is statistical in nature,which faces two key *** is finding appropriate mathematical representations to suit various dependence structures underlying world. The other is getting a good theory to guide learning such that dependence structures are not only learned into mathematical representations but also with an appropriate complexity that matches the size of samples(i.e.,learning reliable structures of underlying world).This paper consists of part *** first two parts summarize typical dependence structures for tackling the challenge one and typical learning theories for tackling for tackling the challenge *** third part introduces Bayesian Ying Yang(BYY) system as a general framework that unifies typical dependence structures and BYY harmony learning for the challenge two,with several favorable *** illustrate this BYY learning,in the fourth part we further introduce fundamentals of independence subspaces and advances obtained from BYY harmony learning on typical independence subspaces,including PCA,MCA,DCA,ICA,FA,TFA,NFA,BFA,LMSER,as well as their temporal ***,a concluding remark is made and new results of BYY learning in other learning areas are also briefly listed.
The paper introduces the knowledge engineering (KE) approach for the modeling and the discovery of new knowledge in bioinformatics. This approach extends the machine learning approach with various rule extraction and ...
详细信息
ISBN:
(纸本)0780382781
The paper introduces the knowledge engineering (KE) approach for the modeling and the discovery of new knowledge in bioinformatics. This approach extends the machine learning approach with various rule extraction and other knowledge representation procedures. Examples of the KE approach, and especially of one of the recently developed techniques - evolving connectionist systems (ECOS), to challenging problems in bioinformatics are given, that include: DNA sequence analysis, microarray gene expression profiling, protein structure prediction, finding gene regulatory networks, medical prognostic systems, computational neurogenetic modeling.
Reverse engineering of genetic networks generally requires establishing correlative behavior within and between a very large number of genes. This becomes a difficult analytical problem for even a few hundred genes an...
详细信息
ISBN:
(纸本)0769521940
Reverse engineering of genetic networks generally requires establishing correlative behavior within and between a very large number of genes. This becomes a difficult analytical problem for even a few hundred genes and the difficulty tends to grow exponentially as more genes are examined Using a hybrid data analysis method known as Fractal Genomics Modeling (FGM), this problem is reduced to examining correlative behavior within small gene groups that can then be compared and integrated to produce a picture of larger networks using a type of shotgun approach. We have applied FGM toward examining genetic networks involved in HIV infection in the brain. These networks have relevance both to processes related to HIV infection and neurodegenerative disorders. Our preliminary findings have produced conjectures of related pathways and networks as well new candidates for genetic markers in HIV brain infection. Evidence has also been produced which appears to show the presence of a hierarchical network structure within the genes studied We will discuss the background and methodology of FGM as well as our recent findings.
The following topics are dealt with: genomes to life; microarray data analysis; pathways, networks, and systems biology; biomedical research and visualization; data mining; pattern recognition; sequence alignment; dat...
The following topics are dealt with: genomes to life; microarray data analysis; pathways, networks, and systems biology; biomedical research and visualization; data mining; pattern recognition; sequence alignment; data integration; functional genomics; genomic annotation; genotyping and SNPs; molecular simulation; phylogeny and evolution; predictive methods; sequence comparison; strings, graphs, and algorithms; structural biology; text mining and ontologies; and systems biology.
Proteases play a fundamental role in the control of intra- and extracellular processes by binding and cleaving specific amino acid sequences. Identifying these targets is extremely challenging. Current computational a...
详细信息
ISBN:
(纸本)0769521940
Proteases play a fundamental role in the control of intra- and extracellular processes by binding and cleaving specific amino acid sequences. Identifying these targets is extremely challenging. Current computational attempts to predict cleavage sites are limited, representing these amino acid sequences as patterns or frequency matrices. Here we present PoPS, a publicly accessible bioinformatics tool (http://***/)which provides a novel method for building computational models of protease specificity that, while still being based on these amino acid sequences, can be built from any experimental data or expert knowledge available to the user. PoPS specificity models can be used to predict and rank likely cleavages within a single substrate, and within entire proteomes. Other factors, such as the secondary or tertiary structure of the substrate, can be used to screen unlikely sites. Furthermore, the tool also provides facilities to infer, compare and test models, and to store them in a publicly accessible database.
World-wide structural genomics initiatives are rapidly accumulating structures for which limited functional information is available. Additionally, state-of-the art structural prediction programs are now capable of ge...
详细信息
ISBN:
(纸本)0769521940
World-wide structural genomics initiatives are rapidly accumulating structures for which limited functional information is available. Additionally, state-of-the art structural prediction programs are now capable of generating at least low resolution structural models of target proteins. Accurate detection and classification of functional sites within both solved and modelled protein structures therefore represents an important challenge. We present a fully automatic site detection method, FuncSite, that uses neural network classifiers to predict the location and type of functionally important sites in protein structures. The method is designed primarily to require only backbone residue positions without the need for specific side-chain atoms to be present. In order to highlight effective site detection in low resolution structural models FuncSite was used to screen model proteins generated using mGenTHREADER on a set of newly released structures. We found effective metal site detection even for moderate quality protein models illustrating the robustness of the method.
The proceedings contain 63 papers from the 2004 2nd International ieeeconference Intelligent systems - Volume 1. The topics discussed include: intelligent decision making and information fusion;computational intellig...
详细信息
ISBN:
(纸本)0780382781
The proceedings contain 63 papers from the 2004 2nd International ieeeconference Intelligent systems - Volume 1. The topics discussed include: intelligent decision making and information fusion;computational intelligence approach to real-world cooperative vehicle dispatching problem;bioinformatics: a knowledge engineering approach;local minima free neural network learning and automatic text summarization with neural networks.
The proceedings contains 73 papers from the conference on Fourth ieee Symposium on bioinformatics and Bioengineering, BIBE 2004. The topics discussed include: techniques for enhancing computation of DNA curvature mole...
详细信息
ISBN:
(纸本)0769521738
The proceedings contains 73 papers from the conference on Fourth ieee Symposium on bioinformatics and Bioengineering, BIBE 2004. The topics discussed include: techniques for enhancing computation of DNA curvature molecules;towards automating an interventional radiological procedure;reducing the computational load of energy evaluations for protein folding;segmentation of the sylvian fissure in brain MR images;biomedical ontologies in post-genomic information systems;identifying significant genes from microarray data;good spaced seeds for homology search;and estimating seed sensitivity on homogeneous alignments.
暂无评论