The breadth of biological data collected in the last decade has far outstripped the methods available to process it. To effectively investigate and explore this abundance of data, novel automated collection and analys...
详细信息
The breadth of biological data collected in the last decade has far outstripped the methods available to process it. To effectively investigate and explore this abundance of data, novel automated collection and analysis approaches must be devised. We have developed a new open software framework, the Open Genomic Analysis Platform (OGAP), to aid in the analysis of genomic data. It is capable of analyzing a variety of data source, and focuses on using information theory to characterize data. The frameworks has is capable of import a variety of genome tied data, and provides custom analysis and visualization of results. We then demonstrate the use of this framework analyzing the Prochlorococcus Marinus organism. We show a strong correlation between the information content of sequence data and up regulation of gene expression during lytic infection.
Background: Neuroscientists often need to access a wide range of data sets distributed over the Internet. These data sets, however, are typically neither integrated nor interoperable, resulting in a barrier to answeri...
详细信息
Background: Neuroscientists often need to access a wide range of data sets distributed over the Internet. These data sets, however, are typically neither integrated nor interoperable, resulting in a barrier to answering complex neuroscience research questions. Domain ontologies can enable the querying heterogeneous data sets, but they are not sufficient for neuroscience since the data of interest commonly span multiple research domains. To this end, e-Neuroscience seeks to provide an integrated platform for neuroscientists to discover new knowledge through seamless integration of the very diverse types of neuroscience data. Here we present a Semantic Web approach to building this e-Neuroscience framework by using the Resource Description Framework (RDF) and its vocabulary description language, RDF Schema (RDFS), as a standard data model to facilitate both representation and integration of the data. Results: We have constructed a pilot ontology for BrainPharm (a subset of SenseLab) using RDFS and then converted a subset of the BrainPharm data into RDF according to the ontological structure. We have also integrated the converted BrainPharm data with existing RDF hypothesis and publication data from a pilot version of SWAN (Semantic Web Applications in Neuromedicine). Our implementation uses the RDF Data Model in Oracle Database 10g release 2 for data integration, query, and inference, while our Web interface allows users to query the data and retrieve the results in a convenient fashion. Conclusion: Accessing and integrating biomedical data which cuts across multiple disciplines will be increasingly indispensable and beneficial to neuroscience researchers. The Semantic Web approach we undertook has demonstrated a promising way to semantically integrate data sets created independently. It also shows how advanced queries and inferences can be performed over the integrated data, which are hard to achieve using traditional data integration approaches. Our pilot results su
The Generation Challenge programme (GCP) is a global crop research consortium directed toward crop improvement through the application of comparative biology and genetic resources characterization to plant breeding. A...
The Generation Challenge programme (GCP) is a global crop research consortium directed toward crop improvement through the application of comparative biology and genetic resources characterization to plant breeding. A key consortium research activity is the development of a GCP crop bioinformatics platform to support GCP research. This platform includes the following: (i) shared, public platform-independent domain models, ontology, and data formats to enable interoperability of data and analysis flows within the platform;(ii) web service and registry technologies to identify, share, and integrate information across diverse, globally dispersed data sources, as well as to access high-performance computational (HPC) facilities for computationally intensive, high-throughput analyses of project data;(iii) platform-specific middleware reference implementations of the domain model integrating a suite of public (largely open-access/-source) databases and software tools into a workbench to facilitate biodiversity analysis, comparative analysis of crop genomic data, and plant breeding decision making.
Cluster analysis is the most important method for analyzing large-scale gene expression patterns. The matrix representation of microarray data and its successive 'optimal' incisional hyperplanes that create to...
详细信息
ISBN:
(纸本)0780366573
Cluster analysis is the most important method for analyzing large-scale gene expression patterns. The matrix representation of microarray data and its successive 'optimal' incisional hyperplanes that create top-down hierarchical tree are a useful platform for developing optimization algorithms to determine the 'optimal' clusters from a pairwise proximity matrix which represents completely connected and weighted graph. Evolution strategy is applied to determine the 'globally optimal' incisional hyperplanes to construct hierarchical tree structure and tested with Fisher's iris and Golub's leukemia data sets. The results were compared with those of bottom-up hierarchical clustering, K-means and SOMs (Self-Organizing Maps) algorithms with promising results.
暂无评论