Heterogeneity of the data leads to low utilization rate of resources, producing a large amount of information islands. This paper gives a new method to convert global query into local query which can be executed by OG...
详细信息
Background: Regulation of cellular events is, often, initiated via extracellular signaling. Extracellular signaling occurs when a circulating ligand interacts with one or more membrane-bound receptors. Identification ...
详细信息
Background: Regulation of cellular events is, often, initiated via extracellular signaling. Extracellular signaling occurs when a circulating ligand interacts with one or more membrane-bound receptors. Identification of receptor-ligand pairs is thus an important and specific form of PPI prediction. Results: Given a set of disparate datasources (expression data, domain content, and phylogenetic profile) we seek to predict new receptor-ligand pairs. We create a combined kernel classifier and assess its performance with respect to the database of Ligand-Receptor Partners (DLRP) 'golden standard' as well as the method proposed by Gertz et al. Among our findings, we discover that our predictions for the tgf beta family accurately reconstruct over 76% of the supported edges (0.76 recall and 0.67 precision) of the receptor-ligand bipartite graph defined by the DLRP "golden standard". In addition, for the tgf beta family, the combined kernel classifier is able to relatively improve upon the Gertz et al. work by a factor of approximately 1.5 when considering that our method has an F-measure of 0.71 while that of Gertz et al. has a value of 0.48. Conclusions: The prediction of receptor-ligand pairings is a difficult and complex task. We have demonstrated that using kernel learning on multiple data sources provides a stronger alternative to the existing method in solving this task.
Background: Public health triangulation is a process for reviewing, synthesising and interpreting secondary data from multiplesources that bear on the same question to make public health decisions. It can be used to ...
详细信息
Background: Public health triangulation is a process for reviewing, synthesising and interpreting secondary data from multiplesources that bear on the same question to make public health decisions. It can be used to understand the dynamics of HIV transmission and to measure the impact of public health programs. While traditional intervention research and metaanalysis would be ideal sources of information for public health decision making, they are infrequently available, and often decisions can be based only on surveillance and survey data. Methods: The process involves examination of a wide variety of datasources and both biological, behavioral and program data and seeks input from stakeholders to formulate meaningful public health questions. Finally and most importantly, it uses the results to inform public health decision-making. There are 12 discrete steps in the triangulation process, which included identification and assessment of key questions, identification of datasources, refining questions, gathering data and reports, assessing the quality of those data and reports, formulating hypotheses to explain trends in the data, corroborating or refining working hypotheses, drawing conclusions, communicating results and recommendations and taking public health action. Results: Triangulation can be limited by the quality of the original data, the potentials for ecological fallacy and "data dredging" and reproducibility of results. Conclusions: Nonetheless, we believe that public health triangulation allows for the interpretation of data sets that cannot be analyzed using meta-analysis and can be a helpful adjunct to surveillance, to formal public health intervention research and to monitoring and evaluation, which in turn lead to improved national strategic planning and resource allocation.
Background: The development of high-throughput technologies has produced several large scale protein interaction data sets for multiple species, and significant efforts have been made to analyze the data sets in order...
详细信息
Background: The development of high-throughput technologies has produced several large scale protein interaction data sets for multiple species, and significant efforts have been made to analyze the data sets in order to understand protein activities. Considering that the basic units of protein interactions are domain interactions, it is crucial to understand protein interactions at the level of the domains. The availability of many diverse biological data sets provides an opportunity to discover the underlying domain interactions within protein interactions through an integration of these biological data sets. Results: We combine protein interaction data sets from multiple species, molecular sequences, and gene ontology to construct a set of high-confidence domain-domain interactions. First, we propose a new measure, the expected number of interactions for each pair of domains, to score domain interactions based on protein interaction data in one species and show that it has similar performance as the E-value defined by Riley et al. [1]. Our new measure is applied to the protein interaction data sets from yeast, worm, fruitfly and humans. Second, information on pairs of domains that coexist in known proteins and on pairs of domains with the same gene ontology function annotations are incorporated to construct a high-confidence set of domain-domain interactions using a Bayesian approach. Finally, we evaluate the set of domain-domain interactions by comparing predicted domain interactions with those defined in iPfam database [ 2,3] that were derived based on protein structures. The accuracy of predicted domain interactions are also confirmed by comparing with experimentally obtained domain interactions from H. pylori [ 4]. As a result, a total of 2,391 high-confidence domain interactions are obtained and these domain interactions are used to unravel detailed protein and domain interactions in several protein complexes. Conclusion: Our study shows that integration of
We have developed an application, iVici, to analyze cellular networks represented as addressable symmetric or asymmetric two-dimensional matrices. iVici was designed to permit simultaneous visualization and correlatio...
详细信息
We have developed an application, iVici, to analyze cellular networks represented as addressable symmetric or asymmetric two-dimensional matrices. iVici was designed to permit simultaneous visualization and correlation of multipledatasets, representing any relationship between a set of genes, mRNAs, or proteins. Visual overlay of datasets and addressable access to gene annotations permits comparison of networks of different types ( for example protein-protein interactions and genetic networks) or investigation of the dynamic reorganization of a particular network.
暂无评论