The conduct of clinical and translational research regularly involves the use of a variety of heterogeneous and large-scale data resources. Scalable methods for the integrative analysis of such resources, particularly...
详细信息
The conduct of clinical and translational research regularly involves the use of a variety of heterogeneous and large-scale data resources. Scalable methods for the integrative analysis of such resources, particularly when attempting to leverage computable domain knowledge in order to generate actionable hypotheses in a high-throughput manner, remain an open area of research. In this report, we describe both a generalizable design pattern for such integrative knowledge-anchored hypothesis discovery operations and our experience in applying that design pattern in the experimental context of a set of driving research questions related to the publicly available Osteoarthritis Initiative data repository. We believe that this 'test bed' project and the lessons learned during its execution are both generalizable and representative of common clinical and translational research paradigms.
Objective The conduct of investigational studies that involve large-scale data sets presents significant challenges related to the discovery and testing of novel hypotheses capable of supporting in silico discovery sc...
详细信息
Objective The conduct of investigational studies that involve large-scale data sets presents significant challenges related to the discovery and testing of novel hypotheses capable of supporting in silico discovery science. The use of what are known as Conceptual Knowledge Discovery in databases (CKDD) methods provides a potential means of scaling hypothesis discovery and testing approaches for large data sets. Such methods enable the high-throughput generation and evaluation of knowledge-anchored relationships between complexes of variables found in targeted data sets. Methods The authors have conducted a multipart model formulation and validation process, focusing on the development of a methodological and technical approach to using CKDD to support hypothesis discovery for in silico science. The model the authors have developed is known as the Translational Ontology-anchored Knowledge Discovery Engine (TOKEn). This model utilizes a specific CKDD approach known as Constructive Induction to identify and prioritize potential hypotheses related to the meaningful semantic relationships between variables found in large-scale and heterogeneous biomedical data sets. Results The authors have verified and validated TOKEn in the context of a translational research data repository maintained by the NCI-funded Chronic Lymphocytic Leukemia Research Consortium. Such studies have shown that TOKEn is: (1) computationally tractable;and (2) able to generate valid and potentially useful hypotheses concerning relationships between phenotypic and biomolecular variables in that data collection. Conclusions The TOKEn model represents a potentially useful and systematic approach to knowledge synthesis for in silico discovery science in the context of large-scale and multidimensional research data sets.
Tobacco use is increasingly prevalent among vulnerable populations, such as people living in rural Appalachian communities. Owing to limited access to a reliable internet service in such settings, there is no widespre...
详细信息
Tobacco use is increasingly prevalent among vulnerable populations, such as people living in rural Appalachian communities. Owing to limited access to a reliable internet service in such settings, there is no widespread adoption of electronic data capture tools for conducting community-based research. By integrating the REDCap data collection application with a custom synchronization tool, the authors have enabled a workflow in which field research staff located throughout the Ohio Appalachian region can electronically collect and share research data. In addition to allowing the study data to be exchanged in near-real-time among the geographically distributed study staff and centralized study coordinator, the system architecture also ensures that the data are stored securely on encrypted laptops in the field and centrally behind the Ohio State University Medical Center enterprise firewall. The authors believe that this approach can be easily applied to other analogous study designs and settings.
暂无评论