BackgroundThe population diagnosed with renal cell carcinoma, especially in Asia, represents 36.6% of global cases, with the incidence rate of renal cell carcinoma in Korea steadily increasing annually. However, treat...
详细信息
BackgroundThe population diagnosed with renal cell carcinoma, especially in Asia, represents 36.6% of global cases, with the incidence rate of renal cell carcinoma in Korea steadily increasing annually. However, treatment options for renal cell carcinoma are diverse, depending on clinical stage and histologic characteristics. Hence, this study aims to develop a machine learning based clinical decision-support system that recommends personalized treatment tailored to the individual health condition of each *** reviewed the real-world medical data of 1,867 participants diagnosed with renal cell carcinoma between November 2008 and June 2021 at the Pusan National University Yangsan Hospital in South Korea. data were manually divided into a follow-up group where the patients did not undergo surgery or chemotherapy (Surveillance), a group where the patients underwent surgery (Surgery), and a group where the patients received chemotherapy before or after surgery (Chemotherapy). Feature selection was conducted to identify the significant clinical factors influencing renal cell carcinoma treatment decisions from 2,058 features. These features included subsets of 20, 50, 75, 100, and 150, as well as the complete set and an additional 50 expert-selected features. We applied representative machine learning algorithms, namely Decision Tree, Random Forest, and Gradient Boosting Machine (GBM). We analyzed the performance of three applied machine learning algorithms, among which the GBM algorithm achieved an accuracy score of 95% (95% CI, 92-98%) for the 100 and 150 feature sets. The GBM algorithm using 100 and 150 features achieved better performance than the algorithm using features selected by clinical experts (93%, 95% CI 89-97%).ConclusionsWe developed a preliminary personalized treatment decision-support system (TDSS) called "RCC-Supporter" by applying machine learning (ML) algorithms to determine personalized treatment for the various clinical situations of RCC
Objectives Despite easy-to-use tools like the Cohort Builder, using All of Us Research Program data for complex research questions requires a relatively high level of technical expertise. We aimed to increase research...
详细信息
Objectives Despite easy-to-use tools like the Cohort Builder, using All of Us Research Program data for complex research questions requires a relatively high level of technical expertise. We aimed to increase research and training capacity and reduce barriers to entry for the All of Us community through an R package, allofus. In this article, we describe functions that address common challenges we encountered while working with All of Us Research Program data, and we demonstrate this functionality with an example of creating a cohort of All of Us participants by synthesizing electronic health record and survey data with time *** audience All of Us Research Program data are widely available to health researchers. The allofus R package is aimed at a wide range of researchers who wish to conduct complex analyses using best practices for reproducibility and transparency, and who have a range of experience using R. Because the All of Us data are transformed into the Observational Medical Outcomes Partnership common data model (OMOP CDM), researchers familiar with existing OMOP CDM tools or who wish to conduct network studies in conjunction with other OMOP CDM data will also find value in the *** We developed an initial set of functions that solve problems we experienced across survey and electronic health record data in our own research and in mentoring student projects. The package will continue to grow and develop with the All of Us Research Program. The allofus R package can help build community research capacity by increasing access to the All of Us Research Program data, the efficiency of its use, and the rigor and reproducibility of the resulting research.
BackgroundReal-world evidence (RWE) plays a key role in regulatory and healthcare decision-making, but the potentially fragmentated nature of generated evidence may limit its utility for clinical decision-making. Hete...
详细信息
BackgroundReal-world evidence (RWE) plays a key role in regulatory and healthcare decision-making, but the potentially fragmentated nature of generated evidence may limit its utility for clinical decision-making. Heterogeneity and a lack of reproducibility in RWE resulting from inconsistent application of methodologies across data sources should be minimized through *** paper's aim is to describe and reflect upon a multidisciplinary research platform (FOUNTAIN;FinerenOne mUlti-database NeTwork for evidence generAtIoN) with coordinated studies using diverse RWE generation approaches and explore the platform's strengths and limitations. With guidance from an executive advisory committee of multidisciplinary experts and patient representatives, the goal of the FOUNTAIN platform is to harmonize RWE generation across a portfolio of research projects, including research partner collaborations and a common data model (CDM)-based program. FOUNTAIN's overarching objectives as a research platform are to establish long-term collaborations among pharmacoepidemiology research partners and experts and to integrate diverse approaches for RWE generation, including global protocol execution by research partners in local data sources and common protocol execution in multiple data sources through federated data networks, while ensuring harmonization of medical definitions, methodology, and reproducible artifacts across all studies. Specifically, the aim of the multiple studies run within the frame of FOUNTAIN is to provide insight into the real-world utilization, effectiveness, and safety of finerenone across its ***, the FOUNTAIN platform includes 9 research partner collaborations and 8 CDM-mapped data sources from 7 countries (United States, United Kingdom, China, Japan, The Netherlands, Spain, and Denmark). These databases and research partners were selected after a feasibility fit-for-purpose evaluation. Six multicountry, multidatabase
Background: Despite numerous past endeavors for the semantic harmonization of Alzheimer's disease (AD) cohort studies, an automatic tool has yet to be developed. Objective: As cohort studies form the basis of data...
详细信息
Background: Despite numerous past endeavors for the semantic harmonization of Alzheimer's disease (AD) cohort studies, an automatic tool has yet to be developed. Objective: As cohort studies form the basis of data-driven analysis, harmonizing them is crucial for cross-cohort analysis. We aimed to accelerate this task by constructing an automatic harmonization tool. Methods: We created a common data model (CDM) through cross-mapping data from 20 cohorts, three CDMs, and ontology terms, which was then used to fine-tune a BioBERT model. Finally, we evaluated the model using three previously unseen cohorts and compared its performance to a string-matching baseline model. Results: Here, we present our AD-Mapper interface for automatic harmonization of AD cohort studies, which outperformed a string-matching baseline on previously unseen cohort studies. We showcase our CDM comprising 1218 unique variables. Conclusion: AD-Mapper leverages semantic similarities in naming conventions across cohorts to improve mapping performance.
Importance The Observational Health data Sciences and Informatics (OHDSI) is the largest distributed data network in the world encompassing more than 331 data sources with 2.1 billion patient records across 34 countri...
详细信息
Importance The Observational Health data Sciences and Informatics (OHDSI) is the largest distributed data network in the world encompassing more than 331 data sources with 2.1 billion patient records across 34 countries. It enables large-scale observational research through standardizing the data into a common data model (CDM) (Observational Medical Outcomes Partnership [OMOP] CDM) and requires a comprehensive, efficient, and reliable ontology system to support data *** and methods We created the OHDSI Standardized Vocabularies-a common reference ontology mandatory to all data sites in the network. It comprises imported and de novo-generated ontologies containing concepts and relationships between them, and the praxis of converting the source data to the OMOP CDM based on these. It enables harmonization through assigned domains according to clinical categories, comprehensive coverage of entities within each domain, support for commonly used international coding schemes, and standardization of semantically equivalent *** The OHDSI Standardized Vocabularies comprise over 10 million concepts from 136 vocabularies. They are used by hundreds of groups and several large data networks. More than 8600 users have performed 50 000 downloads of the system. This open-source resource has proven to address an impediment of large-scale observational research-the dependence on the context of source data representation. With that, it has enabled efficient phenotyping, covariate construction, patient-level prediction, population-level estimation, and standard *** and conclusion OHDSI has made available a comprehensive, open vocabulary system that is unmatched in its ability to support global observational research. We encourage researchers to exploit it and contribute their use cases to this dynamic resource.
Background: Several studies have investigated the relationship between ursodeoxycholic acid (UDCA) and COVID-19 ***, complex and conflicting results have generated confusion in the application of these results. Object...
详细信息
Background: Several studies have investigated the relationship between ursodeoxycholic acid (UDCA) and COVID-19 ***, complex and conflicting results have generated confusion in the application of these results. Objective: We aimed to investigate whether the association between UDCA and COVID-19 infection can also be demonstratedthrough the analysis of a large-scale cohort. Methods: This retrospective study used local and nationwide cohorts, namely, the Jeonbuk National University Hospital intothe Observational Medical Outcomes Partnership common data model cohort (JBUH CDM) and the Korean National HealthInsurance Service claim-based database (NHIS). We investigated UDCA intake and its relationship with COVID-19 susceptibilityand severity using validated propensity score ***: Regarding COVID-19 susceptibility, the adjusted hazard ratio (aHR) value of the UDCA intake was significantlylowered to 0.71 in the case of the JBUH CDM (95% CI 0.52-0.98) and was significantly lowered to 0.93 (95% CI 0.90-0.96) inthe case of the NHIS. Regarding COVID-19 severity, the UDCA intake was found to be significantly lowered to 0.21 (95% CI0.09-0.46) in the case of JBUH CDM. Furthermore, the aHR value was significantly lowered to 0.77 in the case of NHIS (95%CI 0.62-0.95). Conclusions: Using a large-scale local and nationwide cohort, we confirmed that UDCA intake was significantly associatedwith reductions in COVID-19 susceptibility and severity. These trends remained consistent regardless of the UDCA dosage. Thissuggests the potential of UDCA as a preventive and therapeutic agent for COVID-19 infection.
PurposeTo describe the development of INSIGHT, a real-world data quality tool to assess completeness, consistency, and fitness-for-purpose of observational health data *** designed a three-level pipeline with data qua...
详细信息
PurposeTo describe the development of INSIGHT, a real-world data quality tool to assess completeness, consistency, and fitness-for-purpose of observational health data *** designed a three-level pipeline with data quality assessments (DQAs) to be performed in ConcePTION common data model (CDM) instances. The pipeline has been coded using *** is an open-source tool that identifies potential data quality issues in CDM-standardized instances through the systematic execution and summary of over 588 configurable DQAs. Level 1 focuses on conformance to the ConcePTION CDM specifications. Level 2 evaluates the temporal plausibility of events and uniqueness of records. Level 3 provides an overview of distributions, outliers, and trends over time to facilitate fit-for-purpose evaluation. Therefore, level 1 and 2 assure a proper data standardization, while level 3 provides information regarding the study population, and potential sub-populations. The DQAs are run locally and assessed centrally by a data quality revisor together with the data access provider's *** quality is the sum of several internal and external features of the data. While DQAs can provide reassurance about fitness-for-purpose for secondary-use data sources, improvements in data collection are essential to reduce errors and enhance overall data quality for Real World *** aims to support clinical and regulatory decision-making for medicines and vaccines by evaluating the quality of observational health data sources to support fit for purpose assessment. Assessing and improving data quality will enhance the reliability and quality of the generated *** RegistrationThis research was registered in EU PAS registration with number EU50142.
Background Lumbar spinal stenosis (LSS) and spondylolisthesis (SPL) are characterized as degenerative spinal pathologies and share considerable similarities. However, opinions vary on whether to recommend exercise or ...
详细信息
Background Lumbar spinal stenosis (LSS) and spondylolisthesis (SPL) are characterized as degenerative spinal pathologies and share considerable similarities. However, opinions vary on whether to recommend exercise or restrict it for these diseases. Few studies have objectively compared the effects of daily physical activity on LSS and SPL because it is impossible to restrict activities ethnically and practically. We investigated the effect of restricting physical activity due to social distancing (SoD) on LSS and SPL, focusing on the aspect of healthcare burden changes during the pandemic period. Methods We included first-visit patients diagnosed exclusively with LSS and SPL in 2017 and followed them up for two years before and after the implementation of the SoD policy. As controls, patients who first visited in 2015 and were followed for four years without SoD were analyzed. The common data model was employed to analyze each patient's diagnostic codes and treatments. Hospital visits and medical costs were analyzed by regression discontinuity in time to control for temporal effects on dependent variables. Results Among 33,484 patients, 2,615 with LSS and 446 with SPL were included. A significant decrease in hospital visits was observed in the LSS (difference, -3.94 times/month100 patients;p = 0.023) and SPL (difference, -3.44 times/month100 patients;p = 0.026) groups after SoD. This decrease was not observed in the data from the control group. Concerning medical costs, the LSS group showed a statistically significant reduction in median copayment (difference, -$45/monthpatient;p < 0.001) after SoD, whereas a significant change was not observed in the SPL group (difference, -$19/monthpatient;p = 0.160). Conclusion Restricted physical activity during the SoD period decreased the healthcare burden for patients with LSS or, conversely, it did not significantly affect patients with SPL. Under circumstances of physical inac
Background: Acute kidney injury (AKI) is a marker of clinical deterioration and renal toxicity. While there are many studies offering prediction models for the early detection of AKI, those predicting AKI occurrence u...
详细信息
Background: Acute kidney injury (AKI) is a marker of clinical deterioration and renal toxicity. While there are many studies offering prediction models for the early detection of AKI, those predicting AKI occurrence using distributed research network (DRN)-based time series data are rare. Objective: In this study, we aimed to detect the early occurrence of AKI by applying an interpretable long short-term memory (LSTM)-based model to hospital electronic health record (EHR)-based time series data in patients who took nephrotoxic drugs using a DRN. Methods: We conducted a multi-institutional retrospective cohort study of data from 6 hospitals using a DRN. For each institution, a patient-based data set was constructed using 5 drugs for AKI, and an interpretable multivariable LSTM (IMVLSTM) model was used for training. This study used propensity score matching to mitigate differences in demographics and clinical characteristics. Additionally, the temporal attention values of the AKI prediction model's contribution variables were demonstrated for each institution and drug, with differences in highly important feature distributions between the case and control data confirmed using 1-way ANOVA. Results: This study analyzed 8643 and 31,012 patients with and without AKI, respectively, across 6 hospitals. When analyzing the distribution of AKI onset, vancomycin showed an earlier onset (median 12, IQR 5-25 days), and acyclovir was the slowest compared to the other drugs (median 23, IQR 10-41 days). Our temporal deep learning model for AKI prediction performed well for most drugs. Acyclovir had the highest average area under the receiver operating characteristic curve score per drug (0.94), followed by acetaminophen (0.93), vancomycin (0.92), naproxen (0.90), and celecoxib (0.89). Based on the temporal attention values of the variables in the AKI prediction model, verified lymphocytes and calcvancomycin ium had the highest attention, whereas lymphocytes, albumin, and hemoglobin
Background Sharing data across institutions is critical to improving care for children who are using long-term mechanical ventilation (LTMV). Mechanical ventilation data are complex and poorly standardized. This lack ...
详细信息
Background Sharing data across institutions is critical to improving care for children who are using long-term mechanical ventilation (LTMV). Mechanical ventilation data are complex and poorly standardized. This lack of data standardization is a major barrier to data sharing. Objective We aimed to describe current ventilator data in the electronic health record (EHR) and propose a framework for standardizing these data using a common data model (CDM) across multiple populations and sites. Methods We focused on a cohort of patients with LTMV dependence who were weaned from mechanical ventilation (MV). We extracted and described relevant EHR ventilation data. We identified the minimum necessary components, termed "Clinical Ideas," to describe MV from time of initiation to liberation. We then utilized existing resources and partnered with informatics collaborators to develop a framework for incorporating Clinical Ideas into the PEDSnet CDM based on the Observational Medical Outcomes Partnership (OMOP). Results We identified 78 children with LTMV dependence who weaned from ventilator support. There were 25 unique device names and 28 unique ventilation mode names used in the cohort. We identified multiple Clinical Ideas necessary to describe ventilator support over time: device, interface, ventilation mode, settings, measurements, and duration of ventilation usage per day. We used Concepts from the SNOMED-CT vocabulary and integrated an existing ventilator mode taxonomy to create a framework for CDM and OMOP integration. Conclusion The proposed framework standardizes mechanical ventilation terminology and may facilitate efficient data exchange in a multisite network. Rapid data sharing is necessary to improve research and clinical care for children with LTMV dependence.
暂无评论