In the era of data-driven machine learning algorithms, data represents a new oil. The application of machine learning algorithms shows they need large heterogeneous datasets that crucially are correctly labeled. Howev...
详细信息
In the era of data-driven machine learning algorithms, data represents a new oil. The application of machine learning algorithms shows they need large heterogeneous datasets that crucially are correctly labeled. However, data collection and its labeling are time-consuming and labor-intensive processes. A particular task we solve using machine learning is related to the segmentation of medical devices in echocardiographic images during minimally invasive surgery. However, the lack of data motivated us to develop an algorithm generating synthetic samples based on real datasets. The concept of this algorithm is to place a medical device (catheter) in an empty cavity of an anatomical structure, for example, in a heart chamber, and then transform it. To create random transformations of the catheter, the algorithm uses a coordinate system that uniquely identifies each point regardless of the bend and the shape of the object. It is proposed to take a cylindrical coordinate system as a basis, modifying it by replacing the Z-axis with a spline along which the h-coordinate is measured. Having used the proposed algorithm, we generated new images with the catheter inserted into different heart cavities while varying its location and shape. Afterward, we compared the results of deep neural networks trained on the datasets comprised of real and synthetic data. The network trained on both real and synthetic datasets performed more accurate segmentation than the model trained only on real data. For instance, modified U-net trained on combined datasets performed segmentation with the Dice similarity coefficient of 92.6±2.2%, while the same model trained only on real samples achieved the level of 86.5±3.6%. Using a synthetic dataset allowed decreasing the accuracy spread and improving the generalization of the model. It is worth noting that the proposed algorithm allows reducing subjectivity, minimizing the labeling routine, increasing the number of samples, and improving the heterog
International benchmarking competitions have become fundamental for the comparative performance assessment of image analysis methods. However, little attention has been given to investigating what can be learnt from t...
International benchmarking competitions have become fundamental for the comparative performance assessment of image analysis methods. However, little attention has been given to investigating what can be learnt from these competitions. Do they really generate scientific progress? What are common and successful participation strategies? What makes a solution superior to a competing method? To address this gap in the literature, we performed a multicenter study with all 80 competitions that were conducted in the scope of IEEE ISBI 2021 and MICCAI 2021. Statistical analyses performed based on comprehensive descriptions of the submitted algorithms linked to their rank as well as the underlying participation strategies revealed common characteristics of winning solutions. These typically include the use of multi-task learning (63%) and/or multi-stage pipelines (61%), and a focus on augmentation (100%), image preprocessing (97%), data curation (79%), and post-processing (66%). The “typical” lead of a winning team is a computer scientist with a doctoral degree, five years of experience in biomedical image analysis, and four years of experience in deep learning. Two core general development strategies stood out for highly-ranked teams: the reflection of the metrics in the method design and the focus on analyzing and handling failure cases. According to the organizers, 43% of the winning algorithms exceeded the state of the art but only 11% completely solved the respective domain problem. The insights of our study could help researchers (1) improve algorithm development strategies when approaching new problems, and (2) focus on open research questions revealed by this work.
Type 2 diabetes (T2D) is a heterogeneous disease that develops through diverse pathophysiological processes and molecular mechanisms that are often specific to cell type. Here, to characterize the genetic contribution...
Type 2 diabetes (T2D) is a heterogeneous disease that develops through diverse pathophysiological processes and molecular mechanisms that are often specific to cell type. Here, to characterize the genetic contribution to these processes across ancestry groups, we aggregate genome-wide association study data from 2,535,601 individuals (39.7% not of European ancestry), including 428,452 cases of T2D. We identify 1,289 independent association signals at genome-wide significance (P < 5 × 10) that map to 611 loci, of which 145 loci are, to our knowledge, previously unreported. We define eight non-overlapping clusters of T2D signals that are characterized by distinct profiles of cardiometabolic trait associations. These clusters are differentially enriched for cell-type-specific regions of open chromatin, including pancreatic islets, adipocytes, endothelial cells and enteroendocrine cells. We build cluster-specific partitioned polygenic scores in a further 279,552 individuals of diverse ancestry, including 30,288 cases of T2D, and test their association with T2D-related vascular outcomes. Cluster-specific partitioned polygenic scores are associated with coronary artery disease, peripheral artery disease and end-stage diabetic nephropathy across ancestry groups, highlighting the importance of obesity-related processes in the development of vascular outcomes. Our findings show the value of integrating multi-ancestry genome-wide association study data with single-cell epigenomics to disentangle the aetiological heterogeneity that drives the development and progression of T2D. This might offer a route to optimize global access to genetically informed diabetes care.
Background: In an era of shifting global agendas and expanded emphasis on non-communicable diseases and injuries along with communicable diseases, sound evidence on trends by cause at the national level is essential. ...
Background: In an era of shifting global agendas and expanded emphasis on non-communicable diseases and injuries along with communicable diseases, sound evidence on trends by cause at the national level is essential. The Global Burden of Diseases, Injuries, and Risk Factors Study (GBD) provides a systematic scientific assessment of published, publicly available, and contributed data on incidence, prevalence, and mortality for a mutually exclusive and collectively exhaustive list of diseases and injuries. Methods: GBD estimates incidence, prevalence, mortality, years of life lost (YLLs), years lived with disability (YLDs), and disability-adjusted life-years (DALYs) due to 369 diseases and injuries, for two sexes, and for 204 countries and territories. Input data were extracted from censuses, household surveys, civil registration and vital statistics, disease registries, health service use, air pollution monitors, satellite imaging, disease notifications, and other sources. Cause-specific death rates and cause fractions were calculated using the Cause of Death Ensemble model and spatiotemporal Gaussian process regression. Cause-specific deaths were adjusted to match the total all-cause deaths calculated as part of the GBD population, fertility, and mortality estimates. Deaths were multiplied by standard life expectancy at each age to calculate YLLs. A Bayesian meta-regression modelling tool, DisMod-MR 2.1, was used to ensure consistency between incidence, prevalence, remission, excess mortality, and cause-specific mortality for most causes. Prevalence estimates were multiplied by disability weights for mutually exclusive sequelae of diseases and injuries to calculate YLDs. We considered results in the context of the Socio-demographic Index (SDI), a composite indicator of income per capita, years of schooling, and fertility rate in females younger than 25 years. Uncertainty intervals (UIs) were generated for every metric using the 25th and 975th ordered 1000 draw value
Gliomas are the most common primary brain malignancies, with different degrees of aggressiveness, variable prognosis and various heterogeneous histologic sub-regions, i.e., peritumoral edematous/invaded tissue, necrot...
详细信息
A1 Highlights from the eleventh ISCB Student Council Symposium 2015 Katie Wilkins, Mehedi Hassan, Margherita Francescatto, Jakob Jespersen, R. Gonzalo Parra, Bart Cuypers, Dan DeBlasio, Alexander Junge, Anupama Jigish...
详细信息
A1 Highlights from the eleventh ISCB Student Council Symposium 2015 Katie Wilkins, Mehedi Hassan, Margherita Francescatto, Jakob Jespersen, R. Gonzalo Parra, Bart Cuypers, Dan DeBlasio, Alexander Junge, Anupama Jigisha, Farzana Rahman O1 Prioritizing a drug’s targets using both gene expression and structural similarity Griet Laenen, Sander Willems, Lieven Thorrez, Yves Moreau O2 Organism specific protein-RNA recognition: A computational analysis of protein-RNA complex structures from different organisms Nagarajan Raju, Sonia Pankaj Chothani, C. Ramakrishnan, Masakazu Sekijima; M. Michael Gromiha O3 Detection of Heterogeneity in Single Particle Tracking Trajectories Paddy J Slator, Nigel J Burroughs O4 3D-NOME: 3D NucleOme Multiscale Engine for data-driven modeling of three-dimensional genome architecture Przemysław Szałaj, Zhonghui Tang, Paul Michalski, Oskar Luo, Xingwang Li, Yijun Ruan, Dariusz Plewczynski O5 A novel feature selection method to extract multiple adjacent solutions for viral genomic sequences classification Giulia Fiscon, Emanuel Weitschek, Massimo Ciccozzi, Paola Bertolazzi, Giovanni Felici O6 A Systems Biology Compendium for Leishmania donovani Bart Cuypers, Pieter Meysman, Manu Vanaerschot, Maya Berg, Hideo Imamura, Jean-Claude Dujardin, Kris Laukens O7 Unravelling signal coordination from large scale phosphorylation kinetic data Westa Domanova, James R. Krycer, Rima Chaudhuri, Pengyi Yang, Fatemeh Vafaee, Daniel J. Fazakerley, Sean J. Humphrey, David E. James, Zdenka Kuncic
Transcranial Doppler ultrasound can be used to detect circulating cerebral emboli. Embolic signals have characteristic transient chirps suitable for wavelet analysis. We have implemented and evaluated the first online...
详细信息
Transcranial Doppler ultrasound can be used to detect circulating cerebral emboli. Embolic signals have characteristic transient chirps suitable for wavelet analysis. We have implemented and evaluated the first online selective wavelet transient enhancement filter to amplify embolic signals in a preprocessing system. Our approach is similar to wavelet de-noising for signal enhancement, but, in order to retain blood flow information, we do not use traditional threshold methods. The selective wavelet amplifier uses the matched filter properties of wavelets to enhance embolic signals significantly and improve classification performance using a novel noise tolerant approach. Even the smallest embolic signals are enhanced. We show an increase of over 2dB (on average) in embolic signal strength and a significant improvement in detection accuracy when our filter is applied both to a commercially available detection system and an in house frequency based detection system. The implementation of the filter is simplified by using an optimized matrix form. The block matrix form is significantly faster than the normal recursive discrete wavelet transform implementation.
暂无评论