Diffusion-weighted MRI (DWI) is essential for stroke diagnosis, treatment decisions, and prognosis. However, image and disease variability hinder the development of generalizable AI algorithms with clinical value. We ...
详细信息
Diffusion-weighted MRI (DWI) is essential for stroke diagnosis, treatment decisions, and prognosis. However, image and disease variability hinder the development of generalizable AI algorithms with clinical value. We address this gap by presenting a novel ensemble algorithm derived from the 2022 Ischemic Stroke Lesion Segmentation (ISLES) challenge. ISLES’22 provided 400 patient scans with ischemic stroke from various medical centers, facilitating the development of a wide range of cutting-edge segmentation algorithms by the research community. By assessing them against a hidden test set, we identified strengths, weaknesses, and potential biases. Through collaboration with leading teams, we combined top-performing algorithms into an ensemble model that overcomes the limitations of individual solutions. Our ensemble model combines the individual algorithms’ strengths and achieved superior ischemic lesion detection and segmentation accuracy (median Dice score: 0.82, median lesion-wise F1 score: 0.86) on our internal test set compared to individual algorithms. This accuracy generalized well across diverse image and disease variables. Furthermore, the model excelled in extracting clinical biomarkers like lesion types and affected vascular territories. Notably, in a Turing-like test, neuroradiologists consistently preferred the algorithm’s segmentations over manual expert efforts, highlighting increased comprehensiveness and precision. Validation using a real-world external dataset (N=1686) confirmed the model’s generalizability (median Dice score: 0.82, median lesion-wise F1 score: 0.86). The algorithm’s outputs also demonstrated strong correlations with clinical scores (admission NIHSS and 90-day mRS) on par with or exceeding expert-derived results, underlining its clinical relevance. This study offers two key findings. First, we present an ensemble algorithm that detects and segments ischemic stroke lesions on DWI across diverse scenarios on par with expert (neuro)rad
Electroencephalogram(EEG) is a test that detect brain activities using multiple electrodes placed on the scalp. Multiple channels of EEG signals are recorded through the electrodes and are widely used in applications ...
Electroencephalogram(EEG) is a test that detect brain activities using multiple electrodes placed on the scalp. Multiple channels of EEG signals are recorded through the electrodes and are widely used in applications such as neurological disease diagnosis, emotion recognition, and behavior modeling. Recently, deep learning methods have been applied to classify EEG signals, where the different EEG channels are almost treated as a 2D grid input to the machine learning model. This data formation doesn't consider The complex connection among the EEG channels is not considered in such data formation. In our work, we treat EEG signals as frames of graph, and propose an end-to-end edge-aware spatio-temporal graph convolutional neural network for EEG classification. Specifically, we iteratively apply graph convolutional layer spatially and standard convolutional layer temporally. Since there is no prior knowledge about the exact connection among EEG channels, in our model, we initialize the connection as complete graph and apply learnable mask to capture graph structure at different levels. Furthermore, we also propose an iterative method based on information aggregation in graph convolution mechanism to reveal the latent graph structure. Empirical evaluation shows that our model achieves superior performance over state-of-the-art methods for EEG classification, and the learnt and revealed latent EEG graph structure is verified to be meaningful by neuroscientists.
In folksonomies, users annotate items with abundant personalized tags. The tags can be used in recommendation systems to produce meaningful information. The Density-Based Spatial Clustering of Applications with Noise ...
详细信息
Latent Dirichlet Allocation (LDA) is a popular topic modeling technique for discovery of hidden semantic architecture of text datasets, and plays a fundamental role in many machine learning applications. However, like...
详细信息
The Transformer is a fully attention-based alternative to recurrent networks that has achieved state-of-the-art results across a range of NLP tasks. In this paper, we analyze the structure of attention in a Transforme...
详细信息
The K-means algorithm is arguably the most popular data clustering method, commonly applied to processed datasets in some "feature spaces", as is in spectral clustering. Highly sensitive to initializations, ...
详细信息
The FAIR principles have been widely cited,endorsed and adopted by a broad range of stakeholders since their publication in *** intention,the 15 FAIR guiding principles do not dictate specific technological implementa...
详细信息
The FAIR principles have been widely cited,endorsed and adopted by a broad range of stakeholders since their publication in *** intention,the 15 FAIR guiding principles do not dictate specific technological implementations,but provide guidance for improving Findability,Accessibility,Interoperability and Reusability of digital *** has likely contributed to the broad adoption of the FAIR principles,because individual stakeholder communities can implement their own FAIR ***,it has also resulted in inconsistent interpretations that carry the risk of leading to incompatible ***,while the FAIR principles are formulated on a high level and may be interpreted and implemented in different ways,for true interoperability we need to support convergence in implementation choices that are widely accessible and(re)-*** introduce the concept of FAIR implementation considerations to assist accelerated global participation and convergence towards accessible,robust,widespread and consistent FAIR *** self-identified stakeholder community may either choose to reuse solutions from existing implementations,or when they spot a gap,accept the challenge to create the needed solution,which,ideally,can be used again by other communities in the ***,we provide interpretations and implementation considerations(choices and challenges)for each FAIR principle.
The identification of tax evasion plays an important role in ensuring tax order, promoting the level of tax collection and management, and reducing tax losses. With the advancements in data mining technology, many mac...
详细信息
ISBN:
(数字)9781728108582
ISBN:
(纸本)9781728108599
The identification of tax evasion plays an important role in ensuring tax order, promoting the level of tax collection and management, and reducing tax losses. With the advancements in data mining technology, many machine learning techniques have yielded results in identifying tax evasion. However, to realize satisfactory performance, these models require large amounts of human annotated data. In the tax field, unlabeled tax data are abundant, data annotation in a single region is expensive, and the distributions of characteristics differ among regions; these factors pose substantial difficulties in the development of an identification model. Existing tax evasion detection methods are either trained for single-region tasks, in which case they perform poorly on inter-region tax evasion identification due to the discrepancies in feature distributions, or utilize labeled data from both the target-task field and different but related auxiliary fields to reuse and transfer knowledge of the target domain data, in which case they cannot deal with scenarios in which there are no labeled data in target audit tasks. Although current unsupervised transfer learning techniques can train models in labeled regions for unlabeled regions, large intra-class distribution discrepancies cannot be perfectly minimized in tax evasion detection scenarios. To better address the above challenges, this paper proposes a general architecture, namely, the unsupervised conditional adversarial networks (UCAN) for tax evasion detection, which is the first approach to solve audit tasks in unlabeled target domains via inter-region transfer. Our architecture establishes an adversarial neural network adding label information in the distribution adapter, which can granularly adapt the joint probability distribution (JPD) of the data. We introduce a constraint that is based on the conditional maximum mean discrepancy (CMMD) of the extracted features to align the conditional probability distribution (CPD)
This study presents the outcomes of the shared task competition BioCreative VII (Task 3) focusing on the extraction of medication names from a Twitter user's publicly available tweets (the user's 'timeline...
This study presents the outcomes of the shared task competition BioCreative VII (Task 3) focusing on the extraction of medication names from a Twitter user's publicly available tweets (the user's 'timeline'). In general, detecting health-related tweets is notoriously challenging for natural language processing tools. The main challenge, aside from the informality of the language used, is that people tweet about any and all topics, and most of their tweets are not related to health. Thus, finding those tweets in a user's timeline that mention specific health-related concepts such as medications requires addressing extreme imbalance. Task 3 called for detecting tweets in a user's timeline that mentions a medication name and, for each detected mention, extracting its span. The organizers made available a corpus consisting of 182 049 tweets publicly posted by 212 Twitter users with all medication mentions manually annotated. The corpus exhibits the natural distribution of positive tweets, with only 442 tweets (0.2%) mentioning a medication. This task was an opportunity for participants to evaluate methods that are robust to class imbalance beyond the simple lexical match. A total of 65 teams registered, and 16 teams submitted a system run. This study summarizes the corpus created by the organizers and the approaches taken by the participating teams for this challenge. The corpus is freely available at https://***/tasks/biocreative-vii/track-3/. The methods and the results of the competing systems are analyzed with a focus on the approaches taken for learning from class-imbalanced data.
Heterogeneous data source produces different types of data that cannot be treated in the same way. In this paper, two sources of data are considered: image and human knowledge. The former is rep-resented using visual ...
详细信息
暂无评论