Objective: To report the development and performance of 2 distinct deep learning models trained exclusively on retinal color fundus photographs to classify Alzheimer disease (AD). Patients and Methods: Two independent...
详细信息
Accurate medical diagnosis remains elusive due to limitations in symptom analysis and personalized reports. Current systems often rely on keyword matching, leading to misdiagnosis and failing to adapt to nuanced medic...
详细信息
ISBN:
(纸本)9798350391558;9798350379990
Accurate medical diagnosis remains elusive due to limitations in symptom analysis and personalized reports. Current systems often rely on keyword matching, leading to misdiagnosis and failing to adapt to nuanced medical narratives. This paper proposes a novel web application that addresses these challenges using state-of-the-art natural language processing techniques. Users input symptoms, triggers, and illness duration using a Likert scale. The proposed system employs a two-step process: initial diagnosis with a BM25 model followed by refinement using a pre-trained BERT model for a more contextual understanding. A comprehensive medical report is then generated, outlining the likely condition, proposing a treatment plan, and categorizing the report based on extracted keywords. Finally, K-means clustering suggests relevant specialist doctors based on the report category. This innovative approach enhances healthcare accessibility by providing tailored and swift recommendations, ultimately empowering users with informed decisions and fostering efficient doctor-patient connections.
The field of electrical power encompasses a vast array of diverse information modalities, with textual data standing as a pivotal constituent of this domain. In this study, we harness an extensive corpus of textual da...
详细信息
ISBN:
(数字)9789819996148
ISBN:
(纸本)9789819996131;9789819996148
The field of electrical power encompasses a vast array of diverse information modalities, with textual data standing as a pivotal constituent of this domain. In this study, we harness an extensive corpus of textual data drawn from the electrical power systems domain, comprising regulations, reports, and other pertinent materials. Leveraging this corpus, we construct an Electrical Power Systems Corpus and proceed to annotate entities within this text, thereby introducing a novel Named Entity Recognition (NER) dataset tailored specifically for the electrical power domain. We employ an end-to-end deep learning model, the BERT-BiLSTM-CRF model, for named entity recognition on our custom electrical power domain dataset. This NER model integrates the BERT pre-trained model into the traditional BiLSTM-CRF model, enhancing its ability to capture contextual and semantic information within the text. Results demonstrate that the proposed model outperforms both the BiLSTM-CRF model and the BERT-softmax model in NER tasks across the electrical power domain and various other domains. This study contributes to the advancement of NER applications in the electrical power domain and holds significance for furthering the construction of knowledge graphs and databases related to electrical power systems.
Sentiment analysis (also known as opinion mining or emotion AI) refers to the use of natural language processing (NLP) and text analysis to systematically identify, extract and quantify states and subjective informati...
详细信息
ISBN:
(数字)9781728153773
ISBN:
(纸本)9781728153773
Sentiment analysis (also known as opinion mining or emotion AI) refers to the use of natural language processing (NLP) and text analysis to systematically identify, extract and quantify states and subjective information. Transfer learning (TL) [1] is a study in machine learning focusing on storing knowledge gained while solving one problem and applying it to a different but related problem. In NLP, recent results also demonstrated the effectiveness of models using pre-training on a language modeling task [2], [3]. The transfer learning based models help to rapidly increase understanding of words and sentences arrangement in which semantics and connections are easily grasped. In this paper, we present the results from applying BERT [16], a transfer learning method, in Vietnamese benchmark [17] for one of text classification problems, the Aspect Based Sentiment Analysis problem. The experiments were conducted on two data sets, named Hotel and Restaurant [17], in two task (A) Aspect Detection and (B) Aspect Polarity. The obtained results have outperformed some previous systems [18]-[20] in precision, recall, as well as the F1 measures.
Traditional methods often require a large number of parallel corpora, that is, texts from one language and corresponding texts from another language. The acquisition and construction of this data is very expensive and...
详细信息
ISBN:
(纸本)9798400718144
Traditional methods often require a large number of parallel corpora, that is, texts from one language and corresponding texts from another language. The acquisition and construction of this data is very expensive and time-consuming, which limits the application scope and feasibility of the model. This article used mBERT and XLM-R to solve the problem of relying on parallel corpora in traditional methods, and improved the quality of English-Chinese translation through the model's own multilingual representation ability. A publicly available parallel corpus of English-Chinese was collected for translation tasks, and the collected corpus was preprocessed with segmentation, punctuation, and other techniques. mBERT and XLM-R were used for English-Chinese translation respectively, and Ensemble method was adopted to weight and fuse the results of mBERT and XLM-R. BLEU (Bilingual Evaluation Understudy), METEOR (Metric for Evaluation of Translation with Explicit Ordering), and ROUGE (Recall-Oriented Understudy for Gisting Evaluation) were used to evaluate the quality of the translations. The average BLEU, average METEOR, and average ROUGE-L values of mBERT-XLM-R were 0.98, 0.97, and 0.98, respectively, according to the results. Therefore, mBERT and XLM-R working together may significantly enhance the quality of English-Chinese translation.
Chinese Named Entity Recognition (NER) requires model identify entity boundaries in the sentence i.e., entity segmentation, and meanwhile assign entities to pre-defined categories, i.e., entity classification. Current...
详细信息
ISBN:
(数字)9789819996148
ISBN:
(纸本)9789819996131;9789819996148
Chinese Named Entity Recognition (NER) requires model identify entity boundaries in the sentence i.e., entity segmentation, and meanwhile assign entities to pre-defined categories, i.e., entity classification. Current NER tasks follows sequence tagging scheme and assign the characters to different labels by considering both segmentation position and entity categories. In such a scheme, the characters in the same entity will be regarded as different classes in the training process according to different positions. In fact, the knowledge of entity segmentation is shared across different entity categories, while entity category knowledge is relatively independent of entity segmentation. Such labeling scheme will lead to the entanglement of these two objectives, hindering the effective knowledge acquisition by the models. To address the entanglement issue and comprehensively extract useful knowledge of two objectives, we propose a novel framework that disentangle the original NER labels into two additional training labels for entity segmentation and entity classification respectively. Then we introduce two dedicated expert models to effectively extract specific knowledge from the disentangled labels. Afterwards, their predictions will be integrated into the original model as auxiliary knowledge, further enhancing the primary NER model learning process. We conduct experiments on three publicly available datasets to demonstrate the effectiveness of our proposed method.
The existing course recommendation systems around the world rely on student's past performances, grades and CGPA when giving personalized suggestions. The goal is to provide a customized course recommendation syst...
详细信息
ISBN:
(纸本)9798350386356;9798350386349
The existing course recommendation systems around the world rely on student's past performances, grades and CGPA when giving personalized suggestions. The goal is to provide a customized course recommendation system using Machine Learning (ML) techniques for students. In this project, Machine Learning techniques are used to analyze student feedback about teachers teaching skills, teacher friendliness and course difficulty from the students who already finished said course, this feedback data will be analyzed using semantic analysis. The software developed will be able to give alternative suggestions to students based on their academic performance. By combining the past students' feedback and making real-time suggestions the program will be able to give a clear perspective about the course and the teachers. The final purpose is to give the students a better academic experience by giving them more choices and better course recommendations that will guide them to follow their college career.
Due to the rising number of sophisticated customer functionalities, electronic control units (ECUs) are increasingly integrated into modern automotive systems. However, the high connectivity between the in-vehicle and...
详细信息
ISBN:
(纸本)9798350310085
Due to the rising number of sophisticated customer functionalities, electronic control units (ECUs) are increasingly integrated into modern automotive systems. However, the high connectivity between the in-vehicle and the external networks paves the way for hackers who could exploit in-vehicle network protocols' vulnerabilities. Among these protocols, the Controller Area Network (CAN), known as the most widely used in-vehicle networking technology, lacks encryption and authentication mechanisms, making the communications delivered by distributed ECUs insecure. Inspired by the outstanding performance of bidirectional encoder representations from transformers (BERT) for improving many natural language processing tasks, we propose in this paper "CAN-BERT", a deep learning based network intrusion detection system, to detect cyber attacks on CAN bus protocol. We show that the BERT model can learn the sequence of arbitration identifiers (IDs) in the CAN bus for anomaly detection using the "masked language model" unsupervised training objective. The experimental results on the "Car Hacking: Attack & Defense Challenge 2020" dataset show that "CAN-BERT" outperforms state-of-the-art approaches. In addition to being able to identify in-vehicle intrusions in real-time within 0.8 ms to 3 ms w.r.t CAN ID sequence length, it can also detect a wide variety of cyberattacks with an F1-score of between 0.81 and 0.99.
Natural language processing tasks in the health domain often deal with limited amount of labeled data. Pretrained language models show us a promising way to compensate for the lake of training data, such as Bidirectio...
详细信息
ISBN:
(纸本)9781728153827
Natural language processing tasks in the health domain often deal with limited amount of labeled data. Pretrained language models show us a promising way to compensate for the lake of training data, such as bidirectional encoder representations from transformers (BERT). However, previous downstream tasks often used training data at such a large scale that is unlikely to obtain in health domain. In this work, We conducted a learning curve analysis on a disease classification task to study the behavior of BERT and baseline models can still benefit downstream tasks when training data are relatively small in the context of health NLP.(1)
Purpose: To compare the performance of 3 phenotyping methods in identifying diabetic retinopathy (DR) and related clinical conditions. Design: Three phenotyping methods were used to identify clinical conditions includ...
详细信息
Purpose: To compare the performance of 3 phenotyping methods in identifying diabetic retinopathy (DR) and related clinical conditions. Design: Three phenotyping methods were used to identify clinical conditions including unspecified DR, nonproliferative DR (NPDR) (mild, moderate, severe), consolidated NPDR (unspecified DR or any NPDR), proliferative DR, diabetic macular edema (DME), vitreous hemorrhage, retinal detachment (RD) (tractional RD or combined tractional and rhegmatogenous RD), and neovascular glaucoma (NVG). The first method used only International Classification of Diseases, 10th Revision (ICD-10) diagnosis codes (ICD-10 Lookup System). The next 2 methods used a bidirectional encoder representations from transformers with a dense Multilayer Perceptron output layer natural language processing (NLP) framework. The NLP framework was applied either to free-text of provider notes (Text-Only NLP System) or both free-text and ICD-10 diagnosis codes (Text-and-International Classification of Diseases [ICD] NLP System). Subjects: Adults >= 18 years with diabetes mellitus seen at the Wilmer Eye Institute. Methods: We compared the performance of the 3 phenotyping methods in identifying the DR related conditions with gold standard chart review. We also compared the estimated disease prevalence using each method. Main Outcome Measures: Performance of each method was reported as the macro F1 score. The agreement between the methods was calculated using the kappa statistic. Prevalence estimates were also calculated for each method. Results: A total of 91 097 patients and 692 486 office visits were included in the study. Compared with the gold standard, the Text-and-ICD NLP System had the highest F1 score for most clinical conditions (range 0.39-0.64). The agreement between the ICD-10 Lookup System and Text-Only NLP System varied (kappa of 0.21-0.81). The prevalence of DR and related conditions ranged from 1.1% for NVG to 17.9% for DME (using the Text-and-ICD NLP System)
暂无评论