The peculiarities of the method of syntactic analysis of Ukrainian-language text content aimed at automatic detection of significant keywords of input texts are considered. The role and formal features of the parser i...
详细信息
The peculiarities of the method of syntactic analysis of Ukrainian-language text content aimed at automatic detection of significant keywords of input texts are considered. The role and formal features of the parser in the process of identifying keywords of the content topic are defined, and the procedures of the proposed method are decomposed into 4 stages. Compared to well-known parsers, the proposed method provides self-improvement and self-learning of the automated keyword identification system due to the mechanism of identification of significant statistical parameters within the limits defined by the moderator. The experimental study confirmed the reliability of the method - for various methods of processing the primary text, the average coincidence of the lists of identified keywords with the authors varies in the range of 52.6-68.5%. The accuracy of matching keywords with the author's keywords ranges from 43.6 to 62.9%. The average match of meaningful keywords compared to all found by the system ranges from 38.9 to 75.8% according to the stages of article text analysis. The accuracy of matching keywords compared to all found by the system varies between 34.3-71.9% according to the stages of analysis of the texts of the articles. The reliability of scientific and practical results is confirmed by relevant materials on the implementation of dissertation research, as well as by comparing the obtained practical results on different samples of reliable input data. CLS was developed on the information resource http://*** using CMS Joomla! (for developing the e-framework of articles), PHP (for implementing text content processing methods), HTML (for implementing page markup), CSS (for describing page styles), and MySQL (for storing data and dictionaries). The experimental study confirmed the reliability of the method of determining Keywords - for different algorithms for processing the primary text, the average coincidence of the lists of identified key
In order to comply with the trend of intelligent visual communication, this study proposed an innovative visual communication scenario based on imageprocessingalgorithms. The framework aims to optimize traditional k...
详细信息
Preventing unintentional leakage of information about the training set has high relevance for many machine learning tasks, such as medical image segmentation. While differential privacy (DP) offers mathematically rigo...
详细信息
ISBN:
(纸本)9781665405409
Preventing unintentional leakage of information about the training set has high relevance for many machine learning tasks, such as medical image segmentation. While differential privacy (DP) offers mathematically rigorous protection, the high output dimensionality of segmentation tasks prevents the direct application of state-of-the-art algorithms such as Private Aggregation of Teacher Ensembles (PATE). In order to alleviate this problem, we propose to learn dimensionality-reducing transformations to map the prediction target into a bounded lower-dimensional space to reduce the required noise level during the aggregation stage. To this end, we assess the suitability of principal component analysis (PCA) and autoencoders. We conclude that autoencoders are an effective means to reduce the noise in the target variables.
Steganography is the practice of hiding information by embedding it as secret data within various types of digital media to strengthen security. Numerous algorithms have been proposed for image steganography with a co...
详细信息
Intestinal parasitic infections in animals can cause a range of symptoms, including diarrhea, weight loss, anemia, and malnutrition. This project aims to classify parasitic eggs belonging to the Monezia and Strongyles...
详细信息
The results of studies of the morphological analysis of the text are demonstrated. Application of the technology of automatic processing of Russian-language texts to determine the parts of speech presented in the digi...
详细信息
Unlike diseases of the human body, plant diseases don't camouflage themselves within the body of the crop. The leaves reflect the infection with a change in color, shape, texture or a combination of the three. Hen...
详细信息
Today's machine learning is considered as one of the artificial intelligence technologies used in many ways. Its functions are very accurate, from receiving the given input data to calculating, measuring and outpu...
详细信息
Manual data annotation of information related to geriatric syndrome (GS), which includes various conditions affecting older adults, is required to train machine learning (ML) models to classify patients' health da...
详细信息
ISBN:
(纸本)9783031702419;9783031702426
Manual data annotation of information related to geriatric syndrome (GS), which includes various conditions affecting older adults, is required to train machine learning (ML) models to classify patients' health data for information that is otherwise poorly coded in structured electric health records. Such classification can be highly beneficial to support patients' healthcare in several ways, including early detection and diagnosis of geriatric syndrome, improving healthcare practices and patient empowerment and education (leading to improved adherence to treatment plans and better overall health outcomes). This paper presents an annotation scheme used for labelling information related to GS (i.e. falls, frailty, dementia, etc.) from electronic health records and its application to a regional, Scottish National Health Service (NHS) dataset. The annotation scheme also captures contextual information on GS. An initial pilot involving the annotation of 163 documents manually annotated by two annotators using this scheme yielded very encouraging inter-annotator-agreement (IAA) results (macro average f1-score = 0.750). This pilot was used to proceed to some fine-tuning of the annotation guidelines with input from clinicians. The resulting annotation scheme is now being used for annotating patient discharge summaries, radiology reports and referral letters provided by a Scottish regional trusted research environment. So far 719 documents have been annotated, resulting in 1,684 GS annotations. We have also begun the annotation of MIMIC iv data using the same annotation scheme. Our final aim is a cross-country comparison of GS-related concept detection using different algorithms and ML models as well as variations in language formulation by clinicians in the USA and Scotland/England.
The expensive fine-grained annotation and data scarcity have become the primary obstacles for the widespread adoption of deep learning-based Whole Slide images (WSI) classification algorithms in clinical practice. Unl...
暂无评论