In an era of information overload, manually annotating the vast and growing corpus of documents and scholarly papers is increasingly impractical. Automated keyphrase extraction addresses this challenge by identifying ...
详细信息
ISBN:
(数字)9798350362480
ISBN:
(纸本)9798350362497
In an era of information overload, manually annotating the vast and growing corpus of documents and scholarly papers is increasingly impractical. Automated keyphrase extraction addresses this challenge by identifying representative terms within texts. However, most existing methods focus on short documents (up to 512 tokens), leaving a gap in processing long-context documents. In this paper, we introduce LongKey, a novel framework for extracting keyphrases from lengthy documents, which uses an encoder-based language model to capture extended text intricacies. LongKey uses a max-pooling embedder to enhance keyphrase candidate representation. Validated on the comprehensive LDKP datasets and six diverse, unseen datasets, LongKey consistently outperforms existing unsupervised and language model-based keyphrase extraction methods. Our findings demonstrate LongKey’s versatility and superior performance, marking an advancement in keyphrase extraction for varied text lengths and domains.
Organizations face an urgent need to become faster, more efficient, and more successful in their commercial operations, given the current state of technological evolution and the arrival of Industry 4.0. Many business...
详细信息
In this article, we present the mathematical analysis of the convergence of the linearized Crank-Nicolson Galerkin method for a nonlinear Schrödinger problem related to a domain with a moving boundary. The conver...
详细信息
Micro, Small, and Medium Enterprises (MSMEs) have an important role in improving the economy of small communities and the stability of the Indonesian economy. However, they still face various problems such as capital,...
详细信息
Type 2 diabetes (T2D) is a prevalent chronic illness with many different options for treatment management. Continuous glucose monitors (CGM) offer physiological data that clinicians can access when making treatment de...
详细信息
ISBN:
(数字)9798350371499
ISBN:
(纸本)9798350371505
Type 2 diabetes (T2D) is a prevalent chronic illness with many different options for treatment management. Continuous glucose monitors (CGM) offer physiological data that clinicians can access when making treatment decisions. However, the utility of CGM in management of T2D remains an active area of research. In our work, we demonstrate the feasibility of exploiting raw daily CGM data to estimate the physiological parameters of insulin sensitivity and beta-cell function that correlate with estimates derived from laboratory findings. We use a peak extraction algorithm to extract peaks from daily CGM data and implement a model-based approach to infer physiological parameters. We demonstrate that the inferred parameter estimates of insulin sensitivity and beta-cell function correlate to the ground truth measurements as determined by an oral glucose tolerance test (OGTT).
Eye disorders such as cataracts, glaucoma, and diabetic retinopathy can cause abnormalities in eye function, including blindness. This research aims to develop a machine-learning model and find the best model for dete...
详细信息
ISBN:
(数字)9798350389654
ISBN:
(纸本)9798350389661
Eye disorders such as cataracts, glaucoma, and diabetic retinopathy can cause abnormalities in eye function, including blindness. This research aims to develop a machine-learning model and find the best model for detecting eye diseases. Deep Learning methods based on the Convolutional Neural Network (CNN) model will be used to create eye disease detection models by adding the RMSProp, SGD, and Adam optimizers techniques. The dataset used consists of 4 classes, namely normal (1074 data), glaucoma (1007 data), cataract (1038 data), and diabetic retinopathy (1098 data), totaling 4217 fundus images. The evaluation results reveal the CNN model with the RMSProp optimizer has better model performance with a value of 88% accuracy, 88% precision, 87% recall, and 88% f1-score, and based on the AUC value on the ROC-AUC curve for each class is above 0.90, which means that the model can differentiate data between classes well and is included in the Excellent classification category.
The approaches that currently constitute the state-of-the-art for the task of regression on continuous data streams usually involve ensembles, regression trees, and regression rules. They have been found to work very ...
The approaches that currently constitute the state-of-the-art for the task of regression on continuous data streams usually involve ensembles, regression trees, and regression rules. They have been found to work very well for certain situations but generally consume computational resources to a prohibitive extent. In this paper, we propose a new method based on an ensemble of linear regressions for the regression task adapted to handle continuous data streams. The technique has been named Adaptive Linear Regression (ALR). The algorithm combines strategies that contribute to high prediction accuracy using (i) distinct sliding window sizes for training each ensemble element, and (ii) a dynamic regressor selection method for final ensemble voting. After an extensive experimental study, ALR was found to exhibit high predictive performance and outperform state-of-the-art ensemble regressors on data streams for real and synthetic datasets. Moreover, it exhibits low processing time in its parallel version and is faster than ARF-Reg in its serial version. The paper also presents an analysis of how the choice of sliding window size for training favors accuracy.
The policy of the Ministry of Trade of the Republic of Indonesia which sets the highest retail price for cooking oil and then cancels it in early 2022 due to people's complaints about the difficulty of getting coo...
The policy of the Ministry of Trade of the Republic of Indonesia which sets the highest retail price for cooking oil and then cancels it in early 2022 due to people's complaints about the difficulty of getting cooking oil in the market has received many responses from the public via social media. This study conducts sentiment analysis using Word2Vec and Bidirectional Long Short Term Memory (BLSTM) with the aim of studying the public's response to the removal of the highest retail price policy and effect of both skip-gram (SG) and continuous bag-of-words (CBOW), window size and dimension to the word similarity and accuracy. Data taken via Twitter in March 2022. The data then goes through preprocessing stages, manual labeling by experts, and word embedding using Word2Vec to be further classified using BLSTM. Data labeling produces imbalanced data where the largest is neutral sentiment (53%) followed by negative (27%) and positive (20%). ome of the words such as ‘mafia’, ‘mahal’, ‘langka’, ‘tangkap’, ‘bongkar’, and ‘mendag’ could become the government's attention. The similarity value of the SG model tends to have a higher value than CBOW at a lower word vector dimension. The model that gives the best performance results is the Word2Vec model with 50 dimensions, SG architecture and window size 5. This is shown in the accuracy of 0.71 which is the highest value and only shows slight signs of overfitting. Symptoms of over fitting will be more visible with increasing dimensions.
Maintaining a healthy lifestyle has been proven to have significant benefits in cancer survivorship. It is expected that clinicians have a comprehensive understanding of publicly published cancer lifestyle guidelines ...
Maintaining a healthy lifestyle has been proven to have significant benefits in cancer survivorship. It is expected that clinicians have a comprehensive understanding of publicly published cancer lifestyle guidelines and can effectively convey the information to patients. The objective of this paper is to develop an automatic text analysis method to assess the compliance of lifestyle information provided during medical visits and publicly available guidelines. Preliminary results show that selected lifestyle keywords appear an average of 3.54 times per medical visit, and 7% of medical notes pertain to patients' lifestyle. Semantic analysis and word dictionary will be applied to evaluate the extent of information compliance and inform strategies for improving lifestyle recommendations for cancer survivors.
Affective assessment instruments have been developed, especially those related to attitudes. In engineering mathematics learning 1 through the MOOC platform, it was carried out with discussion. The development of atti...
详细信息
暂无评论