检索结果-内蒙古大学图书馆

Stylometry-driven framework for Urdu intrinsic plagiarism detection: a comprehensive analysis using machine learning, deep learning, and large language models

引用

Neural Computing and Applications 2025年第9期37卷 6479-6513页

作者： Manzoor, Muhammad Faraz Farooq, Muhammad Shoaib Abid, Adnan Department of Computer Science University of Management and Technology Lahore Pakistan Department of Data Science Faculty of Computing and Information Technology University of the Punjab Lahore Pakistan

Detecting plagiarism in documents is a well-established task in natural language processing (NLP). Broadly, plagiarism detection is categorized into two types (1) intrinsic: to check the whole document or all the passages have been written by a single author;(2) extrinsic: where a suspicious document is compared with a given set of source documents to figure out sentences or phrases which appear in both documents. In the pursuit of advancing intrinsic plagiarism detection, this study addresses the critical challenge of intrinsic plagiarism detection in Urdu texts, a language with limited resources for comprehensive language models. Acknowledging the absence of sophisticated large language models (LLMs) tailored for Urdu language, this study explores the application of various machine learning, deep learning, and language models in a novel framework. A set of 43 stylometry features at six granularity levels was meticulously curated, capturing linguistic patterns indicative of plagiarism. The selected models include traditional machine learning approaches such as logistic regression, decision trees, SVM, KNN, Naive Bayes, gradient boosting and voting classifier, deep learning approaches: GRU, BiLSTM, CNN, LSTM, MLP, and large language models: BERT and GPT-2. This research systematically categorizes these features and evaluates their effectiveness, addressing the inherent challenges posed by the limited availability of Urdu-specific language models. Two distinct experiments were conducted to evaluate the impact of the proposed features on classification accuracy. In experiment one, the entire dataset was utilized for classification into intrinsic plagiarized and non-plagiarized documents. Experiment two categorized the dataset into three types based on topics: moral lessons, national celebrities, and national events. Both experiments are thoroughly evaluated through, a fivefold cross-validation analysis. The results show that the random forest classifier achieved an ex

关键词： Deep learning

来源：评论

学校读者我要写书评

暂无评论

Oracle Inequality for Sparse Trace Regression Models with Exponentialβ-mixing Errors

引用

Acta Mathematica Sinica,English Series 2023年第10期39卷 2031-2053页

作者： Ling PENG Xiang Yong TAN Pei Wen XIAO Zeinab RIZK Xiao Hui LIU School of Statistics and Data Science Jiangxi University of Finance and EconomicsNanchang 330013P.R.China Key Laboratory of Data Science in Finance and Economics Jiangxi University of Finance and EconomicsNanchang 330013P.R.China

In applications involving,e.g.,panel data,images,genomics microarrays,etc.,trace regression models are useful *** address the high-dimensional issue of these applications,it is common to assume some sparsity *** the case of the parameter matrix being simultaneously low rank and elements-wise sparse,we estimate the parameter matrix through the least-squares approach with the composite penalty combining the nuclear norm and the *** extend the existing analysis of the low-rank trace regression with *** to exponentialβ-mixing *** explicit convergence rate and the asymptotic properties of the proposed estimator are ***,as well as a real data application,are also carried out for illustration.

关键词： Trace regression model low-rank matrix oracle inequality exponentialβ-mixing errors

来源：评论

学校读者我要写书评

暂无评论

Inverse local time of one-dimensional diffusions and its comparison theorem

引用

science China Mathematics 2025年第5期68卷 1201-1218页

作者： Zhen-Qing Chen Lidan Wang Department of Mathematics University of WashingtonSeattleWA98195USA School of Statistics and Data Science and KLMDASR Nankai UniversityTianjin300071China

In this paper,we study the inverse local times at 0 of one-dimensional reflected diffusions on[0,∞)and establish a comparison principle for these inverse local *** also provide applications to Green function estimate... 详细信息

关键词： diffusion local time inverse local time subordinator Lévy measure Girsanov transform comparison theorem Green function estimate

来源：评论

学校读者我要写书评

暂无评论

A Systematic Comparison of Horizontal Federated Learning Algorithm Based on Random Forests in a Medical Setting

引用

Machine Intelligence Research 2025年第2期22卷 254-266页

作者： Andrew Cheng Jingqing Zhang Atri Sharma Vibhor Gupta Yike Guo Pangaea Data Limited London SE17LYUK Hong Kong University of Science and Technology Hong Kong 999077China

The medical industry generates vast amounts of data suitable for machine learning during patient-clinician interaction in ***,as a result of data protection regulations like the general data protection regulation(GDPR),patient data cannot be shared freely across *** these cases,federated learning(FL)is a viable option where a global model learns from multiple data sites without moving the *** this paper,we focused on random forests(RFs)for its effectiveness in classification tasks and widespread use throughout the medical industry and compared two popular federated random forest aggregation algorithms on horizontally partitioned *** first provided necessary background information on federated learning,the advantages of random forests in a medical context,and the two aggregation algorithms.A series of extensive experiments using four public binary medical datasets(an excerpt of MIMIC III,Pima Indian diabetes dataset from Kaggle,and diabetic retinopathy and heart failure dataset from UCI machine learning repository)were then performed to systematically compare the two on equal-sized,unequal-sized,and class-imbalanced clients.A follow-up investigation on the effects of more clients was also *** finally empirically analyzed the advantages of federated learning and concluded that the weighted merge algorithm produces models with,on average,1.903%higher F1 score and 1.406%higher AUCROC value.

关键词： Federated learning horizontal federated learning random forests machine learning medical diagnosis.

来源：评论

学校读者我要写书评

暂无评论

Negation Triplet Extraction with Syntactic Dependency and Semantic Consistency 30

Negation Triplet Extraction with Syntactic Dependency and Se...

引用

Joint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024

作者： Shi, Yuchen Yang, Deqing Liu, Jingping Xiao, Yanghua Wang, Zongyu Xu, Huimin School of Data Science Fudan University Shanghai Key Laboratory of Data Science Shanghai China East China University of Science and Technology Shanghai China Meituan China

ISBN: (纸本)9782493814104

Previous works of negation understanding mainly focus on negation cue detection and scope resolution, without identifying negation subject which is also significant to the downstream tasks. In this paper, we propose a new negation triplet extraction (NTE) task which aims to extract negation subject along with negation cue and scope. To achieve NTE, we devise a novel Syntax&Semantic-Enhanced Negation Extraction model, namely SSENE, which is built based on a generative pretrained language model (PLM) of Encoder-Decoder architecture with a multi-task learning framework. Specifically, the given sentence's syntactic dependency tree is incorporated into the PLM's encoder to discover the correlations between the negation subject, cue and scope. Moreover, the semantic consistency between the sentence and the extracted triplet is ensured by an auxiliary task learning. Furthermore, we have constructed a high-quality Chinese dataset NegComment based on the users' reviews from the real-world platform of Meituan, upon which our evaluations show that SSENE achieves the best NTE performance compared to the baselines. Our ablation and case studies also demonstrate that incorporating the syntactic information helps the PLM's recognize the distant dependency between the subject and cue, and the auxiliary task learning is helpful to extract the negation triplets with more semantic consistency. We further demonstrate that SSENE is also competitive on the traditional CDSR task. © 2024 ELRA Language Resource Association: CC BY-NC 4.0.

关键词： Extraction

来源：评论

学校读者我要写书评

暂无评论

Process capability monitoring and change-point analysis for S-type quality characteristic

引用

Quality Technology and Quantitative Management 2024年第2期21卷 237-256页

作者： Liao, Mou-Yuan Wu, Chien-Wei School of Economics and Management Huzhou College Huzhou China Department of Data Science and Big Data Analytics Providence University Taichung Taiwan Department of Industrial Engineering and Engineering Management National Tsing Hua University Hsinchu Taiwan

Among various quality assurance activities, process capability indices (PCIs) are recognized as the most effective tools to quantify and evaluate process performance. The one-sided capability index (Formula presented.) can adequately measure the process capability for processes with a smaller-is-better (S-type) quality characteristic. While focusing on the index (Formula presented.), this study constructs a capability control chart to monitor the short-term capability based on the exponentially weighted moving average (EWMA). Simulations have shown that the EWMA capability control chart is adaptive regarding average run length. Further, this study applies change-point analysis to determine whether and when the short-term capability-level changes. Using these two proposed methods, engineers can realize a variety of short-term process capabilities in good time and implement control measures efficiently against a poor long-term process capability. © 2023 International Chinese Association of Quantitative Management.

关键词： Control charts

来源：评论

学校读者我要写书评

暂无评论

Research on the Relationship Between New Energy Vehicles and Dual Carbon in Yangtze River Delta Based on GRA-LSTM Model 7

Research on the Relationship Between New Energy Vehicles and...

引用

7th IEEE Advanced Information Technology, Electronic and Automation Control Conference, IAEAC 2024

作者： Zhou, Jinxuan Deng, Liang Tong, Yi Gao, Yan Wuhan Business University Department of Information Engineering Hubei China Data Science and Big Data Technology China Internet of Things Engineering China Software Engineering Big Data Technology

ISBN: (纸本)9798350339161

New energy automobile industry plays an important role in building a green, low-carbon and recycling industrial system. In this paper, the prediction simulation training and prediction accuracy comparison study are carried out with the help of the newly constructed GRA-LSTM model, Biological Neural Network model and the first-order one-variable gray GM (1, 1) model. The LSTM model is created and trained by Learning Rate Decay function. The Learning Rate Decay callback function is set, and the learning rate is gradually reduced during the training process to carry out simulation training, and finally the trained model is used to make predictions, and the time for the Yangtze River Delta region to reach carbon peak is 2028, and the time to reach carbon neutrality is 2060, and at the same time this paper analyzes and finds out that the higher the new energy automobile market ownership is, the shorter the time for carbon peak and carbon neutrality will be.2. © 2024 IEEE.

关键词： Forecasting

来源：评论

学校读者我要写书评

暂无评论

Spatio-Temporal Analysis to Inspect Infection Risk of Dengue Hemorrhagic Fever in Central Java from 2015 to 2022 11

Spatio-Temporal Analysis to Inspect Infection Risk of Dengue...

引用

11th International Conference on Computer, Control, Informatics and its Applications, IC3INA 2024

作者： Wudlu, Sofiana Riswantini, Dianadewi Khotimah, Purnomo Husnul Natari, Rifani Bhakti Izzaturrahmi, Hafizh Yanuar, Ferra Department of Mathematics and Data Science Andalas University Padang Indonesia Research Center for Data and Information Science National Research and Innovation Agency Bandung Indonesia

ISBN: (纸本)9798331542313

Dengue hemorrhagic fever (DHF) is a serious public health issue worldwide, including Central Java, Indonesia. Several analyses need to be conducted to serve as a reference for the government to take action to reduce the number of DHF cases. In this study, the data used was the quantity of Central Java and DI Yogyakarta provinces DHF cases from 2015 to 2022 due to the release of Wolbachia-infected Aedes aegypti mosquitoes in some areas in Yogyakarta province, and Central Java province as the surrounding area. The data was analyzed using spatial analysis with Moran’s I measure to see if the proximity between areas affects the number of DHF cases. The Standardized Incidence Ratio (SIR) method was used to see areas most at risk of dengue cases. Further, the Pearson correlation coefficient method was also utilized, to see which climate variables have an important influence on the number of DHF cases. This study found that the number of DHF cases in Central Java and DI Yogyakarta province did not have spatial dependence on regions within districts or cities in 2015-2022. However, for the analysis of cases only in DI Yogyakarta Province, there was spatial dependence between regions in districts/cities for the number of DHF cases in 2015, 2018, 2020, and 2022. In addition, by calculating the risk level of each district/city in the provinces of Central Java and DI Yogyakarta from 2015 to 2022, it was found that the most at-risk area was Magelang City, Central Java, with the SIR value is 15.611 in 2017. Among the three climatic variables (mean temperature, mean humidity, and total precipitation), the mean temperature and total precipitation significantly impacted the number of DHF cases, with p-values are 0.085 and 0.008, respectively. © 2024 IEEE.

关键词： Risk assessment

来源：评论

学校读者我要写书评

暂无评论

Forward attention-based deep network for classification of breast histopathology image

引用

Multimedia Tools and Applications 2024年第40期83卷 88039-88068页

作者： Roy, Sudipta Jain, Pankaj Kumar Tadepalli, Kalyan Reddy, Balakrishna Pailla Artificial Intelligence & Data Science Jio Institute Navi Mumbai410206 India Artificial Intelligence & Data Science Jio Institute Maharashtra Navi Mumbai410206 India Sir HN Reliance Foundation Hospital Girgaon Mumbai400004 India Hyderabad500081 India

Breast cancer diagnosis via histopathology is clinically important but challenges remain. We develop a Forward Attention-based deep network (FA-VGG16) for classifying breast histopathology images. For binary classification, FA-VGG16 achieves 90.4% accuracy, outperforming VGG16 (89.3%). Solving class imbalance boosts performance to 97.7% accuracy. For quaternary classification of benign subtypes, FA-VGG16 obtains individual accuracy between 77.1 and 88.5% and overall, 77.8%. For malignant subtypes, individual accuracy ranges from 77.2 to 98.3% and overall, 92.4%. Receiver operating characteristic analysis yields areas under the curve values exceeding 95.7% for all benign and malignant subtypes. Paired t-testing of variants indicates FA-VGG16 significantly outperforms others (p © The Author(s), under exclusive licence to Springer science+Business Media, LLC, part of Springer Nature 2024.

关键词： Diseases

来源：评论

学校读者我要写书评

暂无评论

Neutrosophic Pseudo-t-Norm and Its Derived Neutrosophic Residual Implication

引用

Neutrosophic Sets and Systems 2023年 57卷 18-32页

作者： Bu, Hongru Hu, Qingqing Zhang, Xiaohong School of Mathematics and Data Science Shaanxi University of Science & Technology Xi’an710021 China School of Electrical and Control Engineering Shaanxi University of Science & Technology Xi’an710021 China School of Mathematics and Data Science Shaanxi University of Science and Technology Xi’an710021 China

First of all, on the basis of complete lattice, the concept of neutrosophic pseudo-t-norm (NPT) is given. Definitions and examples of representable neutrosophic pseudo-t-norms (RNPTs) are given, while unrepresentable neutrosophic pseudo-t-norms (UNPTs) is also given. Secondly, De Morgan neutrosophic triples (DMNTs) consists of three operators: NPTs, neutrosophic negators (NNs) and neutrosophic pseudo-s-norms (NPSs), where NPTs and NPSs are dual about NNs. Again, we study the neutrosophic residual implications (NRIs) of NPTs, as well as their underlying properties. Finally, we give a method to get NPTs from neutrosophic implications (NIs) and construct non-commutative residuated lattices (NCRLs) based on NRIs and NPTs. © (2023). All Rights Reserved.

关键词： Fuzzy logic

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：