检索结果-内蒙古大学图书馆

2022 international conference on data and Software engineering, ICoDSE 2022

作者： Shao, Wenting Wang, Xi Shanghai University Department of Computer Engineering and Science Shanghai China

ISBN: (数字)9798350397055

ISBN: (纸本)9798350397055

machine learning (ML) technology is advancing rapidly but the existing development process lacks standardized process, and the quality of machine learning system development is difficult to guarantee. Requirement modeling is an important method to ensure software quality. data is the key factor to distinguish machine learning systems (MLS) from traditional systems. However, there is no effective modeling method and supporting tool to effectively guide researchers in data modeling for MLS. To address this problem, this paper introduces a two-layer data requirements modeling method for MLS and develops a supporting tool for this method to help users better model data. In order to better illustrate our data requirements modeling method and the supporting tool, we give an example of a self-driving system as the case study. © 2022 IEEE.

关键词： Computer software selection and evaluation

来源：评论

学校读者我要写书评

暂无评论

Federated learning on Non-Independent and Identically Distributed data 3

Federated Learning on Non-Independent and Identically Distri...

引用

3rd international conference on machine learning and Computer Application, ICMLCA 2022

作者： Li, Haowei Luo, Like Wang, Haolong School of Foreign Languages Yangtze University Nanhuan Road Jingzhou China Software College Nanchang University 235 Nanjing East Road Qingshan Lake District Nanchang China Software College Northestern University 195 Chuangxin Road Hunnan District Shenyang China

ISBN: (纸本)9781510664814

Federated Average algorithm (FEDAVG) is the preferred algorithm for federated learning (FL) because of its simplicity and low communication cost However, if all clients's local data aren't independent and equally distributed (that is, nonindependent and identically distributed), FedAvg will have the phenomenon of customer drift, which will lead to the slow convergence speed of the model, and then the efficient cooperative learning to realize the cooperative training of multiple clients will face the challenge of data heterogeneity and vulnerability to attack. This paper systematically summarizes three aspects: local model training improvement, server-side aggregation optimization, and personalized federated learning. The local model can be improved by adjusting the loss function and control variables. Servers can be optimized through asynchronous aggregation, hierarchical aggregation. Personalized federated learning can improve global model performance through both data-based and model-based approaches. This paper puts forward the future research direction of federated learning from all above aspects, and provides reference for the further research of non-IID in federated learning, so as to provide investigation and help for researchers in related fields. © 2023 SPIE.

关键词： learning systems

来源：评论

学校读者我要写书评

暂无评论

Propagation tree says: dynamic evolution characteristics learning approach for rumor detection

引用

international JOURNAL OF machine learning AND CYBERNETICS 2025年第3期16卷 1589-1605页

作者： Zhao, Shouhao Ji, Shujuan Lv, Jiandong Fang, Xianwen Shandong Univ Sci & Technol Sch Comp Sci & Engn Qianwangang Rd Qingdao Shandong Peoples R China Anhui Univ Sci & Technol Anhui Prov Engn Lab Big Data Anal & Early Warning Huainan Anhui Peoples R China

Due to the rapid spread of rumors on social media, which has a detrimental effect on our lives, it is becoming increasingly important to detect rumors. It has been proved that the study of dynamic graphs is helpful to capture the temporal change of information transmission and understand the evolution trend and pattern change of events. However, the dynamic learning methods currently studied do not fully consider the interaction characteristics of the evolutionary process. Therefore, it is difficult to fully capture the structural and semantic differences between them. In order to fully exploit the potential correlations of such temporal information, we propose a novel model named dynamic evolution characteristics learning (DECL) method for rumor detection. First, we partition the temporal snapshot sequences based on the propagation structure of rumors. Secondly, a multi-task graph contrastive learning method is adopted to enable the graph encoder to capture the essential features of rumors, and to fully explore the temporal structural differences and semantic similarities between true rumor and false rumor events. Experimental results on three real-world social media datasets confirm the effectiveness of our model for rumor detection tasks.

关键词： Rumor detection Propagation structure Graph representation Contrastive learning

来源：评论

学校读者我要写书评

暂无评论

Detection of Inconsistencies between Guidance Pages and Actual data Collection of Third-party SDKs in Android Apps 11

Detection of Inconsistencies between Guidance Pages and Actu...

引用

IEEE/ACM 11th international conference on Mobile Software engineering and Systems (MOBILESoft)

作者： Inayoshi, Hiroki Kakei, Shohei Saito, Shoichi Nagoya Inst Technol Nagoya Aichi Japan

ISBN: (纸本)9798400705946

Major app stores have introduced privacy labels (e.g., Google Play's data safety section since July 2022), requiring app developers to provide their privacy disclosures, including data types collected and shared by their apps and third-party SDKs they use. Third-party SDK providers have published guidance pages instructing app developers what data types their SDKs use and thus must be declared to the data safety section. Availability and correctness of the guidance pages are critical issues but have yet to receive any attention. This paper presents the first study of the guidance pages. *** attempted to collect the guidance pages of 175 commercial SDKs widely used in Android apps and did not obtain them for 63% of the SDKs, suggesting that the majority of them have not provided guidance pages. Further, we develop a system that detects inconsistencies between the guidance pages and the actual data collection of SDKs. It uses machine learning and dynamic taint analysis to extract privacy practices from the guidance pages and SDKs and analyzes the outcomes to detect the critical gap. We construct datasets of 47 guidance pages and 43 SDKs' 159 sample apps and evaluate the system. The system uncovered discrepancies related to location and identifiers in the guidance pages of eight SDKs. We also evaluate the machine learning model's accuracy for unknown guidance page contents. The results show that the model performs satisfactorily for updated guidance pages, and the accuracy for newly posted ones increases as the model learns more. This study exposes the critical issues of the guidance pages and also contributes to tools and datasets for facilitating further research on guidance pages and privacy labels.

关键词： Android third-party SDK data safety section consistency analysis

来源：评论

学校读者我要写书评

暂无评论

Detection of Financial Fraudulent Activities with machine learning:A Case Study of Detecting Potential Tax and Invoice Fraud 7

Detection of Financial Fraudulent Activities with Machine Le...

引用

7th international conference on Computer Science and Artificial Intelligence, CSAI 2023

作者： Tian, Maohong Liang, Jian Zhang, Dequan Zhang, Xintong Wang, Zuo Li, Hualin ChongQing Institute of Engineering Intelligent Application of Financial Big Data Chongqing Colleges Universities Engineering Research Center ChongQing China Shu Yi Xin Credit Management Co LTD ChongQing China

ISBN: (纸本)9798400708688

Financial fraud is a widespread problem that can cause significant economic losses. Traditional fraud detection methods often rely on manual audits and rules-based systems, which can be time-consuming and error-prone. In recent years, machine learning methods have emerged as a promising approach to automating fraud detection by leveraging large-scale data analysis. This article explores the use of machine learning methods to detect financial fraud by using tax, invoice, and big data. We first introduce the challenges and opportunities of using these data sources for fraud detection, and then survey various machine learning techniques that have been applied to this problem. We also discuss the evaluation metrics and case studies of these methods, and highlight the potential benefits and limitations of using machine learning for fraud detection. Finally, we identify some future research directions and challenges in this area. This article aims to provide a comprehensive method of the state-of-the-art in using machine learning methods for financial fraud detection, and to inspire further research and development in this important field. © 2023 ACM.

关键词： Big data

来源：评论

学校读者我要写书评

暂无评论

data-driven prediction of product yields and control framework of hydrocracking unit

引用

CHEMICAL engineering SCIENCE 2024年 283卷

作者： Pang, Zheyuan Huang, Pan Lian, Cheng Peng, Chong Fang, Xiangcheng Liu, Honglai East China Univ Sci & Technol Shanghai Engn Res Ctr Hierarch Nanomat State Key Lab Chem Engn Shanghai 200237 Peoples R China East China Univ Sci & Technol Sch Chem Engn Shanghai 200237 Peoples R China East China Univ Sci & Technol Sch Chem & Mol Engn Shanghai 200237 Peoples R China Dalian Univ Technol Sch Chem Engn State Key Lab Fine Chem Dalian 116024 Peoples R China SINOPEC Dalian Res Inst Petr & Petrochem Dalian 116024 Peoples R China

In this study, the relationship between the operating conditions and the product yields and a control framework of the hydrocracking process was developed. The data were collected from a hydrocracking unit in a Chinese refinery. Principal component analysis was used to decrease the number of input variables. Then support vector machine, Gaussian process regression (GPR), and decision tree regression models were developed to establish the relationship above. The best model is GPR, whose Pearson correlation coefficient between the prediction value and the actual value is greater than 0.97 for all the product yields. Shapley additive explanations were performed to interpret the results of the GPR models. A control framework of the hydrocracking unit was then proposed based on the results above. The results show that the machine learning method is a valuable tool for predicting the yield of hydrocracking products, and the control framework proposed helps optimize hydrocracking product yields.

关键词： Hydrocracking machine learning Yield prediction Process control

来源：评论

学校读者我要写书评

暂无评论

Accuracy Comparison of Different Batch Size for a Supervised machine learning Task with Image Classification 9

Accuracy Comparison of Different Batch Size for a Supervised...

引用

9th international conference on Electrical and Electronics engineering (ICEEE)

作者： Aldin, Noor Baha Aldin, Shaima Safa Aldin Baha Hasan Kalyoncu Univ Elect & Elect Engn Gaziantep Turkey Nahrain Univ Continuing Educ Ctr Baghdad Iraq

ISBN: (纸本)9781665467544

machine learning is a type of artificial intelligence where computers solve issues by considering examples of real-world data. Within machine learning, there are various types of techniques or tasks such as supervised, unsupervised, reinforcement, and many hyperparameters have to be tuned to have high accuracy especially in image classification. The batch size refers to the total number of images required to train a single reverse and forward pass. It is one of the most essential hyperparameters. In our paper, we have studied the supervised task with image classification by changing batch size with epoch. The characterization effect of increasing the batch size on training time and how this relationship varies with the training model have been studied, which leads to extremely large variation between them. According to our results, a larger batch size does not always result in high accuracy.

关键词： machine learning supervised task image classification batch size

来源：评论

学校读者我要写书评

暂无评论

Automatic Text Recognition from Image dataset Using Optical Character Recognition and Deep learning Techniques 2nd

Automatic Text Recognition from Image Dataset Using Optical ...

引用

2nd international conference on Computational Intelligence in machine learning, ICCIML 2022

作者： Rao, Ishan Shirgire, Prathmesh Sanganwar, Sanket Vyawhare, Kedar Vispute, S.R. Department of Computer Engineering Pimpri Chinchwad College of Engineering Pune India Computer Department Pimpri Chinchwad College of Engineering Pune India

ISBN: (纸本)9789819979530

Optical Character Recognition (OCR) has become quite well known in the last few years, because it has applications in many sectors. In this paper, we look at the basics of OCR and discuss a few popular datasets that can help one get started with OCR. We aim to comprehensively analyze the research done on OCR with a variety of algorithms. We also look at some popular machine learning models that are used in building OCR systems. Support vector machines and convolutional neural networks are examples of these machine learning models. They have been explained briefly. We use the tesseract tool to extract text. This paper serves as a basic guide to getting started with OCR. Using the datasets and algorithms described, one can start their journey in OCR. We provide our own analysis on handwritten digit dataset using different models in machine learning and then compare their accuracy. We show the use of the popular OCR tool Tesseract by extracting data from a report. The data drawn out from the report is loaded in a database. This is a useful application that can be extended in the future to store details of different reports in medical, transportation, banking, and many other fields to create a paperless environment. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.

关键词： Support vector machines

来源：评论

学校读者我要写书评

暂无评论

Analysis and Comparison of machine learning Methods and improved SVM algorithm in Spam Classification

Analysis and Comparison of Machine Learning Methods and impr...

引用

2022 international conference on Computer Graphics, Artificial Intelligence, and data Processing, ICCAID 2022

作者： Zhu, Hongda Faculty of Art and Science University of Toronto 27 King's College Circle TorontoONM5S 1A1 Canada

ISBN: (纸本)9781510663350

The most popular form of official communication for business purposes is email. Despite the existence of other communication methods, email usage is still the largest. Today's environment necessitates automated email management due to the daily increase in email volume. More than 55% of emails users received nowadays are flagged as spam. This exemplifies how these spams squander the time and resources of email users while creating nothing beneficial. Understanding the various spam email categorization strategies and how they operate is essential since spammers employ complex and creative techniques to carry out their illicit operations through spam emails. The comparison to find the most accuracy machine learning-based spam categorization methods such Naive Bayes, SVM, and Random Forest is the initial objective of this work, after that the paper compares the initial result with the improved SVM algorithm. This study provides a comprehensive analysis and assessment of earlier studies on various machine learning methods, email properties, and methodologies. The results show that the improved support vector machine obtains a good email classification effect and can meet the requirements of spam processing Introduction. © 2023 SPIE.

关键词： Support vector machines

来源：评论

学校读者我要写书评

暂无评论

Design and Practice of University Education Big data Application Support System 3

Design and Practice of University Education Big Data Applica...

引用

3rd international conference on machine learning and Computer Application, ICMLCA 2022

作者： Gao, Jie Li, Hongmei Li, Yinpeng Yunnan Agricultural University Kunming650201 China

ISBN: (纸本)9781510664814

A new engine for advancing college education reform and development is big data. The construction of a college education big data application is the premise and the core in order to fully utilize the value of educational big data for the reform and development of college education. An important part of the national education modernization construction is deepening the application of big data in education and supporting and leading education and teaching reform. Big data is a fundamental support platform for data collection, storage, filtering, cleaning, fusion, analysis, and specific applications in college education. The design of the higher education big data application support system is the subject of a comprehensive analysis in this paper, which begins with an examination of data management and analysis. Including in-depth planning and discussion of the data exchange, data awareness, and data application layers. The big data application support system's application and realization in higher education are then discussed. It can fully fulfill its role in higher education and support the growth and progress of higher education by thoroughly discussing and analyzing the support system for big data applications. © 2023 SPIE.

关键词： Big data

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：