检索结果-内蒙古大学图书馆

57th Hawaii international conference on System sciences (HICSS)

作者： Zahn, Milena Mueller, Tobias Matthes, Florian Tech Univ Munich Munich Germany SAP SE Weinheim Germany

ISBN: (纸本)9780998133171

The insufficient amount of training data is a persisting bottleneck of machine learning systems. A large portion of the world's data is scattered and locked in data silos. Breaking up these data silos could alleviate this problem. Federated machine learning is a novel model-to-data approach that enables the training of machine learning models, on decentralized, potentially siloed data. Despite its promising potential, most Federated machine learning projects never leave the prototype stage. This can be attributed to exaggerated expectations and an inappropriate fit between the technology and the use case. Current literature does not offer guidance for assessing the fit between Federated machine learning and their use case. Against this backdrop, we design a decision-support tool to aid decision-makers in the suitability and complexity assessment of FedML projects. Thereby, we aim to facilitate the technology selection process, avoid exaggerated expectations and consequently facilitate the success of Federated machine learning projects.

关键词： Federated machine learning Technology Adoption Design science Research

来源：评论

学校读者我要写书评

暂无评论

Identification of cold rolling chatter statesbased on multi-source data fusion and Dempster-Shafer theory

引用

international JOURNAL OF ADVANCED MANUFACTURING TECHNOLOGY 2024年第7-8期135卷 3633-3647页

作者： Wang, Xiaoyong Gao, Zhiying Xin, Yanli Univ Sci & Technol Beijing Sch Mech Engn Beijing 100083 Peoples R China

Cold rolling chatter is one of the bottlenecks to improve the production quality and efficiency of high-strength thin strip, so it is very important to predict and identify the chatter states. The accumulation of industrial data from the rolling process and the development of machine learning technology have opened up a path to solve this problem. However, due to low density and uneven distribution of actual process data, knowledge learning and states identification of cold rolling chatter phenomena are confined. Therefore, based on the combination of actual production data and simulation data, a novel identification method is proposed and applied to identify the cold rolling chatter states. Firstly, the actual vibration signals are collected and the simulation data generated from chatter model are used to supplement data in chatter states. The sample space is constructed based on the semi-supervised transfer component analysis (SSTCA) to realize the fusion of actual production data and simulation data. Then, different cold rolling states are identified by particle swarm optimization-support vector machine (PSO-SVM) and back propagation neural network (BPNN), respectively. Finally, the identification results of PSO-SVM and BPNN are combined based on the Dempster-Shafer (D-S) theory. It can be drawn that SSTCA can effectively solve the problems of low density and uneven distribution of industrial data by fusion of multi-source data, and D-S theory can realize the connection of different machine learning methods. Furthermore, the presented method can more accurately identify different chatter states in the rolling process.

关键词： Chatter identification data fusion machine learning Semi-supervised transfer component analysis Dempster-Shafer theory

来源：评论

学校读者我要写书评

暂无评论

Fedmpo: federated optimization based on multidimensional especially 3-dimensional proximal operator

引用

international JOURNAL OF machine learning AND CYBERNETICS 2024年第3期15卷 1075-1085页

作者： Jiang, Fazhen Yang, Xiaoyuan Li, Yixiao Li, Luxuan Beihang Univ Sch Cyber Sci & Technol Beijing Peoples R China Beihang Univ Sch Math Sci Beijing Peoples R China Beihang Univ Key Lab Math Informat & Behav Minist Educ Beijing Peoples R China

Federated learning is a cutting-edge machine learning framework that enables multiple organizations to model data usage and conduct learning tasks while ensuring user privacy protection, data security, and compliance with government regulations. In this study, we propose a novel federated learning algorithm, FedMPO, to address the critical challenge of data heterogeneity among clients, which can lead to inconsistent optimized local models. FedMPO is a versatile multi-dimensional loss function that leverages the 3-dimensional proximal operator to fit a stationary and rapidly convergent loss function using Taylor expansion. As a general loss function, FedMPO can be applied to popular federated learning algorithms, such as FedAvg, FedProx, SCAFFOLD, FedDyn, and FedDC, to enhance the accuracy and stability of secure aggregation. Extensive experiments show that FedMPO can improve accuracy scores(almost 0.02-0.33 and 0.02-0.45 percent improvements on full and partial client participation, respectively) on some common evaluation data sets with various settings and also has robust in partial participation settings, non-iid data and heterogeneous clients in the same time.

关键词： machine learning Federated learning Multidimensional proximal operator data heterogeneity Loss function Security aggregation

来源：评论

学校读者我要写书评

暂无评论

Early Diagnosis of Liver Disease Using machine learning Techniques 5th

Early Diagnosis of Liver Disease Using Machine Learning Tech...

引用

5th international conference on data science, machine learning and Applications

作者： Hegde, Nagaratna P. Vikkurty, Sireesha Sriperambuduri, Vinay Kumar Gogune, Sruthi Anish, Palabatla Thanneru, Praneeth Vasavi Coll Engn Dept CSE Hyderabad Telangana India

ISBN: (纸本)9789819780334;9789819780310;9789819780303

Liver diseases are a global health concern, and early diagnosis is crucial for effective treatment. While traditional liver-function laboratory tests provide valuable information, they may not say much about any emerging or underlying illnesses. In this study, we explore the efficacy of machine learning algorithms in predicting the risk of liver disease using the Indian Liver Patient dataset. This could help patients concerned opt for timely and effective treatment.

关键词： data Analysis Confusion Matrix data Streamlining Correlation Matrix False Negatives Accuracy

来源：评论

学校读者我要写书评

暂无评论

machine learning for crack detection in an anisotropic electrically conductive nano-engineered composite interleave with realistic geometry

引用

international JOURNAL OF ENGINEERING science 2024年 205卷

作者： Akmanov, Iskander S. Lomov, Stepan, V Spasennykh, Mikhail Y. Abaimov, Sergey G. Skolkovo Inst Sci & Technol Ctr Petr Sci & Engn Bolshoy Blvd 30Bld 1 Moscow 121205 Russia

Engineering interleaves of composite laminates with carbon nanotubes (CNTs) improves interlaminar fracture toughness, creating also conductivity, which can be employed for damage identification. The paper explores machine learning (ML) solution of the inverse problem of the defect identification for interleaves with anisotropic conductivity (aligned CNTs). The electrical and geometrical properties of the interleave are assigned based on the synchrotron X-ray computer tomography of glass fibre / epoxy laminates with nanostitch. Several machine learning (ML) models are applied (XGBoost, fully connected (FCNN) and convolution neural (CNN) networks). XGBoost and FCNN algorithms performed poorly, failing to detect smaller defects and giving significant errors for larger ones. CNN algorithm detects defects well: It predicts the geometric characteristics of the defect with error below 16 %.

关键词： A. laminates A. nano-structures B. defects C. computational modelling machine learning

来源：评论

学校读者我要写书评

暂无评论

Common issues of data science on the eco-environmental risks of emerging contaminants

引用

ENVIRONMENT international 2025年 196卷 109301页

作者： Hu, Xiangang Dong, Xu Wang, Zhangjia Nankai Univ Minist Educ Coll Environm Sci & Engn Carbon Neutral Interdisciplinary Sci CtrKey Lab P Tianjin 300350 Peoples R China

data-driven approaches (e.g., machine learning) are increasingly used to replace or assist laboratory studies in the study of emerging contaminants (ECs). In the past ten years, an increasing number of models or approaches have been applied to ECs, and the datasets used are continuously enriched. However, there are large knowledge gaps between what we have found and the natural eco-environmental meaning. For most published reviews, the contents are organized by the types of ECs, but the common issues of data science, regardless of the type of pollutant, are not sufficiently addressed. To close or narrow the knowledge gaps, we highlight the following issues ignored in the field of data-driven EC research. Complicated biological and ecological data and ensemble models revealing mechanisms and spatiotemporal trends with strong causal relationships and without data leakage deserve more attention in the future. In addition, the matrix influence, trace concentration, and complex scenario have often been ignored in previous works. Therefore, an integrated research framework related to natural fields, ecological systems, and large-scale environmental problems, rather than relying solely on laboratory data-related analysis, is urgently needed. Beyond the current prediction purposes, data science can inspire the discovery of scientific questions, and mutual inspiration among data science, process and mechanism models, and laboratory and field research is a critical direction. Focusing on the above urgent and common issues related to data, frameworks, and purposes, regardless of the type of pollutant, data science is expected to achieve great advancements in addressing the eco-environmental risks of ECs.

关键词： Big data Emerging contaminants Microplastics Antibiotics PFAS machine learning

来源：评论

学校读者我要写书评

暂无评论

Teaching Communication in Context: Rhetorical Moves in data science Reports

Teaching Communication in Context: Rhetorical Moves in Data ...

引用

IEEE international Professional Communication conference (ProComm) on Building Bridges - Connecting Ideas, People, and Possibilities

作者： Hutchison, Allison Cornell Univ Engn Commun Program Ithaca NY 14853 USA

ISBN: (纸本)9798350384468;9798350384451

This study involves a data science and machine learning course partnered with an engineering communication course referred to here as an authentically integrated communication model and offers insights into such a model for engineering educators. In these partnered courses, student teams apply data science and machine learning tools to conduct data analysis and write two data science reports. Through qualitative coding and corpus analysis methods, rhetorical moves that students make in the data science report genre were identified. Twelve out of 57 total final reports were randomly chosen and coded, then six corpora of excerpts related to two codes and subcodes were created to generate keyword lists. These codes were "results," "discussion," as well as "ineffective" and "effective" subcodes fir each main code. The total codes were then compared to one another according to students' enrollment in both courses. Overall, students' reports in the engineering communication class more often contained effective results and effective discussion excerpts. Keywords along with example sentences are provided to demonstrate greater context for the use of language in the data science report genre.

关键词： Authentic integration corpus analysis engineering communication genre analysis

来源：评论

学校读者我要写书评

暂无评论

Prediction of Student Academic Performance Utilizing a Multi-Model Fusion Approach in the Realm of machine learning

引用

applied scienceS-BASEL 2025年第7期15卷 3550-3550页

作者： Zou, Wei Zhong, Wei Du, Junzhen Yuan, Lingyun Yunnan Normal Univ Sch Informat Sci & Technol Kunming 650500 Peoples R China Yunnan Normal Univ Key Lab Educ Informatizat Nationalities Minist Educ Kunming 650500 Peoples R China Yunnan Normal Univ Yunnan Key Lab Smart Educ Kunming 650500 Peoples R China

The digitization of college student management is a crucial approach for training institutions to decrease management costs while enhancing the quality of students' development. In this study, we focused on the students majoring in Computer science in a certain university and conducted an exploration using their scores in multiple undergraduate courses. Initially, we selected the students' basic and core academic courses based on the training program and identified four groups of course combinations with strong positive correlations through correlation and cluster analysis. This finding helped the university optimize the arrangement and structure of the Computer science major's course system. Next, we organized the student overall course performance data in a sequential format based on the semester order. Multiple machine learning models were utilized to perform regression prediction for student performance and classification prediction tasks to determine the student's performance level. Finally, we integrated multiple machine learning models to create a practical framework for predicting student academic performance, which can be applied in student digital management. The framework can also provide effective decision support for academic early warning and guide the students' development.

关键词： machine learning multi-model fusion performance prediction

来源：评论

学校读者我要写书评

暂无评论

Let Me Generate That for You: Generative data Augmentation for Misinformation Detection in Low-Resource Environments 11

Let Me Generate That for You: Generative Data Augmentation f...

引用

IEEE 11th international conference on data science and Advanced Analytics (DSAA)

作者： Toney-Wails, Autumn Singh, Lisa Georgetown Univ Washington DC 20057 USA

ISBN: (纸本)9798350364941;9798350364958

Misinformation detection is a rapidly moving target, as new topics emerge and evolve in high volume on social media platforms. Annotated and fact-checked datasets are necessary for detection model training, but are laborious to curate. Thus, many misinformation detection models are trained in low-resource environments and rely on machine learning techniques to improve performance with small ground-truth datasets. Generative data augmentation methods enable topic-specific examples that increase a model's training dataset without incurring the cost and time investment associated with manual annotation. In this work, we assess the value of using generative augmentation for different classes of learning models: a classic neural model, a fine-tuned deep learning model, a reinforcement learning model, and an active learning model. We find that generated training data is not effective for all learning paradigms for the misinformation detection task, highlighting the need to use different quality measures to assess its value for low-resource machine learning tasks.

关键词： Misinformation detection data augmentation Language models

来源：评论

学校读者我要写书评

暂无评论

Quantifying data Difficulty with Polarized K-Entropy for Assessing machine learning Models 25

Quantifying Data Difficulty with Polarized K-Entropy for Ass...

引用

25th IEEE international conference on Information Reuse and Integration for data science (IEEE IRI)

作者： Afolabi, Ayomide Aygun, Ramazan Tran, Truong X. Kennesaw State Univ Sch Data Sci & Analyt Kennesaw GA 30144 USA Kennesaw State Univ Comp Sci Dept Kennesaw GA 30144 USA Penn State Univ Penn State Harrisburg Middletown PA USA

ISBN: (纸本)9798350351194;9798350351187

data difficulty level measurement is a critical aspect of machine learning performance evaluation. Several measures have been used to assess the difficulty level of classifying data points in binary classification. However, these measures typically involve building a machine learning model first, which is then used to assess the data difficulty level. In this paper, we propose a novel model agnostic measure named as polarized K-entropy to evaluate the difficulty of classifying a data instance. Our measure leverages the computation of entropy based on the nearest neighbors of a data point. We conducted experiments to evaluate the effectiveness of our proposed method by analyzing how the accuracy of machine learning models change with respect to data difficulty. We used Spearman's rank correlation coefficient to analyze this relationship for neural network, support vector machine, and random forest. Our results show that our measure outperformed the non-conformity measure in all the experiments conducted for six datasets using the selected machine learning models.

关键词： data difficulty polarized K-entropy non-conformity

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：