In distributed environments, data for machinelearning (ML) applications may be generated from numerous sources and devices, and traverse a cloud-edge continuum via a variety of protocols, using multiple security sche...
详细信息
ISBN:
(纸本)9798350303223
In distributed environments, data for machinelearning (ML) applications may be generated from numerous sources and devices, and traverse a cloud-edge continuum via a variety of protocols, using multiple security schemes and equipment types. While ML models typically benefit from using large training sets, not all data can be equally trusted. In this work, we examine data trust as a factor in creating ML models, and explore an approach using annotated trust metadata to contribute to data weighting in generating ML models. We assess the feasibility of this approach using well-known datasets for both linear regression and classification problems, demonstrating the benefit of including trust as a factor when using heterogeneous datasets. We discuss the potential benefits of this approach, and the opportunity it presents for improved data utilisation and processing.
There is a high surge in usage of online e-learning platforms due to the current ongoing COVID-19 scenario. There are specific problems that persist in the current e-learning online models, i.e., validations and track...
详细信息
The proceedings contain 17 papers. The topics discussed include: spark-based machinelearning pipeline construction method;implementation of chinese reader aid for visually-impaired by using neural network and text su...
ISBN:
(纸本)9781728104041
The proceedings contain 17 papers. The topics discussed include: spark-based machinelearning pipeline construction method;implementation of chinese reader aid for visually-impaired by using neural network and text summarization technologies;a new percentage of sales method for forecasting additional funds needed;an artificially intelligent wearable device for dementia patients;development of IoT-based safety management method through an analysis of structural characteristics and risk factors for industrial valves;analysis of machinelearning techniques for credit card fraud detection;social content mining in social networks;using knowledge discovery techniques to support tutoring in an open world intelligent game-based learning environment;and a clustering approach for outliers detection in a big point-of-sales database.
Aim: To recognize human face expressions accurately using machinelearning algorithms and compare the image features against the criminal records. Methods and Materials: The study contains 2 groups i.e., Unsupervised ...
详细信息
data preprocessing is an important prerequisite for data mining and machinelearning. In this paper, we introduce Preprocessy, a Python framework that provides customisable data preprocessing pipelines for processing ...
详细信息
ISBN:
(纸本)9781665410144
data preprocessing is an important prerequisite for data mining and machinelearning. In this paper, we introduce Preprocessy, a Python framework that provides customisable data preprocessing pipelines for processing structured data. Preprocessy pipelines come with sane defaults and the framework also provides low-level functions to build custom pipelines. The paper gives a brief overview of the features and the high-level APIs of Preprocessy along with a performance comparison against Scikit-learn and Pandas on two datasets. Preprocessy provides functions for handling missing data and outliers, data normalisation, feature selection and data sampling. The goal of Preprocessy is to be easy to use, flexible and performant. Preprocessy helps beginners and experts alike by making data preprocessing an easier and faster task.
The Person Re-identification (Re-ID) task has gained popularity in recent times. Researchers are continuously looking to improve the accuracy of the existing person Re-ID systems. Identifying the person from the surve...
详细信息
The power industry has achieved rapid development with the strong support of national policies, which makes the relevant data information show geometric growth. With the further promotion of the power market reform, t...
详细信息
Due to the characteristics of open links, immense coverage and dynamic network topology, access authentication is crucial and challenging in integrated satellite-terrestrial backhaul networks (ISTBNs). Based on the ac...
详细信息
ISBN:
(纸本)9781728190549
Due to the characteristics of open links, immense coverage and dynamic network topology, access authentication is crucial and challenging in integrated satellite-terrestrial backhaul networks (ISTBNs). Based on the access behavior of users, this paper proposes an ISTBN access authentication approach to prevent the spoofing attack, which is the most prominent security threat in ISTBNs. Concretely, we model the access authentication as a machinelearning classification task with spatial-temporal characteristics. In this way, the dependency on identification information can be avoided, which is easy to be copied by adversaries in spoofing attacks. Then, we design an ensemble learning classifier to judge the legality of the access to ISTBNs. By organically integrating tree-based models and deep learning through the soft voting method, the detection performance of unauthorized access can be improved. Experiment results validate the effectiveness of our approach for preventing unauthorized access to ISTBNs.
Given the rapid growth of emerging systems and technologies, cloud systems of computing, IoT, and other massive data applications, as viewed in this case, are trying to cross thresholds in value realization and real-t...
详细信息
The use of massive clinical data in the medical field for supporting medical decision support is an inevitable development trend. Medical decision support is based on a variety of data sources accumulated and acquired...
详细信息
暂无评论