A key challenge in data cleaning is estimating which of the tuples in a given database are correct and which are not. However, the output of such systems typically includes both false positives and false negatives, i....
详细信息
Various data mining techniques, like prediction and clustering, can be applied on educational data in order to study the student's performance and behavior. Predicting academic results is one of the methods that a...
详细信息
ISBN:
(纸本)9798350319439
Various data mining techniques, like prediction and clustering, can be applied on educational data in order to study the student's performance and behavior. Predicting academic results is one of the methods that aim at monitoring student progress and anticipating students that are at risk of failure in their academic career. In this paper, we propose a machine learning (ML) based Educational data Mining (EDM) approach, named ARSITUN, for the identification of at-risk students. Using ARSITUN, an early intervention can be performed for the detected students in order to lower the risk of their failure. The proposed approach was developed and tested using student's data that were collected from the Tunisian administration system for bachelors and masters called "Salima". We created a new dataset, named GCSD, that concerns 358 students from the faculty of sciences of Gafsa during the school years period 2014-2022. The experimental results showed that our EDM model reaches an accuracy of 90.44% for computer science bachelors' grade prediction (Tunisian case study).
To improve the accuracy of urban rail transit passenger flow prediction, this paper proposes a prediction model for urban rail transit passenger flow based on data mining. Firstly, factor analysis (FA) is used to mode...
详细信息
Federated learning has been widely researched and applied in various scenarios. In the context of mobile computing, federated learning protects users from exposing their private data while enabling collaborative train...
详细信息
Digital twins of structures are virtual representation of mechanical systems that continuously update using sensor data. However, correlating the million degrees of freedom of the numerical models for the digital twin...
详细信息
ISBN:
(纸本)9798350372977;9798350372984
Digital twins of structures are virtual representation of mechanical systems that continuously update using sensor data. However, correlating the million degrees of freedom of the numerical models for the digital twin to the test data is challenging. Thus, proper expansion techniques are desirable. This paper provides analysis of a cantilever beam's dynamic response through the integration of experimental and numerical approaches. There are reduction/expansion approaches to match these degrees of freedom. However, the traditional approaches can only expand displacement, velocity, and acceleration. Sometimes, only strain data is available on the structure. Some researchers have discovered approaches for strain expansion and reduction. This paper describes a global expansion approach for expanding test data capable of expanding strain to displacement or displacement to strain. To show the merit of the approach, a digital twin of a beam is created and tested in the current work.
The Internet of Things (IoT) stands as a revolutionary technology which impacts smart agriculture and different sectors beyond it. The research describes an intelligent IoT-enabled monitoring system which uses sensors...
详细信息
Traffic data serves as a fundamental component in both research and applications within intelligent transportation systems. However, real-world transportation data, collected from loop detectors or similar sources, of...
详细信息
ISBN:
(纸本)9798350399462
Traffic data serves as a fundamental component in both research and applications within intelligent transportation systems. However, real-world transportation data, collected from loop detectors or similar sources, often contains missing values (MVs), which can adversely impact associated applications and research. Instead of discarding this incomplete data, researchers have sought to recover these missing values through numerical statistics, tensor decomposition, and deep learning techniques. In this paper, we propose an innovative deep learning approach for imputing missing data. A graph attention architecture is employed to capture the spatial correlations present in traffic data, while a bidirectional neural network is utilized to learn temporal information. Experimental results indicate that our proposed method outperforms all other benchmark techniques, thus demonstrating its effectiveness.
Cooling costs count for a significant part of the total energy consumption in data centers, and previous re-searchers mainly focused on investigating thermal-ware workload distribution strategies for CPU-intensive wor...
详细信息
Blockchain-based supply chain traceability systems are characterized by decentralization, transparency, and immutability. However, if all nodes replicate the entire ledger, it would result in significant storage overh...
详细信息
ISBN:
(纸本)9798350309461
Blockchain-based supply chain traceability systems are characterized by decentralization, transparency, and immutability. However, if all nodes replicate the entire ledger, it would result in significant storage overhead. Some nodes might be forced to exit the system due to inadequate storage resources. Additionally, the chained structure of blockchain implies that query delays will significantly increase as the volume of data grows. Furthermore, the immutable nature of blockchain demands a higher level of accuracy in the original data within the traceability system, as once erroneous data is uploaded to the blockchain, it cannot be altered or removed. In this paper, First, we analyze the requirements of supply chain business scenarios and the characteristics of traceability data. Then, we propose a semantic multi-chain storage architecture, categorizing data from different business entities into respective semantic side-chains, thereby addressing issues such as insufficient node storage and on-chain data explosion. Second, We introduce a semantic aggregation storage optimization technique, consolidating data with identical semantic keys into a single block, thereby reducing the number of blocks accessed during queries and improving query efficiency. Lastly, for the first time, we introduce a pre-onchain data reliability verification mechanism based on data characteristics. The results indicate that the proposed solution can offer reduced on-chain storage space, faster query speeds, and enhanced reliability of the original data sources.
Often, recommendation systems employ continuous training, leading to a self-feedback loop bias in which the system becomes biased toward its previous recommendations. Recent studies have attempted to mitigate this bia...
详细信息
ISBN:
(纸本)9798350381641
Often, recommendation systems employ continuous training, leading to a self-feedback loop bias in which the system becomes biased toward its previous recommendations. Recent studies have attempted to mitigate this bias by collecting small amounts of unbiased data. While these studies have successfully developed less biased models, they ignore the crucial fact that the recommendations generated by the model serve as the training data for subsequent training sessions. To address this issue, we propose a framework that learns an unbiased estimator using a small amount of uniformly collected data and focuses on generating improved training data for subsequent training iterations. To accomplish this, we view recommendation as a contextual multi-arm bandit problem and emphasize on exploring items that the model has a limited understanding of. We introduce a new offline sequential training schema that simulates real-world continuous training scenarios in recommendation systems, offering a more appropriate framework for studying self-feedback bias. We demonstrate the superiority of our model over state-ofthe-art debiasing methods by conducting extensive experiments using the proposed training schema.
暂无评论