Extracting skill information for students in online learning environments has been a challenging topic across different domains. Predicting the number of skills is the first step towards estimating students’ skills. ...
详细信息
Testing machinelearning (ML) projects is challenging due to inherent non-determinism of various ML algorithms and the lack of reliable ways to compute reference results. Developers typically rely on their intuition w...
详细信息
ISBN:
(纸本)9781665457019
Testing machinelearning (ML) projects is challenging due to inherent non-determinism of various ML algorithms and the lack of reliable ways to compute reference results. Developers typically rely on their intuition when writing tests to check whether ML algorithms produce accurate results. However, this approach leads to conservative choices in selecting assertion bounds for comparing actual and expected results in test assertions. Because developers want to avoid false positive failures in tests, they often set the bounds to be too loose, potentially leading to missing critical bugs. We present FASER - the first systematic approach for balancing the trade-off between the fault-detection effectiveness and flakiness of non-deterministic tests by computing optimal assertion bounds. FASER frames this trade-off as an optimization problem between these competing objectives by varying the assertion bound. FASER leverages 1) statisticalmethods to estimate the flakiness rate, and 2) mutation testing to estimate the fault-detection effectiveness. We evaluate FASER on 87 non-deterministic tests collected from 22 popular ML projects. FASER finds that 23 out of 87 studied tests have conservative bounds and proposes tighter assertion bounds that maximizes the fault-detection effectiveness of the tests while limiting flakiness. We have sent 19 pull requests to developers, each fixing one test, out of which 14 pull requests have already been accepted.
Implementing an algorithmically-informed policy represents a significant intervention into existing social structures. How such an intervention will affect society is a "naive", but arguably central, questio...
详细信息
Control Area Network (CAN), despite facilitating electronic control unit (ECU) communications, lacks built-in mechanisms for secure transmission, exposing its messages to cyber-attacks due to unsecured broadcasting. C...
详细信息
ISBN:
(纸本)9798400712296
Control Area Network (CAN), despite facilitating electronic control unit (ECU) communications, lacks built-in mechanisms for secure transmission, exposing its messages to cyber-attacks due to unsecured broadcasting. Current Intrusion Detection Systems (IDSs) for CAN rely predominantly on rule-based, statistical, or supervised machinelearning (ML) models, which require significant human intervention for tasks such as reconfiguration, gathering labeled data samples, and retraining with newly released vehicle models. These manual dependencies highlight the critical need for autonomous capability in IDS that can adapt independently, thus mitigating practical deployment challenges in real-world scenarios. In this paper, we propose an autonomous cybersecurity IDS named Auto-CIDS, designed to minimize human intervention and enable active learning utilizing past experiences. By applying Deep Reinforcement learning (DRL) with the advantages of unsupervised algorithms, we train Deep Q-network (DQN) agents in a self-supervised manner using their own past experiences. We develop three standalone autonomous methods. The first method, Single-Task Self-Supervised, uses an autoencoder to supervise DQN agents in each environment, which includes both normal and specific attack data without needing labeled datasets. The second method, Multi-Environment Self-Supervised, enhances the generalization ability of the first by training a DQN agent across multiple environments, allowing knowledge transfer from varied settings into a single agent. The third method, Multi-Task Multi-Agent, increases the robustness of our proposed Auto-CIDS by employing a combination of modified unsupervised methods, including autoencoder, k-means, and isolation forest algorithms, each tailored for a specific type of attack. This approach builds attack-specific DQNs that periodically and cooperatively train a global DQN agent based on their predictions, facilitating ongoing active learning. We conducted experim
Model-based deep learning solutions to inverse problems have attracted increasing attention in recent years as they bridge state-of-the-art numerical performance with interpretability. In addition, the incorporated pr...
详细信息
Site-specific radio frequency (RF) propagation prediction increasingly relies on models built from visual data such as cameras and LIDAR sensors. When operating in dynamic settings, the environment may only be partial...
详细信息
ISBN:
(数字)9781665494557
ISBN:
(纸本)9781665494557
Site-specific radio frequency (RF) propagation prediction increasingly relies on models built from visual data such as cameras and LIDAR sensors. When operating in dynamic settings, the environment may only be partially observed. This paper introduces a method to extract statistical channel models, given partial observations of the surrounding environment. We propose a simple heuristic algorithm that performs ray tracing on the partial environment and then uses machine-learning trained predictors to estimate the channel and its uncertainty from features extracted from the partial ray tracing results. It is shown that the proposed method can interpolate between fully statistical models when no partial information is available and fully deterministic models when the environment is completely observed. The method can also capture the degree of uncertainty of the propagation predictions depending on the amount of region that has been explored. The methodology is demonstrated in a robotic navigation application simulated on a set of indoor maps with detailed models constructed using state-of-the-art navigation, simultaneous localization and mapping (SLAM), and computer vision methods.
System identification, system inversion and its multiple-signal extension to source separation, including non-blind and blind (i.e. unsupervised) configurations, are major topics in classical, i.e. nonquantum, signal ...
详细信息
This tutorial focuses on efficient methods to predictive monitoring (PM), the problem of detecting at runtime future violations of a given requirement from the current state of a system. While performing model checkin...
详细信息
Creating a system of complex disease diagnosis based on gene expression data using modern data mining and machinelearning techniques is one of the topical areas of recent bioinformatics. The main problem in this subj...
详细信息
The fundamental task of classification given a limited number of training data samples is considered for physical systems with known parametric statistical models. As a solution, a hybrid classification method-termed ...
详细信息
ISBN:
(纸本)9781728163383
The fundamental task of classification given a limited number of training data samples is considered for physical systems with known parametric statistical models. As a solution, a hybrid classification method-termed HYPHYLEARN-is proposed that exploits both the physics-based statistical models and the learning-based classifiers. The proposed solution is based on the conjecture that HYPHYLEARN would alleviate the challenges associated with the individual approaches of learning-based and statistical model-based classifiers by fusing their respective strengths. The proposed hybrid approach first estimates the unobservable model parameters using the available (suboptimal) statistical estimation procedures, and subsequently uses the physics-based statistical models to generate synthetic data. Next, the training data samples are incorporated with the synthetic data in a learning-based classifier that is based on domain-adversarial training of neural networks. Numerical results on multiuser detection, a concrete communication problem, demonstrate that HYPHYLEARN leads to major classification improvements compared to the existing stand-alone and hybrid classification methods.
暂无评论