Learners with a limited budget can use supervised data subset selection and active learning techniques to select a smaller training set and reduce the cost of acquiring data and training machine learning (ML) models. ...
详细信息
Learners with a limited budget can use supervised data subset selection and active learning techniques to select a smaller training set and reduce the cost of acquiring data and training machine learning (ML) models. However, the resulting high model performance, measured by a data utility function, may not be preserved when some data owners, enabled by the GDPR's right to erasure, request their data to be deleted from the ML model. This raises an important question for learners who are temporarily unable or unwilling to acquire data again: During the initial data acquisition of a training set of size k, can we proactively maximize the data utility after future unknown deletions? We propose that the learner anticipates/estimates the probability that (i) each data owner in the feasible set will independently delete its data or (ii) a number of deletions occur out of k, and justify our proposal with concrete real-world use cases. Then, instead of directly maximizing the data utility function, the learner can maximize the expected or risk-averse post-deletion utility based on the anticipated probabilities. We further propose how to construct these deletion-anticipative data selection (DADS) maximization objectives to preserve monotone submodularity and near-optimality of greedy solutions, how to optimize the objectives and empirically evaluate DADS' performance on real-world datasets. Copyright 2024 by the author(s)
We describe a novel construction of arbitrary read-modify-write (RMW) primitives in a persistent shared memory model with process failures. Our construction uses blocking synchronization, in the form of recoverable mu...
详细信息
Recent growth in the number of drones has made traffic management unworkable, particularly in urban areas. The safe operation and optimized navigation of drone swarms are now growing concerns. In this article, we use ...
详细信息
A novel synthesis method for wideband bandpass filter (BPF) with two in-band conjugate complex transmission zeros is proposed for realizing frequency- and attenuation-reconfigurable in-band notch. A new characteristic...
详细信息
There is a problem of unclear image details in traditional visual communication design, which can affect the effectiveness of the design work. This article proposes a computer-aided design method based on image detail...
详细信息
Grid-connected photovoltaic (PV) systems are crucial to modern renewable energy strategies, but various types of faults can significantly impact their performance. Understanding the behavior of these faults is essenti...
详细信息
In this work, we introduce novel information-theoretic generalization bounds using the conditional f-information framework, an extension of the traditional conditional mutual information (MI) framework. We provide a g...
This paper presents a review on methods for class-imbalanced learning with the Support Vector Machine (SVM) and its variants. We first explain the structure of SVM and its variants and discuss their inefficiency in le...
详细信息
Generative artificial intelligence systems such as large language models (LLMs) exhibit powerful capabilities that many see as the kind of flexible and adaptive intelligence that previously only humans could exhibit. ...
详细信息
Lightweight yet reliable depth estimation models that can deployed on edge devices are crucial for the practical application of fields such as autonomous driving, robot navigation, and augmented reality. However, prev...
详细信息
暂无评论