Uncertainty quantification, by means of confidence interval (CI) construction, has been a fundamental problem in statistics and also important in risk-aware decision-making. In this paper, we revisit the basic problem...
详细信息
We are considering the problem of optimal portfolio delegation between an investor and a portfolio manager under a random default time. We focus on a novel variation of the Principal-Agent problem adapted to this fram...
详细信息
We initiate the study of statistical inference and A/B testing for two market equilibrium models: linear Fisher market (LFM) equilibrium and first-price pacing equilibrium (FPPE). LFM arises from fair resource allocat...
详细信息
An important goal of modern scheduling systems is to efficiently manage power usage. In energy-efficient scheduling, the operating system controls the speed at which a machine is processing jobs with the dual objectiv...
详细信息
Hindsight experience replay and goal relabeling are successful in reinforcement learning (RL) since they enable agents to learn from *** their successes, we lack a theoretical understanding, such as (i) why hindsight ...
详细信息
Hindsight experience replay and goal relabeling are successful in reinforcement learning (RL) since they enable agents to learn from *** their successes, we lack a theoretical understanding, such as (i) why hindsight experience replay improves sample efficiency and (ii) how to design a relabeling method that achieves sample *** this end, we construct an example to show the information-theoretical improvement in sample efficiency achieved by goal *** example reveals that goal relabeling can enhance sample efficiency and exploit the rich information in observations through better hypothesis *** on these insights, we develop an RL algorithm called *** analyze the sample complexity of GOALIVE, we introduce a complexity measure, the goal-conditioned Bellman-Eluder (GOAL-BE) dimension, which characterizes the sample complexity of goal-conditioned RL *** to the Bellman-Eluder dimension, the goal-conditioned version offers an exponential improvement in the best *** the best of our knowledge, our work provides the first characterization of the theoretical improvement in sample efficiency achieved by goal relabeling. Copyright 2024 by the author(s)
Cross-Validation (CV) is the default choice for evaluating the performance of machine learning models. Despite its wide usage, their statistical benefits have remained half-understood, especially in challenging nonpar...
详细信息
This paper aims to develop and provide a rigorous treatment to the problem of entropy regularized fine-tuning in the context of continuous-time diffusion models, which was recently proposed by Uehara et al. (arXiv:240...
详细信息
Mean field variational inference (VI) is the problem of finding the closest product (factorized) measure, in the sense of relative entropy, to a given high-dimensional probability measure ρ. The well known Coordinate...
详细信息
Cross-Validation (CV) is the default choice for estimate the out-of-sample performance of machine learning models. Despite its wide usage, their statistical benefits have remained half-understood, especially in challe...
ISBN:
(纸本)9798331314385
Cross-Validation (CV) is the default choice for estimate the out-of-sample performance of machine learning models. Despite its wide usage, their statistical benefits have remained half-understood, especially in challenging nonparametric regimes. In this paper we fill in this gap and show that, in terms of estimating the out-of-sample performances, for a wide spectrum of models, CV does not statistically outperform the simple "plug-in" approach where one reuses training data for testing evaluation. Specifically, in terms of both the asymptotic bias and coverage accuracy of the associated interval for out-of-sample evaluation, K-fold CV provably cannot outperform plug-in regardless of the rate at which the parametric or nonparametric models converge. Leave-one-out CV can have a smaller bias as compared to plug-in; however, this bias improvement is negligible compared to the variability of the evaluation, and in some important cases leave-one-out again does not outperform plug-in once this variability is taken into account. We obtain our theoretical comparisons via a novel higher-order Taylor analysis that dissects the limit theorems of testing evaluations, which applies to model classes that are not amenable to previously known sufficient conditions. Our numerical results demonstrate that plug-in performs indeed no worse than CV in estimating model performance across a wide range of examples.
Diffusion probabilistic models (DPMs) have emerged as a promising technique in generative modeling. The success of DPMs relies on two ingredients: time reversal of diffusion processes and score matching. In view of po...
详细信息
暂无评论