Empirical risk minimization, where the underlying loss function depends on a pair of data points, covers a wide range of application areas in statistics including pairwise ranking and survival analysis. The common emp...
Empirical risk minimization, where the underlying loss function depends on a pair of data points, covers a wide range of application areas in statistics including pairwise ranking and survival analysis. The common empirical risk estimator obtained by averaging values of a loss function over all possible pairs of observations is essentially a U-statistic. One well-known problem with minimizing U-statistic type empirical risks, is that the computational complexity of U-statistics increases quadratically with the sample size. When faced with big data, this poses computational challenges as the colossal number of observation pairs virtually prohibits centralized computing to be performed on a single machine. This paper addresses this problem by developing two computationally and statistically efficient methods based on the divide-and-conquer strategy on a decentralized computing system, whereby the data are distributed among machines to perform the tasks. One of these methods is based on a surrogate of the empirical risk, while the other method extends the one-step updating scheme in classical M-estimation to the case of pairwise loss. We show that the proposed estimators are as asymptotically efficient as the benchmark global U-estimator obtained under centralized computing. As well, we introduce two distributed iterative algorithms to facilitate the implementation of the proposed methods, and conduct extensive numerical experiments to demonstrate their merit.
In this work, we consider the algorithm to the (nonlinear) regression problems with 0penalty. The existing algorithms for 0based optimization problem are often carried out with a fixed step size, and the selection of ...
详细信息
View synthesis is usually done by an autoencoder, in which the encoder maps a source view image into a latent content code, and the decoder transforms it into a target view image according to the condition. However, t...
详细信息
For univariate variable x ∈ R, the Lévy measures for the two real-valued continuous negative definite functions ln and lncosh(x) are shown in Böttcher et al. (2018). In this note, by utilizing the Fourier t...
详细信息
Knowledge distillation has emerged as a highly effective method for bridging the representation discrepancy between large-scale models and lightweight models. Prevalent approaches involve leveraging appropriate metric...
详细信息
Recent observations, especially in cancer immunotherapy clinical trials with time-to-event outcomes, show that the commonly used proportional hazard assumption is often not justifiable, hampering an appropriate analys...
详细信息
In non-life insurance, it is essential to understand the serial dynamics and dependence structure of the longitudinal insurance data before using them. Existing actuarial literature primarily focuses on modeling, whic...
详细信息
In non-life insurance, it is essential to understand the serial dynamics and dependence structure of the longitudinal insurance data before using them. Existing actuarial literature primarily focuses on modeling, whic...
详细信息
Symmetric positive definite (SPD) matrices have shown important value and applications in statistics and machine learning, such as FMRI analysis and traffic prediction. Previous works on SPD matrices mostly focus on d...
详细信息
Existing GAN inversion methods fail to provide latent codes for reliable reconstruction and flexible editing simultaneously. This paper presents a transformer-based image inversion and editing model for pretrained Sty...
详细信息
暂无评论