This paper studies variable selection using the penalized likelihood method for dis-tributed sparse regression with large sample size n under a limited memory *** is a much needed research problem to be solved in the ...
详细信息
This paper studies variable selection using the penalized likelihood method for dis-tributed sparse regression with large sample size n under a limited memory *** is a much needed research problem to be solved in the big data era.A naive divide-and-conquer method solving this problem is to split the whole data into N parts and run each part on one of N machines,aggregate the results from all machines via averaging,andfinally obtain the selected ***,it tends to select more noise variables,and the false discovery rate may not be well *** improve it by a special designed weighted average in *** the alternating direction method of multiplier can be used to deal with massive data in the literature,our proposed method reduces the computational burden a lot and performs better by mean square error in most ***,we establish asymptotic properties of the resulting estimators for the likelihood models with a diverging number of *** some regularity conditions,we establish oracle properties in the sense that our distributed estimator shares the same asymptotic efficiency as the estimator based on the full ***,a distributed penalized likelihood algorithm is proposed to refine the results in the context of general ***,the proposed method is evaluated by simulations and a real example.
In tasks aiming for long-term returns, planning becomes essential. We study generative modeling for planning with datasets repurposed from offline reinforcement learning. Specifically, we identify temporal consistency...
We present a new supervised learning technique for the Variational AutoEncoder (VAE) that allows it to learn a causally disentangled representation and generate causally disentangled outcomes simultaneously. We call t...
详细信息
From a broader perspective, the objective of visual speech recognition (VSR) is to comprehend the speech spoken by an individual using visual deformations. However, some of the significant limitations of existing solu...
详细信息
Identifying health conditions from facial images is crucial for the early detection of certain diseases and provides crucial information for timely intervention. This study introduces a novel ensemble convolutional ne...
详细信息
Conditional dependence plays a crucial role in various statistical procedures, including variable selection, network analysis and causal inference. However, there remains a paucity of relevant research in the context ...
Conditional dependence plays a crucial role in various statistical procedures, including variable selection, network analysis and causal inference. However, there remains a paucity of relevant research in the context of high-dimensional conditioning variables, a common challenge encountered in the era of big data. To address this issue, many existing studies impose certain model structures, yet high-dimensional conditioning variables often introduce spurious correlations in these models. In this paper, we systematically study the estimation biases inherent in widely-used measures of conditional dependence when spurious variables are present under high-dimensional settings. We discuss the estimation inconsistency both intuitively and theoretically,demonstrating that the conditional dependencies can be either overestimated or underestimated under different scenarios. To mitigate these biases and attain consistency, we introduce a measure based on data splitting and refitting techniques for high-dimensional conditional dependence. A conditional independence test is also developed using the newly advocated measure, with a tuning-free asymptotic null distribution. Furthermore,the proposed test is applied to generating high-dimensional network graphs in graphical modeling. The superior performances of newly proposed methods are illustrated both theoretically and through simulation studies. We also utilize the method to construct the gene-gene networks using a dataset of breast invasive carcinoma, which contains interesting discoveries that are worth further scientific exploration.
Federated learning(FL)is an emerging privacy-preserving distributed computing paradigm,enabling numerous clients to collaboratively train machine learning models without the necessity of transmitting clients’private ...
详细信息
Federated learning(FL)is an emerging privacy-preserving distributed computing paradigm,enabling numerous clients to collaboratively train machine learning models without the necessity of transmitting clients’private datasets to the central *** most existing research where the local datasets of clients are assumed to be unchanged over time throughout the whole FL process,our study addresses such scenarios in this paper where clients’datasets need to be updated periodically,and the server can incentivize clients to employ as fresh as possible datasets for local model *** primary objective is to design a client selection strategy to minimize the loss of the global model for FL loss within a constrained *** this end,we introduce the concept of“Age of Information”(AoI)to quantitatively assess the freshness of local datasets and conduct a theoretical analysis of the convergence bound in our AoI-aware FL *** on the convergence bound,we further formulate our problem as a restless multi-armed bandit(RMAB)***,we relax the RMAB problem and apply the Lagrangian Dual approach to decouple it into multiple ***,we propose a Whittle’s Index Based Client Selection(WICS)algorithm to determine the set of selected *** addition,comprehensive simulations substantiate that the proposed algorithm can effectively reduce training loss and enhance the learning accuracy compared with some state-of-the-art methods.
Despite the abundance of research on reducing carbon emissions, there is a significant gap in understanding the influence of macroeconomic factors on carbon dioxide (CO2) emissions from a spatial-structural perspectiv...
详细信息
In this paper, we advocate a new technique to determine inner bounds for the extreme eigenvalues of real symmetric matrices. Our method involves the matrix elements and compares favourably with existing methods. We al...
详细信息
Rigorously establishing the safety of black-box machine learning models concerning critical risk measures is important for providing guarantees about model behavior. Recently, Bates et. al. (JACM'24) introduced th...
暂无评论