This paper studies variable selection using the penalizedlikelihood method for dis-tributed sparse regression with large sample size n under a limited memory *** is a much needed research problem to be solved in the ...
详细信息
This paper studies variable selection using the penalizedlikelihood method for dis-tributed sparse regression with large sample size n under a limited memory *** is a much needed research problem to be solved in the big data era.A naive divide-and-conquer method solving this problem is to split the whole data into N parts and run each part on one of N machines,aggregate the results from all machines via averaging,andfinally obtain the selected ***,it tends to select more noise variables,and the false discovery rate may not be well *** improve it by a special designed weighted average in *** the alternating direction method of multiplier can be used to deal with massive data in the literature,our proposed method reduces the computational burden a lot and performs better by mean square error in most ***,we establish asymptotic properties of the resulting estimators for the likelihood models with a diverging number of *** some regularity conditions,we establish oracle properties in the sense that our distributed estimator shares the same asymptotic efficiency as the estimator based on the full ***,a distributed penalized likelihood algorithm is proposed to refine the results in the context of general ***,the proposed method is evaluated by simulations and a real example.
暂无评论