检索结果-内蒙古大学图书馆

Regularized robust estimation in binary regression models

JOURNAL OF APPLIED STATISTICS 2022年第3期49卷 574-598页

作者： Tang, Qingguo Karunamuni, Rohana J. Liu, Boxiao Nanjing Univ Sci & Technol Sch Econ & Management Nanjing Peoples R China Univ Alberta Dept Math & Stat Sci Edmonton AB T6G 2G1 Canada Bank Montreal Toronto ON Canada

In this paper, we investigate robust parameter estimation and variable selection for binary regression models withgrouped data. We investigate estimation procedures based on the minimum-distance approach. In particular, we employ minimum Hellinger and minimum symmetric chi-squared distances criteria and propose regularized minimum-distance estimators. These estimators appear to possess a certain degree of automatic robustness against model misspecification and/or for potential outliers. We show that the proposed non-penalized and penalized minimum-distance estimators are efficient under the model and simultaneously have excellent robustness properties. We study their asymptotic properties such as consistency, asymptotic normality and oracle properties. Using Monte Carlo studies, we examine the small-sample and robustness properties of the proposed estimators and compare them with traditional likelihood estimators. We also study two real-data applications to illustrate our methods. The numerical studies indicate the satisfactory finite-sample performance of our procedures.

关键词： binary regression maximum likelihood minimum-distance methods variable selection efficiency robustness

来源：评论

学校读者我要写书评

暂无评论

Least squares moment identification of binary regression mixture models

引用

METRIKA 2021年第4期84卷 561-593页

作者： Auder, Benjamin Gassiat, Elisabeth Loum, Mor Absa Univ Paris Saclay Lab Math Orsay CNRS F-91405 Orsay France

We consider finite mixtures of generalized linear models with binary output. We prove that cross moments (between the output and the regression variables) up to order three are sufficient to identify all parameters of the model. We propose a least-squares estimation method based on those moments and we prove the consistency and the Gaussian asymptotic behavior of the estimator. We provide simulation results and comparisons with likelihood methods. Numerical experiments were conducted using the R-package morpheus that we developed for our least-squares moment method and with the R-package flexmix for likelihood methods. We then give some possible extensions to finite mixtures of regressions with binary output including both continuous and categorical covariates, and possibly longitudinal data.

关键词： Generalized linear model Mixture model Moment method Spectral method binary regression

来源：评论

学校读者我要写书评

暂无评论

Is distribution-free inference possible for binary regression?

引用

ELECTRONIC JOURNAL OF STATISTICS 2020年第2期14卷 3487-3524页

作者： Barber, Rina Foygel Univ Chicago Dept Stat Chicago IL 60637 USA

For a regression problem with a binary label response, we examine the problem of constructing confidence intervals for the label probability conditional on the features. In a setting where we do not have any information about the underlying distribution, we would ideally like to provide confidence intervals that are distribution free-that is, valid with no assumptions on the distribution of the data. Our results establish an explicit lower bound on the length of any distribution-free confidence interval, and construct a procedure that can approximately achieve this length. In particular, this lower bound is independent of the sample size and holds for all distributions with no point masses, meaning that it is not possible for any distribution-free procedure to be adaptive with respect to any type of special structure in the distribution.

关键词： Distribution-free nonparametric inference binary regression adaptive inference

来源：评论

学校读者我要写书评

暂无评论

Integrating random walk and binary regression to identify novel miRNA-disease association

引用

BMC BIOINFORMATICS 2019年第1期20卷 1-13页

作者： Niu, Ya-Wei Wang, Guang-Hui Yan, Gui-Ying Chen, Xing Shandong Univ Sch Math Jinan 250100 Shandong Peoples R China Chinese Acad Sci Acad Math & Syst Sci Beijing 100190 Peoples R China China Univ Min & Technol Sch Informat & Control Engn 1 Daxue Rd Xuzhou 221116 Jiangsu Peoples R China

BackgroundIn the last few decades, cumulative experimental researches have witnessed and verified the important roles of microRNAs (miRNAs) in the development of human complex diseases. Benefitting from the rapid growth both in the availability of miRNA-related data and the development of various analysis methodologies, up until recently, some computational models have been developed to predict human disease related miRNAs, efficiently and *** this work, we proposed a computational model of Random Walk and binary regression-based MiRNA-Disease Association prediction (RWBRMDA). RWBRMDA extracted features for each miRNA from random walk with restart on the integrated miRNA similarity network for binary logistic regression to predict potential miRNA-disease associations. RWBRMDA obtained AUC of 0.8076 in the leave-one-out cross validation. Additionally, we carried out three different patterns of case studies on four human complex diseases. Specifically, Esophageal cancer and Prostate cancer were conducted as one kind of case study based on known miRNA-disease associations in HMDD v2.0 database. Out of the top 50 predicted miRNAs, 94 and 90% were respectively confirmed by recent experimental reports. To simulate new disease without known related miRNAs, the information of known Breast cancer related miRNAs was removed. As a result, 98% of the top 50 predicted miRNAs for Breast cancer were confirmed. Lymphoma, the verified ratio of which was 88%, was used to assess the prediction robustness of RWBRMDA based on the association records in HMDD v1.0 *** anticipated that RWBRMDA could benefit the future experimental investigations about the relation between human disease and miRNAs by generating promising and testable top-ranked miRNAs, and significantly reducing the effort and cost of identification works.

关键词： microRNA Disease miRNA-disease association Random walk binary regression

来源：评论

学校读者我要写书评

暂无评论

Performance of asymmetric links and correction methods for imbalanced data in binary regression

引用

JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION 2019年第9期89卷 1694-1714页

作者： Huayanay, Alex de la Cruz Bazan, Jorge L. Cancho, Vicente G. Dey, Dipak K. USP UFSCar Interinst Grad Stat Sao Carlos SP Brazil Univ Sao Paulo Dept Appl Math & Stat Sao Carlos SP Brazil Univ Connecticut Dept Stat Mansfield CT USA

In binary regression, imbalanced data result from the presence of values equal to zero (or one) in a proportion that is significantly greater than the corresponding real values of one (or zero). In this work, we evaluate two methods developed to deal with imbalanced data and compare them to the use of asymmetric links. The results based on simulation study show, that correction methods do not adequately correct bias in the estimation of regression coefficients and that the models with power links and reverse power considered produce better results for certain types of imbalanced data. Additionally, we present an application for imbalanced data, identifying the best model among the various ones proposed. The parameters are estimated using a Bayesian approach, considering the Hamiltonian Monte-Carlo method, utilizing the No-U-Turn Sampler algorithm and the comparisons of models were developed using different criteria for model comparison, predictive evaluation and quantile residuals.

关键词： Asymmetric link binary regression imbalanced data predictive evaluation quantile residuals similarity measures

来源：评论

学校读者我要写书评

暂无评论

Optimal adaptive inference in random design binary regression

引用

BERNOULLI 2018年第1期24卷 699-739页

作者： Mukherjee, Rajarshi Sen, Subhabrata Stanford Univ Dept Stat Sequoia Hall390 Serra Mall Stanford CA 94305 USA

We construct confidence sets for the regression function in nonparametric binary regression with an unknown design density a nuisance parameter in the problem. These confidence sets are adaptive in L-2 loss over a continuous class of Sobolev type spaces. Adaptation holds in the smoothness of the regression function, over the maximal parameter spaces where adaptation is possible, provided the design density is smooth enough. We identify two key regimes one where adaptation is possible, and one where some critical regions must be removed. We address related questions about goodness of fit testing and adaptive estimation of relevant infinite dimensional parameters.

关键词： adaptive confidence sets binary regression U-statistics

来源：评论

学校读者我要写书评

暂无评论

binary quantile regression and variable selection: A new approach

引用

ECONOMETRIC REVIEWS 2019年第6期38卷 679-694页

作者： Aristodemou, Katerina He, Jian Yu, Keming Brunel Univ London Uxbridge Middx England Shihezi Univ Shihezi Xinjiang Weiwue Peoples R China

In this paper, we propose a new estimation method for binary quantile regression and variable selection which can be implemented by an iteratively reweighted least square approach. In contrast to existing approaches, this method is computationally simple, guaranteed to converge to a unique solution and implemented with standard software packages. We demonstrate our methods using Monte-Carlo experiments and then we apply the proposed method to the widely used work trip mode choice dataset. The results indicate that the proposed estimators work well in finite samples.

关键词： Adaptive lasso binary regression iteratively reweighted least squares quantile regression smoothed maximum score estimator variable selection work trip mode choice

来源：评论

学校读者我要写书评

暂无评论

How to Take Both Non-Linearity and Asymmetry (Skewness) into Account in binary Decision Making: Skew-Probit and Skew-Logit in binary Kink regression

引用

INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS 2020年第Sup1期28卷 39-49页

作者： Maneejuk, Paravee Chiang Mai Univ Fac Econ Ctr Excellence Econometr Chiang Mai Thailand

In many practical situations, it is desirable to predict binary ("yes"-"no") decisions made by people. The traditional approach to this prediction assumes that the utility linearly depends on the corresponding parameters, and that the distribution of the difference between predicted and actual utility is symmetric - usually normal or logistic;the corresponding techniques are known as, correspondingly, probit and logit. In real life, utility often non-linearly depends on the parameters, and the corresponding distributions are asymmetric (skewed). There are techniques for dealing with non-linearity;the most widely used such technique - called kink regression - uses piece-wise linear approximations to the utility. There are also techniques that take into account the distribution's asymmetry;usually, they are based on using special asymmetric distributions: skew-normal and skew-logistic. In this paper, we show how these two techniques to be combined to take into account both non-linearity and asymmetry. On a real-life example, we show that the new technique indeed leads to a better description of human binary decision-making.

关键词： binary decisions binary regression kink regression logit probit skew-normal distribution skew-logistic distribution

来源：评论

学校读者我要写书评

暂无评论

HYPOTHESIS TESTING FOR HIGH-DIMENSIONAL SPARSE binary regression

引用

ANNALS OF STATISTICS 2015年第1期43卷 352-381页

作者： Mukherjee, Rajarshi Pillai, Natesh S. Lin, Xihong Stanford Univ Dept Stat Stanford CA 94305 USA Harvard Univ Dept Stat Cambridge MA 01880 USA Harvard Univ Dept Biostat Boston MA 02115 USA

In this paper, we study the detection boundary for minimax hypothesis testing in the context of high-dimensional, sparse binary regression models. Motivated by genetic sequencing association studies for rare variant effects, we investigate the complexity of the hypothesis testing problem when the design matrix is sparse. We observe a new phenomenon in the behavior of detection boundary which does not occur in the case of Gaussian linear regression. We derive the detection boundary as a function of two components: a design matrix sparsity index and signal strength, each of which is a function of the sparsity of the alternative. For any alternative, if the design matrix sparsity index is too high, any test is asymptotically powerless irrespective of the magnitude of signal strength. For binary design matrices with the sparsity index that is not too high, our results are parallel to those in the Gaussian case. In this context, we derive detection boundaries for both dense and sparse regimes. For the dense regime, we show that the generalized likelihood ratio is rate optimal;for the sparse regime, we propose an extended Higher Criticism Test and show it is rate optimal and sharp. We illustrate the finite sample properties of the theoretical results using simulation studies.

关键词： Minimax hypothesis testing binary regression detection boundary Higher Criticism sparsity

来源：评论

学校读者我要写书评

暂无评论

'Does God toss logistic coins?' and other questions that motivate regression by composition

引用

JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY 2024年第3期187卷 636-655页

作者： Daniel, Rhian M. Farewell, Daniel M. Huitfeldt, Anders Cardiff Univ Sch Med Div Populat Med Cardiff CF14 4YS Wales Cardiff Univ Sch Med Div Populat Med Cardiff Wales

regression by composition is a new and flexible toolkit for building and understanding statistical models. Focusing here on regression models for a binary outcome conditional on a binary treatment and other covariates, we motivate the need for regression by composition. We do this first by exhibiting-using L'Abb & eacute;plots-the families of relationships between untreated and treated conditional outcome risks that emerge from generalized linear models for many different link functions. These are compared with the relationships (between untreated and treated risks) that arise from mechanistic sufficient component cause models, which are first principles causal models for binary outcomes. By considering mechanistic models that allow for non-monotone causal effects and by allowing sufficient causes to be associated, we expand upon similar discussions in the recent literature. We discuss conditions under which commonly used statistical models for binary data, such as logistic regression, arise from mechanistic models where the sufficient causes are associated in a particular way, as well as other situations in which the statistical models arising do not correspond to a generalized linear model but can be naturally expressed as a regression by composition model.

关键词： binary regression causal models generalized linear models mechanistic models regression by composition

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：