检索结果-内蒙古大学图书馆

Multivariate single index modeling of longitudinal data with multiple responses

STATISTICS IN MEDICINE 2023年第17期42卷 2982-2998页

作者： Tian, Zibo Qiu, Peihua Univ Florida Dept Biostat Gainesville FL 32611 USA

In medical studies, composite indices and/or scores are routinely used for predicting medical conditions of patients. These indices are usually developed from observed data of certain disease risk factors, and it has been demonstrated in the literature that single index models can provide a powerful tool for this purpose. In practice, the observed data of disease risk factors are often longitudinal in the sense that they are collected at multiple time points for individual patients, and there are often multiple aspects of a patient's medical condition that are of our concern. However, most existing single-index models are developed for cases with independent data and a single response variable, which are inappropriate for the problem just described in which within-subject observations are usually correlated and there are multiple mutually correlated response variables involved. This paper aims to fill this methodological gap by developing a single index model for analyzing longitudinal data with multiple responses. Both theoretical and numerical justifications show that the proposed new method provides an effective solution to the related research problem. It is also demonstrated using a dataset from the English Longitudinal Study of Aging.

关键词： asymptotic normality em algorithm local linear kernel smoothing mixed-effects modeling multiple responses single index model

来源：评论

学校读者我要写书评

暂无评论

Semiparametric regression analysis of length-biased and partly interval-censored data with application to an AIDS cohort study

引用

STATISTICS IN MEDICINE 2023年第14期42卷 2293-2310页

作者： Feng, Fan Li, Shuwei Wang, Peijie Sun, Jianguo Ke, Chaofu Jilin Univ Sch Math Changchun Peoples R China Guangzhou Univ Sch Econ & Stat Guangzhou Peoples R China Univ Missouri Dept Stat Columbia MO USA Med Coll Soochow Univ Sch Publ Hlth Dept Epidemiol & Biostat Suzhou Peoples R China

Length-biased data occur often in many scientific fields, including clinical trials, epidemiology surveys and genome-wide association studies, and many methods have been proposed for their analysis under various situations. In this article, we consider the situation where one faces length-biased and partly interval-censored failure time data under the proportional hazards model, for which it does not seem to exist an established method. For the estimation, we propose an efficient nonparametric maximum likelihood method by incorporating the distribution information of the observed truncation times. For the implementation of the method, a flexible and stable em algorithm via two-stage data augmentation is developed. By employing the empirical process theory, we establish the asymptotic properties of the resulting estimators. A simulation study conducted to assess the finite-sample performance of the proposed method suggests that it works well and is more efficient than the conditional likelihood approach. An application to an AIDS cohort study is also provided.

关键词： Cox model em algorithm interval censoring length-biased sampling nonparametric maximum likelihood estimation

来源：评论

学校读者我要写书评

暂无评论

Statistical Inference with Local Optima

引用

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION 2023年第543期118卷 1940-1952页

作者： Chen, Yen-Chi Univ Washington Dept Stat Seattle WA 98195 USA

We study the statistical properties of an estimator derived by applying a gradient ascent method with multiple initializations to a multi-modal likelihood function. We derive the population quantity that is the target of this estimator and study the properties of confidence intervals (CIs) constructed from asymptotic normality and the bootstrap approach. In particular, we analyze the coverage deficiency due to finite number of random initializations. We also investigate the CIs by inverting the likelihood ratio test, the score test, and the Wald test, and we show that the resulting CIs may be very different. We propose a two-sample test procedure even when the maximum likelihood estimator is intractable. In addition, we analyze the performance of the em algorithm under random initializations and derive the coverage of a CI with a finite number of initializations. for this article are available online.

关键词： em algorithm Gradient descent Maximum likelihood estimation Nonconvex Two-sample test

来源：评论

学校读者我要写书评

暂无评论

On the integration of decision trees with mixture cure model

引用

STATISTICS IN MEDICINE 2023年第23期42卷 4111-4127页

作者： Aselisewine, Wisdom Pal, Suvra Univ Texas Arlington Dept Math Arlington TX 76019 USA

The mixture cure model is widely used to analyze survival data in the presence of a cured subgroup. Standard logistic regression-based approaches to model the incidence may lead to poor predictive accuracy of cure, specifically when the covariate effect is non-linear. Supervised machine learning techniques can be used as a better classifier than the logistic regression due to their ability to capture non-linear patterns in the data. However, the problem of interpret-ability hangs in the balance due to the trade-off between interpret-ability and predictive accuracy. We propose a new mixture cure model where the incidence part is modeled using a decision tree-based classifier and the proportional hazards structure for the latency part is preserved. The proposed model is very easy to interpret, closely mimics the human decision-making process, and provides flexibility to gauge both linear and non-linear covariate effects. For the estimation of model parameters, we develop an expectation maximization algorithm. A detailed simulation study shows that the proposed model outperforms the logistic regression-based and spline regression-based mixture cure models, both in terms of model fitting and evaluating predictive accuracy. An illustrative example with data from a leukemia study is presented to further support our conclusion.

关键词： em algorithm multiple imputation platt scaling predictive accuracy ROC curve

来源：评论

学校读者我要写书评

暂无评论

The Anatomy of Sorting-Evidence From Danish Data

引用

ECONOMETRICA 2023年第6期91卷 2409-2455页

作者： Lentz, Rasmus Piyapromdee, Suphanit Robin, Jean-Marc Univ Wisconsin Dept Econ Madison WI 53715 USA Univ Aarhus Dale T Mortensen Ctr Aarhus Denmark UCL Dept Econ London England Sci Po Sci Paris France

In this paper, we formulate and estimate a flexible model of job mobility and wages with two-sided heterogeneity. The analysis extends the finite mixture approach of Bonhomme, Lamadon, and Manresa (2019) and Abowd, McKinney, and Schmutte (2019) to develop a new Classification Expectation-Maximization algorithm that ensures both worker and firm latent-type identification using wage and mobility variations in the data. Workers receive job offers in worker-type segmented labor markets. Offers are accepted according to a logit form that compares the value of the current job with that of the new job. In combination with flexibly estimated layoff and job finding rates, the analysis quantifies the four different sources of sorting: job preferences, segmentation, layoffs, and job finding. Job preferences are identified through job-to-job moves in a revealed preference argument. They are in the model structurally independent of the identified job wages, possibly as a reflection of the presence of amenities. We find evidence of a strong pecuniary motive in job preferences. While the correlation between preferences and current job wages is positive, the net present value of the future earnings stream given the current job correlates much more strongly with preferences for it. This is more so for short- than long-tenure workers. In the analysis, we distinguish between type sorting and wage sorting. Type sorting is quantified by means of the mutual information index. Wage sorting is captured through correlation between identified wage types. While layoffs are less important than the other channels, we find all channels to contribute substantially to sorting. As workers age, job arrival processes are the key determinant of wage sorting, whereas the role of job preferences dictate type sorting. Over the life cycle, job preferences intensify, type sorting increases, and pecuniary considerations wane.

关键词： Heterogeneity wage distributions employment and job mobility mutual information finite mixtures em algorithm classification algorithm sorting decomposition of wage inequality

来源：评论

学校读者我要写书评

暂无评论

CO-CLUSTERING OF SPATIALLY RESOLVED TRANSCRIPTOMIC DATA

引用

ANNALS OF APPLIED STATISTICS 2023年第2期17卷 1444-1468页

作者： Sottosanti, Andrea Risso, Davide Univ Padua Dept Stat Sci Padua Italy

Spatial transcriptomics is a groundbreaking technology that allows the where the activity occurs. This technology has enabled the study of the spatial variation of the genes across the tissue. Comprehending gene functions and interactions in different areas of the tissue is of great scientific interest, as it might lead to a deeper understanding of several key biological mechanisms, such as cell-cell communication or tumor-microenvironment interaction. To do so, one can group cells of the same type and genes that exhibit similar expression patterns. However, adequate statistical tools that exploit the previously unavailable spatial information to more coherently group cells and genes are still *** this work we introduce SPARTACO, a new statistical model that clusters the spatial expression profiles of the genes according to a partition of the tissue. This is accomplished by performing a co-clustering, that is, inferring the latent block structure of the data and inducing two types of clustering: of the genes, using their expression across the tissue, and of the image areas, using the gene expression in the spots where the RNA is collected. Our proposed methodology is validated with a series of simulation experiments, and its usefulness in responding to specific biological questions is illustrated with an application to a human brain tissue sample processed with the 10X-Visium protocol.

关键词： Co -clustering em algorithm genomics human dorsolateral prefrontal cortex inte grated completed log -likelihood model -based clustering spatial transcriptomics 10X-Visium

来源：评论

学校读者我要写书评

暂无评论

A linear spline Cox cure model with its applications

引用

COMPUTATIONAL STATISTICS 2023年第2期38卷 935-954页

作者： Liu, Yu Li, Chin-Shang Univ Calif Davis Davis CA 95616 USA State Univ New York Univ Buffalo Sch Nursing Buffalo NY USA

A mixture cure model has been increasingly popular in the field of biostatistics, where some individuals may never experience an event of interest during a study. In most cases, effects of continuous covariates are assumed to be linear. However, a traditional linear assumption often fails in practical situations because real-life effects are usually nonlinear. Proposed is a linear spline Cox cure model in which a spline is used to approximate the unknown smooth functional form for the effect of a continuous covariate to identify the nonlinear functional relationship. The justification and estimation procedure starts from Laplace's approximation of the marginal log-likelihood function and leads to a penalized log-likelihood. The expectation-maximization algorithm is used to estimate the model parameters, and the proposed methodology could then be used to assess the linearity of the continuous covariate effect via the likelihood ratio procedure. An extensive simulation study is conducted to investigate the performance of the proposed lack-of-fit test for the linearity of the continuous covariate effect. The practical use of the methodology is illustrated with fibrous histiocytoma data from the Surveillance, Epidemiology, and End Results (SEER) program database.

关键词： Cox cure model Smoothing B-spline em algorithm Laplace's approximation Lack-of-fit test

来源：评论

学校读者我要写书评

暂无评论

The expectation-maximization approach for Bayesian additive Cox regression with current status data

引用

JOURNAL OF THE KOREAN STATISTICAL SOCIETY 2023年第2期52卷 361-381页

作者： Cui, Di Tee, Clarence City Univ Hong Kong Dept Adv Design & Syst Engn Kowloon Hong Kong Peoples R China ASTAR Inst High Performance Comp Singapore Singapore

In this paper, we propose a Bayesian additive Cox model for analyzing current status data based on the expectation-maximization variable selection method. This model concurrently estimates unknown parameters and identifies risk factors, which efficiently improves model interpretability and predictive ability. To identify risk factors, we assign appropriate priors on the indicator variables which denote whether the risk factors are included. By assuming partially linear effects of the covariates, the proposed model offers flexibility to account for the relationship between risk factors and survival time. The baseline cumulative hazard function and nonlinear effects are approximated via penalized B-splines to reduce the dimension of parameters. An easy to implement expectation-maximization algorithm is developed using a two-stage data augmentation procedure involving latent Poisson variables. Finally, the performance of the proposed method is investigated by simulations and a real data analysis, which shows promising results of the proposed Bayesian variable selection method.

关键词： Additive Cox model Bayesian variable selection Current status data em algorithm Splines

来源：评论

学校读者我要写书评

暂无评论

A GENERAL FRAMEWORK FOR PENALIZED MIXED-EFFECTS MULTITASK LEARNING WITH APPLICATIONS ON DNA METHYLATION SURROGATE BIOMARKERS CREATION

引用

ANNALS OF APPLIED STATISTICS 2023年第4期17卷 3257-3282页

作者： Cappozzo, Andrea Ieva, Francesca Fiorito, Giovanni Politecn Milan Dept Math MOX Milan Italy IRCCS Ist Giannina Gaslini Clin Bioinformat Unit Genoa Italy

Recent evidence highlights the usefulness of DNA methylation (DNAm) biomarkers as surrogates for exposure to risk factors for noncommunicable diseases in epidemiological studies and randomized trials. DNAm variability has been demonstrated to be tightly related to lifestyle behavior and expo-sure to environmental risk factors, ultimately providing an unbiased proxy of an individual state of health. At present, the creation of DNAm surrogates relies on univariate penalized regression models, with elastic-net regularizer being the gold standard when accomplishing the task. Nonetheless, more ad-vanced modeling procedures are required in the presence of multivariate out-comes with a structured dependence pattern among the study samples. In this work we propose a general framework for mixed-effects multitask learning in presence of high-dimensional predictors to develop a multivariate DNAm biomarker from a multicenter study. A penalized estimation scheme, based on an expectation-maximization algorithm, is devised in which any penalty criteria for fixed-effects models can be conveniently incorporated in the fit-ting process. We apply the proposed methodology to create novel DNAm surrogate biomarkers for multiple correlated risk factors for cardiovascular diseases and comorbidities. We show that the proposed approach, modeling multiple outcomes together, outperforms state-of-the-art alternatives both in predictive power and biomolecular interpretation of the results.

关键词： Mixed-effects models multitask learning em algorithm penalized estimation mul-tivariate regression personalized medicine

来源：评论

学校读者我要写书评

暂无评论

Effect of Missing Responses on the C(α) or Score Tests in One-way Layout of Count Data

引用

SANKHYA-SERIES B-APPLIED AND INTERDISCIPLINARY STATISTICS 2024年第1期87卷 147-172页

作者： Malakar, Poonam Paul, Sudhir Mamun, Abdulla Pal, Subhamoy Univ Windsor Dept Math & Stat Windsor ON N9B 3P4 Canada Gonzaga Univ Dept Math Spokane WA USA Michigan Alzheimers Dis Ctr Ann Arbor MI 48105 USA

One-way layout of count data having over/under dispersion arises in many practical situations. For example, in the mice toxicology data, Barnwal and Paul (1988, Biometrika, 75(2), 215-222) sought to assess as to whether the means of several groups of count data in the presence of such over/under dispersion are equal. Specifically, they developed and studied five statistics, two of which are score tests and the other three statistics are based on data transformed to normality. After extensive simulation study they recommended the score tests. Saha (2008, J. Stat. Plan. Inference, 138(7), 2067-2081) developed two similar test statistics for the homogeneity of the means in over/under dispersed count data situations in which no likelihood exists. Again through extensive simulations, Saha recommended a score-type statistic using a double extended quasi-likelihood (Lee and Nelder 2001, Biometrika, 88(4), 987-1006). However, as in the continuous and some other discrete data situations, some observations might be missing in the one way layout of count data. The purpose of this paper is to (i) develop estimation procedures for the parameters involved in the one way layout of count data under different missing data scenarios, (ii) study the comparative behaviour of the score tests developed by Barnwal and Paul (1988, Biometrika, 75(2), 215-222) and score type statistic developed by Saha (2008, J. Stat. Plan. Inference, 138(7), 2067-2081) for complete data, and (iii) study the comparative effect of missing data on the score and score-type statistic under different missing data scenarios. Extensive Monte Carlo simulations and real life data analysis show that for complete data as well as for data under different missing data scenarios, the score-type statistic (Saha 2008, J. Stat. Plan. Inference, 138(7), 2067-2081) has some edge in terms of power over the score test statistic (Barnwal and Paul 1988, Biometrika, 75(2), 215-222) showing that the estimation under missing data met

关键词： Count data em algorithm Missing data Negative binomial likelihood Quasi-likelihood Extended quasi-likelihood Over dispersion Score tests Score-type tests

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：