My dissertation consists of three chapters that evaluate the social welfare effect of either antitrust policy or industrial transition, all using discrete choice model estimation as the front end for counterfactual an...
详细信息
My dissertation consists of three chapters that evaluate the social welfare effect of either antitrust policy or industrial transition, all using discrete choice model estimation as the front end for counterfactual analysis. In the first chapter, I investigate the economic impact of the merger that created the world's largest hotel chain, Marriott's acquisition of Starwood, thereby shedding light on the antitrust authorities' performance in protecting competitive markets for the benefit of consumers. Different from traditional merger analysis that focuses on the tradeoff between the upward pricing pressure and the cost synergy among the merging parties while fixing the market structure, I endogenize firms’ entry decisions into an oligopoly price competition model. To tackle the associated multiple equilibria issue, I use moment inequality estimation and propose a novel lower probability bound that reduces the computational burden from being exponential to being linear in the number of players. It also adds to the scant empirical evidence on post-merger cost synergy by showing that every one more affiliated hotel in the local market reduces a hotel's marginal cost by up to 2.3%. Then a comparison between the simulated with-merger and without-merger equilibria indicates that this merger enhances social welfare. In particular, for those markets that are previously not profitable for any firm to enter, because of the post-merger cost saving, Marriott or Starwood would enter 6% - 24% of them, which provides a new perspective for merger reviews. The second chapter, joint with Mingli Chen, Marc Rysman and Krzysztof Wozniak, studies the determinants of the US payment system's shift from paper payment instruments, namely cash and check, to digital instruments, such as debit cards and credit cards. With a 5-year transaction-level panel data, for the first time in the literature, we can distinguish the short-term effects of transaction size from the long-term changes in househ
After the 2016 double dissolution election, the 45th Australian Parliament was formed. At the time of its swearing in, the Senate of the 45th Australian Parliament consisted of nine political parties, the largest numb...
详细信息
After the 2016 double dissolution election, the 45th Australian Parliament was formed. At the time of its swearing in, the Senate of the 45th Australian Parliament consisted of nine political parties, the largest number in the history of the Australian Parliament. Due to the breadth of the political spectrum that the Senate represented, the situation presented an interesting opportunity for the study of political interactions in the Australian context. Using publicly available Senate voting data in 2016, we quantitatively analyzed two aspects of the Senate. First, we analyzed the degree to which each of the non-government parties of the Senate is pro- or anti-government. Second, we analyzed the degree to which the votes of each of the non-government Senate parties are in concordance or discordance with one another. We utilized the fully visible Boltzmann machine (FVBM) model to conduct these analyses. The FVBM is an artificial neural network that can be viewed as a multivariate generalization of the Bernoulli distribution. Via a maximum pseudolikelihood estimation approach, we conducted parameter estimation and constructed hypothesis tests that revealed the interaction structures within the Australian Senate. The conclusions that we drew are well supported by external sources of information.
The generalized case-cohort design is widely used in large cohort studies to reduce the cost and improve the efficiency. Taking prior information of parameters into consideration in modeling process can further raise ...
详细信息
The generalized case-cohort design is widely used in large cohort studies to reduce the cost and improve the efficiency. Taking prior information of parameters into consideration in modeling process can further raise the inference efficiency. In this paper, we consider fitting proportional hazards model with constraints for generalized case-cohort studies. We establish a working likelihood function for the estimation of model parameters. The asymptotic properties of the proposed estimator are derived via the Karush-Kuhn-Tucker conditions, and their finite properties are assessed by simulation studies. A modified minorization-maximization algorithm is developed for the numerical calculation of the constrained estimator. An application to a Wilms tumor study demonstrates the utility of the proposed method in practice.
Incomplete categorical data often occur in the fields such as biomedicine, epidemiology, psychology, sports and so on. In this paper, we first introduce a novel minorization-maximization (MM) algorithm to calculate th...
详细信息
Incomplete categorical data often occur in the fields such as biomedicine, epidemiology, psychology, sports and so on. In this paper, we first introduce a novel minorization-maximization (MM) algorithm to calculate the maximum likelihood estimates (MLEs) of parameters and the posterior modes for the analysis of general incomplete categorical data. Although the data augmentation (DA) algorithm and Gibbs sampling as the corresponding stochastic counterparts of the expectation-maximization (EM) and ECM algorithms are developed very well, up to now, little work has been done on creating stochastic versions to the existing MM algorithms. This is the first paper to propose a mode-sharing method in Bayesian computation for general incomplete categorical data by developing a new acceptance-rejection (AR) algorithm aided with the proposed MM algorithm. The key idea is to construct a class of envelope densities indexed by a working parameter and to identify a specific envelope density which can overcome the four drawbacks associated with the traditional AR algorithm. The proposed mode-sharing based AR algorithm has three significant characteristics: (I) it can automatically establish a family of envelope densities {g(lambda)(.): lambda is an element of S-lambda} indexed by a working parameter lambda, where each member in the family shares mode with the posterior density;(II) with the onedimensional grid method searching over the finite interval S-lambda,S- it can identify an optimal working parameter lambda(opt) by maximizing the theoretical acceptance probability, yielding a best easy-sampling envelope density g lambda(opt) (.), which is more dispersive than the posterior density;(III) it can obtain the optimal envelope constant c(opt) by using the mode-sharing theorem (indicating that the high-dimensional optimization can be completely avoided) or by using the proposed MM algorithm again. Finally, a toy model and three real data sets are used to illustrate the proposed meth
When the observed data set contains outliers, it is well known that the classical least squares method is not robust. To overcome this difficulty, Wang et al. (J Am Stat Assoc 108(502): 632-643, 2013) proposed a robus...
详细信息
When the observed data set contains outliers, it is well known that the classical least squares method is not robust. To overcome this difficulty, Wang et al. (J Am Stat Assoc 108(502): 632-643, 2013) proposed a robust variable selection method by using the exponential squared loss (ESL) function with a tuning parameter. Although many important statistical models are investigated, to date, in the presence of outliers there is no paper to study the partially nonlinear model by using the ESL function. To fill in this gap, in this paper, we propose a robust and efficient estimation method for the partially nonlinear model based on the ESL function. Under certain conditions, we have shown that the proposed estimators can achieve the best convergence rates. Next, the asymptotic normality of the proposed estimators is established. In addition, we develop a new minorization-maximization algorithm to calculate the estimates for both non-parametric and parametric parts and present a procedure for deriving initial values. Finally, we provide a data-driven approach to select the tuning parameters. Numerical simulations and a real data analysis are used to illustrate that when there are outliers, the proposed ESL method is more robust and efficient for partially nonlinear models than the existing linear approximation method and the composite quantile regression method.
To reduce the cost and improve the efficiency of cohort studies, case-cohort design is a widely used biased-sampling scheme for time-to-event data. In modeling process, case cohort studies can benefit further from tak...
详细信息
To reduce the cost and improve the efficiency of cohort studies, case-cohort design is a widely used biased-sampling scheme for time-to-event data. In modeling process, case cohort studies can benefit further from taking parameters' prior information, such as the histological type and disease stage of the cancer in medical, the liquidity and market demand of the enterprise in finance. Regression analysis of the proportional hazards model with parameter constraints under case-cohort design is studied. Asymptotic properties are derived by applying the Lagrangian method based on Karush-Kuhn-Tucker conditions. The consistency and asymptotic normality of the constrained estimator are established. A modified minorization-maximization algorithm is developed for the calculation of the constrained estimator. Simulation studies are conducted to assess the finite-sample performance of the proposed method. An application to a Wilms tumor study demonstrates the utility of the proposed method in practice. (C) 2017 Published by Elsevier B.V.
Mixture of Linear Experts (MoLE) models provide a popular framework for modeling nonlinear regression data. The majority of applications of MoLE models utilizes a Gaussian distribution for regression error. Such assum...
详细信息
Mixture of Linear Experts (MoLE) models provide a popular framework for modeling nonlinear regression data. The majority of applications of MoLE models utilizes a Gaussian distribution for regression error. Such assumptions are known to be sensitive to outliers. The use of a Laplace distributed error is investigated. This model is named the Laplace MoLE (LMoLE). Links are drawn between the Laplace error model and the least absolute deviations regression criterion, which is known to be robust among a wide class of criteria. Through application of the minorizationmaximizationalgorithm framework, an algorithm is derived that monotonically increases the likelihood in the estimation of the LMoLE model parameters. It is proven that the maximum likelihood estimator (MLE) for the parameter vector of the LMoLE is consistent. Through simulation studies, the robustness of the LMoLE model over the Gaussian MOLE model is demonstrated, and support for the consistency of the MLE is provided. An application of the LMoLE model to the analysis of a climate science data set is described. (C) 2014 Elsevier B.V. All rights reserved.
Understanding how aquatic species grow is fundamental in fisheries because stock assessment often relies on growth dependent statistical models. Length-frequency-based methods become important when more applicable dat...
详细信息
Understanding how aquatic species grow is fundamental in fisheries because stock assessment often relies on growth dependent statistical models. Length-frequency-based methods become important when more applicable data for growth model estimation are either not available or very expensive. In this article, we develop a new framework for growth estimation from length-frequency data using a generalized von Bertalanffy growth model (VBGM) framework that allows for time-dependent covariates to be incorporated. A finite mixture of normal distributions is used to model the length-frequency cohorts of each month with the means constrained to follow a VBGM. The variances of the finite mixture components are constrained to be a function of mean length, reducing the number of parameters and allowing for an estimate of the variance at any length. To optimize the likelihood, we use a minorization-maximization (MM) algorithm with a Nelder-Mead sub-step. This work was motivated by the decline in catches of the blue swimmer crab (BSC) (Portunus armatus) off the east coast of Queensland, Australia. We test the method with a simulation study and then apply it to the BSC fishery data.
Autoregressive (AR) models are an important tool in the study of time series data. However, the standard AR model only allows for unimodal marginal and conditional densities, and cannot capture conditional heterosceda...
详细信息
Autoregressive (AR) models are an important tool in the study of time series data. However, the standard AR model only allows for unimodal marginal and conditional densities, and cannot capture conditional heteroscedasticity. Previously, the Gaussian mixture AR (GMAR) model was considered to remedy these shortcomings by using a Gaussian mixture conditional model. We introduce the Laplace mixture (LMAR) model that utilizes a Laplace mixture conditional model, as an alternative to the GMAR model. We characterize the LMAR model and provide conditions for stationarity. An MM (minorization-maximization) algorithm is then proposed for maximum pseudolikelihood (MPL) estimation of an LMAR model. Conditions for asymptotic inference and a rule for model selection for the MPL estimator are considered. An example analysis of data arising from the calcium imaging of a zebrafish brain is performed. (C) 2015 Elsevier B.V. All rights reserved.
The Gaussian mixture model (GMM) is a popular tool for multivariate analysis, in particular, cluster analysis. The expectation-maximization (EM) algorithm is generally used to perform maximum likelihood (ML) estimatio...
详细信息
The Gaussian mixture model (GMM) is a popular tool for multivariate analysis, in particular, cluster analysis. The expectation-maximization (EM) algorithm is generally used to perform maximum likelihood (ML) estimation for GMMs due to the M-step existing in closed form and its desirable numerical properties, such as monotonicity. However, the EM algorithm has been criticized as being slow to converge and thus computationally expensive in some situations. In this article, we introduce the linear regression characterization (LRC) of the GMM. We show that the parameters of an LRC of the GMM can be mapped back to the natural parameters, and that a minorization-maximization (MM) algorithm can be constructed, which retains the desirable numerical properties of the EM algorithm, without the use of matrix operations. We prove that the ML estimators of the LRC parameters are consistent and asymptotically normal, like their natural counterparts. Furthermore, we show that the LRC allows for simple handling of singularities in the ML estimation of GMMs. Using numerical simulations in the R programming environment, we then demonstrate that the MM algorithm can be faster than the EM algorithm in various large data situations, where sample sizes range in the tens to hundreds of thousands and for estimating models with up to 16 mixture components on multivariate data with up to 16 variables.
暂无评论