A method is proposed in this paper to assess the local influence of minor perturbations for the Sharpe model when the normal distribution is replaced by normal/independent (NI) distributions. The family of NI distribu...
详细信息
A method is proposed in this paper to assess the local influence of minor perturbations for the Sharpe model when the normal distribution is replaced by normal/independent (NI) distributions. The family of NI distributions is an attractive class of symmetric heavy-tailed densities that includes as special cases the normal, t-Student, slash, and the contaminated normal distributions. Since the returns of the market are not observable, the statistical analysis is carried out in the context of an errors-in-variables model. An influence analysis for detecting influential observations (atypical returns) is developed to investigate the sensitivity of the maximum likelihood estimators. Diagnostic measures are obtained based on the conditional expectation of the complete-data log-likelihood function. The results are illustrated by using a set of shares of companies traded in the Chilean stock market.
In this paper, we propose a novel and mathematically tractable frailty model for clustered survival data by assuming a generalized exponential (GE) distribution for the latent frailty effect. Both parametric and semip...
详细信息
In this paper, we propose a novel and mathematically tractable frailty model for clustered survival data by assuming a generalized exponential (GE) distribution for the latent frailty effect. Both parametric and semiparametric versions of the GE frailty model are studied with main focus for the semiparametric case, where an em-algorithm is proposed. Our em-based estimation for the GE frailty model is simpler, faster and immune to a flat likelihood issue affecting, for example, the semiparametric gamma model, as illustrated in this paper through simulated and real data. We also show that the GE model is at least competitive with respect to the gamma frailty model under misspecification. A broad analysis is developed, with simulation results explored via Monte Carlo replications, to evaluate and compare models. A real application using a clustered kidney catheter data is considered to demonstrate the potential for practice of the GE frailty model.
When sensitive issues are surveyed, collecting truthful data and obtaining reliable estimates of population parameters is a persistent problem in many fields of applied research mostly in sociological, economic, demog...
详细信息
When sensitive issues are surveyed, collecting truthful data and obtaining reliable estimates of population parameters is a persistent problem in many fields of applied research mostly in sociological, economic, demographic, ecological and medical studies. In this context, and moving from the so-called negative survey, we consider the problem of estimating the proportion of population units belonging to the categories of a sensitive variable when collected data are affected by measurement errors produced by untruthful responses. An extension of the negative survey approach is proposed herein in order to allow respondents to release a true response. The proposal rests on modelling the released data with a mixture of truthful and untruthful responses that allows researchers to obtain an estimate of the proportions as well as the probability of receiving the true response by implementing the em-algorithm. We describe the estimation procedure and carry out a simulation study to assess the performance of the em estimates vis-a-vis certain benchmark values and the estimates obtained under the traditional data-collection approach based on direct questioning that ignores the presence of misreporting due to untruthful responding. Simulation findings provide evidence on the accuracy of the estimates and permit us to appreciate the improvements that our approach can produce in public surveys, particularly in election opinion polls, when the hidden vote problem is present.
Using play-by-play data from the very beginning of the professional football league in Turkey, a semi-Markov model is presented for describing the performance of football teams. The official match results of the selec...
详细信息
Using play-by-play data from the very beginning of the professional football league in Turkey, a semi-Markov model is presented for describing the performance of football teams. The official match results of the selected teams during 55 football seasons are used and winning, drawing and losing are considered as Markov states. The semi-Markov model is constructed with transition rates inferred from the official match results. The duration between the last match of a season and the very first match of the following season is much longer than any other duration during the season. Therefore these values are considered as missing values and estimated by using expectation-maximization algorithm. The effect of the sojourn time in a state to the performance of a team is discussed as well as mean sojourn times after losing/winning are estimated. The limiting probabilities of winning, drawing and losing are calculated. Some insights about the performance of the selected teams are presented.
In this article, we propose a general class of INteger-valued Generalized AutoRegressive Conditional Heteroskedastic (INGARCH) models based on a flexible family of mixed Poisson (MP) distributions. Our proposed class ...
详细信息
In this article, we propose a general class of INteger-valued Generalized AutoRegressive Conditional Heteroskedastic (INGARCH) models based on a flexible family of mixed Poisson (MP) distributions. Our proposed class of count time series models contains the negative binomial (NB) INGARCH process as particular case and open the possibility to introduce new models such as the Poisson-inverse Gaussian (PIG) and Poisson generalized hyperbolic secant processes. In particular, the PIG INGARCH model is an interesting and robust alternative to the NB model. We explore first-order and second-order stationary properties of our MPINGARCH models and provide expressions for the autocorrelation function and mean and variance marginals. Conditions to ensure strict stationarity and ergodicity properties for our class of INGARCH models are established. We propose an Expectation-Maximization algorithm to estimate the parameters and obtain the associated information matrix. Further, we discuss two additional estimation methods. Monte Carlo simulation studies are considered to evaluate the finite-sample performance of the proposed estimators. We illustrate the flexibility and robustness of the MPINGARCH models through two real-data applications about number of cases of Escherichia coli and Campylobacter infections. This article contains a Supporting Information.
In this work, a flexible class of linear mixed models is introduced by assuming that the random effects and model errors follow a skew-normal-Cauchy distribution. The likelihood function and the information matrix bas...
详细信息
In this work, a flexible class of linear mixed models is introduced by assuming that the random effects and model errors follow a skew-normal-Cauchy distribution. The likelihood function and the information matrix based on of the observed data are computed. An em-type algorithm is also proposed for estimating the parameters that seems to provide some advantages over a direct maximization of the likelihood function. Finally, the performance of the proposed model is evaluated numerically from simulated an real data.
Methods for the separation of a mixture of three-parameter lognormal distributions are investigated theoretically and empirically in the context of modeling message transmission delays in a computer cluster communicat...
详细信息
Modern stage of development of information and communication networks requires solving of crucial tasks of network traffic statistical analysis and traffic simulation modeling. The predominance of non-Poisson traffic ...
详细信息
ISBN:
(纸本)9781728106069
Modern stage of development of information and communication networks requires solving of crucial tasks of network traffic statistical analysis and traffic simulation modeling. The predominance of non-Poisson traffic leads to the impossibility of analyzing multichannel communication systems by the methods of queuing theory that used to describe telephone networks. The last decade has paid much attention to research on traffic that has signs of self-similarity. The main purpose of this work is a statistical analysis of non-Poissonian traffic, represented by multimodal non-standard Pascal (negative binomial) and Rice distributions. As a result, a study of the self-similarity degree has been performed by the R/S analysis and the aggregation method. In addition, we propose to use em-algorithm with an algorithm for determining an optimal number of clusters for an approximation of non-typical multimodal distributions.
In the paper the online fuzzy clustering recurrent procedure has been introduced that allows the forming of hyperellipsoidal clusters with an arbitrary orientation of the axes is investigated. The proposed clustering ...
详细信息
In the paper the online fuzzy clustering recurrent procedure has been introduced that allows the forming of hyperellipsoidal clusters with an arbitrary orientation of the axes is investigated. The proposed clustering system is the generalization of a number of known algorithms, it is intended to solve tasks within the general problems of Medical Data Mining, when information is sequentially fed to processing in online mode.
The article presents the results of developing a machine learning approach to the problem of object identification (recognition) in images (data) recorded by photo-counting sensors. Such images are significantly diffe...
详细信息
ISBN:
(纸本)9789897583513
The article presents the results of developing a machine learning approach to the problem of object identification (recognition) in images (data) recorded by photo-counting sensors. Such images are significantly different from the traditional ones, taken with conventional sensors in the process of time exposure and spatial averaging of the incident radiation. The result of radiation registration by photo-counting sensors (image) is rather a continuous stream of data, whose time frame is characterized by a relatively small number of photocounts. The latter leads to a low signal-to-noise ratio, low contrast and fuzzy shapes of the objects. For this reason, the well-known methods, designed for traditional image recognition, are not effective enough in this case and new recognition approaches, oriented to a low-count images, are required. In this paper we propose such an approach. It is based on the machine learning paradigm and designed for identifying (low count) objects given by point-sets. Consistently using a discrete set of coordinates of photocounts rather than a continuous image reconstructed, we formalize the problem in question as the problem of the best fitting of this set of counts, considered as the realization of a certain point process, to the statistical description of one of the previously registered point processes, which we call precedents. It is shown, that applying the Poisson point process model for formalizing the registration process in photo-counting sensors, it is possible to reduce the problem of object identification to the problem of maximizing the tested point--set likelihood with respect to the classes of modelling object distributions up to shape size and position. It is also demonstrated that these procedures can be brought to an algorithmic realization, analogous in structure to the popular emalgorithms. At the end of the paper we, for the sake of illustration, present some results of applying the developed algorithms to the identifica
暂无评论