In this paper, we propose a control chart to monitor the Weibull shape parameter where the observations are censored due to competing risks. We assume that the failure occurs due to two competing risks that are indepe...
详细信息
In this paper, we propose a control chart to monitor the Weibull shape parameter where the observations are censored due to competing risks. We assume that the failure occurs due to two competing risks that are independent and follow Weibull distribution with different shape and scale parameters. The control charts are proposed to monitor one or both of the shape parameters of competing risk distributions and established based on the conditional expected values. The proposed control chart for both shape parameters is used in certain situations and allows to monitor both shape parameters in only one chart. The control limits depend on the sample size, number of failures due to each risk and the desired stable average run length (ARL). We also consider the estimation problem of the target parameters when the Phase I sample is incomplete. We assumed that some of the products that fail during the life testing have a cause of failure that is only known to belong to a certain subset of all possible failures. This case is known as masking. In the presence of masking, the expectation-maximization (em) algorithm is proposed to estimate the parameters. For both cases, with and without masking, the behaviour of ARLs of charts is studied through the numerical methods. The influence of masking on the performance of proposed charts is also studied through a simulation study. An example illustrates the applicability of the proposed charts.
In this paper two probability distributions are analyzed which are formed by compounding inverse Weibull with zero-truncated Poisson and geometric distributions. The distributions can be used to model lifetime of seri...
详细信息
In this paper two probability distributions are analyzed which are formed by compounding inverse Weibull with zero-truncated Poisson and geometric distributions. The distributions can be used to model lifetime of series system where the lifetimes follow inverse Weibull distribution and the subgroup size being random follows either geometric or zero-truncated Poisson distribution. Some of the important statistical and reliability properties of each of the distributions are derived. The distributions are found to exhibit both monotone and non-monotone failure rates. The parameters of the distributions are estimated using the expectation-maximization algorithm and the method of minimum distance estimation. The potentials of the distributions are explored through three real life data sets and are compared with similar compounded distributions, viz. Weibull-geometric, Weibull-Poisson, exponential-geometric and exponential-Poisson distributions.
Utilization data are defined as the time series data consisting of time fractions of busy periods in fixed time intervals and are practically used to represent server conditions, such as CPU utilization. In general, i...
详细信息
Utilization data are defined as the time series data consisting of time fractions of busy periods in fixed time intervals and are practically used to represent server conditions, such as CPU utilization. In general, it is more challenging to estimate the model parameters from the utilization data since we do not know the exact job arrival time and the service time from the utilization data. In this paper, we consider an approach to estimate the model parameters from the utilization data by assuming a few model assumptions. In particular, we suppose an M-t/M/1/K queueing system whose job arrival follows a Non-homogeneous Poisson Process (NHPP) and propose a parameter estimation method for the NHPP approximately from the utilization data based on the maximum likelihood estimation (MLE) via the expectation maximization (em) algorithm. In numerical experiments, we generate the simulated utilization data of an M-t/M/1/K queueing system and investigate the effectiveness of our method. Also, we use the real CPU utilization data to exhibit the performance evaluation.
In health care, multilevel models are typically used to evaluate hospitals' performance and to rank hospitals accordingly. While multilevel models capture the hierarchical structure in the data, such as the groupi...
详细信息
In health care, multilevel models are typically used to evaluate hospitals' performance and to rank hospitals accordingly. While multilevel models capture the hierarchical structure in the data, such as the grouping of patients into hospitals, these models do not account for additional latent structures. In this paper, we develop a novel multilevel logistic cluster-weighted model which can predict a binary outcome, such as mortality within 30 days of discharge, while accounting both for known and latent structures of the data. We develop an Expectation-Maximization algorithm for parameter estimation and a parametric bootstrap approach for assessing the variability of the estimators. Using a rich data set of the Lombardy (Italy) health care system and focussing on the two wards of cardiosurgery and medicine, we show how the proposed model detects, in both cases, two well-defined clusters within the patient to hospital hierarchical structure of the data. A comparison with standard multilevel and cluster-weighted approaches reveals a better fit of the proposed model and a greater insight into the structure of the data. We show how this can have implications in the resulting league tables and thus how the proposed model can be a useful tool for policy-makers and healthcare managers to conduct hospital evaluations.
emerging spatial crowdsourcing (SC) provides an approach for collecting and analyzing spatiotemporal information from intelligent transportation systems. However, the exposure of massive location privacy to potential ...
详细信息
emerging spatial crowdsourcing (SC) provides an approach for collecting and analyzing spatiotemporal information from intelligent transportation systems. However, the exposure of massive location privacy to potential adversaries for the purpose of quality control makes workers more vulnerable. To protect workers location privacy, an obfuscation scheme is proposed to incorporate uncertainties into the SC quality control problem through obfuscating the standard location data in terms of both space and time. Two measures, location entropy and results accuracy, are used to evaluate the performance of location privacy protection. We theoretically and experimentally confirm the security and accuracy of the obfuscation approach. The results of experiments show that: a) hiding workers location from the requester reduces the quality of SC;and b) obfuscation arithmetic with appropriate obfuscation coefficients protects workers location privacy with little effect on SC quality. Under the protection of this obfuscation scheme, the new system provides better security and similar quality compared to the existing SC system.
An intelligent online modeling approach using the immune mechanisms for the stretching process of fiber is proposed in this paper. Linear parameter varying model is utilized as the process model, and the parameters ar...
详细信息
An intelligent online modeling approach using the immune mechanisms for the stretching process of fiber is proposed in this paper. Linear parameter varying model is utilized as the process model, and the parameters are estimated under the framework of the expectation maximization algorithm. The proposed approach is composed of offline modeling and online modeling. For the offline modeling, the parameters of the local models as well as the weighting functions are estimated simultaneously, and this is the process of producing the first antibodies. Since the process is dynamic and continuous, when new operating point is different from the historical data, the online modeling approach is applied to estimate the parameters of new local models and weighting functions, and this is the process of producing the forthcoming antibodies. The proposed intelligent online modeling approach has the capability of learning and evolution, and can update the parameters adaptively. Apply the proposed approach to the stretching process and compare with other modeling methods, and then the Friedman and the Nemenyi post-hoc tests for assessing the statistical significance of differences in performance are analyzed. The feasibility and efficiency are demonstrated. (C) 2019 Elsevier Inc. All rights reserved.
作者:
He, ZhilinHo, Chun-HsingYuncheng Univ
Math & Informat Technol Sch 1155 Fudan West St Yuncheng Shanxi Peoples R China No Arizona Univ
Dept Civil Engn Construct Management & Environm E POB 15600 Flagstaff AZ 86011 USA
The Finite Gaussian Mixture Model (FGMM) is the most commonly used model for describing mixed density distribution in cluster analysis. An important feature of the FGMM is that it can infinitely approximate any contin...
详细信息
The Finite Gaussian Mixture Model (FGMM) is the most commonly used model for describing mixed density distribution in cluster analysis. An important feature of the FGMM is that it can infinitely approximate any continuous distribution, as long as the model contains enough number of components. In the clustering analysis based on the FGMM, the em algorithm is usually used to estimate the parameters of the model. The advantage is that the computation is stable and the convergence speed is fast. However, the em algorithm relies heavily on the estimation of incomplete data. It does not use any information to reduce the uncertainty of missing data. To solve this problem, an em algorithm based on entropy penalized maximum likelihood estimation is proposed. The novel algorithm constructs the conditional entropy model between incomplete data and missing data, and reduces the uncertainty of missing data through incomplete data. Theoretical analysis and experimental results show that the novel algorithm can effectively adapt to the FGMM, improve the clustering results and improve the efficiency of the algorithm.
Many of the methods which deal with clustering in matrices of data are based on mathematical techniques such as distance-based algorithms or matrix decomposition and eigenvalues. In general, it is not possible to use ...
详细信息
Many of the methods which deal with clustering in matrices of data are based on mathematical techniques such as distance-based algorithms or matrix decomposition and eigenvalues. In general, it is not possible to use statistical inferences or select the appropriateness of a model via information criteria with these techniques because there is no underlying probability model. This article summarizes some recent model-based methodologies for matrices of binary, count, and ordinal data, which are modelled under a unified statistical framework using finite mixtures to group the rows and/or columns. The model parameter can be constructed from a linear predictor of parameters and covariates through link functions. This likelihood-based one-mode and two-mode fuzzy clustering provides maximum likelihood estimation of parameters and the options of using likelihood information criteria for model comparison. Additionally, a Bayesian approach is presented in which the parameters and the number of clusters are estimated simultaneously from their joint posterior distribution. Visualization tools focused on ordinal data, the fuzziness of the clustering structures, and analogies of various standard plots used in the multivariate analysis are presented. Finally, a set of future extensions is enumerated.
Many real-world networks known as attributed networks contain two types of information: topology information and node attributes. It is a challenging task on how to use these two types of information to explore struct...
详细信息
Many real-world networks known as attributed networks contain two types of information: topology information and node attributes. It is a challenging task on how to use these two types of information to explore structural regularities. In this paper, by characterizing the potential relationship between communities of links and node attributes, a principled statistical model named PSB_PG that generates link topology and node attributes is proposed. This model for generating links is based on the stochastic blockmodels following a Poisson distribution. Therefore, it is capable of detecting a wide range of network structures including community structures, bipartite structures, and other mixture structures. The model for generating node attributes assumes that node attributes are high-dimensional, sparse, and also follow a Poisson distribution. This makes the model be uniform, and the model parameters can be directly estimated by the expectation-maximization (em) algorithm. Experimental results on artificial networks and real networks containing various structures have shown that the proposed model PSB_PG is not only competitive with the state-of-the-art models, but also provides a good semantic interpretation for each community via the learned relationship between the community and its related attributes. (C) 2019 Elsevier Inc. All rights reserved.
We introduce a spectrum-adapted expectation-maximization (em) algorithm for high-throughput analysis of a large number of spectral datasets by considering the weight of the intensity corresponding to the measurement e...
详细信息
We introduce a spectrum-adapted expectation-maximization (em) algorithm for high-throughput analysis of a large number of spectral datasets by considering the weight of the intensity corresponding to the measurement energy steps. Proposed method was applied to synthetic data in order to evaluate the performance of the analysis accuracy and calculation time. Moreover, the proposed method was performed to the spectral data collected from graphene and MoS2 field-effect transistors devices. The calculation completed in less than 13.4 s per set and successfully detected systematic peak shifts of the C 1s in graphene and S 2p in MoS2 peaks. This result suggests that the proposed method can support the investigation of peak shift with two advantages: (1) a large amount of data can be processed at high speed;and (2) stable and automatic calculation can be easily performed. [GRAPHICS] .
暂无评论