检索结果-内蒙古大学图书馆

Quantile modeling through multivariate log-normal/independent linear regression models with application to newborn data

引用

BIOMETRICAL JOURNAL 2021年第6期63卷 1290-1308页

作者： Moran-Vasquez, Raul Alejandro Mazo-Lopera, Mauricio A. Ferrari, Silvia L. P. Univ Antioquia Inst Matemat Calle 67 53-108 Medellin 050010 Colombia Univ Nacl Colombia Escuela Estadist Medellin Colombia Univ Sao Paulo Dept Estat Sao Paulo Brazil

In this article, we propose and study the class of multivariate log-normal/independent distributions and linear regression models based on this class. The class of multivariate log-normal/independent distributions is very attractive for robust statistical modeling because it includes several heavy-tailed distributions suitable for modeling correlated multivariate positive data that are skewed and possibly heavy-tailed. Besides, expectation-maximization (em)-type algorithms can be easily implemented for maximum likelihood estimation. We model the relationship between quantiles of the response variables and a set of explanatory variables, compute the maximum likelihood estimates of parameters through em-type algorithms, and evaluate the model fitting based on Mahalanobis-type distances. The satisfactory performance of the quantile estimation is verified by simulation studies. An application to newborn data is presented and discussed.

关键词： em algorithm multivariate linear regression multivariate normal independent distribution newborn quantile modeling

来源：评论

学校读者我要写书评

暂无评论

Approximate Inference and Learning of State Space Models With Laplace Noise

引用

IEEE TRANSACTIONS ON SIGNAL PROCESSING 2021年 69卷 3176-3189页

作者： Neri, Julian Depalle, Philippe Badeau, Roland McGill Univ Montreal PQ H3A 1E3 Canada Telecom Paris Inst Polytech Paris LTCI F-91764 Palaiseau France

State space models have been extensively applied to model and control dynamical systems in disciplines including neuroscience, target tracking, and audio processing. A common modeling assumption is that both the state and data noise are Gaussian because it simplifies the estimation of the system's state and model parameters. However, in many real-world scenarios where the noise is heavy-tailed or includes outliers, this assumption does not hold, and the performance of the model degrades. In this paper, we present a new approximate inference algorithm for state space models with Laplace-distributed multivariate data that is robust to a wide range of non-Gaussian noise. Exact inference is combined with an expectation propagation algorithm, leading to filtering and smoothing that outperforms existing approximate inference methods for Laplace-distributed data, while retaining a fast speed similar to the Kalman filter. Further, we present a maximum posterior expectation-maximization (em) algorithm that learns the parameters of the model in an unsupervised way, automatically avoids over-fitting the data, and provides better model estimation than existing methods for the Gaussian model. The quality of the inference and learning algorithms are exemplified through a diverse set of experiments and an application to non-linear tracking of audio frequency.

关键词： Data models Inference algorithms Biological system modeling Mathematical model Heuristic algorithms Kalman filters Approximation algorithms Bayesian inference time series heavy-tailed noise em algorithm machine learning expectation propagation

来源：评论

学校读者我要写书评

暂无评论

Multi-node Expectation-Maximization algorithm for finite mixture models

引用

STATISTICAL ANALYSIS AND DATA MINING 2021年第4期14卷 297-304页

作者： Lee, Sharon X. McLachlan, Geoffrey J. Leemaqz, Kaleb L. Univ Adelaide Sch Math Sci Adelaide SA Australia Univ Queensland Dept Math Brisbane Qld Australia Univ New South Wales UNSW Business Sch Sydney NSW Australia

Finite mixture models are powerful tools for modeling and analyzing heterogeneous data. Parameter estimation is typically carried out using maximum likelihood estimation via the Expectation-Maximization (em) algorithm. Recently, the adoption of flexible distributions as component densities has become increasingly popular. Often, the em algorithm for these models involves complicated expressions that are time-consuming to evaluate numerically. In this paper, we describe a parallel implementation of the em algorithm suitable for both single-threaded and multi-threaded processors and for both single machine and multiple-node systems. Numerical experiments are performed to demonstrate the potential performance gain in different settings. Comparison is also made across two commonly used platforms-R and MATLAB. For illustration, a fairly general mixture model is used in the comparison.

关键词： em algorithm mixture model parallel computing

来源：评论

学校读者我要写书评

暂无评论

A multivariate student-t process model for dependent tail-weighted degradation data

引用

IISE TRANSACTIONS 2024年

作者： Xu, Ancha Fang, Guanqi Zhuang, Liangliang Gu, Cheng Zhejiang Gongshang Univ Sch Stat & Math Hangzhou Zhejiang Peoples R China Zhejiang Gongshang Univ Collaborat Innovat Ctr Stat Data Engn Technol & Ap Hangzhou Zhejiang Peoples R China

Traditionally, the Gaussian assumption, implied by the Wiener process, is widely admitted for modeling degradation processes. However, when degradation data exhibit heavy tails, this assumption is not suitable. To overcome this limitation, this article proposes a novel class of tail-weighted multivariate degradation model, which is built upon Student-t process. The model is able to account for both between-unit variability and process dependency, while allowing the adjustment of tail heaviness through tuning the parameter of the degree of freedom. For reliability assessment, we derive the system reliability function and present an efficient Monte Carlo method for its evaluation. Further, we introduce an expectation-maximization algorithm for parameter estimation and design a bootstrap method for interval estimation. Comprehensive simulation studies are conducted to validate the effectiveness of the inference method. Finally, the proposed methodology is applied to analyze two real-world degradation datasets.

关键词： Reliability bootstrap em algorithm heavy tail multivariate degradation

来源：评论

学校读者我要写书评

暂无评论

Sequential estimation for mixture of regression models for heterogeneous population

引用

COMPUTATIONAL STATISTICS & DATA ANALYSIS 2024年 194卷

作者： You, Na Dai, Hongsheng Wang, Xueqin Yu, Qingyun Sun Yat Sen Univ Sch Math Guangzhou 510275 Guangdong Peoples R China Univ Essex Sch Math Stat & Actuarial Sci Colchester CO4 3SQ England Newcastle Univ Sch Math Stat & Phys Newcastle NE1 7RU England Univ Sci & Technol China Sch Management Hefei Anhui Peoples R China Newcastle Univ Sch Math Stat & Phys Newcastle upon Tyne NE1 7RU England

Heterogeneity among patients commonly exists in clinical studies and leads to challenges in medical research. It is widely accepted that there exist various sub -types in the population and they are distinct from each other. The approach of identifying the sub -types and thus tailoring disease prevention and treatment is known as precision medicine. The mixture model is a classical statistical model to cluster the heterogeneous population into homogeneous subpopulations. However, for the highly heterogeneous population with multiple components, its parameter estimation and clustering results may be ambiguous due to the dependence of the em algorithm on the initial values. For sub -typing purposes, the finite mixture of regression models with concomitant variables is considered and a novel statistical method is proposed to identify the main components with large proportions in the mixture sequentially. Compared to existing typical statistical inferences, the new method not only requires no pre -specification on the number of components for model fitting, but also provides more reliable parameter estimation and clustering results. Simulation studies demonstrated the superiority of the proposed method. Real data analysis on the drug response prediction illustrated its reliability in the parameter estimation and capability to identify the important subgroup.

关键词： em algorithm Heterogeneous population Mixture model Sub-type

来源：评论

学校读者我要写书评

暂无评论

Regression analysis for Dependent current status data

Regression analysis for Dependent current status data

引用

Asia Conference on algorithms, Computing and Machine Learning (CACML))

作者： Yan, Hanwen Zhou, Yuting Yang, Xuemei Northeast Normal Univ Changchun Peoples R China China Acad Engn Phys Beijing Peoples R China China Acad Engn Phys CAEP Beijing Peoples R China

ISBN: (数字)9781665482905

ISBN: (纸本)9781665482905

In the current state data, each individual is observed only once, and the only available information is whether the failure event of interest occured during the observation time. In other words, the current state data cannot observe any individual's specific survival time or the failure time, therefore, it is significant different from the normal right-censored data. In this paper, we use the Cox model to construct the model of interested failure time and observation time, because the model contains not only regression coefficient of finite dimension, but also the unknown function of infinite dimension, and there are covariables which cannot be observed, so it is difficult to directly maximize the likelihood function. Therefore, the non-observable latent variable is introduced to describe the dependence of two kinds of time, the step function is used to approximate the unknown function to reduce the difficulty of non-parametric part, further the parameter estimation is given by the em algorithm, the consistency and asymptotic of the estimators are also certified. Some data simulations are performed, whose results show that the method presented here performed well under a limited sample. In the following paper, a group of mouse experiments demonstrating that the sterile environment has no significant effect on tumor inhibition. This paper only considered the current state data and the Cox model, In the futher, the statistical inference problem under other more general and more complex models can be further considered.

关键词： em algorithm current state data survival analysis

来源：评论

学校读者我要写书评

暂无评论

Modelling Multiple Regimes in Economic Growth by Mixtures of Generalised Nonlinear Models

引用

ECONOMETRICS AND STATISTICS 2022年 22卷 124-135页

作者： Omerovic, Sanela Friedl, Herwig Gruen, Bettina Austrian Financial Market Author FMA Otto Wagner Pl 5 A-1090 Vienna Austria Graz Univ Technol Kopernikusgasse 24-3 A-8010 Graz Austria Vienna Univ Econ & Business Welthandelspl 1 A-1020 Vienna Austria

The new model class of mixtures of generalised nonlinear models (GNMs) is introduced. The model is specified, identifiability issues discussed, the fitting in a maximum likelihood framework using the expectation-maximisation (em) algorithm outlined and an appropri-ate computational implementation introduced. The new model class is applied to capture cross-country heterogeneity when considering the augmented Solow model including hu-man capital accumulation as underlying model structure. The inherent heterogeneity is attributed to multiple regimes being present within the selected country data set. The re-sults highlight that country-specific differences lead to distinct components. Countries be-longing to the same component exhibit convergence to a homogeneous steady state. The components differ in the initial technological endowment and the contribution of the eco-nomic variables to economic growth.(c) 2021 EcoSta Econometrics and Statistics. Published by Elsevier B.V. All rights reserved.

关键词： Finite mixture model Generalised nonlinear model Solow model em algorithm

来源：评论

学校读者我要写书评

暂无评论

A new class of zero-truncated counting models and its application

引用

COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION 2024年

作者： Tang, Xian-Ping Tian, Yu-Zhu Wu, Chun-Ho Wang, Yue Mian, Zhi-Bao Northwest Normal Univ Sch Math & Stat Lanzhou Peoples R China Gansu Prov Res Ctr Basic Disciplines Math & Stat Lanzhou Peoples R China Hang Seng Univ Hong Kong Sch Decis Sci Shatin Hong Kong Peoples R China Educ Univ Hong Kong Dept Math & Informat Technol Hong Kong Peoples R China Univ Hull Sch Comp Sci Kingston Upon Hull England

Count data is a type of data derived from the number of times an event occurs per unit of time, and zero-truncated count data refers to count data without zero, which often appears in various fields. In this paper, a new zero-truncated Bell (ZTBell) distribution is proposed on the basis of Bell distribution. We studied its statistical properties, exploring methods such as maximum likelihood estimation (MLE), expectation-maximization (em) algorithm, and minimization-maximization (MM) algorithm for parameter estimation, as well as conducting likelihood ratio tests. In addition, we used the Bootstrap method to calculate the standard errors and confidence intervals of the parameters. The simulation results found that all of the MLE, MM algorithm and em algorithm are effective. And, as the sample size increases, the estimates of the parameters are closer to the true values and the root mean square error is smaller. Finally, applying the model to a set of factory accident data, we found that the ZTBell distribution fits better than the other models and is close to the fitting results of the zero-truncated generalized Poisson distribution. But ZTBell distribution has only one parameter, so it's even simpler compared to the latter. Therefore, the ZTBell distribution can be a good alternative to other zero-truncated distributions, which provides more options available for statistical analysis in this domain.

关键词： Bootstrap Count data em algorithm MM algorithm ZTBell

来源：评论

学校读者我要写书评

暂无评论

Model Error Estimation Using the Expectation Maximization algorithm and a Particle Flow Filter

引用

SIAM-ASA JOURNAL ON UNCERTAINTY QUANTIFICATION 2021年第2期9卷 681-707页

作者： Magdalena Lucini, Maria van Leeuwen, Peter Jan Pulido, Manuel UNNE CONICET RA-3400 Corrientes Argentina Univ Reading Dept Meteorol Reading RG6 6AH Berks England Colorado State Univ Dept Atmospher Sci Ft Collins CO 80523 USA

Model error covariances play a central role in the performance of data assimilation methods applied to nonlinear state-space models. However, these covariances are largely unknown in most of the applications. A misspecification of the model error covariance has a strong impact on the computation of the posterior probability density function, leading to unreliable estimations and even to a total failure of the assimilation procedure. In this work, we propose the combination of the expectation maximization (em) algorithm with an efficient particle filter to estimate the model error covariance using a batch of observations. Based on the em algorithm principles, the proposed method encompasses two stages: the expectation stage, in which a particle filter is used with the present updated value of the model error covariance as given to find the probability density function that maximizes the likelihood, followed by a maximization stage, in which the expectation under the probability density function found in the expectation step is maximized as a function of the elements of the model error covariance. This novel algorithm here presented combines the em algorithm with a fixed point algorithm and does not require a particle smoother to approximate the posterior densities. We demonstrate that the new method accurately and efficiently solves the linear model problem. Furthermore, for the chaotic nonlinear Lorenz-96 model the method is stable even for observation error covariance 10 times larger than the estimated model error covariance matrix and also is successful in moderately large dimensional situations where the dimension of the estimated matrix is 40 x 40.

关键词： particle filters state-space models model error covariance em algorithm

来源：评论

学校读者我要写书评

暂无评论

A Lightweight Graph-based Method to Detect Pornographic and Gambling Websites with Imperfect Datasets 21

A Lightweight Graph-based Method to Detect Pornographic and ...

引用

21st IEEE International Conference on Trust, Security and Privacy in Computing and Communications (IEEE TrustCom)

作者： Ma, Xiaoqing Zheng, Chao Li, Zhao Yin, Jiangyi Liu, Qingyun Chen, Xunxun Chinese Acad Sci Inst Informat Engn Beijing Peoples R China Natl Engn Lab Informat Secur Technol Beijing Peoples R China Univ Chinese Acad Sci Sch Cyber Secur Beijing Peoples R China Geedge Networks Beijing Peoples R China Natl Comp Network Emergency Response Tech Team C Beijing Peoples R China

ISBN: (纸本)9781665494250

With the widespread abuse of information technology, pornographic and gambling websites develop rapidly. They affect the physical and mental health of children and endanger personal property. Therefore, it is necessary to detect them. However, the existing detection methods ignored that imperfect datasets are common in the scenario of pornographic and gambling websites which are hence adverse to the detection. Those imperfections specifically include sparse samples, mismatch and imbalanced datasets. In addition, over-reliance on visual features incurred high overhead. To overcome these shortcomings, we innovatively propose a lightweight graph-based method to detect pornographic and gambling websites through semi-supervised learning of textual content. The semi-supervised learning is to solve sparse samples and mismatch datasets, while the graph-based approach can combine the semi-supervised part with community discovery to deal with imbalanced datasets. Specifically, we perform the detection process with the utilization of modified TF-IDF and Louvain during the iteration and updating by the em algorithm. The experimental results show that our method achieves the best 92.01% Macro-Avg-F1 with the shortest CPU time and outperforms all baselines. We also illustrate that the designed components in our model do contribute to the detection.

关键词： Pornographic and Gambling Websites Imperfect Datasets Semi-Supervised em algorithm Modified TF-IDF

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：