Binary observations are often repeated to improve data quality, creating technical replicates. Several scoring methods are commonly used to infer the actual individual state and obtain a probability for each state. Th...
详细信息
K nearest neighbor and Bayesian methods are effective methods of machine learning. expectationmaximization is an effective Bayesian classifier. In this work a data elimination approach is proposed to improve data clu...
详细信息
K nearest neighbor and Bayesian methods are effective methods of machine learning. expectationmaximization is an effective Bayesian classifier. In this work a data elimination approach is proposed to improve data clustering. The proposed method is based on hybridization of k nearest neighbor and expectation maximization algorithms. The k nearest neighbor algorithm is considered as the preprocessor for expectation maximization algorithm to reduce the amount of training data making it difficult to learn. The suggested method is tested on well-known machine learning data sets iris, wine, breast cancer, glass and yeast. Simulations are done in MATLAB environment and performance results are concluded. (C) 2011 Elsevier Ltd. All rights reserved.
The idea of representations of the data in negatively curved manifolds recently attracted a lot of attention and gave a rise to the new research direction named hyperbolic machine learning (ML). In order to unveil the...
详细信息
We introduce a computationally efficient and general approach for utilizing multiple, possibly interval-censored, data streams to study complex biomedical endpoints using multistate semi-Markov models. Our motivating ...
详细信息
In social online platforms, identifying influential seed users to maximize influence spread is a crucial as it can greatly diminish the cost and efforts required for information dissemination. While effective, traditi...
详细信息
Repeated waves of emerging variants during the SARS-CoV-2 pandemics have highlighted the urge of collecting longitudinal genomic data and developing statistical methods based on time series analyses for detecting new ...
详细信息
Repeated waves of emerging variants during the SARS-CoV-2 pandemics have highlighted the urge of collecting longitudinal genomic data and developing statistical methods based on time series analyses for detecting new threatening lineages and estimating their fitness early in time. Most models study the evolution of the prevalence of particular lineages over time and require a prior classification of sequences into lineages. Such process is prone to induce delays and bias. More recently, few authors studied the evolution of the prevalence of mutations over time with alternative clustering approaches, avoiding specific lineage classification. Most of the aforementioned methods are however either non parametric or unsuited to pooled data characterizing, for instance, wastewater samples. The analysis of wastewater samples has recently been pointed out as a valuable complementary approach to clinical sample analysis, however the pooled nature of the data involves specific statistical challenges. In this context, we propose an alternative unsupervised method for clustering mutations according to their frequency trajectory over time and estimating group fitness from time series of pooled mutation prevalence data. Our model is a mixture of observed count data and latent group assignment and we use the expectation-maximizationalgorithm for model selection and parameter estimation. The application of our method to time series of SARS-CoV-2 sequencing data collected from wastewater treatment plants in France from October 2020 to April 2021 shows its ability to agnostically group mutations according to their probability of belonging to B.1.160, Alpha, Beta, B.1.177 variants with selection coefficient estimates per group in coherence with the viral dynamics in France reported by Nextstrain. Moreover, our method detected the Alpha variant as threatening as early as supervised methods (which track specific mutations over time) with the noticeable difference that, since unsupervis
The failure of a system can result from the simultaneous effects of multiple causes, where assigning a specific cause may be inappropriate or unavailable. Examples include contributing causes of death in epidemiology ...
详细信息
Video synthetic aperture radar (ViSAR) has attracted substantial attention in the moving target detection (MTD) field due to its ability to continuously monitor changes in the target area. In ViSAR, the moving targets...
详细信息
In this work, we develop a scalable approach for a flexible latent factor model for high-dimensional dynamical systems. Each latent factor process has its own correlation and variance parameters, and the orthogonal fa...
详细信息
We introduce an approach based on mirror descent and sequential Monte Carlo (SMC) to perform joint parameter inference and posterior estimation in latent variable models. This approach is based on minimisation of a fu...
详细信息
暂无评论