The em algorithm has been successfully applied to obtain maximum likelihood estimates for state-space models. The usual formulation of the algorithm is based on a shift operator model for the discrete-time (or sampled...
详细信息
The em algorithm has been successfully applied to obtain maximum likelihood estimates for state-space models. The usual formulation of the algorithm is based on a shift operator model for the discrete-time (or sampled-data) system. More recently, it has been shown that an equivalent formulation of the algorithm in terms of incremental discrete-time models shows better numerical properties, in particular, for fast sampling rates. In this paper we explore the correspondence between the parameter estimates given by the em algorithm applied to incremental models and those corresponding to a purely continuous-time formulation.
Crowdsourcing appears as one of cheap and fast solutions of distributed labor networks. Since the workers have various expertise levels, several approaches to measure annotators reliability have been addressed. There ...
详细信息
ISBN:
(纸本)9783319687650;9783319687643
Crowdsourcing appears as one of cheap and fast solutions of distributed labor networks. Since the workers have various expertise levels, several approaches to measure annotators reliability have been addressed. There is a condition when annotators who give random answer are abundance and few number of expert is available Therefore, we proposed an iterative algorithm in crowds problem when it is hard to find expert annotators by selecting expert annotator based on em-Bayesian algorithm, Entropy Measure, and Condorcet Jury's Theorem. Experimental results using eight datasets show the best performance of our proposed algorithm compared to previous approaches.
The advent of RNA-Seq has made it possible to quantify transcript expression on a large scale simultaneously. This technology generates small fragments of each transcript sequence, known as sequencing reads. As the fi...
详细信息
ISBN:
(纸本)9781450347228
The advent of RNA-Seq has made it possible to quantify transcript expression on a large scale simultaneously. This technology generates small fragments of each transcript sequence, known as sequencing reads. As the first step of data analysis towards expression quantification, most of the existing methods align these reads to a reference genome or transcriptome to establish their origins. However, read alignment is computationally costly. Recently, a series of methods have been proposed to perform a lightweight quantification analysis in an alignment-free manner. These methods utilize the notion of k-mers, which are short consecutive sequences representing the signatures of each transcript, to estimate the relative abundance from RNA-Seq reads. Current k-mer based approaches make use of a set of fixed size k-mers;however, the true signatures of each transcript may not exist in a fixed size. In this paper, we demonstrate the importance of k-mers selection in transcript abundance estimation. We propose a novel method, Fleximer, to efficiently discover and select an optimal set of k-mers with flexible lengths. Using both simulated and real datasets, we show that, with fewer k-mers, Fleximer is able to cover the similar amount of reads as Sailfish and Kallisto. The selected k-mers own more distinguishing features, and thus substantially reduce the errors in transcript abundance estimation.
We consider the problem of learning underlying tree structure from noisy, mixed data obtained from a linear model. To achieve this, we use the expectation maximization algorithm combined with Chow-Liu minimum spanning...
详细信息
ISBN:
(纸本)9781509041176
We consider the problem of learning underlying tree structure from noisy, mixed data obtained from a linear model. To achieve this, we use the expectation maximization algorithm combined with Chow-Liu minimum spanning tree algorithm. This algorithm is sub-optimal, but has low complexity and is applicable to model selection problems through any linear model.
The purchase decision of customers in e-commerce platforms is strongly influenced by product ratings and reviews. Driven by the profits, review spammers post fake reviews to promote their products or demote their comp...
详细信息
ISBN:
(纸本)9781538610725
The purchase decision of customers in e-commerce platforms is strongly influenced by product ratings and reviews. Driven by the profits, review spammers post fake reviews to promote their products or demote their competitors' products. Differ from individual spammers, the spammer groups manipulate reviews together and can be more damaging. Existing work for spammer group detection extract candidate groups from review data and identify the spammer groups using unsupervised spamicity ranking methods. However, the labeled and unlabeled data are existing simultaneously in practice and no method makes good use of both these data in spammer group detection. In this paper, we propose a semi-supervised learning based spammer group detection method (Semi-SGD), which trains a Naive Bayes classifier on a small set of labeled data as an initial classifier, and then incorporates unlabeled data with Expectation Maximization (em) algorithm to improve the initial classifier iteratively. Experiments on *** datasets show that our proposed Semi-SGD is efficient and effective.
Concatenated code based on nonbinary LDPC code and Hadamard code is used for noncoherent underwater acoustic communication system. 32-ary (620, 320) regular LDPC code and irregular LDPC code is constructed by quasi-cy...
详细信息
ISBN:
(纸本)9781538631423
Concatenated code based on nonbinary LDPC code and Hadamard code is used for noncoherent underwater acoustic communication system. 32-ary (620, 320) regular LDPC code and irregular LDPC code is constructed by quasi-cyclic extension method and Progressive edge-growth (PEG) algorithm, respectively. Under non-Gaussian noise model, Gaussian mixture model (GMM) is used to fit the noise, and the parameters in GMM is estimated by Expectation Maximization (em) algorithm, the probability density of noise is further estimated. In Rayleigh fading channel, posterior probabilities of Hadamard code-words are calculated based on GMM, and nonbinary LDPC code is further decoded by Belief Propagation (BP) algorithm based on Tanner graph. It is verified by simulation that concatenated irregular LDPC code and Hadamard code has a 0.4 dB benefit than concatenated regular LDPC code and Hadamard code under white Gaussian noise;under Gaussian mixture noise, the em algorithm based on GMM can exactly estimate the probability density of noise and improve the error correcting performance of concatenated code, the performance gap is 0.1 dB compared to results in known probability density condition. Noise samples were acquired by experiments carried out in deep sea and shallow lake. Under actual noise, the advantages of concatenated code based on GMM in practical application is verified.
Marshall-Olkin type distribution is defined as a maximum or minimum value distribution of N i.i.d. (secondary) random variables, where N is a geometric distributed random variable. Gopal, and Damondaran (2011) and Xia...
详细信息
ISBN:
(纸本)9781538636817
Marshall-Olkin type distribution is defined as a maximum or minimum value distribution of N i.i.d. (secondary) random variables, where N is a geometric distributed random variable. Gopal, and Damondaran (2011) and Xiao (2015) considered the special cases where the underlying random variables are exponentially and Weibull distributed, respectively, and investigated an applicability of these specific Marshall-Olkin type distributions to software reliability modeling with non-homogeneous Poisson process. In this paper, we further generalize the existing software reliability growth model (SRGM) to more general ones by assuming that the secondary probability distribution is given by one of eleven representative distributions with positive support. For these new SRGMs, we develop efficient parameter estimation algorithms based on the em (Expectation-Maximization) principle, and provide two stable statistical inference schemes in the respective cases where the software fault-detection time and its grouped data are available, respectively. In numerical examples with sixteen data sets (eight for each of detection time data or grouped data), the resulting Marshall-Olkin type SRGMs are compared with the existing eleven SRGMs in terms of goodness-of-fit performance, and reliability prediction.
Aiming at solving the problem that the performance of moving target tracking is not very well when the statistical characteristics of the process noise are unknown in addition to non - Gaussian observation noise, a me...
详细信息
ISBN:
(纸本)9781538610091
Aiming at solving the problem that the performance of moving target tracking is not very well when the statistical characteristics of the process noise are unknown in addition to non - Gaussian observation noise, a method combining em algorithm and particle filter are proposed. Such method applies em algorithm to estimate the accurate process noise parameters, then the particle filter is used to obtain high precision target motion state. Simulation results demonstrate our method can effectively suppress the divergence of filtering and improve the accuracy of tracking significantly.
Thepenalised least square estimator of non-convex penalties such as the smoothly clipped absolute deviation(SCAD)and the minimax concave penalty(MCP)is highly nonlinear and has many local *** a local solution to achie...
详细信息
Thepenalised least square estimator of non-convex penalties such as the smoothly clipped absolute deviation(SCAD)and the minimax concave penalty(MCP)is highly nonlinear and has many local *** a local solution to achieve the so-called oracle property is a challenging *** show that the orthogonalising em(Oem)algorithm can indeed find such a local solution with the oracle property under some regularity conditions for a moderate but diverging number of variables.
The parameter estimation of a wide-sense auto-regressive moving-average(ARMA) model,which is widely applied into a variety of fields,is an extremely important research *** research is conducted with the known driving ...
详细信息
ISBN:
(纸本)9781538629185
The parameter estimation of a wide-sense auto-regressive moving-average(ARMA) model,which is widely applied into a variety of fields,is an extremely important research *** research is conducted with the known driving environment noise or assuming that the driving noise consists unknown *** the driving noise is really complex in *** now,less attention on parameter estimation for a wide-sense stationary hidden ARMA process with unknown noise is paid attention,although it is very common in the complex control *** paper presents parameter estimation method for hidden wide-sense ARMA processes with the known model order.A dual particle filter-based method is adopted to estimate joint states and *** method can be divided into two *** first step utilizes the particle filter algorithm to estimate the state of an ARMA model,then conduct the estimation of parameters in the PF algorithm on the basis of state estimation in the second *** the noise model is extremely unknown,the Gaussian mixture model is adopted to approach the posterior probability function in the process of the above dual PF algorithm according to em *** results verify the effectiveness of the proposed scheme.
暂无评论