Sequence-level losses are commonly used to train deep neural network acoustic models for automatic speech recognition. The forward-backward algorithm is used to efficiently compute the gradients of the sequence loss w...
详细信息
ISBN:
(纸本)9781509047888
Sequence-level losses are commonly used to train deep neural network acoustic models for automatic speech recognition. The forward-backward algorithm is used to efficiently compute the gradients of the sequence loss with respect to the model parameters. Gradient-based optimization is used to minimize these losses. Recent work has shown that the forward-backward algorithm can be efficiently implemented as a series of matrix operations. This paper further improves the forward-backward algorithm via batched computation, a technique commonly used to improve training speed by exploiting the parallel computation of matrix multiplication. Specifically, we show how batched computation of the forward-backward algorithm can be efficiently implemented using TensorFlow to handle variable-length sequences within a mini batch. Furthermore, we also show how the batched forward-backward computation can be used to compute the gradients of the connectionist temporal classification (CTC) and maximum mutual information (MMI) losses with respect to the logits. We show, via empirical benchmarks, that the batched forward-backward computation can speed up the CTC loss and gradient computation by about 183 times when run on GPU with a batch size of 256 compared to using a batch size of 1;and by about 22 times for lattice-free MMI using a trigram phone language model for the denominator.
Using Bayes's theorem, we derive a unit-wise recurrence as well as a backward recursion similar to the forward-backward algorithm. The resulting Bayesian recurrent units can be integrated as recurrent neural netwo...
详细信息
Using Bayes's theorem, we derive a unit-wise recurrence as well as a backward recursion similar to the forward-backward algorithm. The resulting Bayesian recurrent units can be integrated as recurrent neural networks within deep learning frameworks, while retaining a probabilistic interpretation from the direct correspondence with hidden Markov models. Whilst the contribution is mainly theoretical, experiments on speech recognition indicate that adding the derived units at the end of state-of-the-art recurrent architectures can improve the performance at a very low cost in terms of trainable parameters.
This paper proposes an efficient approximation of the forward-backward (FB) algorithm, for the purpose of estimating missing features, based on downsampling statistical models. The paper discusses the role of Hidden M...
详细信息
ISBN:
(纸本)9781424414833
This paper proposes an efficient approximation of the forward-backward (FB) algorithm, for the purpose of estimating missing features, based on downsampling statistical models. The paper discusses the role of Hidden Markov Models (HMMs) in the estimation process, and presents an approximation to the FB method by developing HMMs based on lower resolution quantizers, which are obtained through a tree-structure mapping of quantizer centroids. To illustrate the effectiveness of the proposed method, we apply it to the problem of error concealment in remote speech recognition, using the Aurora-2 database. The FB approximation provides comparable word recognition accuracy results relative to the standard FB method, while reducing the computational load by a large factor (> 250 in this case).
In this note, we examine the forward-backward algorithm from the computational viewpoint of the underflow problem inherent in Baum's (1972) original formulation. We demonstrate that the conversion of Baum's co...
详细信息
De-interlacing is revisited as the problem of assigning a sequence of interpolation methods (interpolators) to a sequence of missing pixels of an interlaced frame (field). With this assumption, our algorithm undergoes...
详细信息
ISBN:
(纸本)9781467350518
De-interlacing is revisited as the problem of assigning a sequence of interpolation methods (interpolators) to a sequence of missing pixels of an interlaced frame (field). With this assumption, our algorithm undergoes transitions from one interpolator to another as it moves from one missing pixel position to the next one. We assume that the next state depends only on the current state which implies a first-order Markov-chain on the sequence of interpolators. For estimation of the optimum sequence of interpolators our algorithm introduces a novel cost function and then makes use of forward-backward algorithm to find the global optimum sequence of interpolators. Simulation results prove that the proposed method is superior to the well-known de-interlacing algorithms proposed in this field.
Second-order hidden Markov models(HMM2) have been used for a long time in pattern recognition,especially in speech *** main advantages are their capabilities to model noisy temporal signals of variable length. In this...
详细信息
Second-order hidden Markov models(HMM2) have been used for a long time in pattern recognition,especially in speech *** main advantages are their capabilities to model noisy temporal signals of variable length. In this paper a new HMM2 for modeling state duration is *** order to describe the state duration of HMM2,the probability of state duration is defined,and the structure of state duration HMM2 is *** using Bayes statistics methods,a new forward-backward algorithm are given based on traditional *** addition,the reestimation formulae of the parameters about new models are brought forward using new forward-backward algorithm.
During the last decades, reweighted procedures have shown high efficiency in computational imaging. They aim to handle non-convex composite penalization functions by iteratively solving multiple approximated subproble...
详细信息
ISBN:
(数字)9781509066315
ISBN:
(纸本)9781509066315
During the last decades, reweighted procedures have shown high efficiency in computational imaging. They aim to handle non-convex composite penalization functions by iteratively solving multiple approximated subproblems. Although the asymptotic behaviour of these methods has recently been investigated in several works, they all necessitate the sub-problems to be solved accurately, which can be sub-optimal in practice. In this work we present a reweighted forward-backward algorithm designed to handle non-convex composite functions. Unlike existing convergence studies in the literature, the weighting procedure is directly included within the iterations, avoiding the need for solving any sub-problem. We show that the obtained reweighted forward-backward algorithm converges to a critical point of the initial objective function. We illustrate the good behaviour of the proposed approach on a Fourier imaging example borrowed to radio-astronomical imaging.
The aim of this paper is to investigate the asymptotic behavior of the forward-backward algorithm for solving null-point problems governed by two maximal monotone operators. An application to the split feasibility pro...
详细信息
The aim of this paper is to investigate the asymptotic behavior of the forward-backward algorithm for solving null-point problems governed by two maximal monotone operators. An application to the split feasibility problem is also sated.
We study the variable metric forward-backward splitting algorithm for convex minimization problems without the standard assumption of the Lipschitz continuity of the gradient. In this setting, we prove that, by requir...
详细信息
We study the variable metric forward-backward splitting algorithm for convex minimization problems without the standard assumption of the Lipschitz continuity of the gradient. In this setting, we prove that, by requiring only mild assumptions on the smooth part of the objective function and using several types of line search procedures for determining either the gradient descent stepsizes or the relaxation parameters, one still obtains weak convergence of the iterates and convergence in the objective function values. Moreover, the o(1/k) convergence rate in the function values is obtained if slightly stronger differentiability assumptions are added. We also illustrate several applications including problems that involve Banach spaces and functions of divergence type.
In this research, we are interested about the monotone inclusion problems in the scope of the real Hilbert spaces by using an inertial forward-backward splitting algorithm. In addition, we have discussed the applicati...
详细信息
In this research, we are interested about the monotone inclusion problems in the scope of the real Hilbert spaces by using an inertial forward-backward splitting algorithm. In addition, we have discussed the application of this algorithm.
暂无评论