A major challenge for Arabic Large Vocabulary Continuous Speech recognition (LVCSR) is the rich morphology of Arabic, which leads to high Out-of-vocabulary (OOV) rates, and poor language Model (LM) probabilities. In s...
详细信息
A major challenge for Arabic Large Vocabulary Continuous Speech recognition (LVCSR) is the rich morphology of Arabic, which leads to high Out-of-vocabulary (OOV) rates, and poor language Model (LM) probabilities. In such cases, the use of morphemes rather than full-words is considered a better choice for LMs. Thereby, higher lexical coverage and less LM perplexities are achieved. On the other side, an effective way to increase the robustness of LMs is to incorporate features of words into LMs. In this paper, we investigate the use of features derived for morphemes rather than words. Thus, we combine the benefits of both morpheme level and feature rich modeling. We compare the performance of stream-based, class-based and Factored LMs (FLMs) estimated over sequences of morphemes and their features for performing Arabic LVCSR. A relative reduction of 3.9% in Word Error Rate (WER) is achieved compared to a word-based system.
Log-linear models find a wide range of applications in patternrecognition. The training of log-linear models is a convex optimization problem. In this work, we compare the performance of stochastic and batch optimiza...
详细信息
ISBN:
(纸本)9781479903573
Log-linear models find a wide range of applications in patternrecognition. The training of log-linear models is a convex optimization problem. In this work, we compare the performance of stochastic and batch optimization algorithms. Stochastic algorithms are fast on large data sets but can not be parallelized well. In our experiments on a broadcast conversations recognition task, stochastic methods yield competitive results after only a short training period, but when spending enough computational resources for parallelization, batch algorithms are competitive with stochastic algorithms. We obtained slight improvements by using a stochastic second order algorithm. Our best log-linear model outperforms the maximum likelihood trained Gaussian mixture model baseline although being ten times smaller.
We use neural network based features extracted by a hierarchical multilayer-perceptron (MLP) network either in a hybrid MLP/HMM approach or to discriminatively retrain a Gaussian hidden Markov model (GHMM) system in a...
详细信息
We use neural network based features extracted by a hierarchical multilayer-perceptron (MLP) network either in a hybrid MLP/HMM approach or to discriminatively retrain a Gaussian hidden Markov model (GHMM) system in a tandem approach. MLP networks have been successfully used to model long-term and non-linear features dependencies in automatic speech and optical character recognition. In offline hand writing recognition, MLPs have been mostly used for isolated character and word recognition in hybrid approaches. Here we analyze MLPs within an LVCSR framework for continuous handwriting recognition using discriminative MMI/MPE training. Especially hybrid MLP/HMM and discriminatively retrained MLP-GHMM tandem approaches are evaluated. Significant improvements and competitive results are re ported for a closed-vocabulary task on the IfN/ENIT Arabic handwriting database and for a large-vocabulary task using the IAM English handwriting database.
This work systematically analyzes the smoothing effect of vocabulary reduction for phrase translation models. We extensively compare various word-level vocabularies to show that the performance of smoothing is not sig...
Social circles detection is a special case of community detection in social network that is currently attracting a growing interest in the research community. In this paper, we propose a two-step technique, making emp...
详细信息
ISBN:
(纸本)9781479919611
Social circles detection is a special case of community detection in social network that is currently attracting a growing interest in the research community. In this paper, we propose a two-step technique, making emphasis on the mapping of the data by Restricted Boltzmann Machines (RBMs). Social circles are subsequently inferred by k-means over the preprocessed data. We define different vectorial representations from both structural egonet information and user profile features, and perform a set of tests to adjust the optimal parameters of the RBMs. We study and compare the performance on the ego-Facebook dataset of social circles from Facebook from the Stanford Large Network Dataset Collection. We compare our results with several different baselines.
We have recently proposed an EM-style algorithm to optimize log-linear models with hidden variables. In this paper, we use this algorithm to optimize a hidden conditional random field, i.e., a conditional random field...
详细信息
We have recently proposed an EM-style algorithm to optimize log-linear models with hidden variables. In this paper, we use this algorithm to optimize a hidden conditional random field, i.e., a conditional random field with hidden variables. Similar to hidden Markov models, the alignments are the hidden variables in the examples considered. Here, EM-style algorithms are iterative optimization algorithms which are guaranteed to improve the training criterion in each iteration without the need for tuning step sizes, sophisticated update schemes or numerical line optimization (with hardly predictable complexity). This is a rather strong property which conventional gradient-based optimization algorithms do not have. We present experimental results for a grapheme-to-phoneme conversion task and compare the convergence behavior of the EM-style algorithm with L-BFGS and Rprop.
Checkpoint averaging is a simple and effective method to boost the performance of converged neural machine translation models. The calculation is cheap to perform and the fact that the translation improvement almost c...
详细信息
Mathematical expression recognition is a research field that aims to develop algorithms and systems capable of interpreting mathematical content. The recognition of MEs requires handling two-dimensional symbol relatio...
详细信息
Recently, there have been many papers studying discriminative acoustic modeling techniques like conditional random fields or discriminative training of conventional Gaussian HMMs. This paper will give an overview of t...
详细信息
ISBN:
(纸本)9781424442959
Recently, there have been many papers studying discriminative acoustic modeling techniques like conditional random fields or discriminative training of conventional Gaussian HMMs. This paper will give an overview of the recent work and progress. We will strictly distinguish between the type of acoustic models on the one hand and the training criterion on the other hand. We will address two issues in more detail: the relation between conventional Gaussian HMMs and conditional random fields and the advantages of formulating the training criterion as a convex optimization problem. Experimental results for various speech tasks will be presented to carefully evaluate the different concepts and approaches, including both a digit string and large vocabulary continuous speech recognition tasks.
The emissions consequences of smart grid technologies can be significant but are not always intuitive. This is particularly true in the implementation of energy storage (ES) to enable the installation of solar photovo...
详细信息
ISBN:
(纸本)9781538622131;9781538622124
The emissions consequences of smart grid technologies can be significant but are not always intuitive. This is particularly true in the implementation of energy storage (ES) to enable the installation of solar photovoltaic (PV) systems. Using the web-based calculator at and prototypical distribution feeders, this paper explores the CO_2, SO_2 and NO_x impacts of ES deployed with solar PV, where the energy storage system is operated to minimize load variation assuming hourly dispatch. Five regions of the country were explored using 15 prototypical distribution feeders and 2015 historical data. Impacts vary in direction, magnitude, and trend, and require a context-dependent screening method for faithful representation.
暂无评论