Hidden properties of social media users, such as their ethnicity, gender, and location, are often reflected in their observed attributes, such as their first and last names. Furthermore, users who communicate with eac...
详细信息
In this paper, three different voicing features are studied as additional acoustic features for continuous speech recognition. The harmonic product spectrum based feature is extracted in frequency domain while the aut...
详细信息
In this paper, three different voicing features are studied as additional acoustic features for continuous speech recognition. The harmonic product spectrum based feature is extracted in frequency domain while the autocorrelation and the average magnitude difference based methods work in time domain. The algorithms produce a measure of voicing for each time frame. The voicing measure was combined with the standard Mel Frequency Cepstral Coefficients (MFCC) using linear discriminant analysis to choose the most relevant features. Experiments have been performed on small and large vocabulary tasks. The three different voicing measures combined with MFCCs resulted in similar improvements in word error rate: improvements of up to 14% on the small-vocabulary task and improvements of up to 6% on the large-vocabulary task relative to using MFCC alone with the same overall number of parameters in the system.
This work investigates the alignment problem in state-of-the-art multi-head attention models based on the transformer architecture. We demonstrate that alignment extraction in transformer models can be improved by aug...
详细信息
We present an approach to automatically recover hidden attributes of scientific articles, such as whether the author is a native English speaker, whether the author is a male or a female, and whether the paper was pub...
详细信息
This paper describes the statistical machine translation system developed at RWTH Aachen University for the English?German and German?English translation tasks of the EMNLP 2017 Second Conference on Machine Translatio...
详细信息
Twitter has been shown to be a fast and reliable method for disease surveillance of common illnesses like influenza. However, previous work has relied on simple content analysis, which conflates flu tweets that report...
In this paper, we propose novel extensions of hierarchical phrase-based systems with a discriminative lexicalized reordering model. We compare different feature sets for the discriminative reordering model and investi...
详细信息
We introduce a lexicalized reordering model for hierarchical phrase-based machine translation. The model scores monotone, swap, and discontinuous phrase orientations in the manner of the one presented by Tillmann (200...
详细信息
We present an unsupervised linguistically-based approach to discourse relations recognition, which uses publicly available resources like manually annotated corpora (Discourse Graphbank, Penn Discourse Treebank, RST-D...
详细信息
ISBN:
(纸本)9789728865979
We present an unsupervised linguistically-based approach to discourse relations recognition, which uses publicly available resources like manually annotated corpora (Discourse Graphbank, Penn Discourse Treebank, RST-DT), as well as empirically derived data from "causally" annotated lexica like LCS, to produce a rule-based algorithm. In our approach we use the subdivision of Discourse Relations into four subsets - CONTRAST, CAUSE, CONDITION, ELABORATION, proposed by[7] in their paper, where they report results obtained with a machine-learning approach from a similar experiment, against which we compare our results. Our approach is fully symbolic and is partially derived from the system called GETARUNS, for text understanding, adapted to a specific task: recognition of Causality Relations in free text. We show that in order to achieve better accuracy, both in the general task and in the specific one, semantic information needs to be used besides syntactic structural information. Our approach outperforms results reported in previous papers[9].
Currently, in speech translation, the straightforward approach - cascading a recognition system with a translation system - delivers state-of-the-art results. However, fundamental challenges such as error propagation ...
详细信息
暂无评论