We investigate approaches for large vocabulary continuous speech recognition (LVCSR) system for new languages or new domains using limited amounts of transcribed training data. In these low resource conditions, the pe...
详细信息
Extractive summarization of conference and lecture speech is useful for online learning and references. We show for the first time that deep(er) rhetorical parsing of conference speech is possible and helpful to extra...
详细信息
ISBN:
(纸本)9781424442959
Extractive summarization of conference and lecture speech is useful for online learning and references. We show for the first time that deep(er) rhetorical parsing of conference speech is possible and helpful to extractive summarization task. This type of rhetorical structures is evident in the corresponding presentation slide structures. We propose using Hidden Markov SVM (HMSVM) to iteratively learn the rhetorical structure of the speeches and summarize them. We show that system based on HMSVM gives a 64.3percent ROUGE-L F-measure, a 10.1percent absolute increase in lecture speech summarization performance compared with the baseline system without rhetorical information. Our method equally outperforms the baseline with a conventional discourse feature. Our proposed approach is more efficient than and also improves upon a previous method of using shallow rhetorical structure parsing [1].
As a high-level feature, prosody may be an effective feature when it is modeled over longer ranges than the typical range of a syllable. This paper is about language recognition with the high-level prosodic attributes...
详细信息
We present results on a novel hybrid semantic SMT model that incorporates the strengths of both semantic role labeling and phrase-based statistical machine translation. The approach avoids major complexity limitations...
详细信息
We present a series of empirical studies aimed at illuminating more precisely the likely contribution of semantic roles in improving statistical machine translation accuracy. The experiments reported study several asp...
详细信息
Frequency domain linear prediction (FDLP) uses autoregressive models to represent Hilbert envelopes of relatively long segments of speech/audio signals. Although the basic FDLP audio codec achieves good quality of the...
详细信息
ISBN:
(纸本)9781615677122
Frequency domain linear prediction (FDLP) uses autoregressive models to represent Hilbert envelopes of relatively long segments of speech/audio signals. Although the basic FDLP audio codec achieves good quality of the reconstructed signal at high bit-rates, there is a need for scaling to lower bit-rates without degrading the reconstruction quality. Here, we present a method for improving the compression efficiency of the FDLP codec by the application of the modified discrete cosine transform (MDCT) for encoding the FDLP residual signals. In the subjective and objective quality evaluations, the proposed FDLP codec provides competent quality of reconstructed signal compared to the state-of-the-art audio codecs for the 32 - 64 kbps range.
In this paper, we propose an active learning approach for feature-based extractive summarization of lecture speech. Most state-of-the-art speech summarization systems are trained by using a large amount of human refer...
详细信息
ISBN:
(纸本)9781424454785
In this paper, we propose an active learning approach for feature-based extractive summarization of lecture speech. Most state-of-the-art speech summarization systems are trained by using a large amount of human reference summaries. Active learning targets to minimize human annotation efforts by automatically selecting a small amount of unlabeled examples for labeling. Our method chooses the unlabeled examples according to a combination of informativeness criterion and robustness criterion. Our summarization results show an increasing learning curve of ROUGE-L F-measure, from 0.44 to 0.54, consistently higher than that of using randomly chosen training samples. We also show that, by following the rhetorical structure in presentation slides, it is possible for humans to produce "gold standard" reference summaries with very high inter-labeler agreement.
Frequency Domain Linear Prediction (FDLP) represents an efficient technique for representing the long-term amplitude modulations (AM) of speech/audio signals using autoregressive models. For the proposed analysis tech...
详细信息
The number of research publications in various disciplines is growing exponentially. Researchers and scientists are increasingly finding themselves in the position of having to quickly understand large amounts of tech...
详细信息
In this paper, we survey some central issues in the historical, current, and future landscape of statistical machine translation (SMT) research, taking as a starting point an extended three-dimensional MT model space....
详细信息
ISBN:
(纸本)9781424454785
In this paper, we survey some central issues in the historical, current, and future landscape of statistical machine translation (SMT) research, taking as a starting point an extended three-dimensional MT model space. We posit a socio-geographical conceptual disparity hypothesis, that aims to explain why language pairs like Chinese-English have presented MT with so much more difficulty than others. The evolution from simple token-based to segment-based to tree-based syntactic SMT is sketched. For tree-based SMT, we consider language bias rationales for selecting the degree of compositional power within the hierarchy of expressiveness for transduction grammars (or synchronous grammars). This leads us to inversion transductions and the ITG model prevalent in current state-of-the-art SMT, along with the underlying ITG hypothesis, which posits a language universal. Against this backdrop, we enumerate a set of key open questions for syntactic SMT. We then consider the more recent area of semantic SMT. We list principles for successful application of sense disambiguation models to semantic SMT, and describe early directions in the use of semantic role labeling for semantic SMT.
暂无评论