检索结果-内蒙古大学图书馆

61.3:Invited Paper:A Virtual Display for Mobile Use

SID Symposium Digest of Technical Papers 2012年第1期35卷

作者： Jukka Häkkinen Audio-Visual Systems Laboratory Nokia Research Center Helsinki Finland

Mobile devices have small displays, so a virtual display that would bring the benefits of a large computer display to users would be an interesting mobile accessory. However, sickness symptoms are a problem that should be solved before a virtual display can be successful. To explore the symptom levels induced by virtual displays, we have tested several head-worn virtual display types in various contexts. Our results indicate that monocular and stereoscopic displays induce significant amount of adverse symptoms. On the other hand, the symptom levels induced by biocular displays are not different from the symptoms induced by direct view displays. The results suggest that a biocular display might be the best alternative as a mobile accessory.

关键词：

来源：评论

学校读者我要写书评

暂无评论

P-28: Image Adjustment with the Tone Rendering Curve

引用

SID Symposium Digest of Technical Papers 2012年第1期35卷

作者： Tero Vuori Kristina Björknäs Joni Oja Markku Lamberg Nokia Research Center Audio-Visual Systems Laboratory FIN-00045 Nokia Group Finland

This paper investigates the suitability of different tone rendering curves (gamma functions) at different external illumination levels and with different display content. It is difficult to find an optimum gamma value because image quality strongly depends on the image content and the external illumination. Tuning the tone rendering curve based on ambient light sensor and / or image content of the display would significantly increase the perceived image quality.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Speech Database Compacted for an Embedded Mandarin TTS System

Speech Database Compacted for an Embedded Mandarin TTS Syste...

引用

International Symposium on Chinese Spoken Language Processing

作者： Qing Guo Bin Wang Nobuyuki Katae Fujitsu Research and Develop Center China Internet Application Laboratory Beijing China Audio-Visual Systems Laboratory Fujitsu Laboratories Limited Akashi Japan

ISBN: (纸本)9781424429424

In recent years, the unit selection based concatenative speech synthesis system that uses large speech database has become popular because it can produce high quality synthesized speech. However, using such a large speech database is not practical for many applications such as those ported on embedded devices with the storage requirement and the computational complexity involved in searching it. In this paper, it proposed the context based pruning algorithm and waveform adjustment effect based pruning algorithm to compact the speech database. At last, it presents experimental results and discussion.

关键词： Databases Speech synthesis Laboratories Clustering algorithms Computational complexity Acoustic measurements Cepstrum Synthesizers Optical computing Telecommunication computing

来源：评论

学校读者我要写书评

暂无评论

Analysis of the meter of acoustic musical signals

Analysis of the meter of acoustic musical signals

引用

作者： Klapuri, Anssi P. Eronen, Antti J. Astola, Jaakko T. IEEE Institute of Signal Processing Tampere University of Technology FIN-33720 Tampere Finland Nokia Research Center Audio-Visual Systems Laboratory FIN-33721 Tampere Finland TUT Institute of Signal Processing Nokia Research Center Tampere Finland Department of Signal Processing Tampere International Center for Signal Processing

A method is decribed which analyzes the basic pattern of beats in a piece of music, the musical meter. The analysis is performed jointly at three different time scales: at the temporally atomic tatum pulse level, at the tactus pulse level which corresponds to the tempo of a piece, and at the musical measure level. Acoustic signals from arbitrary musical genres are considered. For the initial time-frequency analysis, a new technique is proposed which measures the degree of musical accent as a function of time at four different frequency ranges. This is followed by a bank of comb filter resonators which extracts features for estimating the periods and phases of the three pulses. The features are processed by a probabilistic model which represents primitive musical knowledge and uses the low-level observations to perform joint estimation of the tatum, tactus, and measure pulses. The model takes into account the temporal dependencies between successive estimates and enables both causal and noncausal analysis. The method is validated using a manually annotated database of 474 music signals from various genres. The method works robustly for different types of music and improves over two state-of-the-art reference methods in simulations. © 2006 IEEE.

关键词： Signal processing

来源：评论

学校读者我要写书评

暂无评论

Source signal based rate adaptation for GSM AMR speech codec

Source signal based rate adaptation for GSM AMR speech codec

引用

International Conference on Information Technology - Coding and Computing

作者： Makinen, J Vainio, J Audio and Visual Systems Laboratory Nokia Research Center Finland

ISBN: (纸本)0769521088

Adaptive Multi-Rate (AMR) codec [1] was standardised for GSM in 1999. AMR offers substantial improvement over previous GSM speech codecs [6] in error robustness by adapting speech and channel coding depending on channel conditions. However, current standard do not exploit the multi-rate capability of AMR codec in source signal based adaptation that would optimise the average bit-rate vs. quality trade-off. This paper presents a source signal based rate adaptation algorithm for AMR codec in GSM system. Together with fast power control, it can be used to increase the system capacity and further increase the robustness of GSM AMR codec.

关键词： Global system for mobile communications

来源：评论

学校读者我要写书评

暂无评论

Data-driven approaches for automatic detection of syllable boundaries 8

Data-driven approaches for automatic detection of syllable b...

引用

8th International Conference on Spoken Language Processing, ICSLP 2004

作者： Tian, Jilei Audio-Visual Systems Laboratory Nokia Research Center Tampere Finland

Syllabification is an essential component of many speech and language processing systems. The development of automatic speech recognizers frequently requires working with subword units such as syllables. More importantly, syllabification is an inevitable part of speech synthesis system. In this paper we present data-driven approaches to supervised learning and automatic detection of syllable boundaries. The generalization capability of the learning is investigated on the assignment of syllable boundaries to phoneme sequence representation in English. A rule-based self-correction algorithm is also proposed to automatically correct some syllabification errors. We conducted a series of experiments and the neural network approach is clearly better in terms of generalization performance and complexity.

关键词： Speech synthesis

来源：评论

学校读者我要写书评

暂无评论

Efficient compression method for pronunciation dictionaries 8

Efficient compression method for pronunciation dictionaries

引用

8th International Conference on Spoken Language Processing, ICSLP 2004

作者： Tian, Jilei Audio-Visual Systems Laboratory Nokia Research Center Tampere Finland

Pronunciation dictionaries are often used with other data-driven methods to model the pronunciations in phoneme-based automatic speech recognition (ASR) and text-to-speech (TTS) systems. The dictionaries usually take a great amount of memory, which is a limiting factor in portable handheld devices. Compressing the pronunciation dictionaries results in minimal transmission bandwidth and less storage memory. In this paper we present a new procedure to efficiently compress pronunciation dictionaries. First, a novel method transforms the dictionary to a lower entropy representation. Second, the variability in the aligned pronunciation dictionary is reduced to further lower its entropy. Finally, generic lossless compression is applied on the transformed dictionary. Experiments were carried out on English names and words from US English CMU dictionary. The proposed scheme achieved 37.5% improvement over general-purpose lossless text compression.

关键词： Digital storage

来源：评论

学校读者我要写书评

暂无评论

Multilingual E-mail text processing for speech synthesis 8

Multilingual E-mail text processing for speech synthesis

引用

8th International Conference on Spoken Language Processing, ICSLP 2004

作者： Oria, Daniela Vetek, Akos Audio-Visual Systems Laboratory Nokia Research Center Helsinki Finland

An integrated method of text pre-processing and language identification is introduced to deal with the problem of mixed-language e-mail messages in a speech-enabled e-mail reading system. Our method can confidently distinguish between the supported languages and switch between several TTS engines or languages to read the portions of the text in the appropriate language. This is achieved by making use of the combined information from a text pre-processor and a language identifier that relies on both statistical information and linguistic features indicative of a particular language.

关键词： Speech synthesis

来源：评论

学校读者我要写书评

暂无评论

Beginning of utterance detection algorithm for low complexity ASR engines 8

Beginning of utterance detection algorithm for low complexit...

引用

8th International Conference on Spoken Language Processing, ICSLP 2004

作者： Lahti, Tommi Audio-Visual Systems Laboratory Nokia Research Center Tampere Finland

In this paper, a novel method for beginning of utterance detection is proposed for low complexity ASR systems. Assuming MFCC calculations in the ASR front-end, the additional computational load due to the algorithm is negligible. The algorithm makes use of the delay between the MFCC calculation and decoding process, which is typical in front-ends with feature normalization. The main steps of the algorithm involve LDA projection of MFCC features, mean calculation over the projected features, simple implicit SNR estimation and weighting of the decision statistics according to the estimate. Our experimental results show that high performance is obtained down to fairly low SNR conditions as the beginning of utterance detection starts to fail in a safe way at about 5 dB SNR. These properties make the algorithm an attractive choice for low complexity ASR engines.

关键词： Engines

来源：评论

学校读者我要写书评

暂无评论

Segmental speech coding model for storage applications 8

Segmental speech coding model for storage applications

引用

8th International Conference on Spoken Language Processing, ICSLP 2004

作者： Rämö, Anssi Nurminen, Jani Himanen, Sakari Heikkinen, Ari Audio-Visual Systems Laboratory Nokia Research Center Tampere Finland

This paper introduces a novel speech coder structure for storage applications operating at low bit rates. The coder exploits the inherent segmental nature of speech signals by dividing the input into segments of variable length. Quite often the length of the segment is the same as the length of the phoneme. The individual segments are coded using adaptive techniques that take into account the relative perceptual importance of different types of speech, e.g. voiced and unvoiced speech. These main features of the proposed approach are enabled by the fact that many of the design constraints related to real-time conversational speech can be relaxed in storage applications. A practical implementation containing the speech-adaptive segmentation is described and its performance is verified in a listening test at average bit rates of about 1.0 kbps and 2.4 kbps respectively. The results show that the segmental model significantly improves the coding efficiency.

关键词： Speech coding

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：