咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Joint matrix quantization of f... 收藏

Joint matrix quantization of face parameters and LPC coefficients for low bit rate audiovisual speech coding

作     者:Girin, L 

作者机构:Univ Grenoble 3 CNRS INPG Inst Commun Parlee F-38031 Grenoble France 

出 版 物:《IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING》 (IEEE Trans Speech Audio Process)

年 卷 期:2004年第12卷第3期

页      面:265-276页

核心收录:

主  题:audiovisual lip parameters low-bit-rate speech coding LPC parameters matrix quantization speech processing 

摘      要:A key problem for videophony, that is telephony including the processing of images of the speaker s face in addition to acoustic speech, concerns signal compression for transmission. In such systems, audio and video compression are separately achieved by using both audio and video coders. In this paper, an audio-visual approach to this problem is considered, since we claim that the fundamental property of coherence (redundancy) between the two modalities of speech should be exploited by coding systems. We consider the framework of parametric analysis, modeling and synthesis of talking faces, which allows efficient representation of video information. Thus, we propose to jointly encode several face parameters, namely lip shape geometric descriptors, together with sets of audio coefficients, namely quite usual LPC parameters. The definition of an audiovisual distance between vectors of concatenated audio and video parameters allows to generate audiovisual single stage vector and matrix quantizers by using the generalized Lloyd algorithm. Calculation of video and audio mean distortion measures shows a significant gain in quantization accuracy and/or resolution compared to separate video and audio quantization. An alternative sub-optimal tree-like structure for audiovisual joint coding is also tested and yields interesting results while decreasing the computational complexity of the quantization process.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分