In this paper, we propose a rate-distortion optimal algorithm for sinusoidal modeling of audio and speech. The algorithm uses a variable-length analysis window where the total number of sinusoids needed to model the s...
详细信息
ISBN:
(纸本)0780374029
In this paper, we propose a rate-distortion optimal algorithm for sinusoidal modeling of audio and speech. The algorithm uses a variable-length analysis window where the total number of sinusoids needed to model the source signal is optimally distributed over the segments. To account for human auditory perception, we use a new perceptually relevant distortion measure which is combined with the psychoacousticaI matching pursuit algorithm to select the desired sinusoidal components. We discuss the encoding of the segmentation information and show how to reduce this overhead by restricting the minimum and maximum segment size of the constituent segments. Although this restricts the number of possible partitionings of the input signal, we still have a high accuracy in time at which new segments can start. By doing so, we can decrease the segmentation overhead by 50%, almost without loss of coding efficiency and without introducing pre-echoes.
暂无评论