predictive-coding has justifiably become a highly influential theory in Neuroscience. However, the possibility of its unfalsifiability has been raised. We argue that if predictive-coding were unfalsifiable, it would b...
详细信息
predictive-coding has justifiably become a highly influential theory in Neuroscience. However, the possibility of its unfalsifiability has been raised. We argue that if predictive-coding were unfalsifiable, it would be a problem, but there are patterns of behavioural and neuroimaging data that would stand against predictive-coding. Contra (vanilla) predictive patterns are those in which the more expected stimulus generates the largest evokedresponse. However, basic formulations of predictive-coding mandate that an expected stimulus should generate little, if any, prediction error and thus little, if any, evoked-response. It has, though, been argued that contra (vanilla) predictive patterns can be obtained if precision is higher for expected stimuli. Certainly, using precision, one can increase the amplitude of an evoked-response, turning a predictive into a contra (vanilla) predictive pattern. We demonstrate that, while this is true, it does not present an absolute barrier to falsification. This is because increasing precision also reduces latency and increases the frequency of the response. These properties can be used to determine whether precision-weighting in predictive-coding justifiably explains a contra (vanilla) predictive pattern, ensuring that predictive-coding is falsifiable.
As developed by Atal and Schroeder [1], conventional Adaptive predictive coding (APC) of speech employs both vocal tract and pitch prediction to achieve a low energy, spectrally flattened residual. Errors in the pitch...
详细信息
As developed by Atal and Schroeder [1], conventional Adaptive predictive coding (APC) of speech employs both vocal tract and pitch prediction to achieve a low energy, spectrally flattened residual. Errors in the pitch predictor can result in clipping errors which can propagate in the system for relatively long periods of time and degrade the quality of the synthesized speech. Makhoul and Berouti [2] have developed a high quality 16 kbps APC system which eliminates the pitch predictor by using a multi-level variable rate quantizer. In order to achieve comparable quality at even lower data rates, a split band APC (SBAPC) structure is proposed which employs the multi-level quantizer on the low frequency portion of the residual and a 1-bit quantizer on the high frequency portion of the residual.
A new form of predictive quantization, dubbed quantized predictive coding (QPC), is presented. One version uses variable-length coding and another uses fixed-length coding. For autoregressive sources, it is shown both...
详细信息
A new form of predictive quantization, dubbed quantized predictive coding (QPC), is presented. One version uses variable-length coding and another uses fixed-length coding. For autoregressive sources, it is shown both analytically and experimentally that the performance of QPC with (without) entropy-coding is, essentially, the same as that of DPCM (differential pulse code modulation) with uniform quantization with (without) entropy-coding. The principal advantage of QPC is that, unlike DPCM, it is a priori designed for digital implementation. As a result, its digital implementation is simpler than DPCM and suffers no loss in performance due to round-off errors.< >
Theoretical analysis of differential predictive coding (DPC) has almost exclusively focused on scalar quantizers and the high-rate regime for tractability reasons. As a result, the role of noncausal decoding in improv...
详细信息
ISBN:
(纸本)0780385543
Theoretical analysis of differential predictive coding (DPC) has almost exclusively focused on scalar quantizers and the high-rate regime for tractability reasons. As a result, the role of noncausal decoding in improving the quality has been largely ignored in the literature. In this work we conduct a rigorous performance analysis of DPC-based schemes under a simple independent, vector-Gaussian, AR-1 source model and large-block (as opposed to high-rate) asymptotics. This analysis reveals that noncausal decoding can offer a significant relative improvement in the mean squared error (by as much as 3 dB) at medium to low rates (0.1-0.5 bit per sample) for sources having strong temporal correlation. Furthermore, most of this relative improvement can be attained with a modest decoder-latency. At very high and very low rates, the gains are negligible.
We consider ADPCM coding techniques for speech and voiceband data signals that do not require previous identification of the type of signal to be encoded. After a review of the frequently conflicting requirements for ...
详细信息
We consider ADPCM coding techniques for speech and voiceband data signals that do not require previous identification of the type of signal to be encoded. After a review of the frequently conflicting requirements for prediction and quantization of these two types of signals, coders with backward adaptive prediction and backward (AQB) or forward (AQF) quantizer adaptation are discussed. We find that ADPCM-AQF possesses significant performance advantages relative to ADPCM-AQB for both speech and voiceband data signals at the cost of additional signal delay. 4800 b/s voiceband data can be transmitted satisfactorily at 32 kb/s even with multiple stages of tandeming and infrequent transmission errors, but 40 kb/s are required for reliable 9600 b/s voiceband data transmission with intermediate analog conversion.
The feasibility and performance of an embedded RPE (ERPE) scheme based on multistage coding is investigated. The coding efficiency of second and subsequent stages depends on the spectral envelope difference between th...
详细信息
The feasibility and performance of an embedded RPE (ERPE) scheme based on multistage coding is investigated. The coding efficiency of second and subsequent stages depends on the spectral envelope difference between the original speech and the error signal at each stage whereas re-use of LPC parameters derived from the original speech depends on the corresponding LPC spectral difference. Suitable measures of spectral difference are defined and simulation shows that both decrease with the perceptual weighting factor. The ERPE system requires little extra coding complexity and can be simplified further by using a partial phase adaptation procedure with marginal loss of SNR performance. The simulated ERPE system shows graceful reduction of reconstructed speech quality for bit rates from 14.8 to 6.4 kb/s in 4.2 kb/s steps.
An adaptive predictive coding with adaptive bit allocation (APC-AB) is presented for speech encoding at low to medium bit rates (6.4kb/s-24kb/s). In this system, a split-band predictive coding scheme and a bit allocat...
详细信息
An adaptive predictive coding with adaptive bit allocation (APC-AB) is presented for speech encoding at low to medium bit rates (6.4kb/s-24kb/s). In this system, a split-band predictive coding scheme and a bit allocation scheme are employed in order to remove the redundancies due to a periodic concentration of the prediction residual energy as well as nonuniform nature of the speech spectrum. Quantization bits are dynamically allocated both over the sub-bands(frequency domain) and over the subintervals(time domain) in accordance with the distribution of the residual energies in the time-frequency domain. Optimum bit allocation is derived based on the mean square error criterion on the speech waveform, and the SNR gain is presented in relation to the prediction gain of the full-band signal. This system is evaluated in terms of the segmental SNR and speech quality. The result shows that the APC-AB system has advantage over the conventional full-band APC system in the segmental SNR and the stability of the prediction loop. It was also shown that this system can provide speech quality subjectively equivalent to 7 bit Log-PCM at 16 kb/s, and to 6 bit Log-PCM at 9.6 kb/s.
An alternative to scalable predictive coding of first order Gauss-Markov processes is proposed in this paper. It is shown that conventional scalable predictive coding is inherently suboptimal. An alternative to scalab...
详细信息
An alternative to scalable predictive coding of first order Gauss-Markov processes is proposed in this paper. It is shown that conventional scalable predictive coding is inherently suboptimal. An alternative to scalable predictive coding, which achieves the rate-distortion performance of predictive coding for first-order Gauss-Markov processes is then proposed. The proposed approach is posed as a variant of the well-known Wyner-Ziv (1976) problem. By using coset codes with nested lattices, the present paper proves that the proposed approach achieves the predictive coding bound asymptotically at all scales while simultaneously providing the functionality of scalable coding.
Hybrid predictive/transform coding is studied. The usual formulation is to first apply a unitary transform and then code the transform coefficients with independent DPCM coders, i.e., the prediction is performed in th...
详细信息
Hybrid predictive/transform coding is studied. The usual formulation is to first apply a unitary transform and then code the transform coefficients with independent DPCM coders, i.e., the prediction is performed in the transform domain. This structure is compared to spatial domain prediction, where a difference signal is formed in the spatial domain and then coded by a transform coder. A linear spatial domain predictor which minimizes the mean square prediction error also minimizes the mean square of each transform coefficient. The two structures are equivalent if the transform domain prediction scheme is extended to a more general predictor. Hence, the structure that gives the easiest implementation can be chosen. The spatial domain structure is preferred for motion compensation and for line interlaced video signals. Interframe hybrid coding experiments are performed on interlaced videophone scenes using an adaptive transform coder. Motion compensation gives a rate reduction of 25-35 percent compared to frame difference prediction with the same mean square error. The subjective advantage is even greater, since the "dirty window" effect is not present with motion compensation. It is important to perform the motion estimation with fractional pel accuracy. Field coding with a switched predictor using previous field in moving areas is an interesting alternative to frame coding with frame difference prediction.
We present a new architecture called the Modular Neural predictive coding architecture (Modular NPC). This architecture is used for speech discriminant feature extraction (DFE). We present an application of the modula...
详细信息
ISBN:
(纸本)9810475241
We present a new architecture called the Modular Neural predictive coding architecture (Modular NPC). This architecture is used for speech discriminant feature extraction (DFE). We present an application of the modular NPC architecture on phoneme recognition task. The phonemes which are extracted from the Darpa-Timit speech database are: vowels, /b/-/d/-/g/ and /p/-/t/-/k/ phonemes. Comparisons with coding methods (LPC, MFCC, PLP) are presented.
暂无评论