A major source of audible distortion in current low-bit-rate speech coding algorithms is an inaccurate degree of periodicity of the voiced speech signal. If the correlations between neighboring pitch cycles are accura...
详细信息
A major source of audible distortion in current low-bit-rate speech coding algorithms is an inaccurate degree of periodicity of the voiced speech signal. If the correlations between neighboring pitch cycles are accurately reproduced, these audible distortions can be reduced significantly. To this purpose, a novel method of coding voiced speech is introduced, which transmits an encoded prototype waveform at 20-30 ms intervals. The prototype waveform describes a pitch cycle representative for the interval, and is quantized using analysis-by-synthesis methods. The speech signal is reconstructed by concatenation of interpolated prototype waveforms. The short-term and the long-term correlations between pitch cycles can be controlled explicitly. Unquantized reconstructed speech is virtually indistinguishable from the original signal. The method results in excellent speech quality at rates between 3.0 and 4.0 kb/s.< >
predictive coding, once used in only a small fraction of legal and business matters, is now widely deployed to quickly cull through increasingly vast amounts of data and reduce the need for costly and inefficient huma...
详细信息
ISBN:
(纸本)9781467390064
predictive coding, once used in only a small fraction of legal and business matters, is now widely deployed to quickly cull through increasingly vast amounts of data and reduce the need for costly and inefficient human document review. Previously, the sole front-end input used to create a predictive model was the exemplar documents (training data) chosen by subject-matter experts. Many predictive coding tools require users to rely on static preprocessing parameters and a single machine learning algorithm to develop the predictive model. Little research has been published discussing the impact preprocessing parameters and learning algorithms have on the effectiveness of the technology. A deeper dive into the generation of a predictive model shows that the settings and algorithm can have a strong effect on the accuracy and efficacy of a predictive coding tool. Understanding how these input parameters affect the output will empower legal teams with the information they need to implement predictive coding as efficiently and effectively as possible. This paper outlines different preprocessing parameters and algorithms as applied to multiple real-world data sets to understand the influence of various approaches.
In this paper, an online differential compression algorithm with reset columns integrated along with a digital pixel sensor (DPS) array is proposed. The proposed architecture of the sensor array reduces by more than h...
详细信息
In this paper, an online differential compression algorithm with reset columns integrated along with a digital pixel sensor (DPS) array is proposed. The proposed architecture of the sensor array reduces by more than half the silicon area of the DPS by sampling and storing the differential values between the pixel and its prediction, featuring compressed dynamic range and hence requiring limited precision (typically 2-3 bits as compared to 8-bit full precision). Column based reset technique is proposed to overcome the error accumulation problem inherent in predictive coding. While the concept of predictive coding was extensively introduced in previous literature, this is the first time this concept is used to reduce the storage requirement at the pixel level and hence drastically improving both the pixel size and the fill-factor - a key problem in DPS implementation. System level simulation results show the importance of the proposed reset scheme while VLSI implementation results illustrate a pixel level implementation of the whole predictive coding scheme featuring a pixel size reduction of more than 40% with a fill-factor of more than 15%.
predictive coding Network (PCN) is an important neural network inspired by visual processing models in neuroscience. It combines the feedforward and feedback processing and has the architecture of recurrent neural net...
详细信息
predictive coding Network (PCN) is an important neural network inspired by visual processing models in neuroscience. It combines the feedforward and feedback processing and has the architecture of recurrent neural networks (RNNs). This type of network is usually trained with backpropagation through time (BPTT). With infinite recurrent steps, PCN is a dynamic system. However, as one of the most important properties, stability is rarely studied in this type of network. Inspired by reservoir computing, we investigate the stability of hierarchical RNNs from the perspective of dynamic systems, and propose a sufficient condition for their echo state property (ESP). Our study shows the global stability is determined by stability of the local layers and the feedback between neighboring layers. Based on it, we further propose Weight Norm Supervision, a new algorithm that controls the stability of PCN dynamics by imposing different weight norm constraints on different parts of the network. We compare our approach with other training methods in terms of stability and prediction capability. The experiments show that our algorithm learns stable PCNs with a reliable prediction precision in the most effective and controllable way.
Deep-predictive-coding networks (DPCNs) are hierarchical, generative models. They rely on feed-forward and feedback connections to modulate latent feature representations of stimuli in a dynamic and context-sensitive ...
详细信息
Deep-predictive-coding networks (DPCNs) are hierarchical, generative models. They rely on feed-forward and feedback connections to modulate latent feature representations of stimuli in a dynamic and context-sensitive manner. A crucial element of DPCNs is a forward-backward inference procedure to uncover sparse, invariant features. However, this inference is a major computational bottleneck. It severely limits the network depth due to learning stagnation. Here, we prove why this bottleneck occurs. We then propose a new forward-inference strategy based on accelerated proximal gradients. This strategy has faster theoretical convergence guarantees than the one used for DPCNs. It overcomes learning stagnation. We also demonstrate that it permits constructing deep and wide predictive-coding networks. Such convolutional networks implement receptive fields that capture well the entire classes of objects on which the networks are trained. This improves the feature representations compared with our lab's previous nonconvolutional and convolutional DPCNs. It yields unsupervised object recognition that surpass convolutional autoencoders and is on par with convolutional networks trained in a supervised manner.
An adaptive predictive coding with dynamic quantization adjustment(APC-DQA) is proposed for speech coding at 16 kbits/s, which aims to reduce processing delay and hardware complexity, while attaining "toll qualit...
详细信息
An adaptive predictive coding with dynamic quantization adjustment(APC-DQA) is proposed for speech coding at 16 kbits/s, which aims to reduce processing delay and hardware complexity, while attaining "toll quality". The proposed scheme utilizes time-domain processing to lessen the processing delay. Moreover, it employs backward processing in both prediction and quantization, which requires less side information than forward processing. It also incorporates adaptive bit allocation in sub-intervals of each frame so as to remove redundancies due to periodic concentration of the prediction residual energy. The performance evaluation results show that the processing delay of the APC-DQA is 2/3 that of an adaptive predictive coding with adaptive bit allocation scheme (APC-AB) [1]. Moreover, its hardware complexity is approximately 70% to 80% of the APC-AB. It was also shown that this scheme can provide speech quality subjectively equivalent to 6.6 bit Log-PCM.
This paper investigates distributed predictivecoding of correlated sources with memory, which are communicated to a central receiver. This is the setting typically encountered in sensor networks. While source memor...
详细信息
ISBN:
(纸本)9781424413973;1424413974
This paper investigates distributed predictivecoding of correlated sources with memory, which are communicated to a central receiver. This is the setting typically encountered in sensor networks. While source memory may be exploited by distributed coding of large source blocks (vectors), the growth in complexity (and delay) is often unacceptable in practice, hence the interest in a low complexity predictive approach. We first consider the inherent "conflict" between distributed and predictive coding due to the impact of distributed quantization on the prediction loop. This is coupled with the effects of closed loop prediction, which destabilize standard Lloyd-like code design methods. An iterative algorithm is derived, which optimizes the overall system while imposing zero decoder drift due to distributed quantization. The approach circumvents convergence and stability issues of traditional predictive quantizer design by employing an "asymptotic closed loop" framework which is adapted for distributed predictive system design. The scheme efficiently utilizes both the temporal and inter-source correlations and subsumes as extreme special cases both separate source predictive coding, and distributed coding of memoryless correlated sources.
Adaptive predictive coding of speech signals at bit rates lower than 10 kbits/sec often requires the use of 2-level (1 bit) quantization of the samples of the prediction residual. Such a coarse quantization of the pre...
详细信息
Adaptive predictive coding of speech signals at bit rates lower than 10 kbits/sec often requires the use of 2-level (1 bit) quantization of the samples of the prediction residual. Such a coarse quantization of the prediction residual can produce audible quantizing noise in the reproduced speech signal at the receiver. This paper describes a new method of quantization for improving the speech quality. The improvement is obtained by center clipping the prediction residual and by fine quantization of the high-amplitude portions of the prediction residual. The threshold of center clipping is adjusted to provide encoding of the prediction residual at a specified bit rate. This method of quantization not only improves the speech quality by accurate quantization of the prediction residual when its amplitude is large but also allows encoding of the prediction residual at bit rates below 1 bit/sample.
The dichotomy between the challenging nature of obtaining annotations for activities, and the more straightforward nature of data collection from wearables, has resulted in significant interest in the development of t...
详细信息
The dichotomy between the challenging nature of obtaining annotations for activities, and the more straightforward nature of data collection from wearables, has resulted in significant interest in the development of techniques that utilize large quantities of unlabeled data for learning representations. Contrastive predictive coding (CPC) is one such method, learning effective representations by leveraging properties of time-series data to setup a contrastive future timestep prediction task. In this work, we propose enhancements to CPC, by systematically investigating the encoder architecture, the aggregator network, and the future timestep prediction, resulting in a fully con-volutional architecture. Across sensor positions and activities, our method shows substantial improvements on four of six target datasets, demonstrating its ability to empower a wide range of application scenarios. Further, in the presence of very limited labeled data, our technique significantly outperforms both supervised and self-supervised baselines, positively impacting situations where collecting only a few seconds of labeled data may be possible. This is promising, as CPC does not require specialized data transformations or reconstructions for learning effective representations.
暂无评论