Binocilar rivalry occurs when the eyes are presented with different stimuli and subjective perception alternates between them. Though recent years have seen a number of models of this phenomenon, the mechanisms behind...
详细信息
Binocilar rivalry occurs when the eyes are presented with different stimuli and subjective perception alternates between them. Though recent years have seen a number of models of this phenomenon, the mechanisms behind binocular rivalry are still debated and we still lack a principled understanding of why a cognitive system such as the brain should exhibit this striking kind of behaviour. Furthermore, psychophysical and neurophysiological (single cell and imaging) studies of rivalry are not unequivocal and have proven difficult to reconcile within one framework. This review takes an epistemological approach to rivalry that considers the brain as engaged in probabilistic unconscious perceptual inference about the causes of its sensory input. We describe a simple empirical Bayesian framework, implemented with predictive coding, which seems capable of explaining binocular rivalry and reconciling many findings. The core of the explanation is that selection of one stimulus, and subsequent alternation between stimuli in rivalry occur when: (i) there is no single model or hypothesis about the causes in the environment that enjoys both high likelihood and high prior probability and (ii) when one stimulus dominates, the bottom-up driving, signal for that stimulus is explained away while, crucially, the bottom-up signal for the suppressed stimulus is not, and remains as an unexplained but explainable prediction error signal. This induces instability in perceptual dynamics that can give rise to perceptual transitions or alternations during rivalry. (C) 2008 Elsevier B.V. All rights reserved.
predictive coding can be regarded as a function which reduces the error between an input signal and a top-down prediction. If reducing the error is equivalent to reducing the influence of stimuli from the environment,...
详细信息
ISBN:
(纸本)9781728124858
predictive coding can be regarded as a function which reduces the error between an input signal and a top-down prediction. If reducing the error is equivalent to reducing the influence of stimuli from the environment, predictive coding can be regarded as stimulation avoidance by prediction. Our previous studies showed that action and selection for stimulation avoidance emerge in spiking neural networks through spike-timing dependent plasticity (STDP). In this study, we demonstrate that spiking neural networks with random structure spontaneously learn to predict temporal sequences of stimuli based solely on STDP.
We propose a new approach to document image layout extraction using rapid feature analysis, preclassification and predictive coding. First, a set of layout features is used to render the image profile information, The...
详细信息
ISBN:
(纸本)0818680555
We propose a new approach to document image layout extraction using rapid feature analysis, preclassification and predictive coding. First, a set of layout features is used to render the image profile information, The knowledge base is utilized to rule these early regions into layout labels. The regions found are given a classification tag and a degree of membership into background, text, picture and linedrawing classes. A predictive coding method is used with the preclassification information to rise the confidence of each label, and To integrate the regional domain and the labels into a uniform class without any shape assumption. We have tested our technique using three different databases that comprise over 1000 document images. The results shaw high degree of confidence in region separation and extraction The main benefits include robust classification, shape independency and rapid computation.
This paper describes a 15/30 Mbit/s TV codec with a new approach to high-efficiency coding for TV signals, i.e., median adaptive prediction. The 15/30 Mbit/s codec, commonly applicable to NTSC, PAL, and SECAM (525/60 ...
详细信息
This paper describes a 15/30 Mbit/s TV codec with a new approach to high-efficiency coding for TV signals, i.e., median adaptive prediction. The 15/30 Mbit/s codec, commonly applicable to NTSC, PAL, and SECAM (525/60 and 625/50) systems, uses adaptive prediction incorporating a motion-compensated interframe, an interfield, and an intrafield predictor. Its performance for digital transmission is presented. This universal codec is designed, based on CCIR recommendations concerning digital TV coding parameters for studios (Rec. 601) and general principles on long-distance digital TV transmission (Rec. 604). A field trial of 15 Mbit/s digital TV transmission using this codec between earth stations with a 30 m diameter antenna and a 5 m diameter antenna is reported.
We review autoregressive predictive coding (APC), an approach to learn speech representation by predicting a future frame given the past frames. We present three different views of interpreting APC, and provide a hist...
详细信息
We review autoregressive predictive coding (APC), an approach to learn speech representation by predicting a future frame given the past frames. We present three different views of interpreting APC, and provide a historical account to the approach. To study the speech representation learned by APC, we use common speech tasks, such as automatic speech recognition and speaker verification, to demonstrate the utility of the learned representation. In addition, we design a suite of fine-grained tasks, including frame classification, segment classification, fundamental frequency tracking, and duration prediction, to probe the phonetic and prosodic content of the representation. The three views of the APC objective welcome various generalizations and algorithms to learn speech representations. Probing on the suite of fine-grained tasks suggests that APC makes a wide range of high-level speech information accessible in its learned representation.
Self-supervised methods such as Contrastive predictive coding (CPC) have greatly improved the quality of the unsupervised representations. These representations significantly reduce the amount of labeled data needed f...
详细信息
Self-supervised methods such as Contrastive predictive coding (CPC) have greatly improved the quality of the unsupervised representations. These representations significantly reduce the amount of labeled data needed for downstream task performance, such as automatic speech recognition. CPC learns representations by learning to predict future frames given current frames. Based on the observation that the acoustic information, e.g., phones, changes slower than the feature extraction rate in CPC, we propose regularization techniques that impose slowness constraints on the features. Here we propose two regularization techniques: Self-expressing constraint and Left-or-Right regularization. We evaluate the proposed model on ABX and linear phone classification tasks, acoustic unit discovery, and automatic speech recognition. The regularized CPC trained on 100 hours of unlabeled data matches the performance of the baseline CPC trained on 360 hours of unlabeled data. We also show that our regularization techniques are complementary to data augmentation and can further boost the system's performance. In monolingual, cross-lingual, or multilingual settings, with/without data augmentation, regardless of the amount of data used for training, our regularized models outperformed the baseline CPC models on the ABX task.
In this paper, we propose adversarial predictive coding (APC), a novel method for detecting abnormal events. Abnormal event detection (AED) is to identify unobserved events from a given training dataset consisting of ...
详细信息
In this paper, we propose adversarial predictive coding (APC), a novel method for detecting abnormal events. Abnormal event detection (AED) is to identify unobserved events from a given training dataset consisting of normal events, and it is considered as one of the most important objectives in developing intelligent surveillance systems. Given videos and motion flows of normal events, the APC derives a normal event model by applying an adversarial prediction approach on the jointly learnt latent feature space from the videos and motion flows. Since latent space requires more abstracted and noise-free information than the raw data space, the APC can derive a more discriminative model for normal events compared with other deep learning-based AED methods which directly apply uni-modal losses such as mean square error and cross-entropy to low-level data such as video frames. We demonstrate the effectiveness of our method in detecting abnormal events using UCSD-Ped, Avenue, and UCF-Crime datasets. The experimental results show that the APC surpass the existing state-of-the-art AED methods by deriving a more discriminative model for normal events. (c) 2021 Published by Elsevier Inc.
The 3D video extension of High Efficiency Video coding (3D-HEVC) exploits texture-depth redundancies in 3D videos using intercomponent coding tools. It also inherits the same quadtree coding structure as HEVC for both...
详细信息
The 3D video extension of High Efficiency Video coding (3D-HEVC) exploits texture-depth redundancies in 3D videos using intercomponent coding tools. It also inherits the same quadtree coding structure as HEVC for both components. The current software implementation of 3D-HEVC includes encoder shortcuts that speed up the quadtree construction process, but those are always accompanied by coding losses. Furthermore, since the texture and its associated depth represent the same scene, at the same time instant and view point, their quadtrees are closely linked. In this paper, an intercomponent tool is proposed in which this link is exploited to save both runtime and bits through a joint coding of the quadtrees. If depth is coded before the texture, the texture quadtree is initialized from the coded depth quadtree. Otherwise, the depth quadtree is limited to the coded texture quadtree. A 31% encoder runtime saving, a -0.3% gain for coded and synthesized views and a -1.8% gain for coded views are reported for the second method.
In this paper, we propose a novel, discrete wavelet transform (DWT) domain implementation of our previously proposed, pioneering block-based disparity compensated predictive coding algorithm for stereo image compressi...
详细信息
In this paper, we propose a novel, discrete wavelet transform (DWT) domain implementation of our previously proposed, pioneering block-based disparity compensated predictive coding algorithm for stereo image compression. Under the present research context we perform predictive coding in the form of pioneering block search in the sub-band domain. The resulting transform domain predictive error image is subsequently converted to a so-called wavelet-block representation, before being quantized and entropy coded by a JPEG-like CODEC. We show that the proposed novel implementation is able to effectively transfer the inherent advantages of DWT-based image coding technology to efficient stereo image pair compression. At equivalent bit rates, the proposed algorithm achieves peak signal to noise ratio gains of up to 5.5 dB, for reconstructed predicted images, as compared to traditional and state of the art DCT and DWT-based predictive coding algorithms. (C) 2003 Elsevier B.V. All rights reserved.
Distributed coding of correlated sources with memory poses a number of considerable challenges that threaten its practical application, particularly (but not only) in the context of sensor networks. This problem is st...
详细信息
Distributed coding of correlated sources with memory poses a number of considerable challenges that threaten its practical application, particularly (but not only) in the context of sensor networks. This problem is strongly motivated by the obvious observation that most common sources exhibit temporal correlations that may be at least as important as spatial or intersource correlations. This paper presents an analysis of the underlying tradeoffs, paradigms for coding systems, and approaches for distributed predictive coder design optimization. Motivated by practical limitations on both complexity and delay (especially for dense sensor networks) the focus here is on predictive coding. From the source coding perspective, the most basic tradeoff (and difficulty) is due to conflicts that arise between distributed coding and prediction, wherein "standard" distributed quantization of the prediction errors, if coupled with imposition of zero decoder drift, would drastically compromise the predictor performance and hence the ability to exploit temporal correlations. Another challenge arises from instabilities in the design of closed-loop predictors, whose impact has been observed in the past, but is greatly exacerbated in the case of distributed coding. In the distributed predictive coder design, we highlight the fundamental tradeoffs encountered within a more general paradigm where decoder drift is allowable or unavoidable, and must be effectively accounted for and controlled. We derive an overall design optimization method for distributed predictive coding that avoids the pitfalls of naive distributed predictive quantization and produces an optimized low complexity and low delay coding system. The proposed iterative algorithms for distributed predictive coding subsume traditional single-source predictive coding and memoryless distributed coding as extreme special cases.
暂无评论