Modern feedforward convolutional neural networks (CNNs) can now solve some computer vision tasks at super-human levels. However, these networks only roughly mimic human visual perception. One difference from human vis...
详细信息
Modern feedforward convolutional neural networks (CNNs) can now solve some computer vision tasks at super-human levels. However, these networks only roughly mimic human visual perception. One difference from human vision is that they do not appear to perceive illusory contours (e.g. Kanizsa squares) in the same way humans do. Physiological evidence from visual cortex suggests that the perception of illusory contours could involve feedback connections. Would recurrent feedback neural networks perceive illusory contours like humans? In this work we equip a deep feedforward convolutional network with brain-inspired recurrent dynamics. The network was first pretrained with an unsupervised reconstruction objective on a natural image dataset, to expose it to natural object contour statistics. Then, a classification decision head was added and the model was finetuned on a form discrimination task: squares vs. randomly oriented inducer shapes (no illusory contour). Finally, the model was tested with the unfamiliar "illusory contour" configuration: inducer shapes oriented to form an illusory square. Compared with feedforward baselines, the iterative "predictive coding'' feedback resulted in more illusory contours being classified as physical squares. The perception of the illusory contour was measurable in the luminance profile of the image reconstructions produced by the model, demonstrating that the model really "sees" the illusion. Ablation studies revealed that natural image pretraining and feedback error correction are both critical to the perception of the illusion. Finally we validated our conclusions in a deeper network (VGG): adding the same predictive coding feedback dynamics again leads to the perception of illusory contours. (C) 2021 Elsevier Ltd. All rights reserved.
Application of a type of predictive coding to the channel signals of a homomorphic vocoder has produced sizable bit rate reductions. With only slight degradation in speech quality, reduction (for the spectral envelope...
详细信息
Application of a type of predictive coding to the channel signals of a homomorphic vocoder has produced sizable bit rate reductions. With only slight degradation in speech quality, reduction (for the spectral envelope information) from 7800 to 4000 bits/s was achieved. A technique for obtaining the formant frequencies from the predictive coding parameters is described; this approach promises further bit rate reductions. As a by-product of this study of predictive coding, direct and cascade form speech synthesizers are compared on the basis of differing quantization effects.
The brain is constantly generating predictions of future sensory input to enable efficient adaptation. In the auditory domain, this applies also to the processing of speech. Here we aimed to determine whether the brai...
详细信息
The brain is constantly generating predictions of future sensory input to enable efficient adaptation. In the auditory domain, this applies also to the processing of speech. Here we aimed to determine whether the brain predicts the following segments of speech input on the basis of language-specific phonological rules that concern non-adjacent phonemes. Auditory event-related potentials (ERP) were recorded in a mismatch negativity (MMN) paradigm, where the Finnish vowel harmony, determined by the first syllables of pseudowords, either constrained or did not constrain the phonological composition of pseudoword endings. The phonological rule of vowel harmony was expected to create predictions about phonologically legal pseudoword endings. Results showed that MMN responses were larger for phonologically illegal than legal pseudowords, and P3a was elicited only for illegal pseudowords. This supports the hypothesis that speech input is evaluated against context-dependent phonological predictions that facilitate speech processing. (C) 2016 Elsevier Inc. All rights reserved.
Pain is a complex, multidimensional experience that involves dynamic interactions between sensory-discriminative and affective-emotional processes. Pain experiences have a high degree of variability depending on their...
详细信息
Pain is a complex, multidimensional experience that involves dynamic interactions between sensory-discriminative and affective-emotional processes. Pain experiences have a high degree of variability depending on their context and prior anticipation. Viewing pain perception as a perceptual inference problem, we propose a predictive coding paradigm to characterize evoked and non-evoked pain. We record the local field potentials (LFPs) from the primary somatosensory cortex (S1) and the anterior cingulate cortex (ACC) of freely behaving rats-two regions known to encode the sensory-discriminative and affective-emotional aspects of pain, respectively. We further use predictive coding to investigate the temporal coordination of oscillatory activity between the S1 and ACC. Specifically, we develop a phenomenological predictive coding model to describe the macroscopic dynamics of bottom-up and top-down activity. Supported by recent experimental data, we also develop a biophysical neural mass model to describe the mesoscopic neural dynamics in the S1 and ACC populations, in both naive and chronic pain-treated animals. Our proposed predictive coding models not only replicate important experimental findings, but also provide new prediction about the impact of the model parameters on the physiological or behavioral read-out-thereby yielding mechanistic insight into the uncertainty of expectation, placebo or nocebo effect, and chronic pain.
According to the predictive coding theory of cognition (PCT), brains are predictive machines that use perception and action to minimize prediction error, i.e. the discrepancy between bottom-up, externally-generated se...
详细信息
According to the predictive coding theory of cognition (PCT), brains are predictive machines that use perception and action to minimize prediction error, i.e. the discrepancy between bottom-up, externally-generated sensory signals and top-down, internally-generated sensory predictions. Many consider PCT to have an explanatory scope that is unparalleled in contemporary cognitive science and see in it a framework that could potentially provide us with a unified account of cognition. It is also commonly assumed that PCT is a representational theory of sorts, in the sense that it postulates that our cognitive contact with the world is mediated by internal representations. However, the exact sense in which PCT is representational remains unclear;neither is it clear that it deserves such status-that is, whether it really invokes structures that are truly and nontrivially representational in nature. In the present article, I argue that the representational pretensions of PCT are completely justified. This is because the theory postulates cognitive structures-namely action-guiding, detachable, structural models that afford representational error detection-that play genuinely representational functions within the cognitive system.
The discovery of mirror neurons in the ventral premotor cortex (area F5) and inferior parietal cortex (area PFG) in the macaque monkey brain has provided the physiological evidence for direct matching of the intrinsic...
详细信息
The discovery of mirror neurons in the ventral premotor cortex (area F5) and inferior parietal cortex (area PFG) in the macaque monkey brain has provided the physiological evidence for direct matching of the intrinsic motor representations of the self and the visual image of the actions of others. The existence of mirror neurons implies that the brain has mechanisms reflecting shared self and other action representations. This may further imply that the neural basis self-body representations may also incorporate components that are shared with other-body representations. It is likely that such a mechanism is also involved in predicting other's touch sensations and emotions. However, the neural basis of shared body representations has remained unclear. Here, we propose a neural basis of body representation of the self and of others in both human and non-human primates. We review a series of behavioral and physiological findings which together paint a picture that the systems underlying such shared representations require integration of conscious exteroception and interoception subserved by a cortical sensory-motor network involving parieto-inner perisylvian circuits (the ventral intraparietal area [VIM/inferior parietal area [PFG1-secondary somatosensory cortex ISM/posterior insular cortex [pIC/anterior insular cortex [aIC]). Based on these findings, we propose a computational mechanism of the shared body representation in the predictive coding (PC) framework. Our mechanism proposes that processes emerging from generative models embedded in these specific neuronal circuits play a pivotal role in distinguishing a self-specific body representation from a shared one. The model successfully accounts for normal and abnormal shared body phenomena such as mirror-touch synesthesia and somatoparaphrenia. In addition, it generates a set of testable experimental predictions. (C) 2014 Elsevier Ltd. All rights reserved.
Predictions of upcoming movements are based on several types of neural signals that span the visual, somatosensory, motor and cognitive system. Thus far, pre-movement signals have been investigated while participants ...
详细信息
Predictions of upcoming movements are based on several types of neural signals that span the visual, somatosensory, motor and cognitive system. Thus far, pre-movement signals have been investigated while participants viewed the object to be acted upon. Here, we studied the contribution of information other than vision to the classification of preparatory signals for action, even in the absence of online visual information. We used functional magnetic resonance imaging (fMRI) and multivoxel pattern analysis (MVPA) to test whether the neural signals evoked by visual, memory-based and somato-motor information can be reliably used to predict upcoming actions in areas of the dorsal and ventral visual stream during the preparatory phase preceding the action, while participants were lying still. Nineteen human participants (nine women) performed one of two actions towards an object with their eyes open or closed. Despite the well-known role of ventral stream areas in visual recognition tasks and the specialization of dorsal stream areas in somato-motor processes, we decoded action intention in areas of both streams based on visual, memory-based and somato-motor signals. Interestingly, we could reliably decode action intention in absence of visual information based on neural activity evoked when visual information was available and vice versa. Our results show a similar visual, memory and somato-motor representation of action planning in dorsal and ventral visual stream areas that allows predicting action intention across domains, regardless of the availability of visual information.
The amplitude of auditory components of the event-related potential (ER?) is attenuated when sounds are self-generated compared to externally generated sounds. This effect has been ascribed to internal forward modals ...
详细信息
The amplitude of auditory components of the event-related potential (ER?) is attenuated when sounds are self-generated compared to externally generated sounds. This effect has been ascribed to internal forward modals predicting the sensory consequences of one's own motor actions. Auditory potentials are also attenuated when a sound is accompanied by a video of anticipatory visual motion that reliably predicts the sound. Here, we investigated whether the neural underpinnings of prediction of upcoming auditory stimuli are similar for motor-auditory (MA) and visual-auditory (VA) events using a stimulus omission paradigm. In the MA condition, a finger tap triggered the sound of a handclap whereas in the VA condition the same sound was accompanied by a video showing the handclap. In both conditions, the auditory stimulus was omitted in either 50% or 12% of the trials. These auditory omissions induced early and mid-latency ERP components (oN1 and oN2, presumably reflecting prediction and prediction error), and subsequent higher-order error evaluation processes. The oN1 and oN2 of MA and VA were alike in amplitude, topography, and neural sources despite that the origin of the prediction stems from different brain areas (motor versus visual cortex). This suggests that MA and VA predictions activate a sensory template of the sound in auditory cortex. This article is part of a Special Issue entitled SI: Prediction and Attention. (C) 2015 Elsevier B.V. All rights reserved.
predictive processing has recently been advanced as a global cognitive architecture for the brain. I argue that its commitments concerning the nature and format of cognitive representation are inadequate to account fo...
详细信息
predictive processing has recently been advanced as a global cognitive architecture for the brain. I argue that its commitments concerning the nature and format of cognitive representation are inadequate to account for two basic characteristics of conceptual thought: first, its generality-the fact that we can think and flexibly reason about phenomena at any level of spatial and temporal scale and abstraction;second, its rich compositionality-the specific way in which concepts productively combine to yield our thoughts. I consider two strategies for avoiding these objections and I argue that both confront formidable challenges.
In Part I predictive coding was defined and messages, prediction, entropy, and ideal coding were discussed. In the present paper the criterion to be used for predictors for the purpose of predictive coding is defined:...
详细信息
In Part I predictive coding was defined and messages, prediction, entropy, and ideal coding were discussed. In the present paper the criterion to be used for predictors for the purpose of predictive coding is defined: that predictor is optimum in the information theory (IT) sense which minimizes the entropy of the average error-term distribution. Ordered averages of distributions are defined and it is shown that if a predictor gives an ordered average error term distribution it will be a best IT predictor. Special classes of messages are considered for which a best IT predictor can easily be found, and some examples are given. The error terms which are transmitted in predictive coding are treated as if they were statistically independent. If this is indeed the case, or a good approximation, then it is still necessary to show that sequences of message terms which are statistically independent may always be coded efficiently, without impractically large memory requirements, in order to show that predictive coding may be practical and efficient in such cases. This is done in the final section of this paper.
暂无评论