A given pattern of optical stimulation can arise from countless possible real-world sources, creating a dilemma for vision: What in the world actually gives rise to the current pattern? This dilemma was pointed out ce...
详细信息
A given pattern of optical stimulation can arise from countless possible real-world sources, creating a dilemma for vision: What in the world actually gives rise to the current pattern? This dilemma was pointed out centuries ago by the astronomer and mathematician Ibn Al-Haytham and was forcefully restated 150 years ago when von Helmholtz characterized perception as unconscious inference. To buttress his contention, von Helmholtz cited multistable perception: recurring changes in perception despite unchanging sensory input. Recent neuroscientific studies have exploited multistable perception to identify brain areas uniquely activated in association with these perceptual changes, but the specific roles of those activations remain controversial. This article provides an overview of theoretical models of multistable perception, a review of recent neuroimaging and brain stimulation studies focused on mechanisms associated with these perceptual changes, and a synthesis of available evidence within the context of current notions about Bayesian inference that find their historical roots in von Helmholtz's work.
Speech recognition from visual-only faces is difficult, but can be improved by prior information about what is said. Here, we investigated how the human brain uses prior information from auditory speech to improve vis...
详细信息
Speech recognition from visual-only faces is difficult, but can be improved by prior information about what is said. Here, we investigated how the human brain uses prior information from auditory speech to improve visual-speech recognition. In a functional magnetic resonance imaging study, participants performed a visual-speech recognition task, indicating whether the word spoken in visual-only videos matched the preceding auditory-only speech, and a control task (face-identity recognition) containing exactly the same stimuli. We localized a visual-speech processing network by contrasting activity during visual-speech recognition with the control task. Within this network, the left posterior superior temporal sulcus (STS) showed increased activity and interacted with auditory-speech areas if prior information from auditory speech did not match the visual speech. This mismatch-related activity and the functional connectivity to auditory-speech areas were specific for speech, i.e., they were not present in the control task. The mismatch-related activity correlated positively with performance, indicating that posterior STS was behaviorally relevant for visual-speech recognition. In line with predictive coding frameworks, these findings suggest that prediction error signals are produced if visually presented speech does not match the prediction from preceding auditory speech, and that this mechanism plays a role in optimizing visual-speech recognition by prior information. (C) 2012 Elsevier Inc. All rights reserved.
Interoception, which refers to the perception of internal body signals, has been consistently associated with emotional processing and with the sense of self. However, its influence on the subjective appraisal of affe...
详细信息
Interoception, which refers to the perception of internal body signals, has been consistently associated with emotional processing and with the sense of self. However, its influence on the subjective appraisal of affectively neutral and body -unrelated stimuli is still largely unknown. Across two experiments we sought to investigate this issue by asking participants to detect changes in the flashing rhythm of a simple stimulus (a circle) that could either be pulsing synchronously with their own heartbeats or following the pattern of another person's heart. While overall task performance did not vary as a function of cardio-visual synchrony, participants were better at identifying trials in which no change occurred when the flashes were synchronous with their own heartbeats. This study adds to the growing body of research indicating that we use our body as a reference point when perceiving the world;and extends this view by focusing on the role that signals coming from inside the body, such as heartbeats, may play in this referencing process. Specifically we show that private interoceptive sensations can be combined with affectively neutral information unrelated to the self to influence the processing of a multisensory percept. Results are discussed in terms of both standard multisensory integration processes and predictive coding theories. (C) 2015 Elsevier B.V. All rights reserved.
Major depressive disorder negatively impacts the sensitivity and adaptability of the brain's predictive coding framework. The current electroencephalography study into the antidepressant properties of ketamine inv...
详细信息
Major depressive disorder negatively impacts the sensitivity and adaptability of the brain's predictive coding framework. The current electroencephalography study into the antidepressant properties of ketamine investigated the downstream effects of ketamine on predictive coding and short-term plasticity in thirty patients with depression using the auditory roving mismatch negativity (rMMN). The rMMN paradigm was run 3-4 h after a single 0.44 mg/kg intravenous dose of ketamine or active placebo (remifentanil infused to a target plasma concentration of 1.7 ng/mL) in order to measure the neural effects of ketamine in the period when an improvement in depressive symptoms emerges. Depression symptomatology was measured using the Montgomery-Asberg Depression Rating Scale (MADRS);70% of patients demonstrated at least a 50% reduction their MADRS global score. Ketamine significantly increased the MMN and P3a event related potentials, directly contrasting literature demonstrating ketamine's acute atten-uation of the MMN. This effect was only reliable when all repetitions of the post-deviant tone were used. Dynamic causal modelling showed greater modulation of forward connectivity in response to a deviant tone between right primary auditory cortex and right inferior temporal cortex, which significantly correlated with antidepressant response to ketamine at 24 h. This is consistent with the hypothesis that ketamine increases sensitivity to unexpected sensory input and restores deficits in sensitivity to prediction error that are hypothesised to underlie depres-sion. However, the lack of repetition suppression evident in the MMN evoked data compared to studies of healthy adults suggests that, at least within the short term, ketamine does not improve deficits in adaptive internal model calibration. (C) 2020 Elsevier B.V. and ECNP. All rights reserved.
One of the most challenging questions regarding the nature and neural basis of consciousness is the embodied dimension of the phenomenon, that is, feeling located within the body and viewing the world from that spatia...
详细信息
One of the most challenging questions regarding the nature and neural basis of consciousness is the embodied dimension of the phenomenon, that is, feeling located within the body and viewing the world from that spatial perspective. Current theories in neurophysiology highlight the active role of multisensory and sensorimotor integration in supporting self location and self-perspective, and propose the right temporal-parietal-junction (rTPJ) as a key area for such function. These theories are based mainly on findings from two experimental paradigms: manipulation of bottom-up multisensory information integration regarding one's body location (full-body illusion), or direct and invasive manipulation disrupting brain activity at the rTPJ. In this study we take a different approach by using hypnotic suggestion - a non-invasive top-down technique to manipulate the subjective experience of self-location. The brain activity of 18 right-handed participants was recorded using magnetoencephalography (MEG) while their subjective experience of self-location was hypnotically manipulated. Spectral analyses were conducted on the spontaneous MEG data before and during an induction of an out-of-body experience (OBE) by a trained psychiatrist. The results indicate high correlations between power at alpha and high-gamma frequency bands and the degree of perceived change in self-location. Regions exhibiting such correlations include temporal-occipital regions, the rTPJ, as well as frontal and midline regions. These findings are in line with an oscillatory-based predictive coding framework. (C) 2017 Elsevier Ltd. All rights reserved.
This paper studies low-delay Wyner-Ziv coding, i.e., lossy source coding with side information at the decoder, with emphasis on the extreme of zero delay. To achieve zero delay, a scalar quantizer is followed by scala...
详细信息
This paper studies low-delay Wyner-Ziv coding, i.e., lossy source coding with side information at the decoder, with emphasis on the extreme of zero delay. To achieve zero delay, a scalar quantizer is followed by scalar coding of quantization indices. In the fixed-length coding scenario, under high-resolution assumptions and appropriately defined decodability constraints, the optimal quantization level density is conjectured to be periodic. This conjecture, which is provable when the correlation is high, allows for a precise analysis of the rate-distortion tradeoff. The performance of variable-length coding with periodic quantization is also characterized. The results are then incorporated in predictive Wyner-Ziv coding for Gaussian sources with memory, and optimal prediction filters are numerically designed so as to strike a balance between maximally exploiting both temporal and spatial correlation and limiting the propagation of distortion due to occasional decoding errors. Finally, the zero-delay schemes are also employed in transform coding with small block lengths, where the Gaussian source and side information are transformed separately with the premise that corresponding transform coefficient pairs exhibit good spatial correlation and minimal temporal correlation. For the specific source-side information pairs studied, it is shown that transform coding, even with a small block-length, outperforms predictive coding. Performances of both predictive and transform coding are also compared with the asymptotic rate-distortion bounds.
Networks of "conscious agents" (CAs) as defined by Hoffman and Prakash (2014) are shown to provide a robust and intuitive representation of perceptual and cognitive processes in the context of the Interface ...
详细信息
Networks of "conscious agents" (CAs) as defined by Hoffman and Prakash (2014) are shown to provide a robust and intuitive representation of perceptual and cognitive processes in the context of the Interface Theory of Perception (Hoffman, Singh and Prakash, 2015). The behavior of the simplest CA networks is analyzed exhaustively. The construction of short-and long-term memories and the implementation of attention, categorization and case-based planning are demonstrated. These results show that robust perception and cognition can be modelled independently of any ontological assumptions about the world in which an agent is embedded. Any agent-world interaction can, in particular, also be represented as an agent-agent interaction. (C) 2017 Elsevier B.V. All rights reserved.
Among a context of three pixels, the present JPEG-LS produces predicted values based on whether a vertical edge or a horizontal edge is detected. When a diagonal edge exists, however, experiments and observation revea...
详细信息
Among a context of three pixels, the present JPEG-LS produces predicted values based on whether a vertical edge or a horizontal edge is detected. When a diagonal edge exists, however, experiments and observation reveal that such a prediction will generate large predictive errors. By applying the triangle inequality to the analysis of predictive templates in JPEG-LS, we propose a diagonal-edge detection scheme to reduce the predictive error and hence provide an improvement on the prediction accuracy. Experiments are carried out to test the proposed scheme for a group of sample images. In comparison with the current JPEG-LS prediction, our scheme produces lower prediction errors, in terms of both MSE measurement and visual comparison of error images. (C) 2002 Society of Photo-Optical Instrumentation Engineers.
Deep learning, such as convolutional neural networks, has been achieved great success in image processing, computer vision task, and image compression, and has achieved better performance. This paper designs a multipl...
详细信息
Deep learning, such as convolutional neural networks, has been achieved great success in image processing, computer vision task, and image compression, and has achieved better performance. This paper designs a multiple description coding frameworks based on symmetric convolutional auto-encoder, which can achieve high-quality image reconstruction. First, the image is input into the convolutional auto-encoder, and the extracted features are obtained. Then, the extracted features are encoded by the multiple description coding and split into two descriptions for transmission to the decoder. We can get the side information by the side decoder and the central information by the central decoder. Finally, the side information and the central information are deconvolved by convolutional auto-encoder. The experimental results validate that the proposed scheme outperforms the state-of-the-art methods.
Biological vision systems inspire processing methods in computer vision applications. This paper employs the insights of vision systems in hardware and presents a pixel-parallel, reconfigurable, and layer-based hierar...
详细信息
Biological vision systems inspire processing methods in computer vision applications. This paper employs the insights of vision systems in hardware and presents a pixel-parallel, reconfigurable, and layer-based hierarchical architecture for smart image sensors. The architecture aims to bring computation close to the sensor to achieve high acceleration for different machine vision applications while consuming low power. We logically divide the image into multiple regions and perform pixel-level and region-level processing after removing spatiotemporal redundancy. Those processors use bio-inspired algorithms to activate the regions with region of interest of a scene. The hierarchical processing breaks the traditional sequential image processing and introduces parallelism for machine vision applications. Also, we make the hardware design reconfigurable even after fabrication to make the hardware reusable for different applications. Simulation results show that the area overhead and power penalty for adding reconfigurable features stay in an acceptable range. We emphasize to maximize the operating speed and obtain 800 MHz. Besides, the design saves 84.01% and 96.91% dynamic power at the first and second stages of the hierarchy by removing redundant information. Furthermore, the sequential deployment of high-level reasoning only on the selected regions of the image becomes computationally inexpensive to execute a complex task in real time.
暂无评论