作者:
Liu, LKFeig, EIBM
IBM Thomas J. Watson Research Center Yorktown Heights NY USA
A block-based gradient descent search (BBGDS) algorithm is proposed in this paper to perform block motion estimation in video coding. The BBGDS evaluates the values of a given objective function starting from a small ...
详细信息
A block-based gradient descent search (BBGDS) algorithm is proposed in this paper to perform block motion estimation in video coding. The BBGDS evaluates the values of a given objective function starting from a small centralized checking block, The minimum within the checking block is found, and the gradient descent direction where the minimum is expected to lie is used to determine the search direction and the position of new checking block. The BBGDS is compared with full search (FS), three-step search (TSS), one-at-a-time search (OTS), and new three-step search (NTSS). Experimental results show that the proposed technique provides competitive performance with reduced computational complexity.
Interframe motion estimation of subblocks based on improved search techniques is developed. These techniques are based on minimizing the mean difference between the subblock in question in the present frame and the di...
详细信息
Interframe motion estimation of subblocks based on improved search techniques is developed. These techniques are based on minimizing the mean difference between the subblock in question in the present frame and the displaced subblock in the previous frame. The performance of the motion compensated prediction developed here is investigated for various block sizes and is compared to other techniques.
This paper presents a lossy image compression scheme that employs a generalized rank-ordered prediction filter for pyramidal image coding. The proposed prediction method renders significantly reduced variance of the q...
详细信息
This paper presents a lossy image compression scheme that employs a generalized rank-ordered prediction filter for pyramidal image coding. The proposed prediction method renders significantly reduced variance of the quantizer input. Consequently, the quality of the decompressed image is much enhanced due to the greatly reduced quantization distortion. Both analytical and simulation results show that the proposed scheme yields high-quality performance.
Humans have evolved an elaborate system of self-consciousness, self-identity, self-agency, and self-embodiment that is grounded in specific neurological structures including an expanded insula. Instantiation of the bo...
详细信息
Humans have evolved an elaborate system of self-consciousness, self-identity, self-agency, and self-embodiment that is grounded in specific neurological structures including an expanded insula. Instantiation of the bodily self has been most-extensively studied via the 'rubber hand illusion', whereby parallel stimulation of a hidden true hand, and a viewed false hand, leads to the felt belief that the false hand is one's own. Autism and schizophrenia have both long been regarded as conditions centrally involving altered development of the self, but they have yet to be compared directly with regard to the self and embodiment. Here, we synthesize the embodied cognition literature for these and related conditions, and describe evidence that these two sets of disorders exhibit opposite susceptibilities from typical individuals to the rubber hand illusion: reduced on the autism spectrum and increased in schizophrenia and other psychotic-affective conditions. Moreover, the opposite illusion effects are mediated by a consilient set of associated phenomena, including empathy, interoception, anorexia risk and phenotypes, and patterns of genetic correlation. Taken together, these findings: (i) support the diametric model of autism and psychotic-affective disorders, (ii) implicate the adaptive human system of self-embodiment, and its neural bases, in neurodevelopmental disorders, and suggest new therapies and (iii) experimentally ground Bayesian predictive coding models with regard to autism compared with psychosis. Lay summary: Humans have evolved a highly developed sense of self and perception of one's own body. The 'rubber hand illusion' can be used to test individual variation in sense of self, relative to connection with others. We show that this illusion is reduced in autism spectrum disorders, and increased in psychotic and mood disorders. These findings have important implications for understanding and treatment of mental disorders.
Neural oscillations subserve a broad range of functions in speech processing and language comprehension. On the one hand, speech containssomewhatrepetitive trains of air pressure bursts that occur at three dominant am...
详细信息
Neural oscillations subserve a broad range of functions in speech processing and language comprehension. On the one hand, speech containssomewhatrepetitive trains of air pressure bursts that occur at three dominant amplitude modulation frequencies, physically marking the linguistically meaningful progressions of phonemes, syllables and intonational phrase boundaries. To these acoustic events, neural oscillations of isomorphous operating frequencies are thought to synchronise, presumably resulting in an implicit temporal alignment of periods of neural excitability to linguistically meaningful spectral information on the three low-level linguistic description levels. On the other hand, speech is a carrier signal that codes for high-level linguistic meaning, such as syntactic structure and semantic informationwhich cannot be read from stimulus acoustics, but must be acquired during language acquisition and decoded for language comprehension. Neural oscillations subserve the processing of both syntactic structure and semantic information. Here, I synthesise a mapping from each linguistic processing domain to a unique set of subserving oscillatory mechanismsthe mapping is plausible given the role ascribed to different oscillatory mechanisms in different subfunctions of cortical information processing and faithful to the underlying electrophysiology. In sum, the present article provides an accessible and extensive review of the functional mechanisms that neural oscillations subserve in speech processing and language comprehension.
In predictive coding, experience generates predictions that attenuate the feeding forward of predicted stimuli while passing forward unpredicted "errors." Different models have suggested distinct cortical la...
详细信息
In predictive coding, experience generates predictions that attenuate the feeding forward of predicted stimuli while passing forward unpredicted "errors." Different models have suggested distinct cortical layers, and rhythms implement predictive coding. We recorded spikes and local field potentials from laminar electrodes in five cortical areas (visual area 4 [V4], lateral intraparietal [LIP], posterior parietal area 7A, frontal eye field [FEF], and prefrontal cortex [PFC]) while monkeys performed a task that modulated visual stimulus predictability. During predictable blocks, there was enhanced alpha (8 to 14 Hz) or beta (15 to 30 Hz) power in all areas during stimulus processing and prestimulus beta (15 to 30 Hz) functional connectivity in deep layers of PFC to the other areas. Unpredictable stimuli were associated with increases in spiking and in gamma-band (40 to 90 Hz) power/connectivity that fed forward up the cortical hierarchy via superficial-layer cortex. Power and spiking modulation by predictability was stimulus specific. Alpha/beta power in LIP, FEF, and PFC inhibited spiking in deep layers of V4. Area 7A uniquely showed increases in high-beta (similar to 22 to 28 Hz) power/connectivity to unpredictable stimuli. These results motivate a conceptual model, predictive routing. It suggests that predictive coding may be implemented via lower-frequency alpha/ beta rhythms that "prepare" pathways processing-predicted inputs by inhibiting feedforward gamma rhythms and associated spiking.
The visual cues involved in auditory speech processing are not restricted to information from lip movements but also include head or chin gestures and facial expressions such as eyebrow movements. The fact that visual...
详细信息
The visual cues involved in auditory speech processing are not restricted to information from lip movements but also include head or chin gestures and facial expressions such as eyebrow movements. The fact that visual gestures precede the auditory signal implicates that visual information may influence the auditory activity. As visual stimuli are very close in time to the auditory information for audiovisual syllables, the cortical response to them usually overlaps with that for the auditory stimulation;the neural dynamics underlying the visual facilitation for continuous speech therefore remain unclear. In this study, we used a three-word phrase to study continuous speech processing. We presented video clips with even (without emphasis) phrases as the frequent stimuli and with one word visually emphasized by the speaker as the non-frequent stimuli. Negativity in the resulting ERPs was detected after the start of the emphasizing articulatory movements but before the auditory stimulus, a finding that was confirmed by the statistical comparisons of the audiovisual and visual stimulation. No such negativity was present in the control visual-only condition. The propagation of this negativity was observed between the visual and fronto-temporal electrodes. Thus, in continuous speech, the visual modality evokes predictive coding for the auditory speech, which is analysed by the cerebral cortex in the context of the phrase even before the arrival of the corresponding auditory signal.
The auditory system is tuned to detect rhythmic regularities in the environment which can occur on different timescales. Event-related potentials such as mismatch negativity (MMN) and P3b are thought to index local an...
详细信息
The auditory system is tuned to detect rhythmic regularities in the environment which can occur on different timescales. Event-related potentials such as mismatch negativity (MMN) and P3b are thought to index local and global deviance, respectively. However, it is not clear how these hierarchical levels interact and to what extent attention modulates this interaction. In this EEG study with 17 healthy young adults, we used a hierarchical oddball paradigm with local (sequence-level) and global (block-level) violations in attended and unattended conditions. Amplitude of N2 and P3b were analyzed in a 2*2*2 factorial model (local status, global status, attention condition). We found a significant interaction between the local and global status on the N2 amplitude, while there was no significant three-way interaction with attention, together demonstrating that lower-level prediction error is modulated by detection of higher-order regularity but expressed independently of attention. By contrast, higher-level prediction error, indexed by P3b, was sensitive to global regularity violations if the auditory stream was attended. The results demonstrate the capacity of our auditory perception to preattentively resolve conflicts between different levels of predictive hierarchy even across longer time intervals as indexed by MMN modulation, while P3b represents a different, attention-dependent system.
The question of how the mind works is at the heart of cognitive science. It aims to understand and explain the complex processes underlying perception, decision-making and learning, three fundamental areas of cognitio...
详细信息
The question of how the mind works is at the heart of cognitive science. It aims to understand and explain the complex processes underlying perception, decision-making and learning, three fundamental areas of cognition. Bayesian Brain Theory, a computational approach derived from the principles of predictive Processing (PP), offers a mechanistic and mathematical formulation of these cognitive processes. This theory assumes that the brain encodes beliefs (probabilistic states) to generate predictions about sensory input, then uses prediction errors to update its beliefs. In this paper, we present an introduction to the fundamentals of Bayesian Brain Theory. We show how this innovative theory hybridizes concepts inherited from the philosophy of mind and experimental data from neuroscience, and how it translates complex cognitive processes such as perception, action, emotion, or belief, or even the psychiatric symptomatology. (c) 2021 L'Encephale, Paris.
Visual perception involves the grouping of individual elements into coherent patterns, such as object representations, that reduce the descriptive complexity of a visual scene. The computational and physiological base...
详细信息
Visual perception involves the grouping of individual elements into coherent patterns, such as object representations, that reduce the descriptive complexity of a visual scene. The computational and physiological bases of this perceptual remain poorly understood. We discuss recent fMRI evidence from our laboratory where we measured activity in a higher object processing area (LOC), and in primary visual cortex (V1) in response to visual elements that were either grouped into objects or randomly arranged. We observed significant activity increases in the LOC and concurrent reductions of activity in V1 when elements formed coherent shapes, suggesting that activity in early visual areas is reduced as a result of grouping processes performed in higher areas. In light of these results we review related empirical findings of context-dependent changes in activity, recent neurophysiology research related to cortical feedback, and computational models that incorporate feedback operations. We suggest that feedback from high-level visual areas reduces activity in lower areas in order to simplify the description of a visual image-consistent with both predictive coding models of perception and probabilistic notions of 'explaining away.' (C) 2004 Elsevier Ltd. All rights reserved.
暂无评论