Towards effective and efficient image matching or retrieval tasks, the emerging MPEG standard, named Compact Descriptors for Visual Search (CDVS), has fulfilled compact descriptors for still images, consisting of comp...
详细信息
ISBN:
(纸本)9781479983391
Towards effective and efficient image matching or retrieval tasks, the emerging MPEG standard, named Compact Descriptors for Visual Search (CDVS), has fulfilled compact descriptors for still images, consisting of compressed local and global descriptor. Nevertheless, the frame-level coding of CDVS descriptors from a video sequence does not address the inter-frame redundancy issue, which may consume considerable bandwidth and storage resources. In this work, we propose an efficient coding framework of CDVS descriptors to generate compact descriptors for video sequences. For local descriptors, we propose a multiple reference predictive technique to exploit the temporal correlation of local descriptors and location coordinates over a sequence of frames. To further improve the prediction performance, keypoint tracking is applied to identify temporally repeated keypoints. For global descriptors, a propagation coding way is employed to compress the global descriptors of adjacent frames. The empirical evaluation has shown that the proposed coding approach has yielded a low bit rate of less than 40kbps on average, while maintaining comparable matching and retrieval performance. Compared to the sequence of original frame-level CDVS descriptors, the proposed approach has achieved over 25x bit rate reduction.
This paper considers communication in terms of inference about the behaviour of others (and our own behaviour). It is based on the premise that our sensations are largely generated by other agents like ourselves. This...
详细信息
This paper considers communication in terms of inference about the behaviour of others (and our own behaviour). It is based on the premise that our sensations are largely generated by other agents like ourselves. This means, we are trying to infer how our sensations are caused by others, while they are trying to infer our behaviour: for example, in the dialogue between two speakers. We suggest that the infinite regress induced by modelling another agent - who is modelling you - can be finessed if you both possess the same model. In other words, the sensations caused by others and oneself are generated by the same process. This leads to a view of communication based upon a narrative that is shared by agents who are exchanging sensory signals. Crucially, this narrative transcends agency and simply involves intermittently attending to and attenuating sensory input. Attending to sensations enables the shared narrative to predict the sensations generated by another (i.e. to listen), while attenuating sensory input enables one to articulate the narrative (i.e. to speak). This produces a reciprocal exchange of sensory signals that, formally, induces a generalised synchrony between internal (neuronal) brain states generating predictions in both agents. We develop the arguments behind this perspective, using an active (Bayesian) inference framework and offer some simulations (of birdsong) as proof of principle. (C) 2014 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license (http://***/by/3.0/).
Traditional studies of human categorization often treat the processes of encoding features and cues as peripheral to the question of how stimuli are categorized. However, in domains where the features and cues are les...
详细信息
Traditional studies of human categorization often treat the processes of encoding features and cues as peripheral to the question of how stimuli are categorized. However, in domains where the features and cues are less transparent, how information is encoded prior to categorization may constrain our understanding of the architecture of categorization. This is particularly true in speech perception, where acoustic cues to phonological categories are ambiguous and influenced by multiple factors. Here, it is crucial to consider the joint contributions of the information in the input and the categorization architecture. We contrasted accounts that argue for raw acoustic information encoding with accounts that posit that cues are encoded relative to expectations, and investigated how two categorization architectures-exemplar models and back-propagation parallel distributed processing models-deal with each kind of information. Relative encoding, akin to predictive coding, is a form of noise reduction, so it can be expected to improve model accuracy;however, like predictive coding, the use of relative encoding in speech perception by humans is controversial, so results are compared to patterns of human performance, rather than on the basis of overall accuracy. We found that, for both classes of models, in the vast majority of parameter settings, relative cues greatly helped the models approximate human performance. This suggests that expectation-relative processing is a crucial precursor step in phoneme categorization, and that understanding the information content is essential to understanding categorization processes.
In a recent functional magnetic resonance imaging study, Kok and de Lange (2014) observed that BOLD activity for a Kanizsa illusory shape stimulus, in which pacmen-like inducers elicit an illusory shape percept, was e...
详细信息
In a recent functional magnetic resonance imaging study, Kok and de Lange (2014) observed that BOLD activity for a Kanizsa illusory shape stimulus, in which pacmen-like inducers elicit an illusory shape percept, was either enhanced or suppressed relative to a nonillusory control configuration depending on whether the spatial profile of BOLD activity in early visual cortex was related to the illusory shape or the inducers, respectively. The authors argued that these findings fit well with the predictive coding framework, because top-down predictions related to the illusory shape are not met with bottom-up sensory input and hence the feedforward error signal is enhanced. Conversely, for the inducing elements, there is a match between top-down predictions and input, leading to a decrease in error. Rather than invoking predictive coding as the explanatory framework, the suppressive effect related to the inducers might be caused by neural adaptation to perceptually stable input due to the trial sequence used in the experiment.
Distributed visual analysis applications, such as mobile visual search or Visual Sensor Networks (VSNs) require the transmission of visual content on a bandwidth-limited network, from a peripheral node to a processing...
详细信息
ISBN:
(纸本)9781479983391
Distributed visual analysis applications, such as mobile visual search or Visual Sensor Networks (VSNs) require the transmission of visual content on a bandwidth-limited network, from a peripheral node to a processing unit. Traditionally, a "Compress-Then-Analyze" approach has been pursued, in which sensing nodes acquire and encode the pixel-level representation of the visual content, that is subsequently transmitted to a sink node in order to be processed. This approach might not represent the most effective solution, since several analysis applications leverage a compact representation of the content, thus resulting in an inefficient usage of network resources. Furthermore, coding artifacts might significantly impact the accuracy of the visual task at hand. To tackle such limitations, an orthogonal approach named "Analyze-Then-Compress" has been proposed [ 1]. According to such a paradigm, sensing nodes are responsible for the extraction of visual features, that are encoded and transmitted to a sink node for further processing. In spite of improved task efficiency, such paradigm implies the central processing node not being able to reconstruct a pixel-level representation of the visual content. In this paper we propose an effective compromise between the two paradigms, namely "Hybrid-Analyze-Then-Compress" (HATC) that aims at jointly encoding visual content and local image features. Furthermore, we show how a target tradeoff between image quality and task accuracy might be achieved by accurately allocating the bitrate to either visual content or local features.
According to predictive coding models of perception, what we see is determined jointly by the current input and the priors established by previous experience, expectations, and other contextual factors. The same input...
详细信息
According to predictive coding models of perception, what we see is determined jointly by the current input and the priors established by previous experience, expectations, and other contextual factors. The same input can thus be perceived differently depending on the priors that are brought to bear during viewing. Here, I show that expected (diagnostic) colors are perceived more vividly than arbitrary or unexpected colors, particularly when color input is unreliable. Participants were tested on a version of the 'Spanish Castle Illusion' in which viewing a hue-inverted image renders a subsequently shown achromatic version of the image in vivid color. Adapting to objects with intrinsic colors (e.g., a pumpkin) led to stronger afterimages than adapting to arbitrarily colored objects (e.g., a pumpkin-colored car). Considerably stronger afterimages were also produced by scenes containing intrinsically colored elements (grass, sky) compared to scenes with arbitrarily colored objects (books). The differences between images with diagnostic and arbitrary colors disappeared when the association between the image and color priors was weakened by, e.g., presenting the image upside-down, consistent with the prediction that color appearance is being modulated by color knowledge. Visual inputs that conflict with prior knowledge appear to be phenomenologically discounted, but this discounting is moderated by input certainty, as shown by the final study which uses conventional images rather than afterimages. As input certainty is increased, unexpected colors can become easier to detect than expected ones, a result consistent with predictive-coding models. (C) 2015 Elsevier B.V. All rights reserved.
Signal level quantization, a fundamental component in digital sampling of continuous signals such as DPCM, or in near-lossless predictive-coding based compression schemes of digital data such as JPEG-LS, often produce...
详细信息
ISBN:
(纸本)9783319257518;9783319257501
Signal level quantization, a fundamental component in digital sampling of continuous signals such as DPCM, or in near-lossless predictive-coding based compression schemes of digital data such as JPEG-LS, often produces visible banding artifacts in regions where the input signals are very smooth. Traditional techniques for dealing with this issue include dithering, where the encoder contaminates the input signal with a noise function (which may be known to the decoder as well) prior to quantization. We propose an alternate way for avoiding banding artifacts, where quantization is applied in an interleaved fashion, leaving a portion of the samples untouched, following a known pseudo-random Beroulli sequence. Our method, which is sufficiently general to be applied to other types of media, is demonstrated on a modified version of JPEG-LS, resulting in a significant reduction in visible artifacts in all cases, while producing a graceful degradation in compression ratio.
with the increasing spatial and temporal resolutions of acquired remote sensing (RS) images, effective image compression is becoming more and more important. RS image compression technologies have been extensively stu...
详细信息
ISBN:
(纸本)9781479986880
with the increasing spatial and temporal resolutions of acquired remote sensing (RS) images, effective image compression is becoming more and more important. RS image compression technologies have been extensively studied in the past a few decades, and various algorithms have been developed accordingly. In this paper, we provide an overview of practically deployed RS image compression approaches, including predictive coding and transform coding approaches that have been adopted in different satellite systems. In addition, some newly derived RS image compression methods are discussed, with highlights on the new trends of the on-going design and developments of RS image compression.
Continuous active learning achieves high recall for technology-assisted review, not only for an overall information need, but also for various facets of that information need, whether explicit or implicit. Through sim...
详细信息
ISBN:
(纸本)9781450336215
Continuous active learning achieves high recall for technology-assisted review, not only for an overall information need, but also for various facets of that information need, whether explicit or implicit. Through simulations using Cormack and Grossman's TAR Evaluation Toolkit (SIGIR 2014), we show that continuous active learning, applied to a multi-faceted topic, efficiently achieves high recall for each facet of the topic. Our results assuage the concern that continuous active learning may achieve high overall recall at the expense of excluding identifiable categories of relevant information.
Although prediction plays a prominent role in mental processing, we have only limited understanding of how the brain generates and employs predictions. This paper develops a theoretical framework in three steps. First...
详细信息
Although prediction plays a prominent role in mental processing, we have only limited understanding of how the brain generates and employs predictions. This paper develops a theoretical framework in three steps. First I propose a process model that describes how predictions are produced and are linked to behavior. Subsequently I describe a generative mechanism, consisting of the selective amplification of neural dynamics in the context of boundary conditions. I hypothesize that this mechanism is active as a process engine in every mental process, and that therefore each mental process proceeds in two stages: (i) the formation of process boundary conditions;(ii) the bringing about of the process function by the operation within the boundary conditions of a relatively 'blind' generative process. Thirdly, from this hypothesis I derive a strategy for describing processes formally. The result is a multilevel framework that may also be useful for studying mental processes in general. (C) 2015 The Author. Published by Elsevier Ltd.
暂无评论