In this paper we present a new lossless image compression algorithm. To achieve the high compression speed we use a linear prediction, modified Golomb-Rice code family, and a very fast prediction error modeling method...
详细信息
In this paper we present a new lossless image compression algorithm. To achieve the high compression speed we use a linear prediction, modified Golomb-Rice code family, and a very fast prediction error modeling method. We compare the algorithm experimentally with others for medical and natural continuous tone grayscale images of depths of up to 16 bits. Its results are especially good for large images, for natural images of high bit depths, and for noisy images. The average compression speed on Intel Xeon 3.06 GHz CPU is 47 MB/s. For large images the speed is over 60 MB/s, i.e. the algorithm needs less than 50 CPU cycles per byte of image. Copyright (c) 2006 John Wiley & Sons, Ltd.
Group activity analysis has attracted remarkable attention recently due to the widespread applications in security, entertainment and military. This article targets at learning group activity representations with self...
详细信息
Group activity analysis has attracted remarkable attention recently due to the widespread applications in security, entertainment and military. This article targets at learning group activity representations with self-supervision, which differs from the majorities relying heavily on manually annotated labels. Moreover, existing Self-Supervised Learning (SSL) methods for videos are sub-optimal to generate such representations because of the complex context dynamics in group activities. In this article, an end-to-end framework termed Contextualized Relation predictive Model (Con-RPM) is proposed for self-supervised group activity representation learning with predictive coding. It involves the Serial-Parallel Transformer Encoder (SPTrans-Encoder) to model the context of spatial interactions and temporal variations, and the Hybrid Context Transformer Decoder (HConTrans-Decoder) to predict the future spatio-temporal relations guided by holistic scene context. Additionally, to improve the discriminability and consistency of prediction, we introduce a united loss integrating group-wise and person-wise contrastive losses in frame-level as well as the adversarial loss in global sequence-level. Consequently, our Con-RPM learns robust group representations via describing temporal evolutions of individual relationships and scene semantics explicitly. Extensive experimental results on downstream tasks indicate the effectiveness and generalization of our model in self-supervised learning, and present state-of-the-art performance on the Volleyball, Collective Activity, VolleyTactic, and Choi's New datasets.
A rectangular transform is defined and an algorithm for the same has been developed and implemented. The results show that a reasonable amount of data compression is possible. An additional amount of data eompression ...
详细信息
A rectangular transform is defined and an algorithm for the same has been developed and implemented. The results show that a reasonable amount of data compression is possible. An additional amount of data eompression can also be obtained by modifying the kernel.
The aim of the study was to investigate how emotion information processing factors, such as alexithymia and emotional intelligence, modulate body ownership and influence multisensory integration during the 'rubber...
详细信息
The aim of the study was to investigate how emotion information processing factors, such as alexithymia and emotional intelligence, modulate body ownership and influence multisensory integration during the 'rubber hand illusion' (RHI) task. It was previously shown that alexithymia correlates with RHI, and we suggested that emotional intelligence should also be a top-down factor of body ownership, since it was not shown in previous experiments. We elaborated the study of Grynberg and Pollatos [Front. Hum. Neurosci. 9 (2015) 357] with an additional measure of emotional intelligence, and propose an explanation for the interrelation of emotion and body ownership processing. Eighty subjects took part in the RHI experiment and completed the Toronto Alexithymia Scale and the Mayer-Salovey-Caruso Emotional Intelligence Test (MSCEIT). Only MSCEIT was detected to be a significant predictor of the subjective measure of the RHI. There were no significant correlations between alexithymia scores and the test statements of the RHI or the proprioceptive drift, thus we did not replicate the results of Grynberg and Pollatos. However, alexithymia correlated with the control statements of subjective reports of the illusion, which might be explained as a disruption of the ability to discriminate and describe bodily experience. Therefore, (1) alexithymia seems to be connected with difficulties in conscious or verbal processing of body-related information, and (2) higher emotional intelligence might improve multisensory integration of body-related signals and reflect better predictive models of self-processing.
作者:
Friston, KJUCL
Inst Neurol Wellcome Dept Imagin Neurosci London WC1N 3BG England
This article is about how the brain data mines its sensory inputs. There are several architectural principles of functional brain anatomy that have emerged from careful anatomic and physiologic studies over the past c...
详细信息
This article is about how the brain data mines its sensory inputs. There are several architectural principles of functional brain anatomy that have emerged from careful anatomic and physiologic studies over the past century. These principles are considered in the light of representational learning to see if they could have been predicted a priori on the basis of purely theoretical considerations. We first review the organisation of hierarchical sensory cortices, paying special attention to the distinction between forward and backward connections. We then review various approaches to representational learning as special cases of generative models, starting with supervised learning and ending with learning based upon empirical Bayes. The latter predicts many features, such as a hierarchical cortical system, prevalent top-down backward influences and functional asymmetries between forward and backward connections that are seen in the real brain. The key points made in this article are: (i) hierarchical generative models enable the learning of empirical priors and eschew prior assumptions about the causes of sensory input that are inherent in non-hierarchical models. These assumptions are necessary for learning schemes based on information theory and efficient or sparse coding, but are not necessary in a hierarchical context. Critically, the anatomical infrastructure that may implement generative models in the brain is hierarchical. Furthermore, learning based on empirical Bayes can proceed in a biologically plausible way. (ii) The second point is that backward connections are essential if the processes generating inputs cannot be inverted, or the inversion cannot be parameterised. Because these processes involve many-to-one mappings, are non-linear and dynamic in nature, they are generally non-invertible. This enforces an explicit parameterisation of generative models (i.e. backward connections) to afford recognition and suggests that forward architectures, on their ow
Motivated by a growing demand for an efficient motion compensated (MC) coder operating in real time, the authors propose a VLSI architecture based on parallel and pipelined processing for implementing the pel-recursiv...
详细信息
Motivated by a growing demand for an efficient motion compensated (MC) coder operating in real time, the authors propose a VLSI architecture based on parallel and pipelined processing for implementing the pel-recursive motion estimation algorithm for predictive coding of time-varying images. In order to maximize the processing concurrency, the displacement estimation process is divided into its integer and fractional part calculations, and the displacement estimation and the interpolation calculations are decoupled so that each calculation can be computed on a separate processor. The proposed architecture, which exploits pipelining, parallelism, and simple adjacent-neighbor interprocessor wiring, is appropriate for VLSI implementation. The performance of the proposed architecture on the real image sequences is evaluated. Issues regarding the fixed-point arithmetic and coding are discussed.< >
This paper presents a complete general-purpose method for still-image compression called adaptive prediction trees. Efficient lossy and lossless compression of photographs, graphics, textual, and mixed images is achie...
详细信息
This paper presents a complete general-purpose method for still-image compression called adaptive prediction trees. Efficient lossy and lossless compression of photographs, graphics, textual, and mixed images is achieved by ordering the data in a multicomponent binary pyramid, applying an empirically optimized nonlinear predictor, exploiting structural redundancies between color components, then coding with hex-trees and adaptive runlength/Huffman coders. Color palettization and order statistics prefiltering are applied adaptively as appropriate. Over a diverse image test set, the method outperforms standard lossless and lossy alternatives. The competing lossy alternatives use block transforms and wavelets in well-studied configurations. A major result of this paper is that predictive coding is a viable and sometimes preferable alternative to these methods.
There is an urgent need from various multimedia applications to efficiently compress point clouds. The Moving Picture Experts Group has released a standard platform called geometry-based point cloud compression (G-PCC...
详细信息
There is an urgent need from various multimedia applications to efficiently compress point clouds. The Moving Picture Experts Group has released a standard platform called geometry-based point cloud compression (G-PCC). However, its k-nearest neighbor (k-NN) based attribute prediction has limited efficiency for point clouds with rich texture and directional information. To overcome this problem, we propose a texture-aware attribute predictive coding framework in a point cloud diffusion model. In our work, attribute intra prediction is solved as a diffusion-based interpolation problem, and a general attribute predictor is developed. It is theoretically proven that G-PCC k-NN based predictor is a degraded case of the proposed diffusion-based solution. First, a point cloud is represented as two levels of details with seeds as the inpainting mask and non-seed points to be predicted. Second, we design point cloud partial difference operators to perform energy-minimizing attribute inpainting from seeds to unknowns. Smooth attribute interpolation can be achieved via an iterative diffusion process, and an adaptive early termination is proposed to reduce complexity. Third, we propose a structure-adaptive attribute predictive coding scheme, where edge-enhancing anisotropic diffusion is employed to perform texture-aware attribute prediction. Finally, attributes of seeds are beforehand encoded and prediction residuals of left points are progressively encoded into bitstream. Experiments show the proposed scheme surpasses the state-of-the-art by an average of 14.14%, 17.52%, and 17.87% BD-BR gains on the coding of Y, U, and V components, respectively. Subjective results on attribute reconstruction quality also verify the advantage of our scheme.
Context modeling is an extensively studied paradigm for lossless compression of continuous-tone images. However, without careful algorithm design, high-order Markovian modeling of continuous-tone images is too expensi...
详细信息
Context modeling is an extensively studied paradigm for lossless compression of continuous-tone images. However, without careful algorithm design, high-order Markovian modeling of continuous-tone images is too expensive in both computational time and space to be practical, Furthermore, the exponential growth of the number of modeling states in the order of a Markov model can quickly lead to the problem of context dilution;that is, an image may not have enough samples for good estimates of conditional probabilities associated,vith the modeling states, In this paper, new techniques for context modeling of DPCM errors are introduced that can exploit context-dependent DPCM error structures to the benefit of compression. New algorithmic techniques of forming and quantizing modeling contexts are also developed to alleviate the problem of context dilution and reduce both time and space complexities. By innovative formation, quantization, and use of modeling contexts, the proposed lossless image coder has highly competitive compression performance and yet remains practical.
Temporal models based on recurrent neural networks have proven to be quite powerful in a wide variety of applications, including language modeling and speech processing. However, training these models often relies on ...
详细信息
Temporal models based on recurrent neural networks have proven to be quite powerful in a wide variety of applications, including language modeling and speech processing. However, training these models often relies on backpropagation through time (BPTT), which entails unfolding the network over many time steps, making the process of conducting credit assignment considerably more challenging. Furthermore, the nature of backpropagation itself does not permit the use of nondifferentiable activation functions and is inherently sequential, making parallelization of the underlying training process difficult. Here, we propose the parallel temporal neural coding network (P-TNCN), a biologically inspired model trained by the learning algorithm we call local representation alignment. It aims to resolve the difficulties and problems that plague recurrent networks trained by BPTT. The architecture requires neither unrolling in time nor the derivatives of its internal activation functions. We compare our model and learning procedure with other BPTT alternatives (which also tend to be computationally expensive), including real-time recurrent learning, echo state networks, and unbiased online recurrent optimization. We show that it outperforms these on-sequence modeling benchmarks such as Bouncing MNIST, a new benchmark we denote as Bouncing NotMNIST, and Penn Treebank. Notably, our approach can, in some instances, outperform full BPTT as well as variants such as sparse attentive backtracking. Significantly, the hidden unit correction phase of P-TNCN allows it to adapt to new data sets even if its synaptic weights are held fixed (zero-shot adaptation) and facilitates retention of prior generative knowledge when faced with a task sequence. We present results that show the P-TNCN's ability to conduct zero-shot adaptation and online continual sequence modeling.
暂无评论