The objective of this paper is self-supervised learning of spatio-temporal embeddings from video, suitable for human action recognition. We make three contributions: First, we introduce the Dense predictive coding (DP...
详细信息
ISBN:
(数字)9781728150239
ISBN:
(纸本)9781728150239
The objective of this paper is self-supervised learning of spatio-temporal embeddings from video, suitable for human action recognition. We make three contributions: First, we introduce the Dense predictive coding (DPC) framework for self-supervised representation learning on videos. This learns a dense encoding of spatio-temporal blocks by recurrently predicting future representations: Second, we propose a curriculum training scheme to predict further into the future with progressively less temporal context. This encourages the model to only encode slowly varying spatialtemporal signals, therefore leading to semantic representations: Third, we evaluate the approach by first training the DPC model on the Kinetics-400 dataset with self-supervised learning, and then finetuning the representation on a downstream task, i.e. action recognition. With single stream (RGB only), DPC pretrained representations achieve state-of-the-art self-supervised performance on both UCF101 (75.7% topl acc) and HMDB5I (35.7% topl acc), outperforming all previous learning methods by a significant margin, and approaching the performance of a baseline pre-trained on hnageNet. The code is available at https://***/TengdaHan/DPC.
Standard hybrid video coding systems are based on motion compensated prediction with fractional pel displacement vector resolution. In H.264/AVC, a fixed 6-tap interpolation filter is used to generate the half-pel res...
详细信息
ISBN:
(纸本)0819459763
Standard hybrid video coding systems are based on motion compensated prediction with fractional pel displacement vector resolution. In H.264/AVC, a fixed 6-tap interpolation filter is used to generate the half-pel resolution referenced blocks. Considering the non-stationary statistical properties of picture sequences, some adaptive interpolations are introduced in published papers. A practical scheme is proposed to use a three-parameter 1D filter for a whole frame subject to the statistical properties of the source pictures. The problem for such approach is that a universal filter for the whole picture can not adapt to the local changes, it is not considered as the optimal solution to the prediction. In this paper, a local adaptive filter is proposed to adapt to the local statistics of the picture structure. 2D Wiener-Hopf filter of different sizes are simulated to show the possibility of decreasing the prediction error. A better performance to the prediction error as well as the total coding performance compared to the 1D adaptive filtering can be achieved.
The future of healthcare delivery systems and telemedical applications will undergo a radical change due to the developments in wearable technologies, medical sensors, mobile computing and communication techniques. Wh...
详细信息
ISBN:
(纸本)9781467361507;9781467361491
The future of healthcare delivery systems and telemedical applications will undergo a radical change due to the developments in wearable technologies, medical sensors, mobile computing and communication techniques. When dealing with applications of collecting, sorting and transferring medical data from distant locations for performing remote medical collaborations and diagnosis. E-health was born with the integration of networks and telecommunications. In recent years healthcare systems rely on images acquired in two dimensional domains in the case of still images, or three dimensional domains for volumetric video sequences and images. Images are acquired with many modalities including X-ray, magnetic resonance imaging, ultrasound, positron emission tomography, computed axial tomography. Medical informationis either in multidimensional or multi-resolution form, this creates enormous amount of data. Retrieval, Efficient storage, management and transmission of this voluminous data are highly complex. One of the solutions to reduce this complex problem is to compress the medical data without any loss (i.e. lossless). Since the diagnostics capabilities are not compromised. This technique combines integer transforms and predictive coding to enhance the performance of lossless compression. The proposed techniques can be evaluated for performance using compression quality measures.
Sonar is commonly used in the underwater environment for navigation and obstacle detection. The large volume of data, generated by an array of acoustical sensors, has to be transmitted and stored, giving a great motiv...
详细信息
ISBN:
(纸本)9781509059904
Sonar is commonly used in the underwater environment for navigation and obstacle detection. The large volume of data, generated by an array of acoustical sensors, has to be transmitted and stored, giving a great motivation for applying compression methods. This paper concentrates on source coding, based on the well-known linear predictive coding methods, optimized for such sources.
The predictive coding is a widely accepted hypothesis on how our internal visual perceptions are generated. Dynamical predictive coding with reservoir computing (PCRC) models have been proposed, but how they work rema...
详细信息
The predictive coding is a widely accepted hypothesis on how our internal visual perceptions are generated. Dynamical predictive coding with reservoir computing (PCRC) models have been proposed, but how they work remains to be clarified. Therefore, we first construct a simple PCRC network and analyze the nonlinear dynamics underlying it. Since the influence of contexts is another important factor on the visual perception, we also construct PCRC networks for the context-dependent task, and observe their attractor-landscapes on each context. (C) 2019 The Authors. Published by Atlantis Press SARL.
Pain is a complex multidimensional experience, and pain perception is still incompletely understood. Here we combine animal behavior, electrophysiology, and computer modeling to dissect mechanisms of evoked and sponta...
详细信息
ISBN:
(纸本)9781538613115
Pain is a complex multidimensional experience, and pain perception is still incompletely understood. Here we combine animal behavior, electrophysiology, and computer modeling to dissect mechanisms of evoked and spontaneous pain. We record the local field potentials (LFPs) from the primary somatosensory cortex (S1) and anterior cingulate cortex (ACC) of freely behaving rats during pain episodes, and develop a predictive coding model to investigate the temporal coordination of oscillatory activity between the S1 and ACC. Our preliminary results from computational simulations support the experimental findings and provide new predictions.
The neurobiological predictive coding model proposed by Rao and Ballard is one of the most well-known and carefully tested models in the current research space. The manifestation of predictive coding in animals' v...
详细信息
ISBN:
(纸本)9781665476119
The neurobiological predictive coding model proposed by Rao and Ballard is one of the most well-known and carefully tested models in the current research space. The manifestation of predictive coding in animals' visual cortices(such as cats and monkeys) has been adequately demonstrated;however, due to the lack of analytical equipment for the nuanced study of the human brain, it has not been demonstrated comparably in humans. Recently there has been an increase in the variety of opinions in neurobiology research about the application of machine learning/artificial intelligence to understand further and investigate predictive coding theory. In this paper, we induce the predictive coding neural network model (PredNet) into an adversarial setting of Wasserstein and the Conditional-Wasserstein nature. Our experiment includes approximately 60 combinatorial variants of the neural networks and two datasets. The results from our experiments seem to substantiate a new perspective on predictive coding theory. In addition to presenting a unique perspective through our research in this paper, we also provide the performance profile of PredNet in conjunction with an adversarial setting through extensive experimentation and analysis.
Component coding of the NTSC color TV signal is investigated. This coding involves digital demodulation of the composite signal sampled at three times the color subcarrier frequency, the implementation of the compress...
详细信息
We provide a theoretical analysis of the performance of differential predictive coding using fixed-lag smoothing of the standard decoder output. This performance is compared to related results for coding using latency...
详细信息
ISBN:
(纸本)9781479935901
We provide a theoretical analysis of the performance of differential predictive coding using fixed-lag smoothing of the standard decoder output. This performance is compared to related results for coding using latency at the encoder, and causal encoding with delayed decoding, as well as with some prior theoretical analyses of these methods. Surprisingly, it is shown that fixed-lag smoothing of the standard decoder output with causal encoding achieves the asymptotic and finite lag performance promised by a completely reoptimized decoder.
An action's end state can be anticipated by considering the agent's goal, or simply by projecting the movement trajectory. Theories suggest that individuals with autism spectrum condition (ASC) have difficulti...
详细信息
An action's end state can be anticipated by considering the agent's goal, or simply by projecting the movement trajectory. Theories suggest that individuals with autism spectrum condition (ASC) have difficulties anticipating other's goal-directed actions, caused by an impairment using prior information. We examined whether children, adolescents and adults with and without ASC visually anticipate another's action based on its goal or movement trajectory by presenting participants an agent repeatedly taking different paths to reach the same of two targets. The ASC group anticipated the goal and not just the movement pattern, but needed more time to perform goal-directed anticipations. Results are in line with predictive coding accounts, claiming that the use of prior information is impaired in ASC.
暂无评论