Compressed Sensing (CS) has emerged as an alternate method to acquire high dimensional signals effectively by exploiting the sparsity assumption. However, owing to non-sparse and non-stationary nature, it is extremely...
ISBN:
(纸本)9781450366151
Compressed Sensing (CS) has emerged as an alternate method to acquire high dimensional signals effectively by exploiting the sparsity assumption. However, owing to non-sparse and non-stationary nature, it is extremely difficult to process Electroencephalograph (EEG) signals using CS paradigm. The success of Bayesian algorithms in recovering non-sparse signals has triggered the research in CS based models for neurophysiological signal processing. In this paper, we address the problem of Temporal Modeling of EEG Signals using Block Sparse Variational Bayes (SVB) Framework. Temporal correlation of EEG signals is modeled blockwise using normal variance scale mixtures parameterized via some random and deterministic parameters. Variational inference is exploited to infer the random parameters and Expectation Maximization (EM) is used to obtain the estimate of deterministic parameters. To validate the framework, we present experimental results for benchmark State Visual Evoked Potential (SSVEP) dataset with 40-target Brain-computer Interface (BCI) speller using two frequency recognition algorithms viz. Canonical Correlation Analysis (CCA) and L1-regularized Multiway CCA. Results show that the proposed temporal model is highly useful in processing SSVEP-EEG signals irrespective of the recognition algorithms used.
image co-segmentation is jointly segmenting two or more images sharing common foreground objects. In this paper, we propose a novel graph convolution neural network (graph CNN) based end-to-end model for performing co...
详细信息
ISBN:
(纸本)9781450366151
image co-segmentation is jointly segmenting two or more images sharing common foreground objects. In this paper, we propose a novel graph convolution neural network (graph CNN) based end-to-end model for performing co-segmentation. At the beginning, each input image is over-segmented into a set of superpixels. Next, a weighted graph is formed using the over-segmented images exploiting spatial adjacency and both intra-image and inter-image feature similarities among the image superpixels (nodes). Subsequently, the proposed network, consisting of graph convolution layers followed by node classification layers, classifies each superpixel either into the common foreground or its complement. During training, along with the co-segmentation network, an additional network is introduced to exploit the corresponding semantic labels, and the two networks share the same weights in graph convolution layers. The whole model is learned in an end-to-end fashion using a novel cost function comprised of a superpixel wise binary cross entropy and a multi-label cross entropy. We also use empirical class probabilities in the loss function to deal with class imbalance. Experimental results reflect that the proposed technique is very competitive with the state-of-the-art methods on two challenging datasets, Internet and Pascal-VOC.
In this work, we have estimated ball possession statistics from the video of a soccer match. The ball possession statistics is calculated based on the valid pass counts of two playing teams. We propose a player-ball i...
详细信息
ISBN:
(纸本)9781450366151
In this work, we have estimated ball possession statistics from the video of a soccer match. The ball possession statistics is calculated based on the valid pass counts of two playing teams. We propose a player-ball interaction energy function to detect ball pass event. Based on position and velocity of the ball and players, a model for interaction energy is defined. The energy increases when the ball is closer and about to collide with a player. Lower energy denotes that the ball is freely moving and not near to any player. The interaction energy generates a binary state sequence which determines a valid pass or a miss-pass. We assess the performance of our model on publicly available soccer videos and have achieved close to 83% accuracy.
Popular events are often video recorded simultaneously by a general crowd using smartphones. In the present work, we propose a robust recurrent neural network (RNN) based approach for geo-localizing these events using...
详细信息
ISBN:
(纸本)9781450366151
Popular events are often video recorded simultaneously by a general crowd using smartphones. In the present work, we propose a robust recurrent neural network (RNN) based approach for geo-localizing these events using sensor data collected by user smartphones while recording such events. For this task we use GPS and compass sensors, which are commonly available on the smartphones. The circular nature (modulo 2π) of the orientation data from compass limits the ability of the classical neural networks (NN) to geo-localize these events. We mitigate this issue by incorporating circular nodes in our network and show the performance improvements. We train the proposed NN model using simulated data and apply it directly on real data. We train several RNN models using this strategy and show our analyses. The proposed work outperforms all previous approaches in terms of event geo-localization accuracy.
Optimization of the tradeoff between computation time and image quality is essential for reconstructing high-quality magnetic resonance image (MRI) from a limited number of acquired samples in a short time using compr...
详细信息
ISBN:
(纸本)9781450366151
Optimization of the tradeoff between computation time and image quality is essential for reconstructing high-quality magnetic resonance image (MRI) from a limited number of acquired samples in a short time using compressed sensing (CS) algorithms. In this paper, we achieve this for the edge preserving non-linear diffusion reconstruction (NLDR) which eliminates the critical step-size tuning of the total variation (TV) based CS-MRI. Based on optimization of contrast parameter that controls noise and signal in sensitivity modulated channel images, we propose an â-switching NLDR technique for a faster approximation of reconstruction image without affecting the image quality. Proposed algorithm exploits the difference in the extent of undersampling artifacts in signal-background regions of the channel images to arrive at different estimates of contrast parameter, leading to an effective optimization of speed and quality. While maintaining better image quality as compared to conventional TV reconstruction, the switched NLDR also achieves 25-35% gain in convergence time over NLDR without switching. This makes the switched NLDR a better candidate for fast reconstruction over traditional TV and NLDR approaches. In the detailed numerical experiments, we have compared and optimized the tradeoff for various state-of-the-art choices of contrast parameter.
In this work a memory efficient topological map generation algorithm has been proposed using local descriptors. A topological map is a graphical data structure where each node signifies an area within an environment. ...
详细信息
ISBN:
(纸本)9781450366151
In this work a memory efficient topological map generation algorithm has been proposed using local descriptors. A topological map is a graphical data structure where each node signifies an area within an environment. These nodes are connected by links which ensure the presence of a physical path between the pair. Experiments have been conducted with feature descriptors using a vocabulary based approach. These approaches take huge memory and time. To deal with these a KD-tree based map generation algorithm has been proposed where each node in the tree stores a descriptor and a table of occurrence. This table stores node ids of the locations, where the corresponding descriptor is present. The map generation algorithm is a two-stage algorithm. In the first stage, the visual similarity based position identification is conducted in order to check for loop-closures. It is followed by a corrective step on validating the decision of loop closure, if any. The table of occurrence keeps track of presence of each descriptor. The least occurring descriptors are pruned at regular intervals, making the algorithm memory-efficient. The approach has been experimented with several benchmark datasets.
This article presents an algorithm for salient object detection by leveraging the Bayesian surprise of the Restricted Boltzmann Machine (RBM). Here an RBM is trained on patches sampled randomly from the input image. D...
详细信息
ISBN:
(纸本)9781450366151
This article presents an algorithm for salient object detection by leveraging the Bayesian surprise of the Restricted Boltzmann Machine (RBM). Here an RBM is trained on patches sampled randomly from the input image. Due to this random sampling, the RBM is likely to get more exposed to background patches than that of the object. Thus, the trained RBM will minimize the free energy of its hidden states with respect to the background patches as opposed to the object. This, according to the free energy principle, implies minimizing Bayesian surprise which is a measure for saliency based on Kullback Leibler divergence between the input and reconstructed patch distribution. Hence, when the trained RBM is exposed to patches from the object region, it would have high divergence and in turn a high Bayesian surprise. Thus such pixels with high Bayesian surprise could be considered as salient pixels. For each pixel, a neighborhood (with the same size of training patch) is considered and is fed to the trained RBM to obtain the reconstructed patch. Thereafter, the Kullback Leibler divergence between the input and reconstructed neighborhood of each pixel is computed to measure the Bayesian surprise and is stored in the corresponding position in a matrix to form the saliency map. Experiments are carried out on three datasets namely MSRA-10K, ECSSD and DUTS. The results obtained depict promising performance by the proposed approach.
Bayesian Sparse Signal Recovery (SSR) for Multiple Measurement Vectors, when elements of each row of solution matrix are correlated, is addressed in the paper. We propose a standard linear Gaussian observation model a...
ISBN:
(纸本)9781450366151
Bayesian Sparse Signal Recovery (SSR) for Multiple Measurement Vectors, when elements of each row of solution matrix are correlated, is addressed in the paper. We propose a standard linear Gaussian observation model and a three-level hierarchical estimation framework, based on Gaussian Scale Mixture (GSM) model with some random and deterministic parameters, to model each row of the unknown solution matrix. This hierarchical model induces heavy-tailed marginal distribution over each row which encompasses several choices of distributions viz. Laplace distribution, Student's t distribution and Jeffery prior. Automatic Relevance Determination (ARD) phenomenon introduces sparsity in the model. It is interesting to see that Block Sparse Bayesian Learning framework is a special case of the proposed framework when induced marginal is Jeffrey prior. Experimental results for synthetic signals are provided to demonstrate its effectiveness. We also explore the possibility of using Multiple Measurement Vectors to model Dynamic Hand Posture Database which consists of sequence of temporally correlated hand posture sequence. It can be seen that by exploiting temporal correlation information present in the successive image samples, the proposed framework can reconstruct the data with less linear random measurements with high fidelity.
Monocular SLAM refers to using a single camera to estimate robot ego motion while building a map of the environment. While Monocular SLAM is a well studied problem, automating Monocular SLAM by integrating it with tra...
详细信息
ISBN:
(纸本)9781450366151
Monocular SLAM refers to using a single camera to estimate robot ego motion while building a map of the environment. While Monocular SLAM is a well studied problem, automating Monocular SLAM by integrating it with trajectory planning frameworks is particularly challenging. This paper presents a novel formulation based on Reinforcement Learning (RL) that generates fail safe trajectories wherein the SLAM generated outputs do not deviate largely from their true values. Quintessentially, the RL framework successfully learns the otherwise complex relation between perceptual inputs and motor actions and uses this knowledge to generate trajectories that do not cause failure of SLAM. We show systematically in simulations how the quality of the SLAM dramatically improves when trajectories are computed using RL. Our method scales effectively across Monocular SLAM frameworks in both simulation and in real world experiments with a mobile robot.
In this paper, we propose a geometric feature and frame segmentation based approach for video summarization. Video summarization aims to generate a summarized video with all the salient activities of the input video. ...
详细信息
暂无评论