Compressed Sensing (CS) has emerged as an alternate method to acquire high dimensional signals effectively by exploiting the sparsity assumption. However, owing to non-sparse and non-stationary nature, it is extremely...
ISBN:
(纸本)9781450366151
Compressed Sensing (CS) has emerged as an alternate method to acquire high dimensional signals effectively by exploiting the sparsity assumption. However, owing to non-sparse and non-stationary nature, it is extremely difficult to process Electroencephalograph (EEG) signals using CS paradigm. The success of Bayesian algorithms in recovering non-sparse signals has triggered the research in CS based models for neurophysiological signal processing. In this paper, we address the problem of Temporal Modeling of EEG Signals using Block Sparse Variational Bayes (SVB) Framework. Temporal correlation of EEG signals is modeled blockwise using normal variance scale mixtures parameterized via some random and deterministic parameters. Variational inference is exploited to infer the random parameters and Expectation Maximization (EM) is used to obtain the estimate of deterministic parameters. To validate the framework, we present experimental results for benchmark State Visual Evoked Potential (SSVEP) dataset with 40-target Brain-computer Interface (BCI) speller using two frequency recognition algorithms viz. Canonical Correlation Analysis (CCA) and L1-regularized Multiway CCA. Results show that the proposed temporal model is highly useful in processing SSVEP-EEG signals irrespective of the recognition algorithms used.
Machine Learning models are known to be susceptible to small but structured changes to their inputs that can result in wrong inferences. It has been shown that such samples, called adversarial samples, can be created ...
详细信息
ISBN:
(纸本)9781450366151
Machine Learning models are known to be susceptible to small but structured changes to their inputs that can result in wrong inferences. It has been shown that such samples, called adversarial samples, can be created rather easily for standard neural network architectures. These adversarial samples pose a serious threat for deploying state-of-the-art deep neural network models in the real world. We propose a feature augmentation technique called BatchOut to learn robust models towards such examples. The proposed approach is a generic feature augmentation technique that is not specific to any adversary and handles multiple attacks. We evaluate our algorithm on benchmark datasets and architectures to show that models trained using our method are less susceptible to adversaries created using multiple methods.
State-of-the-art empirical work has shown that visual representations learned by deep neural networks are robust in nature and capable of performing classification tasks on diverse datasets. For example, CLIP demonstr...
详细信息
ISBN:
(纸本)9781450398220
State-of-the-art empirical work has shown that visual representations learned by deep neural networks are robust in nature and capable of performing classification tasks on diverse datasets. For example, CLIP demonstrated zero-shot transfer performance on multiple datasets for classification tasks in a joint embedding space of image and text pairs. However, it showed negative transfer performance on standard datasets, e.g., BirdsNAP, RESISC45, and MNIST. In this paper, we propose ContextCLIP, a contextual and contrastive learning framework for the contextual alignment of image-text pairs by learning robust visual representations on Conceptual Captions dataset. Our framework was observed to improve the image-text alignment by aligning text and image representations contextually in the joint embedding space. ContextCLIP showed good qualitative performance for text-to-image retrieval tasks and enhanced classification accuracy. We evaluated our model quantitatively with zero-shot transfer and fine-tuning experiments on CIFAR-10, CIFAR-100, Birdsnap, RESISC45, and MNIST datasets for classification task.
Skeleton data plays an important role in human action recognition due to the compact and distinct information of human poses provided by the skeleton data. Skeleton-based action recognition is gaining interest due to ...
详细信息
ISBN:
(纸本)9798400716256
Skeleton data plays an important role in human action recognition due to the compact and distinct information of human poses provided by the skeleton data. Skeleton-based action recognition is gaining interest due to the availability of Kinect cameras at a reasonable price. With the growing popularity of Geometric deep learning, Graph Convolutional networks (GCN) are extensively used for processing skeleton data, due to their ability to model data topology. It has been found that Spatial-Temporal Graph Convolutional Networks (ST-GCN) are efficient in learning both spatial and temporal dependencies on non-Euclidean data, such as skeleton graphs. However, the state-of-the-art ST-GCN models lack flexibility in feature extraction and do not explicitly consider the high-order Spatio-Temporal significance of the spatial connection topology and intensity of the joints. They lack an attention module that can help us learn together when and where to concentrate on a certain action in the action sequence. The transformer-based methods can effectively capture long-distance dependencies. However, using just a traditional transformer approach overlooks the graph structure in the data, which can provide valuable information about the inter-relationships among the joint points during actions. To address this problem, we propose an architecture named Spatial-Temporal Transformer Network with Graph Convolution (STTGC-Net), that can flexibly capture local and global contexts. Spatial-Temporal Graph Convolutional Networks and Spatial-Temporal Transformer Attention Modules are sequentially fused to create the proposed framework. The ST-GCNs capture temporal dynamics, hierarchy, and local topological information at several levels, and the transformer attention module displays the correlations between joint pairs in global topology via dynamic attention, which resolves the mentioned constraints of the GCN and transformer. We validate our model by conducting tests on the NTU 60 and NTU 12
We present a semi-automatic method for extracting the 3D boundary of the cells in a compact tissue cross-section photographed by a confocal microscope. The confocal microscope provides pictures at different depths of ...
详细信息
We present a semi-automatic method for extracting the 3D boundary of the cells in a compact tissue cross-section photographed by a confocal microscope. The confocal microscope provides pictures at different depths of the cells which can be considered as the image slices of the tissue section. Segmentation of cell boundary from different image slices and combining them to obtain 3D surface automatically is a difficult task. We have developed an approach where given one segmented image slice, the other image slices can be automatically segmented in a layered approach. The idea is to use the information of the previous segmented image slice for segmenting the current image slice.
Real time object tracking is the process of locating moving objects over time using the camera in video sequences in real time. The objective of object tracking is to associate target objects in consecutive video fram...
详细信息
ISBN:
(纸本)9781467377591
Real time object tracking is the process of locating moving objects over time using the camera in video sequences in real time. The objective of object tracking is to associate target objects in consecutive video frames. Object tracking requires location and shape or features of objects in the video frames. Every tracking algorithm needs to detect moving object. So object detection is the preceding step of object tracking in computervision applications. After that, detected object can be extracted by the feature of moving object to track that moving object into video scene. It is challenging task in imageprocessing to track the objects into consecutive frames. Various challenges can arise due to complex object motion, irregular shape of object, occlusion of object to object and object to scene and real time processing requirements. Object tracking has a variety of uses, some of which are: surveillance and security, traffic monitoring, video communication, robot vision and animation.
In this work, we have estimated ball possession statistics from the video of a soccer match. The ball possession statistics is calculated based on the valid pass counts of two playing teams. We propose a player-ball i...
详细信息
ISBN:
(纸本)9781450366151
In this work, we have estimated ball possession statistics from the video of a soccer match. The ball possession statistics is calculated based on the valid pass counts of two playing teams. We propose a player-ball interaction energy function to detect ball pass event. Based on position and velocity of the ball and players, a model for interaction energy is defined. The energy increases when the ball is closer and about to collide with a player. Lower energy denotes that the ball is freely moving and not near to any player. The interaction energy generates a binary state sequence which determines a valid pass or a miss-pass. We assess the performance of our model on publicly available soccer videos and have achieved close to 83% accuracy.
We integrate the advantages of SOM- and snake-based ACMs in order to extract the desired contour from images. We employ: (i) the feature points to guide the contour, as in the case of SOM-based ACMs; (ii) the gradient...
详细信息
We integrate the advantages of SOM- and snake-based ACMs in order to extract the desired contour from images. We employ: (i) the feature points to guide the contour, as in the case of SOM-based ACMs; (ii) the gradient and intensity variations in a local region to control the contour movement. However, in contrast with the snake-based ACMs, we do not use an explicit energy functional based on gradient or intensity. The algorithm is tested on synthetic binary and gray-level images, and the results show the superiority of the proposed algorithm over other conventional SOM- and snake-based ACM algorithms.
Optimization of the tradeoff between computation time and image quality is essential for reconstructing high-quality magnetic resonance image (MRI) from a limited number of acquired samples in a short time using compr...
详细信息
ISBN:
(纸本)9781450366151
Optimization of the tradeoff between computation time and image quality is essential for reconstructing high-quality magnetic resonance image (MRI) from a limited number of acquired samples in a short time using compressed sensing (CS) algorithms. In this paper, we achieve this for the edge preserving non-linear diffusion reconstruction (NLDR) which eliminates the critical step-size tuning of the total variation (TV) based CS-MRI. Based on optimization of contrast parameter that controls noise and signal in sensitivity modulated channel images, we propose an â-switching NLDR technique for a faster approximation of reconstruction image without affecting the image quality. Proposed algorithm exploits the difference in the extent of undersampling artifacts in signal-background regions of the channel images to arrive at different estimates of contrast parameter, leading to an effective optimization of speed and quality. While maintaining better image quality as compared to conventional TV reconstruction, the switched NLDR also achieves 25-35% gain in convergence time over NLDR without switching. This makes the switched NLDR a better candidate for fast reconstruction over traditional TV and NLDR approaches. In the detailed numerical experiments, we have compared and optimized the tradeoff for various state-of-the-art choices of contrast parameter.
image co-segmentation is jointly segmenting two or more images sharing common foreground objects. In this paper, we propose a novel graph convolution neural network (graph CNN) based end-to-end model for performing co...
详细信息
ISBN:
(纸本)9781450366151
image co-segmentation is jointly segmenting two or more images sharing common foreground objects. In this paper, we propose a novel graph convolution neural network (graph CNN) based end-to-end model for performing co-segmentation. At the beginning, each input image is over-segmented into a set of superpixels. Next, a weighted graph is formed using the over-segmented images exploiting spatial adjacency and both intra-image and inter-image feature similarities among the image superpixels (nodes). Subsequently, the proposed network, consisting of graph convolution layers followed by node classification layers, classifies each superpixel either into the common foreground or its complement. During training, along with the co-segmentation network, an additional network is introduced to exploit the corresponding semantic labels, and the two networks share the same weights in graph convolution layers. The whole model is learned in an end-to-end fashion using a novel cost function comprised of a superpixel wise binary cross entropy and a multi-label cross entropy. We also use empirical class probabilities in the loss function to deal with class imbalance. Experimental results reflect that the proposed technique is very competitive with the state-of-the-art methods on two challenging datasets, Internet and Pascal-VOC.
暂无评论