Person re-identification (ReID) is an important problem in computervision, especially for video surveillance applications. The problem focuses on identifying people across different cameras or across different frames...
ISBN:
(纸本)9781450366151
Person re-identification (ReID) is an important problem in computervision, especially for video surveillance applications. The problem focuses on identifying people across different cameras or across different frames of same camera. The main challenge lies in identifying similarity of the same person against large appearance and structure variations, while differentiating between individuals. Recently, deep learning networks with triplet loss has become a common framework for person ReID. However, triplet loss focuses on obtaining correct orders on the training set. We demonstrate that it performs inferior in a clustering task. In this paper, we design a cluster loss, which can lead to the model output with a larger inter-class variation and a smaller intra-class variation compared to the triplet loss. As a result, our model has a better generalisation ability and can achieve a higher accuracy on the test set especially for a clustering task. We also introduce a batch hard training mechanism for improving the results and faster convergence of training.
In practice, images can contain different amounts of noise for different color channels, which is not acknowledged by existing super-resolution approaches. In this paper, we propose to super-resolve noisy color images...
详细信息
ISBN:
(纸本)9781450366151
In practice, images can contain different amounts of noise for different color channels, which is not acknowledged by existing super-resolution approaches. In this paper, we propose to super-resolve noisy color images by considering the color channels jointly. Noise statistics are blindly estimated from the input low-resolution image and are used to assign different weights to different color channels in the data cost. Implicit low-rank structure of visual data is enforced via nuclear norm minimization in association with adaptive weights, which is added as a regularization term to the cost. Additionally, multi-scale details of the image are added to the model through another regularization term that involves projection onto PCA basis, which is constructed using similar patches extracted across different scales of the input image. The results demonstrate the super-resolving capability of the approach in real scenarios.
In this work, we propose a computationally efficient compressive sensing based approach for very low bit rate lossy coding of hyperspectral (HS) image data by exploiting the redundancy inherent in this imaging modalit...
ISBN:
(纸本)9781450366151
In this work, we propose a computationally efficient compressive sensing based approach for very low bit rate lossy coding of hyperspectral (HS) image data by exploiting the redundancy inherent in this imaging modality. We divide the HS datacube into subsets of adjacent bands, each of which is encoded into a coded snapshot using a random code matrix. These coded snapshot images are encoded using the wavelet-based SPIHT compression technique. The decompression from the coded snapshots at the receiver is done using the orthogonal matching pursuit with the help of an overcomplete dictionary learned on a general purpose training dataset. We provide ample experimental results and performance comparisons to substantiate the usefulness of the proposed method. In the proposed technique the encoder is free from any decoder and it offers a significant saving in computation and yet yields a much higher compression quality.
In dermatology, imageprocessing allows non-contact and non-invasive metrological measurements. Psoriasis is an incurable skin disease with an unknown origin. One of the most important tasks in the treatment of psoria...
详细信息
ISBN:
(数字)9781510630543
ISBN:
(纸本)9781510630543
In dermatology, imageprocessing allows non-contact and non-invasive metrological measurements. Psoriasis is an incurable skin disease with an unknown origin. One of the most important tasks in the treatment of psoriasis is to evaluate the degree of the illness following a severity score. Dermatologists use visual and tactile senses to assess the lesions severity. In this article, we propose an automated methodology for assessing objectively the severity of psoriasis by measuring the physical parameters of the skin. Thus, from the colorimetry and geometry obtained by photometric-stereo, we determine the level of erythema and skin thickness. Our results show that for a low acquisition time, the scores obtained are highly correlated with those of dermatologists.
Haze during the bad weather, degrades the visibility of the scene drastically. Degradation of scene visibility varies with respect to the transmission coefficient/map (Tc) of the scene. Estimation of accurate Tc is ke...
详细信息
ISBN:
(纸本)9781450366151
Haze during the bad weather, degrades the visibility of the scene drastically. Degradation of scene visibility varies with respect to the transmission coefficient/map (Tc) of the scene. Estimation of accurate Tc is key step to reconstruct the haze free scene. Previously, local as well as global priors were proposed to estimate the Tc. We, on the other hand, propose integration of local and global approaches to learn both point level and object level Tc. The proposed local encoder decoder network (LEDNet) estimates the scene transmission map in two stages. During first stage, network estimates the point level Tc using parallel convolutional filters and spatial invariance filtering. The second stage comprises of a two level encoder-decoder architecture which anticipates the object level Tc. We also propose, local air-light estimation (LAE) algorithm, which is able to obtain the air-light component of the outdoor scene. Combination of LEDNet and LAE improves the accuracy of haze model to recover the scene radiance. Structural similarity index, mean square error and peak signal to noise ratio are used to evaluate the performance of the proposed approach for single image haze removal. Experiments on benchmark datasets show that LEDNet outperforms the existing state-of-the-art methods for single image haze removal.
Recent trends in image segmentation algorithms have shown various large scale networks with impressive performance for natural scene images. However most of the networks come with costly overheads such has large memor...
详细信息
ISBN:
(纸本)9781450366151
Recent trends in image segmentation algorithms have shown various large scale networks with impressive performance for natural scene images. However most of the networks come with costly overheads such has large memory requirements or dependence on huge number of parallel processing units. In most cases costly graphicsprocessing units or GPUs are used to boost computational capability. However for creating products in the real world we need to consider speed, performance as well as cost of deployment. We propose a novel "spark" module which is a combination of the "fire" module of SqueezeNet and depth-wise separable convolutions. Along with this modified SqueezeNet as an encoder we also propose the use of depth-wise separable transposed convolution for a decoder. The resultant encoder-decoder network has approximately 49 times lesser number of parameters than SegNet and almost 223 times lesser number of parameters than fully convolutional networks(FCN). Even in a CPU the network completes a forward pass for a single sample in approximately 0.39 seconds which is almost 5.1 times faster as compared to SegNet and almost 8.7 times faster compared to FCN.
Video object segmentation aims to segment objects in a video sequence, given some user annotation which indicates the object of interest. Although Convolutional Neural Networks (CNNs) have been used in the recent past...
ISBN:
(纸本)9781450366151
Video object segmentation aims to segment objects in a video sequence, given some user annotation which indicates the object of interest. Although Convolutional Neural Networks (CNNs) have been used in the recent past for the purpose of foreground segmentation in videos, adversarial training methods have not been used effectively to solve this problem, in spite of its extensive use for solving many other problems in computervision. Earlier, flow features and motion trajectories have been extensively used to capture the temporal consistency between subsequent frames to segment moving objects in videos. However, we show that our proposed framework of processing the video frames independently using a deep generative adversarial network (GAN), is able to maintain the temporal coherency across frames without the use of any explicit trajectory based information, to provide superior results. Our main contribution lies in introducing a GAN based framework along with the incorporation of an Intersection-over-Union score based novel cost function for training the model, to solve the problem of foreground object segmentation in videos. The proposed method, when evaluated on popular real-world video segmentation datasets viz. DAVIS, SegTrack-v2 and YouTube-Objects, exhibits substantial performance gain over the recent state-of-the-art methods.
Cloud computing is a highly prospective paradigm in which computational resources from third parties are used for processing outsourced data. Nonetheless, the distributed architecture of this concept poses many securi...
详细信息
ISBN:
(纸本)9781450366151
Cloud computing is a highly prospective paradigm in which computational resources from third parties are used for processing outsourced data. Nonetheless, the distributed architecture of this concept poses many security and privacy threats for the data owners. Shamir's secret sharing is an effective technique for distributing and processing secret images over the encrypted domain. However, it has got some critical limitations primarily due to the presence of correlated information between the image pixels. Our study addresses this problem by proposing a perfectly secure Shamir's secret sharing scheme for images. Our work builds upon the formal notion of perfect secrecy for encoding the Shamir's shares in a particular manner such that they (i.e. encoded shares) do not reveal any additional information about the original image. Importantly, we have provided both theoretical and empirical validation of our proposed approach. We have also performed several image filtering operations on the stored shares and found the resulting PSNR and NCC values to be similar in the plain and encrypted domains. Hence our work provides a privacy-preserving and secure framework for working with images over a cloud-based architecture.
Crypto-currency is a decentralized digital currency which can be used as a medium of exchange. As it does not have any central authority, it is unregulated and a volatile asset. Globally the crypto market is worth a f...
Crypto-currency is a decentralized digital currency which can be used as a medium of exchange. As it does not have any central authority, it is unregulated and a volatile asset. Globally the crypto market is worth a few trillion dollars. The indian crypto market suffered a lot due to a ban imposed by the Reserve Bank of India (RBI), an indian regulatory agency, which lasted for around two years. Even after the ban and cautions issued by the RBI, the crypto market in India touched a billion dollar mark in no time. Currently, the crypto market has more than 10 million indian investors which is growing fast as the number of exchange apps in the indian market increase with the ease of trading in crypto. In the current research work Comparative Analysis of Crypto App in India (CACAI) has been performed and results exceed the expectations.
Machine Learning models are known to be susceptible to small but structured changes to their inputs that can result in wrong inferences. It has been shown that such samples, called adversarial samples, can be created ...
详细信息
ISBN:
(纸本)9781450366151
Machine Learning models are known to be susceptible to small but structured changes to their inputs that can result in wrong inferences. It has been shown that such samples, called adversarial samples, can be created rather easily for standard neural network architectures. These adversarial samples pose a serious threat for deploying state-of-the-art deep neural network models in the real world. We propose a feature augmentation technique called BatchOut to learn robust models towards such examples. The proposed approach is a generic feature augmentation technique that is not specific to any adversary and handles multiple attacks. We evaluate our algorithm on benchmark datasets and architectures to show that models trained using our method are less susceptible to adversaries created using multiple methods.
暂无评论