Various saliency detection algorithms from color images have been proposed to mimic eye fixation or attentive object detection response of human observers for the same scenes. However, developments on hyperspectral im...
详细信息
ISBN:
(纸本)9781479981311
Various saliency detection algorithms from color images have been proposed to mimic eye fixation or attentive object detection response of human observers for the same scenes. However, developments on hyperspectral imaging systems enable us to obtain redundant spectral information of the observed scenes from the reflected light source from objects. A few studies using low-level features on hyper spectral images demonstrated that salient object detection can be achieved. In this work, we proposed a salient object detection model on hyperspectral images by applying manifold ranking (MR) on self-supervised Convolutional Neural Network (CNN) features (high-level features) from unsupervised image segmentation task. Self-supervision of CNN continues until clustering loss or saliency maps converges to a defined error between each iteration. Finally, saliency estimations is done as the saliency map at last iteration when the self-supervision procedure terminates with convergence. Experimental evaluations demonstrated that proposed saliency detection algorithm on hyperspectral images is outperforming state-of-the-arts hyperspectral saliency models including the original MR based saliency model.
The segmentation task refers to the preliminary stage of image preprocessing. Further object detection, feature recognition, scene analysis and prediction of the situations depends on its results. Modern segmentation ...
详细信息
This paper addresses the problem of image based localization. The goal is to find quickly and accurately the relative pose from a query taken from a stereo camera and a map obtained using visual SLAM which contains po...
详细信息
ISBN:
(数字)9781728180687
ISBN:
(纸本)9781728180694
This paper addresses the problem of image based localization. The goal is to find quickly and accurately the relative pose from a query taken from a stereo camera and a map obtained using visual SLAM which contains poses and 3D points associated to descriptors. In this paper we introduce a new method that leverages the stereo vision by adding geometric information to visual descriptors. This method can be used when the vertical direction of the camera is known (for example on a wheeled robot). This new geometric visual descriptor can be used with several image based localization algorithms based on visual words. We test the approach with different datasets (indoor, outdoor) and we show experimentally that the new geometric-visual descriptor improves standard image based localization approaches.
Restoring blurred images to clear images is a challenging problem. Most previous methods only analyzed the single image. However, for motion blurring, this method missed the key trajectory description process. Based o...
详细信息
Building nonexpansive Convolutional Neural Networks (CNNs) is a challenging problem that has recently gained a lot of attention from the imageprocessing community. In particular, it appears to be the key to obtain co...
ISBN:
(数字)9781509066315
ISBN:
(纸本)9781509066322
Building nonexpansive Convolutional Neural Networks (CNNs) is a challenging problem that has recently gained a lot of attention from the imageprocessing community. In particular, it appears to be the key to obtain convergent Plugand-Play algorithms. This problem, which relies on an accurate control of the the Lipschitz constant of the convolutional layers, has also been investigated for Generative Adversarial Networks to improve robustness to adversarial perturbations. However, to the best of our knowledge, no efficient method has been developed yet to build nonexpansive CNNs. In this paper, we develop an optimization algorithm that can be incorporated in the training of a network to ensure the nonexpansiveness of its convolutional layers. This is shown to allow us to build firmly nonexpansive CNNs. We apply the proposed approach to train a CNN for an image denoising task and show its effectiveness through simulations.
Blind image steganalysis is the classification problem of determining whether an image contains any hidden data or not. This blind process doesn't need any prior information about the embedding algorithm which is ...
详细信息
ISBN:
(纸本)9781728140698
Blind image steganalysis is the classification problem of determining whether an image contains any hidden data or not. This blind process doesn't need any prior information about the embedding algorithm which is used to hide data on the examined images. Recently, Convolutional Neural Network (CNN) is presented to deal with the blind image steganalysis classification problem. Most of the CNN-based image steganalysis approaches can't cope with low payloads. Improved Gaussian Convolutional Neural Network (IGNCNN) is presented with a transfer learning method in order to deal with stego-images with low payloads. IGNCNN contains a pre-processing layer which is consisted of a fixed coefficients (data-set independent) high pass filter (HPF). IGNCNN also is a fixed learning rate based-CNN. In this paper, a dynamic learning rate-based CNN approach is proposed, in order to highly minimize the detection error cost. Nevertheless, the proposed approach uses a dataset dependent-based Gaussian HPF instead, as a preprocessing layer, in order to well-choose a cutoff frequency depending on the training dataset. Experiments are performed on graphical processing units (GPUs) with the standard BOSSbase 1.01 dataset exposed to the S-UNIWARD and WOW image steganographic algorithms. Results show that the proposed approach outperforms computing approaches, GNCNN, improved GNCNN, SRM and SRM+EC, by an average increase of 7.4%, 5.3%, 4.1% and 2.8% respectively in terms of accuracy metric.
In general, the amount of information present in biomedical images is more and it is a very challenging task to handle these kinds of images in wireless communication. image retrieval is one of the most emerging techn...
详细信息
ISBN:
(数字)9781728141084
ISBN:
(纸本)9781728141091
In general, the amount of information present in biomedical images is more and it is a very challenging task to handle these kinds of images in wireless communication. image retrieval is one of the most emerging technologies for telemedicine to handle the medical data of images (MRI and CT). Brain tumor segmentation is that the vital method to portrait the beginning level of tumor. Magnifying the tumor is being an enormous challenge due to the complex characteristics of the MRI images which provides high intensive, divergent and uncertain boundaries. In this paper, the tumor present in the brain MRI is segmented using a new formulation technique of FCM (fuzzy C means algorithm) called PIGFCM algorithm has been introduced, followed by an image de-noising technique, which is an inevitable pre-processing step in imageprocessing. image de-noising technique is employed using PSNLM filter (i.e. Pre-Smooth Non-Local Means filter), which is used for denoising the Rician noise (noise present in the image, which is difficult to remove), and the fuzzy algorithm used in the proposed method is called as the PIGFCM algorithm, which is a reformulation of the FCM (fuzzy C means algorithm), which incorporates good quality in the segmentation process. PIGFCM algorithm utilizes the prior information of the tumor classes. The proposed technique has better de-noising results, substantial superior segmentation accuracy and good speed. The PSNR and the NMSE of the results are also calculated.
Field Programable Gate Array (FPGA) implementation of audio and video processing based on Zedboard as a target platform. The Design for the audio system was implemented by verifying various audio files. Verification w...
详细信息
ISBN:
(数字)9781728159706
ISBN:
(纸本)9781728159713
Field Programable Gate Array (FPGA) implementation of audio and video processing based on Zedboard as a target platform. The Design for the audio system was implemented by verifying various audio files. Verification was done by hearing it on a speaker accordingly by connecting the stereo cable(male-to-male) to a line in (blue plug) of the board to the audio port of the laptop or phone and speaker to line out (green plug) of the board to the USB port of the laptop. And design for the video/monitoring system was implemented by interfacing the OV7670 CMOS camera module through peripheral modules of the board, VGA monitor screen through the VGA port of the board. The algorithms for the audio processing system were implemented in Verilog, a Hardware Description Language, and the video processing system was implemented in VHSIC (Very High Scale Integrated Circuits) Hardware Description Language (VHDL) in viVADO 2017.4 as a software platform. Step by step approach to design real-time audio & video processingsystems has been discussed in this paper.
Pansharpening is a process of fusing the multispectral (MS) images with the panchromatic (PAN) image to improve the spatial resolution of the MS images. The key of pansharpening is how to extract the lost detail from ...
详细信息
Stereoscopic image quality assessment (SIQA) has always been challenging due to the remarkable distinction between human monocular and binocular vision. This paper proposes a novel gradient-based dictionary learning m...
详细信息
ISBN:
(数字)9781728168968
ISBN:
(纸本)9781728168975
Stereoscopic image quality assessment (SIQA) has always been challenging due to the remarkable distinction between human monocular and binocular vision. This paper proposes a novel gradient-based dictionary learning method for SIQA, which effectively integrates the gradients are sparser than the image itself. Specifically, we first compute the gradient maps of each view image of stereopair by applying contrast sensitivity of human visual system (HVS) and neighborhood gradient information to weight the gradient magnitudes in a locally adaptive manner. Afterwards, the binocular perceptual information of gradient (GBPI) is represented by the distribution statistics of visual primitives in gradient maps of left and right views' images, which are extracted by sparse representation. Furthermore, the entropy of gradient maps of each views' images are utilized to represent monocular cue. Their mutual information is used to represent binocular cue. The difference of the reference and distorted images' GBPIs is taken as quality-ware features. Finally, the kernel ridge regressing (KRR) is utilized to simulate a nonlinear relationship between the quality-ware features and human opinions. The performance of the proposed metric is evaluated over the LIVE 3D phase II asymmetric datasets, and shown to be competitive with the state-of-the-art SIQA algorithms.
暂无评论