This paper addresses the problem of image based localization. The goal is to find quickly and accurately the relative pose from a query taken from a stereo camera and a map obtained using visual SLAM which contains po...
详细信息
ISBN:
(数字)9781728180687
ISBN:
(纸本)9781728180694
This paper addresses the problem of image based localization. The goal is to find quickly and accurately the relative pose from a query taken from a stereo camera and a map obtained using visual SLAM which contains poses and 3D points associated to descriptors. In this paper we introduce a new method that leverages the stereo vision by adding geometric information to visual descriptors. This method can be used when the vertical direction of the camera is known (for example on a wheeled robot). This new geometric visual descriptor can be used with several image based localization algorithms based on visual words. We test the approach with different datasets (indoor, outdoor) and we show experimentally that the new geometric-visual descriptor improves standard image based localization approaches.
Restoring blurred images to clear images is a challenging problem. Most previous methods only analyzed the single image. However, for motion blurring, this method missed the key trajectory description process. Based o...
详细信息
Building nonexpansive Convolutional Neural Networks (CNNs) is a challenging problem that has recently gained a lot of attention from the imageprocessing community. In particular, it appears to be the key to obtain co...
ISBN:
(数字)9781509066315
ISBN:
(纸本)9781509066322
Building nonexpansive Convolutional Neural Networks (CNNs) is a challenging problem that has recently gained a lot of attention from the imageprocessing community. In particular, it appears to be the key to obtain convergent Plugand-Play algorithms. This problem, which relies on an accurate control of the the Lipschitz constant of the convolutional layers, has also been investigated for Generative Adversarial Networks to improve robustness to adversarial perturbations. However, to the best of our knowledge, no efficient method has been developed yet to build nonexpansive CNNs. In this paper, we develop an optimization algorithm that can be incorporated in the training of a network to ensure the nonexpansiveness of its convolutional layers. This is shown to allow us to build firmly nonexpansive CNNs. We apply the proposed approach to train a CNN for an image denoising task and show its effectiveness through simulations.
Blind image steganalysis is the classification problem of determining whether an image contains any hidden data or not. This blind process doesn't need any prior information about the embedding algorithm which is ...
详细信息
ISBN:
(纸本)9781728140698
Blind image steganalysis is the classification problem of determining whether an image contains any hidden data or not. This blind process doesn't need any prior information about the embedding algorithm which is used to hide data on the examined images. Recently, Convolutional Neural Network (CNN) is presented to deal with the blind image steganalysis classification problem. Most of the CNN-based image steganalysis approaches can't cope with low payloads. Improved Gaussian Convolutional Neural Network (IGNCNN) is presented with a transfer learning method in order to deal with stego-images with low payloads. IGNCNN contains a pre-processing layer which is consisted of a fixed coefficients (data-set independent) high pass filter (HPF). IGNCNN also is a fixed learning rate based-CNN. In this paper, a dynamic learning rate-based CNN approach is proposed, in order to highly minimize the detection error cost. Nevertheless, the proposed approach uses a dataset dependent-based Gaussian HPF instead, as a preprocessing layer, in order to well-choose a cutoff frequency depending on the training dataset. Experiments are performed on graphical processing units (GPUs) with the standard BOSSbase 1.01 dataset exposed to the S-UNIWARD and WOW image steganographic algorithms. Results show that the proposed approach outperforms computing approaches, GNCNN, improved GNCNN, SRM and SRM+EC, by an average increase of 7.4%, 5.3%, 4.1% and 2.8% respectively in terms of accuracy metric.
Field Programable Gate Array (FPGA) implementation of audio and video processing based on Zedboard as a target platform. The Design for the audio system was implemented by verifying various audio files. Verification w...
详细信息
ISBN:
(数字)9781728159706
ISBN:
(纸本)9781728159713
Field Programable Gate Array (FPGA) implementation of audio and video processing based on Zedboard as a target platform. The Design for the audio system was implemented by verifying various audio files. Verification was done by hearing it on a speaker accordingly by connecting the stereo cable(male-to-male) to a line in (blue plug) of the board to the audio port of the laptop or phone and speaker to line out (green plug) of the board to the USB port of the laptop. And design for the video/monitoring system was implemented by interfacing the OV7670 CMOS camera module through peripheral modules of the board, VGA monitor screen through the VGA port of the board. The algorithms for the audio processing system were implemented in Verilog, a Hardware Description Language, and the video processing system was implemented in VHSIC (Very High Scale Integrated Circuits) Hardware Description Language (VHDL) in viVADO 2017.4 as a software platform. Step by step approach to design real-time audio & video processingsystems has been discussed in this paper.
In general, the amount of information present in biomedical images is more and it is a very challenging task to handle these kinds of images in wireless communication. image retrieval is one of the most emerging techn...
详细信息
ISBN:
(数字)9781728141084
ISBN:
(纸本)9781728141091
In general, the amount of information present in biomedical images is more and it is a very challenging task to handle these kinds of images in wireless communication. image retrieval is one of the most emerging technologies for telemedicine to handle the medical data of images (MRI and CT). Brain tumor segmentation is that the vital method to portrait the beginning level of tumor. Magnifying the tumor is being an enormous challenge due to the complex characteristics of the MRI images which provides high intensive, divergent and uncertain boundaries. In this paper, the tumor present in the brain MRI is segmented using a new formulation technique of FCM (fuzzy C means algorithm) called PIGFCM algorithm has been introduced, followed by an image de-noising technique, which is an inevitable pre-processing step in imageprocessing. image de-noising technique is employed using PSNLM filter (i.e. Pre-Smooth Non-Local Means filter), which is used for denoising the Rician noise (noise present in the image, which is difficult to remove), and the fuzzy algorithm used in the proposed method is called as the PIGFCM algorithm, which is a reformulation of the FCM (fuzzy C means algorithm), which incorporates good quality in the segmentation process. PIGFCM algorithm utilizes the prior information of the tumor classes. The proposed technique has better de-noising results, substantial superior segmentation accuracy and good speed. The PSNR and the NMSE of the results are also calculated.
High-speed gated imaging methods such as time-of-flight or fluorescence lifetime imaging are key enablers for various applications such as gesture recognition, safety instrumentation, health monitoring and materials c...
详细信息
ISBN:
(纸本)9781510626263
High-speed gated imaging methods such as time-of-flight or fluorescence lifetime imaging are key enablers for various applications such as gesture recognition, safety instrumentation, health monitoring and materials characterization. In these applications, short light pulses are used to generate and accumulate a photocurrent. Assuming linearity and time-invariance, this system can be modeled by a convolution of the incoming photon stream with an impulse response function (IRF) followed by a time-gated integration. Knowing the IRF allows for further improved signal analysis and sensor design. The IRF can be measured by employing light sources resembling a delta distribution or broadband-tunable sinusoidal waveforms. Both these methods are difficult to realize for increasingly fast detectors. This paper discusses a deconvolution-based approach where the signal shape of the employed light source is considered and corrected for. The IRF reconstruction schemes introduced in this paper are based on a preprocessing step to invert the integration and followed by denoising and deconvolution. Different deconvolution algorithms have been investigated and compared. In particular, we investigated direct deconvolution, Wiener deconvolution and parametric estimation of a pre-defined IRF-model using optimization. In order to evaluate the error of the different reconstruction methods in the presence of jitter and shot noise, a ground truth needs to be generated against which the deconvolution result can be compared. For this, example IRFs that resemble typical sensor behavior were defined using analytical models. Low normalized root-mean-square error (< 0.05) can be achieved with the parametric estimation. The advantages and disadvantages of each schemes are also discussed.
Pansharpening is a process of fusing the multispectral (MS) images with the panchromatic (PAN) image to improve the spatial resolution of the MS images. The key of pansharpening is how to extract the lost detail from ...
详细信息
Stereoscopic image quality assessment (SIQA) has always been challenging due to the remarkable distinction between human monocular and binocular vision. This paper proposes a novel gradient-based dictionary learning m...
详细信息
ISBN:
(数字)9781728168968
ISBN:
(纸本)9781728168975
Stereoscopic image quality assessment (SIQA) has always been challenging due to the remarkable distinction between human monocular and binocular vision. This paper proposes a novel gradient-based dictionary learning method for SIQA, which effectively integrates the gradients are sparser than the image itself. Specifically, we first compute the gradient maps of each view image of stereopair by applying contrast sensitivity of human visual system (HVS) and neighborhood gradient information to weight the gradient magnitudes in a locally adaptive manner. Afterwards, the binocular perceptual information of gradient (GBPI) is represented by the distribution statistics of visual primitives in gradient maps of left and right views' images, which are extracted by sparse representation. Furthermore, the entropy of gradient maps of each views' images are utilized to represent monocular cue. Their mutual information is used to represent binocular cue. The difference of the reference and distorted images' GBPIs is taken as quality-ware features. Finally, the kernel ridge regressing (KRR) is utilized to simulate a nonlinear relationship between the quality-ware features and human opinions. The performance of the proposed metric is evaluated over the LIVE 3D phase II asymmetric datasets, and shown to be competitive with the state-of-the-art SIQA algorithms.
In the past three years, deep convolutional neural networks (DCNNs) have achieved promising results in detecting skin cancer. However, improving the accuracy and efficiency of the automatic detection of melanoma is st...
详细信息
ISBN:
(纸本)9781728126036
In the past three years, deep convolutional neural networks (DCNNs) have achieved promising results in detecting skin cancer. However, improving the accuracy and efficiency of the automatic detection of melanoma is still urgent due to the visual similarity of benign and malignant dermoscopic images. There is also a need for fast and computationally effective systems for mobile applications targeting caregivers and homes. This paper presents the You Only Look Once (Yolo) algorithms, which are based on DCNNs applied to the detection of melanoma. The Yolo algorithms comprise YoloV1, YoloV2, and YoloV3, whose methodology first resets the input image size and then divides the image into several cells. According to the position of the detected object in the cell, the network will try to predict the bounding box of the object and the class confidence score. Our test results indicate that the mean average precision (mAP) of Yolo can exceed 0.82 with a training set of only 200 images, proving that this method has great advantages for detecting melanoma in lightweight system applications.
暂无评论