Deep learning architectures have shown great success in various computer vision applications. In this study, we investigate some of the very popular convolutional neural network (CNN) architectures, namely GoogleNet, ...
详细信息
Deep learning architectures have shown great success in various computer vision applications. In this study, we investigate some of the very popular convolutional neural network (CNN) architectures, namely GoogleNet, AlexNet, VGG19 and ResNet. Furthermore, we show possible early feature fusion strategies for visual object classification tasks. Concatanation of features, average pooling and maximum pooling are among the investigated fusion strategies. We obtain state-of-the-art results on well-known image classification datasets of Caltech-101, Caltech-256 and Pascal VOC 2007.
In this paper we present a novel framework for compressing images using saliency maps and KAZE features. The method involves adapting the quality factor in JPEG compression scheme for each block instead of using the s...
详细信息
ISBN:
(纸本)9781509017478
In this paper we present a novel framework for compressing images using saliency maps and KAZE features. The method involves adapting the quality factor in JPEG compression scheme for each block instead of using the same quality factor for the entire image. This is achieved by adapting JPEG quality parameter based on visual saliency and KAZE keypoints. Subsequently, a piecewise function is used to compress the least important image blocks with higher compression ratio while maintaining the overall perceptual quality and avoiding blocking artifacts. This work introduces use of KAZE keypoints for image compression for the first time in literature. We show that the proposed method outperforms the JPEG compression using PSNR and FSIMc evaluation measures especially at high compression ratios.
Measuring visual quality, as perceived by human observers, is becoming increasingly important in the many applications in which humans are the ultimate consumers of visual information. This paper assesses the visual q...
详细信息
ISBN:
(纸本)9781467399623
Measuring visual quality, as perceived by human observers, is becoming increasingly important in the many applications in which humans are the ultimate consumers of visual information. This paper assesses the visual quality in mapping of high dynamic range (HDR) images to standard dynamic range (SDR) images with 8 bits/color/pixel. In previous work, the Tone-Mapped image Quality Index (TMQI) compares the original HDR image with the rendered SDR image. TMQI quantifies distortions locally and pools them by uniform averaging, in addition to measuring naturalness of the SDR image. For SDR images, perceptual pooling strategies have improved correlation of image quality assessment (IQA) algorithms with subjective scores. The primary contributions of this paper are: (1) integrating local information-based pooling strategies in the TMQI IQA algorithm, (2) measuring image naturalness by using mean-subtracted contrast-normalized pixels, and (3) testing the proposed methods on JPEG compressed tone-mapped images and tone-mapped images for SDR displays using subjective scores.
visual hyperacuity is the capability of the human eye to see beyond the acuity defined by the number and size of its photoreceptors. Optical imaging systems suffer from diffraction since light passes through an apertu...
详细信息
ISBN:
(纸本)9781509049998
visual hyperacuity is the capability of the human eye to see beyond the acuity defined by the number and size of its photoreceptors. Optical imaging systems suffer from diffraction since light passes through an aperture and a lens system. Common diffraction-limited devices are designed to limit this effect and capture a sharp image. Nevertheless, the human eye produces a noteworthy diffraction effect which exceeds these limits, projecting a blurred image over the retina, where photoreceptors are located. On this basis, it seems difficult to understand it operating as a diffraction-limited system. Assuming that diffraction could be helpful to achieve visual hyperacuity, we present a method that intend to simulate and explain it: introducing a controlled diffraction, we are able to enhance the image resolution using post-processing techniques as interpolation and inverse filtering. Our approach uses diffraction to improve image resolution when captured with a reduced number of sensors, far from being a limiting factor.
We present a no-reference (NR) image quality assessment (IQA) algorithm that is inspired by the representation of visual scenes in the primary visual cortex of the human visual system. Specifically, we use the sparse ...
详细信息
ISBN:
(纸本)9781509017478
We present a no-reference (NR) image quality assessment (IQA) algorithm that is inspired by the representation of visual scenes in the primary visual cortex of the human visual system. Specifically, we use the sparse coding model of the area V1 to construct an overcomplete dictionary for sparsely representing pristine (undistorted) natural images. First, we empirically demonstrate that the distribution of the sparse representation coefficients of natural images have sharp peaks and heavy tails, and can therefore be modeled using a Univariate Generalized Gaussian Distribution (UGGD). We then show that the UGGD model parameters form good features for distortion estimation and formulate our no-reference IQA algorithm based on this observation. Subsequently, we find UGGD model parameters that are representative of the class of pristine natural images. This is achieved using a training set of undistorted natural images. The perceptual quality of a test image is then defined to be the likelihood of its sparse coefficients being generated from the pristine UGGD model. We show that the proposed algorithm correlates well with subjective evaluation over several standard image databases. Further, the proposed method allows us to construct a distortion map that has several useful applications like distortion localization, adaptive rate allocation etc. Finally and importantly, the proposed NR-IQA algorithm does not make use of any distortion information or subjective scores during the training process.
The proceedings contain 63 papers. The special focus in this conference is on Social Computing. The topics include: A context-aware model using distributed representations for Chinese zero pronoun resolution;a hierarc...
ISBN:
(纸本)9789811020520
The proceedings contain 63 papers. The special focus in this conference is on Social Computing. The topics include: A context-aware model using distributed representations for Chinese zero pronoun resolution;a hierarchical learning framework for steganalysis of jpeg images;a multi-agent organization approach for developing social-technical software of autonomous robots;a novel approach for the identification of morphological features from low quality images;a novel filtering method for infrared image;a personalized recommendation algorithm with user trust in social network;a preprocessing method for gait recognition;a real-time fraud detection algorithm based on usage amount forecast;a self-determined evaluation method for science popularization based on IOWA operator and particle swarm optimization;a strategy for small files processing in HDFS;a SVM-based feature extraction for face recognition;a transductive support vector machine algorithm based on ant colony optimization;an optimized buffer replacement algorithm for flash storage devices;an approach for automatically generating r2rml-based direct mapping from relational databases;an improved asymmetric bagging relevance feedback strategy for medical image retrieval;an incremental graph pattern matching based dynamic cold-start recommendation method;an optimized load balancing algorithm of dynamic feedback based on stimulated annealing;application progress of signal clustering algorithm;automated artery-vein classification in fundus color images;clarity corresponding to contrast in visual cryptography;a community-oriented group recommendation framework and improvement for leach algorithm in wireless sensor network.
We propose an effective and efficient local decolorization method in this paper. It is an extension of the global decolorization method [6] which robustly reproduces visual appearance of a color image in the grayscale...
详细信息
ISBN:
(纸本)9781509028610
We propose an effective and efficient local decolorization method in this paper. It is an extension of the global decolorization method [6] which robustly reproduces visual appearance of a color image in the grayscale output. The improvement of the local extension is the effective preservation of the local color contrast which may diminish in the global method. Meanwhile the proposed local extension is efficient in that the computational complexity is O(1) for each pixel, which will be independent of the local kernel size. Quantitative evaluation among existing decolorization methods shows that our local extension performs favorable in both image quality and time cost. Meanwhile, our method can be extended into temporal domain for robust video decolorization.
Deep Convolutional Neural Network (CNN) is one of the most popular methods for imageprocessing and recognition. There are many research works to improve the performance of CNNs. However, as an important part of CNNs,...
详细信息
ISBN:
(纸本)9781509028610
Deep Convolutional Neural Network (CNN) is one of the most popular methods for imageprocessing and recognition. There are many research works to improve the performance of CNNs. However, as an important part of CNNs, convolution kernel has rarely been discussed. As one Original Convolution Kernel (OCK) can only detect one type of visual feature with a fixed deformation, the networks using OCKs may learn many duplicate kernels with multiple deformations for one feature. In this paper, we propose a Complex Convolution Kernel (CCK), which can put the duplicate kernels together. Experiments on four popular datasets show the performance of networks is greatly improved by using CCKs, and suggest how to choose convolution kernels while designing networks.
Aim at improving the robustness by employing single feature tracking algorithm in visual object tracking realm, we put forward an object tracking algorithm based on the thought of game theory via multi-feature fusion....
详细信息
ISBN:
(纸本)9781509028610
Aim at improving the robustness by employing single feature tracking algorithm in visual object tracking realm, we put forward an object tracking algorithm based on the thought of game theory via multi-feature fusion. Under the framework of Mean Shift visual tracking, treating the color features and motion features expressed by optical flow as two gamers, by searching the Nash Equilibrium of their game, makes various feature's contribution reach the optimum balance, then the advantages of feature fusion are reflected better. According to experimental results, this algorithm is robust to the drastic motion of object, obstacle occlusion and background interference. This study proposes a new algorithm, which base on traditional Mean Shift algorithm and use multi-feature fusion with game theory. The algorithm shows a good tracking performance.
An improved model for additive spread spectrum (SS) watermark detection on compressed sensing (CS) reconstructed images is presented in this paper. Mathematical form of detection threshold in log-likelihood ratio mode...
详细信息
ISBN:
(纸本)9781509017478
An improved model for additive spread spectrum (SS) watermark detection on compressed sensing (CS) reconstructed images is presented in this paper. Mathematical form of detection threshold in log-likelihood ratio model is derived first and it is seen that detection probability depends on the embedding strength, watermark power, host signal variance on CS along with noise variance in observation/measurements. An optimization framework is then developed to minimize the visual distortion that includes reconstruction and embedding distortion while satisfying certain detection reliability constraint. An approximate closed form solution to the optimization problem in terms of embedding strength and set of appropriate host samples selection for a given number of CS measurements is derived and validated by a large set of simulations.
暂无评论