This paper presents a concise end-to-end visual analysis motivated super-resolution model VASR for image reconstruction. Compatible with the existing machine vision feature coding framework, the features extracted fro...
详细信息
ISBN:
(纸本)9781665475921
This paper presents a concise end-to-end visual analysis motivated super-resolution model VASR for image reconstruction. Compatible with the existing machine vision feature coding framework, the features extracted from the machine vision task model are super-resolution amplified to reconstruct the original image for human vision. The experimental results show that without additional bit-streams, VASR can well complete the task of image reconstruction based on the extracted machine features, and has achieved good results on COCO, Openimages, TVD, and DIV2K datasets.
This paper proposes Graph Grouping (GG) loss for metric learning and its application to face verification. GG loss predisposes image embeddings of the same identity to be close to each other, and those of different id...
详细信息
ISBN:
(纸本)9781728180687
This paper proposes Graph Grouping (GG) loss for metric learning and its application to face verification. GG loss predisposes image embeddings of the same identity to be close to each other, and those of different identities to be far from each other by constructing and optimizing graphs representing the relation between images. Further, to reduce the computational cost, we propose an efficient way to compute GG loss for cases where embeddings are L-2 normalized. In experiments, we demonstrate the effectiveness o(f) the proposed method for face verification on the VoxCeleb dataset. The results show that the proposed GG loss outperforms conventional losses for metric learning.
In the age of digital content creation and distribution, steganography, that is, hiding of secret data within another data is needed in many applications, such as in secret communication between two parties, piracy pr...
详细信息
ISBN:
(纸本)9781728185514
In the age of digital content creation and distribution, steganography, that is, hiding of secret data within another data is needed in many applications, such as in secret communication between two parties, piracy protection, etc. In image steganography, secret data is generally embedded within the image through an additional step after a mandatory image enhancement process. In this paper, we propose the idea of embedding data during the image enhancement process. This saves the additional work required to separately encode the data inside the cover image. We used the Alpha-Trimmed mean filter for image enhancement and XOR of the 6 MSBs for embedding the two bits of the bitstream in the 2 LSBs whereas the extraction is a reverse process. Our obtained quantitative and qualitative results are better than a methodology presented in a very recent paper.
The digital fish provenance and quality tracking system is essential for the seafood supply chain. As a part of this system, we develop a vision-based fish processing system to automatically perform fish freshness est...
详细信息
ISBN:
(纸本)9781728180687
The digital fish provenance and quality tracking system is essential for the seafood supply chain. As a part of this system, we develop a vision-based fish processing system to automatically perform fish freshness estimation, size measurement and species classification. Under the constrained illumination environment, our system is able to auto-process the fish selection, thus greatly reduce the human labour and bring trust and efficiency to the seafood supply chain from catch to market.
Residential real estate price is one of the key components of our economic developments and has also been a major concern of the public, bank industry, government, and investors. The accurate estimation of the sale pr...
详细信息
RDPlot is an open source GUI application for plotting Rate-Distortion (RD)-curves and calculating Bjontegaard Delta (BD) statistics [1]. It supports parsing the output of commonly used reference software packages, par...
详细信息
In motion-compensated processing of image sequences, e.g. in frame interpolation, frame rate conversion, deinterlacing, motion blur correction, image sequence restoration, slow-motion replay, etc., the knowledge of mo...
详细信息
ISBN:
(纸本)0819421030
In motion-compensated processing of image sequences, e.g. in frame interpolation, frame rate conversion, deinterlacing, motion blur correction, image sequence restoration, slow-motion replay, etc., the knowledge of motion is essential. In these applications motion information has to be determined from the image sequence. Most motion estimation algorithms use only a simple motion model, and assume linear constant speed motion. The contribution of our paper is the development of an algorithm for modeling and estimation of accelerated motion trajectories, based on a second order motion model. This model is more general and much closer to the real motion present in natural image sequences. The parameters of the accelerated motion are determined from two consecutive motion fields, that has been estimated from three consecutive image frames using a multiresolution pel-recursive Wiener-based motion estimation algorithm. The proposed algorithm was successfully tested on artificial image sequences with synthetic motion as well as on natural real-file videophone and videoconferencing sequences in a frame interpolation environment.
In recent years, with the popularization of 3D technology, stereoscopic image quality assessment (SIQA) has attracted extensive attention. In this paper, we propose a two-stage binocular fusion network for SIQA, which...
详细信息
ISBN:
(纸本)9781728185514
In recent years, with the popularization of 3D technology, stereoscopic image quality assessment (SIQA) has attracted extensive attention. In this paper, we propose a two-stage binocular fusion network for SIQA, which takes binocular fusion, binocular rivalry and binocular suppression into account to imitate the complex binocular visual mechanism in the human brain. Besides, to extract spatial saliency features of the left view, the right view, and the fusion view, saliency generating layers (SGLs) are applied in the network. The SGL apply multi-scale dilated convolution to emphasize essential spatial information of the input features. Experimental results on four public stereoscopic image databases demonstrate that the proposed method outperforms the state-of-the-art SIQA methods on both symmetrical and asymmetrical distortion stereoscopic images.
Learning-based image compression has reached the performance of classical methods such as BPG. One common approach is to use an autoencoder network to map the pixel information to a latent space and then approximate t...
详细信息
ISBN:
(纸本)9781728185514
Learning-based image compression has reached the performance of classical methods such as BPG. One common approach is to use an autoencoder network to map the pixel information to a latent space and then approximate the symbol probabilities in that space with a context model. During inference, the learned context model provides symbol probabilities, which are used by the entropy encoder to obtain the bitstream. Currently, the most effective context models use autoregression, but autoregression results in a very high decoding complexity due to the serialized data processing. In this work, we propose a method to parallelize the autoregressive process used for image compression. In our experiments, we achieve a decoding speed that is over 8 times faster than the standard autoregressive context model almost without compression performance reduction.
With the development of stereoscopic imaging technology, stereoscopic image quality assessment (SIQA) has gradually been more and more important, and how to design a method in line with human visual perception is full...
详细信息
ISBN:
(纸本)9781728185514
With the development of stereoscopic imaging technology, stereoscopic image quality assessment (SIQA) has gradually been more and more important, and how to design a method in line with human visual perception is full of challenges due to the complex relationship between binocular views. In this article, firstly, convolutional neural network (CNN) based on the visual pathway of human visual system (HVS) is built, which simulates different parts of visual pathway such as the optic chiasm, lateral geniculate nucleus (LGN), and visual cortex. Secondly, the two pathways of our method simulate the 'what' and 'where' visual pathway respectively, which are endowed with different feature extraction capabilities. Finally, we find a different application way for 3D-convolution, employing it fuse the information from left and right view, rather than just extracting temporal features in video. The experimental results show that our proposed method is more in line with subjective score and has good generalization.
暂无评论