Glass reflection is a problem when taking photos through glass windows or showcases. As the visual quality of captured image can be enhanced by removing reflection, we develop an intelligent reflection elimination ima...
详细信息
ISBN:
(纸本)9781665475921
Glass reflection is a problem when taking photos through glass windows or showcases. As the visual quality of captured image can be enhanced by removing reflection, we develop an intelligent reflection elimination imaging device based on polarizer to minimize reflection effect on the images. The system mainly consists of a polarizing module, an image analysis module and a reflection removal module. The users can hold the device and capture images with minimum reflection whether in the day or night. The demo video is available at: https://***/10.6084/***.19687830.v1.
This paper addresses the problem of image based localization. The goal is to find quickly and accurately the relative pose from a query taken from a stereo camera and a map obtained using visual SLAM which contains po...
详细信息
ISBN:
(纸本)9781728180687
This paper addresses the problem of image based localization. The goal is to find quickly and accurately the relative pose from a query taken from a stereo camera and a map obtained using visual SLAM which contains poses and 3D points associated to descriptors. In this paper we introduce a new method that leverages the stereo vision by adding geometric information to visual descriptors. This method can be used when the vertical direction of the camera is known (for example on a wheeled robot). This new geometric visual descriptor can be used with several image based localization algorithms based on visual words. We test the approach with different datasets (indoor, outdoor) and we show experimentally that the new geometricvisual descriptor improves standard image based localization approaches.
This paper presents a deep learning-based audio-in-image watermarking scheme. Audio-in-image watermarking is the process of covertly embedding and extracting audio watermarks on a cover-image. Using audio watermarks c...
详细信息
ISBN:
(纸本)9781728185514
This paper presents a deep learning-based audio-in-image watermarking scheme. Audio-in-image watermarking is the process of covertly embedding and extracting audio watermarks on a cover-image. Using audio watermarks can open up possibilities for different downstream applications. For the purpose of implementing an audio-in-image watermarking that adapts to the demands of increasingly diverse situations, a neural network architecture is designed to automatically learn the watermarking process in an unsupervised manner. In addition, a similarity network is developed to recognize the audio watermarks under distortions, therefore providing robustness to the proposed method. Experimental results have shown high fidelity and robustness of the proposed blind audio-in-image watermarking scheme.
Learning-based compression systems have shown great potential for multi-task inference from their latent-space representation of the input image. In such systems, the decoder is supposed to be able to perform various ...
详细信息
ISBN:
(纸本)9781728185514
Learning-based compression systems have shown great potential for multi-task inference from their latent-space representation of the input image. In such systems, the decoder is supposed to be able to perform various analyses of the input image, such as object detection or segmentation, besides decoding the image. At the same time, privacy concerns around visual analytics have grown in response to the increasing capabilities of such systems to reveal private information. In this paper, we propose a method to make latent-space inference more privacy-friendly using mutual information-based criteria. In particular, we show how organizing and compressing the latent representation of the image according to task-specific mutual information can make the model maintain high analytics accuracy while becoming less able to reconstruct the input image and thereby reveal private information.
Quanta image sensors are a novel paradigm in image sensor technology. Their direct application to quanta image sensors-based imaging systems is challenging because a bit-plane image is a set of binary images. In this ...
详细信息
ISBN:
(纸本)9798331529543;9798331529550
Quanta image sensors are a novel paradigm in image sensor technology. Their direct application to quanta image sensors-based imaging systems is challenging because a bit-plane image is a set of binary images. In this paper, we introduce spatiotemporal priors based on the intensity invariance and smoothness characteristics of the motion vector. Specifically, we model when the image sequences align with the correct motion vector, the spatiotemporal structure becomes more consistent. Moreover, the spatial smoothness prior is incorporated through the smoothing filtering of the evaluation metrics of motion vector candidates. The experimental results show that the proposed method is more effective than conventional methods.
This paper presents a concise end-to-end visual analysis motivated super-resolution model VASR for image reconstruction. Compatible with the existing machine vision feature coding framework, the features extracted fro...
详细信息
ISBN:
(纸本)9781665475921
This paper presents a concise end-to-end visual analysis motivated super-resolution model VASR for image reconstruction. Compatible with the existing machine vision feature coding framework, the features extracted from the machine vision task model are super-resolution amplified to reconstruct the original image for human vision. The experimental results show that without additional bit-streams, VASR can well complete the task of image reconstruction based on the extracted machine features, and has achieved good results on COCO, Openimages, TVD, and DIV2K datasets.
This paper proposes Graph Grouping (GG) loss for metric learning and its application to face verification. GG loss predisposes image embeddings of the same identity to be close to each other, and those of different id...
详细信息
ISBN:
(纸本)9781728180687
This paper proposes Graph Grouping (GG) loss for metric learning and its application to face verification. GG loss predisposes image embeddings of the same identity to be close to each other, and those of different identities to be far from each other by constructing and optimizing graphs representing the relation between images. Further, to reduce the computational cost, we propose an efficient way to compute GG loss for cases where embeddings are L-2 normalized. In experiments, we demonstrate the effectiveness o(f) the proposed method for face verification on the VoxCeleb dataset. The results show that the proposed GG loss outperforms conventional losses for metric learning.
The digital fish provenance and quality tracking system is essential for the seafood supply chain. As a part of this system, we develop a vision-based fish processing system to automatically perform fish freshness est...
详细信息
ISBN:
(纸本)9781728180687
The digital fish provenance and quality tracking system is essential for the seafood supply chain. As a part of this system, we develop a vision-based fish processing system to automatically perform fish freshness estimation, size measurement and species classification. Under the constrained illumination environment, our system is able to auto-process the fish selection, thus greatly reduce the human labour and bring trust and efficiency to the seafood supply chain from catch to market.
In the age of digital content creation and distribution, steganography, that is, hiding of secret data within another data is needed in many applications, such as in secret communication between two parties, piracy pr...
详细信息
ISBN:
(纸本)9781728185514
In the age of digital content creation and distribution, steganography, that is, hiding of secret data within another data is needed in many applications, such as in secret communication between two parties, piracy protection, etc. In image steganography, secret data is generally embedded within the image through an additional step after a mandatory image enhancement process. In this paper, we propose the idea of embedding data during the image enhancement process. This saves the additional work required to separately encode the data inside the cover image. We used the Alpha-Trimmed mean filter for image enhancement and XOR of the 6 MSBs for embedding the two bits of the bitstream in the 2 LSBs whereas the extraction is a reverse process. Our obtained quantitative and qualitative results are better than a methodology presented in a very recent paper.
RDPlot is an open source GUI application for plotting Rate-Distortion (RD)-curves and calculating Bjontegaard Delta (BD) statistics [1]. It supports parsing the output of commonly used reference software packages, par...
详细信息
暂无评论