A new approach to the three-dimensional measurement of the object surface is introduced. A single TV camera with an apparatus to add the circular bias to the image enables us to record the three-dimensional informatio...
详细信息
ISBN:
(纸本)0780339061
A new approach to the three-dimensional measurement of the object surface is introduced. A single TV camera with an apparatus to add the circular bias to the image enables us to record the three-dimensional information of measuring points as circular streaks on a single image. Every shape of the circular streak on the image plane is related to the position of the measuring point. The information is extracted from the image using an imageprocessing technique.
image registration among multimodality has received increasing attention in the scope of computer vision and computational photography nowadays. However, the nonlinear intensity variations prohibit the accurate featur...
详细信息
ISBN:
(纸本)9781728185514
image registration among multimodality has received increasing attention in the scope of computer vision and computational photography nowadays. However, the nonlinear intensity variations prohibit the accurate feature points matching between modal-different image pairs. Thus, a robust image descriptor for multi-modal image registration is proposed, named shearlet-based modality robust descriptor(SMRD). The anisotropic feature of edge and texture information in multi-scale is encoded to describe the region around a point of interest based on discrete shearlet transform. We conducted the experiments to verify the proposed SMRD compared with several state-of-the-art multi-modal/multispectral descriptors on four different multi-modal datasets. The experimental results showed that our SMRD achieves superior performance than other methods in terms of precision, recall and F1-score.
The increasing demand for high-quality, real-time visual communication and the growing user expectations, coupled with limited network resources, necessitate novel approaches to semantic image communication. This pape...
详细信息
ISBN:
(纸本)9798331529543;9798331529550
The increasing demand for high-quality, real-time visual communication and the growing user expectations, coupled with limited network resources, necessitate novel approaches to semantic image communication. This paper presents a method to enhance semantic image communication that combines a novel lossy semantic encoding approach with spatially adaptive semantic image synthesis models. By developing a model-agnostic training augmentation strategy, our approach substantially reduces susceptibility to distortion introduced during encoding, effectively eliminating the need for lossless semantic encoding. Comprehensive evaluation across two spatially adaptive conditioning methods and three popular datasets indicates that this approach enhances semantic image communication at very low bit rate regimes.
Ultra-high resolution image segmentation has attracted increasing attention recently due to its wide applications in various scenarios such as road extraction and urban planning. The ultra-high resolution image facili...
详细信息
ISBN:
(纸本)9781665475921
Ultra-high resolution image segmentation has attracted increasing attention recently due to its wide applications in various scenarios such as road extraction and urban planning. The ultra-high resolution image facilitates the capture of more detailed information but also poses great challenges to the image understanding system. For memory efficiency, existing methods preprocess the global image and local patches into the same size, which can only exploit local patches of a fixed resolution. In this paper, we empirically analyze the effect of different patch sizes and input resolutions on the segmentation accuracy and propose a multi-scale collective fusion (MSCF) method to exploit information from multiple resolutions, which can be end-to-end trainable for more efficient training. Our method achieves very competitive performance on the widely-used DeepGlobe dataset while training on one single GPU.
Reflection removal is a long-standing problem in computer vision. In this paper, we consider the reflection removal problem for stereoscopic images. By exploiting the depth information of stereoscopic images, a new ba...
详细信息
ISBN:
(纸本)9781728180687
Reflection removal is a long-standing problem in computer vision. In this paper, we consider the reflection removal problem for stereoscopic images. By exploiting the depth information of stereoscopic images, a new background edge estimation algorithm based on the Wasserstein Generative Adversarial Network (WGAN) is proposed to distinguish the edges of the background image from the reflection. The background edges are then used to reconstruct the background image. We compare the proposed approach with the state-of-the-art reflection removal methods. Results show that the proposed approach can outperform the traditional single-image based methods and is comparable to the multiple-image based approach while having a much simpler imaging hardware requirement.
This paper presents a novel near infrared (NIR) image colorization approach for the Grand Challenge held by 2020 IEEE International conference on visualcommunications and imageprocessing (VCIP). A Cycle-Consistent G...
详细信息
ISBN:
(纸本)9781728180687
This paper presents a novel near infrared (NIR) image colorization approach for the Grand Challenge held by 2020 IEEE International conference on visualcommunications and imageprocessing (VCIP). A Cycle-Consistent Generative Adversarial Network (CycleGAN) with cross-scale dense connections is developed to learn the color translation from the NIR domain to the RGB domain based on both paired and unpaired data. Due to the limited number of paired NIR-RGB images, data augmentation via cropping, scaling, contrast and mirroring operations have been adopted to increase the variations of the NIR domain. An alternating training strategy has been designed, such that CycleGAN can efficiently and alternately learn the explicit pixel-level mappings from the paired NIR-RGB data, as well as the implicit domain mappings from the unpaired ones. Based on the validation data, we have evaluated our method and compared it with conventional CycleGAN method in terms of peak signal-to-noise ratio (PSNR), structural similarity (SSIM) and angular error (AE). The experimental results validate the proposed colorization framework.
Supported by powerful generative models, low-bitrate learned image compression (LIC) models utilizing perceptual metrics have become feasible. Some of the most advanced models achieve high compression rates and superi...
详细信息
ISBN:
(纸本)9798331529543;9798331529550
Supported by powerful generative models, low-bitrate learned image compression (LIC) models utilizing perceptual metrics have become feasible. Some of the most advanced models achieve high compression rates and superior perceptual quality by using image captions as sub-information. This paper demonstrates that using a large multi-modal model (LMM), it is possible to generate captions and compress them within a single model. We also propose a novel semantic-perceptual-oriented fine-tuning method applicable to any LIC network, resulting in a 41.58% improvement in LPIPS BD-rate compared to existing methods. Our implementation and pre-trained weights are available at https://***/tokkiwa/imageTextCoding.
Saliency prediction can be treated as the activity of the human visual system (HVS). The most effective method should highly approximate the response of HVS to the perceived information. Motivated by that orientation ...
详细信息
ISBN:
(纸本)9781728180687
Saliency prediction can be treated as the activity of the human visual system (HVS). The most effective method should highly approximate the response of HVS to the perceived information. Motivated by that orientation selectivity (OS) mechanism occuring in primary visual cortex (PVC) tells us how the HVS extracts visual information for scene understanding, we propose a novel saliency model by combining an orientation selectivity based local feature called "excitement" map and a visual acuity based global feature called "acuity" map. Further, a saliency augmented operator based on visual error sensitivity is designed to enhance the saliency map. Experimental results on three benchmark databases demonstrate the superior performance of the proposed method compared to ten classical/state-of-the-art algorithms.
There are individual differences in human visual attention between observers when viewing the same scene. Inter-observer visual congruency (IOVC) describes the dispersion between different people's visual attentio...
详细信息
ISBN:
(纸本)9781728185514
There are individual differences in human visual attention between observers when viewing the same scene. Inter-observer visual congruency (IOVC) describes the dispersion between different people's visual attention areas when they observe the same stimulus. Research on the IOVC of video is interesting but lacking. In this paper, we first introduce the measurement to calculate the IOVC of video. And an eye-tracking experiment is conducted in a realistic movie-watching environment to establish a movie scene dataset. Then we propose a method to predict the IOVC of video, which employs a dual-channel network to extract and integrate content and optical flow features. The effectiveness of the proposed prediction model is validated on our dataset. And the correlation between inter-observer congruency and video emotion is analyzed.
This paper presents an approach to realistic motion field estimation. In this approach, an image is first segmented into homogeneous regions using a new multiscale gradient algorithm followed by watershed transformati...
详细信息
ISBN:
(纸本)0819424358
This paper presents an approach to realistic motion field estimation. In this approach, an image is first segmented into homogeneous regions using a new multiscale gradient algorithm followed by watershed transformation. The multiscale gradient algorithm efficiently solves the over-segmentation problem of watershed transformation, increases segmentation accuracy and reduces the computational cost. The motion field is then estimated using block-matching with a consistency constraint. The consistency constraint function is defined by the neighboring motion vectors and the segmentation map. Simulation results show that the motion fields generated by the block-matching with consistency constraint are very smooth within each object, approaching realistic motion fields, even when a small block size is used.
暂无评论