Stereo image super-resolution (SR) has achieved great progress in recent years. However, the two major problems of the existing methods are that the parallax correction is insufficient and the cross-view information f...
详细信息
ISBN:
(纸本)9781728185514
Stereo image super-resolution (SR) has achieved great progress in recent years. However, the two major problems of the existing methods are that the parallax correction is insufficient and the cross-view information fusion only occurs in the beginning of the network. To address these problems, we propose a two-stage parallax correction and a multi-stage cross-view fusion network for better stereo image SR results. Specially, the two-stage parallax correction module consists of horizontal parallax correction and refined parallax correction. The first stage corrects horizontal parallax by parallax attention. The second stage is based on deformable convolution to refine horizontal parallax and correct vertical parallax simultaneously. Then, multiple cascaded enhanced residual spatial feature transform blocks are developed to fuse cross-view information at multiple stages. Extensive experiments show that our method achieves state-of-the-art performance on the KITTI2012, KITTI2015, Middlebury and Flickr1024 datasets.
Region segmentation of images is a well-known 'ill-posed problem', and a specific algorithm like regularization seems to be available. In this paper, an active region segmentation algorithm based on a regulari...
详细信息
ISBN:
(纸本)0819424358
Region segmentation of images is a well-known 'ill-posed problem', and a specific algorithm like regularization seems to be available. In this paper, an active region segmentation algorithm based on a regularization approach using the Hopfield neural network is proposed. The objective function to be minimized by the network is defined based on the criteria that integrate region growing and edge detection for the image segmentation. The energy of the network tends to converge on a local minimum, so that pyramid images are used to avoid such local minima and to achieve fast convergence. Moreover, the active region segmentation algorithm is applied to a sequence of color images to track an object region that change in appearance through complex and nonstationary background/foreground situations. Experimental results show that it's possible to segment images and track the object region using the minimization principle of the energy function of the Hopfield neural network.
Error-resilience is an important feature of any image or video coding algorithm associated with transmission over noisy or multipath channels. in this paper, we present a robust coding algorithm based on a modified ve...
详细信息
ISBN:
(纸本)0819427497
Error-resilience is an important feature of any image or video coding algorithm associated with transmission over noisy or multipath channels. in this paper, we present a robust coding algorithm based on a modified version of the zerotree coding technique, The algorithm provides significantly improved error-resilience with minimum added redundancy while still. retaining the efficiency and scalability of tile original technique.
To speedup the image classification process which conventionally takes the reconstructed images as input, compressed domain methods choose to use the compressed images without decompression as input. Correspondingly, ...
详细信息
ISBN:
(纸本)9781665475921
To speedup the image classification process which conventionally takes the reconstructed images as input, compressed domain methods choose to use the compressed images without decompression as input. Correspondingly, there will be a certain decline about the accuracy. Our goal in this paper is to raise the accuracy of compressed domain classification method using compressed images output by the NN-based image compression networks. Firstly, we design a hybrid objective loss function which contains the reconstruction loss of deep feature map. Secondly, one image reconstruction layer is integrated into the image classification network for up-sampling the compressed representation. These methods greatly help increase the compressed domain image classification accuracy and need no extra computational complexity. Experimental results on the benchmark imageNet prove that our design outperforms the latest work ResNet-41 with a large accuracy gain, about 4.49% on the top-1 classification accuracy. Besides, the accuracy lagging behinds the method using reconstructed images is also reduced to 0.47%. Moreover, our designed classification network has the lowest computational complexity and model complexity.
Model-supported exploitation is a new paradigam in image Understanding research. In this paradigm, three main technical areas have been identified: semi-automatic or automatic construction of site models, automated po...
详细信息
ISBN:
(纸本)0819424358
Model-supported exploitation is a new paradigam in image Understanding research. In this paradigm, three main technical areas have been identified: semi-automatic or automatic construction of site models, automated positioning of images to the sites, and monitoring of movable objects and construction activities. In this paper, we summarize recent progress in the detection and counting of vehicles in selected locales, monitoring and characterization of vehicle groupings. We present the algorithms used and the results obtained, The detection and counting method employs geometrical models and uses a spatial contour matching approach. The configuration detection method exploits knowledge of geometrical models in the frequency domain. The issues of parameter learning as well as sensitivity of the detection performance to misspecification of model and tuning parameters are briefly examined.
Lookup tables (LUTs) are commonly used to speed up imageprocessing by handling complex mathematical functions like sine and exponential calculations. They are used in various applications such as camera image process...
详细信息
ISBN:
(纸本)9798331529543;9798331529550
Lookup tables (LUTs) are commonly used to speed up imageprocessing by handling complex mathematical functions like sine and exponential calculations. They are used in various applications such as camera imageprocessing, high-dynamic range imaging, and edge-preserving filtering. However, due to the increasing gap between computing and input/output performance, LUTs are becoming less effective. Even though specific circuits like SIMD can improve LUT efficiency, they still need to bridge the performance gap fully. The gap makes it difficult to choose between direct numerical and LUT calculations. For this problem, a register-LUTs method with the nearest neighbor was proposed;however, it is limited for functions with narrow-range values approaching zero. In this paper, we propose a method for using register LUTs to process images efficiently over a wide range of values. Our contributions include proposing register-LUT with linear interpolation for efficient computation, using a smaller data type for further efficiency, and suggesting an efficient data retrieving method.
Learned image compression (LIC) has shown its superior compression ability. Quantization is an inevitable stage to generate quantized latent for the entropy coding. To solve the non-differentiable problem of quantizat...
详细信息
ISBN:
(纸本)9781665475921
Learned image compression (LIC) has shown its superior compression ability. Quantization is an inevitable stage to generate quantized latent for the entropy coding. To solve the non-differentiable problem of quantization in the training phase, many differentiable approximated quantization methods have been proposed. However, the derivative of quantized latent to non-quantized latent are set as one in most of the previous methods. As a result, the quantization error between non-quantized and quantized latent is not taken into consideration in the gradient descent. To address this issue, we exploit the gradient scaling method to scale the gradient of non-quantized latent in the back-propagation. The experimental results show that we can outperform the recent LIC quantization methods.
Pixel recovery with deep learning has shown to be very effective for a variety of low-level vision tasks like image super-resolution, denoising, and deblurring. Most existing works operate in the spatial domain, and t...
详细信息
ISBN:
(纸本)9781728185514
Pixel recovery with deep learning has shown to be very effective for a variety of low-level vision tasks like image super-resolution, denoising, and deblurring. Most existing works operate in the spatial domain, and there are few works that exploit the transform domain for image restoration tasks. In this paper, we present a transform domain approach for image deblocking using a deep neural network called DCTResNet. Our application is compressed video motion deblur, where the input video frame has blocking artifacts that make the deblurring task very challenging. Specifically, we use a block-wise Discrete Cosine Transform (DCT) to decompose the image into its low and high-frequency sub-band images and exploit the strong subband specific features for more effective deblocking solutions. Since JPEG also uses DCT for image compression, using DCT sub-band images for image deblocking helps to learn the JPEG compression prior to effectively correct the blocking artifacts. Our experimental results show that both PSNR and SSIM for DCTResNet perform more favorably than other state-of-the-art (SOTA) methods, while significantly faster in inference time.
Generative models have significantly advanced generative AI, particularly in image and video generation. Recognizing their potential, researchers have begun exploring their application in image compression. However, e...
详细信息
ISBN:
(纸本)9798331529543;9798331529550
Generative models have significantly advanced generative AI, particularly in image and video generation. Recognizing their potential, researchers have begun exploring their application in image compression. However, existing methods face two primary challenges: limited performance improvement and high model complexity. In this paper, to address these two challenges, we propose a perceptual image compression solution by introducing a conditional diffusion model. Given that compression performance heavily depends on the decoder's generative capability, we base our decoder on the diffusion transformer architecture. To address the model complexity problem, we implement the diffusion transformer architecture with Swin transformer. Equipped with enhanced generative capability, we further augment the decoder with informative features using a multi-scale feature fusion module. Experimental results demonstrate that our approach surpasses existing perceptual image compression methods while achieving lower model complexity.
We propose a novel method of arbitrarily focused image generation using multiple differently focused images. First, we describe our previously proposed select and merge method for all focused image acquisition. We can...
详细信息
We propose a novel method of arbitrarily focused image generation using multiple differently focused images. First, we describe our previously proposed select and merge method for all focused image acquisition. We can get good results by using this method but it is not easy to extend this method for generating arbitrarily focused images. Then, based on the assumption that depth of a scene changes stepwise, we derive a formula for reconstruction between the desired arbitrarily focused image and multiple acquired images;we can reconstruct the arbitrarily focused image by iterative use of the formula. We also introduce coarse-to-fine estimation of point spread functions (PSFs) of the acquired images. We reconstruct arbitrarily focused images for a natural scene. In other words, we simulate virtual cameras and generate images focused on arbitrary depths. (C) 1998 SPIE and IS&T. [S1017-9909(98)02201-6].
暂无评论