A seam is a set of pixels with minimum energy forming a continuous line in an image. By eliminating or duplicating seams iteratively, an input image can be retargeted. However, this process often results in blurring, ...
A seam is a set of pixels with minimum energy forming a continuous line in an image. By eliminating or duplicating seams iteratively, an input image can be retargeted. However, this process often results in blurring, stretching, or distortion problems around the seams, especially when extending a target image. We propose a novel approach for image extension using content-aware seam restoration to solve this problem. First, we design CSR-Net, which employs features from the horizontal region of target pixels to restore the seams. Second, we develop an image extension scenario based on the seam restoration and the training methodology of CSR-Net. Experimental results demonstrate that the proposed algorithm provides more accurate expanded results at seam pixels the seams than conventional algorithms.
The image sequences captured by Unmanned Aerial Vehicles (UAVs) can be applied to many computer vision tasks. However, due to the instability of UAV flight, the captured image sequences will deviate from the preset tr...
The image sequences captured by Unmanned Aerial Vehicles (UAVs) can be applied to many computer vision tasks. However, due to the instability of UAV flight, the captured image sequences will deviate from the preset trajectory and pose, which reduce the quality of subsequent applications such as panoramic image stitching. In this paper, a novel method is proposed to rectify UAV-captured image sequences by transforming the images to a regular trajectory with the uniform pose. First, to minimize the total transformation deviation, virtual regular camera trajectory is derived by minimizing the global error of coordinates between actual and virtual camera trajectories. Then, camera-pose-relevant local homography is proposed by inserting the camera pose into local homography to transform the images to the derived virtual trajectory with the uniform pose and correct translation parallax. The experimental results demonstrate the effectiveness of the proposed rectification algorithm from both theoretical and application levels.
Plenoptic cameras are light field capturing devices able to acquire large amounts of angular and spatial information. The lenslet video produced by such cameras presents on each frame a distinctive hexagonal pattern o...
详细信息
ISBN:
(数字)9798331529543
ISBN:
(纸本)9798331529550
Plenoptic cameras are light field capturing devices able to acquire large amounts of angular and spatial information. The lenslet video produced by such cameras presents on each frame a distinctive hexagonal pattern of micro-images. Due to the particular structure of lenslet images, traditional video codecs perform poorly on lenslet video. Previous works have proposed a preprocessing scheme that cuts and realigns the micro-images on each lenslet frame. While effective, this method introduces high frequency components into the processed image. In this paper, we propose an additional step to the aforementioned scheme by applying an invertible smoothing transform. We evaluate the enhanced scheme on lenslet video sequences captured with single-focused and multi-focused plenoptic cameras. On average, the enhanced scheme achieves 9.85% bitrate reduction compared to the existing scheme.
This paper focuses on the Referring image Segmentation (RIS) task, which aims to segment objects from an image based on a given language description, having significant potential in practical applications such as food...
详细信息
ISBN:
(数字)9798331529543
ISBN:
(纸本)9798331529550
This paper focuses on the Referring image Segmentation (RIS) task, which aims to segment objects from an image based on a given language description, having significant potential in practical applications such as food safety detection. Recent advances using the attention mechanism for cross-modal interaction have achieved excellent progress. However, current methods tend to lack explicit principles of interaction design as guidelines, leading to inadequate cross-modal comprehension. Additionally, most previous works use a single-modal mask decoder for prediction, losing the advantage of full cross-modal alignment. To address these challenges, we present a Fully Aligned Network (FAN) that follows four cross-modal interaction principles. Under the guidance of reasonable rules, our FAN achieves state-of-the-art performance on the prevalent RIS benchmarks (RefCOCO, RefCOCO+, G-Ref) with a simple architecture.
In the recent years, special emphasis has been placed on visual-based gait recognition due to its unique characteristics such as not requiring a special user action, or its long-distance recognizability. In general, t...
详细信息
ISBN:
(纸本)9783030968786;9783030968779
In the recent years, special emphasis has been placed on visual-based gait recognition due to its unique characteristics such as not requiring a special user action, or its long-distance recognizability. In general, there exist two methods - model-based and appearance-based methods - both of which come with their own advantages and disadvantages. In an effort to harness the best of both worlds we create a compact 3D human model-based gait representation out of 2D images with the help of the DensePose algorithm. We design a simple CNN and train several instances to show that the obtained gait representation can in fact be used to improve gait recognition accuracy. Experimental results are based on the publicly available CASIA-B dataset.
Learning-based image compression methods have emerged as state-of-the-art, showcasing higher performance compared to conventional compression solutions. These data-driven approaches aim to learn the parameters of a ne...
Learning-based image compression methods have emerged as state-of-the-art, showcasing higher performance compared to conventional compression solutions. These data-driven approaches aim to learn the parameters of a neural network model through iterative training on large amounts of data. The optimization process typically involves minimizing the distortion between the decoded and the original ground truth images. This paper focuses on perceptual optimization of learning-based image compression solutions and proposes: i) novel loss function to be used during training and ii) novel subjective test methodology that aims to evaluate the decoded image fidelity. According to experimental results from the subjective test taken with the new methodology, the optimization procedure can enhance image quality for low-rates while offering no advantage for high-rates.
The ever-evolving nature of the Internet and wireless communications, as well as the production of huge amounts of multimedia every day has created a dire need for their security. In this paper, an image encryption te...
详细信息
ISBN:
(纸本)9781665462198
The ever-evolving nature of the Internet and wireless communications, as well as the production of huge amounts of multimedia every day has created a dire need for their security. In this paper, an image encryption technique that is based on 3 stages is proposed. The first stage makes use of DNA encoding. The second stage proposed and utilizes a novel S-box that is based on the Mersenne Twister and a linear descent algorithm. The third stage employs the Tent chaotic map. The computed performance evaluation metrics exhibit a high level of achieved security.
Deep learning has been performing reasonably well in computer vision tasks that call for a high volume of photos, although gathering images is often expensive and challenging. Different picture augmentation techniques...
详细信息
Melanoma is a deadly kind of skin cancer which can spread to other parts of the body. Therefore, it is necessary to identify melanoma at the beginning level. visual examinationat the time of medical examination of ski...
详细信息
Online image sharing on social media platforms faces information leakage due to deep learning-aided privacy attacks. To avoid these attacks, this paper proposes a privacy protection mechanism for image sharing without...
详细信息
ISBN:
(数字)9781728190549
ISBN:
(纸本)9781728190556
Online image sharing on social media platforms faces information leakage due to deep learning-aided privacy attacks. To avoid these attacks, this paper proposes a privacy protection mechanism for image sharing without changing the visual effect, which is based on reversible adversarial examples. Specifically, social media platform users can change the class activation feature to convert the original image into an adversarial image before sharing. When users want to restore the adversarial image to the original image, they can use an improved generative adversarial network model to restore it. The experimental results prove that the conversion model in this paper can effectively prevent privacy attacks from analyzing and stealing users' private information while having no visual impact. At the same time, the proposed restoration model can restore the adversarial examples with high accuracy.
暂无评论