Single image super-resolution (SISR) is the process of reconstructing a high-resolution (HR) image to compensate for the lost high-frequency information from only a single low-resolution (LR) image. Blind image super-...
详细信息
Single image super-resolution (SISR) is the process of reconstructing a high-resolution (HR) image to compensate for the lost high-frequency information from only a single low-resolution (LR) image. Blind image super-resolution attempts to reconstruct the HR image when the blur kernel is unknown, which is an ill-posed inverse problem. We propose an alternating optimization based self-supervised blind image super-resolution method (Self-SR), which models a joint optimization problem about the blur kernel and the HR image and estimates them by iteratively alternating the deep network and the regularization model. The deep convolutional neural network learns complicated features to represent the HR image without requiring smoothness regularization since data fitting is inherently free from noise amplification. The simple blur kernel is modeled using the regularized least-squares model, which admits the direct closed-form solution for the blur kernel. Self-SR incorporates the learning ability of the deep network and the generalizability of the optimization-based model, and with the help of the blur kernel estimated by the regularization model, the data fidelity loss function with the supervision of the LR image facilitates the deep network to solve image super-resolution tasks with the more accurate blur kernel. Experimental results on synthetic and real LR images show that Self-SR achieves better super-resolution performance than most blind and non-blind methods.
Learned image compression research has achieved state-of-the-art compression performance with auto-encoder based neural network architectures, where the image is mapped via convolutional neural networks (CNN) into a l...
详细信息
With the increasingly widespread application of Virtual Reality (VR) technology in the field of education, VR classroom models, characterized by their unique immersive experience, are considered an important direction...
详细信息
With the increasingly widespread application of Virtual Reality (VR) technology in the field of education, VR classroom models, characterized by their unique immersive experience, are considered an important direction for educational innovation. To maximize the educational effects of VR classrooms, efficient processing and optimization of scene images are essential. Currently, although many studies are devoted to the rendering techniques of static scenes, research on real-time processing and personalized layout optimization of dynamic interactive teaching scenes is still insufficient. This paper proposes innovative methods based on deep learning for two core issues in VR classrooms: scene image enhancement and visual layout optimization. First, by constructing an image enhancement generation model based on the U-net network, the clarity and detail richness of scene images are significantly improved. Second, this paper applies an improved Spatial Pyramid Pooling in Fast Regions with Convolutional neural Networks (SPPF) structure from Yolo5 to scene layout and introduces a novel visual graph attention model (GAM), which can extract colors from input images and effectively apply them to visual interface design. These methods not only enhance the visual effects of scenes but also lay the foundation for building personalized teaching environments that meet the needs of different learners. This research provides a new perspective for the real-time processing and layout optimization of VR classroom scenes, which is of significant importance for advancing the development of educational technology.
Deep learning-based methods are widely used in the field of imageprocessing and have achieved remarkable results. However, these methods often produce mis-filling phenomenon when dealing with irregular broken images....
详细信息
Deep learning-based methods are widely used in the field of imageprocessing and have achieved remarkable results. However, these methods often produce mis-filling phenomenon when dealing with irregular broken images. The main reason is that the underlying information of the feature map is not fully utilized, and the semantic information of feature maps at different scales cannot complement each other effectively. Therefore, we propose a network structure based on feature pyramid. In the first stage, we set the expansion factor used to avoid the grid effect and increase the receptive field, while maximizing the use of the underlying feature map information. The second stage uses a feature fusion branch, which first samples the feature maps to construct the feature pyramid, second fuses feature maps with different resolutions and semantic strengths, and finally, generates an image by back-convolution of the feature maps with a decoder. Our experimental results show that this method generates recovered regions with coherent, clear, and visually reasonable images, superior to other methods in terms of image quality.
In modern smartphone cameras, the imagesignal Processor (ISP) is the core element that converts the RAW readings from the sensor into perceptually pleasant RGB images for the end users. The ISP is typically proprieta...
详细信息
ISBN:
(纸本)9798350349405;9798350349399
In modern smartphone cameras, the imagesignal Processor (ISP) is the core element that converts the RAW readings from the sensor into perceptually pleasant RGB images for the end users. The ISP is typically proprietary and handcrafted and consists of several blocks such as white balance, color correction, and tone mapping. Deep learning-based ISPs aim to transform RAW images into DSLR-like RGB images using deep neural networks. However, most learned ISPs are trained using patches (small regions) due to computational limitations. Such methods lack global context, which limits their efficacy on full-resolution images and harms their ability to capture global properties such as color constancy or illumination. First, we propose a novel module that can be integrated into any neural ISP to capture the global context information from the full RAW images. Second, we propose an efficient and simple neural ISP that utilizes our proposed module. Our model achieves state-of-the-art results on different benchmarks using diverse and real smartphone images.
Convolutional neural networks with U-shaped architectures are widely used in medical image segmentation. However, their performance is often limited by imbalanced regional attention caused by interference from irrelev...
详细信息
Convolutional neural networks with U-shaped architectures are widely used in medical image segmentation. However, their performance is often limited by imbalanced regional attention caused by interference from irrelevant features within localized receptive fields. To overcome this limitation, FDU-Net is proposed as a novel U-Net-based model that incorporates a feature decorrelation strategy. Specifically, FDU-Net introduces a feature decorrelation method that extracts multiple groups of features from the encoder and optimizes sample weights to reduce internal feature correlations, thereby minimizing the interference from irrelevant features. Comprehensive experiments on diverse medical imaging datasets show that FDU-Net achieves superior evaluation scores and finer segmentation results, outperforming state-of-the-art methods.
This article proposes an interactive-interpretable network (IIN) to facilitate accurately zooming in the low-resolution scanning electron microscopy (SEM) image data which could preserve the intricate details of origi...
详细信息
This article proposes an interactive-interpretable network (IIN) to facilitate accurately zooming in the low-resolution scanning electron microscopy (SEM) image data which could preserve the intricate details of original image without long exposure of coal-dust specimens under intense energy radiation. By harnessing the interpretability benefits of traditional model-driven approaches, the proposed data-driven deep neural network facilitates an interactive super-resolution (SR) process unfolding as signalprocessing optimization procedures. According to the iterative proximal strategy, a deep unfolding way with proximal gradient projection is employed, in which each layer plays as a step to integrate deep networks into classic optimization with more obviously augmenting clarity and interpretability. Leveraging Taylor series approximation, the SR intermediates are decomposed into fundamental (low-order), derivative (high-order) components, and Remainder term, which are informative by intrinsic prior knowledge to elucidate varying image frequency details. Also, Taylor Remainder is treated as intermediate residual through the discrepancy measurement between intermediate high-resolution part and the whole-order information aggregation, which serves as a guidance for the following interactive refinement. Additionally, the reconstructed outputs undergo further synergy of the dual-model framework that could enhance final SR outcomes. Final experiments show that the proposed method with the interpretable and accurate merits, which outperforms other highly related SR methods from quantitative and qualitative perspectives.
In recent years, there has been significant advancement in the field of model watermarking techniques. However, the protection of image-processingneural networks remains a challenge, with only a limited number of met...
详细信息
In recent years, there has been significant advancement in the field of model watermarking techniques. However, the protection of image-processingneural networks remains a challenge, with only a limited number of methods being developed. The objective of these techniques is to embed a watermark in the output images of the target generative network, so that the watermark signal can be detected in the output of a surrogate model obtained through model extraction attacks. This promising technique, however, has certain limits. Analysis of the frequency domain reveals that the watermark signal is mainly concealed in the high-frequency components of the output. Thus, we propose an overwriting attack that involves forging another watermark in the output of the generative network. The experimental results demonstrate the efficacy of this attack in sabotaging existing watermarking schemes for image-processing networks with an almost 100% success rate. To counter this attack, we propose an adversarial framework for the watermarking network. The framework incorporates a specially-designed adversarial training step, where the watermarking network is trained to defend against the overwriting network, thereby enhancing its robustness. Additionally, we observe an overfitting phenomenon in the existing watermarking method, which can render it ineffective. To address this issue, we modify the training process to eliminate the overfitting problem.
Liver cancer, as one of the leading causes of cancer-related deaths around the world, has triggered an urgent need for automatic segmentation of the liver and tumors. Nonetheless, owing to the ambiguous morphology, si...
详细信息
Liver cancer, as one of the leading causes of cancer-related deaths around the world, has triggered an urgent need for automatic segmentation of the liver and tumors. Nonetheless, owing to the ambiguous morphology, size, location, and relationship of the liver and tumors to the surrounding tissues, this poses a challenge to perform automatic segmentation in CT images. To address these challenging issues, we propose a novel model LGMA-Net. This model is designed to improve the ability to capture details and small targets in the image, thus improving the segmentation accuracy of liver and tumor. Different from the existing segmentation networks, we propose a depthwise separable convolutional SNP-like neuron model from nonlinear spiking mechanism in spiking neural P systems. Then, an important component, the SNP convolutional Transformer block, is designed based on this model. SNP convolutional Transformer block not only captures global dependencies but also local context information. In addition, we propose channel-attentive skip connection (CASC). The CASC has the ability to autonomously concentrate on crucial characteristics by learning channel dependencies, and the fused features has the ability to autonomously concentrate on important features in the skip connection. Our proposed model was evaluated on two public datasets. On the LiTS dataset, the liver and tumor segmentation DSC were 97.72% and 87.48%. On the 3D-IRCAbb dataset, the liver and tumor segmentation DSC were 97.2% and 83.24%.
image hiding is a task that hides secret images into cover images. The purposes of image hiding are to ensure the secret images are invisible to the human and the secret images can be recovered. The current state-of-t...
详细信息
image hiding is a task that hides secret images into cover images. The purposes of image hiding are to ensure the secret images are invisible to the human and the secret images can be recovered. The current state-of-the-art steganography methods run the risk of secret information leakage. A safe image hiding network (SIHNet) is presented to reduce the leakage of secret information. Based on some phenomena of image hiding methods which use invertible neural network, a reversible secret imageprocessing (SIP) module is proposed to make the secret images suitable for hiding and make the stego images leak less secret information. Besides, a reversible lost information hiding (LIH) module is used to hide the lost information into the cover images, thus the method can recover the secret images better than the method that uses random noise to replace the lost information. Experimental results show that SIHNet outperforms other state-of-the-art methods on the PSNR and SSIM values of the recovered secret images and the stego images. Besides, residual images of other state-of-the-art methods all contain information about secret images while residual images of SIHNet leak almost no secret information. Thus the method can prevent the listener of transmission channel from obtaining the information of the secret image through the residual image, which means SIHNet performs better in security than other state-of-the-art methods. We propose a reversible secret imageprocessing (SIP) module to make the secret images suitable for hiding and make the stego images leak less secret information. Besides, we use a reversible lost information hiding (LIH) module to hide the lost information into the cover images, thus our method can recover the secret images better than other methods. Experimental results show that SIHNet outperforms other state-of-the-art methods on the PSNR and SSIM values. image
暂无评论