This article proposes a distributed stochastic algorithm with variance reduction for general smooth non-convex finite-sum optimization, which has wide applications in signalprocessing and machine learning communities...
详细信息
This article proposes a distributed stochastic algorithm with variance reduction for general smooth non-convex finite-sum optimization, which has wide applications in signalprocessing and machine learning communities. In distributed setting, a large number of samples are allocated to multiple agents in the network. Each agent computes local stochastic gradient and communicates with its neighbors to seek for the global optimum. In this article, we develop a modified variance reduction technique to deal with the variance introduced by stochastic gradients. Combining gradient tracking and variance reduction techniques, this article proposes a distributed stochastic algorithm, gradient tracking algorithm with variance reduction (GT-VR), to solve large-scale non-convex finite-sum optimization over multiagent networks. A complete and rigorous proof shows that the GT-VR algorithm converges to the first-order stationary points with O(1/k) convergence rate. In addition, we provide the complexity analysis of the proposed algorithm. Compared with some existing first-order methods, the proposed algorithm has a lower O(PM epsilon b;(1)) gradient complexity under some mild condition. By comparing state-of-the-art algorithms and GT-VR in numerical simulations, we verify the efficiency of the proposed algorithm.
Single image super -resolution (SISR) is still an important while challenging task. Existing methods usually ignore the diversity of generated Super -Resolution (SR) images. The fine details of the corresponding highr...
详细信息
Single image super -resolution (SISR) is still an important while challenging task. Existing methods usually ignore the diversity of generated Super -Resolution (SR) images. The fine details of the corresponding highresolution (HR) images cannot be confidently recovered due to the degradation of detail in low -resolution (LR) images. To address the above issue, this paper presents a flow -based multi -scale learning network (FMLnet) to explore the diverse mapping spaces for SR. First, we propose a multi -scale learning block (MLB) to extract the underlying features of the LR image. Second, the introduced pixel -wise multi -head attention allows our model to map multiple representation subspaces simultaneously. Third, by employing a normalizing flow module for a given LR input, our approach generates various stochastic SR outputs with high visual quality. The trade-off between fidelity and perceptual quality can be controlled. Finally, the experimental results on five datasets demonstrate that the proposed network outperforms the existing methods in terms of diversity, and achieves competitive PSNR/SSIM results. Code is available at https://***/qianyuwu/FMLnet.
Aiming at the problem that the image segmentation accuracy of highway pavement distress is easily affected by complex texture, noisy background, uneven illumination conditions and external environmental interference, ...
详细信息
ISBN:
(纸本)9798350350920
Aiming at the problem that the image segmentation accuracy of highway pavement distress is easily affected by complex texture, noisy background, uneven illumination conditions and external environmental interference, this paper studies the image segmentation methods of highway pavement distress based on semantic segmentation Convolutional neural Networks (CNN). Firstly, the methods of the image segmentation highway pavement distress based on FCN-DenseNet, DeepLabv3+, MobileNet are compared and analyzed. Secondly, the four variants of CNN models are investigated for the image segmentation of highway pavement distress, including FCN-DenseNet121 for Pavement Distress Segmentation (FCN-D121-PDS), DeepLabv3-DRN for Pavement Distress Segmentation (DL-D-PDS), DeepLabv3-MobilenetV3 for Pavement Distress Segmentation (DL-M-PDS and DeepLabv3-Mobilenet1 for Pavement Distress Segmentation (DL-M1-PDS). Finally, the comparative experiments were conducted, and the results showed that the average of DL-M1-PDS network is superior to the other three methods, with image segmentation accuracy of 98.20%.
Hyperspectral image denoising is crucial for accurate extraction of spectral information. However, current convolutional neural network (CNN)-based methods have inherent limitations, while Transformer- based methods s...
详细信息
Hyperspectral image denoising is crucial for accurate extraction of spectral information. However, current convolutional neural network (CNN)-based methods have inherent limitations, while Transformer- based methods suffer from high computational complexity when processing global contextual information. To address this problem, we designed a hybrid Mamba-CNN context interaction module and constructed a U-shaped hierarchical encoder-decoder network (MUNet). The network takes the pixel-scale as input to maximize the preservation of image information and employs state-space model (SSM)-based Mamba blocks to efficiently capture global semantic information, while using convolution to extract local features. This enhances the modeling of global and local features for better denoising. Extensive experiments on synthetic and real hyperspectral image (HSI) datasets showed that the proposed MUNet achieves better performance than other state-of-the-art techniques.
The paper considers methods for processing an acoustic signal obtained with an optoacoustic effect in a liquid. A 12-layer convolutional neural network is proposed, trained by minimizing the loss of the mean square de...
详细信息
Recent research on enhancing image resolution using convolutional neural networks (CNNs) have shown encouraging outcomes. While due to the intrinsic locality of the convolution operator, CNN-based methods limit the ca...
详细信息
Recent research on enhancing image resolution using convolutional neural networks (CNNs) have shown encouraging outcomes. While due to the intrinsic locality of the convolution operator, CNN-based methods limit the capacity to obtain contextual information and long-range dependency. To address this problem, we propose a hybrid network by integrating CNN and Transformer which show impressive performance to learn long-range contextual information for image SR. Specifically, by introducing a spatial pyramid pooling (SPP) module into the multi-head attention (MHA), the Spatial Pyramid Swin Transformer (SPST) module achieves linear computational complexity and integrates multi-scale features. This enables the model to learn a wider range of multi-scale features and enhances the capabilities of the attention matrix. Moreover, the gated convolution (GC) module employs the abundant low-frequency information from low-resolution to assist reconstruction and provides a learnable dynamic feature selection mechanism to further constrain the training to improve performance. Extensive experiments were carried out to assess the efficacy of our approach utilizing a benchmark dataset. The results of indicate that our method surpasses alternative approaches in terms of parameter count and computational efficiency. Especially, the proposed method increases PSNR by 0.05 dB and uses 1.6M fewer parameters than SwinIR, resulting in a shorter inference time.
Makeup transfer involves transferring makeup from a reference image to a target image while maintaining the target's identity. Existing methods, which use Generative Adversarial Networks, often transfer not just m...
详细信息
ISBN:
(纸本)9798350344868;9798350344851
Makeup transfer involves transferring makeup from a reference image to a target image while maintaining the target's identity. Existing methods, which use Generative Adversarial Networks, often transfer not just makeup but also the reference image's skin tone. This limits their use to similar skin tones and introduces bias. Our solution introduces a skin tone-robust makeup embedding achieved by augmenting the reference image with varied skin tones. Using Graph neural Networks, we establish connections between target, reference, and augmented images to create this robust representation that preserves the target's skin tone. In a user study, our approach outperformed other methods 66% of the time, showcasing its resilience to skin tone variations.
We propose a novel method for privacy-preserving deep neural networks (DNNs) with the Vision Transformer (ViT). The method allows us not only to train models and test with visually protected images but to also avoid t...
详细信息
We propose a novel method for privacy-preserving deep neural networks (DNNs) with the Vision Transformer (ViT). The method allows us not only to train models and test with visually protected images but to also avoid the performance degradation caused from the use of encrypted images, whereas conventional methods cannot avoid the influence of image encryption. A domain adaptation method is used to efficiently fine-tune ViT with encrypted images. In experiments, the method is demonstrated to outperform conventional methods in an image classification task on the CIFAR-10 and imageNet datasets in terms of classification accuracy.
Despite the great potential of artificial intelligence (AI), which promotes machines to mimic human intelligence in performing tasks, it requires a deep/extensive model with a sufficient number of parameters to enhanc...
详细信息
Despite the great potential of artificial intelligence (AI), which promotes machines to mimic human intelligence in performing tasks, it requires a deep/extensive model with a sufficient number of parameters to enhance the expressive ability. This aspect often hinders the application of AI on resource-constrained devices. Structured pruning is an effective compression technique that reduces the computation of neural networks. However, it typically achieves parameter reduction at the cost of non-negligible accuracy loss, necessitating fine-tuning. This paper introduces a novel technique called Structured Directional Pruning (SDP) and its fast solver, Alternating Structured Directional Pruning (AltSDP). SDP is a general energy-efficient coarse-grained pruning method that enables efficient model pruning without requiring fine-tuning or expert knowledge of the desired sparsity level. Theoretical analysis confirms that the fast solver, AltSDP, achieves SDP asymptotically after sufficient training. Experimental results validate that AltSDP reaches the same minimum valley as the vanilla optimizer, namely stochastic gradient descent (SGD), while maintaining a constant training loss. Additionally, AltSDP achieves state-of-the-art pruned accuracy integrating pruning into the initial training process without the need for fine-tuning. Consequently, the newly proposed SDP, along with its fast solver AltSDP, can significantly facilitate the development of shrinking deep neural networks (DNNs) and enable the deployment of AI on resource-constrained devices.
Compressive sensing (CS) is a notable technique in signalprocessing, especially in multimedia, as it allows for simultaneous signal acquisition and dimensionality reduction. Recent advancements in deep learning (DL) ...
详细信息
Compressive sensing (CS) is a notable technique in signalprocessing, especially in multimedia, as it allows for simultaneous signal acquisition and dimensionality reduction. Recent advancements in deep learning (DL) have led to the creation of deep unfolding architectures, which overcome the inefficiency and subpar quality of traditional CS reconstruction methods. In this paper, we introduce a novel CS image reconstruction algorithm that leverages the strengths of the fast iterative shrinkage-thresholding algorithm (FISTA) and modern Transformer networks. To enhance computational efficiency, we employ a block-based sampling approach in the sampling module. By mapping FISTA's iterative process onto neural networks in the reconstruction module, we address the hyperparameter challenges of traditional algorithms, thereby improving reconstruction efficiency. Moreover, the robust feature extraction capabilities of Transformer networks significantly enhance image reconstruction quality. Experimental results show that the FusionOpt-Net model surpasses other advanced methods on various public benchmark datasets.
暂无评论