The main bottleneck faced by total variation methods for image fusion is that it is difficult to design a novel optimization model that can be solved by numerical methods. This paper proposes a general framework of to...
详细信息
The main bottleneck faced by total variation methods for image fusion is that it is difficult to design a novel optimization model that can be solved by numerical methods. This paper proposes a general framework of total variation optimized by deep learning for infrared and visible image fusion, which combines the advantages of deep convolutional neural networks. Under this framework, any arbitrary convex or non-convex total variation model for image fusion can be designed, and its optimization solution can be obtained through neural network learning. The core idea of the proposed framework is to transform the designed variational model into a loss function of a deep convolutional neural network, and then use the initial fused image of a source image and the output fused image to represent the data item, use the output image and the source image to represent the regularization term, and finally use a deep neural network learning method to obtain the optimal fused image. Based on the proposed framework, further research on pre-fusion, network model and regularization item can be carried out. To verify the effectiveness of the proposed framework, we designed a specific non-convex total variational model and performed experiments on the infrared and visible image datasets. Experimental results show that the proposed method has strong robustness, and compared with the fused images obtained by current state-of-art algorithms in terms of objective evaluation metrics and visual effects, the fused image obtained by the proposed method has more competitive advantages. Our code is publicly available at https://***/gzsds/globaloptimizationimagefusion.
The last decade has seen significant advancements in multi-object tracking, particularly with the emergence of deep learning based methods. However, many prior studies in online tracking have primarily focused on enha...
详细信息
ISBN:
(纸本)9798350349405;9798350349399
The last decade has seen significant advancements in multi-object tracking, particularly with the emergence of deep learning based methods. However, many prior studies in online tracking have primarily focused on enhancing track management or extracting visual features, often leading to hybrid approaches with limited effectiveness, especially in scenarios with severe occlusions. Conversely, in offline tracking, there has been a lack of emphasis on robust motion cues. In response, this approach aims to present a novel solution for offline tracking by merging tracklets using some recent promising learning-based architectures. We leverage a jointly performing Transformer and Graph neural Network (GNN) encoder to integrate both the individual motions of targets and their interactions in between. By enabling bi-directional information propagation between the Transformer and the GNN, proposed model allows motion modeling to depend on interactions, and conversely, interaction modeling to depend on the motion of each target. The proposed solution is an end-to-end trainable model that eliminates the requirement for any handcrafted short-term or long-term matching processes. This approach performs on par with state-of-the-art multi-object tracking algorithms, demonstrating its effectiveness and robustness.
This paper addresses the problem of object recognition given a set of images as input (e.g., multiple cam-era sources and video frames). Convolutional neural network (CNN)-based frameworks do not exploit these sets ef...
详细信息
This paper addresses the problem of object recognition given a set of images as input (e.g., multiple cam-era sources and video frames). Convolutional neural network (CNN)-based frameworks do not exploit these sets effectively, processing a pattern as observed, not capturing the underlying feature distribution as it does not consider the variance of images in the set. To address this issue, we propose the Grassmannian learning mutual subspace method (G-LMSM), a NN layer embedded on top of CNNs that can process image sets more effectively and can be trained in an end-to-end manner. The image set is first represented by a low-dimensional input subspace and then this input subspace is matched with dic-tionary subspaces by a similarity of their canonical angles, an interpretable and easy to compute metric. The key idea of G-LMSM is that the dictionary subspaces are learned as points on the Grassmann man-ifold, optimized with Riemannian stochastic gradient descent. This learning is stable, efficient and theo-retically well-grounded. We demonstrate the effectiveness of our proposed method on hand shape recognition, face identification, and facial emotion recognition.(c) 2022 Elsevier B.V. All rights reserved.
Given that super-resolution (SR) aims to recover lost information, and low-resolution (LR) images in real-world conditions might be corrupted with multiple degradations, considering basic bicubic down-sampling as the ...
详细信息
Given that super-resolution (SR) aims to recover lost information, and low-resolution (LR) images in real-world conditions might be corrupted with multiple degradations, considering basic bicubic down-sampling as the sole degradation significantly limits the performance of most existing SR models. This paper presents a model for simultaneous super-resolution and blind additive white Gaussian noise (AWGN) denoising with two components (netdeg and netSR) that is based on a generative adversarial network (GAN) to achieve detailed results. netdeg, featuring residual and innovative cost-effective ghost residual blocks with a frequency separation module for obtaining long-range information, blindly restores a clean version of the LR image. netSR leverages slim ghost full-frequency residual blocks to process low-frequency (LF) and high-frequency (HF) information via static large convolutions and pixel-wise highlighted input-adaptive dynamic convolutions, respectively. To address the susceptibility of dynamic layers to noise and preserve feature diversity while reducing model's costs, static and dynamic layer features are combined and highlighted. Diverse and non-redundant features are then processed using ghost-style blocks. The proposed model achieves comparable SR results in bicubic down-sampling scenarios, outperform existing SR methods in the complex task of concurrent SR and AWGN denoising, and demonstrate robustness in handling images corrupted with varying levels of AWGN.
Compressed sensing (CS) is a method for accelerating MRI acquisition by acquiring less data through undersampling of Fourier space (k-space). Existing deep learning-based CSMRI methods are commonly concerned with opti...
详细信息
ISBN:
(纸本)9798350349405;9798350349399
Compressed sensing (CS) is a method for accelerating MRI acquisition by acquiring less data through undersampling of Fourier space (k-space). Existing deep learning-based CSMRI methods are commonly concerned with optimizing datadriven network models with input undersampled data points and an efficient learning framework. Generative modelling is a learning framework employed in different applications for learning an abstract distribution of observed data and thereby generating new data points similar to the true features. In this regard, the current work proposes a Generative Adversarial Network (GAN) based Cross Domain Extrapolation Generative Adversarial Network (CdE-GAN) that incorporates an extrapolation mechanism through decoder-type architectural design to represent the fine details with a large set of pixels. The results obtained from the experiments show that the extrapolation network enables robust and accurate estimation of missing frequencies, alleviating the structural artifacts at higher acceleration/downsampling factors compared to state-of-the-art methods.
In this article, we develop a general theoretical framework for constructing Haar-type tight framelets on any compact set with a hierarchical partition. In particular, we construct a novel area-regular hierarchical pa...
详细信息
In this article, we develop a general theoretical framework for constructing Haar-type tight framelets on any compact set with a hierarchical partition. In particular, we construct a novel area-regular hierarchical partition on the two spheres and establish its corresponding spherical Haar tight framelets with directionality. We conclude by evaluating and illustrate the effectiveness of our area-regular spherical Haar tight framelets in several denoising experiments. Furthermore, we propose a convolutional neural network (CNN) model for spherical signal denoising, which employs fast framelet decomposition and reconstruction algorithms. Experiment results show that our proposed CNN model outperforms threshold methods and processes strong generalization and robustness.
Mixing signal separation is an important field of imageprocessing. However, traditional blind source separation (BSS) algorithms were proposed to solve this task utilizing multiple signal constraints such as independ...
详细信息
Mixing signal separation is an important field of imageprocessing. However, traditional blind source separation (BSS) algorithms were proposed to solve this task utilizing multiple signal constraints such as independent, non-Gaussian, low rank, sparsity, temporal continuity etc. What's more, as a case of ill-conditioned signal mixing, the single-channel blind source separation (SCBSS) is more difficult. Because neural networks have strong adaptability and self-organization capability, neural network methods based on training and learning ideas are favored by researchers. However, most BSS methods based on neural network are limited by small sample sizes. Among various neural network, generative adversarial network (GAN) has emerged as an interesting candidate because it is free from statistical constraints and samples. Therefore, we present a single-channel blind image separation algorithm based on attention mechanism GAN, coined AGAN, which uses an end-to-end manner, and it will have more hopeful prospects in the blind image separation task. The network with feature extraction, as well as edge guidance to data creates a new way to iteratively separate mixing images. The experimental results show that AGAN can effectively separate the source signal in the mixing images compared with the neural egg separation (NES) algorithm, which is a neural network separation algorithm. Compared with the classical blind source separation algorithms, this method has better separation performance.
The recent advance on stereoscopic image quality assessment (SIQA) models has been remarkably improved due to the pervasive application of convolutional neural network (CNN). Although the current CNN-based methods hav...
详细信息
The recent advance on stereoscopic image quality assessment (SIQA) models has been remarkably improved due to the pervasive application of convolutional neural network (CNN). Although the current CNN-based methods have achieved good results, these methods only extract single scale features at the same level. And some CNN-based methods directly take left and right images as an input of the network ignoring the visual fusion mechanism. In this work, a hierarchical multi-scale no-reference SIQA method is proposed based on dilated convolution. Multi-scale module constructed by standard convolution will lead to a sharp increase in the number of model parameters. On the contrary, the dilated convolution can restrain the increase in the number of model parameters and enlarge the receptive field. Therefore, dilated convolution is used to simulate the multi-scale characteristics of human vision. In addition, instead of left and right images, the cyclopean image generated by a new method is used as the input of the network. Experimental results on four public databases show that the proposed model is superior to the state-of-the-art SIQA methods.
Machine learning has made significant progress in image recognition, natural language processing, and autonomous driving. However, the generation of adversarial examples has proved that the machine learning system is ...
详细信息
Machine learning has made significant progress in image recognition, natural language processing, and autonomous driving. However, the generation of adversarial examples has proved that the machine learning system is unreliable. By adding imperceptible perturbations to clean images can fool the well-trained machine learning systems. To solve this problem, we propose an adaptive image denoising framework Adaptive Scalar Quantization (ASQ-FastBM3D). The ASQ-FastBM3D framework combines the ASQ method with the FastBM3D algorithm. The adaptive scalar quantization is the improvement of scalar quantization, which is used to eliminate most of the perturbations. FastBM3D is proposed to improve the quality of the quantified image. The running time of FastBM3D is 50% less than that of BM3D. Compared with some traditional filter methods and some state-of-the-art neural network methods for recovering the adversarial examples, the accuracy rate of our ASQ-FastBM3D method is 99.73% and the F1 score is 98.01%, which is the highest.
Matrix functions are, of course, indispensable and of primary concern in polarization optics when the vector nature of light has been considered. This paper is devoted to investigating matrix-based Fourier analysis of...
详细信息
Matrix functions are, of course, indispensable and of primary concern in polarization optics when the vector nature of light has been considered. This paper is devoted to investigating matrix-based Fourier analysis of twodimensional matrix signals and systems. With the aid of the linearity and the superposition integral of matrix functions, the theory of linear invariant matrix systems has been constructed by virtue of six matrix-based integral transformations [i.e., matrix (direct) convolution, matrix (direct) correlation, and matrix element-wise convolution/correlation]. Properties of the matrix-based Fourier transforms have been introduced with some applications including the identity impulse matrix, matrix sampling theorem, width, bandwidth and their uncertainty relation for the matrix signal, and Haagerup's inequality for matrix normalization. The coherence time and the effective spectral width of the stochastic electromagnetic wave have been discussed as an application example to demonstrate how to apply the proposed mathematical tools in analyzing polarization-dependent Fourier optics.
暂无评论