Hyperspectral remote sensing images (HSIs) capture detailed spectral characteristics of features, while multi- spectral remote sensing images (MSIs) provide clear spatial distribution. Fusing these two types of images...
详细信息
Hyperspectral remote sensing images (HSIs) capture detailed spectral characteristics of features, while multi- spectral remote sensing images (MSIs) provide clear spatial distribution. Fusing these two types of images can enhance feature identification and classification accuracy. Current deep learning algorithms achieve high fusion quality but struggle with balancing global effective perception and lightweight computation. Moreover, these algorithms typically discretely handle data mapping, which contrasts with the continuous nature of the world. Recently, the Mamba has shown significant potential for complex long-range modeling, addressing the computational complexity of global perception. Concurrently, implicit neural representation (INR) offers high-quality solutions for continuous domain modeling. To this end, this study introduces a novel network architecture that combines Mamba and INR, termed the Mamba cooperative INR fusion network (MCIFNet). MCIFNet effectively captures global image information and generates fused images in a continuous domain through pointto-point processing. The network comprises two main units: potential space projection and semantic extraction and fusion. The potential space projection unit performs shallow encoding of hyperspectral and MSIs, mapping them to a latent feature space. The semantic extraction and fusion unit (SEFU) uses scale adaptive residual state spatial and implicit spatial-spectral fusion (ISSF) modules to extract deep features from the bimodal images, generating fused images point-by-point. A series of fusion experiments with 4x, 8x, and 16x scale factors demonstrate that MCIFNet surpasses popular algorithms in both spatial detail and spectral information reconstruction, while also providing more lightweight performance. The code for MCIFNet will be shared on https://***/chunyuzhu/MCIFNet.
fMRI (functional Magnetic Resonance Imaging) visual decoding involves decoding the original image from brain signals elicited by visual stimuli. This often relies on manually labeled ROIs (Regions of Interest) to sele...
详细信息
Existing methods for integerized training speed up deep learning by using low-bitwidth integerized weights, activations, gradients, and optimizer buffers. However, they overlook the issue of full-precision latent weig...
详细信息
Existing methods for integerized training speed up deep learning by using low-bitwidth integerized weights, activations, gradients, and optimizer buffers. However, they overlook the issue of full-precision latent weights, which consume excessive memory to accumulate gradient-based updates for optimizing the integerized weights. In this paper, we propose the first latent weight quantization schema for general integerized training, which minimizes quantization perturbation to training process via residual quantization with optimized dual quantizer. We leverage residual quantization to eliminate the correlation between latent weight and integerized weight for suppressing quantization noise. We further propose dual quantizer with optimal nonuniform codebook to avoid frozen weight and ensure statistically unbiased training trajectory as full-precision latent weight. The codebook is optimized to minimize the disturbance on weight update under importance guidance and achieved with a three-segment polyline approximation for hardware-friendly implementation. Extensive experiments show that the proposed schema allows integerized training with lowest 4-bit latent weight for various architectures including ResNets, MobileNetV2, and Transformers, and yields negligible performance loss in image classification and text generation. Furthermore, we successfully fine-tune Large Language Models with up to 13 billion parameters on one single GPU using the proposed schema.
In recent years, research and technology advancements have driven exponential growth in the adoption of Artificial intelligence (AI)-based systems, even in safety-critical contexts such as autonomous driving and healt...
详细信息
computational tomography (CT) provides high-resolution medical imaging, but it can expose patients to high radiation. X-ray scanners have low radiation exposure, but their resolutions are low. This paper proposes a ne...
详细信息
Near-space airship-borne communication network is recognized to be an indispensable component of the future integrated ground-air-space network thanks to airships' advantage of long-term residency at stratospheric...
详细信息
Near-space airship-borne communication network is recognized to be an indispensable component of the future integrated ground-air-space network thanks to airships' advantage of long-term residency at stratospheric altitudes, but it urgently needs reliable and efficient Airship-to-X link. To improve the transmission efficiency and capacity, this paper proposes to integrate semantic communication with massive multiple-input multiple-output (MIMO) technology. Specifically, we propose a deep joint semantic coding and beamforming (JSCBF) scheme for airship-based massive MIMO image transmission network in space, in which semantics from both source and channel are fused to jointly design the semantic coding and physical layer beamforming. First, we design two semantic extraction networks to extract semantics from image source and channel state information, respectively. Then, we propose a semantic fusion network that can fuse these semantics into complex-valued semantic features for subsequent physical-layer transmission. To efficiently transmit the fused semantic features at the physical layer, we then propose the hybrid data and model-driven semantic-aware beamforming networks. At the receiver, a semantic decoding network is designed to reconstruct the transmitted images. Finally, we perform end-to-end deep learning to jointly train all the modules, using the image reconstruction quality at the receivers as a metric. The proposed deep JSCBF scheme fully combines the efficient source compressibility and robust error correction capability of semantic communication with the high spectral efficiency of massive MIMO, achieving a significant performance improvement over existing approaches.
The discrete wavelet transform (DWT) is commonly used for feature extraction in machine learning applications. Since these applications are frequently deployed in portable systems with limited computational resources,...
详细信息
暂无评论