remotesensingimage Change Captioning (RSICC) faces significant challenges in effectively identifying and articulating changes between bi-temporal images. Traditional approaches often utilize individual text decoders...
详细信息
ISBN:
(纸本)9789819784929;9789819784936
remotesensingimage Change Captioning (RSICC) faces significant challenges in effectively identifying and articulating changes between bi-temporal images. Traditional approaches often utilize individual text decoders, which may not capture the subtleties of visual changes nor fully exploit advanced language modeling capabilities. To overcome these limitations, we propose Change-Aware Adaption, namely Chareption, a novel framework that effectively leverages pre-trained large language models (LLMs) to enhance both the accuracy and detail of change captions. Central to Chareption is a change-aware module designed to selectively identify and utilize tokens that significantly represent changes, thus avoiding the common issue of redundancy that plagues methods relying solely on class tokens or indiscriminate use of all patch tokens. Additionally, Chareption designs a lightweight change adapter module, seamlessly integrated into both the vision backbone and the LLM, requiring minimal learnable parameters while optimally adjusting representations for the RSICC task. Our experiments on the LEVIR-CC dataset demonstrate that Chareption significantly outperforms existing methods in caption accuracy and contextual relevance, while also reducing training overhead. This establishes Chareption as a pioneering solution that sets a new direction in RSICC by harnessing the rich representational power of LLMs for improved multimodal understanding.
In recent years, convolutional neural networks (CNNs) have excelled in remotesensingimage super-resolution reconstruction (RSISR) tasks, becoming the predominant algorithms in this domain. However, these models prim...
详细信息
High-resolution (HR) remotesensing is essential for remotesensingimage interpretation, but challenges in super-resolution (SR) stem from scale and texture differences within images, neglecting high-dimensional deta...
详细信息
The article considers a problem related to testing the hypothesis about the independence of random variables given large amounts of statistical data. The solution to this problem is necessary when estimating probabili...
详细信息
The article considers a problem related to testing the hypothesis about the independence of random variables given large amounts of statistical data. The solution to this problem is necessary when estimating probability densities of random variables and synthesizing algorithms for processing information. A nonparametric procedure is proposed for testing the hypothesis about the independence of random variables in a sample containing a large amount of statistical data. The procedure involves the compression of initial statistical data by decomposing the range of values of random variables. The generated data array consists of the centers of sampling intervals and the corresponding frequencies of observations belonging to the original sample. The obtained data was used in the construction of a nonparametric patternrecognition algorithm, which corresponds to the maximum likelihood criterion. The distribution laws in the classes were evaluated assuming the independence and dependence of the compared random variables. When recovering the distribution laws of random variables in the classes, the regression estimates of probability densities were used. For these conditions, the probability of errors in recognizing patterns in the classes was estimated, and decisions about the independence or dependence of random variables were made according to their minimum value. The procedure was used in the analysis of remotesensing data on forest areas;linear and nonlinear relationships between the spectral features of the subject matter of the study were determined.
remotesensingimage change detection is an important task in the field of remotesensingimage analysis, and it is widely used in urban planning, disaster detection, environmental protection and other fields. A U-Net...
详细信息
To synthesize high-fidelity samples, diffusion models typically require auxiliary data to guide the generation process. However, it is impractical to procure the painstaking patch-level annotation effort required in s...
详细信息
ISBN:
(纸本)9798350353013;9798350353006
To synthesize high-fidelity samples, diffusion models typically require auxiliary data to guide the generation process. However, it is impractical to procure the painstaking patch-level annotation effort required in specialized domains like histopathology and satellite imagery;it is often performed by domain experts and involves hundreds of millions of patches. Modern-day self-supervised learning (SSL) representations encode rich semantic and visual information. In this paper, we posit that such representations are expressive enough to act as proxies to fine-grained human labels. We introduce a novel approach that trains diffusion models conditioned on embeddings from SSL. Our diffusion models successfully project these features back to high-quality histopathology and remotesensingimages. In addition, we construct larger images by assembling spatially consistent patches inferred from SSL embeddings, preserving long-range dependencies. Augmenting real data by generating variations of real images improves downstream classifier accuracy for patch-level and larger, image-scale classification tasks. Our models are effective even on datasets not encountered during training, demonstrating their robustness and generalizability. Generating images from learned embeddings is agnostic to the source of the embeddings. The SSL embeddings used to generate a large image can either be extracted from a reference image, or sampled from an auxiliary model conditioned on any related modality (e.g. class labels, text, genomic data). As proof of concept, we introduce the text-to-large image synthesis paradigm where we successfully synthesize large pathology and satellite images out of text descriptions. (1)
This work serves as demonstrator, how low/medium quality UAV data can be integrated for agricultural pattern classification with convolutional neural network (CNN). The study also illustrates the potential sources of ...
详细信息
ISBN:
(纸本)9798350320107
This work serves as demonstrator, how low/medium quality UAV data can be integrated for agricultural pattern classification with convolutional neural network (CNN). The study also illustrates the potential sources of error in spectral and texture information that arise during image acquisition and processing, which can be improved during imageprocessing and correct choice of mosaicking parameters. CNN classification of six agricultural patterns of interest (weed infested area, dry and vital crop area, dry and vital lodged crop area, bare soil area) of corn, rapeseed, winter wheat and spring barley fields. The performance of the classification is assessed on images with different units (reflectance and DN) and images with different sun lightening conditions, shadows and 'blur' effects (moderate/low quality data).
Performing super-resolution of a depth image using the guidance from an RGB image is a problem that concerns several fields, such as robotics, medical imaging, and remotesensing. While deep learning methods have achi...
详细信息
ISBN:
(纸本)9798350301298
Performing super-resolution of a depth image using the guidance from an RGB image is a problem that concerns several fields, such as robotics, medical imaging, and remotesensing. While deep learning methods have achieved good results in this problem, recent work highlighted the value of combining modern methods with more formal frameworks. In this work, we propose a novel approach which combines guided anisotropic diffusion with a deep convolutional network and advances the state of the art for guided depth super-resolution. The edge transferring/enhancing properties of the diffusion are boosted by the contextual reasoning capabilities of modern networks, and a strict adjustment step guarantees perfect adherence to the source image. We achieve unprecedented results in three commonly used benchmarks for guided depth super-resolution. The performance gain compared to other methods is the largest at larger scales, such as x32 scaling. Code(1) for the proposed method is available to promote reproducibility of our results.
Fusing low spatial resolution hyperspectral (LR HS) and high spatial resolution multispectral (HR MS) images from different modalities aim to obtain high spatial resolution hyperspectral (HR HS) images. However, most ...
详细信息
Fusing low spatial resolution hyperspectral (LR HS) and high spatial resolution multispectral (HR MS) images from different modalities aim to obtain high spatial resolution hyperspectral (HR HS) images. However, most deep neural network (DNN)-based methods overlook the correlation between the spatial domain and spectral domain, leading to limited fusion performance. To solve this problem, we propose the spatial-spectral unfolding network with mutual guidance (SMGU-Net). Specifically, the information of different modalities in the source images is treated as mutual complementary components to derive the reconstruction model. Then, the model is optimized using half-quadratic splitting and gradient descent algorithms and is unfolded into a network that leverages the powerful learning capabilities of DNNs to explore more potential information in the deep feature space. In this way, the network achieves the interaction and supplementarity of cross-modality information generate fused images. Experiments are conducted on four benchmark datasets to demonstrate the effectiveness of SMGU-Net. The code can be downloaded from https://***/yansql/SMGU-Net.
Wild fish recognition is a fundamental problem of ocean ecology research and contributes to the understanding of biodiversity. Given the huge number of wild fish species and unrecognized category, the essence of the p...
详细信息
Wild fish recognition is a fundamental problem of ocean ecology research and contributes to the understanding of biodiversity. Given the huge number of wild fish species and unrecognized category, the essence of the problem is an open set fine-grained recognition. Moreover, the unrestricted marine environment makes the problem even more challenging. Deep learning has been demonstrated as a powerful paradigm in image classification tasks. In this article, the wild fish recognition deep neural network (termed WildFishNet) is proposed. Specifically, an open set fine-grained recognition neural network with a fused activation pattern is constructed to implement wild fish recognition. First, three different reciprocal inverted residual structural modules are combined by neural structure search to obtain the best feature extraction performance for fine-grained recognition;next, a new fusion activation pattern of softmax and openmax functions is designed to improve the recognition ability of open set. Then, the experiments are implemented on the WildFish dataset that consists of 54 459 unconstrained images, which includes 685 known classes and 1 open set unrecognized category. Finally, the experimental results are analyzed comprehensively to demonstrate the effectiveness of the proposed method. The in-depth study also shows that artificial intelligence can empower marine ecosystem research.
暂无评论