3D-aware image generation necessitates extensive training data to ensure stable training and mitigate the risk of overfitting. This paper first considers a novel task known as One-shot 3D Generative Domain Adaptation ...
详细信息
Cloud detection is a crucial step in the preprocessing of satellite remote sensing images. Existing methods tend to have misjudgments when dealing with specific scenarios, such as challenges in distinguishing thin clo...
详细信息
ISBN:
(数字)9798350390155
ISBN:
(纸本)9798350390162
Cloud detection is a crucial step in the preprocessing of satellite remote sensing images. Existing methods tend to have misjudgments when dealing with specific scenarios, such as challenges in distinguishing thin clouds from the background and addressing missing cloud boundaries. To solve this problem, we designed a novel spatial–Frequency Domain Feature Enhancement Block (SFDE) embedded in a U-shaped network called SFDE-net. SFDE consists of three units: the Dual Frequency Feature Unit (DFF), the spatial Domain Feature Unit (SDF), and the Cross-Domain Feature Fusion Unit (CDF). DFF globally learns the boundaries and overall structure of clouds in the frequency domain, SDF captures fine-grained information in the spatial domain, and CDF adaptively fuses features from both DFF and SDF. Our method’s effectiveness was evaluated on two public datasets, GF-1 WFV and LandSat8. Extensive experiments demonstrated that the proposed SFDE-net achieved accurate detection accuracy and outperformed several state-of-the-art methods.
Remote sensing data has strong correlation and continuity in space and time,so time series remote sensing images have low-rank *** this dataset,we repaired images using low-rank tensor ***,we preprocessed the MODIS la...
Remote sensing data has strong correlation and continuity in space and time,so time series remote sensing images have low-rank *** this dataset,we repaired images using low-rank tensor ***,we preprocessed the MODIS land surface temperature data and employed spatio-temporal interpolation to initially fill in the missing values caused by cloud ***,we treated the land surface temperature time series data as a third-order spatio-temporal tensor and introduced Fourier transform on the time dimension to convert it into a space-frequency *** performing singular value decomposition and Gaussian low-pass filtering on this tensor,followed by inverse Fourier transform,we obtained a space-time ***,we further optimized the missing tensor using the alternating direction method of *** data accuracy using the method was validated through simulation experiments,where artificial masks were added and subsequently *** resulting mean absolute error(MAE)falls within the range of 2.1℃to 4.9℃.This dataset includes the following data for the Tibetan Plateau on a daily basis for the years 2000-2020:(1)the optimized surface temperature data for the cloud-shaded regions of the MOD11A1,MYD11A1 products(MOD11A1_QTP_PART,MYD11A1_QTP_PART);(2)optimized MOD11A1/MYD11A1 data(MOD11A1_QTP_TEMP,MYD11A1_QTP_TEMP);and(3)original MOD11A1 and MYD11A1 products(MOD11A1_QTP_ORIGIN,MOD11A1_QTP_ORIGIN).All data have a spatial resolution of 1 km and are stored in an integer data format,with pixel value representing the thermodynamic temperature of the surface with a scale factor of 0.02 in *** dataset is archived *** format,and consists of 43833 data files with data size of 143 GB(compressed into 21 files with 138 GB).
Continual semantic segmentation (CSS) has risen as a popular field, which aims to acquire new skills constantly without forgetting past knowledge catastrophically. In CSS, we identify that there is a severe imbalance ...
详细信息
ISBN:
(数字)9798350390155
ISBN:
(纸本)9798350390162
Continual semantic segmentation (CSS) has risen as a popular field, which aims to acquire new skills constantly without forgetting past knowledge catastrophically. In CSS, we identify that there is a severe imbalance between new classes and old classes, leading to the classifier weight toward new classes. In this paper, we deal with the continual semantic segmentation problem from the class imbalance perspective via mask-based class rebalancing, avoiding the model suffering from catastrophic forgetting. More specifically, the mask-based class rebalancing depends on a mask to combine resampling with reweighting ingenuously, which mitigates the classifier bias toward new classes. Besides, we also propose a frequency knowledge distillation, leveraging multiple frequency components information to maintain the feature representation space for old classes. We demonstrate the effectiveness of our approach with an extensive evaluation of the Pascal-VOC 2012 and ADE20K datasets, significantly outperforming the state-of-the-art method.
Over the past few years, the advancement of Multimodal Large Language Models (MLLMs) has captured the wide interest of researchers, leading to numerous innovations to enhance MLLMs’ comprehension. In this paper, we p...
详细信息
Light field, as a new data representation format in multimedia, has the ability to capture both intensity and direction of light rays. However, the additional angular information also brings a large volume of data. Cl...
详细信息
ISBN:
(纸本)9781665492584
Light field, as a new data representation format in multimedia, has the ability to capture both intensity and direction of light rays. However, the additional angular information also brings a large volume of data. Classical coding methods are not effective to describe the relationship between different views, leading to redundancy left. To address this problem, we propose a novel light field compression scheme based on implicit neural representation to reduce redundancies between views. We store the information of a light field image implicitly in an neural network and adopt model compression methods to further compress the implicit representation. Extensive experiments have demonstrated the effectiveness of our proposed method, which achieves comparable rate-distortion performance as well as superior perceptual quality over traditional methods.
In Versatile Video Coding (VVC), local affine motion compensation (LAMC) is adopted to handle complex motions, such as rotation and zooming. However, it is inefficient to use LAMC to handle the global motion due to th...
详细信息
ISBN:
(纸本)9781665475938
In Versatile Video Coding (VVC), local affine motion compensation (LAMC) is adopted to handle complex motions, such as rotation and zooming. However, it is inefficient to use LAMC to handle the global motion due to the following two reasons. First, the use of LAMC may lead to some extra bit cost on the affine motion model parameters. Second, the precision of LAMC is restricted by the MV precision of the control points. Therefore, in this paper, we propose a global homography motion compensation (GHMC) framework to better characterize the global motion. For each coding block, an extra mode is added to perform motion compensation based on an 8-parameter global homography motion model. In addition, an extrapolation scheme is designed to derive the parameters from reference frames to save the bit cost for signaling them. The proposed framework is implemented into the VVC reference software VTM-6.0. Experimental results show that, on average, 0.69% and 0.66% BD-rate reduction is achieved under Low Delay P and Low Delay B configurations, respectively, for sequences with rich complex global motions.
Synthetic aperture radar (SAR) tomography (TomoSAR) has garnered significant attention due to its capability for three-dimensional reconstruction. Compressed sensing (CS) methods are widely employed to address the Tom...
ISBN:
(数字)9781837240982
Synthetic aperture radar (SAR) tomography (TomoSAR) has garnered significant attention due to its capability for three-dimensional reconstruction. Compressed sensing (CS) methods are widely employed to address the TomoSAR inversion challenge. Nevertheless, practical applications reveal phase errors among different channels, resulting in defocusing and blurring when relying solely on CS for 3D reconstruction. Current state-of-the-art autofocus techniques suffer from prohibitive computational complexity, limiting their applicability to large-scale 3D imaging. In pursuit of efficient TomoSAR 3-D autofocusing, we proposed ASAMP-Net, an innovative deep unfolding network. Operating within a two-step framework, each layer comprises two stages: phase error estimation and iterative scattering coefficient reconstruction using the sparse adaptive matching pursuit (SAMP) algorithm. Additionally, phase error estimation is obtained through mathematical derivation, while challenges associated with fixed sparsity and limited efficiency in conventional methods are mitigated through deep learning techniques. Simulation experiments and real data validation affirm the effectiveness and superiority of the proposed method.
Deep neural networks (DNNs) have shown great potential in non-reference image quality assessment (NR-IQA). However, the annotation of NR-IQA is labor-intensive and time-consuming, which severely limits their applicati...
详细信息
On-orbit processing is becoming more prevalent due to its ability to efficiently exploit satellite resources. On-orbit geometric rectification improves positioning accuracy for follow-up tasks such as object detection...
On-orbit processing is becoming more prevalent due to its ability to efficiently exploit satellite resources. On-orbit geometric rectification improves positioning accuracy for follow-up tasks such as object detection or geometric calibration, while avoiding heavy burden on downlinking bandwidth and time delay. However, existing rectification methods faces some challenges. The hardware resources onboard satellites are restricted, and geographic positioning is often inaccurate. In this article, we propose a novel method designed for on-orbit rectification. The proposed method introduces a two-step registration framework to overcome large initial offsets and also a feature-compressing strategy to reduce the storage space of reference patches. Quantitative and practical experiments demonstrate that the proposed method performs well in terms of storage space, time efficiency as well as registration accuracy.
暂无评论