检索结果-内蒙古大学图书馆

arXiv 2024年

作者： Li, Ziqiang Wu, Yi Wang, Chaoyue Rui, Xue Li, Bin Nanjing University of Information Science and Technology Nanjing China University of Science and Technology of China Hefei China University of Sydney Australia CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System University of Science and Technology of China Hefei China

3D-aware image generation necessitates extensive training data to ensure stable training and mitigate the risk of overfitting. This paper first considers a novel task known as One-shot 3D Generative Domain Adaptation (GDA), aimed at transferring a pre-trained 3D generator from one domain to a new one, relying solely on a single reference image. One-shot 3D GDA is characterized by the pursuit of specific attributes, namely, high fidelity, large diversity, cross-domain consistency, and multi-view consistency. Within this paper, we introduce 3D-Adapter, the first one-shot 3D GDA method, for diverse and faithful generation. Our approach begins by judiciously selecting a restricted weight set for fine-tuning, and subsequently leverages four advanced loss functions to facilitate adaptation. An efficient progressive fine-tuning strategy is also implemented to enhance the adaptation process. The synergy of these three technological components empowers 3D-Adapter to achieve remarkable performance, substantiated both quantitatively and qualitatively, across all desired properties of 3D GDA. Furthermore, 3D-Adapter seamlessly extends its capabilities to zero-shot scenarios, and preserves the potential for crucial tasks such as interpolation, reconstruction, and editing within the latent space of the pre-trained generator. Code will be available at https://***/iceli1007/3D-Adapter. © 2024, CC BY.

关键词： Generative adversarial networks

来源：评论

学校读者我要写书评

暂无评论

SFDE-net: A spatial-Frequency Domain Feature Enhancement Network for Cloud Detection

SFDE-net: A Spatial-Frequency Domain Feature Enhancement Net...

引用

IEEE International Conference on Multimedia and Expo (ICME)

作者： Baotong Su Siyan Li Wenguang Zheng Yao Chen The School of Computer Science and Engineering Tianjin University of Technology Tianjin China Key Laboratory of Technology in Geo-Spatial Information Processing and Application System Chinese Academy of Sciences and Aerospace Information Research Institute Chinese Academy of Sciences Beijing China

ISBN: (数字)9798350390155

ISBN: (纸本)9798350390162

Cloud detection is a crucial step in the preprocessing of satellite remote sensing images. Existing methods tend to have misjudgments when dealing with specific scenarios, such as challenges in distinguishing thin clouds from the background and addressing missing cloud boundaries. To solve this problem, we designed a novel spatial–Frequency Domain Feature Enhancement Block (SFDE) embedded in a U-shaped network called SFDE-net. SFDE consists of three units: the Dual Frequency Feature Unit (DFF), the spatial Domain Feature Unit (SDF), and the Cross-Domain Feature Fusion Unit (CDF). DFF globally learns the boundaries and overall structure of clouds in the frequency domain, SDF captures fine-grained information in the spatial domain, and CDF adaptively fuses features from both DFF and SDF. Our method’s effectiveness was evaluated on two public datasets, GF-1 WFV and LandSat8. Extensive experiments demonstrated that the proposed SFDE-net achieved accurate detection accuracy and outperformed several state-of-the-art methods.

关键词： Earth Accuracy Satellites Artificial satellites Fuses Frequency-domain analysis Merging

来源：评论

学校读者我要写书评

暂无评论

1-km/Daily Land Surface Temperature Optimized Dataset for the Qinghai-Tibet Plateau Based on MODIS Data(2000-2020)

全球变化数据仓储（中英文）

引用

全球变化数据仓储（中英文） 2023年第10期 4-7页

作者： XU Xunpeng ZHANG Yu JI Luyan TANG Hairong Aerospace Information Research Institute Chinese Academy of SciencesBeijing 100094China the Key Laboratory of Technology in Geo-Spatial information Processing and Application System Chinese Academy of SciencesBeijing 100190China the School of Electronic Electrical and Communication EngineeringUniversity of Chinese Academy of SciencesBeijing 101408China Aerospace Information Research Institute Chinese Academy of SciencesBeijing 100094China the Key Laboratory of Technology in Geo-Spatial information Processing and Application System Chinese Academy of SciencesBeijing 100190China

Remote sensing data has strong correlation and continuity in space and time,so time series remote sensing images have low-rank *** this dataset,we repaired images using low-rank tensor ***,we preprocessed the MODIS land surface temperature data and employed spatio-temporal interpolation to initially fill in the missing values caused by cloud ***,we treated the land surface temperature time series data as a third-order spatio-temporal tensor and introduced Fourier transform on the time dimension to convert it into a space-frequency *** performing singular value decomposition and Gaussian low-pass filtering on this tensor,followed by inverse Fourier transform,we obtained a space-time ***,we further optimized the missing tensor using the alternating direction method of *** data accuracy using the method was validated through simulation experiments,where artificial masks were added and subsequently *** resulting mean absolute error(MAE)falls within the range of 2.1℃to 4.9℃.This dataset includes the following data for the Tibetan Plateau on a daily basis for the years 2000-2020:(1)the optimized surface temperature data for the cloud-shaded regions of the MOD11A1,MYD11A1 products(MOD11A1_QTP_PART,MYD11A1_QTP_PART);(2)optimized MOD11A1/MYD11A1 data(MOD11A1_QTP_TEMP,MYD11A1_QTP_TEMP);and(3)original MOD11A1 and MYD11A1 products(MOD11A1_QTP_ORIGIN,MOD11A1_QTP_ORIGIN).All data have a spatial resolution of 1 km and are stored in an integer data format,with pixel value representing the thermodynamic temperature of the surface with a scale factor of 0.02 in *** dataset is archived *** format,and consists of 43833 data files with data size of 143 GB(compressed into 21 files with 138 GB).

关键词：

来源：评论

学校读者我要写书评

暂无评论

Continual Semantic Segmentation via Mask-Based Class Rebalancing

Continual Semantic Segmentation via Mask-Based Class Rebalan...

引用

IEEE International Conference on Multimedia and Expo (ICME)

作者： Yongjie Guo Siya Chen Hongjian You Key Laboratory of Technology in Geo-Spatial Information Processing and Application System Aerospace Information Research Institute Chinese Academy of Sciences Beijing China School of Electronic Electrical and Communication Engineering University of Chinese Academy of Sciences Beijing China

ISBN: (数字)9798350390155

ISBN: (纸本)9798350390162

Continual semantic segmentation (CSS) has risen as a popular field, which aims to acquire new skills constantly without forgetting past knowledge catastrophically. In CSS, we identify that there is a severe imbalance between new classes and old classes, leading to the classifier weight toward new classes. In this paper, we deal with the continual semantic segmentation problem from the class imbalance perspective via mask-based class rebalancing, avoiding the model suffering from catastrophic forgetting. More specifically, the mask-based class rebalancing depends on a mask to combine resampling with reweighting ingenuously, which mitigates the classifier bias toward new classes. Besides, we also propose a frequency knowledge distillation, leveraging multiple frequency components information to maintain the feature representation space for old classes. We demonstrate the effectiveness of our approach with an extensive evaluation of the Pascal-VOC 2012 and ADE20K datasets, significantly outperforming the state-of-the-art method.

关键词： Semantic segmentation Semantics

来源：评论

学校读者我要写书评

暂无评论

AdaptVision: Dynamic Input Scaling in MLLMs for Versatile Scene Understanding

arXiv

引用

arXiv 2024年

作者： Wang, Yonghui Zhou, Wengang Feng, Hao Li, Houqiang The CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System Department of Electronic Engineering and Information Science University of Science and Technology of China Hefei230027 China Institute of Artificial Intelligence Hefei Comprehensive National Science Center China

Over the past few years, the advancement of Multimodal Large Language Models (MLLMs) has captured the wide interest of researchers, leading to numerous innovations to enhance MLLMs’ comprehension. In this paper, we present AdaptVision, a multimodal large language model specifically designed to dynamically process input images at varying resolutions. We hypothesize that the requisite number of visual tokens for the model is contingent upon both the resolution and content of the input image. Generally, natural images with a lower information density can be effectively interpreted by the model using fewer visual tokens at reduced resolutions. In contrast, images containing textual content, such as documents with rich text, necessitate a higher number of visual tokens for accurate text interpretation due to their higher information density. Building on this insight, we devise a dynamic image partitioning module that adjusts the number of visual tokens according to the size and aspect ratio of images. This method mitigates distortion effects that arise from resizing images to a uniform resolution and dynamically optimizing the visual tokens input to the LLMs. Our model is capable of processing images with resolutions up to 1008 × 1008. Extensive experiments across various datasets demonstrate that our method achieves impressive performance in handling vision-language tasks in both natural and text-related scenes. The source code and dataset are now publicly available at https://***/harrytea/AdaptVision. Copyright © 2024, The Authors. All rights reserved.

关键词： Modeling languages

来源：评论

学校读者我要写书评

暂无评论

Light Field Compression Based on Implicit Neural Representation

Light Field Compression Based on Implicit Neural Representat...

引用

Picture Coding Symposium, PCS

作者： Henan Wang Hanxin Zhu Zhibo Chen CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System University of Science and Technology of China Hefei China

ISBN: (纸本)9781665492584

Light field, as a new data representation format in multimedia, has the ability to capture both intensity and direction of light rays. However, the additional angular information also brings a large volume of data. Classical coding methods are not effective to describe the relationship between different views, leading to redundancy left. To address this problem, we propose a novel light field compression scheme based on implicit neural representation to reduce redundancies between views. We store the information of a light field image implicitly in an neural network and adopt model compression methods to further compress the implicit representation. Extensive experiments have demonstrated the effectiveness of our proposed method, which achieves comparable rate-distortion performance as well as superior perceptual quality over traditional methods.

关键词： Image coding Redundancy Pipelines Neural networks Rate-distortion Light fields Encoding

来源：评论

学校读者我要写书评

暂无评论

Global Homography Motion Compensation for Versatile Video Coding

Global Homography Motion Compensation for Versatile Video Co...

引用

IEEE Visual Communications and Image processing (VCIP)

作者： Yao Li Zhuoyuan Li Li Li Dong Liu Houqiang Li CAS Key Laboratory of Technology in Geo-Spatial Information Processing and Application System University of Science and Technology of China Hefei China

ISBN: (纸本)9781665475938

In Versatile Video Coding (VVC), local affine motion compensation (LAMC) is adopted to handle complex motions, such as rotation and zooming. However, it is inefficient to use LAMC to handle the global motion due to the following two reasons. First, the use of LAMC may lead to some extra bit cost on the affine motion model parameters. Second, the precision of LAMC is restricted by the MV precision of the control points. Therefore, in this paper, we propose a global homography motion compensation (GHMC) framework to better characterize the global motion. For each coding block, an extra mode is added to perform motion compensation based on an 8-parameter global homography motion model. In addition, an extrapolation scheme is designed to derive the parameters from reference frames to save the bit cost for signaling them. The proposed framework is implemented into the VVC reference software VTM-6.0. Experimental results show that, on average, 0.69% and 0.66% BD-rate reduction is achieved under Low Delay P and Low Delay B configurations, respectively, for sequences with rich complex global motions.

关键词： Video coding Extrapolation Adaptation models Costs Image coding Visual communication Motion compensation

来源：评论

学校读者我要写书评

暂无评论

An autofocus network for multi-channel phase errors with application to tomoSAR imaging

An autofocus network for multi-channel phase errors with app...

引用

IET International Radar Conference (IRC 2023)

作者： Muhan Wang Silin Gao Zhe Zhang Xiaolan Qiu Key Laboratory of Technology in Geo-spatial Information Processing and Application System Chinese Academy of Sciences Beijing 100190 People's Republic of China Key Laboratory of Intelligent Aerospace Big Data Application Technology Suzhou 215123 People's Republic of China

ISBN: (数字)9781837240982

Synthetic aperture radar (SAR) tomography (TomoSAR) has garnered significant attention due to its capability for three-dimensional reconstruction. Compressed sensing (CS) methods are widely employed to address the TomoSAR inversion challenge. Nevertheless, practical applications reveal phase errors among different channels, resulting in defocusing and blurring when relying solely on CS for 3D reconstruction. Current state-of-the-art autofocus techniques suffer from prohibitive computational complexity, limiting their applicability to large-scale 3D imaging. In pursuit of efficient TomoSAR 3-D autofocusing, we proposed ASAMP-Net, an innovative deep unfolding network. Operating within a two-step framework, each layer comprises two stages: phase error estimation and iterative scattering coefficient reconstruction using the sparse adaptive matching pursuit (SAMP) algorithm. Additionally, phase error estimation is obtained through mathematical derivation, while challenges associated with fixed sparsity and limited efficiency in conventional methods are mitigated through deep learning techniques. Simulation experiments and real data validation affirm the effectiveness and superiority of the proposed method.

关键词：

来源：评论

学校读者我要写书评

暂无评论

StyleAM: Perception-Oriented Unsupervised Domain Adaption for Non-reference Image Quality Assessment

arXiv

引用

arXiv 2022年

作者： Lu, Yiting Li, Xin Liu, Jianzhao Chen, Zhibo The CAS Key Laboratory of Technology in Geo-Spatial Information Processing and Application System University of Science and Technology of China Hefei230027 China

Deep neural networks (DNNs) have shown great potential in non-reference image quality assessment (NR-IQA). However, the annotation of NR-IQA is labor-intensive and time-consuming, which severely limits their application especially for authentic images. To relieve the dependence on quality annotation, some works have applied unsupervised domain adaptation (UDA) to NR-IQA. However, the above methods ignore that the alignment space used in classification is suboptimal, since the space is not elaborately designed for perception. To solve this challenge, we propose an effective perception-oriented unsupervised domain adaptation method StyleAM for NR-IQA, which transfers sufficient knowledge from label-rich source domain data to label-free target domain images via Style Alignment and Mixup. Specifically, we find a more compact and reliable space i.e., feature style space for perception-oriented UDA based on an interesting/amazing observation, that the feature style (i.e., the mean and variance) of the deep layer in DNNs is exactly associated with the quality score in NR-IQA. Therefore, we propose to align the source and target domains in a more perceptual-oriented space i.e., the feature style space, to reduce the intervention from other quality-irrelevant feature factors. Furthermore, to increase the consistency between quality score and its feature style, we also propose a novel feature augmentation strategy Style Mixup, which mixes the feature styles (i.e., the mean and variance) before the last layer of DNNs together with mixing their labels. Extensive experimental results on two typical cross-domain settings (i.e., synthetic to authentic, and multiple distortions to one distortion) have demonstrated the effectiveness of our proposed StyleAM on NR-IQA. Copyright © 2022, The Authors. All rights reserved.

关键词： Alignment

来源：评论

学校读者我要写书评

暂无评论

On-orbit geometric rectification for micro-satellite based on Lightweight feature database

On-orbit geometric rectification for micro-satellite based o...

引用

IEEE International Symposium on Geoscience and Remote Sensing (IGARSS)

作者： Linhui Wang Yuming Xiang Feng Wang Yuxin Hu Hongjian You Key Laboratory of Technology in Geo-Spatial Information Processing and Application System Aerospace Information Research Institute Chinese Academy of Sciences School of Electronics Electrical and Communication Engineering University of Chinese Academy of Sciences

On-orbit processing is becoming more prevalent due to its ability to efficiently exploit satellite resources. On-orbit geometric rectification improves positioning accuracy for follow-up tasks such as object detection or geometric calibration, while avoiding heavy burden on downlinking bandwidth and time delay. However, existing rectification methods faces some challenges. The hardware resources onboard satellites are restricted, and geographic positioning is often inaccurate. In this article, we propose a novel method designed for on-orbit rectification. The proposed method introduces a two-step registration framework to overcome large initial offsets and also a feature-compressing strategy to reduce the storage space of reference patches. Quantitative and practical experiments demonstrate that the proposed method performs well in terms of storage space, time efficiency as well as registration accuracy.

关键词：

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：