检索结果-内蒙古大学图书馆

IEEE International Conference on Multimedia and Expo (ICME)

作者： Yongjie Guo Siya Chen Hongjian You Key Laboratory of Technology in Geo-Spatial Information Processing and Application System Aerospace Information Research Institute Chinese Academy of Sciences Beijing China School of Electronic Electrical and Communication Engineering University of Chinese Academy of Sciences Beijing China

ISBN: (数字)9798350390155

ISBN: (纸本)9798350390162

Continual semantic segmentation (CSS) has risen as a popular field, which aims to acquire new skills constantly without forgetting past knowledge catastrophically. In CSS, we identify that there is a severe imbalance between new classes and old classes, leading to the classifier weight toward new classes. In this paper, we deal with the continual semantic segmentation problem from the class imbalance perspective via mask-based class rebalancing, avoiding the model suffering from catastrophic forgetting. More specifically, the mask-based class rebalancing depends on a mask to combine resampling with reweighting ingenuously, which mitigates the classifier bias toward new classes. Besides, we also propose a frequency knowledge distillation, leveraging multiple frequency components information to maintain the feature representation space for old classes. We demonstrate the effectiveness of our approach with an extensive evaluation of the Pascal-VOC 2012 and ADE20K datasets, significantly outperforming the state-of-the-art method.

关键词： Semantic segmentation Semantics

来源：评论

学校读者我要写书评

暂无评论

Light Field Compression Based on Implicit Neural Representation

Light Field Compression Based on Implicit Neural Representat...

引用

Picture Coding Symposium, PCS

作者： Henan Wang Hanxin Zhu Zhibo Chen CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System University of Science and Technology of China Hefei China

ISBN: (纸本)9781665492584

Light field, as a new data representation format in multimedia, has the ability to capture both intensity and direction of light rays. However, the additional angular information also brings a large volume of data. Classical coding methods are not effective to describe the relationship between different views, leading to redundancy left. To address this problem, we propose a novel light field compression scheme based on implicit neural representation to reduce redundancies between views. We store the information of a light field image implicitly in an neural network and adopt model compression methods to further compress the implicit representation. Extensive experiments have demonstrated the effectiveness of our proposed method, which achieves comparable rate-distortion performance as well as superior perceptual quality over traditional methods.

关键词： Image coding Redundancy Pipelines Neural networks Rate-distortion Light fields Encoding

来源：评论

学校读者我要写书评

暂无评论

AdaptVision: Dynamic Input Scaling in MLLMs for Versatile Scene Understanding

arXiv

引用

arXiv 2024年

作者： Wang, Yonghui Zhou, Wengang Feng, Hao Li, Houqiang The CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System Department of Electronic Engineering and Information Science University of Science and Technology of China Hefei230027 China Institute of Artificial Intelligence Hefei Comprehensive National Science Center China

Over the past few years, the advancement of Multimodal Large Language Models (MLLMs) has captured the wide interest of researchers, leading to numerous innovations to enhance MLLMs’ comprehension. In this paper, we present AdaptVision, a multimodal large language model specifically designed to dynamically process input images at varying resolutions. We hypothesize that the requisite number of visual tokens for the model is contingent upon both the resolution and content of the input image. Generally, natural images with a lower information density can be effectively interpreted by the model using fewer visual tokens at reduced resolutions. In contrast, images containing textual content, such as documents with rich text, necessitate a higher number of visual tokens for accurate text interpretation due to their higher information density. Building on this insight, we devise a dynamic image partitioning module that adjusts the number of visual tokens according to the size and aspect ratio of images. This method mitigates distortion effects that arise from resizing images to a uniform resolution and dynamically optimizing the visual tokens input to the LLMs. Our model is capable of processing images with resolutions up to 1008 × 1008. Extensive experiments across various datasets demonstrate that our method achieves impressive performance in handling vision-language tasks in both natural and text-related scenes. The source code and dataset are now publicly available at https://***/harrytea/AdaptVision. Copyright © 2024, The Authors. All rights reserved.

关键词： Modeling languages

来源：评论

学校读者我要写书评

暂无评论

Interpreting the Latent Space of GANs via Measuring Decoupling

IEEE Transactions on Artificial Intelligence

引用

IEEE Transactions on Artificial Intelligence 2021年第1期2卷 58-70页

作者： Li, Ziqiang Tao, Rentuo Wang, Jie Li, Fu Niu, Hongjing Yue, Mingdao Li, Bin The CAS Key Laboratory of Technology in Geo-Spatial Information Processing and Application Systems University of Science and Technology of China Hefei230052 China The Department of Electronic Engineering and Information Science University of Science and Technology of China Hefei230052 China The School of Information Science and Technology University of Science and Technology of China Hefei230052 China The College of Mechanical and Electrical Engineering Suzhou University Suzhou234000 China

With the success of generative adversarial networks (GANs) on various real-world applications, the controllability and security of GANs have raised more and more concerns from the community. Specifically, understanding the latent space of GANs, i.e., obtaining the completely decoupled latent space, is essential for applications in some secure scenarios. At present, there is no quantitative method to measure the decoupling of latent space, which is not conducive to the development of the community. In this article, we propose two methods to measure the sensitivity of latent dimensions: one is a sequential intervention method, and the other is an optimization-based method that measures the sensitivity in both the value and the direction. With these two methods, the decoupling of latent space can be measured by the sparsity of the sensitivity vector obtained. The effectiveness of the proposed methods has been verified by experiments on the representative GANs. Code will be available at https://***/iceli1007/latent-analysis-of. Impact Statement-Generative adversarial networks (GANs) is a popular technology in image generation. Benefiting from the development of neural networks, GANs have been widely used in many real-world tasks, such as super-resolution, image translation, and image inpainting. However, the understanding of the generator in GANs as a map from latent space (represented in a multidimensional vector) to image space is incomplete. To make better use of GANs, more insight into the generation process is required. The motivation of this article is to develop methods to analyze the influence of each dimension of latent space on the generated results, and to measure the decoupling of latent space, the knowledge obtained will be useful for the development of more precise controllable image generation technology. © 2021 IEEE.

关键词： Generative adversarial networks

来源：评论

学校读者我要写书评

暂无评论

An autofocus network for multi-channel phase errors with application to tomoSAR imaging

An autofocus network for multi-channel phase errors with app...

引用

IET International Radar Conference (IRC 2023)

作者： Muhan Wang Silin Gao Zhe Zhang Xiaolan Qiu Key Laboratory of Technology in Geo-spatial Information Processing and Application System Chinese Academy of Sciences Beijing 100190 People's Republic of China Key Laboratory of Intelligent Aerospace Big Data Application Technology Suzhou 215123 People's Republic of China

ISBN: (数字)9781837240982

Synthetic aperture radar (SAR) tomography (TomoSAR) has garnered significant attention due to its capability for three-dimensional reconstruction. Compressed sensing (CS) methods are widely employed to address the TomoSAR inversion challenge. Nevertheless, practical applications reveal phase errors among different channels, resulting in defocusing and blurring when relying solely on CS for 3D reconstruction. Current state-of-the-art autofocus techniques suffer from prohibitive computational complexity, limiting their applicability to large-scale 3D imaging. In pursuit of efficient TomoSAR 3-D autofocusing, we proposed ASAMP-Net, an innovative deep unfolding network. Operating within a two-step framework, each layer comprises two stages: phase error estimation and iterative scattering coefficient reconstruction using the sparse adaptive matching pursuit (SAMP) algorithm. Additionally, phase error estimation is obtained through mathematical derivation, while challenges associated with fixed sparsity and limited efficiency in conventional methods are mitigated through deep learning techniques. Simulation experiments and real data validation affirm the effectiveness and superiority of the proposed method.

关键词：

来源：评论

学校读者我要写书评

暂无评论

A SAR Deceptive Jamming Suppression Method Based on PRI Variation Design and Multi-Channel Principle

引用

IEEE Transactions on geoscience and Remote Sensing 2025年

作者： Shi, Haixu Xu, Zhongqiu Li, Guangzuo Lin, Kuan Liu, Tianqu Hong, Wen Chinese Academy of Sciences Aerospace Information Research Institute Beijing100094 China Key Laboratory of Spatial Information Processing and Application System Technology Beijing100094 China Key Laboratory of Target Cognition and Application Technology Beijing100094 China University of Chinese Academy of Sciences School of Electronic Electrical and Communication Engineering Beijing101499 China

The synthetic aperture radar (SAR) can be affected by various types of jamming during operation. Among them, the deceptive jamming generated by digital radio frequency memory (DRFM) jammers poses a serious threat to SAR imaging by creating highly realistic false targets. Moreover, with advancements in deceptive jamming technology, the generation speed of deceptive jamming has increased, rendering existing methods less effective. To address this issue, an anti-deceptive jamming method based on pulse repetition interval (PRI) variation design and multi-channel principle is proposed to mitigate the effects of deceptive jamming. First, a PRI variation strategy that will not cause the loss of echo signals in the imaging area is designed. By utilizing this strategy for imaging, deceptive jamming signals are dispersed across different ranges, resulting in preliminary suppression of the jamming. Subsequently, after azimuth non-uniform sampling reconstruction and range processing, most of the jamming signals are suppressed due to the azimuth timing differences between SAR and jamming signals. However, when the jammer uses specific retransmission intervals, such as the average PRI of the PRI sequence, the jamming signals may be concentrated at certain ranges, retaining some coherence and posing a threat to SAR imaging. To overcome this challenge, a residual jamming detection and suppression algorithm based on multi-channel principle is proposed, which can detect and filter out the channels affected by jamming. Finally, an azimuth sparse reconstruction is introduced for azimuth processing. Since the anti-jamming principle of this method relies on the differences in azimuth timing between SAR and jamming, it can suppress deceptive jamming even when the generation speed of deceptive jamming is rapid, which some other anti-deceptive jamming methods cannot achieve. Simulations of SAR imaging under deceptive jamming conditions are conducted for point target scene and complex target

关键词： Synthetic aperture radar

来源：评论

学校读者我要写书评

暂无评论

On-orbit geometric rectification for micro-satellite based on Lightweight feature database

On-orbit geometric rectification for micro-satellite based o...

引用

IEEE International Symposium on geoscience and Remote Sensing (IGARSS)

作者： Linhui Wang Yuming Xiang Feng Wang Yuxin Hu Hongjian You Key Laboratory of Technology in Geo-Spatial Information Processing and Application System Aerospace Information Research Institute Chinese Academy of Sciences School of Electronics Electrical and Communication Engineering University of Chinese Academy of Sciences

On-orbit processing is becoming more prevalent due to its ability to efficiently exploit satellite resources. On-orbit geometric rectification improves positioning accuracy for follow-up tasks such as object detection or geometric calibration, while avoiding heavy burden on downlinking bandwidth and time delay. However, existing rectification methods faces some challenges. The hardware resources onboard satellites are restricted, and geographic positioning is often inaccurate. In this article, we propose a novel method designed for on-orbit rectification. The proposed method introduces a two-step registration framework to overcome large initial offsets and also a feature-compressing strategy to reduce the storage space of reference patches. Quantitative and practical experiments demonstrate that the proposed method performs well in terms of storage space, time efficiency as well as registration accuracy.

关键词：

来源：评论

学校读者我要写书评

暂无评论

StyleAM: Perception-Oriented Unsupervised Domain Adaption for Non-reference Image Quality Assessment

arXiv

引用

arXiv 2022年

作者： Lu, Yiting Li, Xin Liu, Jianzhao Chen, Zhibo The CAS Key Laboratory of Technology in Geo-Spatial Information Processing and Application System University of Science and Technology of China Hefei230027 China

Deep neural networks (DNNs) have shown great potential in non-reference image quality assessment (NR-IQA). However, the annotation of NR-IQA is labor-intensive and time-consuming, which severely limits their application especially for authentic images. To relieve the dependence on quality annotation, some works have applied unsupervised domain adaptation (UDA) to NR-IQA. However, the above methods ignore that the alignment space used in classification is suboptimal, since the space is not elaborately designed for perception. To solve this challenge, we propose an effective perception-oriented unsupervised domain adaptation method StyleAM for NR-IQA, which transfers sufficient knowledge from label-rich source domain data to label-free target domain images via Style Alignment and Mixup. Specifically, we find a more compact and reliable space i.e., feature style space for perception-oriented UDA based on an interesting/amazing observation, that the feature style (i.e., the mean and variance) of the deep layer in DNNs is exactly associated with the quality score in NR-IQA. Therefore, we propose to align the source and target domains in a more perceptual-oriented space i.e., the feature style space, to reduce the intervention from other quality-irrelevant feature factors. Furthermore, to increase the consistency between quality score and its feature style, we also propose a novel feature augmentation strategy Style Mixup, which mixes the feature styles (i.e., the mean and variance) before the last layer of DNNs together with mixing their labels. Extensive experimental results on two typical cross-domain settings (i.e., synthetic to authentic, and multiple distortions to one distortion) have demonstrated the effectiveness of our proposed StyleAM on NR-IQA. Copyright © 2022, The Authors. All rights reserved.

关键词： Alignment

来源：评论

学校读者我要写书评

暂无评论

Source-free Unsupervised Domain Adaptation for Blind Image Quality Assessment

arXiv

引用

arXiv 2022年

作者： Liu, Jianzhao Li, Xin An, Shukun Chen, Zhibo The CAS Key Laboratory of Technology in Geo-Spatial Information Processing and Application System University of Science and Technology of China Hefei230027 China

Existing learning-based methods for blind image quality assessment (BIQA) are heavily dependent on large amounts of annotated training data, and usually suffer from a severe performance degradation when encountering the domain/distribution shift problem. Thanks to the development of unsupervised domain adaptation (UDA), some works attempt to transfer the knowledge from a label-sufficient source domain to a label-free target domain under domain shift with UDA. However, it requires the coexistence of source and target data, which might be impractical for source data due to the privacy or storage issues. In this paper, we take the first step towards the source-free unsupervised domain adaptation (SFUDA) in a simple yet efficient manner for BIQA to tackle the domain shift without access to the source data. Specifically, we cast the quality assessment task as a rating distribution prediction problem. Based on the intrinsic properties of BIQA, we present a group of well-designed self-supervised objectives to guide the adaptation of the BN affine parameters towards the target domain. Among them, minimizing the prediction entropy and maximizing the batch prediction diversity aim to encourage more confident results while avoiding the trivial solution. Besides, based on the observation that the IQA rating distribution of single image follows the Gaussian distribution, we apply Gaussian regularization to the predicted rating distribution to make it more consistent with the nature of human scoring. Extensive experimental results under cross-domain scenarios demonstrated the effectiveness of our proposed method to mitigate the domain shift. Copyright © 2022, The Authors. All rights reserved.

关键词： Forecasting

来源：评论

学校读者我要写书评

暂无评论

Remote Sensing Image Matching and Localization Method Based on the MySQL Multi-Feature Control Point Database

Remote Sensing Image Matching and Localization Method Based ...

引用

International Conference on Image, Vision and Computing (ICIVC)

作者： Zhao Zilu Wang Feng You Hongjian Li Peifeng Zhang Tingtao Aerospace Information Research Institute Chinese Academy of Sciences Beijing China Key Laboratory of Technology in Geo-Spatial Information Processing and Application System Beijing China School of Electronic Electrical and Communication Engineering University of Chinese Academy of Sciences Beijing China

ISBN: (数字)9798350385991

ISBN: (纸本)9798350386004

With the increase in the number of remote sensing satellites and imaging modes, the amount of data for acquiring remote sensing images has greatly increased. Effectively and stably performing geometric positioning on remote sensing images is the foundation of remote sensing applications. This paper proposes a remote sensing image matching and positioning method based on a multi-feature control point database in MySQL. Firstly, a feature control point database in MySQL is constructed based on multiple feature methods. Subsequently, the target image is matched from coarse to fine using region features and point features in the feature control point database. Experimental results show that, on three target remote sensing images, the coarse-to-fine matching method based on MySQL multi-feature database can achieve good geometric positioning effects, with a positioning accuracy of around 0.5 pixels.

关键词： Location awareness Accuracy Satellites Databases Image matching Imaging Sensors Remote sensing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：