检索结果-内蒙古大学图书馆

SSRN 2024年

作者： Lao, Chengxue Luo, Shengda Leung, Alex Po Tsoi, Ah Chung Bugiolacchi, Roberto Zhang, Jianguo School of Computer Science and Engineering Faculty of Innovation Engineering Macau University of Science and Technology Avenida Wai Long Taipa 999078 China Department of Computer Science and Engineering Southern University of Science and Technology Shenzhen518055 China Department of Physics Faculty of Science The University of Hong Kong Pok Fu Lam Hong Kong School of Computing and Information Technology University of Wollongong Northfields Ave. WollongongNSW2522 Australia Space Science Institute State Key Laboratory Macau University of Science and Technology Avenida Wai Long Taipa 999078 China

In conventional few-shot learning approaches, masked image modeling paradigms such as masked autoencoders are typically used as feature extractors, followed by classifiers. Traditional masked autoencoders depend on standard transformers for encoding, neglecting the inherent inductive biases of CNNs, which are pivotal for few-shot learning efficiency. In response, this paper introduces the Modernized Convolutional Network with Efficient Channel Attention (ConvNeXt-ECA), a effective encoder for masked autoencoders that modernizes the vanilla ResNet through the integration of transformer-inspired elements. This modernization process emphasizes merging the architectural advantages of transformers with the inductive biases inherent in CNNs, upgrading crucial aspects of ResNet, including computation stages, the design of residual blocks, activation functions, and channel interaction ***, this study proposes an innovative few-shot learning classifier that combines masked autoencoders with ConvNeXt-ECA and NeXtVLAD. Our classifier utilizes strategic masking operations and soft focal loss to address supervision bias and biases towards simple features, respectively. In our experiments, the proposed methods benchmarks against top existing methods across five standard few-shot learning datasets, consistently outperforming rival models and demonstrating unmatched performance on each *** source code will be made publicly available. © 2024, The Authors. All rights reserved.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

Channel and space-based joint rate allocation algorithm

Channel and space-based joint rate allocation algorithm

引用

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： Dayong Wang Chao Yuan Yu Sun Xin Lu Hui Guo Frederic Dufaux Ce Zhu Key Laboratory of Big Data Intelligent Computing Chongqing University of Posts and Telecommunications Guangxi Key Laboratory of Machine Vision and Intelligent Control Wuzhou University Chongqing Key Laboratory of Image Cognition Chongqing University of Posts and Telecommunications China Department of Computer Science University of Central Arkansas Faculty of Computing Engineering and Media (CEM) De Montfort University UK Université Paris-Saclay CNRS CentraleSupélec Laboratoire Des Signaux et Systèmes France School of Information and Communication Engineering University of Electronic Science and Technology of China

ISBN: (数字)9798350368741

ISBN: (纸本)9798350368758

Rate control is a critical component for image and video compression Particularly under limited network bandwidth conditions, bitrate control is essential to ensure efficient image transmission by effectively allocation channel resources. In this research, since both Channel and Spatial have relationship with rate allocation, we first propose a joint Channel-wise and Spatial-wise Quantization scheme to determine optimal quantization parameters. Subsequently, we develop a quantization step estimation network to obtain parameters to efficiently allocate rate according to target rate. Experiments demonstrate that our algorithm significantly improve compressed image quality with minimal bitrate distortion and achieve accurate rate control with nearly 3% average bitrate error.

关键词： Image quality Quantization (signal) Image coding Bit rate Signal processing algorithms Estimation Rate-distortion Video compression Resource management Speech processing

来源：评论

学校读者我要写书评

暂无评论

Deep Robust Reversible Watermarking

arXiv

引用

arXiv 2025年

作者： Chen, Jiale Wang, Wei Shi, Chongyang Dong, Li Li, Yuanman Hu, Xiping School of Computer Science Beijing Institute of Technology Beijing100081 China Guangdong-Hong Kong-Macao Joint Laboratory for Emotional Intelligence and Pervasive Computing Artificial Intelligence Research Institute Shenzhen MSU-BIT University Shenzhen518172 China School of Medical Technology Beijing Institute of Technology Beijing100081 China College of Electronics and Information Engineering Shenzhen University Shenzhen518060 China Department of Computer Science Faculty of Electrical Engineering and Computer Science Ningbo University Ningbo315211 China

Robust Reversible Watermarking (RRW) enables perfect recovery of cover images and watermarks in lossless channels while ensuring robust watermark extraction under lossy channels. However, existing RRW methods, mostly non-deep learning-based, suffer from complex designs, high computational costs, and poor robustness limiting their practical applications. To address these issues, this paper proposes Deep Robust Reversible Watermarking (DRRW), a deep learning-based RRW scheme. DRRW introduces an Integer Invertible Watermark Network (iIWN) to achieve an invertible mapping between integer data distributions, fundamentally addressing the limitations of conventional RRW approaches. Unlike traditional RRW methods requiring task-specific designs for different distortions, DRRW adopts an encoder-noise layer-decoder framework, enabling adaptive robustness against various distortions through end-to-end training. During inference, the cover image and watermark are mapped into an overflowed stego image and latent variables. Arithmetic coding efficiently compresses these into a compact bitstream, which is embedded via reversible data hiding to ensure lossless recovery of both the image and watermark. To reduce pixel overflow, we introduce an overflow penalty loss, significantly shortening the auxiliary bitstream while improving both robustness and stego image quality. Additionally, we propose an adaptive weight adjustment strategy that eliminates the need to manually preset the watermark loss weight, ensuring improved training stability and performance. Experiments on multiple datasets demonstrate that DRRW achieves notable performance advantages. Compared to state-of-the-art RRW methods, DRRW improves robustness and reduces embedding, extraction, and recovery complexities by 55.14×, 5.95×, and 3.57×, respectively. The auxiliary bitstream is shortened by 43.86×, and reversible embedding succeeds on 16,762 images in the PASCAL VOC 2012 dataset, marking a significant step toward pra

关键词： Image enhancement

来源：评论

学校读者我要写书评

暂无评论

Causal-IQA: towards the generalization of image quality assessment based on causal inference 24

Causal-IQA: towards the generalization of image quality asse...

引用

Proceedings of the 41st International Conference on Machine Learning

作者： Yan Zhong Xingyu Wu Li Zhang Chenxi Yang Tingting Jiang School of Mathematical Sciences and National Engineering Research Center of Visual Technology National Key Laboratory for Multimedia Information Processing School of Computer Science Peking University Beijing China Department of Computing The Hong Kong Polytechnic University Hong Kong SAR China Hefei Institute of Physical Science Chinese Academy of Sciences University of Science and Technology of China Hefei China National Engineering Research Center of Visual Technology National Key Laboratory for Multimedia Information Processing School of Computer Science and 5National Biomedical Imaging Center Peking University Beijing China

Due to the high cost of Image Quality Assessment (IQA) datasets, achieving robust generalization remains challenging for prevalent deep learning-based IQA methods. To address this, this paper proposes a novel end-to-end blind IQA method: Causal-IQA. Specifically, we first analyze the causal mechanisms in IQA tasks and construct a causal graph to understand the interplay and confounding effects between distortion types, image contents, and subjective human ratings. Then, through shifting the focus from correlations to causality, Causal-IQA aims to improve the estimation accuracy of image quality scores by mitigating the confounding effects using a causality-based optimization strategy. This optimization strategy is implemented on the sample subsets constructed by a Counterfactual Division process based on the Backdoor Criterion. Extensive experiments illustrate the superiority of Causal-IQA.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Downstream-agnostic Adversarial Examples

Downstream-agnostic Adversarial Examples

引用

International Conference on computer Vision (ICCV)

作者： Ziqi Zhou Shengshan Hu Ruizhi Zhao Qian Wang Leo Yu Zhang Junhui Hou Hai Jin School of Cyber Science and Engineering Huazhong University of Science and Technology National Engineering Research Center for Big Data Technology and System Services Computing Technology and System Lab Hubei Key Laboratory of Distributed System Security Hubei Engineering Research Center on Big Data Security School of Cyber Science and Engineering Wuhan University School of Information and Communication Technology Griffith University Department of Computer Science City University of Hong Kong School of Computer Science and Technology Huazhong University of Science and Technology Cluster and Grid Computing Lab

Self-supervised learning usually uses a large amount of unlabeled data to pre-train an encoder which can be used as a general-purpose feature extractor, such that downstream users only need to perform fine-tuning operations to enjoy the benefit of "large model". Despite this promising prospect, the security of pre-trained encoder has not been thoroughly investigated yet, especially when the pre-trained encoder is publicly available for commercial *** this paper, we propose AdvEncoder, the first framework for generating downstream-agnostic universal adversarial examples based on the pre-trained encoder. AdvEncoder aims to construct a universal adversarial perturbation or patch for a set of natural images that can fool all the downstream tasks inheriting the victim pre-trained encoder. Unlike traditional adversarial example works, the pre-trained encoder only outputs feature vectors rather than classification labels. Therefore, we first exploit the high frequency component information of the image to guide the generation of adversarial examples. Then we design a generative attack framework to construct adversarial perturbations/patches by learning the distribution of the attack surrogate dataset to improve their attack success rates and transferability. Our results show that an attacker can successfully attack downstream tasks without knowing either the pre-training dataset or the downstream dataset. We also tailor four defenses for pre-trained encoders, the results of which further prove the attack ability of AdvEncoder. Our codes are available at: https://***/CGCL-codes/AdvEncoder.

关键词：

来源：评论

学校读者我要写书评

暂无评论

MIMO Beamforming and Signal Modulation Design for Federated Learning Optimization

MIMO Beamforming and Signal Modulation Design for Federated ...

引用

IEEE Conference on Global Communications (GLOBECOM)

作者： Nuocheng Yang Sihua Wang Mingzhe Chen Cong Shen Changchuan Yin Christopher G. Brinton Beijing Laboratory of Advanced Information Network Beijing University of Posts and Telecommunications Beijing China Department of Electrical and Computer Engineering Institute for Data Science and Computing University of Miami Coral Gables FL USA Charles L. Brown Department of Electrical and Computer Engineering University of Virginia Charlottesville VA USA School of Electrical and Computer Engineering Purdue University West Lafayette IN USA

In this paper, we consider the optimization of federated learning (FL) over a realistic wireless multiple-input multiple-output (MIMO) communication system with digital modulation and over-the-air computation (AirComp). In such a system, MIMO devices transmit their locally trained FL models to a parameter server (PS) using beamforming to maximize the number of devices scheduled for transmission. AirComp enables efficient wireless model aggregation by the PS in bandwidth-limited settings. However, wireless channel fading can produce distortions in AirComp-based FL. To tackle this challenge, we develop a novel aggregation scheme that combines digital modulation with AirComp to mitigate wireless fading while ensuring communication efficiency. We formulate this as a joint transmit-receive beamforming design optimization problem which dynamically adjusts the beamforming matrices to minimize the FL training loss with transmission errors. To solve this problem based on limited information at the PS, we employ an artificial neural network (ANN) to estimate the local FL models of all devices. Then, we derive a closed-form optimal design of the transmit and receive beamforming matrices based on predicted FL models. Numerical evaluations validate the advantages of the proposed methodology in terms of model training performance compared with baselines.

关键词：

来源：评论

学校读者我要写书评

暂无评论

AFF-Dehazing: Attention-based feature fusion network for low-light image Dehazing

AFF-Dehazing: Attention-based feature fusion network for low...

引用

作者： Zhou, Yu Chen, Zhihua Sheng, Bin Li, Ping Kim, Jinman Wu, Enhua Department of Computer Science and Engineering East China University of Science and Technology Shanghai China Department of Computer Science and Engineering Shanghai Jiao Tong University Shanghai China Department of Computing The Hong Kong Polytechnic University Kowloon Hong Kong School of Information Technologies The University of Sydney Sydney Australia State Key Laboratory of Computer Science Institute of Software Chinese Academy of Sciences Beijing China Faculty of Science and Technology University of Macau China

Images captured in haze conditions, especially at nighttime with low light, often suffer from degraded visibility, contrasts, and vividness, which makes it difficult to carry out the following vision tasks. In this article, we propose an attention-based feature fusion network (AFF-Dehazing) for low-light image dehazing. Our method decomposes the low-light image dehazing into two task-independent streams containing four modules: image dehazing module, low-light feature extractor module, feature fusion module, and image restoration module. The basic block of these modules is the proposed attention-based residual dense block. Since the dual-branch are used, AFF-Dehazing can avoid learning the mixed degradation all-in-one and enhance the details of low-light haze images. Extensive experiments show that our method surpasses previous state-of-the-art image dehazing methods and low-light enhancement methods by a very large margin both quantitatively and qualitatively. © 2021 John Wiley & Sons, Ltd.

关键词： Image reconstruction

来源：评论

学校读者我要写书评

暂无评论

DPNet: Dynamic Pooling Network for Accurate and Efficient Size-Aware Tiny Object Detection

引用

IEEE Internet of Things Journal 2025年

作者： Gong, Luqi Chen, Haotian Chen, Yikun Yao, Tianliang Li, Chao Zhao, Shuai Han, Guangjie Beijing University of Posts and Telecommunications School of Computer Science Beijing China Zhejiang Lab Research Center for Space Computing System Hangzhou China Southwest Jiaotong University SWJTU-LEEDS Joint School Chengdu China LTD Guangdong Zhiyun City construction Technology Co Zhuhai China Tongji University College of Electronic and Information Engineering Department of Control Science and Engineering Shanghai China Hohai University Key Laboratory of Maritime Intelligent Network Information Technology Ministry of Education China

In unmanned aerial systems, especially in complex environments, accurately detecting tiny objects is crucial. Resizing images is a common strategy to improve detection accuracy, particularly for small objects. However, simply enlarging images significantly increases computational costs and the number of negative samples, severely degrading detection performance and limiting its applicability. This paper proposes a Dynamic Pooling Network (DPNet) for tiny object detection to mitigate these issues. DPNet employs a flexible down-sampling strategy by introducing a factor (df) to relax the fixed down-sampling process of the feature map to an adjustable one. Furthermore, we design a lightweight predictor to predict df for each input image, which will be used to decrease the resolution of feature map in backbone. Thus, we achieve input-aware down-sampling. We design an Adaptive Normalization Module (ANM) to make a unified detector well compatible with different dfs. At the same time, we also design a guidance loss to supervise the predictor’s training. DPNet realizes the dynamic allocation of computing resources to trade off detection accuracy and efficiency through this. Experiments on the TinyCOCO and TinyPerson datasets show that our DPNet can save over 35% and 25% GFLOPs, respectively, while maintaining comparable detection *** code will be made publicly available. © 2014 IEEE.

关键词： Object detection

来源：评论

学校读者我要写书评

暂无评论

Logical Relation Modeling and Mining in Hyperbolic Space for Recommendation

Logical Relation Modeling and Mining in Hyperbolic Space for...

引用

International Conference on Data engineering

作者： Yanchao Tan Hang Lv Zihao Zhou Wenzhong Guo Bo Xiong Weiming Liu Chaochao Chen Shiping Wang Carl Yang College of Computer and Data Science Fuzhou University Fuzhou China Engineering Research Center of Big Data Intelligence Ministry of Education Fuzhou China Fujian Key Laboratory of Network Computing and Intelligent Information Processing Fuzhou University Fuzhou China Institute for Artificial Intelligence University of Stuttgart Stuttgart Germany College of Computer Science Zhejiang University Hangzhou China Department of Computer Science Emory University Atlanta United States

ISBN: (数字)9798350317152

ISBN: (纸本)9798350317169

The sparse interactions between users and items have aggravated the difficulty of their representations in recommender systems. Existing methods leverage tags to alleviate the sparsity problem but ignore prevalent logical relations among items and tags (e.g., membership, hierarchy, and exclusion), which can be leveraged to enhance the accuracy of modeling user preferences and conducting recommendations. To this end, we propose to extract logical relations among item tags from existing tag taxonomies and exploit the individual strengths of the Poincaré and the Lorentz models in hyperbolic space for logical relation modeling towards enhanced recommendations. Moreover, we find that the logical relations directly extracted from existing tag taxonomies can be inaccurate and coarse. Therefore, we further devise innovative consistency-based and granularity- based weighting mechanisms based on user behavior patterns for data-driven logical relation mining that can be jointly optimized along with recommendations in an end-to-end fashion. Extensive experiments on four real-world benchmark datasets show drastic performance gains brought by our proposed framework, which constantly achieves an average of 8.25% improvement over state-of-the-art competitors regarding both Recall and NDCG metrics. Insightful case studies further demonstrate that our automatically refined logical relations are highly accurate and interpretable.

关键词： Measurement Accuracy Taxonomy Semantics Predictive models Performance gain Linguistics

来源：评论

学校读者我要写书评

暂无评论

Efficient Graph Neural Network Driven Recurrent Reinforcement Learning for GNSS Position Correction 36

Efficient Graph Neural Network Driven Recurrent Reinforcemen...

引用

36th International Technical Meeting of the Satellite Division of the Institute of Navigation, ION GNSS+ 2023

作者： Zhao, Haoli Tang, Jianhao Li, Zhenni Wu, Zhuoyu Xie, Shengli Wu, Zhaofeng Liu, Ming Kumara, Banage T.G.S. School of Automation Guangdong University of Technology Guangzhou510006 China Guangdong-HongKong-Macao Joint Laboratory for Smart Discrete Manufacturing Guangzhou510006 China 111 Center for Intelligent Batch Manufacturing Based on IoT Technology Guangzhou510006 China Key Laboratory of Intelligent Detection and The Internet of Things in Manufacturing Guangzhou510006 China Guangdong Key Laboratory of IoT Information Technology Guangzhou510006 China Techtotop Microelectronics Technology Co. Ltd. Guangzhou510000 China Department of Electronic and Computer Engineering Hong Kong University of Science and Technology Hong Kong Department of Computing and Information Systems Sabaragamuwa University of Sri Lanka Belihuloya Sri Lanka

ISBN: (纸本)9780936406350

With the wide applications of the Global Navigation Satellite System (GNSS) in autonomous driving scenarios, the demand for high-precision positioning of navigation systems has increased dramatically in complex multipath environments. Conventional model-based methods are constrained by strict assumptions about noise models and can hardly model complex environment errors. In contrast, approaches based on artificial intelligent learning have become an important direction to solving the problem of high-precision positioning because learning-based approaches only require simple assumptions. However, current learning-based approaches are facing the following issues. The existing Graph Neural Network-based (GNN) method could hardly adapt to dynamically changing driving environment scenarios since it considers positioning discretely. On the other hand, existing Reinforcement Learning-based (RL) approaches ignore the relationship between multi-constellation satellites, resulting in an inadequate description of the driving correction environment observations. In this paper, we construct a GNN-driven recurrent reinforcement learning method to consider the GNSS measurement of multi-constellation satellites and to learn real-time correction strategy in the dynamic driving environment. To establish a comprehensive positioning correction environment, we construct a multi-constellation graph observation, based on the feature vector concerning GNSS measurement of multi-constellation satellites and edges for satellites in and between constellations. To make more effective use of GNSS measurements, we employ the graph embedding module to deal with the multi-constellation graph inputs, to extract hidden topological features to form the brief states about relationships between multi-constellation satellites for the RL environment. Finally, we construct a recurrent actor-critic structured RL model with cumulative reward and continuous action space to exploit historical information and a

关键词：

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：