检索结果-内蒙古大学图书馆

Self-knowledge distillation with dimensional history knowledge

Science China(Information Sciences) 2025年

作者： Wenke HUANG Mang YE Zekun SHI He LI Bo DU National Engineering Research Center for Multimedia Software School of Computer ScienceWuhan University

Existing self-knowledge distillation (Self-KD) solutions usually focus on transferring historical predictions of individual instances to the current network. However, this approach tends to create overconfidence for easy instances and underconfidence for hard instances. The widely used temperature-based strategies to smooth or sharpen the predicted distributions can lead to inconsistencies across instances, causing sensitivity issues. To address this, our approach views a queue of instances as an ensemble rather than treating each instance independently. We propose a novel method that distills historical knowledge from a dimensional perspective, utilizing intra class characteristics and interclass relationships within each ensemble. First, we align each dimension distribution from the current network to the historical output. Second, we ensure each dimension is closer to similar dimensions than dissimilar ones, maintaining consistent attitudes from present and historical perspectives. Our insights reveal that distilling historical knowledge from a dimensional perspective is more effective than the traditional instance-based approach, with potential applications in related tasks. Empirical results on three famous datasets and various network architectures demonstrate the superiority of our proposed method. Our code is available at https://***/WenkeHuang/DimSelfKD.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Learning a generalizable re-identification model from unlabelled data with domain-agnostic expert

引用

Visual Intelligence 2024年第1期2卷 337-349页

作者： Fangyi Liu Mang Ye Bo Du National Engineering Research Center for Multimedia Software Hubei Key Laboratory of Multimedia and Network Communication EngineeringSchool of Computer ScienceHubei Luojia LaboratoryWuhan UniversityWuhan430072China

In response to real-world scenarios,the domain generalization(DG)problem has spurred considerable research in person re-identification(ReID).This challenge arises when the target domain,which is significantly different from the source domains,remains ***,the performance of current DG ReID relies heavily on labor-intensive source domain *** the potential of unlabeled data,we investigate unsupervised domain generalization(UDG)in *** goal is to create a model that can generalize from unlabeled source domains to semantically retrieve images in an unseen target *** address this,we propose a new approach that trains a domain-agnostic expert(DaE)for unsupervised domain-generalizable person *** involves independently training multiple experts to account for label space inconsistencies between source *** the same time,the DaE captures domain-generalizable information for *** experiments demonstrate the effectiveness of this method for learning generalizable features under the UDG *** results demonstrate the superiority of our method over state-of-the-art *** will make our code and models available for public use.

关键词： Domain generalization(DG) Unlabeled source domains Label space inconsistencies Domain-agnostic expert(DaE)

来源：评论

学校读者我要写书评

暂无评论

NNVISR: Bring Neural Network Video Interpolation and Super Resolution into Video Processing Framework 24

NNVISR: Bring Neural Network Video Interpolation and Super R...

引用

32nd ACM International Conference on multimedia, MM 2024

作者： Tong, Yuan Hu, Mengshun Wang, Zheng National Engineering Research Center for Multimedia Software Hubei Key Laboratory of Multimedia and Network Communication Engineering School of Computer Science Wuhan University Wuhan China

ISBN: (纸本)9798400706868

We present NNVISR - an open-source filter plugin for the VapourSynth video processing framework, which facilitates the application of neural networks for various kinds of video enhancing tasks, including denoising, super resolution, interpolation, and spatio-temporal super-resolution. NNVISR fills the gap between video enhancement neural networks and video processing pipelines, by accepting any network that enhances a group of frames, and handling all other network agnostic details during video processing. NNVISR is publicly released at https://***/tongyuantongyu/vs-NNVISR. © 2024 ACM.

关键词： Open systems

来源：评论

学校读者我要写书评

暂无评论

Resisting Over-Smoothing in Graph Neural Networks via Dual-Dimensional Decoupling 24

Resisting Over-Smoothing in Graph Neural Networks via Dual-D...

引用

32nd ACM International Conference on multimedia, MM 2024

作者： Shen, Wei Ye, Mang Huang, Wenke National Engineering Research Center for Multimedia Software School of Computer Science Wuhan University Wuhan China

ISBN: (纸本)9798400706868

Graph Neural Networks (GNNs) are widely employed to derive meaningful node representations from graphs. Despite their success, deep GNNs frequently grapple with the oversmoothing issue, where node representations become highly indistinguishable due to repeated aggregations. In this work, we consider the oversmoothing issue from two aspects of the node embedding space: dimension and instance. Specifically, while existing methods primarily concentrate on instance-level node relations to mitigate oversmoothing, we propose to mitigate oversmoothing at dimension level. We reveal the heightened information redundancy between dimensions which diminishes information diversity and impairs node differentiation in GNNs. Motivated by this insight, we propose the Dimension-Level Decoupling (DLD) to reduce dimension redundancy, enhancing dimensional-level node differentiation. Besides, at the instance level, the neglect of class differences leads to vague classification boundaries. Hence, we introduce the Instance-Level Class-Difference Decoupling (ICDD) that repels inter-class nodes and attracts intra-class nodes, improving the instance-level node discrimination with clear classification boundaries. Additionally, we introduce a novel evaluation metric that considers the impact of class differences on node distances, facilitating precise oversmoothing measurement. Extensive experiments demonstrate the effectiveness of our method Dual-Dimensional Class-Difference Decoupling (DDCD) across diverse scenarios. © 2024 ACM.

关键词： Graph neural networks

来源：评论

学校读者我要写书评

暂无评论

Cloth-aware Augmentation for Cloth-generalized Person Re-identification 24

Cloth-aware Augmentation for Cloth-generalized Person Re-ide...

引用

32nd ACM International Conference on multimedia, MM 2024

作者： Liu, Fangyi Ye, Mang Du, Bo National Engineering Research Center for Multimedia Software Institute of Artificial Intelligence School of Computer Science Hubei Key Laboratory of Multimedia and Network Communication Engineering Wuhan University Wuhan China

ISBN: (纸本)9798400706868

Person re-identification (ReID) is crucial in video surveillance, aiming to match individuals across different camera views while cloth-changing person re-identification (CC-ReID) focuses on pedestrians changing attire. Many existing CC-ReID methods overlook generalization, crucial for universality across cloth-consistent and cloth-changing scenarios. This paper pioneers exploring the cloth-generalized person re-identification (CG-ReID) task and introduces the Cloth-aware Augmentation (CaAug) strategy. Comprising domain augmentation and feature augmentation, CaAug aims to learn identity-relevant features adaptable to both scenarios. Domain augmentation involves creating diverse fictitious domains and simulating various clothing scenarios. Supervising features from different cloth domains enhances robustness and generalization against clothing changes. Additionally, for feature augmentation, element exchange introduces diversity concerning clothing changes. Regularizing the model with these augmented features strengthens resilience against clothing change uncertainty. Extensive experiments on cloth-changing datasets demonstrate the efficacy of our approach, consistently outperforming state-of-the-art methods. © 2024 ACM.

关键词： Feature extraction

来源：评论

学校读者我要写书评

暂无评论

A Theoretical Analysis of Backdoor Poisoning Attacks in Convolutional Neural Networks 41

A Theoretical Analysis of Backdoor Poisoning Attacks in Conv...

引用

41st International Conference on Machine Learning, ICML 2024

作者： Li, Boqi Liu, Weiwei School of Computer Science National Engineering Research Center for Multimedia Software Institute of Artificial Intelligence Hubei Key Laboratory of Multimedia and Network Communication Engineering Wuhan University Wuhan China

The rising threat of backdoor poisoning attacks (BPAs) on Deep Neural Networks (DNNs) has become a significant concern in recent years. In such attacks, the adversaries strategically target a specific class and generate a poisoned training set. The neural network (NN), well-trained on the poisoned training set, is able to predict any input with the trigger pattern as the targeted label, while maintaining accurate outputs for clean inputs. However, why the BPAs work remains less explored. To fill this gap, we employ a dirty-label attack and conduct a detailed analysis of BPAs in a two-layer convolutional neural network. We provide theoretical insights and results on the effectiveness of BPAs. Our experimental results on two real-world datasets validate our theoretical findings. Copyright 2024 by the author(s)

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

Devil is in Details: Locality-Aware 3D Abdominal CT Volume Generation for Self-Supervised Organ Segmentation 24

Devil is in Details: Locality-Aware 3D Abdominal CT Volume G...

引用

32nd ACM International Conference on multimedia, MM 2024

作者： Wang, Yuran Wan, Zhijing Qiu, Yansheng Wang, Zheng National Engineering Research Center for Multimedia Software Hubei Key Laboratory of Multimedia and Network Communication Engineering Institute of Artificial Intelligence School of Computer Science Wuhan University Wuhan China

ISBN: (纸本)9798400706868

In the realm of medical image analysis, self-supervised learning (SSL) techniques have emerged to alleviate labeling demands, while still facing the challenge of training data scarcity owing to escalating resource requirements and privacy constraints. Numerous efforts employ generative models to generate high-fidelity, unlabeled 3D volumes across diverse modalities and anatomical regions. However, the intricate and indistinguishable anatomical structures within the abdomen pose a unique challenge to abdominal CT volume generation compared to other anatomical regions. To address the overlooked challenge, we introduce the Locality-Aware Diffusion (Lad), a novel method tailored for exquisite 3D abdominal CT volume generation. We design a locality loss to refine crucial anatomical regions and devise a condition extractor to integrate abdominal priori into generation, thereby enabling the generation of large quantities of high-quality abdominal CT volumes essential for SSL tasks without the need for additional data such as labels or radiology reports. Volumes generated through our method demonstrate remarkable fidelity in reproducing abdominal structures, achieving a decrease in FID score from 0.0034 to 0.0002 on AbdomenCT-1K dataset, closely mirroring authentic data and surpassing current methods. Extensive experiments demonstrate the effectiveness of our method in self-supervised organ segmentation tasks, resulting in an improvement in mean Dice scores on two abdominal datasets effectively. These results underscore the potential of synthetic data to advance self-supervised learning in medical image analysis. © 2024 ACM.

关键词： Self-supervised learning

来源：评论

学校读者我要写书评

暂无评论

Empowering Visible-Infrared Person Re-Identification with Large Foundation Models 38

Empowering Visible-Infrared Person Re-Identification with La...

引用

38th Conference on Neural Information Processing Systems, NeurIPS 2024

作者： Hu, Zhangyi Yang, Bin Ye, Mang National Engineering Research Center for Multimedia Software School of Computer Science Wuhan University Wuhan China

Visible-Infrared Person Re-identification (VI-ReID) is a challenging cross-modal retrieval task due to significant modality differences, primarily resulting from the absence of color information in the infrared modality. The development of large foundation models like Large Language Models (LLMs) and Vision Language Models (VLMs) motivates us to explore a feasible solution to empower VI-ReID with off-the-shelf large foundation models. To this end, we propose a novel Text-enhanced VI-ReID framework driven by Large Foundation Models (TVI-LFM). The core idea is to enrich the representation of the infrared modality with textual descriptions automatically generated by VLMs. Specifically, we incorporate a pre-trained VLM to extract textual features from texts generated by VLM and augmented by LLM, and incrementally fine-tune the text encoder to minimize the domain gap between generated texts and original visual modalities. Meanwhile, to enhance the infrared modality with extracted textual representations, we leverage modality alignment capabilities of VLMs and VLM-generated feature-level filters. This enables the text model to learn complementary features from the infrared modality, ensuring the semantic structural consistency between the fusion modality and the visible modality. Furthermore, we introduce modality joint learning to align features across all modalities, ensuring that textual features maintain stable semantic representation of overall pedestrian appearance during complementary information learning. Additionally, a modality ensemble retrieval strategy is proposed to leverage complementary strengths of each query modality to improve retrieval effectiveness and robustness. Extensive experiments on three expanded VI-ReID datasets demonstrate that our method significantly improves the retrieval performance, paving the way for the utilization of large foundation models in downstream multi-modal retrieval tasks. © 2024 Neural information processing systems foundat

关键词：

来源：评论

学校读者我要写书评

暂无评论

A Provable Decision Rule for Out-of-Distribution Detection 41

A Provable Decision Rule for Out-of-Distribution Detection

引用

41st International Conference on Machine Learning, ICML 2024

作者： Ma, Xinsong Zou, Xin Liu, Weiwei School of Computer Science National Engineering Research Center for Multimedia Software Institute of Artificial Intelligence Hubei Key Laboratory of Multimedia and Network Communication Engineering Wuhan University Wuhan China

Out-of-distribution (OOD) detection task plays the key role in reliable and safety-critical applications. Existing researches mainly devote to designing or training the powerful score function but overlook investigating the decision rule based on the proposed score function. Different from previous work, this paper aims to design a decision rule with rigorous theoretical guarantee and well empirical performance. Specifically, we provide a new insight for the OOD detection task from a hypothesis testing perspective and propose a novel generalized Benjamini Hochberg (g-BH) procedure with empirical p-values to solve the testing problem. Theoretically, the g-BH procedure controls false discovery rate (FDR) at pre-specified level. Furthermore, we derive an upper bound of the expectation of false positive rate (FPR) for the g-BH procedure based on the tailed generalized Gaussian distribution family, indicating that the FPR of g-BH procedure converges to zero in probability. Finally, the extensive experimental results verify the superiority of g-BH procedure over the traditional threshold-based decision rule on several OOD detection benchmarks. Copyright 2024 by the author(s)

关键词：

来源：评论

学校读者我要写书评

暂无评论

Sequential Kernel Goodness-of-fit Testing 41

Sequential Kernel Goodness-of-fit Testing

引用

41st International Conference on Machine Learning, ICML 2024

作者： Zhou, Zhengyu Liu, Weiwei School of Computer Science National Engineering Research Center for Multimedia Software Institute of Artificial Intelligence Hubei Key Laboratory of Multimedia and Network Communication Engineering Wuhan University Wuhan China

Goodness-of-fit testing, a classical statistical tool, has been extensively explored in the batch setting, where the sample size is ***, practitioners often prefer methods that adapt to the complexity of a problem rather than fixing the sample size *** batch tests are generally unsuitable for streaming data, as valid inference after data peeking requires multiple testing corrections, resulting in reduced statistical *** address this issue, we delve into the design of consistent sequential goodness-of-fit *** the principle of testing by betting, we reframe this task as selecting a sequence of payoff functions that maximize the wealth of a fictitious bettor, betting against the null in a repeated *** conduct experiments to demonstrate the adaptability of our sequential test across varying difficulty levels of problems while maintaining control over type-I errors. Copyright 2024 by the author(s)

关键词：

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：