检索结果-内蒙古大学图书馆

arXiv 2023年

作者： Luo, Anwei Cai, Rizhao Kong, Chenqi Ju, Yakun Kang, Xiangui Huang, Jiwu Kot, Alex C. The School of Information Technology Jiangxi University of Finance and Economics Nanchang330013 China The School of Computer Science and Engineering Sun Yat-Sen University Guangzhou510006 China Lab. School of Electrical and Electronic Engineering Nanyang Technology University Singapore The Guangdong Key Laboratory of Intelligent Information Processing National Engineering Laboratory for Big Data System Computing Technology Shenzhen University Shenzhen518060 China The China-Singapore International Joint Research Institute Singapore

With the rapid progress of generative models, the current challenge in face forgery detection is how to effectively detect realistic manipulated faces from different unseen domains. Though previous studies show that pre-trained Vision Transformer (ViT) based models can achieve some promising results after fully fine-tuning on the Deepfake dataset, their generalization performances are still unsatisfactory. One possible reason is that fully fine-tuned ViT-based models may disrupt the pre-trained features [1], [2] and overfit to some data-specific patterns [3]. To alleviate this issue, we present a Forgery-aware Adaptive Vision Transformer (FA-ViT) under the adaptive learning paradigm, where the parameters in the pre-trained ViT are kept fixed while the designed adaptive modules are optimized to capture forgery features. Specifically, a global adaptive module is designed to model long-range interactions among input tokens, which takes advantage of self-attention mechanism to mine global forgery clues. To further explore essential local forgery clues, a local adaptive module is proposed to expose local inconsistencies by enhancing the local contextual association. In addition, we introduce a fine-grained adaptive learning module that emphasizes the common compact representation of genuine faces through relationship learning in fine-grained pairs, driving these proposed adaptive modules to be aware of fine-grained forgery-aware information. Extensive experiments demonstrate that our FA-ViT achieves state-of-the-arts results in the cross-dataset evaluation, and enhances the robustness against unseen perturbations. Particularly, FA-ViT achieves 93.83% and 78.32% AUC scores on Celeb-DF and DFDC datasets in the cross-dataset evaluation. The code and trained model have been released at: https://***/LoveSiameseCat/FAViT. Copyright © 2023, The Authors. All rights reserved.

关键词： Contrastive Learning

来源：评论

学校读者我要写书评

暂无评论

Not All Diffusion Model Activations Have Been Evaluated as Discriminative Features

arXiv

引用

arXiv 2024年

作者： Meng, Benyuan Xu, Qianqian Wang, Zitai Cao, Xiaochun Huang, Qingming Institute of Information Engineering CAS China School of Cyber Security University of Chinese Academy of Sciences China Key Lab. of Intelligent Information Processing Institute of Computing Technology CAS China Peng Cheng Laboratory China School of Cyber Science and Tech. Shenzhen Campus of Sun Yat-sen University China School of Computer Science and Tech. University of Chinese Academy of Sciences China Key Laboratory of Big Data Mining and Knowledge Management CAS China

Diffusion models are initially designed for image generation. Recent research shows that the internal signals within their backbones, named activations, can also serve as dense features for various discriminative tasks such as semantic segmentation. Given numerous activations, selecting a small yet effective subset poses a fundamental problem. To this end, the early study of this field performs a large-scale quantitative comparison of the discriminative ability of the activations. However, we find that many potential activations have not been evaluated, such as the queries and keys used to compute attention scores. Moreover, recent advancements in diffusion architectures bring many new activations, such as those within embedded ViT modules. Both combined, activation selection remains unresolved but overlooked. To tackle this issue, this paper takes a further step with a much broader range of activations evaluated. Considering the significant increase in activations, a full-scale quantitative comparison is no longer operational. Instead, we seek to understand the properties of these activations, such that the activations that are clearly inferior can be filtered out in advance via simple qualitative evaluation. After careful analysis, we discover three properties universal among diffusion models, enabling this study to go beyond specific models. On top of this, we present effective feature selection solutions for several popular diffusion models. Finally, the experiments across multiple discriminative tasks validate the superiority of our method over the SOTA competitors. Our code is available at this url. © 2024, CC BY.

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

Suppress Content Shift: Better Diffusion Features via Off-the-Shelf Generation Techniques

arXiv

引用

arXiv 2024年

作者： Meng, Benyuan Xu, Qianqian Wang, Zitai Yang, Zhiyong Cao, Xiaochun Huang, Qingming Institute of Information Engineering CAS China School of Cyber Security University of Chinese Academy of Sciences China Key Lab. of Intelligent Information Processing Institute of Computing Technology CAS China Peng Cheng Laboratory China School of Computer Science and Tech. University of Chinese Academy of Sciences China Key Laboratory of Big Data Mining and Knowledge Management CAS China School of Cyber Science and Tech. Sun Yat-sen University Shenzhen Campus China

Diffusion models are powerful generative models, and this capability can also be applied to discrimination. The inner activations of a pre-trained diffusion model can serve as features for discriminative tasks, namely, diffusion feature. We discover that diffusion feature has been hindered by a hidden yet universal phenomenon that we call content shift. To be specific, there are content differences between features and the input image, such as the exact shape of a certain object. We locate the cause of content shift as one inherent characteristic of diffusion models, which suggests the broad existence of this phenomenon in diffusion feature. Further empirical study also indicates that its negative impact is not negligible even when content shift is not visually perceivable. Hence, we propose to suppress content shift to enhance the overall quality of diffusion features. Specifically, content shift is related to the information drift during the process of recovering an image from the noisy input, pointing out the possibility of turning off-the-shelf generation techniques into tools for content shift suppression. We further propose a practical guideline named GATE to efficiently evaluate the potential benefit of a technique and provide an implementation of our methodology. Despite the simplicity, the proposed approach has achieved superior results on various tasks and datasets, validating its potential as a generic booster for diffusion features. Our code is available at this url. © 2024, CC BY.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Improved Visual Fine-tuning with Natural Language Supervision

Improved Visual Fine-tuning with Natural Language Supervisio...

引用

International Conference on computer Vision (ICCV)

作者： Junyang Wang Yuanhong Xu Juhua Hu Ming Yan Jitao Sang Qi Qian School of Computer and Information Technology & Beijing Key Lab of Traffic Data Analysis and Mining Beijing Jiaotong University Beijing China DAMO Academy Alibaba Group Hangzhou China School of Engineering and Technology University of Washington Tacoma WA USA Peng Cheng Lab Shenzhen China DAMO Academy Alibaba Group Bellevue WA USA

Fine-tuning a visual pre-trained model can leverage the semantic information from large-scale pre-training data and mitigate the over-fitting problem on downstream vision tasks with limited training examples. While the problem of catastrophic forgetting in pre-trained backbone has been extensively studied for fine-tuning, its potential bias from the corresponding pre-training task and data, attracts less attention. In this work, we investigate this problem by demonstrating that the obtained classifier after fine-tuning will be close to that induced by the pre-trained model. To reduce the bias in the classifier effectively, we introduce a reference distribution obtained from a fixed text classifier, which can help regularize the learned vision classifier. The proposed method, Text Supervised fine-tuning (TeS), is evaluated with diverse pre-trained vision models including ResNet and ViT, and text encoders including BERT and CLIP, on 11 downstream tasks. The consistent improvement with a clear margin over distinct scenarios confirms the effectiveness of our proposal. Code is available at https://***/idstcv/TeS.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Attention-effective multiple instance learning on weakly stem cell colony segmentation

arXiv

引用

arXiv 2022年

作者： Yudistira, Novanto Kavitha, Muthu Subash Rajan, Jeny Kurita, Takio Intelligent System Lab Faculty of Computer Science Brawijaya University Indonesia School of Information and Data Sciences Nagasaki University Nagasaki Japan Department of Computer Science and Engineering National Institute of Technology Karnataka Surathkal India Graduate School of Advanced Science and Engineering Hiroshima University Hiroshima Japan

The detection of induced pluripotent stem cell (iPSC) colonies often needs the precise extraction of the colony features. However, existing computerized systems relied on segmentation of contours by preprocessing for classifying the colony conditions were task-extensive. To maximize the efficiency in categorizing colony conditions, we propose a multiple instance learning (MIL) in weakly supervised settings. It is designed in a single model to produce weak segmentation and classification of colonies without using finely labeled samples. As a single model, we employ a U-net-like convolution neural network (CNN) to train on binary image-level labels for MIL colonies classification. Furthermore, to specify the object of interest we used a simple post-processing method. The proposed approach is compared over conventional methods using five-fold cross-validation and receiver operating characteristic (ROC) curve. The maximum accuracy of the MIL-net is 95%, which is 15% higher than the conventional methods. Furthermore, the ability to interpret the location of the iPSC colonies based on the image level label without using a pixel-wise ground truth image is more appealing and cost-effective in colony condition recognition. © 2022, CC BY.

关键词： Learning systems

来源：评论

学校读者我要写书评

暂无评论

GrabDAE: An Innovative Framework for Unsupervised Domain Adaptation Utilizing Grab-Mask and Denoise Auto-Encoder

arXiv

引用

arXiv 2024年

作者： Chen, Junzhou Wen, Xuan Zhang, Ronghui Ren, Bingtao Wu, Di Xu, Zhigang Wang, Danwei The Guangdong Provincial Key Laboratory of Intelligent Transport System School of Intelligent Systems Engineering Sun Yat-sen University Guangzhou510275 China The School of Transportation Science and Engineering Beihang University State Key Lab of Intelligent Transportation System Beijing100191 China The School of Computer Science and Engineering Sun Yat-sen University Guangzhou510006 China The Guangdong Key Laboratory of Big Data Analysis and Processing Guangdong510006 China The School of Information Engineering Chang’an University Shaanxi Xi’an710064 China The School of Electrical and Electronic Engineering Nanyang Technological University Singapore639798 Singapore

Unsupervised Domain Adaptation (UDA) aims to adapt a model trained on a labeled source domain to an unlabeled target domain by addressing the domain shift. Existing Unsupervised Domain Adaptation (UDA) methods often fall short in fully leveraging contextual information from the target domain, leading to suboptimal decision boundary separation during source and target domain alignment. To address this, we introduce GrabDAE, an innovative UDA framework designed to tackle domain shift in visual classification tasks. GrabDAE incorporates two key innovations: the Grab-Mask module, which blurs background information in target domain images, enabling the model to focus on essential, domain-relevant features through contrastive learning;and the Denoising Auto-Encoder (DAE), which enhances feature alignment by reconstructing features and filtering noise, ensuring a more robust adaptation to the target domain. These components empower GrabDAE to effectively handle unlabeled target domain data, significantly improving both classification accuracy and robustness. Extensive experiments on benchmark datasets, including VisDA-2017, OfficeHome, and Office31, demonstrate that GrabDAE consistently surpasses state-of-the-art UDA methods, setting new performance benchmarks. By tackling UDA’s critical challenges with its novel feature masking and denoising approach, GrabDAE offers both significant theoretical and practical advancements in domain adaptation. © 2024, CC BY-NC-SA.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Shielding Federated Learning: Robust Aggregation with Adaptive Client Selection

arXiv

引用

arXiv 2022年

作者： Wan, Wei Hu, Shengshan Lu, Jianrong Yu Zhang, Leo Jin, Hai He, Yuanyuan School of Cyber Science and Engineering Huazhong University of Science and Technology China School of Computer Science and Technology Huazhong University of Science and Technology China National Engineering Research Center for Big Data Technology and System Services Computing Technology and System Lab Hubei Engineering Research Center on Big Data Security China Cluster and Grid Computing Lab School of Information Technology Deakin University Australia

Federated learning (FL) enables multiple clients to collaboratively train an accurate global model while protecting clients’ data privacy. However, FL is susceptible to Byzantine attacks from malicious participants. Although the problem has gained significant attention, existing defenses have several flaws: the server irrationally chooses malicious clients for aggregation even after they have been detected in previous rounds;the defenses perform ineffectively against sybil attacks or in the heterogeneous data setting. To overcome these issues, we propose MAB-RFL, a new method for robust aggregation in FL. By modelling the client selection as an extended multi-armed bandit (MAB) problem, we propose an adaptive client selection strategy to choose honest clients that are more likely to contribute high-quality updates. We then propose two approaches to identify malicious updates from sybil and non-sybil attacks, based on which rewards for each client selection decision can be accurately evaluated to discourage malicious behaviors. MAB-RFL achieves a satisfying balance between exploration and exploitation on the potential benign clients. Extensive experimental results show that MAB-RFL outperforms existing defenses in three attack scenarios under different percentages of attackers. Copyright © 2022, The Authors. All rights reserved.

关键词： data privacy

来源：评论

学校读者我要写书评

暂无评论

BiKT: Unleashing the potential of GNNs via Bi-directional Knowledge Transfer

arXiv

引用

arXiv 2023年

作者： Zheng, Shuai Liu, Zhizhe Zhu, Zhenfeng Zhang, Xingxing Li, Jianxin Zhao, Yao The Institute of Information Science Beijing Jiaotong University Beijing100044 China The Beijing Key Laboratory of Advanced Information Science and Network Technology Beijing100044 China Qiyuan Lab Beijing China The Beijing Advanced Innovation Center for Big Data and Brain Computing School of Computer Science and Engineering Beihang University Beijing100083 China

Based on the message-passing paradigm, there has been an amount of research proposing diverse and impressive feature propagation mechanisms to improve the performance of GNNs. However, less focus has been put on feature transformation, another major operation of the message-passing framework. In this paper, we first empirically investigate the performance of the feature transformation operation in several typical GNNs. Unexpectedly, we notice that GNNs do not completely free up the power of the inherent feature transformation operation. By this observation, we propose the Bi-directional Knowledge Transfer (BiKT), a plug-and-play approach to unleash the potential of the feature transformation operations without modifying the original architecture. Taking the feature transformation operation as a derived representation learning model that shares parameters with the original GNN, the direct prediction by this model provides a topological-agnostic knowledge feedback that can further instruct the learning of GNN and the feature transformations therein. On this basis, BiKT not only allows us to acquire knowledge from both the GNN and its derived model but promotes each other by injecting the knowledge into the other. In addition, a theoretical analysis is further provided to demonstrate that BiKT improves the generalization bound of the GNNs from the perspective of domain adaption. An extensive group of experiments on up to 7 datasets with 5 typical GNNs demonstrates that BiKT brings up to 0.5% - 4% performance gain over the original GNN, which means a boosted GNN is obtained. Meanwhile, the derived model also shows a powerful performance to compete with or even surpass the original GNN, enabling us to flexibly apply it independently to some other specific downstream tasks. Copyright © 2023, The Authors. All rights reserved.

关键词： Message passing

来源：评论

学校读者我要写书评

暂无评论

Multi-modal Multi-kernel Graph Learning for Autism Prediction and Biomarker Discovery

arXiv

引用

arXiv 2023年

作者： Liu, Jin Mao, Junbin Lin, Hanhe Kuang, Hulin Pan, Shirui Wu, Xusheng Xie, Shan Liu, Fei Pan, Yi Hunan Provincial Key Lab on Bioinformatics School of Computer Science and Engineering Central South University Changsha410083 China Xinjiang Engineering Research Center of Big Data and Intelligent Software School of software Xinjiang University Wulumuqi830000 China Hunan Province Key Lab on Bioinformatics School of Computer Science and Engineering Central South University Changsha410083 China Faculty of Computer Science and Control Engineering Shenzhen University of Advanced Technology Shenzhen Institute of Advanced Technology Chinese Academy of Sciences Shenzhen518055 China School of Science and Engineering University of Dundee DundeeDD1 4HN United Kingdom School of Information and Communication Technology Griffith University Gold CoastQLD4215 Australia Shenzhen Health Development Research and Data Management Center Shenzhen518109 China

Graph learning-based multi-modal integration and classification is one of the most challenging tasks for disease prediction. To effectively offset the negative impact among modalities in the process of multi-modal integration and heterogeneous information extractions from graphs, we propose a novel method called Multi-modal Multi-Kernel Graph Learning (MMKGL). To solve the problem of negative impact among modalities, we propose a multi-modal graph embedding module to construct a multi-modal graph. Different from conventional methods that manually construct static graphs for all modalities, each modality generates a separate graph by adaptive learning, where a function graph and a supervision graph are introduced for optimization during the multi-graph fusion embedding process. We then propose a multi-kernel graph learning module to extract heterogeneous information from the multi-modal graph. The information in the multi-modal graph at different levels is aggregated by convolutional kernels with different receptive field sizes, followed by generating a cross-kernel discovery tensor for disease prediction. Our method is evaluated on the benchmark Autism Brain Imaging data Exchange (ABIDE) dataset and outperforms the state-of-the-art methods. In addition, discriminative brain regions associated with autism are identified by our model, providing guidance for the study of autism pathology. The source code will be available at https://***/yutian0315/MMKGL. © 2023, CC BY-NC-ND.

关键词： Graph embeddings

来源：评论

学校读者我要写书评

暂无评论

A deep learning system for predicting time to progression of diabetic retinopathy

引用

NATURE MEDICINE 2024年第2期30卷 358-359页

作者： [Anonymous] Shanghai Belt and Road International Joint Laboratory for Intelligent Prevention and Treatment of Metabolic Disorders Department of Computer Science and Engineering School of Electronic Information and Electrical Engineering Shanghai Jiao Tong University Department of Endocrinology and Metabolism Shanghai Sixth People’s Hospital Affiliated to Shanghai Jiao Tong University School of Medicine Shanghai Diabetes Institute Shanghai Clinical Center for Diabetes Shanghai China MOE Key Laboratory of AI School of Electronic Information and Electrical Engineering Shanghai Jiao Tong University Shanghai China Department of Ophthalmology Huadong Sanatorium Wuxi China Department of Ophthalmology Shanghai Sixth People’s Hospital Affiliated to Shanghai Jiao Tong University School of Medicine Shanghai China Department of Ophthalmology and Visual Sciences The Chinese University of Hong Kong Hong Kong China Singapore Eye Research Institute Singapore National Eye Centre Singapore Singapore Department of Computer Science and Engineering The Hong Kong University of Science and Technology Hong Kong China Department of Chemical and Biological Engineering The Hong Kong University of Science and Technology Hong Kong China State Key Laboratory of Ophthalmology Zhongshan Ophthalmic Center Sun Yat-sen University Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science Guangzhou China Department of Ophthalmology Peking Union Medical College Hospital Peking Union Medical College Chinese Academy of Medical Sciences Beijing China Medical Records and Statistics Office Shanghai Sixth People’s Hospital Affiliated to Shanghai Jiao Tong University School of Medicine Shanghai China Department of Geriatrics Tongji Hospital Tongji Medical College Huazhong University of Science and Technology Wuhan China National Engineering Research Center for Big Data Technology and System Services Computing Technology and System Lab Cluster and Grid Computing Lab School of Computer Science and Tech

We developed and validated a deep learning system (termed DeepDR Plus) in a diverse, multiethnic, multi-country dataset to predict personalized risk and time to progression of diabetic retinopathy. We show that DeepDR Plus can be integrated into the clinical workflow to promote individualized intervention strategies for the management of diabetic retinopathy.

关键词： Diabetes complications Machine learning Predictive markers

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：