检索结果-内蒙古大学图书馆

arXiv 2025年

作者： Li, Yixuan Tian, Yu Huang, Yipo Lu, Wei Wang, Shiqi Lin, Weisi Rocha, Anderson College of Computing City University of Hong Kong Hong Kong School of Computer Science and Engineering Ministry of Education Key Laboratory of Information Technology Guangdong Province Key Laboratory of Information Security Technology Sun Yat-Sen University Guangzhou510006 China College of Computing and Data Science Nanyang Technological University Singapore Artificial Intelligence Lab Recod.ai University of Campinas Campinas13084-851 Brazil

The rapid and unrestrained advancement of generative artificial intelligence (AI) presents a double-edged sword: while enabling unprecedented creativity, it also facilitates the generation of highly convincing deceptive content, undermining societal trust. As image generation techniques become increasingly sophisticated, detecting synthetic images is no longer just a binary task—it necessitates interpretable, context-aware methodologies that enhance trustworthiness and transparency. However, existing detection models primarily focus on classification, offering limited explanatory insights into image authenticity. In this work, we propose FakeScope, an expert multimodal model (LMM) tailored for AI-generated image forensics, which not only identifies AI-synthetic images with high accuracy but also provides rich, interpretable, and query-driven forensic insights. To this end, we first construct FakeChain dataset that contains linguistic authenticity reasoning based on visual trace evidence, developed through a novel human-machine collaborative framework. Building upon it, we further present FakeInstruct, the largest multimodal instruction tuning dataset containing 2 million visual instructions tailored to enhance forensic awareness in LMMs. Leveraging the knowledge of FakeInstruct, FakeScope achieves state-of-the-art performance in both closed-ended and open-ended forensic scenarios. It can distinguish synthetic images with high accuracy while offering coherent and insightful explanations, free-form discussions on fine-grained forgery attributes, and actionable enhancement strategies. Notably, despite being trained exclusively on qualitative hard labels, FakeScope demonstrates remarkable zero-shot quantitative capability on detection, enabled by our proposed token-based probability estimation strategy. Furthermore, FakeScope exhibits strong generalization and in-the-wild ability, ensuring its applicability in real-world scenarios. The data, model, and demo will be publ

关键词： computer forensics

来源：评论

学校读者我要写书评

暂无评论

Identifying Flight Trajectory Patterns via a Density-Aided Hierarchical Clustering Algorithm 5th

Identifying Flight Trajectory Patterns via a Density-Aided H...

引用

5th China Aviation science and Technology Conference, 2021

作者： Zhang, Zhuxi Chen, Yichong Fang, Jing Zhou, Xueyang An, Yuhang Zhu, Xi National Engineering Laboratory for Comprehensive Transportation Big Data Application Technology Beihang University Beijing100191 China Unit 32751 Beijing China School of Electronics and Information Engineering Beihang University Beijing100191 China Aviation Data Communication Corporation Beijing100191 China School of Computer Science and Engineering Beihang University Beijing100191 China Research Institute for Frontier Science Beihang University Beijing100191 China

ISBN: (纸本)9789811674228

Identifying flight trajectory patterns is a vital task that helps controllers better understand the flight operation mechanism, so as to effectively recognize flight anomalies and manage traffic flow, etc. However, flight operation is sensitively affected by the weather and instant airspace regulation, making the flight trajectory pattern too intertwined to be easily distinguished. In this work, we propose a trajectory pattern identification method based on a density-aided hierarchical clustering algorithm. This method employs a weighted trajectory clustering mechanism to keep the minor trajectory patterns from being improperly "swallowed" by other large trajectory patterns. Experimental results show that the proposed method can explicitly distinguish different trajectory patterns and achieve more accurate results than existing approaches. © 2022, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

关键词： Clustering algorithms

来源：评论

学校读者我要写书评

暂无评论

Group RandAugment: Video Augmentation for Action Recognition 5

Group RandAugment: Video Augmentation for Action Recognition

引用

5th International Conference on data science and information Technology, DSIT 2022

作者： An, Fengmin Zhang, Bingbing Wang, Zhenwei Dong, Wei Zhang, Jianxin School of Computer Science and Engineering Dalian Minzu University Dalian China Institute of Machine Intelligence and Bio-computing Dalian Minzu University Dalian China SEAC Key Lab of Big Data Applied Technology Dalian Minzu University Dalian China School of Information and Communication Engineering Dalian University of Technology Dalian China

ISBN: (数字)9781665498685

ISBN: (纸本)9781665498685

data augmentation, as a critical strategy in deep learning, well improves the sample diversity for network training, leading to the obvious improvement of model generalization ability. Besides, automatic data augmentation, while sparking in image tasks, has attracted too little attention in video recognition task. Therefore, this work explores a novel group random augmentation (GRA) that automatically augments video data for recognition. GRA first collects 21 augmentation transformations to enrich the sample diversity, and then divides these transformations into four groups (i.e., pixel-transform, rigidtransform, erasing-transform, environment-transform) to reduce the excessive regularization caused by similar augmentation. To reduce the computational cost of finding the optimal setting, GRA adopts the combination form to select the optimal situation. For frames in the same video clip, GRA uses synchronous GRA. Additionally, the proposed GRA can be integrated into any existing video frameworks. To prove the effectiveness of GRA, we conduct experiments on two commonly used video action recognition benchmarks (Something-Something V1 & V2) and three typical frameworks (TSM, GSM, and TEA), whose results demonstrate GRA can improve performance in all cases without adding additional computational cost. © 2022 IEEE.

关键词： Deep learning

来源：评论

学校读者我要写书评

暂无评论

Confidence-Regulated Generative Diffusion Models for Reliable AI Agent Migration in Vehicular Metaverses

arXiv

引用

arXiv 2025年

作者： Kang, Yingkai Kang, Jiawen Wen, Jinbo Zhang, Tao Yang, Zhaohui Niyato, Dusit Zhang, Yan School of Automation Guangdong University of Technology Guangzhou510006 China College of Computer Science and Technology Nanjing University of Aeronautics and Astronautics Nanjing210016 China School of Cyberspace Science and Technology Beijing Jiaotong University Beijing100044 China College of Information Science and Electronic Engineering Zhejiang Provincial Key Lab of information processing communication and networking Zhejiang University Hangzhou310007 China College of Computing and Data Science Nanyang Technological University Singapore Department of Informatics University of Oslo the Simula Research Laboratory Norway

Vehicular metaverses are an emerging paradigm that merges intelligent transportation systems with virtual spaces, leveraging advanced digital twin and Artificial Intelligence (AI) technologies to seamlessly integrate vehicles, users, and digital environments. In this paradigm, vehicular AI agents are endowed with environment perception, decision-making, and action execution capabilities, enabling real-time processing and analysis of multi-modal data to provide users with customized interactive services. Since vehicular AI agents require substantial resources for real-time decision-making, given vehicle mobility and network dynamics conditions, the AI agents are deployed in RoadSide units (RSUs) with sufficient resources and dynamically migrated among them. However, AI agent migration requires frequent data exchanges, which may expose vehicular metaverses to potential cyber attacks. To this end, we propose a reliable vehicular AI agent migration framework, achieving reliable dynamic migration and efficient resource scheduling through cooperation between vehicles and RSUs. Additionally, we design a trust evaluation model based on the theory of planned behavior to dynamically quantify the reputation of RSUs, thereby better accommodating the personalized trust preferences of users. We then model the vehicular AI agent migration process as a partially observable markov decision process and develop a Confidence-regulated Generative Diffusion Model (CGDM) to efficiently generate AI agent migration decisions. Numerical results demonstrate that the CGDM algorithm significantly outperforms baseline methods in reducing system latency and enhancing robustness against cyber attacks. Copyright © 2025, The Authors. All rights reserved.

关键词： Markov processes

来源：评论

学校读者我要写书评

暂无评论

Shape-aware contrastive deep supervision for esophageal tumor segmentation from CT scans

Shape-aware contrastive deep supervision for esophageal tumo...

引用

2023 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2023

作者： Jin, Qiangguo Cui, Hui Sun, Changming Huang, Jiapeng Xuan, Ping Xu, Yiyue Wang, Linlin Cao, Leilei Wei, Leyi Su, Ran Northwestern Polytechnical University School of Software Shaanxi China Yangtze River Delta Research Institute of Northwestern Polytechnical University Taicang China La Trobe University Department of Computer Science and Information Technology Melbourne Australia Csiro Data61 Sydney Australia Shantou University School of Engineering Department of Computer Science Guangdong China Shandong First Medical University Shandong Academy of Medical Sciences Shandong Cancer Hospital and Institute Department of Radiation Oncology Shandong China Zhejiang University Innovation Center of Yangtze River Delta Zhejiang China Shandong University School of Software Shandong China Tianjin University School of Computer Software College of Intelligence and Computing Tianjin China

ISBN: (纸本)9798350337488

Accurate tumor segmentation is crucial for esophageal cancer radiotherapy treatment planning. The low contrast among the esophagus, tumors, and surrounding tissues, and irregular tumor shapes limit the performance of automatic segmentation methods. In this paper, we aim to exploit the irregular shapes of tumors to facilitate accurate segmentation. We propose a simple and pluggable shape-aware contrastive deep supervision network (SCDSNet) with shape-aware regularization and voxel-to-voxel contrastive deep supervision. Specifically, the shape-aware regularization with an uncertainty minimization strategy encourages the precise predictions of an additional shape-aware head. The voxel-to-voxel contrastive deep supervision enhances the multi-scale shape-tumor contrast for better voxel-to-voxel prediction of shapes. The proposed method is simple and highly pluggable, which can easily be extended to other frameworks. Further, we establish a large in-house dataset on esophageal cancer to validate the effectiveness of our proposed method. The quantitative and qualitative experimental results demonstrate the effectiveness of SCDSNet on the esophageal cancer dataset. © 2023 IEEE.

关键词： Deep supervision Esophagus tumor segmentation Shape-aware regularization Voxel-to-voxel contrastive learning

来源：评论

学校读者我要写书评

暂无评论

Improving Fast Adversarial Training Paradigm: An Example Taxonomy Perspective

arXiv

引用

arXiv 2024年

作者： Gui, Jie Jiang, Chengze Dong, Minjing Tong, Kun Shi, Xinli Tang, Yuan Yan Tao, Dacheng The School of Cyber Science and Engineering Southeast University Nanjing210000 China Purple Mountain Laboratories Nanjing210000 China The Department of Computer Science City University of Hong Kong Hong Kong The Department of Computer and Information Science University of Macau 999078 China The College of Computing & Data Science Nanyang Technological University #32 Block N4 #02a-014 50 Nanyang Avenue Singapore639798 Singapore

While adversarial training is an effective defense method against adversarial attacks, it notably increases the training cost. To this end, fast adversarial training (FAT) is presented for efficient training and has become a hot research topic. However, FAT suffers from catastrophic overfitting, which leads to a performance drop compared with multi-step adversarial training. However, the cause of catastrophic overfitting remains unclear and lacks exploration. In this paper, we present an example taxonomy in FAT, which identifies that catastrophic overfitting is caused by the imbalance between the inner and outer optimization in FAT. Furthermore, we investigated the impact of varying degrees of training loss, revealing a correlation between training loss and catastrophic overfitting. Based on these observations, we redesign the loss function in FAT with the proposed dynamic label relaxation to concentrate the loss range and reduce the impact of misclassified examples. Meanwhile, we introduce batch momentum initialization to enhance the diversity to prevent catastrophic overfitting in an efficient manner. Furthermore, we also propose Catastrophic Overfitting aware Loss Adaptation (COLA), which employs a separate training strategy for examples based on their loss degree. Our proposed method, named example taxonomy aware FAT (ETA), establishes an improved paradigm for FAT. Experiment results demonstrate our ETA achieves state-of-the-art performance. Comprehensive experiments on four standard datasets demonstrate the competitiveness of our proposed method. Copyright © 2024, The Authors. All rights reserved.

关键词： Generative adversarial networks

来源：评论

学校读者我要写书评

暂无评论

Chain-of-Thought in Neural Code Generation: From and For Lightweight Language Models

arXiv

引用

arXiv 2023年

作者： Yang, Guang Zhou, Yu Chen, Xiang Zhang, Xiangyu Zhuo, Terry Yue Chen, Taolue The College of Computer Science and Technology Nanjing University of Aeronautics and Astronautics Nanjing China The School of Information Science and Technology Nantong University China Monash University CSIRO’s Data61 Australia School of Computing and Mathematical Sciences Birkbeck University of London United Kingdom

Large Language Models (LLMs) have demonstrated remarkable potential in code generation. The integration of Chain of Thought (CoT) reasoning can further boost their performance. However, current CoT methods often require manual writing or LLMs with over 100 billion parameters to generate, impeding their applicability in resource-constrained scenarios. In this study, we investigate lightweight Language Models (LMs), which are defined to have fewer than 10 billion parameters. Empirically, we find that most LMs cannot generate high-quality CoTs when prompted by the few-shot method, but can take advantage of high-quality CoTs generated elsewhere to improve their performance in code generation. Based on these findings, we design a novel approach COTTON which can leverage LMs to automatically generate CoTs for code generation. We synthesize new datasets and conduct extensive experiments on various benchmarks. The results show that the CoTs generated by COTTON outperform the baselines in terms of automated and human evaluation metrics. In particular, the CoTs generated by COTTON boost various LMs to achieve higher performance gains than those generated by LLMs such as ChatGLM (130B), and are competitive with those generated by Gemini and gpt-3.5-turbo. The results also reveal that COTTON not only improves the performance of LMs, but also enhances the performance of LLMs. Our study showcases the potential of LMs in software engineering applications. © 2023, CC BY.

关键词： Chains

来源：评论

学校读者我要写书评

暂无评论

A Fusion Neural Network Incorporating Attention for Sensor-Based Human Activity Recognition 3

A Fusion Neural Network Incorporating Attention for Sensor-B...

引用

3rd International Conference on computer Vision, Image and Deep Learning and International Conference on computer Engineering and Applications, CVIDL and ICCEA 2022

作者： Lu, Limeng Zhang, Chuanlin Cao, Kai Deng, Dao Institute of Chinese Ethnic Information Technology Northwest Minzu University Lanzhou China Key Laboratory of China's Ethnic Languages and Information Technology Ministry of Education Lanzhou China Northwest Minzu University School of Mathematics and Computer Science Lanzhou China Northwest Minzu University Key Laboratory of Streaming Data Computing Technologies and Application Lanzhou China

ISBN: (纸本)9781665459112

Even though the RNN, LSTM, and other networks are used to extract dependencies in time series, sensor-based human behavior recognition (HAR) still faces some difficulties, and the ability of deep learning (DL) networks to extract features still needs to be improved. We propose a fusion neural network in which an optimized small optimized Convolutional Block Attention Module (MP-CBAM) is suitable for HAR tasks to extract samples. The MP-CBAM is added to two branches Convolutional Neural Network (CNN) with different convolution kernel sizes, and the fused features are labeled with GRU for temporal dependence. Then the softmax function is used for classification. We validate on benchmark datasets UCI-HAR and WISDM and accuracies of 97.15% and 9S.9S% are obtained, respectively, demonstrating the adaptation of our framework to both single-sensor and multi-sensor devices. © 2022 IEEE.

关键词： Convolution

来源：评论

学校读者我要写书评

暂无评论

BiKT: Unleashing the potential of GNNs via Bi-directional Knowledge Transfer

arXiv

引用

arXiv 2023年

作者： Zheng, Shuai Liu, Zhizhe Zhu, Zhenfeng Zhang, Xingxing Li, Jianxin Zhao, Yao The Institute of Information Science Beijing Jiaotong University Beijing100044 China The Beijing Key Laboratory of Advanced Information Science and Network Technology Beijing100044 China Qiyuan Lab Beijing China The Beijing Advanced Innovation Center for Big Data and Brain Computing School of Computer Science and Engineering Beihang University Beijing100083 China

Based on the message-passing paradigm, there has been an amount of research proposing diverse and impressive feature propagation mechanisms to improve the performance of GNNs. However, less focus has been put on feature transformation, another major operation of the message-passing framework. In this paper, we first empirically investigate the performance of the feature transformation operation in several typical GNNs. Unexpectedly, we notice that GNNs do not completely free up the power of the inherent feature transformation operation. By this observation, we propose the Bi-directional Knowledge Transfer (BiKT), a plug-and-play approach to unleash the potential of the feature transformation operations without modifying the original architecture. Taking the feature transformation operation as a derived representation learning model that shares parameters with the original GNN, the direct prediction by this model provides a topological-agnostic knowledge feedback that can further instruct the learning of GNN and the feature transformations therein. On this basis, BiKT not only allows us to acquire knowledge from both the GNN and its derived model but promotes each other by injecting the knowledge into the other. In addition, a theoretical analysis is further provided to demonstrate that BiKT improves the generalization bound of the GNNs from the perspective of domain adaption. An extensive group of experiments on up to 7 datasets with 5 typical GNNs demonstrates that BiKT brings up to 0.5% - 4% performance gain over the original GNN, which means a boosted GNN is obtained. Meanwhile, the derived model also shows a powerful performance to compete with or even surpass the original GNN, enabling us to flexibly apply it independently to some other specific downstream tasks. Copyright © 2023, The Authors. All rights reserved.

关键词： Message passing

来源：评论

学校读者我要写书评

暂无评论

RAGRAPH: a general retrieval-augmented graph learning framework 24

RAGRAPH: a general retrieval-augmented graph learning framew...

引用

Proceedings of the 38th International Conference on Neural information Processing Systems

作者： Xinke Jiang Rihong Qiu Yongxin Xu Wentao Zhang Yichen Zhu Ruizhe Zhang Yuchen Fang Xu Chu Junfeng Zhao Yasha Wang Key Laboratory of High Confidence Software Technologies (Peking University) School of Computer Science Peking University China University of Electronic Science and Technology of China Key Laboratory of High Confidence Software Technologies (Peking University) School of Computer Science Peking University China and Center on Frontiers of Computing Studies Peking University Beijing China and Peking University Information Technology Institute Tianjin Binhai China Key Laboratory of High Confidence Software Technologies (Peking University) School of Computer Science Peking University China and Big Data Technology Research Center Nanhu Laboratory Jiaxing China Key Laboratory of High Confidence Software Technologies (Peking University) School of Computer Science Peking University China and Peking University Information Technology Institute Tianjin Binhai China

ISBN: (纸本)9798331314385

Graph Neural Networks (GNNs) have become essential in interpreting relational data across various domains, yet, they often struggle to generalize to unseen graph data that differs markedly from training instances. In this paper, we introduce a novel framework called General Retrieval-Augmented Graph Learning (RAGRAPH), which brings external graph data into the general graph foundation model to improve model generalization on unseen scenarios. On the top of our framework is a toy graph vector library that we established, which captures key attributes, such as features and task-specific label information. During inference, the RAGRAPH adeptly retrieves similar toy graphs based on key similarities in downstream tasks, integrating the retrieved data to enrich the learning context via the message-passing prompting mechanism. Our extensive experimental evaluations demonstrate that RAGRAPH significantly outperforms state-of-the-art graph learning methods in multiple tasks such as node classification, link prediction, and graph classification across both dynamic and static datasets. Furthermore, extensive testing confirms that RAGRAPH consistently maintains high performance without the need for task-specific fine-tuning, highlighting its adaptability, robustness, and broad applicability.

关键词：

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：