文献详情 >The Codecfake Dataset and Coun... 收藏

IEEE Transactions on Audio, Speech and Language Processing

The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio

作者：Yuankun Xie Yi Lu Ruibo Fu Zhengqi Wen Zhiyong Wang Jianhua Tao Xin Qi Xiaopeng Wang Yukun Liu Haonan Cheng Long Ye Yi Sun

作者机构：State Key Laboratory of Media Convergence and Communication Beijing China School of Artificial Intelligence University of Chinese Academy of Sciences Beijing China Institute of Automation Chinese Academy of Sciences Beijing China Beijing National Research Center for Information Science and Technology Tsinghua University Beijing China Department of Automation Tsinghua University Beijing China School of Data Science and Intelligent Media Communication University of China Beijing China School of Cyberspace science and Technology Beijing Institute of Technology Beijing China

出版物：《IEEE Transactions on Audio, Speech and Language Processing》

年卷期：2025年第33卷

页面：386-400页

基　　金：National Natural Science Foundation of China

主　　题：Deepfakes Codecs Vocoders Acoustics Training Codes Feature extraction Vector quantization Speech enhancement Speech coding

摘要：With the proliferation of Audio Language Model (ALM) based deepfake audio, there is an urgent need for generalized detection methods. ALM-based deepfake audio currently exhibits widespread, high deception, and type versatility, posing a significant challenge to current audio deepfake detection (ADD) models trained solely on vocoded data. To effectively detect ALM-based deepfake audio, we focus on the mechanism of the ALM-based audio generation method, the conversion from neural codec to waveform. We initially constructed the Codecfake dataset, an open-source, large-scale collection comprising over 1 million audio samples in both English and Chinese, focus on ALM-based audio detection. As countermeasure, to achieve universal detection of deepfake audio and tackle domain ascent bias issue of original sharpness aware minimization (SAM), we propose the CSAM strategy to learn a domain balanced and generalized minima. In our experiments, we first demonstrate that ADD model training with the Codecfake dataset can effectively detects ALM-based audio. Furthermore, our proposed generalization countermeasure yields the lowest average equal error rate (EER) of 0.616% across all test conditions compared to baseline models. The dataset and associated code are available online.

本地馆藏 | 借阅须知 | 我要预约

已订购，未入库

sda

目录详情 | 试阅读 |

读者评论与其他读者分享你的观点

学校读者

用户名:未登录

我的评分

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio

读者评论与其他读者分享你的观点

请选择收藏分类：

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio

读者评论 与其他读者分享你的观点

请选择收藏分类： 新增自定义分类 确定 取消

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

读者评论与其他读者分享你的观点

请选择收藏分类：