文献详情 >Improving multi-talker binaura... 收藏

Improving multi-talker binaural DOA estimation by combining periodicity and spatial features in convolutional neural networks

作者：Varzandeh, Reza Doclo, Simon Hohmann, Volker

作者机构：Carl von Ossietzky Univ Oldenburg Dept Med Phys & Acoust D-26111 Oldenburg Germany Carl von Ossietzky Univ Oldenburg Cluster Excellence Hearing4all D-26111 Oldenburg Germany

出版物：《EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING》 (Eurasip J. Audio Speech Music Process.)

年卷期：2025年第2025卷第1期

页面：1-18页

核心收录：

学科分类：0808[工学-电气工程] 0809[工学-电子科学与技术（可授工学、理学学位）] 08[工学] 0702[理学-物理学]

基　　金：Deutsche Forschungsgemeinschaft

主　　题：Convolutional neural networks Spatial feature Periodicity feature Binaural DOA estimation Multiple talkers Feature reduction Hearing devices.

摘要：Deep neural network-based direction of arrival (DOA) estimation systems often rely on spatial features as input to learn a mapping for estimating the DOA of multiple talkers. Aiming to improve the accuracy of multi-talker DOA estimation for binaural hearing aids with a known number of active talkers, we investigate the usage of periodicity features as a footprint of speech signals in combination with spatial features as input to a convolutional neural network (CNN). In particular, we propose a multi-talker DOA estimation system employing a two-stage CNN architecture that utilizes cross-power spectrum (CPS) phase as spatial features and an auditory-inspired periodicity feature called periodicity degree (PD) as spectral features. The two-stage CNN incorporates a PD feature reduction stage prior to the joint processing of PD and CPS phase features. We investigate different design choices for the CNN architecture, including varying temporal reduction strategies and spectro-temporal filtering approaches. The performance of the proposed system is evaluated in static source scenarios with 2-3 talkers in two reverberant environments under varying signal-to-noise ratios using recorded background noises. To evaluate the benefit of combining PD features with CPS phase features, we consider baseline systems that utilize either only CPS phase features or combine CPS phase and magnitude spectrogram features. Results show that combining PD and CPS phase features in the proposed system consistently improves DOA estimation accuracy across all conditions, outperforming the two baseline systems. Additionally, the PD feature reduction stage in the proposed system improves DOA estimation accuracy while significantly reducing computational complexity compared to a baseline system without this stage, demonstrating its effectiveness for multi-talker DOA estimation.

本地馆藏 | 借阅须知 | 我要预约

已订购，未入库

sda

目录详情 | 试阅读 |

读者评论与其他读者分享你的观点

学校读者

用户名:未登录

我的评分

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

Improving multi-talker binaural DOA estimation by combining periodicity and spatial features in convolutional neural networks

读者评论与其他读者分享你的观点

请选择收藏分类：

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

Improving multi-talker binaural DOA estimation by combining periodicity and spatial features in convolutional neural networks

读者评论 与其他读者分享你的观点

请选择收藏分类： 新增自定义分类 确定 取消

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

读者评论与其他读者分享你的观点

请选择收藏分类：