+高级检索
基于邻域自适应注意力的跨域融合语音增强
DOI:
作者:
作者单位:

作者简介:

通讯作者:

基金项目:


Neighborhood Adaptive Attention Based Cross-domain Fusion Network for Speech Enhancement
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    基于深度学习的语音增强方法可分为时域方法和频域方法两类,这两类方法各有优点.为了综合利用时、频两域方法的优点,提出了基于邻域自适应注意力的跨域融合语音增强模型.该模型能够同时对输入的波形和频谱进行增强,并对时域和频域的增强结果进行跨域融合得到最终增强结果.为了利用时域增强结果与频域增强结果的信息互补特性,提出使用信息交流模块来实现两域增强结果的互补提升.为了提高时域增强模型与频域增强模型的特征提取能力,充分利用时域和频域的信号特点,进一步提出了邻域自适应注意力模块.该模块依据输入信息自适应选择汇聚具有不同邻域窗口的局部自注意力模块,进而高效利用不同尺度下的平稳特征.实验结果表明,所提邻域自适应注意力模块和时频域的信息交流与融合模块,可有效利用波形与频谱的互补特性,进一步提升增强效果.

    Abstract:

    Deep learning (DL) based speech enhancement methods can be divided into time domain methods and frequency domain methods, each of which has its own pros. To make full use of the advantages of methods in both domains, a cross-domain speech enhancement model based on the neighborhood adaptive attention mechanism is proposed. The model enhances the input waveform and spectrum at the same time, and the final enhancement result is obtained by cross-domain fusion of the enhancement results in time domain and frequency domain. In order to take advantage of the information complementarity between the enhanced results in two domains, an information communication module is proposed to realize the information exchange between the enhanced results. In order to improve the feature extraction ability of the time-domain and the frequency-domain enhanced models, and to make full use of the signal characteristics of the two domains, the neighborhood adaptive attention module is proposed. The neighborhood adaptive attention module adaptively aggregates local self-attention with different neighborhood sizes according to the input information and then models the stationary features of different scales. The experimental results show that the complementary characteristics of waveform and spectrum can be effectively utilized to further improve the enhancement performance by adding the neighborhood adaptive attention module and cross-domain information exchange and fusion module.

    参考文献
    相似文献
    引证文献
文章指标
  • PDF下载次数:
  • HTML阅读次数:
  • 摘要点击次数:
  • 引用次数:
引用本文

YUE Huanjing, DUO Wenxin, YANG Jingyu?.基于邻域自适应注意力的跨域融合语音增强[J].湖南大学学报:自然科学版,2023,(12):59~68

复制
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2024-01-02
  • 出版日期:
作者稿件一经被我刊录用,如无特别声明,即视作同意授予我刊论文整体的全部复制传播的权利,包括但不限于复制权、发行权、信息网络传播权、广播权、表演权、翻译权、汇编权、改编权等著作使用权转让给我刊,我刊有权根据工作需要,允许合作的数据库、新媒体平台及其他数字平台进行数字传播和国际传播等。特此声明。
关闭