+Advanced Search

Neighborhood Adaptive Attention Based Cross-domain Fusion Network for Speech Enhancement
Author:
Affiliation:

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
    Abstract:

    Deep learning (DL) based speech enhancement methods can be divided into time domain methods and frequency domain methods, each of which has its own pros. To make full use of the advantages of methods in both domains, a cross-domain speech enhancement model based on the neighborhood adaptive attention mechanism is proposed. The model enhances the input waveform and spectrum at the same time, and the final enhancement result is obtained by cross-domain fusion of the enhancement results in time domain and frequency domain. In order to take advantage of the information complementarity between the enhanced results in two domains, an information communication module is proposed to realize the information exchange between the enhanced results. In order to improve the feature extraction ability of the time-domain and the frequency-domain enhanced models, and to make full use of the signal characteristics of the two domains, the neighborhood adaptive attention module is proposed. The neighborhood adaptive attention module adaptively aggregates local self-attention with different neighborhood sizes according to the input information and then models the stationary features of different scales. The experimental results show that the complementary characteristics of waveform and spectrum can be effectively utilized to further improve the enhancement performance by adding the neighborhood adaptive attention module and cross-domain information exchange and fusion module.

    Reference
    Related
    Cited by
Article Metrics
  • PDF:
  • HTML:
  • Abstract:
  • Cited by:
Get Citation
History
  • Received:
  • Revised:
  • Adopted:
  • Online: January 02,2024
  • Published: