+Advanced Search

Text Deduplication Algorithm Based on Event Heterogeneous Graph Representation
Author:
Affiliation:

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
    Abstract:

    The text representation method based on graph structure has a better effect in news text deduplication. However, at present, this representation method cannot fully represent the complete information of the text, and ignores the semantic information of the graph, which reduces the deduplication effect of news text. To this end, this study proposes a text deduplication algorithm based on event heterogeneous graph representation. The algorithm first represents the global semantic and structural information of news text through event heterogeneous graph, and then proposes a dual-label graph kernel algorithm to represent event heterogeneous graph to realize the structure and semantic information of the deep representation graph. The experimental results show that the deduplication algorithm proposed improves the F1-score index by 10%, compared with the existing text representation deduplication method based on graph structure. Finally, the algorithm can improve the deduplication effect of news text.

    Reference
    Related
    Cited by
Article Metrics
  • PDF:
  • HTML:
  • Abstract:
  • Cited by:
Get Citation
History
  • Received:
  • Revised:
  • Adopted:
  • Online: March 06,2023
  • Published: