+Advanced Search

A Partial Comparative Mixture Model for Multi-collections Documents
Author:
Affiliation:

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
    Abstract:

    State-of-the-art cross collections topic models suffer from major flaw that they can only analyze the common topics among document collections. We introduced a mixture model PCCMix (Partial comparative Cross Collections Mixture) for multi-collections CTM to detect both common topics and collection-special topics. PCCMix divides the two types of topics in document collections by estimating a probability distribution from the whole dataset in advance, and then trains the model by the Expectation-maximuzation algorithm (EM). Experiment results show that PCCMix can analyze both common topics among collections and collection special topics. The PCCMix model is very effective and can model the document collections more precisely than the two main CTM models.

    Reference
    Related
    Cited by
Article Metrics
  • PDF:
  • HTML:
  • Abstract:
  • Cited by:
Get Citation
History
  • Received:
  • Revised:
  • Adopted:
  • Online:
  • Published: