+Advanced Search

Parallel Outlier Detection Algorithm in Heterogeneous Distributed Environment
Author:
Affiliation:

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
    Abstract:

    Outlier detection is one of the hotspots in the field of data mining. The main purpose is to identify the abnormal but valuable data points in the data set. With the expansion of data scale,the efficiency of processing massive data is reduced,and then a distributed algorithm is introduced. At present,most of the existing distributed algorithms are used to solve the homogeneous distributed processing environment. However,in practical applications,due to the differences in processor configuration involved in distributed computing,the existing distributed outlier detection algorithms cannot be well applied to heterogeneous distributed environments. In view of the above problems,this paper proposes an outlier detection algorithm for heterogeneous distributed environments. Firstly,a grid-based dynamic data partitioning (GDDP) method is proposed,which makes full use of the computing resources of each processor and divides the data according to the spatial location information of the data points,which can effectively reduce the network communication. Secondly,based on the GDDP algorithm,a parallel GDDP-based Outlier Detection Algorithm (GODA) is proposed. The algorithm consists of two stages:in each processor,filtering according to the order of the data points in the index and obtaining the outlier candidate set by two scans;determining the candidate outliers requiring network communication and using low network overhead leads to global outliers. Finally,the effectiveness of the proposed GDDP and GODA algorithms is verified by a large number of experiments.

    Reference
    Related
    Cited by
Article Metrics
  • PDF:
  • HTML:
  • Abstract:
  • Cited by:
Get Citation
History
  • Received:
  • Revised:
  • Adopted:
  • Online: October 26,2020
  • Published: