+高级检索
面向不稳定日志的一致性异常检测方法
DOI:
作者:
作者单位:

作者简介:

通讯作者:

基金项目:


Conformal Anomaly Detection Method for Unstable Logs
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    系统日志被用作系统异常检测的主要数据源.现有的日志异常检测方法主要利用从历史日志中提取的日志事件数据构建检测模型,即假设日志数据随时间的推移其分布规律具有稳定性.然而,在实践中,日志数据往往包含以前未出现过的事件或序列.这种不稳定性有两种来源:1)日志发生了概念漂移;2)日志处理过程中引入了噪声.为缓解日志中出现的不稳定问题,设计了基于置信度协同多种算法的异常检测模型EBCAD(Ensemble-Based Conformal Anomaly Detection).首先,用统计量p值度量日志之间的不一致性,选择多个合适的集成算法作为不一致性度量函数计算不一致性得分进行协同检测;然后,设计了基于置信度的更新机制来缓解日志不稳定问题,将新日志的不一致性得分添加到已有得分集,更新日志异常检测的经验;最后,根据协同检测得到的置信度与预设置信水平大小来判断不稳定日志是否异常.实验结果表明,在HDFS日志数据集中,当不稳定数据注入率从5%增加到20%时,EBCAD模型的F1值仅从0.996降低到0.985;在BGL_100K日志数据集中,当不稳定数据注入率从5%增加到20%时,EBCAD的F1值仅从0.71降低到0.613.证明EBCAD在不稳定日志中可以有效检测到异常.

    Abstract:

    System logs are used as the primary data source for system anomaly detection.??Existing log anomaly detection methods mainly use log event data extracted from historical logs to build detection models, that is, the distribution of log data is assumed to be stable over time.??However, in practice, log data often contains events or sequences that have not occurred before.??The instability comes from two sources: 1) conceptual drift occurs in logs;??2) noise is introduced during log processing.??In order to alleviate the problem of instability in logs, an anomaly detection model called Ensemble-Based Conformal Anomaly Detection (EBCAD) based on confidence degree and multiple algorithms is designed.??Firstly, the p-value statistics are used to measure the non-conformity between logs, and multiple appropriate ensemble algorithms are selected as the non-conformity measure functions to calculate the non-conformal scores for collaborative detection.??Then, an update mechanism based on confidence is designed to alleviate the problem of log instability. By adding scores of new logs into existing sets, the experiences of log anomaly detection are updated. Finally, according to the confidence degree and the preset significance level obtained by collaborative detection, the unstable log is judged to be abnormal.??The experimental results show that when the unstable data injection rate increases from 5% to 20% in HDFS log data set, the F1-score of EBCAD model only decreases from 0.996 to 0.985.??In the BGL_100K log data set, when the unstable data injection rate increases from 5% to 20%, the F1-score of EBCAD decreases only from 0.71 to 0.613.??This proves that EBCAD can effectively detect anomalies in unstable logs.

    参考文献
    相似文献
    引证文献
文章指标
  • PDF下载次数:
  • HTML阅读次数:
  • 摘要点击次数:
  • 引用次数:
引用本文

刘春波 ,梁孟孟 ,侯晶雯 ,顾兆军 ,王志 .面向不稳定日志的一致性异常检测方法[J].湖南大学学报:自然科学版,2022,49(4):89~99

复制
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2022-05-13
  • 出版日期:
作者稿件一经被我刊录用,如无特别声明,即视作同意授予我刊论文整体的全部复制传播的权利,包括但不限于复制权、发行权、信息网络传播权、广播权、表演权、翻译权、汇编权、改编权等著作使用权转让给我刊,我刊有权根据工作需要,允许合作的数据库、新媒体平台及其他数字平台进行数字传播和国际传播等。特此声明。
关闭