入侵检测系统通过提供可能由恶意攻击导致的告警信息来保护计算机系统.为了能够利用历史数据自动提升入侵检测的性能,机器学习方法被引入入侵检测.但是,高质量训练数据的获取往往需要繁重的劳动或代价昂贵的监控过程.同时,不同类型的误分类导致的代价也是不一样的,入侵检测需要使误分类代价最小.针对这两种需要综合考虑的问题,提出一种基于代价敏感主动学习的入侵检测分类器构造方法ACS.该方法结合代价敏感学习和主动学习方法,其目标为减少学习代价敏感分类器的标注次数,使代价敏感分类器的误分类代价最小.该方法在主动学习的学习引擎中使用代价敏感学习算法替代传统的错误最小学习算法,同时在采样引擎中使用最大误分类代价的采样标准.ACS方法在主动学习中版本空间的构造、更新过程都针对代价敏感环境作了对应的改进,使该算法能够以较高的收敛速度收敛到误分类代价最小的目标函数.在入侵检测数据集KDDCUP99上的的实验表明,ACS方法能够有效地减少学习代价敏感分类器的标注次数.
Intrusion detection systems (IDS) protect the computer system by providing alerts which might be caused by malicious attacks. Machine Learning methods were introduced into intrusion detection to automatically improve the performance by using history data. Yet high quality data requires heavy labor of experts or expensive monitoring process. Meanwhile, different types of misclassification result in different costs and IDS should minimize a nonuniform misclassification cost. In the paper, we aim to reduce the burden of labeling data for constructing the intrusion detection classifier with the least misclassification cost. We proposed a novel active cost-sensitive learning method ACS (active cost-sensitive sampling) for intrusion detection using the technologies of active learning and cost-sensitive learning. The proposed method uses a popular cost-sensitive learning method Metacost as the base classifier and a sampling criterion of the largest miselassifieation cost. The ACS method modifies the construction and updating process of version space according to the cost-sensitive environment, thus it can converge to the target function with the lowest misclassification cost quickly. The results of the experiments on intrusion detection datasets of KDDCUP 99 show that the proposed method is effective.