伪装入侵检测面临的主要问题是如何利用相对不够充分的训练数据来尽可能精确地描述用户的正常行为轮廓,并利用该行为轮廓进行检测.本文提出了一种基于K最近邻(k-Nearest Neighbor,KNN)文本分类的伪装入侵检测方法,减少了TFIDF权重表示中高频命令的权重,提出新的权重表示方法 STFIDF,使得有区分性的命令权重增大,有利于更准确地表示用户的行为特征,采用Jaccard权重余弦(Jaccard Weighted Cosine,JWC)相似度计算方法,而不是通常的余弦相似度计算,提高了整体的伪装行为识别能力.对比其他方法,检测率高、误报少,且实时性好.该方法不需要复杂的训练过程,检测方法也很简单,快速高效且易于实现.
How to use relatively inadequate training data to describe the user's normal behavior profile as accurately as possible and take advantage of the behavior profile for detection are the main problem faced to masquerade detection. This paper describes a masquerader detection method based on k-nearest neighbor ( KNN ) text categorization. It reduces the weight of high frequency command in weighting representation TFIDF, proposes a novel weighting representation STFIDF to increase the weight of distinguish commands and more accurately represent the behavior of the user, and introduces Jaccard Weighted Cosine ( JWC ) similarity calculation method, instead of the usual Cosine similarity ,to improve recognition ability for camouflage behavior. It is shown that this technique is significantly better than the other techniques in achieving higher detection rates,lower false positive rates, and better real-time performance. This method is very simple, fast, efficient, and easy to implement without a complex training process.