基于局部离群因子的增量挖掘算法需要多次扫描数据集。反k近邻适用于度量离群程度,根据该性质提出基于反k近邻的流数据离群点挖掘算法(SOMRNN)。采用滑动窗口模型更新当前窗口,仅须进行一次扫描,提高了算法效率。通过查询过程实现在任意指定时刻对当前窗口进行整体查询,及时捕捉数据流概念漂移现象。实验结果证明,SOMRNN具有适用性和有效性。
Incremental mining algorithms based on local outlier factor demand multiple scans of the data set. Stream data Outlier Mining algorithm based on Reverse k Nearest Neighbors(SOMRNN) is proposed according to the concept that reverse k nearest neighbors is suitable to measure outlier degree. The sliding window is adopted to update the current window with one scan, which improves the algorithm efficiency. The capability of queries at arbitrary time on the whole current window is achieved by query manager procedure, which can capture the phenomenon of concept drift of data stream in time. Experimental results show that SOMRNN has feasibility and efficiency.