随着电信行业间的竞争加剧,运营商更加应该专注于服务的质量。以非法回拨电话为代表的非法行为会扰乱用户的正常生活。如何快速准确地检测出非法回拨异常用户的问题亟待解决。而且高速发展的信息科技积累的海量数据集将带来更大的挑战。然而,传统的非法行为检测方法准确率并不高,这些方法在面对海量数据处理的时候将会变得低效甚至无效。本文中引入了基于mapreduce的并行离群点检测方法来定位有离群点特征的非法用户的行为。此外,为了获得较高的准确率,本文结合了聚类系数来进行离群点检测。大量的实验表明改进的方法在提高效率和准确率方面有很好的效果。
With the intensified competition among telecommunications industry, much have been focused on the quality of service. Illegal activities, especially dial-back fraud calls, may cause annoyance and inconvenience which will reduce user experience. The detection of dial-back fraud calls is an urgent issue that needs to be addressed. The rapid development of information technology which gives rise to the accumulated huge data will pose a greater challenge. However, traditional detecting methods to identify illegal activities cannot get acceptable accuracy. On the other hand, those methods become very inefficient or even unavailable when processing massive data. In this paper, a outlier detection approach has been introduced to locate illegal acts of the illegal users who have the characteristics as outliers. For a higher hit rate, the method combining outlier detection with cluster coefficient, Besides, the method exploits parallel computation based on MapReduce in order to obtain vast time savings and improve the processing capability of the algorithm on large data. Extensive experimental results demonstrate the efficiently performances of proposed algorithm according to the evaluation criterions of speedup and scale up.