近年来数据挖掘在电信领域的应用越来越广泛,而自中心网络从人与环境交互的角度为数据分析提出了新的思路,但是由于数据量、数据维度、计算复杂度等原因,传统的计算方式不能应付海量数据的自中心网络生成和分析的需求。本文首先给出了基于mapreduce模型的传统自中心网络生成算法的实现,然后提出了新的基于三角形提取自中心网络生成算法,并给出了基于mapreduce编程模型的实现,该算法针对mapreduce模型和真实的社会网络进行了优化并实现性能提升,最后对两种算法进行了运行时间和IO的比较。
Basing on the researching of the relation between personal and the culture he is in and the interactive between he and the other persons,we can find a lot of character of the person by analysis of egocentric network.Recently,data mining is more and more widely used in the telecommunication area,and egocentric is a new idea that trade the person as a part of the whole network.But because of the size of data,dimensionality of data and the complication of the computation,traditional methods is not suitable for this kind of application.In this article,we give the implementation of the traditional egocentric algorithm based on mapreduce module.And then we propose a new egocentric network generating algorithm based on the discovery of triangles.After that,we give the implementation of the new algorithm based on the mapreduce module.In this new algorithm,we make some optimize in connection with the mapreduce module and the character of the real social network to enhance the efficiency.At last,we will compare these two algorithms by time consumed and IO.