eMule网络是近年来越来越流行的文件共享对等网络.一直以来,文件源的准确定位是文件共享对等网络的一个关键步骤;此外,不健康内容的肆意传播也使网络监管成为必需.这些都导致准确确定eMule网络中节点的需求,同时促使eMule网络最佳节点标识问题的提出.然而,eMule网络中广泛使用的节点标识KadID因可被eMule用户任意更改,存在KadID别名,即单个节点对应多个KadID的情况,以及KadID重复,即多个节点对应同一个KadID的情况,从而使用传统KadID很难准确确定节点.为解决这一问题,首先定义候选节点标识的稳定因子用以评价候选标识:然后设计并实现一个证明可收敛且时空复杂度不高的eMule网络节点信息采集器--Rainbow,以获得实际eMule网络中节点的多个候选标识之间的对应关系信息.实验结果表明,{userlD}的稳定因子最大,是节点标识集合2{Kad ID,user ID,IP},_{Ф}中的最佳节点标识;之后,为了量化Kad ID别名问题的程度,对{user ID}与{Kad ID}的关系进行探讨;最后对最佳节点标识的应用有效性进行分析,说明采用{user ID}作为节点标识能够更准确地确定节点.总之,所确定的最佳节点标识为eMule网络的研究奠定了基础,Rainbow也为真实eMule网络测量提供了良好的工具.
In recent years, eMule network, a kind of peer-to-peer (P2P) file-sharing network has become more and more popular. Along with its popularity, the demand to accurately determine the peer in eMule has also increased for two reasons: it is a critical step to accurately locate sources of files in P2P file-sharing networks, and the wanton spread of vulgar content makes it necessary to censor eMule. This demand allows everyone to put forward the problem of optimal peer identifier in eMule network. However, since Kad ID (the widely-used identifier in eMule network) can be freely changed by users of eMule, there exists Kad ID aliasing, a single peer may correspond to multiple Kad IDs; reversely, There also exists Kad ID repetition, which are multiple peers corresponding with a single Kad ID. Therefore, it is difficult to accurately determine the peer by using Kad ID. This paper attempts to solve this problem. First, the stability factor (SF) of peer identifier is defined to evaluate candidate identifiers. Then, a crawler named Rainbow is designed and implemented to collect peer information from multiple candidate identifiers' relationship in real eMule network. Note that Rainbow has been proved to be convergent and has low time and space complexity. Experimental results show that {userID} is the optimal peer identifier in peer identifier set 2{KadID'userID, IP)--{Ф} as {userID} has the largest SF value. Later on, in order to quantify the extent of Kad ID aliasing, the relationship between {userID} and {Kad ID} is discussed. Lastly, the effectiveness of the application of the optimal peer identifier is analyzed. Results show that peers are more accurately determined when using {userID} as the identifier of peers. All in all, the identification of optimal peer identifier provides a basis for future research of eMule network, and Rainbow serves as a useful tool for measuring real eMule network.