为解决目前域名获取方法效率低且获取域名数量较少的问题,对前期采集的大量域名进行了统计分析,以发现域名字符的组成规则以及分布特征,并根据这些特征,设计了基于马尔可夫链的域名模型,提出了一种基于改进马尔可夫链的域名生成算法。对生成的域名进行了WHOIS查询验证,以确认域名是否存在。通过大量实验结果证实,该算法具有较高的域名生成准确率,且与其它域名获取方法相比,该方法具有生成域名速率快、域名数量多和顶级域名覆盖广等优点。
To solve the problem that current domain name acquisition methods have the low efficiency and can only acquire a small number of domain names,the study conducted the statistical analysis of the quantifies of domain names collected in the early stage to find the composition rules and distribution characteristics of domain name characters,and then designed a domain name model based on Markov chain according to these characteristics,and proposed a domain name generation algorithm based on the improved Markov chain. The generated domain names were verified with WHOIS records to confirm whether the domain names exist. The experimental results show that the proposed algorithm has a high accuracy in domain name generating. And compared with other domain name acquisition methods,this method has the faster generating speed,and can generate more domain names with a wider coverage of Top-Level Domains.