提出了一种基于聚类的时空关联规则的公交犯罪挖掘算法.针对某市一个区的110报警数据库中的大量业务信息进行分析.首先,通过文本挖掘技术从案情信息中提取时间、地点等信息,并利用高德地图API的地理编码服务和POI搜索功能对提取的地址信息进行地址匹配,提取受害人上下车站点、乘坐公交线路等信息.其次,对提取得到的时空数据进行归并处理.最后,根据案发时段、季节以及是否节假日进行聚类分析,然后在簇内进行时空关联规则分析.这种挖掘方法具有以下特点:①在聚类基础上进行关联规则分析,减少扫描数据库次数,大大缩小数据扫描范围,提高算法效率,更加适合海量犯罪数据的挖掘.②聚类后簇内数据具有相似性,特征更加明显,在此基础上进行关联规则分析产生较小的频繁项集,并且提取出置信度较高的规则.③考虑犯罪行为的时空特性,挖掘过程中同时考虑了案发季节、是否节假日等因素.
This paper introduced the spatio-temporal association rules based on clustering minging to find out the spatio-temporal crime patterns of bus pickpocketing. It can be carried out through three steps. Firstly, extract time, places and other information from the case information by text extraction. Then, confirm the boarding stations and getting off stations of victims using the geocoding service and POI search capability of Amap API. Divide the bus routes into sections according to the bus stops and merge the crime time into time interval. Thirdly, the analysis of association rules based on clustering is carried out to discover the patterns of bus pickpocketing. The results prove that the proposed mining model has the following characteristics: (1)This method can reduce the database scanning times, the candidate item sets amount and improve time efficiency of the searching. (2)After clustering, the data in a cluster is similar and the characteristics axe more obvious. On this basis, the association rules of high confidence are extracted. (3)When the analysis was carried out, the temporal and spatial characteristics of the bus pickpocketing crime were also considered.