异常点诊断是统计学中的经典问题.发现并减少异常点对纳税评估数据分析的影响是一项很有意义的研究.然而,通常的异常点诊断一般采用适用于单峰分布的全局识别方法.借鉴局部域相关积分(Local correlation integral)理论,提出基于非参数密度估计的识别方法.方法适用于多峰分布,能识别局域性质的异常点,对异常点占比较高的样本也有较强的识别能力.基于某市10920个企业样本,实证分析对比研究了税务局目前使用的和建议的纳税评估方法,结果表明税务局采用的方法有较大的纳税评估风险(误判风险).
Outlier detection is a classical problem in statistics. It is a very meaningful research to find and reduce the effects on analysis of outlier on tax assessment data. However, the former outlier diagnosis generally applied the global recognition method which suits for the unimodal distribution. This paper adopts the theory of local correlation integral and proposes the detection method based on nonparametric density estimation. This method suits for the multimodal distribution, can detect the local outliner, and have strong recognition ability about the sample which has the high proportion of outliner. Based on the samples of 10920 enterprises, the empirical analysis compares the tax assessment 'methods used by Tax Bureau currently and proposed by this paper, and the result shows the method used by Tax Bureau has great risk of tax assessment (the misjudged risk).