概率近似正确(PAC)是研究"可学习"的理论框架。近年来,研究人员融合贝叶斯方法与不依赖分布的PAC性能度量提出了所谓的PAC-Bayesian学习理论。该理论因其对于任意概念空间任意测度的先验均能给出泛化误差界而在人工智能不同领域的相关算法分析中得到广泛应用。文章综述了PAC-Bayesian学习理论的由来及其核心思想,进而结合大数据的特点,论述了PAC-Bayesian适合于大数据相关算法的理论分析。
The theory of probably approximately correct(PAC)is a framework for the study of learnable.In recent years,researchers combined Bayesian method with distribution-free PAC guarantees and proposed so-called PAC-Bayesian learning theory.This theory can give generalization error bounds for an arbitrany prior measure on an arbitrary concept space,so it has been widely used in different fields of artificial intelligence to analyze related algorithms.This paper surveys the derivation of PAC-Bayesian learning theory and its core ideas.Further,considering the characteristics of big data,this paper discusses why PAC-Bayesian is useful for theoretical analysis of the related algorithms for big data.