Dirichlet过程是一种典型的变参数贝叶斯模型,其优点是参数的个数和性质灵活可变,可通过模型和数据来自主地计算,近年来它已成为机器学习和自然语言处理研究领域中的一个研究热点。该文较为系统的介绍了Dirichlet过程的产生、发展,并重点介绍了其模型计算,同时结合自然语言处理中的具体应用问题进行了详细分析。最后讨论了Dirichlet过程未来的研究方向和发展趋势。
Dirichlet process is a well-known nonparametric Bayesian model, with the attractive property of a flexible number of components determined by the model and the data. The Dirichlet process is an active area of research both within machine learning and in the natural language processing community. This paper introduces the origin and development of Dirichlet process, and the methods for model calculating. This paper also demonstrates how to use this model to solve natural language processing task. In the end, the future research and development trend of Dirichlet process is discussed.