报告人：Mans Magnusson, Linkoping University, Sweden
报告摘要：Topic models are widely used for probabilistic modeling of text. MCMC sampling from the posterior distribution is typically performed using a collapsed Gibbs sampler. We propose a parallel sparse partially collapsed Gibbs sampler and compare its speed and efficiency to state-of-the-art samplers for topic models. The experiments, which are performed on well-known small and large corpora, show that the expected increase in statistical inefficiency from only partial collapsing is smaller than commonly assumed. This minor inefficiency can be more than compensated by the speed-up from parallelization of larger corpora. The proposed algorithm is fast, efficient, exact, and can be used in more modeling situations than the ordinary collapsed sampler. Work to speed up the computations further using the Polya-Urn distribution will also be presented.