报告时间:2017年4月19日(周三)13:30
报告地点:沙河校区主教213
报告人:Mans Magnusson, Linkoping University, Sweden
报告摘要:Topic models are widely used for probabilistic modeling of text. MCMC sampling from the posterior distribution is typically performed using a collapsed Gibbs sampler. We propose a parallel sparse partially collapsed Gibbs sampler and compare its speed and efficiency to state-of-the-art samplers for topic models. The experiments, which are performed on well-known small and large corpora, show that the expected increase in statistical inefficiency from only partial collapsing is smaller than commonly assumed. This minor inefficiency can be more than compensated by the speed-up from parallelization of larger corpora. The proposed algorithm is fast, efficient, exact, and can be used in more modeling situations than the ordinary collapsed sampler. Work to speed up the computations further using the Polya-Urn distribution will also be presented.
报告人简介:Mans Magnusson目前工作于瑞典林雪平大学统计与机器学习系,主要研究兴趣包括文本数据建模以及主题模型。