学术报告:Probabilistic exponential family inverse regression and its applications
报告时间:7月17日(星期三)上午10:00-11:30
报告地点:沙河校区,二教113
报告人:王涛,上海交通大学,教授
报告摘要:Rapid advances in high-throughput sequencing technologies have led to the fast accumulation of high-dimensional data, which is harnessed for understanding the implications of various factors on human disease and health. While dimension reduction plays an essential role in high-dimensional regression and classification, existing methods often require the predictors to be continuous, making them unsuitable for discrete data, such as presence-absence records of species in community ecology and sequencing reads in single-cell studies. To identify and estimate sufficient reductions in regressions with discrete predictors, we introduce probabilistic exponential family inverse regression (PrEFIR), assuming that, given the response and a set of latent factors, the predictors follow one-parameter exponential families. We show that the low-dimensional reductions result not only from the response variable but also from the latent factors. We further extend the latent factor modeling framework to the double exponential family by including an additional parameter to account for the dispersion. This versatile framework encompasses regressions with all categorical or a mixture of categorical and continuous predictors. We propose the method of maximum hierarchical likelihood for estimation, and develop a highly parallelizable algorithm for its computation. The effectiveness of PrEFIR is demonstrated through simulation studies and real data examples.
报告人简介:王涛博士是上海交通大学教授、博士生导师,并担任交大-耶鲁生物统计与数据科学联合中心研究员。他曾在美国耶鲁大学生物统计系进行博士后研究,主要研究领域为生物医学大数据的统计共性算法和理论。其研究成果发表在《JASA》、《JRSSB》、《Biometrika》等统计学期刊,以及《Genome Biology》、《Briefings in Bioinformatics》、《Bioinformatics》等生物学期刊上。王涛博士曾获国家自然科学基金优秀青年科学基金项目,并担任中国现场统计研究会统计交叉科学研究分会副理事长和生存分析分会副理事长。他还积极参与了教育部生物科学“101”计划生物信息学核心课程的建设和核心教材的编写工作。
撰稿人:盖玉洁
审稿人:邓露