报告题目:Optimal leveraging for big data regression
时间:2015年6月17(星期三)14:00-15:00
地点:学院南路校区,学术会堂603
报告人:Professor Ping Ma,Department of Statistics, University of Georgia
摘要:
Rapid advance in science and technology in the past decade brings an extraordinary amount of data, offering researchers an unprecedented opportunity to tackle complex research challenges. The opportunity, however, has not yet been fully utilized, because effective and efficient statistical tools for analyzing super-large dataset are still lacking. One major challenge is that the advance of computing resources still lags far behind the exponential growth of database.
In this talk, I will introduce a family of statistical leveraging methods to facilitate scientific discoveries using current computing resources. Leveraging methods are designed under a subsampling framework, in which one samples a small proportion of the data (subsample) from the full sample, and then performs intended computations for the full sample using the small subsample as a surrogate. The key of the success of the leveraging methods is to construct nonuniform sampling probabilities so that influential data points are sampled with high probabilities. The optimal criteria will be discussed. These methods stand as the very unique development of their type in big data analytics and allow pervasive access to massive amounts of information without resorting to high performancecomputing and cloud computing.
报告人简介:
马平教授为中央财经大学“手拉手”项目特聘教授,佐治亚大学(UGA)统计系教授,美国普渡大学统计学博士,哈佛大学统计系博士后。马平教授在非参数统计、数据建模、超大样本统计等方面有着很深的理论造诣,在高水平学术杂志上发表论文20余篇,承担9项美国国家科学基金(NSF)科研项目。曾获得Canadian Journal of Statistics优秀论文奖、美国自然科学基金CAREER 奖。University of Illinois优秀教师,同时担任Journal of the American Statistical Association等多个国际著名统计学期刊的副主编。