It’s the Data Science and Public Policy colloquium, and they asked me to give my talk, Little Data: How Traditional Statistical Ideas Remain Relevant in a Big-Data World. Here’s the abstract:
“Big Data” is more than a slogan; it is our modern world in which we learn by combining information from diverse sources of varying quality. But traditional statistical questions—how to generalize from sample to population, how to compare groups that differ, and whether a given data pattern can be explained by noise—continue to arise. Often a big-data study will be summarized by a little p-value. Recent developments in psychology and elsewhere make it clear that our usual statistical prescriptions, adapted as they were to a simpler world of agricultural experiments and random-sample surveys, fail badly and repeatedly in the modern world in which millions of research papers are published each year. Can Bayesian inference help us out of this mess? Maybe, but much research will be needed to get to that point.
It’s for the Data Science for Social Good program, so I suppose I’ll alter my talk a bit to discuss how data science can be used for social bad. The talk should be fun, but I do want to touch on some open research questions. Remember, theoretical statistics is the theory of applied statistics, and we have a lot of applied statistics to do, so we have a lot of theoretical statistics to do too.