This is kinda weird because I don’t know anything about machine learning in finance. I guess the assumption is that statistical ideas are not domain specific. Anyway, here it is:
What can we learn from data?
Andrew Gelman, Department of Statistics and Department of Political Science, Columbia University
The standard framework for statistical inference leads to estimates that are horribly biased and noisy for many important examples. And these problems all even worse as we study subtle and interesting new questions. Methods such as significance testing are intended to protect us from hasty conclusions, but they have backfired: over and over again, people think they have learned from data but they have not. How can we have any confidence in what we think we’ve learned from data? One appealing strategy is replication and external validation but this can be difficult in the real world of social science. We discuss statistical methods for actually learning from data without getting fooled.