I just happened to come across this quote from Dan Simpson:
When the signal-to-noise ratio is high, modern machine learning methods trounce classical statistical methods when it comes to prediction. The role of statistics in this case is really to boost the signal-to-noise ratio through the understanding of things like experimental design.
Is there a nice war story or example of experimental design boosting signal-to-noise ratio so that machine learning methods then dominate?
Not sure what you mean by dominate here, but I really like the work of Vikas Raykar and crew on adding noisy measurement models to training. It can work with any kind of ML model. Most ML treats a corpus as a gold standard, whereas statisticians know there’s often important measurement error and that accounting for it can in some cases can be a huge help.
Also, you really want to be using statistics to evaluate systems even if they’re not built based on probabilistic principles.
The use of the Penn Treebank in natural language has been a joke to anyone who knows anything about stats for 20+ years. Everyone’s required to train and test on the exact same sections and papers have been reporting increases in precision and recall on that section that are dwarfed by cross-validation error across the other sections of the treebank. Classic overfitting and noise chasing, which persists only because practitioners don’t understand statistics.
What I’d recommend instead is Chris Manning’s work (from an old CICLING paper) on fixing the errors in annotation in the treebank. This is also related to the noisy measurement models mentioned above, but it actually fixes the errors. It’s not considered kosher to do this, so the results don’t “count” in any sense. It’s not like anyone took his advice and started using his improved Treebank. Nope, back to testing and training on the exact same noisy data.
That’s what I would say too (wrt. overfitting the dataset, in the widest sense), but sometimes it might be more complicated than that, if this analysis stands correct:
https://arxiv.org/abs/1806.00451
Thanks for this comment, @Bob Carpenter. Do you a reference to a Raykar paper that would be good to start with?
It’s not quite the same thing, but one thing that’s done in ML on images these days is data augmentation, where instead of just doing a training step on an image, you’ll pre-process the image to do things like tweak the brightness or crop part of it out or otherwise mess with it.
The big question, though, is how do we know when the signal to noise ration is truly high? It may appear so based on our assessment, but even with a hold-out test sample, we still don’t know if either the signal, or the noise, is representative of the broader reality we are trying to model. The superior machine learning methods may have done a great job of overfitting the noise or bias of our sample. This is particularly a problem when the underlying phenomena may be changing. How much can we really expect to know about the future?
“How much can we really expect to know about the future?”
Ah, the age-old question — but how can we know if we can’t know the future? ;~)
Terms like ‘signal’ and ‘noise’ appear to be holdovers from the intelligence world, I am speculating. I wonder whether the are appropriate to use in AI and machine learning. Seem anachronistic.
In all honesty, this seems like a bizarre claim. SNR is a fundamental concept in information theory and I’m having a hard time imagining how one would think about those kinds of problems without coming up with something like it. Also https://dsp.stackexchange.com/questions/tagged/snr?sort=frequent
Signal to noise has been and stii is used in Communication Engineering to aid in designing receivers. I learnt the theory a long time ago but basically the statistics used were around probability models, averages and deviations.
The M4 competition paper is out just a few months ago; they compared machine learning forecasting methods with statistical ones and come to the conclusion that the former need the latter to work:
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0194889
What qualifies as a modern machine learning method?