While Andrew is trying to get someone to make a t-shirt design “Gone fishing”, someone else thinks fishing is one of the “big data trends in 2015”. This advertisement by some company keeps re-appearing in my twitter feed.
While Andrew is trying to get someone to make a t-shirt design “Gone fishing”, someone else thinks fishing is one of the “big data trends in 2015”. This advertisement by some company keeps re-appearing in my twitter feed.
Does Exploratory Fishing smell fishy?
If JSM isn’t particularly engaging it might be fun to drop by this “mysterious” company’s booth and calling them out on their BS ads.
Sure, the fishing analogy has negative connotations. But I’m not sure if I know enough about this company’s tools or philosophy to understand why they need to be “called out”.
Does the practice of storing *all* data for potential future analyses sound like a good idea? Yes. Given the volume of that data, will it require some very modern database and sophisticated computational algorithms? Certainly.
And as a staunch Bayesian, I am frustrated by a common practice in academia to seemingly ignore all previous studies and use “default” or “weakly informative” priors instead of moving science forward by building priors on previous study results.
Maybe that’s what they mean by fishing in the big data lake? Not ignoring what has already been measured.
I think fishing as uggh only when disguised in the garbs of a confirmatory study.
As a means for hypothesis generation alone what is wrong with fishing? Quite some science starts out as “fishing”.
“I am frustrated by a common practice in academia to seemingly ignore all previous studies and use “default” or “weakly informative” priors instead of moving science forward by building priors on previous study results.”
+1
I think people arguing for Bayes over frequentist methods should spend more time on this kind of demonstration than on demolishing frequentist methods. Maybe it’s already been done but I don’t know it. I’ve tried it in one paper; I did it in a second one but I was afraid that reviewers would think that I am trying to strengthen my (already strong) case by making my effects look even bigger. So I removed it.
Bayesian work on clinical trials are a good place to look for priors that incorporate info from previous studies.
I should really learn to grammar…
p.s. If I have googled onto the right folks, they do in fact refer to this Gartner release in their marketing materials:
Gartner Says Beware of the Data Lake Fallacy
http://www.gartner.com/newsroom/id/2809117
And their software. Urgh. Trying to get it to work makes you want to flip a table over
I know what you mean; I had the hardest time getting it to do what I wanted. But once you realize that you are actually manipulating what is a simplified OLAP cube, the concepts immediately became clear.
Their software was built around relational datastores (NoSQL wasn’t popular when they started in 2003), so the concepts map very well to relational theory.
I don’t mind their software that much — it’s a great EDA and viz tool. But the “fishing” thing is definitely a blunder on the part of their marketing team.
I was fishing in a data lake years ago, and out popped a major art project :)
http://www.translatingnature.org/thelake
The URL lost its hyphen!
http://www.translatingnature.org/the-lake/
I like this cartoon,
http://pages.stat.wisc.edu/~wahba/spiegelhalter.science2014.pdf
, since it suggests not only can you go fishing but you can even help guide the hook into the fish’s mouth (as discussed often on this blog and, for example, in Simmons et al (2011, http://www.haas.berkeley.edu/groups/online_marketing/facultyCV/papers/nelson_false-positive.pdf).