13 thoughts on “Are you ready to go fishing in the data lake?

    • Sure, the fishing analogy has negative connotations. But I’m not sure if I know enough about this company’s tools or philosophy to understand why they need to be “called out”.

      Does the practice of storing *all* data for potential future analyses sound like a good idea? Yes. Given the volume of that data, will it require some very modern database and sophisticated computational algorithms? Certainly.

      And as a staunch Bayesian, I am frustrated by a common practice in academia to seemingly ignore all previous studies and use “default” or “weakly informative” priors instead of moving science forward by building priors on previous study results.

      Maybe that’s what they mean by fishing in the big data lake? Not ignoring what has already been measured.

      • I think fishing as uggh only when disguised in the garbs of a confirmatory study.

        As a means for hypothesis generation alone what is wrong with fishing? Quite some science starts out as “fishing”.

      • “I am frustrated by a common practice in academia to seemingly ignore all previous studies and use “default” or “weakly informative” priors instead of moving science forward by building priors on previous study results.”

        +1

        I think people arguing for Bayes over frequentist methods should spend more time on this kind of demonstration than on demolishing frequentist methods. Maybe it’s already been done but I don’t know it. I’ve tried it in one paper; I did it in a second one but I was afraid that reviewers would think that I am trying to strengthen my (already strong) case by making my effects look even bigger. So I removed it.

    • I know what you mean; I had the hardest time getting it to do what I wanted. But once you realize that you are actually manipulating what is a simplified OLAP cube, the concepts immediately became clear.

      Their software was built around relational datastores (NoSQL wasn’t popular when they started in 2003), so the concepts map very well to relational theory.

      I don’t mind their software that much — it’s a great EDA and viz tool. But the “fishing” thing is definitely a blunder on the part of their marketing team.

Leave a Reply to JD Cancel reply

Your email address will not be published. Required fields are marked *