Can’t keep up with the flood of gobbledygook

Jonathan Falk points me to a paper published in one of the tabloids; he had skepticism about its broad claims. I took a look at the paper, noticed a few goofy things about it (for example, “Our data also indicate a shift toward more complex societies over time in a manner that lends support to the idea of a driving force behind the evolution of increasing complexity”), and wrote back to him: “I’m too exhausted to even bother mocking this on the blog.”

Falk replied:

Well, you’d need about 30 co-bloggers to even up with the number of authors.

Actually 53, but who’s counting?

The funny thing is, it’s not like this paper is so horrible. It’s a million times better than the ovulation-and-voting paper, the beauty-and-sex-ratio paper, the ages-ending-in-9 paper, etc. It’s just an innocent little data analysis attached to some over-the-top bravado (“The approach that we have taken in this paper can be used to resolve other long-standing controversies in the study of human societies” etc etc etc). It would be ok as a final project in a social science data analysis class: you’d say the students got a lot by grappling with the data and also they learned a couple of new statistical techniques along the way. OK, they overclaimed, but ya gotta start somewhere, and there’s no harm in theorizing, as this can inform further work. To me, the most interesting of their data-based claims was this: “many trajectories exhibit long periods of stasis or gradual, slow change interspersed with sudden large increases in the measure of social complexity over a relatively short time span. This pattern is consistent with a punctuational model of social evolution, in which the evolution of larger polities requires a relatively rapid change in sociopolitical organization, including the development of new governing institutions and social roles, to be stable.” I don’t really know how seriously to take this, as I could imagine the apparent jumps being the result of data problems, but it’s an interesting thought.

There’s a problem when this sort of half-baked work gets placed in the tabloids and goes through the hype machine. On the plus side, it seems that the hype machine isn’t what it used to be: Tyler Cowen posted on this article today (I guess that’s how Falk heard about it), and the few comments were uniformly skeptical.

Again: I’m not saying this sort of data analysis and speculation shouldn’t be done—who knows what can be learned from this sort of thing, it’s worth a try! I think one problem with the tabloid model of science publication is that it incentivizes big claims.

29 thoughts on “Can’t keep up with the flood of gobbledygook

  1. From http://www.pnas.org/content/early/2017/12/20/1708800115:

    “We were able to capture information on 51 variables reflecting nine characteristics of human societies, such as social scale, economy, features of governance, and information systems. Our analyses revealed that these different characteristics show strong relationships with each other and that a single principal component captures around three-quarters of the observed variation. Furthermore, we found that different characteristics of social complexity are highly predictable across different world regions. These results suggest that key aspects of social organization are functionally related and do indeed coevolve in predictable ways. Our findings highlight the power of the sciences and humanities working together to rigorously test hypotheses about general rules that may have shaped human history.”

    Comments:

    1. Number of variables almost as large as number of authors.

    2. “Our analyses revealed that these different characteristics show strong relationships with each other” Sounds hyped. Just show a table of correlations to show what variables are correlated — but watch out for words such as “related” that suggest interpretations of causality. (See, e.g., Tyler Vigen’s Spurious Correlation page, http://www.tylervigen.com/spurious-correlations)

    3. The last three sentences sound like little more than grandiose sounding restatements.

  2. More often that not when I referee a paper one of my main comments involves asking the authors to tamp down on statements in the intro and conclusion that are in no way supported by the actual data analysis in the paper. I just wish they’d say “I analyze relationship X in the data, and since I can interpret that result within over-riding theoretical paradigm A, I will now speculate about all the implications of all of A as though my study of X had proved that.” I’d probably let people say whatever they wanted after that statement… at least it would show the kind of self-awareness and honesty-of-thought you’d expect from someone who (presumably) wants to be taken seriously as a thinker.

    Maybe we need to go spend some more time with Quine. Because it all feels like a kind of weird under-determinism turned on its head: instead of being afraid that the web-of-belief is just a convenient interpretation that can encompass any finding, we consider the particular finding to be strong evidence that the web-of-belief is grounded in reality. Despite the fact that the web-of-belief can predict anything (you know, because #moderators and #supressors and #parameters).

    A related point sometimes comes up with my students in questions about what is the “right” way to analyze some data, often in the contexts of discussions about replications or re-analyses. I’ve been trying out an argument to my students that any particular analysis is just a “perspective”. There are infinite potential perspectives to take on any dataset. Many of these perspectives will be “false” ones (bad math, faulty reasoning, etc.), but there will still be an infinite number of “true” perspectives. Maybe I should make this a homework project for my graduate students – take a paper with a clear experiment that was interpreted in terms of one theory, and interpret all the results in terms of some other theory. In this formulation, maybe we don’t need Quine as much as we need Nietzsche. Probably both… you know, for some perspective.

    • Jrc:

      Selection bias: Had the authors written the paper with more moderate claims, I don’t think it would’ve been published in the tabloids.

      And check out this line from the webpage of the first author of the above-mentioned article:

      [The author] has published two hundred articles, including a dozen in such top journals as Nature, Science, and PNAS

      I don’t begrudge the author his success—I myself would love to publish more papers in the tabloids—My point here is just that he does what it takes to publish in those journals. They’re looking for importance, that is, big claims.

      • I’ve seen hypotheses like that often enough (I’m a sociologist). Tho I’ve not seen the paper, it’s a difficult hypothesis to test. Looks like it’s maybe in PNAS — I’ll look at it & get back to you. For my information, what journals are not tabloids? Social science journals?

        • OK. Any social science journals that are not tabloids? Or that publish mostly good research? I think most journals in my field (Sociology) make too much of too little, and publish too much data mining. But I’m interested in your take.

          Thanks,
          –Larry

  3. I think you are too charitable because it is precisely the grandiose claims that warrant a more critical view. I don’t worry too much about cute little psychology results because, hey, they are so small in the grand scheme. But when a paper claims to have found some sort of universal law of social science and civilization…. I think we need to be pitiless.

  4. This sentence stretches my credulity to the popping point: “To test between competing hypotheses, we constructed a massive repository of historical and archaeological information known as ‘Seshat: Global History Databank.’ We systematically coded data on 414 societies from 30 regions around the world spanning the last 10,000 years.”

    I don’t know how you can construct and populate such a database without risk of bias. Much historical data comes to us with selection and interpretation; how would you separate the interpretation from the data? How would you guarantee that your conclusions about the evolution of societies–over ten thousand years–did not derive from existing historical emphases (which then affected the historical data)? An alternative to the conclusion “key aspects of social organization are functionally related and do indeed coevolve in predictable ways” might be “key aspects of social organization have been subject to unifying interpretations, which have affected the data available to us today.”

    • I love that “systematically.” If you have a “system” by which research assistants can “code data” from all of recorded history, by all means publish your “system,” it must be revolutionary.

        • Fair enough, I hadn’t noticed the link to the code manual there. I now think they are using the term “data” rather liberally, but OK, on inspection I tend to agree with RJB’s comment below, it’s an interesting effort.

        • I took a brief look too and agree that it’s an interesting (and massive) effort that could lead to some good findings. Two things give me pause at this point, though:

          Many variables are binary; irrigation systems are “present” or “absent”; so, too, with poetry. This could obscure many of the complexities and simplicities.

          Some categories are anachronistic (or so it seems to me). For instance, religious, scientific, philosophical, and fictional writing are treated as distinct, whereas in ancient literature they often overlap.

  5. The project http://seshatdatabank.info/ seems like a bigger, non static version of the Human Relations Area Files. And it still seems to be focused on the same kinds of questions about concepts like “complexity.” And unlike HRAF this seems more run by historians than by anthropologists though I’m not positive.

  6. “Falk replied:

    Well, you’d need about 30 co-bloggers to even up with the number of authors.

    Actually 53, but who’s counting?”

    Such a shame, they only needed a few more authors to reach a “critical mass”, which apparently might be 72 authors: see https://www.nature.com/articles/s41562-017-0189-z

    It’s amazing how science works these days, it seems like you need 3 ingredients for something to possibly take off and/or have a response ready when some crucial criticism appears from out of nowhere:

    1) publish in a tabloid magazine,

    2) have lots (72 could be the “critical mass” you want to aim for) of “big names” on your paper (who might only be “big names” because they published in tabloid magazines before), and

    3) if sh#t hits the fan, tell them things like “don’t throw out the baby with the bathwater”, “see how much discussion this paper resulted in”, or “don’t let perfect be the enemy of good”.

    They got 1.5 out of 3 already, with the possiblity of 2.5 out of 3 when some criticism gathers enough attention for them to have to respond to it still open!

  7. Maybe I’m just a softy, but they devoted a lot of time to gathering data that seems pretty interesting. It’s not like the stats are airtight, but they argue that many historians would expect that different facets of complexity to have stronger relations to one another than to the remaining ones, but they don’t find any support for more than one principal component. As long as you take this as a first step, not the last word, it seems like a worthwhile contribution. At least they have provided some interesting and improvable measures, and lots of opportunities for others to reanalyze the data. How much more should we really be demanding of a single paper, other than maybe toning down their summary claims.

    • # “but they devoted a lot of time to gathering data ”

      To me, and most importantly i reason to science, this is irrelevant.

      # “but they argue that many historians would expect that different facets of complexity to have stronger relations to one another than to the remaining ones, ”

      I am not sure if that is what they found. This is probably all too complicated for me, but if i just read what is stated below figure 2 for instance, it is stated that:

      “Nine CCs (ovals) aggregating 51 variables (SI Appendix has details on all CCs). (…) All CCs are significantly correlated with one another (correlation coefficients range between 0.49 and 0.88). Some variables show stronger linkages with each other”

      # “but they don’t find any support for more than one principal component”

      This is what i don’t understand, but my comprehension of PCA/statistics in general is pretty bad. But also see commment above by Elin. I wonder what this all tells us. It seems to me that:

      1) they chose a specific type of society (“polities”)
      2) they do not make a comparison between these societies concerning level of development/complexity

      If this is correct, i wonder what the point of it all is.

      What is exactly the difference between doing this and taking different kinds of coconut-cookies, finding lots of variables these different kinds of coconut-cookies consist of, then grouping these variables together according to some higher-order variable, and then doing a PCA on coconut-cookies and finding that they mostly load on 1 factor, and stating that:

      “Our analyses revealed that these different characteristics show strong relationships with each other and that a single principal component captures around three-quarters of the observed variation. Furthermore, we found that different characteristics of coconut-cookies are highly predictable across different cookies. These results suggest that key aspects of coconut-cookies are functionally related and do indeed co-occur in predictable ways”

      I also wonder if it is normal that all used factors load very similarly on the one dimension. I am simply trying to understand what can be learned from this study, and why.

        • I actually think it’s worse than that, and let me start by saying that I admire the effort to collect the data as well. But they end up with 20 percent missing data items, and the argue that they will do their analysis with regression imputation of those items. In other words, they create a (noisy) linear combination of the variables they have to create the observations they don’t have. Then they run the PCA on this. Well it seems to me obvious (unless they’ve done something I don’t see) that the fact that 20 percent of the data items are linear combinations of something-or-other that the first principal component is going to get a lot of those loadings really strongly reflected in the variance; thus what you’re really seeing is not just “cookie containing coconut,” but in 20 percent of the observations where the ingredient list is missing, you imputed the presence of coconut before you ran the PCA.

    • Rjb:

      Yes, and the sad thing is that because of the (implicit) rules of scientific publication, the data-gathering part of the project, which is the most interesting thing, gets downplayed in the published paper, while the speculations, which have little content, get all the attention.

  8. Since many commenters here seem to value the data (-gathering), perhaps it is useful to provide a link to a few open data journals here:

    https://www.fosteropenscience.eu/foster-taxonomy/open-data-journals?page=0

    If i understood things correctly, the authors of the paper this blog-post is about could publish a paper about the data-set only, and publish it in an “official” journal, and could hereby (for instance) gather possible citations and get rewarded for their work and/or making the data available.

  9. I agree with RJB:
    ” As long as you take this as a first step, not the last word, it seems like a worthwhile contribution. At least they have provided some interesting and improvable measures, and lots of opportunities for others to reanalyze the data. How much more should we really be demanding of a single paper, other than maybe toning down their summary claims”.
    There’s lots to criticize, but that’s always the case. Importantly, there’s lots to improve upon & lots of opportunities to do so.

  10. Note that this article was contributed to PNAS by a member of NAS, who essentially gets to serve as his own editor by selecting the reviewers, who are then named:
    “Contributed by Charles Spencer, November 16, 2017 (sent for review May 26, 2017; reviewed by Simon A. Levin and Charles Stanish)”

    • Anon:

      That bit actually doesn’t bother me. It seems reasonable enough for members of that particular club to have their own journal where they can publish their research ideas. My problem is when journalists in the news media take PNAS too seriously. It’s annoying to me when, for example, goofy political science articles in PNAS get more media attention than serious political science articles in APSR. Not that APSR is perfect, but when it comes to political science I think it gets more serious reviewing than PNAS.

Leave a Reply to Martha (Smith) Cancel reply

Your email address will not be published. Required fields are marked *