In the future, everyone will publish everything.

Bob told me the other day (the other week, actually, as I’m stacking up posts here with a roughly one-month delay) that I shouldn’t try to compete with the electrical engineers when it comes to length of C.V.: according to Bob, these dudes can have over two thousand publications!

How do they do it? First, an EE prof will have tons of graduate students and postdocs, they’re all writing papers and presenting at conferences, and they all stick his name on the author list. Second, these students and postdocs write up and publish every experiment they do. Including (especially!) computer experiments.

And . . . all these people writing paper cite each other, so they quickly rack up thousands of citations.

Upon hearing this, my first reaction to this was fear, plain and simple. One of the distinguishing characteristics of my own research record is that I have so many publications and citations. Those electrical engineers . . . how dare they go around devaluing my currency!

But then I started thinking some more, and I realized that the EE profs’ system is the logical endpoint of some things I’ve actually been trying to do. I advice lots of students and postdocs and would be happy to have more in my orbit. I encourage them to publish early and often and to take the initiative, to themselves write up what they’ve done. If this were all to happen, then, yes, there’d be zillions of publications running around.

But maybe that wouldn’t be a bad thing. I’ve done lots of small experiments and analyses that, if I’d written up and published, maybe would be useful to others. We’re doing a few of these right now. We’ll typically do lots of experiments in order to understand our methods, then we only publish a small part of this (all based on our guess of what the journal referees might want to see). Now that we can publish on the web, it’s probably a good idea to publish more. So maybe the EE profs have the right idea. We just have to ditch the idea of a linear “C.V.” that lists all of one’s publications. (Even now, such lists, if unstructured, can be difficult to navigate.)

28 thoughts on “In the future, everyone will publish everything.

  1. With the advent and popularity of blogging, everyone can essentially self-publish their “small experiments and analyses.” It’s fast, it’s easy, and it gets the word out. And I like the fact that I get feedback from readers and the occasional blogger who picks up my work and expands upon it. Of course, it doesn’t lengthen my C.V., but that’s not my chief motivation.

  2. There are at least two strategies a researcher can follow to make a major and lasting impact. By “major and lasting” I mean doing something that someone would care about beyond a generation or two. They could either:

    (1) Play all the publication/grant proposal/Ivy league status games needed to become a well respected researcher. Then use whatever time is left over to do the real research.

    (2) Spend your early years becoming independently wealthy (join a start up for example) and then use the wealth to self fund the research in whatever time you have left.

    Traditionally, (1) has been the preferred way to go as it was far more likely to yield sufficient free time to actually do something. Moreover, following the second path (2) exposes a researcher to the same pitfalls that autodidacts and cranks face (see Wolfram’s book or that guy from Microsoft that Andrew keeps talking about).

    Nevertheless, we’ve reached a point where the chances of doing anything real following (1) is almost nill. So my sincere advice to any budding research genius today is to follow path (2). The odds are against you, but at least you have a fighting chance.

    And you’ll probably have more fun. Publishing thousands of papers no one reads, writing grant proposals to scam taxpayers, paying off student loans with an adjunct professor pay, and generally kissing every academic butt in sight for decades before you can speak your mind, seems almost perfectly designed to be as mind numbing as possible.

    • Dr. G.:

      Your scheme misses the possibility that a researcher can make an enduring contribution while doing (1) and (2). I made my most enduring contribution during six-week stay at Bell Labs between getting my Ph.D. and starting my first job. Steven Wolfram made his most enduring contribution with the software he used to make those big bucks. So I think your mistake is to assume that the serious work doesn’t get started until after the academic or financial success.

      • Andrew,

        Sure, but how old are you? My advice was to budding geniuses today. Most fields seem to be up against some kind of wall. Researchers report breakthrough after breakthrough, but over decades there seems to be remarkably little change in many fields. Those thousands of papers getting published seem to be just wheel spinning. For example,

        (1) Life expectancy which used to increase by years/decades now increases by days/months if at all
        (2) We still can’t predict the weather accurately a couple of weeks from now.
        (3) Or predict earthquakes
        (4) Or predict next year’s inflation rate
        (5) Statistics, with all of its groundbreaking and revolutionary advances has given us this: 47 out of 53 “landmark” papers in cancer research couldn’t be replicated:

        http://www.reuters.com/article/2012/03/28/us-science-cancer-idUSBRE82R12P20120328

        The reason why I mention this is because the longer fields stay stagnant the more intense the status games become. It’s not just that researchers are spinning their wheels to no effect, but the rate of spinning is increasing. At some point the cost of playing the status games is so great there is nothing left over for real research. It looks like we’ve already reached that point.

        From here on out, it’s very unlikely “academic superstars” will be able to break any of these fields free from their long term stagnation. The real breakthroughs (ones that will be recognized as such hundreds of years from now) will come from outsiders who bypassed the status games and the waste they entail.

        On the other hand, outsiders face a number of inherent difficulties, both internal and external, which makes it difficult to really do anything. Researchers publishing thousands of papers is a symptom of a greater disaster.

        • These assertions might “feel” like their true, but my guess that it’s always “felt” like things are progressing very slowly at the present moment. There certainly are circumstances unique to the present time (and institutional structures + funding situation), and the crappy academic job market reflects that, but you’re making some pretty broad generalizations here when these examples mostly relate to a narrow set of problems – predictions on highly non-linear phenomena which are inherently difficult to predict.

          Regarding cancer – there is definitely a lot of research that doesn’t pan out, but that is expected in my view. My understanding is that the life expectancy of certain cancers is improving as a direct consequence of development of therapies. Regarding life expectancy overall – averaging over the entire population is a pretty coarse measure of scientific progress and personally I’d rather see improvements to quality of life than life expectancy anyway.

          In the end, there’s always been a ridiculous amount of failure for every success in research. The job market at the present moment is particularly bad (something I and many of my friends are painfully aware of), but I don’t think it adds much to characterize the current situation in such black-and-white apocalyptic terms.

        • revo11,

          On the contrary, it is very important and useful to look at the big picture in the way I did. The reason is that it is very easy to get fooled by signs of micro-progress. Every grant, every paper, and every thesis makes some plausible claim to a marginal advance. It’s almost impossible to look at these and know if they’ll really add up to anything.

          To avoid this pitfall it’s important to look over longer periods and see if there actually has been any major progress. Sometimes there is. For example a Physicist in 1950 sees a very different picture of physics than one in 1900. Not only were there many major advances in our predictive ability, but they had major effects on society as whole.

          In other cases there doesn’t appear to be much change. For example someone working in Finance in 2012 is pretty much in the same theoretical boat as one working 1980, with about the same predictive ability.

          So where do most fields stand today? Are they similar to Physics from 1900-1950 or are they more like Finance form 1980-today? Well there is a growing sense that most fields are in the latter category and this has been true for a while (20-60 years depending on the field).

          This isn’t just me saying that, I’m seeing increasing recognition of this from specialists in everything from Artificial Intelligence, to Microeconomics, to Organic Chemistry, to Physics. See for example Tyler Cowen’s book “The Great Stagnation” which implies this slowdown has been going on long enough to have a significant impact on society.

          Statistics doesn’t seem to fall into this category though. It’s currently considered a vibrant field with much present progress. I wouldn’t disagree necessarily, but even here caution is warranted given results like the above where 88% of major foundational, landmark papers in cancer research couldn’t be reproduced.

        • revo11:

          Did you read the link – it’s that’s slower than in needs to be.

          “Some authors required the Amgen scientists sign a confidentiality agreement barring them from disclosing data at odds with the original findings. “The world will never know” which 47 studies — many of them highly cited — are apparently wrong, Begley said.”

        • “landmark papers in cancer research couldn’t be reproduced”

          I do not believe this should blame on lack of statistical technology.

          For instance see O’Rourke K, ‘An historical perspective on meta-analysis: dealing quantitatively with varying study results’, Journal of the Royal Society of Medicine, 100 (2007), 579-82.”

          Though maybe a statistical practice style that was once coined the “cult of the single study” in that many statisticians ignored the need to think about other prior studies in the analysis of a current study in hand.

  3. I’ve been reading so many Wikipedia articles lately I want to put a little fact-check notice next to my statement. I do recall seeing a CV with over 1000 listed publications, but I can’t recall whose it was — almost certainly someone in speech recognition. So I did the structured Google query,

    [“over * publications” site:.edu -library]

    (Note: putting a ‘*’ inside a quoted field finds matches of some word in that context, the site restriction is to .edu sites, and the negation is because I didn’t want library in the result after seeing the hits with that restriction).

    The most I could find in a couple of pages of results was 780+ from Herschel A. Rabitz (physical chemist); he got his Ph.D. in 1970. I’m rooting for James Speck (materials scientist) to get there, he’s around our age, having gotten an Sc.D. in 1989, and already has 550+. Does anyone know of people with more than 1000 refereed publications?

    I don’t think I’ve even come close to fully understanding 780 papers!

  4. Along these lines, I think that sites like StackOverflow, MathOverflow, Math.StackExchange, Cross-Validated, etc., will begin to subsume research quality open documents. Answers to some of the mathematics questions at Math.StackExchange already approach publication quality. I think it would be wise for hopeful graduate students in about 5-15 years to think very hard about what this might do to the economy of their intended career path.

  5. So the Bayesian route is no longer possible? – publish a single posthumous paper and thereby name an entire branch of statistics.

  6. A lot of people I know flip it the other way. Because there is such a deluge, they try very hard to only hit home runs. I guess it depends on how much people hiring you will look for those 5 bad papers.

  7. Yep. Peer review won’t be abolished. Instead, like an old lady getting progressively weaker, it will slowly fade away. And hiring committees will have to come up with a substitute for citation impact to justify not making an effort to figure out what’s it all about. Oh, the pain!

    an EE prof will have tons of graduate students and postdocs, they’re all writing papers and presenting at conferences, and they all stick his name on the author list

    Same in true for the entire life science research. For contrast, the most famous 1953 Watson & Crick paper was by a postdoc and grad student with their bosses not hijacking any credit.

  8. While everybody understands that this is good for the authors’ CVs, the real progress is not getting any faster — if anything, it goes slower, because you have to read so much more to figure out who does what, and what is worth reading, anyway. If everybody publishes 100+ papers, what are the five important papers for a new inspiring student to read to get acquainted with an area? Do I have to read all the 500 papers written by the top 5 people in the field? Or all 2000 papers written by everybody in the field? I highly doubt that one could find much incremental knowledge beyond say 20 randomly chosen papers from such a pool. If you know generalized linear mixed models, and if you know Antoniak’s and Ferguson’s papers from the 1970s, you don’t have to read beyond the abstract of a 2012 paper that says, “We employ a Dirichlet process to model the distribution of the random effects in a logistic regression model of…” — this is a paper that just assembled the Lego blocks differently.

    As an AE, I see the referees just miss the literature to go on and say, “Aye, this is a good paper”. I can’t blame them: they don’t have the time to read the previous 20 papers by the current author(s) to identify the overlap with the current paper. On most of these papers, though, the citation patterns I see are 50-60-70% self-citations by the co-authors of these papers. If the promotion criteria required 10 external references per year instead of 3 new papers, and a greater number of external citations than self-citations to prove that your visibility in your discipline goes beyond your office, I bet that most researchers would structure their publication activity very differently.

    • And a neat way to get citations is publish a blatant error that bothers the discipline enough that multiple papers are written to correct it – all of which must sight your paper.
      (that actually happened to someone I know, though the error was not made intentionally.)

  9. Published papers in peer-reviewed are and should be mountains in the academic landscape; but most days, i need the road laid down on arXiv to get where I’m going.

Comments are closed.