A gathering of the literary critics: Louis Menand and Thomas Mallon, meet Jeet Heer

Marshall McLuhan: The environment is not visible. It’s information. It’s electronic.

Normal Mailer: Well, nonetheless, nature still exhibits manifestations which defy all methods of collecting information and data. For example, an earthquake may occur, or a tidal wave may come in, or a hurricane may strike. And the information will lag critically behind our ability to control it.

Regular readers will know that I’m a big fan of literary criticism.  See, for example,

“End of novel. Beginning of job.”: That point at which you make the decision to stop thinking and start finishing

Contingency and alternative history (followup here)

Kazin to Birstein to a more general question of how we evaluate people’s character based on traits that might, at least at first glance, appear to be independent of character (followup here)

“Readability” as freedom from the actual sensation of reading

Things that I like that almost nobody else is interested in

Anthony West’s literary essays

I recently came across a book called “Sweet Lechery: Reviews, Essays and Profiles,” by literary journalist Jeet Heer. The “Lechery” in the title is a bit misleading, but, yes, Heer is open about sexual politics. In any case, like the best literary critics, he engages with the literary works and the authors in the context of politics and society. He has some of the overconfidence of youth—the book came out ten years ago, and some of its essays are from ten or more years before that—, and there’s a bunch of obscure Canadian stuff that doesn’t interest me, but overall I found the writing fun and the topics interesting.

One good thing about the book was its breadth of cultural concerns, including genre and non-genre literature, political writing, and comic books, with the latter taken as of interest in themselves, not merely as some sort of cultural symbol.

I also appreciated that he didn’t talk about movies or pop music. I love movies and pop music, but they’re also such quintessential topics for Boomer critics who want to show their common touch. There are enough other places where I can read about how Stevie Wonder and Brian Wilson are geniuses, that Alex Chilton is over- or under-rated, appreciation of obscure records and gritty films from the 1970s, etc.

My comparison point here is Louis Menand’s book on U.S. cold war culture from 1945-1965, which made me wonder how he decided what to leave in and what to leave out. I’m a big fan of Menand—as far as I’m concerned, he can write about whatever he wants to write about—; it was just interesting to consider all the major cultural figures he left out, even while considering the range of characters he included in that book. Heer writes about Philip Roth but also about John Maynard Keynes; he’s not ashamed to write about, and take seriously, high-middlebrow authors such as John Updike and Alice Munro, while also finding time to write thoughtfully about Robert Heinlein and Philip K. Dick. I was less thrilled with his writing about comics, not because of anything he said that struck me as wrong, exactly, but rather because he edged into a boosterish tone, promotion as much as criticism.

Another comparison from the New Yorker stable of writers is Thomas Mallon, who notoriously wrote this:

Screen Shot 2015-06-14 at 12.32.19 PM

Thus displaying his [Mallon’s] ignorance of Barry Malzberg, who has similarities with Mailer both in style and subject matter. I guess that Malzberg was influenced by Mailer.

And, speaking of Mailer, who’s written some good things but I think was way way overrated by literary critics during his lifetime—I’m not talking about sexism here, I just think there were lots of other writers of his time who had just as much to say and could say it better, with more lively characters, better stories, more memorable turns of phrase, etc.—; anyway, even though I’m not the world’s biggest Mailer fan, I did appreciate the following anecdote which appeared, appropriately enough, in an essay by Heer about Canadian icon Marshall McLuhan:

Connoisseurs of Canadian television should track down a 1968 episode of a CBC program called The Summer Way, a highbrow cultural and political show that once featured a half-hour debate about technology between McLuhan and the novelist Norman Mailer. . . .

McLuhan: We live in a time when we have put a man-made satellite environment around the planet. The planet is no longer nature. It’s no longer the external world. It’s now the content of an artwork. Nature has ceased to exist.

Mailer: Well, I think you’re anticipating a century, perhaps.

McLuhan: But when you put a man-made environment around the planet, you have in a sense abolished nature. Nature from now on has to be programmed.

Mailer: Marshall, I think you’re begging a few tremendously serious questions. One of them is that we have not yet put a man-made environment around this planet, totally. We have not abolished nature yet. We may be in the process of abolishing nature forever.

McLuhan: The environment is not visible. It’s information. It’s electronic.

Mailer: Well, nonetheless, nature still exhibits manifestations which defy all methods of collecting information and data. For example, an earthquake may occur, or a tidal wave may come in, or a hurricane may strike. And the information will lag critically behind our ability to control it.

McLuhan: The experience of that event, that disaster, is felt everywhere at once, under a single dateline.

Mailer: But that’s not the same thing as controlling nature, dominating nature, or superseding nature. It’s far from that. Nature still does exist as a protagonist on this planet.

McLuhan: Oh, yes, but it’s like our Victorian mechanical environment. It’s a rear-view mirror image. Every age creates as a utopian image a nostalgic rear-view mirror image of itself, which puts it thoroughly out of touch with the present. The present is the enemy.

That’s great! I love how McLuhan keeps saying these extreme but reasonable-sounding things and then, each time, Mailer brings him down to Earth. Norman Mailer, who built much of a career on bloviating philosophizing, is the voice of reason here. The snippet that I put at the top of this post is my favorite: McLuhan as glib Bitcoin bro, Mailer as the grizzly dad who has to pay the bills and fix the roof after the next climate-induced hurricane.

Heer gets it too, writing:

It’s a measure of McLuhan’s ability to recalibrate the intellectual universe that in this debate, Mailer—a Charlie Sheen–style roughneck with a history of substance abuse, domestic violence, and public mental breakdowns—comes across as the voice of sobriety and sweet reason.

Also, Heer’s a fan of Uncle Woody!

Lefty Driesell and Bobby Knight

This obit of the legendary Maryland basketball coach reminded me of a discussion we had a few years ago. It started with a remark in a published article by political scientist Diana Mutz identifying herself as “a Hoosier by birth and upbringing, the daughter of a former Republican officeholder, and someone who still owns a home in Mike Pence’s hometown.”

That’s interesting: I don’t know so many children of political officeholders! Actually, I can’t think of anyone I know, other than Mutz, who is a child of a political officeholder, but perhaps there are some such people in my social network. I don’t know the occupations of most of my friends’ parents.

Anyway, following up on that bit from Mutz, sociologist Steve Morgan added some background of his own:

I was also born in Indiana, and in fact my best friend in the 1st grade, before I left the state, was Pat Knight. To me, his father, Bobby Knight was a pleasant and generally kind man (who used to give us candy bars, etc.). He turned out to be a Trump supporter, and probably his son too. So, in addition to not appreciating his full basketball personality when I was 6 years old, I also did not see his potential to find a demagogue inspiring. We moved to Ohio, where I received a lot of education in swing-state politics and Midwestern resentment of coastal elites.

And then I threw in my two cents:

I was not born in Indiana, but I grew up in suburban Maryland (about 10 miles from Brett Kavanaugh, but I went to a public school in a different part of the county and so had zero social overlap with his group). One of the kids in my school was Chuck Driesell, son of Lefty Driesell, former basketball coach at the University of Maryland. Lefty is unfortunately now most famous for his association with Len Bias, but Chuck and I were in high school before that all happened, when Lefty was famous for being a good coach who couldn’t ever quite beat North Carolina. Once I remember the Terps decided to beat Dean Smith at his own game by doing the four corners offense themselves. But it didn’t work; I think Maryland ended up losing 21-18 or some other ping-pong-like score. Chuck was in my economics class. I have no idea if he’s now a Trump supporter. I guess it’s possible. One of the other kids in that econ class was an outspoken conservative, one of the few Reagan supporters of our group of friends back in 1980. Chuck grew up and became a basketball coach; the other kid grew up and became an economist.

I never went to a Maryland basketball game all the time I lived there, even when I was a student at the university. I wish I’d gone; I bet it would’ve been a lot of fun. My friends and I played some pickup soccer and basketball, and I watched lots of sports on TV, but for whatever reason we never even considered the idea of going to a game. We didn’t attend any of high school football games either, even though our school’s team was the state champions. This was not out of any matter of principle; we just never thought of going. Our loss.

Here’s some academic advice for you: Never put your name on a paper you haven’t read.

Success has many fathers, but failure is an orphan.

Jonathan Falk points to this news article by Tom Bartlett which has this hilarious bit:

What at first had appeared to be a landmark study . . . seemed more like an embarrassment . . .

[The second-to-last author of the paper,] Armando Solar-Lezama, a professor in the electrical-engineering and computer-science department at MIT and associate director of the university’s computer-science and artificial-intelligence laboratory, says he didn’t realize that the paper was going to be posted as a preprint. . . .

The driving force behind the paper, according to Solar-Lezama and other co-authors, was Iddo Drori, [the last author of the paper and] an associate professor of the practice of computer science at Boston University. . . . The two usually met once a week or so. . . .

Solar-Lezama says he was unaware of the sentence in the abstract that claimed ChatGPT could master MIT’s courses. “There was sloppy methodology that went into making a wild research claim,” he says. While he says he never signed off on the paper being posted, Drori insisted when they later spoke about the situation that Solar-Lezama had, in fact, signed off. . . .

Solar-Lezama and two other MIT professors who were co-authors on the paper put out a statement insisting that they hadn’t approved the paper’s posting . . . Drori didn’t agree to an interview for this story, but he did email a 500-word statement providing a timeline of how and when he says the paper was prepared and posted online. In that statement, Drori writes that “we all took active part in preparing and editing the paper” . . . The revised version doesn’t appear to be available online and the original version has been withdrawn. . . .

This reminds me of a piece of advice that someone once gave me: Never put your name on a paper you haven’t read.

The Lakatos soccer training

Alex Lax writes:

While searching the Internet for references to Lakatos, I noticed your comment about Lakatos being a Stalinist. I met Imre Lakatos shortly after his arrival in the UK. My parents spoke Hungarian and helped to settle the refugees to 1956. Imre Lakatos was one of those the refugees. I remember him playing football with me at a time when Hungarian football was seen as far superior to English football, and I also remember once when we met him at Cambridge railway station with his latest girlfriend who was very tall. She had managed to lose some contact lenses and I was grovelling around on the road trying to find them. During his visits he would often complain about his treatment in prison which destroyed his stomach and he would rant against the Communists. However after his death, I was told that a book by a well known French Communist was dedicated to Imre. I have not found this dedication but if true would suggest that he was a Communist of some flavour while pretending otherwise.

I hope this might be of interest to you.

He adds:

By the way, the Lakatos soccer training consisted of two players on a small pitch with two smallish opposing goals, with each player protecting their own goal. Each player was only allowed to touch the ball once.

I’m interested in Lakatos because his writing has been very influential to my work; see for example here and here. He was said to be a very difficult person, but perhaps that was connected in some way to his uncompromising intellectual nature, which served him well as an innovator in the philosophy of science.

Uncertainty in games: How to get that balance so that there’s a motivation to play well, but you can still have a chance to come back from behind?

I just read the short book, “Uncertainty in games,” by Greg Costikyan. It was interesting. His main point, which makes sense to me, is that uncertainty a key part of the appeal of any game. He gives interesting examples of different sources of uncertainty. For example, if you’re playing a video game such as Pong, the uncertainty is in your own reflexes and reactions. With Diplomacy, there’s uncertainty in what the other players will do. With poker, there’s uncertainty about all the hole cards. With chess, there’s uncertainty in what the other player will do and also uncertainty in the logical implications of any position, in the same way that I am uncertain about what is the 200th digit of the decimal expansion of pi, even though that number exists. I agree with Costikyan that uncertainty is a helpful concept for thinking about games.

There’s one thing he didn’t discuss in his book, though, that I wanted to hear more about, and that’s the way that time and uncertainty interact in games, and how this factors into game design. I’ve been thinking a lot about time lately, and this is another example, especially relevant to me as we’re in the process of finishing up the design of a board game, and we want to improve its playability.

To fix ideas, consider a multi-player tabletop game with a single winner, and suppose the game takes somewhere between a half hour and two hours to play. As a player, I want to have a real chance of winning, until close to the end, and when the game reaches the point at which I pretty much know I can’t lose, I still want it to be fun, I want some intermediate goal such as the possibility of being a spoiler, or of being able to capitalize on my opponents’ mistakes. At the same time, I don’t want the outcome to be entirely random.

Consider two extremes:
1. One player gets ahead early and then can relentlessly exploit the advantage to get a certain win.
2. Nobody is ever ahead by much; there’s a very equal balance, and the winner is decided only at the very end by some random event.

Option #1 actually isn’t so bad—as long as the player in the lead can compound the advantage and force the win quickly. For example, in chess, if you have a decisive lead you can use your pieces together to increase your advantage. This is to be distinguished from how we played as kids, which was that once you’re in the lead you’d just try to trade pieces until the opposing player had nothing left: that got pretty boring. If you can use your pieces together, the game is more interesting even during the period where the winning player is clinching it.

Option #2 would not be so much fun. Sure, sometimes you will have a close game that’s decided at the very end, and that’s fine, but I’d like for victory to be some reflection of cumulative game play, as otherwise it’s meaningless.

Sometimes this isn’t so important. In Scrabble, for example, the play itself is enjoyable. The competition can also be good—it’s fun to be in a tight game where you’re counting the letters, blocking out the other player, and strategizing to get that final word on the board—but even if you’re way behind, you can still try to get the most out of your rack.

In some other games, though, once you’re behind and you don’t have a chance to win, it’s just a chore to keep playing. Monopoly and Risk handle this by creating a positive incentive for players to wipe out weak opponents, so that once you’re down, you’ll soon be out.

And yet another approach is to have cumulative scoring. In poker it’s all about the money. Whether you’re ahead or behind for the night, you’re still motivated to improve your bankroll.

One thing I don’t have a good grip on regarding game design is how to get that balance between all these possibilities, so that how you play matters throughout the game, while at the same time keeping the possibility of winning for as long as is feasibly possible.

I remember my dad saying that he preferred tennis scoring (each game is played to 4 points, each set is 6 games, you need to win 2 or 3 sets) as compared to old-style ping-pong scoring (whoever reaches 21 points first, wins), because in tennis, even if you’re way behind, you always have a chance to come back. Which makes sense, and is related to Costikyan’s point about uncertainty, but is hard for me to formalize.

A key idea here, I think, is that the relative skill of the players during the course of a match is a nonstationary process. For example, if player A is winning, perhaps up 2 sets to 0 and up 5 games to 2 in the third set, but then player B comes from behind to catch up and then maybe win in the fifth set, yes, this is an instance of uncertainty in action, but it won’t be happening at random. What will happen is that A gets tired, or B figures out a new plan of action, or some other factor that affects the relative balance of skill. And that itself is part of the game.

In summary, we’d like the game to balance three aspects:

1. Some positive feedback mechanism so that when you’re ahead you can use this advantage to increase your lead.

2. Some responsiveness to changes in effort and skill during the game, so that by pushing really hard or coming up with a clever new strategy you can come back from behind.

3. Uncertainty, as emphasized by Costikyan.

I’m sure that game designers have thought systematically about such things; I just don’t know where to look.

Those annoying people-are-stupid narratives in journalism

Palko writes:

Journalists love people-are-stupid narratives, but, while I believe cognitive dissonance is real, I think the lesson here is not “To an enthusiastically trusting public, his failure only made his gifts seem more real” and is instead that we should all be more skeptical of simplistic and overused pop psychology.

It’s easier for me to just give the link above than to explain all the background. The story is interesting on its own, but here I just wanted to highlight this point that Palko makes. Yes, people can be stupid, but it’s frustrating to see journalists take a story of a lawsuit-slinging celebrity and try to twist it into a conventional pop-psychology narrative.

I love this paper but it’s barely been noticed.

Econ Journal Watch asked me and some others to contribute to an article, “What are your most underappreciated works?,” where each of us wrote 200 words or less about an article of ours that had received few citations.

Here’s what I wrote:

What happens when you drop a rock into a pond and it produces no ripples?

My 2004 article, Treatment Effects in Before-After Data, has only 23 citations and this goes down to 16 after removing duplicates and citations from me. But it’s one of my favorite papers. What happened?

It is standard practice to fit regressions using an indicator variable for treatment or control; the coefficient represents the causal effect, which can be elaborated using interactions. My article from 2004 argues that this default class of models is fundamentally flawed in considering treatment and control conditions symmetrically. To the extent that a treatment “does something” and the control “leaves you alone,” we should expect before-after correlation to be higher in the control group than in the treatment group. But this is not implied by the usual models.

My article presents three empirical examples from political science and policy analysis demonstrating the point. The article also proposes some statistical models. Unfortunately, these models are complicated and can be noisy to fit with small datasets. It would help to have robust tools for fitting them, along with evidence from theory or simulation of improved statistical properties. I still hope to do such work in the future, in which case perhaps this work will have the influence I hope it deserves.

Here’s the whole collection. The other contributors were Robert Kaestner, Robert A. Lawson, George Selgin, Ilya Somin, and Alexander Tabarrok.

My contribution got edited! I prefer my original version shown above; if you’re curious about the edited version, just follow the link and you can compare for yourself.

Others of my barely-noticed articles

Most of my published articles have very few citations; it’s your usual Zipf or long-tailed thing. Some of those have narrow appeal and so, even if I personally like the work, it is understandable that they haven’t been cited much. For example, “Bayesian hierarchical classes analysis” (16 citations) took a lot of effort on our part and appeared in a good journal, but ultimately it’s on a topic that not many researchers are interested in. For another example, I enjoyed writing “Teaching Bayes to Graduate Students in Political Science, Sociology, Public Health, Education, Economics, . . .” (17 citations) and I think if it reached the right audience of educators it could have a real influence, but it’s not the kind of paper that gets built upon or cited very often. A couple of my ethics and statistics papers from my Chance column only have 14 citations each; no surprise given that nobody reads Chance. At one point I was thinking of collecting them into a book, as this could get more notice.

Some papers are great but only take you part of the way there. I really like my morphing paper with Cavan and Phil, “Using image and curve registration for measuring the goodness of fit of spatial and temporal predictions” (12 citations) and, again, it appeared in a solid journal, but it was more of a start than a finish to a research project. We didn’t follow it up, and it seems that nobody else did either.

Sometimes we go to the trouble of writing a paper and going through the review process, but then it gets so little notice that I ask myself in retrospect, why did we bother? For example, “Objective Randomised Blinded Investigation With Optimal Medical Therapy of Angioplasty in Stable Angina (ORBITA) and coronary stents: A case study in the analysis and reporting of clinical trials” has been cited only 5 times since its publication in 2019—and three of those citations were from me. It seems safe to say that this particular dropped rock produced few ripples.

What happened? That paper had a good statistical message and a good applied story, but we didn’t frame it in a general-enough way. Or . . . it wasn’t quite that, exactly. It’s not a problem of framing so much as of context.

Here’s what would’ve made the ORBITA paper work, in the sense of being impactful (i.e., useful): either a substantive recommendation regarding heart stents or a general recommendation (a “method”) regarding summarizing and reporting clinical studies. We didn’t have either of these. Rather than just getting the paper published, we should’ve done the hard work to more forward in one of those two directions. Or, maybe our strategy was ok if we can use this example in some future article. The article presented a great self-contained story that could be part of larger recommendations. But the story on its own didn’t have impact.

This is a good reminder that what typically makes a paper useful is if it can get used by people. A starting point is the title. We should figure out who might find the contents of the article useful and design the title from there.

Or, for another example, consider “Extension of the Isobolographic Approach to Interactions Studies Between More than Two Drugs: Illustration with the Convulsant Interaction between Pefloxacin, Norfloxacin, and Theophylline in Rats” (5 citations). I don’t remember this one at all, and maybe it doesn’t deserve to be read—but if it does, maybe it should’ve be more focused on the general approach so it could’ve been more directly useful to people working in that field.

“Information, incentives, and goals in election forecasts” (21 citations). I don’t know what to say about this one. I like the article, it’s on a topic that lots of people care about, the title seems fine, but not much impact. Maybe more people will look at it in 2024? “Accounting for uncertainty during a pandemic” is another one with only 21 citations. For that one, maybe people are just sick of reading about the goddam pandemic. I dunno; I think uncertainty is an important topic.

The other issue with citations is that people have to find your paper before they would consider citing it. I guess that many people in the target audiences for our articles never even knew they existed. From that perspective, it’s impressive that anything new ever gets cited at all.

Here’s an example of a good title: “A simple explanation for declining temperature sensitivity with warming.” Only 25 citations so far, but I have some hopes for this one: the title really nails the message, so once enough people happen to come across this article one way or another, I think they’ll read it and get the point, and this will eventually show up in citations.

“Tables as graphs: The Ramanujan principle” (4 citations). OK, I love this paper too, but realistically it’s not useful to anyone! So, fair enough. Similarly with “‘How many zombies do you know?’ Using indirect survey methods to measure alien attacks and outbreaks of the undead” (6 citations). An inspired, hilarious effort in my opinion, truly a modern classic, but there’s no real reason for anyone to actually cite it.

“Should we take measurements at an intermediate design point?” (3 citations). This is the one that really bugs me. Crisp title, clean example, innovative ideas . . . it’s got it all. But it’s sunk nearly without a trace. I think the only thing to do here is to pursue the researcher further, get new results, and publish those. Maybe also set up the procedure more explicitly as a method, rather than just the solution to a particular applied problem.

Torment executioners in Reno, Nevada, keep tormenting us with their publications.

The above figures come from this article which is listed on this Orcid page (with further background here):

Horrifying as all this is, at least from the standpoint of students and faculty at the University of Nevada, not to mention the taxpayers of that state, I actually want to look into a different bizarre corner of the story.

Let me point you to a quote from a recent article in Retraction Watch:

The current editor-in-chief [of the journal that featured the above two images, among with lots lots more] . . . published a statement about the criticism on the journal’s website, where he took full responsibility for the journal’s shortcomings. “While you can argue on the merits, quality, or impact of the work it is all original and we vehemently disagree with anyone who says otherwise,” he wrote.

I don’t think that claim is true. In particular, I don’t think it’s correct to state, vehemently or otherwise, that the work published in that journal is “all original.” I say this on the evidence of this paragraph from the this article that appeared there, an article we associate with the phrase, “torment executioners“:

It appears that the original source of this material was an article that had appeared the year before in an obscure and perhaps iffy outlet called The Professional Medical Journal. From the abstract of the paper in that journal:

The scary thing is that if you google the current editor of the journal where the apparent bit of incompetent plagiarism was published, you’ll see that this is first listed publication:

Just in case you were wondering: no, “Cambridge Scholars Publishing” is not the same as Cambridge University Press.

Kinda creepy that someone who “vehemently” makes a false statement about plagiarism published in his own journal has published a book on “Guidelines for academic researchers.”

We seem to have entered a funhouse-mirror version of academia with entire journals and subfields of fake articles, advisers training new students to enter fake academic careers, and, in a Gresham’s law sort of way, crowding out legitimate teaching and researchers.

Not written by a chatbot

The published article from the above-discussed journal that got this all “torment executioners”started was called “Using Science to Minimize Sleep Deprivation that May Reduce Train Accidents.” It’s two paragraphs long, includes a mislabeled figure that was a stock image of a fly, and has no content.

I pointed that article to a colleague who asked whether it was written by ChatGPT. I said, no, I didn’t think so because it was too badly written to be by a chatbot. I was not joking! Chatbot text is coherent at some level, often following something like the format of the standard five-paragraph high school essay, while this article did not make any sense at all. I think it’s more likely that it was a really bad student paper, maybe something written in desperation in the dwindling hours before the assignment was due, and then they published it in this fake journal. On the other hand, it was published in 2022, and chatbots were not so good back in 2022, so maybe it really is the product of an incompetent chatbot. Or maybe it was put together from plagiarized material, as in the “torment executioners” paper, and we just don’t have the original source to demonstrate it. My guess remains that it was a human-constructed bit of nonsense, but I’m guessing that anyone who would do this sort of thing today would use a chatbot. So in that sense these articles are a precious artifact of the past.

Back to the torment executioners

That apparently plagiarized article was still bugging me. One weird part of the story is that even the originally-published study seems a bit off, with statements such as “42% dentist preferred both standing and sitting position.” Maybe the authors of the “torment executioners” paper purposely picked something from a very obscure source, under the belief that then nobody would catch the copying?

What the authors of the “torment executioners” paper seem to have done is to take material from the paper that had been published earlier in in a different journal and run it through a computer program that changed some of the words, perhaps to make it less easily caught by plagiarism detectors? Here’s the map of transformations:

"acquired" -> "procured"
"vision" -> "perception"
"incidence" -> "effect"
"involvement" -> "association"
"followed" -> "taken after"
"Majority of them" -> "The larger part of the dental practitioner"
"intensity of pain" -> "concentration of torment"

Ha! Now we’re getting somewhere. “Concentration of torment,” indeed.

OK, let’s continue:

"discomfort" -> "inconvenience"
"aching" -> "hurting"
"paracetamol" -> "drugs"
"pain killer" -> "torment executioners"

Bingo! We found it. It’s interesting that this last word was made plural in translation. This suggests that the computer program that did these word swaps also had some sort of grammar and usage checker, so as a side benefit it fixed a few errors in the writing of the original article. The result is to take an already difficult-to-read passage and make it nearly incomprehensible.

But we’re not yet done with this paragraph. We also see:

"agreed to the fact" -> "concurred to the truth"

This is a funny one, because “concurred” is a reasonable synonym for “agreed,” and “truth” is not a bad replacement for “fact,” but when you put it together you get “concurred to the truth,” which doesn’t work here at all.

And more:

"pain" -> "torment level"
"aggravates" -> "bothers"
"repetitive movements" -> "tedious developments"

Whoa! That makes no sense at all. A modern chatbot would do it much better, I guess.

Here are a few more fun ones, still from this same paragraph of Ferguson et al. (2019):

"Conclusions:" -> "To conclude"
"The present study" -> "the display consideration"

“Display consideration”? Huh?

"high prevalence" -> "tall predominance"

This reminded me of Lucius Shepard’s classic story, “Barnacle Bill the Spacer,” which featured a gang called the Strange Magnificence. Maybe the computer program was having some fun here!

"disorders" -> "disarrangement"
"dentist" -> "dental specialists"
"so there should be" -> "in this manner"
"preventing" -> "avoiding"
"delivered" -> "conveyed"
"during" -> "amid"
"undergraduate curriculum" -> "undergrad educational programs"
"should be programmed" -> "ought to be put up"
"explain" -> "clarify"
"prolonged" -> "drawn out"

Finally, “bed posture density” becomes “bed pose density.” I don’t know about this whole “bed posture” thing . . . maybe someone could call up the Dean of Engineering at the University of Nevada and find out what’s up with that.

The whole article is hilarious, not just that paragraph. It’s a fun game, to try to figure out the original source of phrases such as, “indigent body movements” (indigent = poor) and “There are some signs when it comes to musculoskeletal as well” (I confess to be baffled by this one), and, my personal favorite, “Several studies have shown that
overweight children are an actual thing.”

Whaddya say, president and provost of the University of Reno? Are you happy that your dean of engineering is running a journal that publishes a paper like that? “Overweight children are an actual thing.”

Oh, it’s ok, that paper was never read from beginning to end by anybody—authors included.

Actually, this sentence might be my absolute favorite:

Having consolation in their shoes, having vigor in their shoes, and having quality in their shoes come to play within the behavioral design of youthful and talented kids with respect to the footwear they select to wear.

“Having vigor in their shoes” . . . that’s what it’s all about!

There’s “confidential dental clinics”: I guess “confidential” is being used as a “synonym” for private. And this:

Dental practitioners and other wellbeing callings in fact cannot dodge inactive stances for an awfully long time.

Exactly what you’d expect to see in a legitimate journal of the International Supply Chain Technology Journal.

I think the authors of this article are well qualified to teach in the USC medical school. They just need to work in some crazy giraffe facts and they’ll be just fine.

With the existence of chatbots, there will never be a need for this sort of ham-fisted plagiarism. End of an era. Kinda makes me sad.

P.S. As always, we laugh only to avoid crying. I remain furious on behalf of the hardworking students and faculty at UNR, not to mention the taxpayers of the state of Nevada, who are paying for this sort of thing. The phrase “torment executioners” has entered the lexicon.

P.P.S. Regarding the figures at the top of the post: I’ve coauthored papers with students. That’s fine; it’s a way that students can learn. I’m not at all trying to mock the students who made those pictures, if indeed that’s who drew them. I am criticizing whoever thought it was a good idea to publish this, not to mention to include it on professional C.V.’s. As a teacher, when you work with students, you try to help them do their best; you don’t stick your name on their crude drawings, plagiarized work, etc., which can’t be anyone’s best. I feel bad for any students who got sucked into this endeavor and were told that this sort of thing is acceptable work.

P.P.P.S. It looks like there may be yet more plagiarism going on; see here.

P.P.P.P.S. Retraction Watch found more plagiarism, this time on a report for the National Science Foundation.

Clinical trials that are designed to fail

Mark Palko points us to a recent update by Robert Yeh et al. of the famous randomized parachute-jumping trial:

Palko writes:

I also love the way they dot all the i’s and cross all the t’s. The whole thing is played absolutely straight.

I recently came across another (not meant as satire) study where the raw data was complete crap but the authors had this ridiculously detailed methods section, as if throwing in a graduate level stats course worth of terminology would somehow spin this shitty straw into gold.

Yeh et al. conclude:

This reminded me of my zombies paper. I forwarded the discussion to Kaiser Fung, who wrote:

Another recent example from Covid is this Scottish study. They did so much to the data that it is impossible for any reader to judge whether they did the right things or not. The data are all locked down for “privacy.”

Getting back to the original topic, Joseph Delaney had some thoughts:

I think the parachute study makes a good and widely misunderstood point. Our randomized controlled trial infrastructure is designed for the drug development world, where there is a huge (literally life altering) benefit to proving the efficacy of a new agent. Conservative errors are being cautious and nobody seriously considers a trial designed to fail as a plausible scenario.

But you see new issues with trials designed to find side effects (e.g., RECORD has a lot more LTFU than I saw in a drug study, when I did trials we studied how to improve adherence to improve the results—but a trial looking for side effects that cost the company money would do the reverse). We teach in pharmacy that conservative design is actually a problem in safety trials.

Even worse are trials which are aliased with a political agenda. It’s easy-peasy to design a trial to fail (the parachute trial was jumping from a height of 2 feet). That makes me a lot more critical when you see trials where the failure of the trial would be seen as a upside, because it is just so easy to botch a trial. Designing good trials is very hard (smarter people than I spend entire careers doing a handful of them). It’s a tough issue.

Lots to chew on here.

If school funding doesn’t really matter, why do people want their kid’s school to be well funded?

A question came up about the effects of school funding and student performance, and we were referred to this review article from a few years ago by Larry Hedges, Terri Pigott, Joshua Polanin, Ann Marie Ryan, Charles Tocci, and Ryan Williams:

One question posed continually over the past century of education research is to what extent school resources affect student outcomes. From the turn of the century to the present, a diverse set of actors, including politicians, physicians, and researchers from a number of disciplines, have studied whether and how money that is provided for schools translates into increased student achievement. The authors discuss the historical origins of the question of whether school resources relate to student achievement, and report the results of a meta-analysis of studies examining that relationship. They find that policymakers, researchers, and other stakeholders have addressed this question using diverse strategies. The way the question is asked, and the methods used to answer it, is shaped by history, as well by the scholarly, social, and political concerns of any given time. The diversity of methods has resulted in a body of literature too diverse and too inconsistent to yield reliable inferences through meta-analysis. The authors suggest that a collaborative approach addressing the question from a variety of disciplinary and practice perspectives may lead to more effective interventions to meet the needs of all students.

I haven’t followed this literature carefully. It was my vague impression that studies have found effects of schools on students’ test scores to be small. So, not clear that improving schools will do very much. On the other hand, everyone wants their kid to go to a good school. Just for example, all the people who go around saying that school funding doesn’t matter, they don’t ask to reduce the funding of their own kids’ schools. And I teach at an expensive school myself. So lots of pieces here, hard for me to put together.

I asked education statistics expert Beth Tipton what she thought, and she wrote:

I think the effect of money depends upon the educational context. For example, in higher education at selective universities, the selection process itself is what ensures success of students – the school matters far less. But in K-12, and particularly in under resourced areas, schools and finances can matter a lot – thus the focus on charter schools in urban locales.

I guess the problem here is that I’m acting like the typical uninformed consumer of research. The world is complicated, and any literature will be a mess, full of claims and counter-claims, but here I am expecting there to be a simple coherent story that I can summarize in a short sentence (“Schools matter” or “Schools don’t matter” or, maybe, “Schools matter but only a little”).

Given how frustrated I get when others come into a topic with this attitude, I guess it’s good for me to recognize when I do it.

Hey, here’s some free money for you! Just lend your name to this university and they’ll pay you $1000 for every article you publish!

Remember that absolutely ridiculous claim that scientific citations are worth $100,000 each?

It appears that someone is taking this literally. Or, nearly so. Nick Wise has the story:

A couple of months ago a professor received the following email, which they forwarded to me.

Dear esteemed colleagues,

We are delighted to extend an invitation to apply for our prestigious remote research fellowships at the University of Religions and Denominations (URD) . . . These fellowships offer substantial financial support to researchers with papers currently in press, accepted or under review by Scopus-indexed journals. . . .

Fellowship Type: Remote Short-term Research Fellowship. . . .

Affiliation: Encouragement for researchers to acknowledge URD as their additional affiliation in published articles.

Remuneration: Project-based compensation for each research article.

Payment Range: Up to $1000 USD per article (based on SJR journal ranking). . . .

Why would the institution pay researchers to say that they are affiliated with them? It could be that funding for the university is related to the number of papers published in indexed journals. More articles associated with the university can also improve their placing in national or international university rankings, which could lead directly to more funding, or to more students wanting to attend and bringing in more money.

The University of Religions and Denominations is a private Iranian university . . . Until recently the institution had very few published papers associated with it . . . and their subject matter was all related to religion. . . . However, last year there was a substantial increase to 103 published papers, and so far this year there are already 35. This suggests that some academics have taken them up on the offer in the advert to include URD as an affiliation.

Surbhi Bhatia Khan is a lecturer in data science at the University of Salford in the UK since March 2023 and a top 2% scientist in the world according to Stanford University’s rankings. She published 29 research articles last year according to Dimensions, an impressive output, in which she was primarily affiliated to the University of Salford. In addition though, 5 of those submitted in the 2nd half of last year had an additional affiliation at the Department of Engineering and Environment at URD, which is not listed as one of the departments on the university website. Additionally, 19 of the 29 state that she’s affiliated to the Lebanese American University in Beirut, which she was not affiliated with before 2023. She is yet to mention her role at either of these additional affiliations on her LinkedIn profile.

Looking at the Lebanese American University, another private university, its publication numbers have shot up from 201 in 2015 to 503 in 2021 and 2,842 in 2023, according to Dimensions. So far in 2024 they have published 525, on track for over 6,000 publications for the year. By contrast, according to the university website, the faculty consisted of 547 full-time staff members in 2021 but had shrunk to 423 in 2023. It is hard to imagine how such growth in publication numbers could occur without a similar growth in the faculty, let alone with a reduction.

Wise writes:

How many other institutions are seeing incredible increases in publication numbers? Last year we saw gaming of the system on a grand scale by various Saudi Arabian universities, but how many offers like the one above are going around, whether by email or sent through Whatsapp groups or similar?

It’s bad news when universities in England, Iran, Saudi Arabia, and Lebanon start imitating the corrupt citation practices that we have previously associated with nearby Cornell University.

But I can see where Dr. Khan is coming from: if someone’s gonna send you free money, why not take it? Even if the “someone” is a University of Religions and Denominations, and none of your published research relates to religion, and you list an affiliation with an apparently nonexistent department.

The only thing that’s bugging me is that, according to an esteemed professor at Northeastern University, citations are worth $100,000 each—indeed, we are told that it is possible to calculate “exactly how much a single citation is worth.” In that case, Dr. Khan is getting ripped off by University of Religions and Denominations, who are offering a paltry “up to $1000”—and that’s per article, not per citation! I know about transaction costs etc. but maybe she could at least negotiate them up to $2000 per.

I can’t imagine this scam going on for long, but while it lasts you might as well get in on it. Why should professors at Salford University have all the fun?

Parting advice

Just one piece of advice for anyone who’s read this far down into the post: if you apply for the “Remote Short-term Research Fellowship” and you get it, and you send them the publication notice for your article that includes your affiliation with the university, and then they tell you that they’ll be happy to send you a check for $1000, you just have to wire them a $10 processing fee . . . don’t do it!!!

Listen to those residuals

This is Jessica. Speaking of data sonification (or sensification), Hyeok, Yea Seul Kim, and I write

Data sonification-mapping data variables to auditory variables, such as pitch or volume-is used for data accessibility, scientific exploration, and data-driven art (e.g., museum exhibitions) among others. While a substantial amount of research has been made on effective and intuitive sonification design, software support is not commensurate, limiting researchers from fully exploring its capabilities. We contribute Erie, a declarative grammar for data sonification, that enables abstractly expressing auditory mappings. Erie supports specifying extensible tone designs (e.g., periodic wave, sampling, frequency/amplitude modulation synthesizers), various encoding channels, auditory legends, and composition options like sequencing and overlaying. Using standard Web Audio and Web Speech APIs, we provide an Erie compiler for web environments. We demonstrate the expressiveness and feasibility of Erie by replicating research prototypes presented by prior work and provide a sonification design gallery. We discuss future steps to extend Erie toward other audio computing environments and support interactive data sonification.

Have you ever wanted to listen to your model fit? I haven’t, but I think it’s worth exploring how one would do so effectively, either for purposes of making data representations accessible to blind and visual impaired users, or for other purposes like data journalism or creating “immersive” experiences of data like you might find in museums.

But turns out it’s really hard to create data sonifications with existing tools! You have to learn low-level audio programming and use multiple tools to do things like combine several sonifications into a single design. Other tools only offer the ability to make sonifications corresponding to a narrow range of chart types, perhaps as a result of a bias toward thinking about sonifications only from the perspective of how they map to existing visualizations.

Hyeok noticed some of these issues and decided to do something about it. Erie provides a flexible specification format where you can define a sonification design in terms of tone (the overall quality of a sound) and encodings (mappings from data variables to auditory features). You can compose more complex sonifications by repeating, sequencing, and overlaying sonifications, and it interfaces with standard web audio APIs. 

Documentation on how to install and use Erie is here. There’s also an online editor you can use to try out the grammar. But first I recommend playing some of the examples, which include some simple charts and recreations of data journalism examples. My favorites are the residuals from a poorly fit model and a better fitting one. Especially if you play just the data series of these back to back, the better fit should sound more consistent and slightly more harmonious.

This was really Hyeok’s vision; I can’t claim to have contributed very much to this work. But it was interesting to watch it come together. During our meetings about the project, it was initially very unfamiliar to me, trying to interpret audio variables like pitch as carrying information about data values, and I can’t really say it’s gotten easier. I guess this gets at how hard it is to make data easily consumable in a serial format like audio, at least for users who are accustomed to all the benefits of parallel visual processing. 

Social penumbras predict political attitudes (my talk at Harvard on Monday Feb 12 at noon)

Monday, February 12, 2024, 12:00pm to 1:15pm

Social penumbras predict political attitudes

The political influence of a group is typically explained in terms of its size, geographic concentration, or the wealth and power of the group’s members. This article introduces another dimension, the penumbra, defined as the set of individuals in the population who are personally familiar with someone in that group. Distinct from the concept of an individual’s social network, penumbra refers to the circle of close contacts and acquaintances of a given social group. Using original panel data, the article provides a systematic study of various groups’ penumbras, focusing on politically relevant characteristics of the penumbras (e.g., size, geographic concentration, sociodemographics). Furthermore, we show the connection between changes in penumbra membership and public attitudes on policies related to the group.

This is based on a paper with Yotam Margalit from 2021.

Bayesian Analysis with Python

Osvaldo Martin writes:

The third edition of Bayesian Analysis with Python serves as an introduction to the basic concepts of applied Bayesian modeling. It adopts a hands-on approach, guiding you through the process of building, exploring and expanding models using PyMC and ArviZ. The field of probabilistic programming is in a different place today than it was when the first edition was devised in the middle of the last decade. The journey from its first publication to this current edition mirrors the evolution of Bayesian modeling itself – a path marked by significant advancements, growing community involvement, and an increasing presence in both academia and industry. Consequently, this updated edition also includes coverage of additional topics and libraries such as Bambi, for flexible and easy hierarchical linear modeling, PyMC-BART, for flexible non-parametric regression; PreliZ, for prior elicitation; and Kulprit, for variable selection.

Whether you’re a student, data scientist, researcher, or developer aiming to initiate Bayesian data analysis and delve into probabilistic programming, this book provides an excellent starting point. The content is introductory, requiring little to none prior statistical knowledge, although familiarity with Python and scientific libraries like NumPy is advisable.

By the end of this book, you will possess a functional understanding of probabilistic modeling, enabling you to design and implement Bayesian models for your data science challenges. You’ll be well-prepared to delve into more advanced material or specialized statistical modeling if the need arises.

See more at the book website

Osvaldo spent one year at Aalto in Finland (unfortunately, during the pandemic) so I know he knows what he writes. Bambi is rstanarm / brms style interface for building models with PyMC in Python ecosystem, and Kulprit is the Python version of projpred (in R) for projective predictive model selection (which is one of my favorite research topics).

When all else fails, add a code comment

Another way of saying this is that you should treat inline code comments as a last resort when there is no other way to make your intentions clear.

I used to teach a session of Andrew’s statistical communication class once a year and I’d focus on communicating a computational API. Most of the students hated it because they signed up for the class to hear Andrew talk about stats, not me talk about API design. At least one student just up and walked out every year! So if you’re that student, now’s your chance to bail.

Comments considered harmful

Most academics, before they will share code with me, tell me they have to “clean it up.” I invariably tell them not to bother, and at best, they will dilly dally and shilly shally and apologize for lack of comments. What they don’t realize is that they were on the right track in the first place. The best number of inline code comments is zero. Nada. Zilch. Nil. Naught.

Why are comments so harmful? They lie! Even with the best of intent, they might not match the actual implementation. They often go stale over time. You can write whatever you want in a comment and there’s no consistency checking with the code.

You know what doesn’t lie? Code. Code doesn’t lie. So what do professional programmers do? They don’t trust comments and read the code instead. At this point, comments just get in the way.

What’s a bajillion times better than comments?

Readable code. Why? It’s self documenting. To be self documenting, code needs to be relatively simple and modular. The biggest mistake beginners make in writing code is lack of modularity. Without modularity, it’s impossible to build code bottom up, testing as you go.

It’s really hard to debug a huge program. It’s really easy to debug modules built up piece by piece on top of already-tested modules. So design top down, but build code bottom up. This is why we again and again stress in our writing on Bayesian workflow and in our replies to user questions on forums, that it helps immensely to scaffold up a complicated model one piece at a time. This lets you know when you add something and it causes a failure.

Knowing where to draw lines between modules is, unfortunately, a matter of experience. The best way to get that experience? Read code. In the product coding world, code is read much more often than it’s written. That means much more effort typically goes into production code to make it readable. This is very unlike research code which might be written once and never read again.

There is a tradeoff here. Code is more readable with short variable names and short function names. It’s easier to apprehend the structure of the expression a * b + c**2 than meeting_time * number_of_meetings + participants**2. We need to strike a balance with not too long, but still informative variable names.

And why are beginners so afraid of wasting horizontal space while being spendthrifts on the much more valuable vertical space? I have no explanation. But I see a lot of code from math-oriented researchers that looks like this, ((a*b)/c)+3*9**2+cos(x-y). Please use spaces around operators and no more parens than are necessary to disambiguate given attachment binding.

When should I comment?

Sometimes you’re left with no choice and have to drop in a comment as a last resort. This should be done if you’re doing something non-idiomatic with the language or coding an unusual algorithm or something very involved. In this case, a little note inline about intent and/or algebra can be helpful. That’s why commenting is sometimes called a method of last resort.

But whatever you do, comment for people who know the language better than you. Don’t write a comment that explains what a NumPy function does—that’s what the NumPy doc is for. Nobody wants to see this:

int num_observations = 513;  // declare num_observations as an integer and set equal to 513

But people who feel compelled to comment will write just this kind of thing, thinking it makes their code more professional. If you think this is a caricature, you don’t read enough code.

The other thing you don’t want to do is this:

#####################################################
################## INFERENCE CODE ###################
#####################################################
...
...
...

This is what functions are for. Write a function called inference() and call it. It will also help prevent accidental reuse of global variables, which is always a problem in scripting languages like R and Python. Don’t try to fix hundreds or thousands of lines of unstructured code with structured comments.

Another thing to keep in mind is that vertical space is very precious in coding, because we want to be able to see as much of the code as we can at a time without scrolling. Don’t waste vertical space with useless or even harmful comments.

Do not, and I repeat, do not use /* ... */ style comments inline with code. It’s too easy to get confused when it’s a lot of lines and it’s doubly confusing when nested. Instead, use line comments (// in C++ and Stan, # in Python and R). Use the comment-region command in emacs or whatever does the same in your IDE. With line comments, the commented out code will be very visible, as in the following example.

for (int n = 0; n < N; ++n) {
  // int x = 5
  // int y = x * x * 3;
  // int z = normal_rng(y, 1);
  z = n * 3
}

Compare that to what I often see, which is some version of the following.

for (int n = 0; n < N; ++n) {
  /* int x = 5
  int y = x * x * 3;
  int z = normal_rng(y, 1); */
  z = n * 3
}

In the first case, it's easy to just scan down the margin and see what's commented out.

After commenting out and fixing everything, please be a good and respectful citizen and just delete all the edited out code before merging or releasing. Dead code makes the live code hard to find and one always wonders why it's still there---was it a mistake or some future plan or what? When I first showed up at Bell Labs in the mid 1990s, I was handed a 100+ page Tcl/Tk script for running a speech recognizer and told only a few lines were active, but I'd have to figure out which ones. Don't do that!

The golden triangle

What I stressed in Andrew's class is the tight interconnection between three aspects of production code:


$latex \textrm{API Documentation} \leftrightarrow \textrm{Unit tests} \leftrightarrow \textrm{Code}$

 

The API documentation should be functionally oriented and say what the code does. It might include a note as to how it does it if that is relevant to its use. An example might be different algorithms to compute the same thing that are widely known by name and useful in different situations. The API doc should ideally be specific enough to be the basis of both unit testing and coding. So I'm not saying don't document. I'm saying don't document how inline code works, document your API's intent.

The reason I call this the "golden" triangle is the virtuous cycle it imposes. If the API doc is hard to write, you know there's a problem with the way the function has been specified or modularized. With R and Python programmers, that's often because the code is trying to do too many things for a single function and the input types and output types become a mess of dependencies. This leads to what programmers identify as a "bad smell" in the code. If the code or the unit tests are hard to write, you know there's a problem with the API specification.

Clients (human and computational) are going to see and "feel" the API. That's where the "touch" is that designers like to talk about in physical object design. Things need to feel natural for the application, or in the words of UI/UX designers, it needs to offer affordances (in the past, we might have said it should be intuitive). It needs to feel natural for the application. Design the API first from the client perspective. Sometimes you have to suffer on the testing side to maintain a clean and documentable API, but that clean API is your objective.

What about research code?

Research code is different. It doesn't have to be robust. It doesn't have to be written to be read by multiple people in the future. You're usually writing end-to-end tests rather than unit tests, though that can be dangerous. It still helps to develop bottom-up with testing.

What research code should be is reproducible. There should be a single script to run that generates all the output for a paper. That way, even if the code's ugly, at least the output's reproducible and someone with enough interest can work through it.

And of course, research code needs to be tested that it's doing what it's supposed to be doing. And it needs to be audited to make sure it's not "cheating" (like cross-validating a time-series, etc.).

Notebooks, Quarto, and other things that get in the way of coding and documenting

With all due respect to Donald Knuth (never a good start), literate programming is a terrible way to develop code. (On a related topic, I would totally recommend at least the beginning part of Knuth's notes on how to write math.)

I don't love them, but I use Quarto and Jupyter (nee iPython) notebooks for writing reproducible tutorial material. But only after I've sorted out the code. These tools mix text and code and make too many compromises along the way to make them good at either task. Arguably the worst sin is that it winds up obfuscating the code with a bunch of text. Jupyter also makes it possible to get into inconsistent states because it doesn't automatically re-run everything. Quarto is just a terrible typesetting platform, inheriting all the flaws of pandoc, citeproc, with the added joy of HTML and LaTeX interoperability and R and Python interoperability. We use it for Stan docs so that we can easily generate HTML and LaTeX, but it always feels like there should be a better way to do this as it's a lot of trial and error due to the lack of specs for markdown.

“Replicability & Generalisability”: Applying a discount factor to cost-effectiveness estimates.

This one’s important.

Matt Lerner points us to this report by Rosie Bettle, Replicability & Generalisability: A Guide to CEA discounts.

“CEA” is cost-effectiveness analysis, and by “discounts” they mean what we’ve called the Edlin factor—“discount” is a better name than factor, because it’s a number that should be between 0 and 1, it’s what you should multiply a point estimate by to adjust for inevitable upward biases in reported effect-size estimates, issues discussed here and here, for example.

It’s pleasant to see some of my ideas being used for a practical purpose. I would just add that type M and type S errors should be lower for Bayesian inferences than for raw inferences that have not been partially pooled toward a reasonable prior model.

Also, regarding empirical estimation of adjustment factors, I recommend looking at the work of Erik van Zwet et al; here are some links:
What’s a good default prior for regression coefficients? A default Edlin factor of 1/2?
How large is the underlying coefficient? An application of the Edlin factor to that claim that “Cash Aid to Poor Mothers Increases Brain Activity in Babies”
The Shrinkage Trilogy: How to be Bayesian when analyzing simple experiments
Erik van Zwet explains the Shrinkage Trilogy
The significance filter, the winner’s curse and the need to shrink
Bayesians moving from defense to offense: “I really think it’s kind of irresponsible now not to use the information from all those thousands of medical trials that came before. Is that very radical?”
Explaining that line, “Bayesians moving from defense to offense”

I’m excited about the application of these ideas to policy analysis.

I’ve been mistaken for a chatbot

… Or not, according to what language is allowed.

At the start of the year I mentioned that I am on a bad roll with AI just now, and the start of that roll began in late November when I received reviews back on a paper. One reviewer sent in a 150 word review saying it was written by chatGPT. The editor echoed, “One reviewer asserted that the work was created with ChatGPT. I don’t know if this is the case, but I did find the writing style unusual ….” What exactly was unusual was not explained.

That was November 20th. By November 22nd my computer shows a file created named ‘tryingtoproveIamnotchatbot,’ which is just a txt where I pasted in the GitHub commits showing progress on the paper. I figured maybe this would prove to the editors that I did not submit any work by chatGPT.

I didn’t. There are many reasons for this. One is I don’t think that I should. Further, I suspect chatGPT is not so good at this (rather specific) subject and between me and my author team, I actually thought we were pretty good at this subject. And I had met with each of the authors to build the paper, its treatise, data and figures. We had a cool new meta-analysis of rootstock x scion experiments and a number of interesting points. Some of the points I might even call exciting, though I am biased. But, no matter, the paper was the product of lots of work and I was initially embarrassed, then gutted, about the reviews.

Once I was less embarrassed I started talking timidly about it. I called Andrew. I told folks in my lab. I got some fun replies. Undergrads in my lab (and others later) thought the review itself may have been written by chatGPT. Someone suggested I rewrite the paper with chatGPT and resubmit. Another that I just write back one line: I’m Bing.

What I took away from this was myriad, but I came up with a couple next steps. I decided this was not a great peer review process that I should reach out to the editor (and, as one co-author suggested, cc the editorial board). And another was to not be so mortified as to not talk about this.

What I took away from these steps were two things:

1) chatGPT could now control my language.

I connected with a senior editor on the journal. No one is a good position here, and the editor and reviewers are volunteering their time in a rapidly changing situation. I feel for them and for me and my co-authors. The editor and I tried to bridge our perspectives. It seems he could not have imagined that I or my co-authors would be so offended. And I could not have imagined that the journal already had a policy of allowing manuscripts to use chatGPT, as long as it was clearly stated.

I was also given some language changes to consider, so I might sound less like chatGPT to reviewers. These included some phrases I wrote in the manuscript (e.g. `the tyranny of terroir’). Huh. So where does that end? Say I start writing so I sound less to the editor and others ‘like chatGPT’ (and I never figured out what that means), then chatGPT digests that and then what? I adapt again? Do I eventually come back around to those phrases once they have rinsed out of the large language model?

2) Editors are shaping the language around chatGPT.

Motivated by a co-author’s suggestion, I wrote a short reflection which recently came out in a careers column. I much appreciate the journal recognizing this as an important topic and that they have editorial guidelines to follow for clear and consistent writing. But I was surprised by the concerns from the subeditors on my language. (I had no idea my language was such a problem!)

This problem was that I wrote: I’ve been mistaken for a chatbot (and similar language). The argument was that I had not been mistaken — my writing had been. The debate that ensued was fascinating. If I had been in a chatroom and this happened, then I could write `I’ve been mistaken for a chatbot’ but since my co-authors and I wrote this up and submitted it to a journal, it was not part of our identities. So I was over-reaching in my complaint. I started to wonder: if I could not say ‘I was mistaken for an AI bot’ — why does the chatbot get ‘to write’? I went down an existential hole, from which I have not fully recovered.

And since then I am still mostly existing there. On the upbeat side, writing the reflection was cathartic and the back and forth with the editors — who I know are just trying to their jobs too — gave me more perspectives and thoughts, however muddled. And my partner recently said to me, “perhaps one day it will be seen as a compliment to be mistaken for a chatbot, just not today!”

Also, since I don’t know an archive that takes such things so I will paste the original unedited version below.

I have just been accused of scientific fraud. It’s not data fraud (which, I guess, is a relief because my lab works hard at data transparency, data sharing and reproducibility). What I have just been accused of is writing fraud. This hurts, because—like many people—I find writing a paper a somewhat painful process.

Like some people, I comfort myself by reading books on how to write—both to be comforted by how much the authors of such books stress that writing is generally slow and difficult, and to find ways to improve my writing. My current writing strategy involves willing myself to write, multiple outlines, then a first draft, followed by much revising. I try to force this approach on my students, even though I know it is not easy, because I think it’s important we try to communicate well.

Imagine my surprise then when I received reviews back that declared a recently submitted paper of mine a chatGPT creation. One reviewer wrote that it was `obviously Chat GPT’ and the handling editor vaguely agreed, saying that they found `the writing style unusual.’ Surprise was just one emotion I had, so was shock, dismay and a flood of confusion and alarm. Given how much work goes into writing a paper, it was quite a hit to be accused of being a chatbot—especially in short order without any evidence, and given the efforts that accompany the writing of almost all my manuscripts.

I hadn’t written a word of the manuscript with chatGPT and I rapidly tried to think through how to prove my case. I could show my commits on GitHub (with commit messages including `finally writing!’ and `Another 25 mins of writing progress!’ that I never thought I would share), I could try to figure out how to compare the writing style of my pre-chatGPT papers on this topic to the current submission, maybe I could ask chatGPT if it thought I it wrote the paper…. But then I realized I would be spending my time trying to prove I am not a chatbot, which seemed a bad outcome to the whole situation. Eventually, like all mature adults, I decided what I most wanted to do was pick up my ball (manuscript) and march off the playground in a small fury. How dare they?

Before I did this, I decided to get some perspectives from others—researchers who work on data fraud, co-authors on the paper and colleagues, and I found most agreed with my alarm. One put it most succinctly to me: `All scientific criticism is admissible, but this is a different matter.’

I realized these reviews captured both something inherently broken about the peer review process and—more importantly to me—about how AI could corrupt science without even trying. We’re paranoid about AI taking over us weak humans and we’re trying to put in structures so it doesn’t. But we’re also trying to develop AI so it helps where it should, and maybe that will be writing parts of papers. Here, chatGPT was not part of my work and yet it had prejudiced the whole process simply by its existential presence in the world. I was at once annoyed at being mistaken for a chatbot and horrified that reviewers and editors were not more outraged at the idea that someone had submitted AI generated text.

So much of science is built on trust and faith in the scientific ethics and integrity of our colleagues. We mostly trust others did not fabricate their data, and I trust people do not (yet) write their papers or grants using large language models without telling me. I wouldn’t accuse someone of data fraud or p-hacking without some evidence, but a reviewer felt it was easy enough to accuse me of writing fraud. Indeed, the reviewer wrote, `It is obviously [a] Chat GPT creation, there is nothing wrong using help ….’ So it seems, perhaps, that they did not see this as a harsh accusation, and the editor thought nothing of passing it along and echoing it, but they had effectively accused me of lying and fraud in deliberately presenting AI generated text as my own. They also felt confident that they could discern my writing from AI—but they couldn’t.

We need to be able to call out fraud and misconduct in science. Currently, the costs to the people who call out data fraud seem too high to me, and the consequences for being caught too low (people should lose tenure for egregious data fraud in my book). But I am worried about a world in which a reviewer can casually declare my work AI-generated, and the editors and journal editor simply shuffle along the review and invite a resubmission if I so choose. It suggests not only a world in which the reviewers and editors have no faith in the scientific integrity of submitting authors—me—but also an acceptance of a world where ethics are negotiable. Such a world seems easy for chatGPT to corrupt without even trying—unless we raise our standards.

Side note: Don’t forget to submit your entry to the International Cherry Blossom Prediction Competition!

It’s bezzle time: The Dean of Engineering at the University of Nevada gets paid $372,127 a year and wrote a paper that’s so bad, you can’t believe it.

“As we look to sleep and neuroscience for answers we can study flies specifically the Drosophila melanogaster we highlight in our research.”

1. The story

Someone writes:

I recently read a paper of yours in the Chronicle about how academic fraudsters get away with it. I came across a strange case that I thought you would at least have some interest in when a faculty members owns an open access journal that costs to publish and then publishes a large number of papers in the journal.  The most recent issue is all from the same authors (family affair).

It is from an administrator at University of Nevada Reno.  This concern is related to publications within a journal that may not be reputable.   The Dean of Engineering has a number of publications in the International Supply Chain Technology Journal that are in question Google Scholar.  Normally, I would contact the editor, or publisher, but in this case, there are complexities.

This may not  be an issue but many of the articles are short, being 1 or 2 pages. In addition, some have a peer review process of 3 days or less. Another concern is that many of the papers do not even discuss what is in the title.  Take the following paper: It presents nothing about the title. Many of the papers read as if AI was used.

While the quality of these papers may not be of concern, the representation of these as publications could be. The person publishing them should have ethical standards that exceed those that are under his leadership. He is also the highest ranking official of the college of engineering and is expected to lead by example and be a good model to those under him.

If that is not enough, looking into the journal in more detail alludes to more ethical questions. The journal is published by PWD Group out of Texas. Lookup of PWD Group out of Texas yields that Erick Jones is the Director and President.  Erick Jones was also the Editor of the journal.  In addition to the journal articles, even books authored by Erick Jones are published by PWD.

Further looking into the journal publications you will see that there are a large number with Erick Jones Sr. and Erick Jones Jr.  There are also a large number with Felicia Jefferson.  Felicia is also a faculty member at UNR and the spouse of Dean Jones.  A few of the papers raise concerns related to deer supply chains. The following has a very fast peer review process of a few days and the caption of a white tailed deer is a reindeer. Another paper is even shorter, with a very fast peer review, and captions yet a different deer which is still not a white tail. It is unlikely these papers went through a robust peer review.

While these papers affiliation are prior to coming to UNR, the incoherence, conflict of interest, and incorrect data do lot look good for UNR and they were published either when Dr. Jefferson was applying to UNR or early upon her arrival. Similar issues with the timing of this article. Also, in the print version of the journal, Dr. Jefferson handles submissions (pp3).

Maybe this information is nothing to be concerned about.  At the very least, it sheds a poor light on the scientific process, especially when a Dean is the potential abuser.  It is not clear how he can encourage high quality manuscripts from other faculty when he has been able to climb the ladder using his own publishing house. I’ll leave you with a paper with a relevant title on minimizing train accidents through minimizing sleep deprivation. It seems like a really important study.  The short read should convince you otherwise and make you question the understanding of the scientific process by these authors.

Of specific concern is whether these publications led to he, or his spouse, being hired at UNR.  If these are considered legitimate papers, the entire hiring and tenure process at UNR is compromised.  Similar arguments exist if these papers are used in the annual evaluation process. It also raises a conflict of interest if he pays to publish and then receives proceeds on the back end.

I have no comment on the hiring, tenure, and evaluation process at UNR, or on any conflicts of interest. I know nothing about what is going on at UNR. It’s a horrifying story, though.

2. The published paper

OK, here it is, in its entirety (except for references). You absolutely have to see it to believe it:

Compared to this, the Why We Sleep guy is a goddamn titan of science.

3. The Dean of Engineering

From the webpage of the Dean of Engineering at the University of Reno:

Dr. Erick C. Jones is a former senior science advisor in the Office of the Chief Economist at the U.S. State Department. He is a former professor and Associate Dean for Graduate Studies at the College of Engineering at The University of Texas at Arlington.

From the press release announcing his appointment, dated July 01, 2022:

Jones is an internationally recognized researcher in industrial manufacturing and systems engineering. . . . “In Erick Jones, our University has a dynamic leader who understands how to seize moments of opportunity in order to further an agenda of excellence,” University President Brian Sandoval said. . . . Jones was on a three-year rotating detail at National Science Foundation where he was a Program Director in the Engineering Directorate for Engineering Research Centers Program. . . .

Jones is internationally recognized for his pioneering work with Radio Frequency Identification (RFID) technologies, Lean Six Sigma Quality Management (the understanding of whether a process is well controlled), and autonomous inventory control. He has published more than 243 manuscripts . . .

According to this source, his salary in 2022 was $372,127.

According to wikipedia, UNR is the state’s flagship public university.

I was curious to see what else Jones had published so I searched him on Google scholar and took a look at his three most-cited publications. The second of these appeared to be a textbook, and the third was basically 8 straight pages of empty jargon—ironic that a journal called Total Quality Management would publish something that has no positive qualities! The most-cited paper on the list was pretty bad too, an empty bit of make-work, the scientific equivalent of the reports that white-collar workers need to fill out and give to their bosses who can then pass these along to their bosses to demonstrate how productive they are. In short, this guy seems to be a well-connected time server in the Ed Wegman mode, minus the plagiarism.

He was a Program Director at the National Science Foundation! Your tax dollars at work.

Can you imagine what it would feel like to be a student in the engineering school at the flagship university of the state of Nevada, and it turns out the school is being run by the author of this:

Our recent study has the premise that both humans and flies sleep during the night and are awake during the day, and both species require a significant amount of sleep each day when their neural systems are developing in specific activities. This trait is shared by both species. An investigation was segmented into three subfields, which were titled “Life span,” “Time-to-death,” and “Chronological age.” In D. melanogaster, there was a positive correlation between life span, the intensity of young male medflies, and the persistence of movement. Time-to-death analysis revealed that the male flies passed away two weeks after exhibiting the supine behavior. Chronological age, activity in D. melanogaster was adversely correlated with age; however, there was no correlation between chronological age and time-to-death. It is probable that the incorporation the findings of age-related health factors and increased sleep may lead toless train accidents. of these age factors when considering these options supply chain procedure for maintaining will be beneficial.

I can’t even.

P.S. The thing I still can’t figure out is, why did Jones publish this paper at all? He’d already landed the juicy Dean of Engineering job, months before submitting it to his own journal. To then put his name on something so ludicrously bad . . . it can’t help his career at all, could only hurt. And obviously it’s not going to do anything to reduce train accidents. What was he possibly thinking?

P.P.S. I guess this happens all the time; it’s what Galbraith called the “bezzle.” We’re just more likely to hear about when it happens at some big-name place like Stanford, Harvard, Ohio State, or Cornell. It still makes me mad, though. I’m sure there are lots of engineers who are doing good work and could be wonderful teachers, and instead UNR spends $372,127 on this guy.

I’ll leave the last word to another UNR employee, from the above-linked press release:

“What is exciting about having Jones as our new dean for the College of Engineering is how he clearly understands the current landscape for what it means to be a Carnegie R1 ‘Very High Research’ institution,” Provost Jeff Thompson said. “He very clearly understands how we can amplify every aspect of our College of Engineering, so that we can continue to build transcendent programs for engineering education and research.”

They’re transcending something, that’s for sure.

My challenge for Jeff Thompson: Show up at an engineering class at your institution, read aloud the entire contents (i.e., the two paragraphs) of “Using Science to Minimize Sleep Deprivation that may reduce Train Accidents,” then engage the students in a discussion of what this says about “the current landscape for what it means to be a Carnegie R1 ‘Very High Research’ institution.”

Should be fun, no? Just remember, the best way to keep the students’ attention is to remind them that, yes, this will be covered on the final exam.

P.P.P.S. More here from Retraction Watch.

P.P.P.P.S. Still more here.

P.P.P.P.P.S. Retraction Watch found more plagiarism, this time on a report for the National Science Foundation.

Cherry blossoms—not just another prediction competition

It’s back! As regular readers know, the Cherry Blossom Prediction Competition will run throughout February 2024. We challenge you to predict the bloom date of cherry trees at five locations throughout the world and win prizes.

We’ve been promoting the competition for three years now—but we haven’t really emphasized the history of the problem. You might be surprised to know that bloom date prediction interested several famous nineteenth century statisticians. Co-organizer Jonathan Auerbach explains:

The “law of the flowering plants” states that a plant blooms after being exposed to a predetermined quantity of heat. The law was discovered in the mid-eighteenth century by René Réaumur, an early adopter of the thermometer—but it was popularized by Adolphe Quetelet, who devoted a chapter to the law in his Letters addressed to HRH the Grand Duke of Saxe Coburg and Gotha (Letter Number 33). See this tutorial for details.

Kotz and Johnson list Letters as one of eleven major breakthroughs in statistics prior to 1900, and the law of the flowering plants appears to have been well known throughout the nineteenth century, influencing statisticians such as Francis Galton and Florence Nightingale. But its popularity waned among statisticians as statistics transitioned from a collection of “fundamental” constants to a collection of principles for quantifying uncertainty. In fact, Ian Hacking mocks the law as a particularly egregious example of stamp collecting statistics.

But the law is widely used today! Charles Morren, Quetelet’s collaborator, later coined the term phenology, the name of the field that currently studies life-cycle events, such as bloom dates. Phenologists keep track of accumulated heat or growing degree days to predict bloom dates, crop yields, and the emergence of insects. Predictions are made using a methodology that is largely unchanged since Quetelet’s time—despite the large amounts of data now available and amenable to machine learning.