Skip to content

OpenData Latinoamerica

Miguel Paz writes:

Poderomedia Foundation and PinLatam are launching OpenDataLatinoamerica.org, a regional data repository to free data and use it on Hackathons and other activities by HacksHackers chapters and other organizations.

We are doing this because the road to the future of news has been littered with lost datasets. A day or so after every hackathon and meeting where a group has come together to analyze, compare and understand a particular set of data, someone tries to remember where the successful files were stored. Too often, no one is certain. Therefore with Mariano Blejman we realized that we need a central repository where you can share the data that you have proved to be reliable: OpenData Latinoamerica, which we are leading as ICFJ Knight International Journalism Fellows.

If you work in Latin America or Central America your organization can take part in OpenDataLatinoamerica.org. To apply, go to the website and answer a simple form agreeing to meet the standard criteria for open data. Once the application is approved, you will receive an account to start running and managing open data, becoming part of the community.

Crime novels for economists

eddie

Following up on this post by Noah Smith on economics in science fiction, Mark Palko writes on economics in crime fiction.

Just as almost all science fiction is ultimately about politics, one could say that just about all crime fiction is about economics.

But if I had to pick one crime novelist with an economics focus, I’d pick George V. Higgins. In one of his novels, his character Jerry Kennedy had a riff on the difference between guys who get a salary and guys who have to work for every dollar. But, really, almost all his novels are full of economics.

Actually, I have no problem with this graph

infovis

Tom Salvesen asks, is this the worst info-graphic of the year?

I say, no. Nobody really cares about these numbers. It’s an amusing feature. The alternative would not be a better display of these data, the alternative would be some photo or cartoon. They’re just having fun. I wouldn’t give it any design awards but it’s fine, it is what it is.

The recursion of pop-econ

Dave Berri posted the following at the Freakonomics blog:

The “best” picture of 2012 was Argo. At least that’s the film that won the Oscar for best picture. According to the Oscars, the decision to give this award to Argo was made by the nearly 6,000 voting members of the Academy of Motion Picture Arts and Sciences. . . . In other words, this choice is made by the “experts.” There is, though, another group that we could have listened to on Sunday night. That group would be the people who actually spend money to go to the movies. . . . According to that group, Marvel’s the Avengers was the “best” picture in 2012. With domestic revenues in excess of $600 million, this filmed earned nearly $200 million more than any other picture. And when we look at world-wide revenues, this film brought in more than $1.5 billion. . . . Despite what seems like a clear endorsement by the customers of this industry, the Avengers was ignored by the Oscars. Perhaps this is just because I am an economist, but this strikes me as odd. Movies are not a product made just for the members the academy. These ventures are primarily made for the general public. And yet, when it comes time to decide which picture is “best,” the opinion of the general public seems to be ignored. Essentially the Oscars are an industry statement to their customers that says: “We don’t think our customers are smart enough to tell us which of our products are good. So we created a ceremony to correct our customers.”

He keeps going along those lines for awhile and concludes:

One would hope the Academy would at least pay a bit more attention to the people paying the bills. Not only does it seem wrong (at least to this economist) to argue that movies many people like are simply not that good, focusing on the box office would seem to make good financial sense for the Oscars as well. A recent Slate article argued that the Oscars’ telecast tends to have higher ratings when more commercially successful films are nominated for best picture. So in the future, maybe voters for the Oscars will pay a bit more attention to their customers. These customers may not be thought of as “movie experts.” But these are the people who pay the bills, and therefore, ultimately it is their opinion that should matter to this industry.

What strikes me about this discussion is the mix of descriptive and normative that seems so characteristic of pop-microeconomics. (I should emphasize here that I’m not using “pop” in any sort of derogatory way. I’m speaking of serious economic writing that is intended for a popular audience.)

1. On one hand, you have the purely descriptive perspective: economist as person-from-Mars, looking at human society objectively, the way a scientist studies cell cultures in a test tube. Consumer sovereignty is what it’s all about, with a slightly offended tone that anyone could think otherwise. Who are you, smartypants, to think you know better than the average ticket-buyer, etc. I’m reminded of the perhaps-apocryphal story of the “some academics” who “conclude that bookmakers simply aren’t very smart.”

2. At the same time, we’re given a moral lesson. The Avengers is the best movie because it made more money. It is “the people who pay the bills” whose “opinion that should matter to this industry.”

The difficulty, of course, is that lesson 2 gets blurred if it is folded into lesson 1.

Berri’s argument is that moviemakers should not be paternalistically ignoring the attitudes of their customers in giving awards. But this argument dissolves if you take one step back and consider moviemakers as independent business operators. In that case, their business decisions (to do the Oscars however they want) should be given as much respect as that of moviegoers to choose which movies to watch.

As far as I’m concerned, the Academy can do whatever they want. What’s interesting to me here is to see how the economist’s explicitly non-normative ideology (his implication that the “best” picture must be the one with most revenue, and that any other criteria would be disrespectful of moviegoers) so quickly becomes normative (that it’s “wrong . . . to argue that movies many people like are simply not that good”). To me, it’s a strange mixture of idealism and cynicism. The man from Mars has become a scold.

Same old same old

In an email I sent to a colleague who’s writing about lasso and Bayesian regression for R users:

The one thing you might want to add, to fit with your pragmatic perspective, is to point out that these different methods are optimal under different assumptions about the data. However, these assumptions are never true (even in the rare cases where you have a believable prior, it won’t really follow the functional form assumed by bayesglm; even in the rare cases where you have a real loss function, it won’t really follow the mathematical form assumed by lasso etc), but these methods can still be useful and be given the interpretation of regularized estimates.

Another thing that someone might naively think is that regularization is fine but “unbiased” is somehow the most honest. In practice, if you stick to “unbiased” methods such as least squares, you’ll restrict the number of variables you can include in your model. So in reality you suffer from omitted-variable bias. So there is not safe home base. It’s not like the user can simply do unregularized regression and then think of regularization as a frill. The practitioner who uses unregularized regression has already essentially made a compromise with the devil by restricting the number of predictors in the model to a “manageable” level (whatever that means).

A tale of two discussion papers

Over the years I’ve written a dozen or so journal articles that have appeared with discussions, and I’ve participated in many published discussions of others’ articles as well. I get a lot out of these article-discussion-rejoinder packages, in all three of my roles as reader, writer, and discussant.

Part 1: The story of an unsuccessful discussion

The first time I had a discussion article was the result of an unfortunate circumstance. I had a research idea that resulted in an article with Don Rubin on monitoring the mixing of Markov chain simulations. I new the idea was great, but back then we worked pretty slowly so it was awhile before we had a final version to submit to a journal. (In retrospect I wish I’d just submitted the draft version as it was.) In the meantime I presented the paper at a conference. Our idea was very well received (I had a sheet of paper so people could write their names and addresses to get preprints, and we got either 50 or 150 (I can’t remember which, I guess it must have been 50) requests), but there was one person who came up later and tried to shoot down our idea. The shooter-down, Charlie Geyer, has done some great work but in this case he was confused, I think in retrospect because we did not have a clear discussion of the different inferential goals that arose in the sorts of calculations he was doing (inference for normalizing constants of distributions) and which I was doing (inference for parameters in fitted models). In any case, the result was that our new and exciting method was surrounded by an air of controversy. In some ways that was a good thing: I became well known in the field right away, perhaps more than I deserved at the time (in the sense that most of my papers up to then and for the next few years were on applied topics; it was awhile before I published other major papers on statistical theory, methods, and computation). But overall I’d rather have been moderately known for an excellent piece of research than very well known for being part of a controversy. I didn’t seek out controversy; it arose because someone else criticized our work without seeing the big picture, and at the time neither he nor I nor my collaborator had the correct synthesis of my work and his criticism.

(Again, the synthesis was that he was trying to get precise answers for hard problems and was in a position where he needed to have a good understanding of the complex distributions he was simulating from, whereas I was working on a method to apply routinely in relatively easy (but nontrivial!) settings. For Charlie’s problems, my method would not suffice because he wouldn’t be satisfied until he was directly convinced that the Markov chain was exploring all the space. For my problems, Charlie’s approach (to run a million simulations and work really hard to understand the computation for a particular model) wasn’t a practical solution. His approach to applied statistics was to handcraft big battleships to solve large problems, one at a time. I wanted to fit lots of small and medium-sized models (along with the occasional big one), fast.)

Anyway, this “different methods for different goals” conversation never occurred, hence I left that meeting with an unpleasant feeling that our method was controversial, not fully accepted, and not fully understood. So I got it into my head that our article should be published as a discussion, so that Geyer and others could comment and we could respond.

But we never had that discussion, not in those words. Neither Charlie nor I nor Don Rubin was aware enough of the sociological context, as it were, so we ended up talking past each other.

In retrospect, that particular discussion did not work so well.

Here’s another example from about the same time, the Ising model. Here’s one chain from the Gibbs sampler. After 2000 iterations, it looks like it’s settled down to convergence (here we’re plotting the log probability density, which is a commonly used summary for this sort of distribution).

But then look at the second plot: the first 500 iterations. If we’d only seen these, we might have been tempted to declare victory too early!

At this point, the naive take-home point might be that 500 iterations was not enough but we’re safe with 2000. But no! Even that the last bit of those 2000 looks as stationary and clean as can be, if we start from a different point and run for 2000, we get something different:

This one looks stationary too! But a careful comparison with the graphs above (even clearer when I displayed these on transparency sheets and overlaid them on the projector) reveals that the two “stationary” distributions are different. The chains haven’t mixed, the process hasn’t converged. R-hat reveals this right away (without even having to look at the graphs, but you can look at the graphs if you want).

As I wrote in our article in Bayesian Statistics 4,

This example shows that the Gibbs sampler can stay in a small subset of its space for a long time, without any evidence of this problematic behavior being provided by one simulated series of finite length. The simplest way to run into trouble is with a two-chambered space, in which the probability of switching chambers is very low, but the above graphs are especially disturbing because the probability density in the Ising model has a unimodal (in the sense that this means anything in a discrete distribution) and approximately Gaussian marginal distribution on the gross scale of interest. That is, the example is not pathological; the Gibbs sampler is just very slow. Rather than being a worst-case example, the Ising model is typical of the probability distributions for which iterative simulation methods were designed, and may be typical of many posterior distributions to which the Gibbs sampler is being applied.

So that was my perspective: start from one point and the chain looks fine; start from two points and you see the problem. But Charlie had a different attitude toward the Ising example. His take on it was: the Ising model is known to be difficult, no one but a fool would try to simulate it with 2000 iterations of a Gibbs sampler. There’s a huge literature on the Ising model already!

Charlie was interested in methods for solving large, well-understood problems one at a time. I was interested in methods that would be used for all sorts of problems by statisticians such as myself who, for applied reasons, bite off more in model than we can chew in computation and understanding. For Charlie with the Ising model, multiple sequences missed the point entirely, as he knew already that 2000 iterations of Gibbs wouldn’t do it. For me, though . . . as an applied guy I was just the kind of knucklehead who might apply Gibbs to this sort of problem (in my defense, Geman and Geman made a similar mistake in 1984, I’ve been told), so it was good to have a practical convergence check.

Again, I think that in our discussion and rejoinder, Don and I presented our method well, in the context of our applied purposes. But I think it would’ve worked better as a straight statistics article. Nothing much useful came out of the discussion because none of us cut through to the key difference in the sorts of problems we were working on.

Part 2: A successful discussion

In the years since then, I’ve realized that communication is more than being right (or, should I say, thinking that one is right). Statistical ideas (and, for that matter, mathematical and scientific ideas in general) are sometimes best understood through their limitations. It’s Lakatos’s old “proofs and refutations” story all over again.

Recently I was involved in a discussion that worked out well. It started a few years ago with a post of mine on the differences between the sorts of data visualizations that go viral on the web (using some examples that were celebrated by statistician/designer Nathan Yau), as compared to statistical graphics of the sort that we are trained to make. It seemed to me that many visualizations that are successful with general audiences feature unique or striking designs and puzzle-like configurations, whereas the most successful statistical graphics have more transparent formats that foreground data comparisons. Somewhere in between are the visualizations created by lab scientists, who generally try to follow statistical principles but usually (in my view) try too hard to display too much information on a single plot.

My posts, and various follow-ups, were disliked by many in the visualization community. They didn’t ever quite disagree with my claim that many successful visualizations involve puzzles, but they didn’t like what they perceived as my negative tone.

In attempt to engage the fields of statistics and visualization more directly, I wrote an article (with Antony Unwin) on the different goals and different looks of these two sorts of graphics. Like many of my favorite papers, this one took a lot of effort to get into a journal. But finally it was accepted in the Journal of Computational and Graphical Statistics, with discussion.

The discussants (Stephen Few, Robert Kosara, Paul Murrell, and Hadley Wickham; links to all four discussions are here on Kosara’s blog) politely agreed with us on some points and disagreed with us on others. And then it was time for us to write our rejoinder.

In composing the rejoinder I finally came upon a good framing of the problem. Before we’d spoken of statistical graphs and information visualization as having different goals and looking different. But that didn’t work. No matter how often I said that it could be a good thing that an infovis is puzzle-like, or no matter how often I said that as a statistician I would prefer graphing the data like This but I can understand how graphing it like That could attract more viewers . . . no matter how much I said this sort of thing, it was interpreted as a value judgment (and it didn’t help when I said that something “sucked,” even if I later modified that statement).

Anyway, my new framing, that I really like, is in terms of tradeoffs. Not “two cultures,” not “different goals, different looks,” but tradeoffs. So it’s not stat versus infographics; instead it’s any of us trying to construct a graph (or, better still, a grid of graphs) and recognizing that it’s not generally possible to satisfy all goals at once, so we have to think about what goals are most important in any given situation:

In the internet age, we should not have to choose between attractive graphs and informational graphs: it should be possible to display both, via interactive displays. But to follow this suggestion, one must first accept that not every beautiful graph is informative, and not every informative graph is beautiful.

Yes, it can sometimes be possible for a graph to be both beautiful and informative, as in Minard’s famous Napoleon-in-Russia map, or more recently the Baby Name Wizard, which we featured in our article. But such synergy is not always possible, and we believe that an approach to data graphics that focuses on celebrating such wonderful examples can mislead people by obscuring the tradeoffs between the goals of visual appeal to outsiders and statistical communication to experts.

So it’s not Us versus Them, it’s each of us choosing a different point along the efficient frontier for each problem we care about.

And I think the framing worked well. At least, it helped us communicate with Robert Kosara, one of our discussants. Here’s what Kosara wrote, after seeing our article, the discussions (including his), and our rejoinder:

There are many, many statements in that article [by Gelman and Unwin] that just ask to be debunked . . . I [Kosara] ended up writing a short response that mostly points to the big picture of what InfoVis really is, and that gives some examples of the many things they missed.

While the original article is rather infuriating, the rejoinder is a great example of why this kind of conversation is so valuable. Gelman and Unwin respond very thoughtfully to the comments, seem to have a much more accurate view of information visualization than they used to, and make some good points in response.

Great! A discussion that worked! This is how it’s supposed to go: not a point-scoring debate, not people talking past each other, but an honest and open discussion.

Reflections

Perhaps my extremely, extremely frustrating experience early in my career (detailed in Part 1 above) motivated me to think seriously about the Lakatosian attitude toward understanding and explaining ideas. If you compare Bayesian Data Analysis to other statistics books of that era, for example, I think we did a pretty good job (although maybe not good enough) of understanding the methods through their limitations. But even with all my experience and all my efforts, this can be difficult, as revealed by the years it took for us to finally process our ideas on graphics and visualization to the extent that we could communicate with experts in these fields.

Of parsing and chess

Screen Shot 2013-05-07 at 9.12.42 PM

Gary Marcus writes,

An algorithm that is good at chess won’t help parsing sentences, and one that parses sentences likely won’t be much help playing chess.

That is soooo true. I’m excellent at parsing sentences but I’m not so great at chess. And, worse than that, my chess ability seems to be declining from year to year.

Which reminds me: I recently read Frank Brady’s much lauded Endgame, a biography of Bobby Fischer. The first few chapters were great, not just the Cinderella story of his steps to the world championship, but also the background on his childhood and the stories of the games and tournaments that he lost along the way.

But after Fischer beats Spassky in 1972, the book just dies. Brady has chapter after chapter on Fisher’s life, his paranoia, his girlfriends, his travels. But, really, after the chess is over, it’s just sad and kind of boring. I’d much rather have had twice as much detail on the first part of the life and then had the post-1972 era compressed into a single chapter. I mean, sure, I respect that Brady wanted to tell the full life story, and I’m not telling him how he should’ve written his book, I’m just giving my reactions.

Also, I would’ve liked more information on the games: what was the amazing set of moves that Fischer did in the so-called Game of the Century, what happened in some of the games he lost, and so on. In an afterword, Brady writes that he decided not to include any games so as to make the book more accessible. What I wonder is, how many readers are there like me, who enjoy chess, could understand a diagram and some discussion of what these amazing plays were, even if we couldn’t follow an entire game written on the page or have the patience to play one out on the board. I wouldn’t have gotten much out of transcripts of chess games, but a few diagrams and discussions of key moments, that would’ve made the book a lot more interesting to me.

P.S. After Kasparov beat Karpov in the final game of their tournament—the game where both players knew that Kasparov had to win, that a draw wouldn’t be enough—I clipped the game out of the newspaper and later played it out with my dad. That was a game. To my ignorant eyes, there was no single point where I could spot a mistake by Karpov. Kasparov just gradually and imperceptibly got to a winning position. Amazing.

Like Casper the ghost, Niall Ferguson is not only white. He is also very, very adorable.

Is Felix Salmon wrong on free TV?

rabbit-ears

Mark Palko writes:

Salmon is dismissive of the claim that there are fifty million over-the-air television viewers:

The 50 million number, by the way, should not be considered particularly reliable: it’s Aereo’s guess as to the number of people who ever watch free-to-air TV, even if they mainly watch cable or satellite. (Maybe they have a hut somewhere with an old rabbit-ear TV in it.)

And he strongly suggests the number is not only smaller but shrinking. By comparison, here’s a story from the broadcasting news site TV News Check from June of last year (if anyone has more recent numbers please let me know):

According to new research by GfK Media, the number of Americans now relying solely on over-the-air (OTA) television reception increased to almost 54 million, up from 46 million just a year ago. The recently completed survey also found that the demographics of broadcast-only households skew towards younger adults, minorities and lower-income families.

As Palko says, Salmon is usually a pretty careful reporter. And this one should be right up his alley. Here’s Palko again:

We’ve talked about how well over-the-air television compares to cable (for some people), how new and apparently successful businesses are springing up around OTA, and how the number of viewers getting their television through antennas appears to have been growing substantially since the introduction of digital. What we haven’t covered so far is the potential social impact of killing broadcast television.

It is almost axiomatic that, if you have a resource that is used in one way by people at the top of the economic ladder and in another way by people on the bottom and you “let the market decide” what to do with the resource, it will go with the people who have the money. . . .

This becomes particularly troubling when we’re talking about a publicly held resource. . . . What groups rely heavily on broadcast television? What groups would have the most difficulty finding alternatives?

People in the bottom one or two deciles are going to be in trouble. Even the lowest tier of cable would represent a significant monthly expense. People with limited residential security will be even worse off. People with limited income security will face a difficult choice: sign up for exorbitant no-contract plans or commit to a financial obligation they may not be able to fulfill. People with poor credit histories will have to come up with large deposits every time they move. . . .

Palko summarizes:

OTA [over-the-air television] is a promising technology supporting an innovative and growing industry, serving important economic and social roles.

The technology is doing fine in the marketplace. It’s lobbyists who are likely to kill it.

I wonder what Salmon’s take is on this. Is Palko missing something, or does he just happen to be sharing a perspective that is different from that of NYC-based financial journalists?

P.S. Let me emphasize that this post is not some sort of trolling of Felix Salmon. I’m a big fan of his quantitatively sophisticated reporting, which is why it’s interesting if he’s getting something wrong.

P.P.S. There’s some dispute about that 54 million number. Salmon points to this news article by Michael Grotticelli:

Free, over-the-air television viewing of broadcast TV signals are now watched by only 9 percent of the U.S. population — down from 16 percent in 2003, according to Nielsen, the major TV and radio rating service. . . .

The Nielsen numbers are certain to cause a dispute with the NAB, which has insisted the amount of over-the-air viewing is increasing in an era of cord-cutting. Last summer, the NAB produced a survey by Knowledge Networks citing about 18 percent as “broadcast exclusive” households. That total was 54 million Americans — up from 46 million in 2011.

So, one claim is that 9% watch any over-the-air TV, the other is that 18% only watch over-the-air TV. That’s a big gap.

Against optimism about social science

tip_of_the_iceberg

Social science research has been getting pretty bad press recently, what with the Excel buccaneers who didn’t know how to handle data with different numbers of observations per country, and the psychologist who published dozens of papers based on fabricated data, and the Evilicious guy who wouldn’t let people review his data tapes, etc etc. And that’s not even considering Dr. Anil Potti.

On the other hand, the revelation of all these problems can be taken as evidence that things are getting better. Psychology researcher Gary Marcus writes:

There is something positive that has come out of the crisis of replicability—something vitally important for all experimental sciences. For years, it was extremely difficult to publish a direct replication, or a failure to replicate an experiment, in a good journal. . . . Now, happily, the scientific culture has changed. . . . The Reproducibility Project, from the Center for Open Science is now underway . . .

And sociologist Fabio Rojas writes:

People may sneer at the social sciences, but they hold up as well. Recently, a well known study in economics was found to be in error. People may laugh because it was an Excel error, but there’s a deeper point. There was data, it could be obtained, and it could be replicated. Fixing errors and looking for mistakes is the hallmark of science. . . .

I agree with Marcus and Rojas that attention to problems of replication is a good thing. It’s bad that people are running incompetent analysis or faking data all over the place, but it’s good that they’re getting caught. And, to the extent that scientific practices are improving to help detect error and fraud, and to reduce the incentives for publishing erroneous and fradulent results in the first place, that’s good too.

But I worry about a sense of complacency. I think we should be careful not to overstate the importance of our first steps. We may be going in the right direction but we have a lot further to go. Here are some examples:

1. Marcus writes of the new culture of publishing replications. I assume he’d support the ready publications of corrections, too. But we’re not there yet, as this story indicates:

Recently I sent a letter to the editor to a major social science journal pointing out a problem in an article they’d published, they refused to publish my letter, not because of any argument that I was incorrect, but because they judged my letter to not be in the top 10% of submissions to the journal. I’m sure my letter was indeed not in the top 10% of submissions, but the journal’s attitude presents a serious problem, if the bar to publication of a correction is so high. That’s a disincentive for the journal to publish corrections, a disincentive for outsiders such as myself to write corrections, and a disincentive for researchers to be careful in the first place. Just to be clear: I’m not complaining how I was treated here; rather, I’m griping about the system in which a known error can stand uncorrected in a top journal, just because nobody managed to send in a correction that’s in the top 10% of journal submissions.

2. Rojas writes of the notorious Reinhardt and Rogoff study that, “There was data, it could be obtained, and it could be replicated.” Not so fast:

It was over two years before those economists shared the data that allowed people to find the problems in their study. If the system really worked, people wouldn’t have had to struggle for years to try to replicate an unreplicable analysis.

And, remember, the problem with that paper was not just a silly computer error. Reinhardt and Rogoff also made serious mistakes handling their time-series cross-sectional data.

3. Marcus writes in a confident tone about progress in methodology: “just last week, Uri Simonsohn [and Leif Nelson and Joseph Simmons] released a paper on coping with the famous file-drawer problem, in which failed studies have historically been underreported.” I think Uri Simonsohn is great, but I agree with the recent paper by Christopher Ferguson and Moritz Heene that the so-called file-drawer problem is not a little technical issue that can be easily cleaned up; rather, it’s fundamental to our current practice of statistically-based science.

And there’s pushback. Biostatisticians Leah Jager and Jeffrey Leek wrote a paper, which I strongly disagree with, called “Empirical estimates suggest most published medical research is true.” I won’t go into the details here—my take on their work is that they’re applying a method that can make sense in the context of a single large study but which won’t generally work with meta-analysis—my point is that there remains a constituency for arguments that science is basically OK already.

I respect the view of Marcus, Rojas, Jager, Leek, and others that the current environment of criticism has in some ways gone too far. All those people do serious, respected research, and those of us who do serious research know how difficult it can be to publish in good journals, how hard we work—out of necessity—to consider all possible alternative explanations for any results we find, how carefully we document the steps of our data collection and analysis, and so forth. But many problems still remain.

Thomas Basbøll analogizes the difficulties of publishing scientific criticism to problems with the subprime mortgage market before the crash. He quotes Michael Lewis:

To sell a stock or bond short you need to borrow it, and [the bonds they were interested in] were tiny and impossible to find. You could buy them or not buy them but you couldn’t bet explicitly against them; the market for subprime mortages simply had no place for people in it who took a dim view of them. You might know with certainty that the entire mortgage bond market was doomed, but you could do nothing about it.

And now here’s Basbøll:

I had a shock of recognition when I read that. I’ve been trying to “bet against” a number of stories that have been told in the organization studies literature for years now, and the thing I’m learning is that there’s no place in the literature for people who take a dim view of them. There isn’t really a genre (in the area of management studies) of papers that only points out errors in other people’s work. You have to make a “contribution” too. In a sense, you can buy the stories people are telling you or not buy them but you can’t criticize them.

This got me thinking about the difference between faith and knowledge. Knowledge, it seems to me, is a belief held in a critical environment. Faith, we might say, is a belief held in an “evangelical” environment. The mortgage bond market was an evangelical environment in which to hold beliefs about housing prices, default rates, and credit ratings on CDOs. There was no simple way to critique the “good news” . . .

Eventually, as Lewis reports, people were able to bet against the subprime mortgage market, but it wasn’t easy. And the fact that some investors, with great difficulty, were able to do it, doesn’t mean the financial system is A-OK.

Basbøll’s analogy may be going too far, but I agree with his general point that the existence of a few cases of exposure should not make us complacent. Marcus’s suggestions on cleaning up science are good ones, and we have a ways to go before they are generally implemented.
Continue reading ‘Against optimism about social science’ »

Cleaning up science

David Hogg pointed me to this post by Gary Marcus, reviewing this skeptics’ all-star issue of Perspectives on Psychological Science that features replication culture heroes Jelte Wicherts, Hal Pashler, Arina Bones, E. J. Wagenmakers, Gregory Francis, Hal Pashler, John Ioannidis, and Uri Simonsohn. I agree with pretty much everything Marcus has to say. In addition to Marcus’s suggestions, which might be called cultural or psychological, I also have various statistical ideas that might help move the field forward. Most notably I think we need to go beyond uniform priors and null-hypothesis testing to a more realistic set of models for effects and variation. I’ll discuss more at some other time, but in the meantime I thought I’d share these links.

P.S. Marcus updates with a glass-is-half-full take.

The New York Times Book of Mathematics

This was an good idea: take a bunch of old (and some recent) news articles on developments in mathematics and related ares from the past hundred years. Fun for the math content and historical/nostalgia value. Relive the four-color theorem, Fermat, fractals, and early computing.

I have too much of a technical bent to be the ideal reader for this sort of book, but it seems like an excellent gift for a non-technical reader who nonetheless enjoys math. (I assume that such people are out there, just as there are people like me who can’t read music but still enjoy reading about the subject.)

The book is organized by topic. My own preference would have been chronological and with more old stuff. I particularly enjoyed the material from many decades ago, such as the news report on one of the early computers. This must have been a fun book to compile.

One more thought on Hoover historian Niall Ferguson’s thing about Keynes being gay and marrying a ballerina and talking about poetry

eddie

We had some interesting comments on our recent reflections on Niall Ferguson’s ill-chosen remarks in which he attributed Keynes’s economic views (I don’t actually know exactly what Keyesianism is, but I think a key part is for the government to run surpluses during economic booms and deficits during recessions) to the Keynes being gay and marrying a ballerina and talking about poetry. The general idea, I think, is that people without kids don’t care so much about the future, and this motivated Keynes’s party-all-the-time attitude, which might have worked just fine for Eddie Murphy’s girlfriend in the 1980s and in San Francisco bathhouses of the 1970s but, according to Ferguson, is not the ticket for preserving today’s American empire.

Some of the more robust defenders of Ferguson may have been disappointed by his followup remarks: “I should not have suggested . . . that Keynes was indifferent to the long run because he had no children, nor that he had no children because he was gay. This was doubly stupid. . . . My disagreements with Keynes’s economic philosophy have never had anything to do with his sexual orientation. It is simply false to suggest, as I did, that his approach to economic policy was inspired by any aspect of his personal life.” It’s tough to try to defend a statement that was disowned by the person saying it.

But the question then arises: What’s so horrible about what Ferguson said? After all, it’s not unreasonable to think that someone’s personal circumstances will affect their political attitudes and their views on economic policy. And certainly no one doubts that Keynes’s background as an upper-class British backgrounds was relevant for understanding his views.

So what was up?
Continue reading ‘One more thought on Hoover historian Niall Ferguson’s thing about Keynes being gay and marrying a ballerina and talking about poetry’ »

The Folk Theorem of Statistical Computing

From an email I received the other day:

Things are going much better now — it’s interesting, it feels like with both of my models, parameters are slow to converge or get “stuck” and have trouble mixing when the model is somehow misspecified.

See here for a statement of the folk theorem.

Jesus historian Niall Ferguson and the improving standards of public discourse

History professor (or, as the news reports call him, “Harvard historian”) Niall Ferguson got in trouble when speaking at a conference of financial advisors. Tom Kostigen reports:
Continue reading ‘Jesus historian Niall Ferguson and the improving standards of public discourse’ »

NYC Data Skeptics Meetup

Rachel Schutt writes:

The hype surrounding Big Data and Data Science is at a fever pitch with promises to solve the world’s business and social problems, large and small. How accurate or misleading is this message? How is it helping or damaging people, and which people? What opportunities exist for data nerds and entrepreneurs that examine the larger issues with a skeptical view?

This Meetup focuses on mathematical, ethical, and business aspects of data from a skeptical perspective. Guest speakers will discuss the misuse of and best practices with data, common mistakes people make with data and ways to avoid them, how to deal with intentional gaming and politics surrounding mathematical modeling, and taking into account the feedback loops and wider consequences of modeling. We will take deep dives into models in the fields of Data Science, statistics, financial engineering, and economics.

This is an independent forum and open to anyone sharing an interest in the larger use of data. Technical aspects will be discussed, but attendees do not need to have a technical background.

Setting aside the politics, the debate over the new health-care study reveals that we’re moving to a new high standard of statistical journalism

Pointing to this news article by Megan McArdle discussing a recent study of Medicaid recipients, Jonathan Falk writes:

Forget the interpretation for a moment, and the political spin, but haven’t we reached an interesting point when a journalist says things like:

When you do an RCT with more than 12,000 people in it, and your defense of your hypothesis is that maybe the study just didn’t have enough power, what you’re actually saying is “the beneficial effects are probably pretty small”.

and

A good Bayesian—and aren’t most of us are supposed to be good Bayesians these days?—should be updating in light of this new information. Given this result, what is the likelihood that Obamacare will have a positive impact on the average health of Americans? Every one of us, for or against, should be revising that probability downwards. I’m not saying that you have to revise it to zero; I certainly haven’t. But however high it was yesterday, it should be somewhat lower today.

This is indeed an excellent news article. Also this sensible understanding of statistical significance and effect sizes:

But that doesn’t mean Medicaid has no effect on health. It means that Medicaid had no statistically significant effect on three major health markers during a two-year study. Those are related, but not the same. And in fact, all three markers moved in the right direction. They just weren’t big enough to rule out the possibility that this was just random noise in the underlying data. I’d say this suggests that it’s more likely than not that there is some effect–but also, more likely than not that this effect is small.

Continue reading ‘Setting aside the politics, the debate over the new health-care study reveals that we’re moving to a new high standard of statistical journalism’ »

Culture clash

Screen Shot 2013-05-02 at 10.13.04 PM

I had no idea this sort of thing even existed:

Screen Shot 2013-05-02 at 10.14.43 PM

I’m reminded of our discussion of Charles Murray’s recent book on social divisions among Americans. Murray talked about differences between upper and lower class, but I thought he was really talking more about differences between liberals and conservatives among the elite. (More discussion here.)

In this particular case, Murray’s story about irresponsible elites seems to fit pretty well. At the elite level, you have well-connected D.C. gun lobbyists opposing any restrictions on personal weapons. As Murray might put it, the elites (Phil Spector aside) may be able to handle their guns, but some lower-class Americans cannot—they do things like give real rifles to 5-year-olds (!). As Murray writes, it’s a combination of cultural ignorance and a permissive ideology: I assume the senators who voted against the recent gun control bill wouldn’t give live weapons to their kids (or live in neighborhoods in which kids have access to guns at home), but they don’t feel right about restricting the rights of others to do so.

P.S. After reading some comments, I thought it might help to clarify two points.
Continue reading ‘Culture clash’ »

7 ways to separate errors from statistics

sharing

Betsey Stevenson and Justin Wolfers have been inspired by the recent Reinhardt and Rogoff debacle to list “six ways to separate lies from statistics” in economics research:

1. “Focus on how robust a finding is, meaning that different ways of looking at the evidence point to the same conclusion.”

2. Don’t confuse statistical with practical significance.

3. “Be wary of scholars using high-powered statistical techniques as a bludgeon to silence critics who are not specialists.”

4. “Don’t fall into the trap of thinking about an empirical finding as ‘right’ or ‘wrong.’ At best, data provide an imperfect guide.”

5. “Don’t mistake correlation for causation.”

6. “Always ask ‘so what?’”

I like all these points, especially #4, which I think doesn’t get said enough. As I wrote a few months ago, high-profile social science research aims for proof, not for understanding—and that’s a problem.

My addition to the list

If you compare my title above to that of Stevenson and Wolfers, you’ll find two differences. First, I changed “lies” to “errors.” I have no idea who’s lying, and I’m much more comfortable talking about errors. Second, I think they missed an even better, more general way to find mistakes:

7. Make your data and analysis public.

This is the best approach, because now you can have lots of strangers checking your work for free! This advice is also particularly appropriate for Reinhardt and Rogoff because, according to various reports (see here and here), it was years before they made their data available to outsiders. Nearly three years ago (!), Dean Baker wrote a column entitled, “It Would Be Helpful if Rogoff and Reinhart Made Their Data Available.”

Perhaps “the risk of forced disclosure” (as Keith O’Rourke puts it) will motivate researchers to be more careful in the future.

Your additions?

I told Wolfers I was going to link to his list and add my own #7. He replied that we’re probably missing #8, 9, and 10. In the comments, feel free to add your favorite ways to separate errors from statistics. Phil already gave some here.

A graph at war with its caption. Also, how to visualize the same numbers without giving the display a misleading causal feel?

Kaiser Fung discusses the following graph that is captioned, “A study of 54 nations–ranked below–found that those with more progressive tax rates had happier citizens, on average.”

6a00d8341e992c53ef017d43093cb8970c

As Kaiser writes, “from a purely graphical perspective, the chart is well executed . . . they have 54 points, and the chart still doesn’t look too crammed . . .” But he also points out that the graph’s implicit claims (that tax rates can explain happiness or cause more happiness) are not supported.

Kaiser and I are not being picky-picky-picky here. Taken literally, the graph title says nothing about causation, but I think the phrasing implies it. Also, from a purely descriptive perspective, the graph is somewhat at war with its caption. The caption announces a relationship, but in the graph, the x and y variables have only a very weak correlation. The caption says that happiness and progressive tax rates go together, but the graph uses the U.S. as a baseline, and when you move from the U.S. point on the graph to the right-hand side (more progressive taxes), you see a lot more points below the line than above the line. Thus the visual impression of the graph is that more progressive taxes will lead to lower happiness—the opposite of the message from the caption.

What can be done here?

I don’t exactly think the graph is “bad data,” and, although the graph says little directly about causation, the data have some relevance to our understanding of policy debates over taxes. If nothing else, we learn that tax progressivity and average happiness some variation among countries. I think a start would be to reframe and put happiness on the x-axis and the tax system on the y-axis, which would allow us to see that, at any happiness level, there is a range of tax systems. with none of the very happiest countries having flat taxes.

Better still might be to make a line plot with three columns: First, a list of country names, in decreasing order from richest to poorest (using, for example, per-capita GDP (yes, I know, such data aren’t perfect!)), then a column showing tax progressivity (if that’s the measure they want to use), then a column showing average happiness.

The advantage of this pair of dotplots is that you get to see the spread in each of these variables with respect to a natural measure (how rich the country is), and there’s no implicit causal story getting in the way.

“Tragedy of the science-communication commons”

I’ve earlier written that science is science communication—that is, the act of communicating scientific ideas and findings to ourselves and others is itself a central part of science. My point was to push against a conventional separation between the act of science and the act of communication, the idea that science is done by scientists and communication is done by communicators. It’s a rare bit of science that does not include communication as part of it. As a scientist and science communicator myself, I’m particularly sensitive to devaluing of communication. (For example, Bayesian Data Analysis is full of original research that was done in order to communicate; or, to put it another way, we often think we understand a scientific idea, but once we try to communicate it, we recognize gaps in our understanding that motivate further research.)

I once saw the following on one of those inspirational-sayings-for-every-day desk calendars: “To have ideas is to gather flowers. To think is to weave them into garlands.” Similarly, writing—more generally, communication to oneself or others—forces logic and structure, which are central to science.

Dan Kahan saw what I wrote and responded by flipping it around: He pointed out that there is a science of science communication. As scientists, we should move beyond the naive view of communication as the direct imparting of facts and ideas. We should think more systematically about how communications are produced and how they are understood by their immediate and secondary recipients.

The science of science communication is still in its early stages, and I’m glad that people such as Kahan are working on it. Here’s something he wrote recently explicating his theory of cultural cognition:

The motivation behind this research has been to understand the science communication problem. The “science communication problem” (as I use this phrase) refers to the failure of valid, compelling, widely available science to quiet public controversy over risk and other policy relevant facts to which it directly speaks. The climate change debate is a conspicuous example, but there are many others, including (historically) the conflict over nuclear power safety, the continuing debate over the risks of HPV vaccine, and the never-ending dispute over the efficacy of gun control. . . . The research I will describe reflects the premise that making sense of these peculiar packages of types of people and sets of factual beliefs is the key to understanding—and solving—the science communication problem. The cultural cognition thesis posits that people’s group commitments are integral to the mental processes through which they apprehend risk. . . .

I think of Kahan as part of a loose network of constructive skeptics, along with various people including Thomas Basbøll, John Ioannidis, the guys at Retraction Watch, pissed-off scholars such as Stan Liebowitz, bloggers such as Felix Salmon, and a whole bunch of psychology researchers such as Wicherts, Wagenmakers, Simonsohn, Nosek, etc. This is not to represent a complete list but rather is intended to give a sense of the different aspect of this movement-without-a-name. 10 or 20 or 30 years ago, I don’t think such a movement existed. There were concerns about individual studies or research programs but not such a sense of a statistics-centered crisis in science as a whole.

Giving credit where due

Gregg Easterbrook may not always be on the ball, but I 100% endorse the last section of his recent column (scroll down to “Absurd Specificity Watch”).

Earlier in the column, Easterbrook has a plug for Tim Tebow. I’d forgotten about Tim Tebow.

The Great Race

This post is by Phil.

Last summer my wife and I took a 3.5-month vacation that included a wide range of activities. When I got back, people would ask “what were the highlights or your trip?”, and I was somewhat at a loss: we had done so many things that were so different, many of which seemed really great…how could I pick? Someone said, wisely, that in six months or a year I’d be able to answer the question because some memories would be more vivid than others. They were right, and I was recently thinking back on our vacation and putting together a list of highlights — enjoyable in itself, but also worth doing to help plan future vacations.

One of the things we did was go to four evenings of track and field events at the London Olympics. After we got back, people would ask what we had seen at the Olympics. I would say “We saw Usain Bolt run the 200m, we saw the women’s 4x100m relay and the men’s 4×400, we saw the last events of the decathlon…lots of great stuff. But my favorite was the men’s 800m.”

Trying to figure out why that was one of my favorite events to watch, I looked up some facts and statistics about the race. Perhaps unexpectedly, I think that some of the things that made it great, as both an athletic contest and a spectacle, are reflected in the stats.

Continue reading ‘The Great Race’ »

The blogroll

Chain Links

I encourage you to check out our linked blogs. Here’s what they’re all about:

Cognitive and Behavioral Science

BPS Research Digest: I haven’t been following this one recently, but it has lots of good links, I should probably check it more often. There are a couple things that bother me, though. The blog is sponsored by the British Psychological Society, so this sounds pretty serious. But then they run things like advertising promotions sponsored by a textbook company and highlight iffy experimental claims. For example, in 2010 they ran a wholly uncritical post on the notorious Daryl Bem study that purported to find ESP. After being called on it in the comments, the blogger (Christian Jarrett) responded with, “The stats appear sound. . . . it’s a great study. Rigorously conducted” and even defended “the discussion of quantum physics in the paper.” To be fair, though, and as he points out in comments, Jarrett wrote of Bem’s study: “this isn’t proof of psi, far from it. Needs to be replicated. I like how Bem has used standard psychological tasks as a way to explore psi. Makes it easier for other labs to try to replicate.” Jarrett writes that he tries to “strike a balance between promotion and skepticism of new findings.” Fair enough.

Decision Science News: A mix of conference announcements and reports of new research. Here’s a typical example. I love this stuff; others might find it a bit technical. Also, this blog runs ads. I wonder how much the advertisers pay? I can’t imagine anyone would pay enough to a niche blogger to make the ads worth it. I mean, sure, if an advertiser offered me enough money for me to hire a postdoc, I’d do it, but I can only imagine we’re talking really small amounts of money. A topic of discussion for Decision Science News, perhaps?

Language Log: Not much needs to be said here. This one’s a classic blog with lots of statistical content, remains strong after all these years.

Seth Roberts: I disagree with him on climate change denial, Holocaust denial, etc. Still, he’s a pioneer of self-experimentation. I hope that the next generation of psychology or medical research involves an integration of informal experimentation with statistical controls.

The Hardest Science: Mostly revolves around reproducible research. It’s where I heard the story of the lamest, grudgingest, non-retraction retraction ever.

Cultural

Light Reading: She’s like me, she likes to write and has a lot of energy. I’m still wondering what she will think of Debutante Hill (I’ll lend her my copy).

Lists and Letters of Note: Great stuff but not much new material lately; he says he’s busy working on a book.

Love the Liberry: Amazingly enough, they keep coming up with good material.

Paperpools: Not much material lately. As it should be. We want Helen DeWitt to be writing novels, not blogging!

Research as a Second Language: Anti-charismatic self-help advice. The alternative to those omnipresent shouting, obnoxious internet gurus.

Streetsblog: Good stuff. Ideally this would all be in your daily newspaper. I don’t read it too often; if I did, I’d be too angry to think about anything else all day.

Sister Blogs

The Monkey Cage: Sometimes I simul-post, other times I’ll rant there and then link from here. (for example)

The Statistics Forum: I recently formulated the plan to fill it up with 365 stories. So far, though, I’ve only received a few. So maybe just a story a week? I’m not sure what to do with this blog. An official American Statistical Association blog seems like a good thing but I don’t really know what to do with it.

Social and Political Science

Chris Blattman: International development, politics, economics, and policy.

Fivethirtyeight: Nate does a good job. I like how he can focus on whatever question he’s answering without getting overwhelmed. Here’s a good recent example.

Lane Kenworthy is a completely serious and reasonable person, just as his name would suggest.

Marginal Revolution: You’ve heard of these guys.

Monthly Labor Review Precis: Direct links to research on things that matter. Good stuff.

Overcoming Bias: He recently wrote, “most people we know talk as if they hate, revile, and despise ads. They say ads are an evil destructive manipulative force that exists only because big bad firms run the world, and use ads to control us all.” I was surprised to hear that most people Robin Hanson knows talk that way, and this gives me a new perspective on why he writes the way he does. It’s gotta be frustrating, hanging around people who talk about big bad firms and evil destructive manipulative forces.

Rajiv Sethi: He only blogs a couple times a month, but he always has something interesting to say. (The opposite of this blog, I suppose.)

The Baby Name Wizard: The one and only, by the people who, among other things, debunked the myth that there’s something special about the word “orange.” But you can just skip directly to the Name Voyager.

U.S. Census Blog: Not the funnest thing out there to read, but it’s good that the people at the Census are doing this for us. When you need good data, the Census is there for you.

Statistics and Machine Learning

Bob Carpenter: He wrote Stan.

Chance News: The original statistics blog.

Christian Robert: People who used to do theoretical statistics, now do computational statistics. This is a good thing.

Cosma Shalizi: He has an odd retro style and enough combination of common sense and knowledge of philosophy that I asked him to collaborate on my paper that became this. His set of interests and frustrations seems to overlap a lot with mine, except that he doesn’t really ride a bike and I’m sure there are some big parts of his life that don’t match to anything in mine.

Deborah Mayo: I learned about her through Shalizi. Mayo believes in learning through model checking, just like Jaynes (and me). Her blog features long comment threads and contributions from the likes of Stephen Senn.

John Cook: Like Tyler Cowen, a guy who does a lot of things but is best known for his blogging. He throws in some applied math and numerical analysis along with the statistics.

Kaiser Fung: Fun to read and utterly sensible. Among many other things, he offered a good probabilistic summary of the Lance Armstrong story, well before it finally broke.

Larry Wasserman: His perspective on statistics is different from mine (for example, he defines p(a|b) = p(a,b)/p(b), whereas I define p(a,b)=p(a|b)p(b)), but it’s good that he can get his views out there. Research proceeds in many different ways, and if everyone agreed with me (or with any single perspective), the field of statistics would make a lot less progress.

Messy Matters: This one reads a bit more like a draft of a pop-science book than like a blog. The trouble is, there are already so many pop-science books about economics and data. They’ll have to come up with their own unique twist.

Nuit Blanche: Compressive sensing: that’s cool stuff! I’m impressed by these CS guys who can effortlessly throw around terabytes of data.

Observational Epidemiology: These guys are thoughtful and I admire the effort they put into their blogging. If they’d started blogging in 2003, they would’ve been on everyone’s blogroll.

Stats Blogs: A convenient compendium, with links back to the originals.

The Numbers Guy: Carl Bialik is one of the original data journalists. He, Falix Salmon, and Nate Silver have very similar profiles (as Bill James might say).

Visualization

Chartsnthings: This is the ultimate graphics blog. The New York Times graphics team presents some great data visualizations along with the stories behind them. I love this sort of insider’s perspective.

Eager Eyes: Graphics research.

Information Aesthetics: Seriously pretty.

Junk Charts: The nitty gritty. What to read if you want to make your own graphs better.

Plain old everyday Bayesianism!

Sam Behseta writes:

There is a report by Martin Tingley and Peter Huybers in Nature on the unprecedented high temperatures at northern latitudes (Russia, Greenland, etc). What is more interesting is the authors are have used a straightforward hierarchical Bayes model, and for the first time (as far as I can remember) the results are reported with a probability attached to them (P>0.99), as opposed to the usual p-value<0.01 business. This might be a sign that editors of big time science journals are welcoming Bayesian approaches.

I agree. This is a good sign for statistical communication. Here are the key sentences from the abstract:

Here, using a hierarchical Bayesian analysis of instrumental, tree-ring, ice-core and lake-sediment records, we show that the magnitude and frequency of recent warm temperature extremes at high northern latitudes are unprecedented in the past 600 years. The summers of 2005, 2007, 2010 and 2011 were warmer than those of all prior years back to 1400 (probability P > 0.95), in terms of the spatial average. The summer of 2010 was the warmest in the previous 600 years in western Russia (P > 0.99) and probably the warmest in western Greenland and the Canadian Arctic as well (P > 0.90). These and other recent extremes greatly exceed those expected from a stationary climate, but can be understood as resulting from constant space–time variability about an increased mean temperature.

As with classical p-values, these probability statements depend on an assumed model, but I agree with Sam that the expression of direct probabilities is a huge step forward from traditional practice.