Skip to content

I’m officially no longer a “rogue”

In our Freakonomics: What Went Wrong article, Kaiser and I wrote:

Levitt’s publishers characterize him as a “rogue economist,” yet he received his Ph.D. from MIT, holds the title of Alvin H. Baum Professor at the University of Chicago, and has served as editor of the completely mainstream Journal of Political Economy. Further “rogue” credentials revealed by Levitt’s online C.V. include an undergraduate degree from Harvard, a research fellowship with the American Bar Foundation, membership in the Harvard Society of Fellows, a fellowship at the National Bureau of Economic Research, and a stint as a consultant for “Corporate Decisions, Inc.”

That’s all well and good, but, on the other hand, I too have degrees from Harvard and MIT and I also taught at the University of Chicago. But what really clinches it is that this month I gave a talk for an organization called the Corporate Executive Board. No kidding.

In my defense, I’ve never actually called myself a “rogue.” But still . . .

“Readability” as freedom from the actual sensation of reading

In her essay on Margaret Mitchell and Gone With the Wind, Claudia Roth Pierpoint writes:

The much remarked “readability” of the book must have played a part in this smooth passage from the page to the screen, since “readability” has to do not only with freedom from obscurity but, paradoxically, with freedom from the actual sensation of reading [emphasis added]—of the tug and traction of words as they move thoughts into place in the mind. Requiring, in fact, the least reading, the most “readable” book allows its characters to slip easily through nets of words and into other forms. Popular art has been well defined by just this effortless movement from medium to medium, which is carried out, as Leslie Fiedler observed in relation to Uncle Tom’s Cabin, “without loss of intensity or alteration of meaning.” Isabel Archer rises from the page only in the hanging garments of Henry James’s prose, but Scarlett O’Hara is a free woman.

Well put. I wish Pierpoint would come out with another book. But I think this sort of book is out of fashion nowadays. There are zillions of uncollected book reviews and literary essays that I’d love to see in book form (the hypothetical collected reviews of Anthony West, Alfred Kazin, and many others) but it seems like it won’t ever happen.

How many data points do you really have?

Joshua Clover update

Surfing the blogroll, I found myself on Helen DeWitt’s page and noticed the link to the Joshua Clover, alias Jane Dark. I hadn’t checked out Clover for awhile (see my reactions here and here), so I decided to head on over.

Here’s what it looked like:

“The case against the Federal minimum wage,” huh? That surprised me, as I had the vague impression that Clover was on the far left of the American political spectrum. But I guess he could have some sort of wonky thing going on, or maybe there’s some unexpected twist? It seemed a bit off of Clover’s usual cultural-criticism beat, so I clicked through to take a look . . . and it was just a boring set of paragraphs on the minimum wage.

Hmmmm. I went back to the homepage, looked around more carefully, and realized that the blog is fake, the online equivalent of those fake book spines that are used to simulate rows of books on a bookshelf.

I don’t know what happened. My guess is that Clover got tired of blogging and let the domain name lapse, and then some loser entrepreneur noticed it was still getting some hits (from DeWitt’s blog?) so they put up a fake blog.

I can only assume it was all done automatically? Somebody has a webcrawler that looks for dead sites with links, then buys them up for something close to $0 and fills ‘em with crap? Yuck.

Factual – a new place to find data

Factual collects data on a variety of topics, organizes them, and allows easy access. If you ever wanted to do a histogram of calorie content in Starbucks coffees or plot warnings with a live feed of earthquake data – your life should be a bit simpler now.

Also see DataMarket, InfoChimps, and a few older links in The Future of Data Analysis.

If you access the data through the API, you can build live visualizations like this:

Of course, you could just go to the source. Roy Mendelssohn writes (with minor edits):

Since you are both interested in data access, please look at our service ERDDAP:

http://coastwatch.pfel.noaa.gov/erddap/index.html

http://upwell.pfeg.noaa.gov/erddap/index.html

Please do not be fooled by the web pages. Everything is a service (including search and graphics) and the URL completely defines the request, and response formats are easily changed just by changing the “file extension”. The web pages are just html and javascript that use the services. For example, put this URL in your browser:

http://coastwatch.pfeg.noaa.gov/erddap/griddap/erdBAsstamday.png?sst[(2010-01-16T12:00:00Z):1:(2010-01-16T12:00:00Z)][(0.0):1:(0.0)][(30):1:(50.0)][(220):1:(240.0)]

Now if you use R:


library(ncdf4)
library(lattice)
download.file(url="http://coastwatch.pfeg.noaa.gov/erddap/griddap/erdBAsstamday.nc?sst[(2010-01-16T12:00:00Z):1:(2010-01-16T12:00:00Z)][(0.0):1:(0.0)][(30):1:(50.0)][(220):1:(240.0)]", destfile="AGssta.nc")
AGsstaFile<-nc_open('AGssta.nc')
sst<-ncvar_get(AGsstaFile,'sst',start=c(1,1,1,1),count=c(-1,-1,-1,-1))
lonval<-ncvar_get(AGsstaFile,'longitude',1,-1)
latval<-ncvar_get(AGsstaFile,'latitude',1,-1)
image(lonval,latval,sst,col=rainbow(30))

Or if you use Matlab:

link='http://coastwatch.pfeg.noaa.gov/erddap/griddap/erdBAsstamday.mat?sst[(2010-01-16T12:00:00Z):1:(2010-01-16T12:00:00Z)][(0.0):1:(0.0)][(30):1:(50.0)][(220):1:(240.0)]';
F=urlwrite(link,'cwatch.mat');
load('-MAT',F);
ssta=reshape(erdBAsstamday.sst,201,201);
pcolor(double(ssta));shading flat;colorbar;

The two services above allow access to literally petabytes of data, some observed some from model output. I realize you guys don’t usually work in these fields, but this is part of a significant NOAA effort to make as much of its data available as possible. One more thing, if you use “last” as the time, you will always get the latest data, This allows people to set up web pages that track the latest (algal bloom) conditions, such as done by one of my colleagues.

BTW – for people who want a GUI to help with the extract from within the app, there is a product called the Environmental Data Connector that runs in ArcGIS, Matlab, R and Excel.

Roy’s links inspired me to write another blog post, which is forthcoming.

This post is by Aleks Jakulin, follow him at @aleksj.

Standardized writing styles and standardized graphing styles

Back in the 1700s—JennyD can correct me if I’m wrong here—there was no standard style for writing. You could be discursive, you could be descriptive, flowery, or terse. Direct or indirect, serious or funny. You could construct a novel out of letters or write a philosophical treatise in the form of a novel.

Nowadays there are rules. You can break the rules, but then you’re Breaking. The. Rules. Which is a distinctive choice all its own.

Consider academic writing. Serious works of economics or statistics tend to be written in a serious style in some version of plain academic English. The few exceptions (for example, by Tukey, Tufte, Mandelbrot, and Jaynes) are clearly exceptions, written in styles that are much celebrated but not so commonly followed.

A serious work of statistics, or economics, or political science could be written in a highly unconventional form (consider, for example, Wallace Shawn’s plays), but academic writers in these fields tend to stick with the standard forms. The consensus seems to be that straight prose is the clearest way to convey interesting and important ideas. Serious popular writers such as Oliver Sacks and Malcolm Gladwell follow a slightly different formula, going with the magazine-writing tradition of placing ideas inside human stories. But they still, by and large, are trying to write clear prose.

When it comes to data graphics, though, we’re back in the freewheeling 1700s. Maybe that’s a good thing, I don’t know. But what I do know is there’s no standard way of displaying quantitative information, nor is there any acceptance of the unique virtues of the graphical equivalent of clear prose.

Serious works of social science nowadays use all sorts of data display, from showing no data at all, to tables, to un-designed Excel-style bar charts, to Cleveland-style dot and line plots, to creative new data displays, to ornamental information visualizations. The analogy in writing style would be if some journal articles were written in the pattern of Ezra Pound, others like Ernest Hemingway, and others in the style of James Joyce or William Faulkner.

I won’t try to make the case that everybody should do graphs the way I do. I accept that some people communicate with tables, others prefer infovis, and others prefer no quantitative information at all. I just think it’s interesting that prose style is so standardized—I’ve had submissions to journals criticized on the grounds that my writing is too lively!—but when it comes to display of data and models, it’s the Wild West.

For example . . .

Kaiser points to this graph from the book Poor Economics by Abhijit Banerjee and Esther Duflo:

In case you’re curious what’s actually going on here, Kaiser helpfully replots the data in a readable form:

I’d be interested in what my infovis friends would say about this. The best argument I can think of in favor of the Banerjee and Duflo graph, besides its novelty and (perhaps) attractiveness, is that its very difficulty forces the reader to work, to put in so much effort to figure out what’s going on that he or she is then committed to learning more. In contrast, one might argue that Kaiser’s direct plot is so clear that the reader can feel free to stop right there. I don’t really believe this argument—I’d rather have the clear graph and convey more information—but that’s the best I can do.

That said, if a book has dozens of informative Kaiser-style graphs, I can see the benefit of having a few goofy ones just to mix things up a bit.

Not as ugly as you look

Kaiser asks the interesting question: How do you measure what restaurants are “overrated”? You can’t just ask people, right? There’s some sort of social element here, that “overrated” implies that someone’s out there doing the rating.

Rare name analysis and wealth convergence

Steve Hsu summarizes the research of economic historian Greg Clark and Neil Cummins:

Using rare surnames we track the socio-economic status of descendants of a sample of English rich and poor in 1800, until 2011. We measure social status through wealth, education, occupation, and age at death. Our method allows unbiased estimates of mobility rates. Paradoxically, we find two things. Mobility rates are lower than conventionally estimated. There is considerable persistence of status, even after 200 years. But there is convergence with each generation. The 1800 underclass has already attained mediocrity. And the 1800 upper class will eventually dissolve into the mass of society, though perhaps not for another 300 years, or longer.

Read more at Steven’s blog. The idea of rare names to perform this analysis is interesting – and has been recently applied to the study of nepotism in Italy.

I haven’t looked into the details of the methodology, but rare events have their own distributional characteristics, and could benefit from Bayesian modeling in sparse data conditions. Moreover, there seems to be an underlying assumption that rare names are somehow uniformly represented in the population. They might not be. A hypothetical situation: in feudal days, rare names were good at predicting who’s rich and who’s not – wealth was passed through family by name. But then industrialization perturbed the old feudal order stratified by name into one that’s stratified by skill and no longer identifiable by name.

Let’s scrutinize this new methodology! With power comes responsibility.

This post is by Aleks Jakulin.

Sports examples in class

Karl Broman writes:

I [Karl] personally would avoid sports entirely, as I view the subject to be insufficiently serious. . . . Certainly lots of statisticians are interested in sports. . . . And I’m not completely uninterested in sports: I like to watch football, particularly Nebraska, Green Bay, and Baltimore, and to see Notre Dame or any team from Florida or Texas lose.

But statistics about sports? Yawn.

As a person who loves sports, statistics, and sports statistics, I have a few thoughts:

1. Not everyone likes sports, and even fewer are interested in any particular sport. It’s ok to use sports examples, but don’t delude yourself into thinking that everyone in the class cares about it.

2. Don’t forget foreign students. A lot of them don’t even know the rules of kickball, fer chrissake!

3. Of the students who care about a sport, there will be a minority who really care. We had some serious basketball fans in our class last year.

4. I think the best solution is to cover examples in all sorts of topics, including but not limited to sports. I’ve been trying to work in more examples from areas such as cooking, sewing, and shopping.

5. In my experience, students looove education examples, stories about grades, studying, and so forth. But maybe that’s just at the sorts of colleges where I’ve taught: Columbia, Harvard, Berkeley, Chicago. Perhaps students at less elite institutions are less interested in grades.

6. Getting back to Karl’s point about sports being unimportant: Yeah, I pretty much agree with him on that one. Psychologists and economists who study sports will make the claim that the research has larger value, for example in studying decision making or in isolating some cognitive process (as in the justly-celebrated “hot-hand” study), but ultimately I think sports are valuable for their own sake. Sports are a form of art, it’s not a topic such as medicine or education that has much interest beyond itself. That’s ok, though, as long as we’re honest about it, and as long as we also include examples that interest other students in the class.

7. Whenever you teach an applied example well, you induce some subject-matter learning. When I teach sex ratios of births, I give the probability as 0.485, not 0.5, and students learn a little bit of biology. When I teach a sports example, students learn a bit about sports and psychology (for example, the hot hand). The one thing I never never like to do is use complicated gambling examples. I have no interest in teaching students the rules of craps or the probability of getting three of a kind in a poker hand. There are lots of probability examples out there that have the same level of complexity but apply to real-world situations.

Believe the statistics, not your lying eyes

Here.

A previous discussion with Charles Murray about liberals, conservatives, and social class

From 2.5 years ago. Read all the comments; the discussion is helpful.

“False-positive psychology”

Everybody’s talkin bout this paper by Joseph Simmons, Leif Nelson and Uri Simonsohn, who write:

Despite empirical psychologists’ nominal endorsement of a low rate of false-positive findings (≤ .05), flexibility in data collection, analysis, and reporting dramatically increases actual false-positive rates. In many cases, a researcher is more likely to falsely find evidence that an effect exists than to correctly find evidence that it does not. We [Simmons, Nelson, and Simonsohn] present computer simulations and a pair of actual experiments that demonstrate how unacceptably easy it is to accumulate (and report) statistically significant evidence for a false hypothesis. Second, we suggest a simple, low-cost, and straightforwardly effective disclosure-based solution to this problem. The solution involves six concrete requirements for authors and four guidelines for reviewers, all of which impose a minimal burden on the publication process.

Whatever you think about these recommendations, I strongly recommend you read the article. I love its central example:

To help illustrate the problem, we [Simmons et al.] conducted two experiments designed to demonstrate something false: that certain songs can change listeners’ age. Everything reported here actually happened.

They go on to present some impressive-looking statistical results, then they go behind the curtain to show the fairly innocuous manipulations they performed to attain statistical significance.

A key part of the story is that, although such manipulations could be performed by a cheater, they could also seem like reasonable steps to a sincere researcher who thinks there’s an effect and wants to analyze the data a bit to understand it further.

We’ve all known for a long time that a p-value of 0.05 doesn’t really mean 0.05. Maybe it really means 0.1 or 0.2. But, as this paper demonstrates, that p=.05 can often mean nothing at all. This can be a big problem for studies in psychology and other fields where various data stories are vaguely consistent with theory. We’ve all known about these problems but it’s only recently that we’ve been aware of how serious they are and how little we should trust a bunch of statistically significant results.

Sanjay Srivastava has some comments here. My main comment on Simmons et al. is that I’m not so happy with the framing in terms of “false positives”; to me, the problem is not so much with null effects but with uncertainty and variation.

Charles Murray on the new upper class

The other day I posted some comments on the voting patterns of rich and poor in the context of Charles Murray’s recent book, “Coming Apart.” My graphs on income and voting are just fine, but I mischaracterized Murray’s statements. So I want to fix that right away. After that I have some thoughts on the book itself.

In brief:

1. I was unfair to call him a Tucker Carlson.

2. Murray talks a lot about upper-class liberals. That’s fine but I think his discussion would be improved by also considering upper-class conservatives, given that I see the big culture war occurring within the upper class.

3. Using the case of Joe Paterno as an example, I discuss why Murray’s “preach what you practice” advice could be difficult to carry out in practice.
Continue reading ‘Charles Murray on the new upper class’ »

The tabloids strike again

See comments #2,3,4 here. I guess that’s why Science and Nature are known as “the tabloids.” As the commenter writes, “you can’t have people look at too many images of maggot-infested wounds.”

Extra babies on Valentine’s Day, fewer on Halloween?

Just in time for the holiday, X pointed me to an article by Becca Levy, Pil Chung, and Martin Slade reporting that, during a recent eleven-year period, more babies were born on Valentine’s Day and fewer on Halloween compared to neighboring days:

What I’d really like to see is a graph with all 366 days of the year. It would be easy enough to make. That way we could put the Valentine’s and Halloween data in the context of other possible patterns. While they’re at it, they could also graph births by day of the week and show Thanksgiving, Easter, and other holidays that don’t have fixed dates. It’s so frustrating when people only show part of the story.

The data are publicly available, so maybe someone could make those graphs? If the Valentine’s/Halloween data are worth publishing, I think more comprehensive graphs should be publishable as well. I’d post them here, that’s for sure.

Recently in the sister blog

Help with this problem, win valuable prizes

Corrected equation

 

 

 

 

 

 

 

 

This post is by Phil.

In the comments to an earlier post, I mentioned a problem I am struggling with right now. Several people mentioned having (and solving!) similar problems in the past, so this seems like a great way for me and a bunch of other blog readers to learn something. I will describe the problem, one or more of you will tell me how to solve it, and you will win…wait for it….my thanks, and the approval and admiration of your fellow blog readers, and a big thank-you in any publication that includes results from fitting the model.  You can’t ask fairer than that!

Here’s the problem.  The goal is to estimate six parameters that characterize the leakiness (or air-tightness) of a house with an attached garage.  We are specifically interested in the parameters that describe the connection between the house and the garage; this is of interest because of the effect on the air quality in the house  if there are toxic chemicals (gasoline, car exhaust, etc.) in the garage, but I won’t go into the motivation of the experiments, I’ll just describe them. (See below the fold for the rest)

Continue reading ‘Help with this problem, win valuable prizes’ »

Philosophy of Bayesian statistics: my reactions to Wasserman

Continuing with my discussion of the articles in the special issue of the journal Rationality, Markets and Morals on the philosophy of Bayesian statistics:

Larry Wasserman, “Low Assumptions, High Dimensions”:

This article was refreshing to me because it was so different from anything I’ve seen before. Larry works in a statistics department and I work in a statistics department but there’s so little overlap in what we do. Larry and I both work in high dimesions (maybe his dimensions are higher than mine, but a few thousand dimensions seems like a lot to me!), but there the similarity ends. His article is all about using few to no assumptions, while I use assumptions all the time. Here’s an example. Larry writes:

P. Laurie Davies (and his co-workers) have written several interesting papers where probability models, at least in the sense that we usually use them, are eliminated. Data are treated as deterministic. One then looks for adequate models rather than true models. His basic idea is that a distribution P is an ad- equate approximation for x1,…,xn, if typical data sets of size n, generated under P look like x1,…,xn. In other words, he asks whether we can approximate the deterministic data with a stochastic model.

This sounds cool. And it’s so different from my world! I do a lot of work with survey data, where the sample is intended to mimic the population, and a key step comes in the design, which is all about probability sampling. I agree that Wassserman’s (or Davies’s) approach could be applied to surveys—the key step would be to replace random sampling with quota sampling, and maybe this would be a good idea—but in the world of surveys we would typically think of quota sampling or other nonprobabilistic approaches as an unfortunate compromise with reality rather than as a desirable goal. In short, typical statisticians such as myself see probability modeling as a valuable tool that is central to applied statistics, while Wasserman appears to see probability as an example of an assumption to be avoided.

Just to be clear: I’m not at all saying Wasserman is wrong in any way here; rather, I’m just marveling on how different his perspective is from mine. I can’t immediately see how his assumption-free approach could possibly be used to estimate public opinion or votes cross-classified by demogtaphics, income, and state. But, then again, maybe my models wouldn’t work so well on the applications on which Wasserman works. Bridges from both directions would probably be good.

With different methods and different problems come different philosophies. My use of generative modeling motivates, and allows, me to check fit to data using predictive simulation. Wasserman’s quite different approach motivates him to understand his methods using other tools.

Meta-analysis, game theory, and incentives to do replicable research

One of the key insights of game theory is to solve problems in reverse time order. You first figure out what you would do in the endgame, then decide a middle-game strategy to get you where you want to be at the end, then you choose an opening that will take you on your desired path. All conditional on what the other players do in their turn.

In an article from 1989, “Meta-analysis in medical research: Strong encouragement for higher quality in individual research efforts,” Keith O’Rourke and Allan Detsky apply this principle to the process of publication of scientific research:
Continue reading ‘Meta-analysis, game theory, and incentives to do replicable research’ »

Adding an error model to a deterministic model

Daniel Lakeland asks, “Where do likelihoods come from?” He describes a class of problems where you have a deterministic dynamic model that you want to fit to data. The data won’t fit perfectly so, if you want to do Bayesian inference, you need to introduce an error model. This looks a little bit different from the usual way that models are presented in statistics textbooks, where the focus is typically on the random error process, not on the deterministic part of the model. A focus on the error process makes sense in some applications that have inherent randomness or variation (for example, genetics, psychology, and survey sampling) but not so much in the physical sciences, where the deterministic model can be complicated and is typically the essence of the study. Often in these sorts of studies, the staring point (and sometimes the ending point) is what the physicists call “nonlinear least squares” or what we would call normally-distributed errors. That’s what we did for our toxicology and dilution-assay models. Sometimes it makes sense to have the error variance scale as a power of the magnitude of the measurement. The error terms in these models typically include model error as well as measurement variation. In other settings you might put errors in different places in the model, corresponding to different sources of variation and model error. For discrete data, Iven Van Mechelen and I suggested a generic approach for adding error to a deterministic model, but I don’t think this really would work with Lakeland’s examples.

If an entire article in Computational Statistics and Data Analysis were put together from other, unacknowledged, sources, would that be a work of art?

Spy novelist Jeremy Duns tells the amazing story of Quentin Rowan, a young writer who based an entire career on patching together stories based on uncredited material from published authors, culminating in a patchwork job that Duns had blurbed as an “instant classic.”

Rowan did not merely plagiarize to fill in some gaps or cover some technical material that he was too lazy to rewrite; rather, he put together an entire novel out of others’ material. Rowan writes (as part of a longer passage that itself appears to be dishonest; see the November 15, 2011 5:36 AM comment later on in the thread):

I [Rowan] sat there with the books [by others] on my kitchen table and typed the passages up word for word. I had a plot in mind, initially, and looked for passages that would work within that context. People told me the initial plot was dull (spies being killed all over Europe – no one knows why), so I changed it to be more like the premise of McCarry’s “Second Sight” which was a whole lot more interesting. I had certain things I wanted to see happen in the initial plot: a double cross, a drive through the South of France, a raid on a snowy satellite base. Eventually I found passages that adhered to these kinds of scenes that only meant changing the plot a little bit here and there. It felt very much like putting an elaborate puzzle together. Every new passage added has its own peculiar set of edges that had to find a way in.

The problem is not that he cut and pasted but that he didn’t acknowledge the sources. Although if he’d done that, he might’ve been up against some copyright infringement problems.

A commenter writes:

The whole thing about this that is so sad is that, yes, writing is hard work, and sending your words into the world to be read and judged is hard. But writing also brings me great joy and satisfaction. And that joy is what Quentin has cheated himself out of because he was scared.

I don’t know about that. Putting together an entire novel out of existing scraps and pieces—that’s pretty impressive to me. Quilting may be less technically impressive than weaving but it’s a skill all its own. Similarly, rappers have stolen lots of 70s riffs but they’ve added something of their own.

Literary vs. academic theft

The commenters also discuss other literary plagiarists such as Jacob Epstein, Patricia Waddell, Richard Condon, and Jerzy Kosinski.

Based on all these examples, literary plagiarism seems a bit different than academic plagiarism. (And both are different from journalists who take from blogs without giving credit.)

Goodwin, Fischer, Wegman, Tribe, Ayres, Dershowitz, etc etc etc are doing just fine in their careers. They don’t need to plagiarize; they seem to do it out of a sense of obligation, or because they’re too lazy to figure things out themselves. (Or maybe for one of these reasons.) It’s less effort to copy than to fully read external material, incorporate it into one’s worldview, and rewrite it in a way that is coherent with one’s larger argument.

In contrast, for literary plagiarists there is skill involved in patching together complementary material from others’ published work, seeking passages that are obscure enough or bland enough to escape notice. I don’t think that even Ed Wegman’s strongest defenders would argue that he applied any wit or creativity in his ripoffs of Wikipedia and other published sources. But I think you can admire the skill (if not the dishonesty) of a literary copyist who can put together a whole novel out of others’ material.

Getting energy from the reader

I also liked this comment, later on in the thread:

Books contain energy, and when you purposefully use words found in other books, you pull that energy into your own work.

I [the commenter] think the energy comes from the author, and all her experiences and deliberations, but also it comes from the people working on the book: editors, artists, marketing, etc. I’d go so far as to say that people reading and responding to books can contribute to their energy as well.

Interesting point. I believe the reader can supply a lot, and some books stimulate this by having lots of hooks, as it were, to connect to the readers’ thoughts and experiences. Think of all those memoirs that work by reminding readers of their own childhoods. Or consider Nassim Taleb’s books. Many of my correspondents were surprised that I responded so positively to Fooled by Randomness and The Black Swan, but I really enjoyed the experience of reading them with pen in hand, it was just the right book to bring out lots of thoughts that I had within myself.

Familial Linkage between Neuropsychiatric Disorders and Intellectual Interests

When I spoke at Princeton last year, I talked with neuroscientist Sam Wang, who told me about a project he did surveying incoming Princeton freshmen about mental illness in their families. He and his coauthor Benjamin Campbell found some interesting results, which they just published:

A link between intellect and temperament has long been the subject of speculation. . . . Studies of the artistically inclined report linkage with familial depression, while among eminent and creative scientists, a lower incidence of affective disorders is found. In the case of developmental disorders, a heightened prevalence of autism spectrum disorders (ASDs) has been found in the families of mathematicians, physicists, and engineers. . . .

We surveyed the incoming class of 2014 at Princeton University about their intended academic major, familial incidence of neuropsychiatric disorders, and demographic variables. . . . Consistent with prior findings, we noticed a relation between intended academic majors and ASDs. Looking for relations between other neuropsychiatric disorders and academic interest we also noted a heightened prevalence of bipolar disorder, major depressive disorder and substance abuse in the families of those pursuing the humanities. A composite score based on these four heritable disorders was strongly correlated with a student’s intended academic major. Thus, familial risk toward a spectrum of psychopathologies can predict propensity toward technical versus humanist interests.

When I spoke with Sam last year we discussed various ways to analyze the data as well as various interpretations of the results, but I don’t actually remember any of our conversation except for the bit where he described to me how they conducted their study.

Charles Murray [perhaps] does a Tucker Carlson, provoking me to unleash the usual torrent of graphs

Charles Murray wrote a much-discussed new book, “Coming Apart: The State of White America, 1960-2010.”

David Frum quotes Murray as writing, in an echo of now-forgotten TV personality Tucker Carlson, that the top 5% of incomes “tends to be liberal—right? There’s no getting around it. Every way of answering this question produces a yes.”

[I’ve interjected a “perhaps” into the title of this blog post to indicate that I don’t have the exact Murray quote here so I’m relying on David Frum’s interpretation.]

Frum does me the favor of citing Red State Blue State as evidence, and I’d like to back this up with some graphs.

Frum writes:

Say “top 5%” to Murray, and his imagination conjures up everything he dislikes: coastal liberals listening to NPR in their Lexus hybrid SUVs. He sees that image so intensely that no mere number can force him to remember that the top 5% also includes the evangelical Christian assistant coach of a state university football team. . . .

To put it in graphical terms:


Continue reading ‘Charles Murray [perhaps] does a Tucker Carlson, provoking me to unleash the usual torrent of graphs’ »

The more likely it is to be X, the more likely it is to be Not X?

This post is by Phil Price.

A paper by Wood, Douglas, and Sutton looks at “Beliefs in Contradictory Conspiracy Theories.”  Unfortunately the  subjects were 140 undergraduate psychology students, so one wonders how general the results are.  I found this sort of arresting:

In Study 1 (n=137), the more participants believed that Princess Diana faked her own death, the more they believed she was murdered.  In Study 2 (n=102), the more participants believed that Osama Bin Laden was already dead when U.S. Special Forces raided his compound in Pakistan, the more they believed he is still alive.

As the article says, “conspiracy advocates’ distrust of official narratives may be so strong that many alternative theories are simultaneously endorsed in spite of any contradictions between them.”  But I think the authors overstate things when they say “One would think that there ought to be a negative correlation between beliefs in contradictory accounts of events — the more one believes in a particular theory, the less likely rival theories will seem.”  Well, one might think that, but actually a positive correlation makes sense to me.  I can see how, if you really think that a lot of what the government says is a lie, you would think “well, I don’t know exactly which part of the Bin Laden account is a lie but they are probably lying about something; maybe he was already dead, or maybe he’s still alive now, but I don’t know which.”  The authors realize this is what is going on, they just make too much of how surprising it should be.

Philosophy of Bayesian statistics: my reactions to Hendry

Continuing with my discussion here and here of the articles in the special issue of the journal Rationality, Markets and Morals on the philosophy of Bayesian statistics:

David Hendry, “Empirical Economic Model Discovery and Theory Evaluation”:

Hendry presents a wide-ranging overview of scientific learning, with an interesting comparison of physical with social sciences. (For some reason, he discusses many physical sciences but restricts his social-science examples to economics and psychology.)

The only part of Hendry’s long and interesting article that I will discuss, however, is the part where he decides to take a gratuitous swing at Bayes. I don’t know why he did this, but maybe it’s part of some fraternity initiation thing, like TP-ing the dean’s house on Halloween.

Here’s the story. Hendry writes:

‘Prior distributions’ widely used in Bayesian analyses, whether subjective or ‘objective’, cannot be formed in such a setting either, absent a falsely assumed crystal ball. Rather, imposing a prior distribution that is consistent with an assumed model when breaks are not included is a recipe for a bad analysis in macroeconomics. Fortunately, priors are neither necessary nor sufficient in the context of discovery.

I could just laugh this off—but as someone who has published two books and hundreds of articles on applied Bayesian statistics, I think I’ll take Hendry seriously.

Let me start with the tone. I generally don’t like when people take words or phrases that you disagree with them and put them in quotes. If you’re going to put “prior distributions” and “objective” in quotes, then please show the same disrespect to your other terms: “falsely” . . . “crystal ball” . . . “breaks” . . . “recipe” . . . “macroeconomics” . . . “discovery.”

But let me get to the substance. First, Hendry’s right. No statistical method is necessary. With sufficient effort, I think you can solve all statistical problems with Bayesian methods, or with robust methods, or with bootstrapping, or with any number of alternative approaches. Fuzzy sets would probably work too. Different approaches have different advantages, but I’m sure that if Hendry adopts a self-denying ordinance and decides to never use priors, he can solve all sorts of data analysis problems. He’ll just have to work really hard sometimes. But, to be fair, there are some problems that I have to work really hard on too. In short: econometrics methods tend to require more effort in complicated settings, but they often have appealing robustness properties. It’s fair enough that Hendry and I place different values on robustness vs. modeling flexibility.

My most serious criticism with Hendry’s above paragraph is the old, old story: he’s singling out Bayesian methods and priors as being particularly bad. Meanwhile all those likelihood functions and assumptions of additivity, symmetry, etc. all just sneak in. Hendry’s standing at the back window with a shotgun, scanning for priors coming over the hill, while a million assumptions just walk right into his house through the front door.

Here’s Hendry’s summary:

The pre-existing framework of ideas is bound to structure any analysis for better or worse, but being neither necessary nor sufficient, often blocking, and unhelpful in a changing world, prior distributions should play a minimal role in data analyses that seek to discover useful knowledge.

I’m going to have to disagree. I could give a million examples of useful knowledge that can be discovered with the aid of prior distributions. For example, where are the houses in the U.S. that have high radon levels? What are the effects of redistricting? How much perchloroethylene does the body metabolize? What is public opinion on gay rights by state? Or, for a classic from Mosteller and Wallace in 1960, classify the authorship of the Federalist Papers using 1960s technology.

I’m not saying that Hendry and his colleagues need to be using Bayesian methods in his applied research. I’m not even saying that Bayesian methods are needed to solve the problems listed in the above paragraph. In practice these problems were indeed solved using Bayesian inference, but I think other approaches could get there too. What I am saying is, why is Hendry so sure that “prior distributions should play a minimal role” etc.? I’m really bothered when people go beyond the simple and direct, “I have no personal experience with Bayesian inference solving a useful problem” to prescriptive (and wrong) statements such as “prior distributions should play a minimal role.” And it’s just silly to say that priors are “unhelpful in a changing world.” I’d think an econometrician would know about time series models!

Hendry also pulls the no-true-Scotsman trick:

Fortunately, priors are neither necessary nor sufficient in the context of discovery. For example, children learn whatever native tongue is prevalent around them, be it Chinese, Arabic or English, for none of which could they have a ‘prior’. Rather, trial-and-error learning seems a child’s main approach to language acquisition: see Clark and Clark (1977). Certainly, a general language system seems to be hard wired in the human brain (see Pinker 1994; 2002) but that hardly constitutes a prior. Thus, in one of the most complicated tasks imaginable, which computers still struggle to emulate, priors are not needed.

This is a no-true-Scotsman argument because, when confronted with an example in which our brains figure things out using a pre-existing structure (not for Chinese, Arabic, or English, but for human language in general), Hendry simply says that this system that is “hard wired in the human brain . . . hardly constitutes a prior.” Huh? It’s definitely a prior. That’s the whole point: our brains are tuned to decode human language.

Why does this bug me so much about a few throwaway paragraphs in an otherwise-pretty-good-article? Hendry’s anti-Bayesian sentiments are no more clueless than those earlier expressed by, say, John DiNardo. The difference is that DiNardo was just venting his opinions and was pretty open about this, whereas Hendry’s presenting his prejudices with an air of expertise. If Hendry wants to work on “replacing unrestricted non-linear functions by an encompassing theory-derived form, such as an ogive,” then fine. His theoretical models of model selection seem interesting and could perhaps be useful. I just wish he’d cut out the part where he implicitly disparages the work of Mosteller and Wallace, Lax and Phillips, and a few zillion other researchers who’ve used Bayesian methods to solve problems.

It’s not too late for Hendry to reform (I hope). All he needs to do is to retreat to present the positive virtues of his preferred inferential approach along with his explanations as to why Bayesian methods have not seemed useful for him. He’s an econometrician, he doesn’t work in toxicology and that’s fine. I think both his positive and his negative statements would be stronger if he would be more aware of the limits of his own experience. Just as, in mathematics, a theorem is clearer if you understand the range of its applicability and the areas where there are counterexamples.