## Why the square-root rule for vote allocation is a bad idea

Commentators and experts have taken two positions on the allocation of votes in a two-stage voting system, such as block voting in the European Union or the Electoral College in the United States. From one side (for example, this article by Richard Baldwin and Mika Widgren), there is the claim that mathematical considerations of fairness demand that countries (or, more generally, blocks) get votes in proportion to the square root of their populations. From the other side (for example, this article by Gideon Rachman), there is the claim that such mathematical rules are irrelevant to the real world of politics. This debate has real-world importance, in particular because of Poland’s recent lobbying for square-root allocation in the European Union, in opposition to Germany’s support of something closer to proportionality.

I make a different claim, which is that mathematical rules are relevant to the real world, but that when the mathematics and statistics are done correctly, we find that proportional allocation is much more fair than square-root allocation, in the sense of giving more equal voting power–probability of decisiveness–to individual voters. This sense of voting power is the criterion used by the square-root-rule proponents. Thus, I am taking them at their own word and saying that, under their own rules, the square-root rule is not fair.

The basic idea

I’ll work with the EU as an example, since this is where the controversy is right now. The position of Germany is that each country should get a vote proportional to its population. Poland wants a square-root rule.

Let’s start from scratch. Your voting power–the probability that your vote is decisive–is equal to the probability that your vote is decisive within your country (that is, the probability that your country would be exactly tied without your vote), multiplied by the probability that your country’s electoral votes are decisive in the block voting system (so that, if your state flips, it will change the electoral vote winner), if your state were tied.

If your state has N voters and a block vote of B, the probability that your country is tied on any particular issue is approximately proportional to 1/N, and the probability that your country’s block votes are necessary is approximately proportional to B. So the probability that your vote is decisive–your “voting power”–is roughly proportional to B/N, that is, the number of block votes per voter in your state. The allocation is roughly fair if a country’s vote is proportional to its population.

The key issue: closeness of elections in large and small states

The point has sometimes been obscured, unfortunately, by “voting power” calculations that purportedly show that, counterintuitively, voters in large states have more voting power (“One man, 3.312 votes,” in the oft-cited paper of Banzhaf, 1968) and recommend the square-root rule, discussed by Penrose in a paper from 1946. This claim of Penrose, Banzhaf, and others is counterintuitive and, in fact, false.

Why is the Penrose/Banzhaf claim false? The claim is based on the same idea as we noted above: voting power equals the probability that your country is tied, times the probability that your country’s block votes are necessary for a national coalition. The hitch is that Penrose, Banzhaf, and others computed the probability of your country being tied as being proportional to 1/sqrt(N). This calculation is based (explicitly or implicitly) on a binomial distribution model, and it implies that elections in large states will be much closer (in proportion of the vote) than elections in small states.

Actually, though, elections in large states are not much closer than elections in small states. Here are some data (from this paper in the British Journal of Political Science):

Large elections are very slightly closer than small elections, but much much less than would be implied by the square-root rule. In the graphs above, the “alpha = -0.5” line shows what would happen under the square root rule; actual best-fit alphas are much closer to zero.

If you wanted to set a power-law based on the consensus of the data, it would make more sense to set a 0.9 power rule. But this is much closer to proportionality than to the square-root (or 0.5 power).

Engaging the argument

My claim (and that of Jonathan Katz and Joe Bafumi, my coauthors), thus, is that even if one accepts the voting power criterion, the square-root rule is inappropriate. Could we be right? Is it possible that the consensus of experts in voting power in Europe are wrong, and three political science professors from the United States got it right? Let me try to engage the voting-power experts on their own turf. I hope this can be the beginning of a deeper dialogue.

In this article on Vox (“Research-based policy analysis and commentary from Europe’s leading economists”) from 15 June 2007, RIchard Baldwin and Mika Widgrén write:

People versed in the game theory of voting know that the square-root is almost sacred. . . . This column reviews the logic of the square root for readers with some numeric skills. . . . Strange as it may seem, ensuring that the EU’s keystone decision-making body – the Council of Ministers – is such that each EU citizen has equal power – requires just this. Each Council member should have power in Council that is proportional to the square root of her nation’s population. Why? . . . In her national election, a typical German citizen has less power than a typical Luxembourger. Each group of voters chooses one government but German voters are 160 times more numerous. . . .

A first guess is that in her national election, a German voter is only 1/160th as influential as a Luxembourg voter is in hers. In this case, making EU citizens equipotent in the Council would require that the German Minister is 160 times more power in the Council than the Luxembourg Minister. It seems right – 1/160th as powerful in the national election and 160 times more powerful in the Council. But this is wrong since it misses a subtly that requires some mental gymnastics to comprehend.

In national elections, two things change as the number of voters rises. First, the likelihood of being critical in a particular winning coalition decreases and – as intuition dictates – it declines linearly with the number of voters. Second, the number of winning coalitions increases. Thus, the German has 1/160th the chance that a Luxembourger does of making or breaking a given winning coalition, but for the German this is applied to many more coalitions. Taking this into account one can see that the German voter’s power is less than that of a Luxembourger in their respective national elections, but the figure is not 1/160th as powerful, it is higher. . . . The precise answer is that for all EU citizens to be equally powerful in the Council, their Ministers should have power in the Council that is proportional to the square root of their national populations. . . .

For people trained in game theory and mathematical statistics, the square root is a snap. . . . power per citizen in national elections declines with the square root of the population, so national power in the Council should increase with square root in order to have a fair system, i.e. a system where each EU citizen is equally powerful in the Council of Ministers.

Where, you may ask, does the square root come from? The answer requires a bit of maths. Consider a randomly selected yes-no issue and suppose that member nations decide their stance on this issue by a referendum; define P_N as the probability that a typical citizen’s vote is critical in the referendum outcome. Then the member states vote in the Council. Define P_ms as the probability that the member state is critical in the Council vote. A citizen’s probability of being critical is thus P_N times P_ms and our fairness metric requires this to be equal for all member states.

P_ms has nothing to do with the number of voters (proxied by population), but P_N falls at the square root of population. . . . The mathematics of combinatorics gives us an exact formula assuming a voter’s stance is randomly determined on a randomly selected issue. Taking M as the minimum number of votes in a winning coalition and n as the number of voters, one can use the binomial distribution to work out the answer. The precise, the formula is complex, but it can be well approximated as the square root of 2/n(Pi), where n is the number of voters and Pi is 3.14 etc. (This approximation is called Stirling’s formula). Hence the square root.

I gave a long quote because I want to be fair to their argument. Anyway, the last paragraph above shows their key mistake: they count all possible coalitions (arrangements of N voters) equally, which corresponds to the implicit assumption that all coalitions are equally likely, which in turn is equivalent to voters flipping coins to decide how to vote. I don’t object to the use of a simplified model–that happens all the time–but I do want to focus on the key implication of the equally-likely coalition rule, which is that elections in large countries will be much much closer, in proportional terms, than elections in small countries.

Again, our paper in the BJPS has the details; further theoretical discussion appears in our Statistical Science paper.

A quick summary of our argument: The square-root-rule is derived from a game-theoretic argument that also implies that elections in large countries will be much much closer (on average) than elections in small countries. This implication is in fact crucial to the reasoning justifying the square-root rule. But it’s not empirically correct. For example, if a country is 9 times larger, its elections should be approximately 3 times closer to 50/50. This doesn’t happen. Larger elections are slightly closer than small elections, but by very little, enough that perhaps a 0.9 power rule would be appropriate, not a square-root (0.5 power) rule.

How could they all get it wrong?

Mathematics can be seductive, as can be seen from some lines in the Baldwin and Widgrén article quoted above: “People versed in the game theory of voting know that the square-root is almost sacred. . . . a subtly that requires some mental gymnastics to comprehend. . . . it is not the easiest concept to grasp, but it is correct and has a cherished position in the mathematics of voting systems. . . . The answer requires a bit of maths. . . .” I think this stuff has been out there for so long that people just assume it’s correct. Also, the opposing arguments tend to dismiss the mathematical reasoning entirely (for example, Gideon Rachman refers to “the baffling square root system.” This sort of debate would lead the square-root-proponents to assume that the mathematics is on their side.

The other issue is that the use of mathematical arguments is a slippery slope. The square-root proponents might feel that, if a model must be used, it should be the simplest possible model of coin-flip voting. But I don’t buy it. Their model has specific implications and they’re using it to give extra votes to small countries. If the model is inappropriate–as, indeed, it is–then I don’t think it should be used to set voting rules.

“Counterintuitive” is a tough sell by itself, but “counterintuitive and wrong” is really bad.

Don’t blame Penrose . . .

Well, actually I do blame Penrose a bit. Even in 1946 without tons of data on the computer, he could’ve realized the error of the implication that large elections will be extremely close. But we have a lot more information now, so I think it’s really time for the voting-power subfield of political science, economics, and mathematics to move beyond this silly model.

P.S. This is a topic I’ve written about before, but I found the recent discussion from a reference in a paper by Auriol and Gary-Bobo linked to by Mark Thoma.

1. LemmusLemmus says:

You should send the German government a letter. There might be money for you in this.

2. Alex F says:

"I make a different claim, which is that mathematical rules are relevant to the real world, but that when the mathematics and statistics are done correctly, we find that proportional allocation is much more fair than square-root allocation, in the sense of giving more equal voting power–probability of decisiveness–to individual voters. This sense of voting power is the criterion used by the square-root-rule proponents. Thus, I am taking them at their own word and saying that, under their own rules, the square-root rule is not fair."

Of course, their criterion is silly. The primitive of the model — and the real-world relevant issue — is that people have preferences over outcomes, not over being decisive. A fair system will be one that chooses an outcome which gives equal weight to preferences of people everywhere, not one which makes each person equally likely to be decisive. An optimal system will be one that chooses "the best" outcome according to some criterion like maximizing sums of normalized utilities. But there's no reason in the world — not in mathematics, not in politics, not in social intuition — why we should design a system to give everyone an equal chance of being decisive.

We can argue about whether a system which weights country sizes by the 1/2, 9/10, or 1st power is the one which best equalizes probability of being decisive. But I just don't see that there can be any argument that it's an answer to the wrong question.

3. Marian says:

In my opinion it might be true that this abstract model hasn't got much to do with the real world, and might not be appropriate to the requirements of daily politics.

But there are some serious errors in reasoning in the 2nd comment ("Posted by: LemmusLemmus at October 11, 2007 8:55 AM."):

"The primitive of the model — and the real-world relevant issue — is that people have preferences over outcomes, not over being decisive."
I think that for nearly everybody, knowing to be decisive (or at least as much decisive as every other person) is very important. Voter participation is decreasing in most countries, not just in the EU, but all over the world. One reason could be that voters don't feel powerfull, and as a consequence useless, in the face of globalisation and the cession of more and more rigths of the national goverments to the institutions of the EU. A decisive voter is a characteristic of an efficient democracy!!

"A fair system will be one that chooses an outcome which gives equal weight to preferences of people everywhere, not one which makes each person equally likely to be decisive."
A system which makes each person equally likely to be decisive is the only conversant / established way of giving equal weight to preferences of people "everywhere". –> If everyone's vote is euqally likely to be crucial, everybodys preferences are represent equally strong!

And what are "normalized utilities"?! I would be glad if you could define this, LemmusLemmus…

4. Dietger says:

Does democracy mean that the majority rules, or that everybody has an equal chance of deciding on any issue?

If equal chance on being decisive is the definition of 'fair', then having all citizens vote and randomly chose one vote to dictate the outcome on the issue at hand would probably be the only real fair voting system, independent of any particular statistical voting distribution.

I think that, square root laws aside, no intuitive reason exists for using Penrose's power index to measure power let alone fairness.

5. Oystein says:

One premise of Penrose's argument is that the decision of a single voter is uncorrelated with the decision of their fellow voters – r=0. If this were so – that is, if each election on a yes/no issue had an Expected Value of 50% (regardless of standard deviation), then the square-root-rule would indeed be fair:
B ~ P^0.5
But this assumes that nationals don't tend to show common trends on international issues – hardly a realistic assumption. If the European Council were to decide, for example, on redistribution of funds in favour of the richer and to the detriment of the poorer countries, then surely the German vote will correlate, and the Romanian will correlate too.
If, on the other hand, cultural cohesion in a country were so great that all elections end with a 100% yes or a 100% no, that is perfect correlation, r=1, then fair representation in an international council would be proportional to population:
B ~ P^1
The "fair" exponentiation would be somewhere between 0.5 and 1. Conceivably one could do statistics to statistics to get a "good" value, Gelman proposes 0.9 based on some such statistic.
There is a problem in that sort of statistics: Quite probably, correlation within a country will be greater on issues that pit one set of countries against another (such as redistribution issues) than on issues that promise a similar benefits/costs ratio for all member countries (such as travel regulations). The former type of issue is crucial in the debate – the small countries don't want fairness, they want leverage. By imposing a square-root-rule, they are given maximum (and unrealistic) leverage.
If we could figure out the average correlation between a single voters vote with the vote of his/her countryhumans, we could run Penrose's calculations again, this time assuming the empirical correlation coefficient, and calculate a "fair" leverage.