Random matrices in the news

Mark Buchanan wrote a cover article for the New Scientist on random matrices, a heretofore obscure area of probability theory that his headline writer characterizes as “the deep law that shapes our reality.”

It’s interesting stuff, and he gets into some statistical applications at the end, so I’ll give you my take on it.

But first, some background.

About two hundred years ago, the mathematician/physicist Laplace discovered what is now called the central limit theorem, which is that, under certain conditions, the average of a large number of small random variables has an approximate normal (bell-shaped) distribution. A bit over 100 years ago, social scientists such as Galton applied this theorem to all sorts of biological and social phenomena. The central limit theorem, in its generality, is also important in the information that it indirectly conveys when it fails.

For example, the distribution of the heights of adult men or women is nicely bell-shaped, but the distribution of the heights of all adults has a different, more spread-out distribution. This is because your height is the sum of many small factors and one large factor—your sex. The conditions of the theorem are that no single factor (or small number of factors) should be important on its own. For another example, it has long been observed that incomes do not follow a bell-shaped curve, even on the logarithmic scale. Nor do sizes of cities and many other social phenomena. These “power-law curves,” which don’t fit the central limit theorem, have motivated social scientists such as Herbert Simon to come up with processes more complicated than simple averaging (for example, models in which the rich get richer).

The central limit theorem is an example of an attractor–a mathematical model that appears as a limit as sample size gets large. The key feature of an attractor is that it destroys information. Think of it as being like a funnel: all sorts of things can come in, but a single thing—the bell-shaped curve—comes out. (Or, for other models, such as that used to describe the distribution of incomes, the attractor might be a power-law distribution.) The beauty of an attractor is that, if you believe the model, it can be used to explain an observed pattern without needing to know the details of its components. Thus, for example, we can see that the heights of men or of women have bell-shaped distributions, without knowing the details of the many small genetic and environmental influences on height.

Now to random matrices.

A random matrix is an array of numbers, where each number is drawn from some specified probability distribution. You can compute the eigenvalues of a square matrix—that’s a set of numbers summarizing the structure of the matrix–and they will have a probability distribution that is induced by the probability distribution of the individual elements of the matrix. Over the past few decades, mathematicians such as Alan Edelman have performed computer simulations and proved theorems deriving the distribution of the eigenvalues of a random matrix, as the dimension of the matrix becomes large.

It appears that the eigenvalue distribution is an attractor. That is, for a broad range of different input models (distributions of the random matrices), you get the same output—the same eigenvalue distribution—as the sample size becomes large. This is interesting, and it’s hard to prove. (At least, it seemed hard to prove the last time I looked at it, about 20 years ago, and I’m sure that it’s even harder to make advances in the field today!)

Now, to return to the news article. If the eigenvalue distribution is an attractor, this means that a lot of physical and social phenomena which can be modeled by eigenvalues (including, apparently, quantum energy levels and some properties of statistical tests) might have a common structure. Just as, at a similar level, we see the normal distribution and related functions in all sorts of unusual places.

Consider this quote from Buchanan’s article:

Recently, for example, physicist Ferdinand Kuemmeth and colleagues at Harvard University used it to predict the energy levels of electrons in the gold nanoparticles they had constructed. Traditional theories suggest that such energy levels should be influenced by a bewildering range of factors, including the precise shape and size of the nanoparticle and the relative position of the atoms, which is considered to be more or less random. Nevertheless, Kuemmeth’s team found that random matrix theory described the measured levels very accurately.

That’s what an attractor is all about: different inputs, same output.

Thus, I don’t quite understand this quote:

Random matrix theory has got mathematicians like Percy Deift of New York University imagining that there might be more general patterns there too. “This kind of thinking isn’t common in mathematics,” he notes. ‘Mathematicians tend to think that each of their problems has its own special, distinguishing features. But in recent years we have begun to see that problems from diverse areas, often with no discernible connections, all behave in a very similar way.

This doesn’t seem like such a surprise to me—it seems very much in the tradition of mathematical modeling. But maybe there’s something I’m missing here.

Finally, Buchanan turns to social science:

An economist may sift through hundreds of data sets looking for something to explain changes in inflation – perhaps oil futures, interest rates or industrial inventories. Businesses such as Amazon.com rely on similar techniques to spot patterns in buyer behaviour and help direct their advertising.

While random matrix theory suggests that this is a promising approach, it also points to hidden dangers. As more and more complex data is collected, the number of variables being studied grows, and the number of apparent correlations between them grows even faster. With enough variables to test, it becomes almost certain that you will detect correlations that look significant, even if they aren’t. . . . even if these variables are all fluctuating randomly, the largest observed correlation will be large enough to seem significant.

This is well known. The new idea is that mathematical theory might enable the distribution of these correlations to be understood for a general range of cases. That’s interesting but doesn’t alter the basic statistical ideas.

Beyond this, I think there’s a flaw in the idea that statistics (or econometrics) proceeds by blindly looking at the correlations among all variables. In my experience, it makes more sense to fit a hierarchical model, using structure in the economic indexes rather than just throwing them all in as predictors. We are in fact studying the properties of hierarchical models when the number of cases and variables becomes large, and it’s a hard problem. Maybe the ideas from random matrix theory will be relevant here too.

Buchanan writes:

In recent years, some economists have begun to express doubts over predictions made from huge volumes of data, but they are in the minority. Most embrace the idea that more measurements mean better predictive abilities. That might be an illusion, and random matrix theory could be the tool to separate what is real and what is not.

I’m with most economists here: I think that, on average, more measurements do mean better predictive abilities! Maybe not if you are only allowed to look at correlations and least-squares regressions, but if you can model with more structure than, yes, more information should be better.

6 thoughts on “Random matrices in the news

  1. Is it possible that Buchanan is using "huge volumes of data" to mean "huge numbers of variables"? It hardly seems possible, but…

    Also, he defines the "curse of dimensionality" quite differently than I've seen it defined in machine learning.

  2. Lovely post. I'd not previously been exposed to that definition of an attractor; e.g. as an attribute of a model achieved by the destroying information. That's so common in the social sciences it would be nice if it had a standard name. I'd love to be able to utter: "I a very attractive model" and have it well understood as somewhat sardonic. "Great Attractor" sic. Makes me desire some citations :).

  3. Thanks. Most interesting. I'd never heard the term "attractor" before, but it sums up a lot of things.

    Just as the bell curve and the power law serve as major attractors for many processes, perhaps the concept of attractor could be extended, for the human world, to the idea of an "attention attractor."

    Consider questions like "Who will win the NCAA basketball tournament? or "Will the stock market go up or down tomorrow?" or "Will red or black come up on the roulette wheel?" versus questions like "When will the sun come up tomorrow?" or "Will school test scores be higher in Beverly Hills or Compton next year?"

    The first type of questions engage more human attention and controversy and emotion than the second type, although it's hardly clear which are more important in an objective sense.

    The attractor for human attention appears to me to be unpredictability. We devote more of our attention to things that are to predict. The more attention devoted to something in the daily buzz, the more unpredictable it tends to be. Unpredictability is the attractor.

    A corollary would be that you can make a lot of money inventing artificially unpredictable things like roulette wheels and the NCAA March Madness.

    Another corollary would be that since this concept of an "attention attractor" is so non-unpredictable, it will attract almost not attention!

  4. This New Scientist article makes me queasy with its undertones of divine mystery and conspiracy. I think Deepak Chopra will soon add "random matrix theory" to his bag of buzz words.

  5. Hi there Andrew, I regularly post on probability ideas pertaining to resource depletion and other topics. A few people have pointed me to the Random Matrix theory as something that might be relevant.
    I believe that what may be missing in much of this discussion is the vast amount of disorder that exists in a specific phenomenon. However, whatever the amount of disorder, the mean value and other constraints are still applicable so that the Principle of Maximum Entropy holds. Once you start applying MaxEnt all sorts of power-law behaviors fall out. So that I assert that maximum entropy considerations may act as the filter that you talk about, but with the constraints acting as the attractor.

    A good example of this approach is this post I have written recently "Business as Entropic Warfare"
    http://mobjectivist.blogspot.com/2010/04/business
    This models labor productivity in MaxEnt terms together with constraints based on a simple non-linear compounding growth model. The big assumption is that the state space of possible outcomes follows an ergodic hypothesis.

    I perform the quantitative analysis against the real data to see how well these ideas work and I always find the agreement parsimonious.

    The coarse-graining idea is a good one as well. All the details of the micro-states gets wiped out as a larger macro behavior emerges.

Comments are closed.