From the physics arXiv blog at Technology Review:
In 1938, the physicist Frank Benford made an extraordinary discovery about numbers. He found that in many lists of numbers drawn from real data, the leading digit is far more likely to be a 1 than a 9. In fact, the distribution of first digits follows a logarithmic law. So the first digit is likely to be 1 about 30 per cent of time while the number 9 appears only five per cent of the time.
That's an unsettling and counterintuitive discovery. Why aren't numbers evenly distributed in such lists? One answer is that if numbers have this type of distribution then it must be scale invariant. So switching a data set measured in inches to one measured in centimetres should not change the distribution. If that's the case, then the only form such a distribution can take is logarithmic.
But while this is a powerful argument, it does nothing to explan the existence of the distribution in the first place.
Then there is the fact that Benford Law seems to apply only to certain types of data. Physicists have found that it crops up in an amazing variety of data sets. Here are just a few: the areas of lakes, the lengths of rivers, the physical constants, stock market indices, file sizes in a personal computer and so on.
However, there are many data sets that do not follow Benford's law, such as lottery and telephone numbers.
What's the difference between these data sets that makes Benford's law apply or not? It's hard to escape the feeling that something deeper must be going on.
Today, Lijing Shao and Bo-Qiang Ma at Peking University in China provide a new insight into the nature of Benford's law. They examine how Benford's law applies to three kinds of statistical distributions widely used in physics.