Post by Alex on Oct 3, 2009 14:16:17 GMT -8
Benford's Law
This is a pretty neat numbers theory effect we heard about recently. Effectively the take is that numbers that start with 1 are much more common than those that start with 2, and that is more common that numbers that start with 3 and so on. The least likely numbers are 7, 8, and then 9 as the least probable. This also apply's to a value sequence, such that 123 is far more likely to occur than 987.
What is interesting is that this rule applies outside of any particular dataset (it covers income statements through astronomy), and is independent of scale (inches, meters, Celsius or Fahrenheit). So it drifts in the realms of counter-intuitive and spooky laws.
To be fair there are a couple caveats. The probability effect only holds true for datasets that occur over several orders of magnitude (range more than 1 - 10,000 [also stated 1 x 10^0 thru 1 x 10^3]). For this reason normal distribution doesn't hold to this rule (for instance the average height of a man). Also the pool of numbers cannot be limited to exclude certain digits (such as defining a criteria that the values must fall between 300 and 900). An interesting twist is that if the random numbers is a statistical sampling of multiple normal distributions (like average man's height + average cat's length + average rate of hair growth per month, etc), Benford's law reappears.
Why is this interesting, you may ask:
1) Like random number sequences, deviations from patterns are directly useful in detecting fraudulent numbers, and difficult to cook without violating. That makes tools like this very interesting to people who want to know if your numbers really add up.
2) More to the point, spooky and hard to explain effects that challenge reason are always interesting. Particularly ones that are universal and counter-intuitive. We can be talking about star sizes or baseball cards sold per city and have the same effect.
digit probability of leading a number sequence
1 30.1%
2 17.6%
3 12.5%
4 9.7%
5 7.9%
6 6.7%
7 5.8%
8 5.1%
9 4.6%
This is a pretty neat numbers theory effect we heard about recently. Effectively the take is that numbers that start with 1 are much more common than those that start with 2, and that is more common that numbers that start with 3 and so on. The least likely numbers are 7, 8, and then 9 as the least probable. This also apply's to a value sequence, such that 123 is far more likely to occur than 987.
What is interesting is that this rule applies outside of any particular dataset (it covers income statements through astronomy), and is independent of scale (inches, meters, Celsius or Fahrenheit). So it drifts in the realms of counter-intuitive and spooky laws.
To be fair there are a couple caveats. The probability effect only holds true for datasets that occur over several orders of magnitude (range more than 1 - 10,000 [also stated 1 x 10^0 thru 1 x 10^3]). For this reason normal distribution doesn't hold to this rule (for instance the average height of a man). Also the pool of numbers cannot be limited to exclude certain digits (such as defining a criteria that the values must fall between 300 and 900). An interesting twist is that if the random numbers is a statistical sampling of multiple normal distributions (like average man's height + average cat's length + average rate of hair growth per month, etc), Benford's law reappears.
Why is this interesting, you may ask:
1) Like random number sequences, deviations from patterns are directly useful in detecting fraudulent numbers, and difficult to cook without violating. That makes tools like this very interesting to people who want to know if your numbers really add up.
2) More to the point, spooky and hard to explain effects that challenge reason are always interesting. Particularly ones that are universal and counter-intuitive. We can be talking about star sizes or baseball cards sold per city and have the same effect.
digit probability of leading a number sequence
1 30.1%
2 17.6%
3 12.5%
4 9.7%
5 7.9%
6 6.7%
7 5.8%
8 5.1%
9 4.6%