The astronomer Simon Newcomb discovered in 1881 that it is much more common in astronomical data that the leading digit (the left one) is small (1, 2, …) than large (9, 8, “…). that the number begins with a digit d corresponds roughly to the decimal logarithm of (1 + 1 / d): For d = 1 this is 30.1%, for d = 9, however, only 4.6%.
This surprising fact was rediscovered in 1937 by the physicist Frank Benford, who pointed out that this is also the case with the most diverse types of data: city population, numbers in this edition of Folha, cases of Covid in different countries and states, sizes of volcanoes, time intervals between Heartbeats, points in basketball games, etc.
There are exceptions when the data is artificial (cell counts in Rio de Janeiro always start with 9) or vary within a limited range (adult height almost always starts with 5 or 6). However, it is widely confirmed that the vast majority of natural data follows Benford’s Law, as I outlined in the last issue of this column. This means that real data can be distinguished from false or fraudulent data.
One request concerns the examination of tax returns: if the claim is genuine, the amounts must comply with Benford law, so any discrepancy is an indication that the claim falls into the fine net and is carefully analyzed. The IRS and its counterparts in other countries do not disclose their methods, so we ignore how the law is actually applied. But it’s a free tool and it’s very easy to use.
I know, dear reader, that must seem very naive to you: surely a good cheater is smart enough to “boil” his values according to Benford’s Law, right? Well, it’s not that easy because the law has a property called scale invariance. In practice this means that regardless of the unit used, the law must always apply!
Even if the declaration data is well “made” in reais, the IRS can convert it, for example, into Japanese yen, Swiss francs or Indian rupees. The numbers are completely different but still have to comply with Benford’s Law: discrepancies in any of these currencies are suspicious evidence. It’s complicated now, isn’t it?
And Benford’s Law has other notable properties that make it even more difficult to deceive. Stay for next week.
PRESENT LINK: Did you like this column? The subscriber can release five free hits from each link per day. Just click the blue F below.