What is Benford’s law for? – 06.02.2021 – Marcelo Viana

A reader kindly sent me a study from 2011 in which the Benford law applied to economic data from EU countries raised suspicions of manipulation of data from Greece, which was later officially confirmed.

The law, discovered by S. Newcomb in 1881 and by F. Benford in 1937, states that the probability that the first digit (the left one) is d is given by the decimal logarithm of (1 + 1 / d) for the most varied of data types. This chance decreases with increasing number: for d = 1 it is 30.1%, for d = 9 it is only 4.6%.

We’re not sure why this law works, but it’s a useful tool for exposing all types of fraud, not just accounting. It is used, for example, to check data presented in scientific articles: in real articles the information follows Benford’s law, but not in “manufactured” works.

Compliance with the law must apply regardless of the unit used (e.g. inches, meters, or light years), although the numbers vary widely depending on the unit. This means that even if the “production” in the unit used in the article is perfect, it can only be identified by changing the unit.

Another property that makes Benford’s law difficult to fool is that it holds true in any number base: if we swap 10 for another base b, the chance that the initial “digit” is d is given by the logarithm of ( 1 + 1 / d) on this basis b.

There are also generalizations about the odds of the first two digits, the first three digits, etc. They have been applied to election results in Brazil and other countries with interesting conclusions. But deviations from the law can also be the result of “useful votes” and other legitimate voter attitudes, not fraud.

Benford’s law was even used to identify fake users on social media. The idea is to analyze how many followers the followers of the user have: Thousands of “bots” have already been identified because the number of followers of their followers did not meet the legal requirements.

Another interesting application is the detection of “fake” images on the Internet. Photos are represented digitally in the form of numbers (usually b = 16) that follow Benford’s law. But if the photo is edited and saved again, the law is broken. So a JPG or GIF file that deviates from Benford’s law indicates that the original photo may have been modified, although the discrepancy alone cannot tell what the change was.

PRESENT LINK: Did you like this column? The subscriber can grant five free accesses to any link per day. Just click the blue F below.

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button