A test for spotting fraudulent statistics

Tally of COVID cases by country
COVID-19 cases and deaths are almost certainly undercounted. However, some countries appear to be reporting fraudulent statistics. Shutterstock image.

As countries report their death tolls due to COVID-19, some appear to be providing fraudulent statistics. Dmitry Kobak has examined them using the Poisson distribution. It’s a useful way to distinguish between lies and statistics.

Data with a Poisson distribution follow a certain pattern: the mean value is equal to the variance. The Poisson distribution is useful for predicting independent events such as the number of meteorites above a given size that strike the earth in a given year. There are certain conditions:

  • The event can occur any number of times within the time interval, or not at all.
  • The occurrence of one event does not affect whether another such event will occur during the time interval. That is, the events are independent.
  • The average rate of occurrences need not be constant, but is independent of occurrences.
  • Two events cannot occur at exactly the same time. That is, within a very small time interval, either one event or no event occurs, but not two or more.

In the case of deaths due to COVID-19, a modification is necessary because they’re not entirely independent. For example, a super-spreader event could result in a large number of cases in a short time, followed by a large number of deaths. In that situation the variance is greater than the mean. Statisticians call this kind of distribution “overdispersed.”

Identifying fraudulent statistics

All countries’ data are subject to undercounting to some extent. If there aren’t enough COVID tests to go around, or if a person dies of more than one cause, they might not attribute that death to COVID. Or there may simply be a backlog in issuing death certificates. None of these situations constitutes fraud; it’s just error.

The Russian numbers offer an example of abnormal neatness. In August 2021 daily death tallies went no lower than 746 and no higher than 799. Russia’s invariant numbers continued into the first week of September, ranging from 792 to 799. A back-of-the-envelope calculation shows that such a low-variation week would occur by chance once every 2,747 years.—”More equal than others,” The Economist, 25 February 2022

However, the data from some countries suggest deliberate tampering. That is, the distributions are too smooth to be credible. Even the normal decline in deaths over weekends followed by spikes on Mondays is absent. As The Economist points out, these countries are almost exclusively those without democratic governments or a free press.