Statisticians React to the News

Pandemic and Statistics Marriage: Happy or not? Can we make it work?

10 November 2020
Luis R. Pericchi Guerra

The world is immersed in the worst cataclysm of the last 100 years. The quality of the data and the transparency of the methods worldwide is key to solve it. The International Statistical Institute (ISI) may have a leading role in their improvement.

The worldwide cataclysm

The current pandemic is arguable the most encompassing and crippling worldwide phenomena in the last 100 years. It poses an existential risk to a portion of the population well beyond any recent war or natural disaster. Our automatic and robust immune system is insufficient in front of a novel, hitherto unknown treat. Thus, we need to reinforce with the most powerful system invented by Homo sapiens: Recognition of ignorance, data, information, modeling, hypotheses, trials, confirmation. That is Science, Mathematics, Computation, all based on Data: Data Science and Statistics. All interconnected in an all-powerful, almost immediate, communication system.

The pandemics’ implications are not only health-related but embrace the whole of the globe’s economic, social, and political reality. Arguably, the pandemics’ national data are not the property of any single country but of all of the world. Science knows no boundaries.

The intangible estimation of quality?

However, what is clean, informative, unbiased data in this rush for cure and protection? There is no clear-cut answer. As argued for example in an article by Nate Silver (2020), “Coronavirus case counts are meaningless, unless you know something about testing. And even then, it gets complicated”.  Moreover, the pandemic data are bound to have heavy political consequences, and thus there is a direct stimulus for political interference in systems which are weak on checks and balances.

In summary, the data and the monitoring of the pandemics are simultaneously complex and murky on one hand, and indispensable for humans on the other hand.

My question, at this point is: in this conundrum who can help? 

Epidemiological data and official statistics?

The transparency-quality data problem, even if it is more pressing now, is not new. To some extent, epidemiological data are one part of Official Statistics and thus share characteristics with them.

My own story has not been heavily related with official statistics for a long time, although everyone nowadays is touched by it. But obviously, there is no homogeneous level of quality across the different countries of the world. However, the data are not a property of one country; they arguably belong to all of us; they are needed by all of us.

Even more, the handling of the data is of paramount importance since it heavily affects the final output. For example, it is very hard to even assess the direction of the performance: Is it a sign of good policy and implementation if a relative low count of deceased is reported from the pandemic in a country? Or on the contrary, it is a sign of a very weak and insufficient health system, not able to handle many cases, that are later assigned to other causes of death? So, I decided to try to find out how the international institutions, including the International Statistical Institute (ISI), are setting some quality standards for data that genuinely reflect the real situation of the pandemic.

Puerto Rico and Venezuela

Two cases close to my experience are in order: one is the case of Puerto Rico (PR) another is Venezuela. In PR at the onset of the epidemic, the data on different tests, the possible positives, the suspected ill, even the number of hospital beds occupied as a result of COVID-19 were in question. The Secretariat of Health was overwhelmed. However, we in PR have two advantages: one is the Federal USA aid and another the existence of the Institute of Statistics of Puerto Rico. The latter contributed a great deal to organize the data, double-check it, and to communicate it effectively. The simple fact that some independent external body is double-checking, and even better if it has Data Science skills, is of great importance and provides almost immediate improvement both in quality and confidence.

On the other corner, when the Venezuelan Academy of Sciences alerted the country about the likely under-count and the highly probable peak of the epidemic after some months of delay, the Academy was publicly menaced by the second most powerful state-man of the regime. Cross-checks and transparency lead to more reliable and trustworthy data. The opposite would mislead the science, which ought to be beyond borders and beyond political boundaries.

Guiding principles?

Elchert (2020) usefully describes and promotes a robust and reliable Federal Data infrastructure, mainly, but not only, in the USA, advocating for the integrity and efficiency of statistical agencies so that they operate according to best practices. Fortunately, there are useful documents for best practices, many of them cited in Elchert’s article, providing definite guidelines. Two of them are listed here:

 

Image
PrinciplesAndPracticesBookCover

USA National Academy of Sciences Principles and Practices for a Federal Statistical Agency (6th Edition, 2017):

  • Principle 1: Relevance to Policy Issues
  • Principle 2: Credibility among Data Users
  • Principle 3: Trust among Data Providers
  • Principle 4: Independence from Political and Other Undue External Influence

These are 4 clear-cut principles that should guide good and reliable practice.  

United Nations Fundamental Principles of Official Statistics (2014)

An even more complete set of 10 principles has been laid down by the United Nations. It is worth re-reading carefully. Here I highlight some of them:

  • Principle 3: To facilitate a correct interpretation of the data, the statistical agencies are to present information according to scientific standards on the sources, methods and procedures of statistics.
  • Principle 4: The statistical agencies are entitled to comment on erroneous interpretation and misuse of statistics.
  • Principle 5: Data for statistical purposes may be drawn from all types of sources be they statistical surveys or administrative records. Statistical agencies are to choose the source with regard to quality, timeliness, costs and the burden on respondents.
  • Principle 7: The laws, regulations and measures under which the statistical systems operate are to be made public.
  • Principle 8: Coordination among statistical agencies within countries is essential to achieve consistency and efficiency in the statistical system.
  • Principle 9: The use by statistical agencies in each country of international concepts, classifications and methods promotes the consistency and efficiency of statistical systems at all official levels.
  • Principle 10: Bilateral and multilateral cooperation in statistics contributes to the improvement of systems of official statistics in all countries.

 

Finally: In the needs of standards, a little help from the ISI?

Admittedly, some of these principles are hard to measure and implement. My suggestion is: may the International Statistical Institute (ISI) help in this respect, giving guidance on how to measure quality and implement practices conducive to higher reliability and transparency?

Perhaps this may be even done in a constructive way promoting that the different countries make a direct effort to qualify to a high standard. An important document is the ISI Declaration of professional ethics. The ISI website introduces the subject with an incisive phrase:

"The ISI is concerned to raise and maintain professional ethical standards in statistics across the world. As a Non-Government Organization (NGO) it can take actions which may be politically difficult for other organisations.  We have adopted the ISI Declaration on Professional Ethics and set up an Advisory Board on Ethics to advise on compliance with the Declaration.  The ISI considers submissions on ethical issues, issues statements and works with other organisations to raise and maintain ethical standards within the statistics profession."

Perhaps this is the time to implement actively ethical and quality standards regarding the precious data on the pandemic on each country. According to David Spiegelhalter’s dedication of his recent 2019 book The art of Statistics: Learning from Data, the statisticians are the most suited profession to advance the discovery of the evanescent quality of empirical data of this and future pandemics, along with other official data:

“To statisticians everywhere, with their endearing traits of pedantry, generosity, integrity, and desire to use data in the best way possible”.

Luis R. Pericchi Guerra
Puerto Rico & Venezuela