Stochastic statistics education: Randomly making the world a better place

This is a post about a bad idea leading to a good one.

The bad idea: stochastic terrorism. I stumbled on a new term recently in an article shared by a friend, “stochastic terrorism.” Like any good statistician, I was fascinated (and a bit terrified). What could it mean?

The word stochastic is often used as a synonym for “random” and is usually used to describe a process. According to Mirriam-Webster.com it comes from the Greek words stochastikos, stochazesthai, and stochos which refer to aiming, guessing, and targeting. A simple definition of the statistical concept would be a random process – some sort of activity with a random component (e.g., sampling, observing data, investing, or even fishing) – the outcome of which cannot be predicted exactly even though the distribution of possible outcomes may be very well-understood from first principles or from past experience.

It turns out that, as far as stochastic terrorism is concerned, I am not alone in my fascination and confusion, but I am very late to the topic. According again to Mirriam-Webster.com, there was a spike in on-line searches for the term “stochastic” after a journalist used the term in an August 2016 article in Rolling Stone magazine to describe Trump’s suggestion that “Second Amendment people” could “do” something about Hillary Clinton. The journalist described stochastic terrorism as implications or suggestions that lead to violent outcomes which are “statistically predictable but individually unpredictable.” The idea is that a person with a high profile makes a “suggestion” for action to a very large number of people, most of whom have a very small probability of acting on it. With high probability, the suggestion will trigger some action which then appears to be that of a “lone wolf.” This type of terrorism has become possible due to the enormous reach of social media platforms. Never before in history, could a suggestion of carrying out violence reach 88M people in an instant.

The big idea behind the term seems to be more about teeny probabilities applied to enormous numbers than about stochasticity but maybe, as a statistician, I am just used to thinking in probabilities and distributions. The scary idea is that violence can be purposefully triggered via suggestion in a way that leaves the perpetrator to appear blameless. The perpetrator does not know who, specifically, will follow up but by reaching a very large number of people, there is a high likelihood of at least one individual doing something. The feasibility of this form of terrorism depends, to a large degree, on two things. First, our impressive inability to understand large numbers. Second, our impressive inability to understand small probabilities.

Are either of these human failings really a big problem? Yes. Both are a big problem and, in combination, can lead to terrorism and violence (as above) and also to hundreds of thousands of unnecessary COVID-19 deaths across the globe. And, accidentally, we are back to COVID-19 – sorry – and the infuriating suggestions that a small probability of an individual dying from COVID-19 (conditional on testing positive) implies that COVID-19 is not a big problem. Right? A 99% survival rate (hypothetically) means a 1% mortality rate which, applied to the population of, say, the United States (328M in 2019), would be over 3M people! We don’t know who – and it might not be you – but it is still one predictable and tragic outcome of a teeny probability applied to a large number.

We, as humans, also get confused with good outcomes that result from small probabilities and large numbers. If a million people buy a lottery ticket with a 0.00001 chance of a win, then about 10 people are likely to win. Woohoo! If you notice 10 winners, it does not therefore mean that buying a ticket is a great investment. Nope, you, individually, are still not at all likely to win.

This is a post about a bad idea leading to a good one.

The bad idea: stochastic terrorism. I stumbled on a new term recently in an article shared by a friend, “stochastic terrorism.” Like any good statistician, I was fascinated (and a bit terrified). What could it mean?

The word stochastic is often used as a synonym for “random” and is usually used to describe a process. According to Mirriam-Webster.com it comes from the Greek words stochastikos, stochazesthai, and stochos which refer to aiming, guessing, and targeting. A simple definition of the statistical concept would be a random process – some sort of activity with a random component (e.g., sampling, observing data, investing, or even fishing) – the outcome of which cannot be predicted exactly even though the distribution of possible outcomes may be very well-understood from first principles or from past experience.

Are either of these human failings really a big problem? Yes. Both are a big problem and, in combination, can lead to terrorism and violence (as above) and also to hundreds of thousands of unnecessary COVID-19 deaths across the globe. And, accidentally, we are back to COVID-19 – sorry – and the infuriating suggestions that a small probability of an individual dying from COVID-19 (conditional on testing positive) implies that COVID-19 is not a big problem. Right? A 99% survival rate (hypothetically) means a 1% mortality rate which, applied to the population of, say, the United States (328M in 2019), would be over 3M people! We don’t know who – and it might not be you – but it is still one predictable and tragic outcome of a teeny probability applied to a large number.

So, could better statistics education help us make better personal decisions with respect to health and finance as well as change the way we use social media to detect and reduce the spread of lies or attempts at stochastic terrorism?

The good idea: Stochastic Statistics Education. For years, I have felt that better statistics education could, in fact, make the world a better place. In answer to that particularly annoying question, “What would you do if you had unlimited time and money?”, I have always answered, “teach statistical thinking globally.” And that is usually the end of my having to put up with small talk. (Ever asked yourself why statisticians don’t get invited to more dinner parties?)

I rarely get the chance to elaborate but, when I do, it has been difficult to say exactly how I think the world would be better, exactly who I hope would do what differently. I remain, however, 100% confident that if “we” were to teach just one million additional young people how to think statistically, and if only 0.00001 of them (again about 10) made it to a position of power, the decisions those individuals would make would be more robust, more clearly based on data, and would incorporate concepts such as uncertainty, odds of false negatives, conditionality, and even the powerful combination of small probabilities applied to large numbers.

A recent article in MIT’s “Thinking Forward” newsletter put a number on the potential benefit of a different type of education. They looked at the prevalence of inventors by gender, ethnicity, and parent wealth. They found that kids born into the richest 1 percent of society are 10 times more likely to be inventors than those born into the bottom 50 percent and that innovation in the U.S. could quadruple if women, minorities, and children from low-income families became inventors at the same rate as men from high-income families. They identified clusters of innovation in which kids whose parents were innovators in a particular industry were more likely to grow up to innovate in the same industry; these clusters allowed the researchers to hypothesize a mechanism - “dinner table capital” – the informal education, inspiration, terminology, connections, and ideas that you pick up from listening to your parents. The researchers conclude that the USA has missed out on millions of innovations from all the kids who did not benefit from dinner table capital. No one knows which specific kids might have been influenced nor what they would have invented, but Van Reenen and colleagues demonstrated, probabilistically, how the world could be a better place if more kids had this type of education.

Statistics education, formal or informal, can have similar benefits. If research were conducted on the links between statistics education and well-made decisions in the professional or personal sphere, that research would surely document that the world has missed out on millions of well-made decisions for lack of strong and equitable statistics education – at the dinner table, in early education, or in preparation for advanced careers such as medicine and law.

Enter, stochastic statistics education. It may be more difficult to reach 88M individuals in an instant with the beauty of the Central Limit Theorem than with a quick tweet to incite violence, but it is not impossible for high quality statistics education to reach a much higher proportion of students than currently reached. With enhanced teacher training programs, professional incentives for teaching, and an injection of statistics into atypical subjects, many more students could be exposed to foundational statistical skills such as how to make and communicate decisions based on distributions of potential outcomes. What better way to combat the ongoing infodemic?

Statistics education – applied to a large number of people each with a small probability of being charged with responsibilities involving communicating information clearly, interpreting observations, or making decisions from uncertain information – would, with impressively high levels of certainty, lead to better decisions globally. We don’t know exactly who will leverage their statistics education or exactly how, but we don’t have to know to randomly make the world a better place.

----

++ I note that this article was not written in reference to the storming of the US Capitol Building on 6 January 2021. It was drafted, coincidentally, in the early hours of J6 anuary 2021, before events that resulted from more explicit calls for action. ++