All models are wrong, but does that mean we should ignore them?

Mark J. Panaggio
May 10, 2020
12 min read

Updated: May 11, 2020

Over the course of this pandemic a narrative has emerged that the models were wrong. I have heard this from a variety of different sources ranging from friends and family to the media. I even have heard it from the symposium at my own institution. When I hear a claim like that, my first inclination is to ask: what do you mean by that? Unfortunately, the explanations I have received thus far have been quite disappointing. More often than not, I find that the person making the claim does not understand the models or what they claim.

So, in this post, I would like to examine that claim and discuss the following questions: Were the models wrong? If so, how were they wrong and why? In particular I would like to focus on one particular model, the Imperial College model ( you can find the original paper here), which was reported to have influenced both the US and UK governments to alter their responses to COVID19.

First off, was this model wrong? Yes. You don’t have to know anything about the model to answer that question. All models are wrong. Models are not a crystal ball for peering into the future. They simply provide the best estimate based on the information that was available at the time. So, that is not a particularly useful question to ask. The real question is how was the model wrong and is it still useful?

How was it wrong? Here is a direct quote from the president of Hillsdale College (my employer) during a recent symposium on the coronavirus:

“The initial prediction about this virus was that it was going to kill two million Americans. And, it doesn’t look like that now; it doesn’t look like it will be anything near that.”

So, he is claiming that the models were wrong because they overestimated the number of deaths. Now, I have been reading as much as I can about these models, so I happen to know exactly what he is talking about. On page 7 of the report on the Imperial College model it states “we would predict approximately 510,000 deaths in GB and 2.2 million in the US”. So if you look at that one phrase, take it out of context and ignore the rest of the paper, then he is absolutely right.

That number, 2.2 million deaths, has been cited by the press, various think tanks, countless op-eds and even the president himself (seriously go ahead and google it). It is usually cited by people who are (a) trying to argue that society is overreacting to the whole coronavirus thing, or (b) trying to congratulate themselves on how successful their response has been. But is that what the model actually says? Not by a long shot.

If you actually read the report you will see that this is NOT a prediction. The report states:

“In the (unlikely) absence of any control measures or spontaneous changes in individual behaviour, we would expect a peak in mortality (daily deaths) to occur after approximately 3 months (Figure 1A). In such scenarios, given an estimated R0 of 2.4, we predict 81% of the GB and US populations would be infected over the course of the epidemic…In total, in an unmitigated epidemic, we would predict approximately 510,000 deaths in GB and 2.2 million in the US, not accounting for the potential negative effects of health systems being overwhelmed on mortality.”

There are few important things to note here:

1. This refers to an unmitigated scenario where people literally do nothing differently. This scenario includes no isolation of infected people, no social distancing, no school closures, no mask wearing and no lockdowns. If fact, this scenario doesn’t even account for people washing their hands more often. Obviously, this scenario is not at all realistic, nor was it intended to be, so it would be unreasonable to expect the outcome to be consistent with the status quo.

2. The authors of the study did NOT consider this to be a realistic prediction. They specifically call it an “unlikely” scenario because it was self-evident that people would change their behavior even without government guidelines.

3. This baseline “do nothing” type of scenario accounted for only 4 out of over 64 scenarios presented in the paper. None of these other scenarios are cited as frequently and every single one had drastically lower predicted death totals (ranging from 5600 deaths to 120,000 deaths in GB which would correspond to 20k-500k deaths in the US).

Upon closer examination, anyone who claims that the model “predicted” 2.2 million deaths has never read the report or worse, they have read it and they are grossly misrepresenting its contents. This was not a prediction. The authors did not believe that outcome was likely, not did they claim that it was. It was simply a baseline provided to illustrate the severity of the situation. It was intended to persuade readers and more specifically government officials that doing nothing was not a viable option and that significant action was required. When you look at the other scenarios described in the paper and in particular the ones that most closely resemble the policies that have been actually been implemented (which include school closures, case isolation and social distancing) their predictions seem far more reasonable. In fact, they predict somewhere between 6400 and 71000 total deaths (in GB) depending on the infectiousness of the virus and the adherence to social distancing measures. For reference, GB currently has over 31,000 deaths and that total is increasing by about 500 per day.

A second and related criticism was that the creators of the model immediately changed their dire predictions. Take for example this excerpt from an article in the Federalist:

“British scientist Neil Ferguson ignited the world’s drastic response to the novel Wuhan coronavirus when he published the bombshell report predicting 2.2 million Americans and more than half a million Brits would be killed. After both the U.S. and U.K. governments effectively shut down their citizens and economies, Ferguson is walking back his doomsday scenarios…But after tens of thousands of restaurants, bars, and businesses closed, Ferguson is now retracting his modeling, saying he feels ‘reasonably confident’ our health care system can cope when the predicted peak of the epidemic arrives in a few weeks. Testifying before the U.K.’s parliamentary select committee on science and technology on Wednesday, Ferguson said he now predicts U.K. deaths from the disease will not exceed 20,000, and could be much lower.” (https://thefederalist.com/2020/03/26/the-scientist-whose-doomsday-pandemic-model-predicted-armageddon-just-walked-back-the-apocalyptic-predictions/)

The author of the piece then cites a series of twitter posts by a former NY Times journalist Alex Berenson as justification for this claim. This same journalist was invited to speak as part of Hillsdale College’s Coronavirus symposium where he gave a 50 minute lecture on “The failure of expert predictions and models.” Both Berenson and the staff at the Federalist have repeatedly argued that the US response was a significant overreaction, because the vast numbers of deaths never materialized (I discussed the logical fallacy behind this argument in a previous post).

But is that what really happened? Again, not even close. As we just discussed, contrary to what Berenson claims, the original report never predicted 2.2 million deaths and the lead author never retracted his model. In fact, let’s hear it from the horse’s mouth. Here is the lead author of the Imperial College Study via twitter:

“I think it would be helpful if I cleared up some confusion that has emerged in recent days. Some have interpreted my evidence to a UK parliamentary committee as indicating we have substantially revised our assessments of the potential mortality impact of COVID-19. This is not the case. Indeed, if anything, our latest estimates suggest that the virus is slightly more transmissible than we previously thought. Our lethality estimates remain unchanged. My evidence to Parliament referred to the deaths we assess might occur in the UK in the presence of the very intensive social distancing and other public health interventions now in place. Without those controls, our assessment remains that the UK would see the scale of deaths reported in our study (namely, up to approximately 500 thousand).” (https://twitter.com/neil_ferguson/status/1243294815200124928)

In other words, the author stands by his original model and points out that people are comparing apples and oranges. The 2.2 million deaths only occurred in a hypothetical scenario with no mitigation. The 20000 deaths came from the author’s best estimate based on the intense mitigation and suppression policies that the government had recently adopted. So, the author was not claiming to have overestimated the number of deaths at all.

So, how was the model wrong? Many of the scenarios it described, including the 2.2 million deaths, were inconsistent with the actual policy response. And, the lead author’s best estimate based on that model which was presented to parliament UNDERESTIMATES the number of deaths by a significant margin since the UK is already at 31k deaths and counting.

On to the last question, why was the model wrong? There are many answers to this. The first is uncertainty about the disease specific model parameters. We don’t know the basic reproduction ratio for the novel coronavirus. Estimates range from 2 to 3.5. That can make a big difference in how fast and how far the virus spreads and it depends on a variety of factors including the specific strain of the virus, the population density, social norms, individual behavior, etc. The authors of this paper addressed this uncertainty by presenting results for a range of different values for that parameter.

We also don’t know the hospitalization rates and mortality rates precisely. They vary from country to country and age group to age group, and they are highly uncertain because in the absence of widespread testing we don’t know how many people have contracted the disease in the first place. As a result, any prediction about the number of deaths or the load on hospitals will necessarily involve a great deal of uncertainty. That is why scientists often talk about a range of outcomes or a confidence interval rather than a specific prediction. Unfortunately, that type of nuanced language is often lost when the press gets a hold of a paper and it gets condensed down to a headline or soundbite. The result is that the most sensational aspects of a paper are the ones that people hear about.

The second reason that the models are wrong is that there is uncertainty about human behavior. It is simply impossible to predict how people will respond when presented with information about a disease like COVID-19. Will governments take the threat seriously? What policies will they put in place? Will people abide by those policies? For how long? The answers to these questions have significant impact on the course of an epidemic and it is unreasonable to expect any model to be able to predict all the specifics of the disease AND our responses to the disease.

So why are models wrong? In epidemiology, one of the primary reasons is that people are unpredictable. Sometimes the best thing that a modeler can do is to describe a range of difference scenarios so that they can say if we do (A) then (B) will happen, but if we do (C) then (D) will happen. That is precisely what the Imperial College study did.

Clearly there is a great deal of confusion surrounding this model and its implications. Who is to blame? Well, there is plenty of blame to go around.

1. We should blame the scientists. Science is hard and scientific communication is even harder, but we (scientists) need to do a better job of communicating our findings clearly to help provide the public and the government with accurate, comprehensible, and actionable information. I do not know the specifics of what Neil Ferguson and the other members of his team told the press about their model, but clearly something was lost in translation. And if they overhyped their model and overstated their claims, them shame on them.

2. We should blame the media. The media has a difficult job. Asking a reporter with minimal scientific background to present scientific results to a general audience is a tall order. However, the media has a responsibility to avoid over-sensationalizing the headlines. Otherwise, they can cause unnecessary fear and confusion. When they do this, it only serves to make the public distrust the media and the scientific establishment. And if they do this intentionally just to generate a few extra clicks, then shame on them.

3. We should blame the public. Our appetite for entertainment is what drives the media. In many ways the media gives us exactly what we ask for. How often are we willing to read beyond the headlines to appreciate the nuanced details of an article? How often are we often willing to go to the source (the preprint or peer-reviewed article) instead of expecting a journalist to tell us what we need to know? The answer is: not nearly enough. So why should we be surprised when the media feeds us the click-bait headlines that we ask for? If we are unwilling to put forth any effort to investigate suspect claims before blindly accepting them, then shame on us.

4. We should blame the individuals who spread lies. One of the reasons that nonsense like that 2.2 million deaths prediction spreads is that dishonest people like Alex Berenson and the writers at the Federalist (intentionally?) misrepresented the facts in order to fit their agenda. They started from the premise that COVID19 is overblown and that the response is an overreaction and then they manufactured arguments to support that claim. I listened to the entirety of Berenson’s lecture on “The failure of experts predictions and models” and it was abundantly clear that he has no understanding of mathematical modeling or epidemiology, that he has no regard for truth, and that he has uses his contrarian diatribes as a tool for self-promotion. So shame on him and others like him for spreading lies and misinformation.

5. We should blame the institutions who enable these individuals. Sadly, my employer (although not for long), Hillsdale College, is one of those institutions. Hillsdale’s logo has the caption “Pursuing truth; Defending liberty since 1844”. If pursuing truth is one of the aims of the institution, there was little evidence of that in their coronavirus symposium. They managed to invite one congressman, two journalists, one economist and a panel of politics faculty, but did not bother to include a single scientist with expertise in epidemiology or modeling. This resulted in an embarrassing display of complaining, conspiracy peddling, pandering and science bashing. So, Hillsdale College, shame on you for giving people like Alex Berenson a platform without including a single credible expert on the scientific issues at play. Shame on you for treating a global pandemic like a purely political issue and presenting it from a remarkably one-sided perspective. Shame on you for using a “symposium” as an opportunity to spread propaganda instead of a forum for genuine inquiry in the pursuit of truth.

Before I conclude let me offer a few thoughts about how we should relate to models like the Imperial College model:

1. We should not blindly trust any model without carefully examining its assumptions. In this case, I wish the authors of the Imperial College study had provided more details about the specifics of their model and made their code available for other researchers to validate their findings from the get-go. It is hard for me to tell whether I should believe their model because they failed to provide sufficient documentation in their original report. From what I have read about the lead author, I have reasons to question his judgment, but that does not necessarily mean that we should discount the work of the dozens of other contributors.

2. We should consider a variety of models when making policy decisions. A complicated model often involves many choices and assumptions that may not be clear cut especially in the face of imperfect and incomplete data. For this reason, it is important that policy makers avoid becoming over reliant on any one model and should instead consider a variety of models in aggregate. In the case of COVID19, most of the prominent models are making similar predictions about what the next few months will look like (for a nice summary of their predictions, check out this helpful article: https://projects.fivethirtyeight.com/covid-forecasts/?ex_cid=rrpromo) and sadly most of them have dramatically increased their estimates for the number of deaths because of the number of states that are abandoning social distancing measures and the way that people are reverting to normal behavior in response.

3. We should expect that models will be revised and refined as new data comes in. In sports there is this notion that changing your prediction in response to new information is evidence of cowardice. I have no idea where that notion came from, but it is utter nonsense. Modelers should absolutely refine their models in response to new information. Every day, we get more data on the prevalence of COVID19, how quickly it spreads, how susceptible people are, etc. Instead of complaining when predictions change, we should applaud because that means we are that much closer to understanding what is really going on. Now there may be legitimate concerns about the robustness of some models (the predictions of a good model will not change dramatically in response to minute changes of the inputs), but generally speaking predictions will change and we should view that as a good thing.

4. We should refrain from bashing models or modelers until we have actually done our homework. In the case of the Imperial College model, many public figures wasted time kicking down a straw man (the fictional prediction of 2.2 million deaths) instead of discussing far more relevant critiques of the model and engaging in more productive debates about how the real predictions should inform public policy. If people had actually read the paper or taken the time to talk to someone with expertise in epidemiology, they could have easily ascertained that they were focusing on the wrong things.

PS The symposium was not all bad. I particularly appreciated the comments by Dr. Adam Carrington on the constitutional issues surrounding lockdowns. I found his comments to be both informative and balanced.

PPS The cover image is George Box, the statistician whose famous quote inspired the title of this post.

PPPS I had some strong words for my employer in this post. I should point out that there are many reasonable people at Hillsdale. I have gotten to know many wonderful colleagues on the faculty and on staff and wonderful students during my 4 years here. This criticism is directed not at them, but at whoever put together the so-called symposium that was truly an embarrassment.

P^(4)S Here is a link to the videos for the symposium I referenced. I post it reluctantly because on the one hand, I don't want to encourage people to watch them, but on the other hand, I do think it is important for people to be able to verify my description for themselves. So, here is the link: https://symposium.hillsdale.edu/

Mark J. Panaggio

All models are wrong, but does that mean we should ignore them?

Recent Posts

Comentarios