Dangers of the coronavirus (COVID-19) are dramatically over-hyped, and it’s easy to prove. It’s also a great example of how poor interpretation of data can mislead a lot of people – in this case, virtually the entire world.

When the scare first started, I more or less ignored it. There have been so many major news stories that promote irrational fear of one thing or another that I reflexively just wait these things out a little. But lately, I’ve noticed more and more “precautions” as I travel – more questions about where I’ve been, whether I have any symptoms, more masked faces in airports and more empty seats on airplanes – so I reluctantly decided to investigate it a little.

The first thing I did was to just ask some friends and colleagues what all the fuss was about. How is this different from other outbreaks? The answer was usually the same: It’s the mortality rate. A large percentage of people who are infected with the coronavirus die. That’s what makes it so scary.

But is it true?

Mortality rate involves two factors: the number of people who are infected and the number of people who die as a result of the infection. An error in either of these numbers would of course lead to an error in the mortality rate. For example, if you understate the number of people who are infected, then you will overstate the mortality rate. And that’s what’s happening now, in a big way.

Consider this recent article from the New York Times. Notice how the author includes a deeply flawed and overstated mortality rate along with an invalid comparison, and then a little later, admits there could be a problem with this number. Which part do you think most people will remember? (Emphasis mine.)

“‘Globally, about 3.4 percent of reported Covid-19 cases have died,’ Dr. Tedros said. ‘By comparison, seasonal flu generally kills far fewer than 1 percent of those infected.'”

… then a few sentences later …

The figure does not include mild cases that do not require medical attention and is skewed by Wuhan, where the death rate is several times higher than elsewhere in China. It is also quite possible that there are many undetected cases that would push the mortality rate lower.”

That last sentence is key. First, many people who get sick with common cold and flu symptoms don’t go to the doctor. So those cases aren’t reported. Second, in the case of COVID-19, the testing is severely limited. Both issues mean that the number of reported cases is much lower than the number of people who are infected. That might sound alarming, but in a way, it’s good news. It means the mortality rate is much lower than what is being widely reported. It also means comparing the reported COVID-19 mortality rate to rates of seasonal flu outbreak – which is based on an estimate of actual symptomatic illnesses, whether or not the infected person was tested or even went to a healthcare provider – is not a valid comparison. Not even close.

And here’s one from the Wall Street Journal. In this article, they assert an outright falsehood. The author states that “Globally, the new coronavirus has infected 93,123 people since it was first identified … according to data compiled by Johns Hopkins University. A total of 3,198 coronavirus patients have died.” Does this mean the mortality rate is 3.4%? Well, that’s what Business Insider decided to claim in the headline of this article.

But if you look at the referenced John’s Hopkins information here, you’ll see that they are counting “confirmed” cases. Again, there are many more people infected than have been confirmed. That means the mortality rate is much lower than 3.4%. Probably dramatically lower.

There may be other issues with the data as well that push the error in either direction. For example, it could be that some people have been infected but haven’t yet succumbed to the illness. Or it could be that the data has been fudged or mistaken one way or another, for one reason or another. We in the field of data management know that bad data quality is extremely common – especially when analysis is rushed out the door.

But assuming the data being reported is valid, it’s clear that the mortality rate is significantly overstated. Even in cases where reporters can claim that they are technically reporting the facts, the implication is clear. And the reader is left to dig and think carefully to find the issue. Most of us don’t have time for that, and the passing impression formed is that of a looming global catastrophe.

So, misinformation and fear continue to spread even faster and more dangerously than the virus itself, usually with the best of intentions I’m sure. And when testing for COVID-19 is done more widely and the number of confirmed cases skyrockets, there will still be no cause for panic. It will only be further evidence that the mortality rate is much lower than what has been widely claimed or implied.

I’ll remain open-minded as more data is collected and (hopefully) a few journalists begin to analyze that data more carefully and responsibly. But for now, it’s business as usual for me. If you extend your hand, I’ll shake it. If you hug me, I’ll hug you back. It’s going to take more evidence than what’s been presented so far for me to give in to hysteria and give up the most basic human contact.

Update: The latest “official” estimate from the CDC of the mortality rate of Covid-19 is approximately .26%, with a very wide range of values for different age groups (another important topic regarding interpretation and use of data). Clearly this rate is much, much lower than the 3.4% that was being widely communicated at the time this post was written.

As the post predicted, the reason that the mortality rate “changed” to a much lower number has little to due with the lethality of the disease and instead has to do with how the number is calculated. The 3.4% mortality rate is called the Case Fatality Rate (CFR), which refers to deaths among reported cases. However, the number was widely interpreted as if it was the Infection Fatality Rate (IFR), which refers to the percentage of deaths among people infected with the virus, whether or not the infection was reported. The IFR obviously involves estimating how many people are infected in real life.

Is this just a trivial semantic issue? No. When people hear “mortality rate” – whatever term is used and however it’s calculated – they will, and did, reflexively use that number to guess their chances of dying if they get infected with the virus, especially if CFR for Covid-19 is compared with IFR for other illnesses, such as influenza, which was extremely common. Whether or not this was an intentional effort to “get people to take this virus more seriously” or just an innocent mistake, it was dishonest.

Having said all that, if I had known how politicized this subject would become, I would have worded the post much differently. I guarantee that there is no possible way to infer any political leanings I might have from the original post.


  1. I have been struggling to try to understand the actual mortality rate. Understanding this rate is fairly key to understanding the trade-offs between various degrees of social distancing.
    I don’t see how our current shutting down of “everything” and the economy can be sustainable for very long. There will be very real consequences for those at the bottom of the social-economic ladder. Millions of hourly workers are losing their jobs and I have to assume some non-trivial amount of people at the very bottom will end up homeless. People are also putting off non-critical medical appointments and I have to imagine that some folks will be putting off diagnosing something important that they don’t know about yet where early detection would have been key.
    Also without significant herd immunity via inoculation from contracting the virus or a vaccine, it would seem that unless we get new cases to virtually zero and contain/quarantine all other cases (which right now is hard without sufficient testing) that the virus spread will just immediately rebound and we will have shut everything down again in short order.
    Everyone wants to try to compare the mortality rate to the flu to try to gauge the impact on our health care system.
    Here is the CDC website for the flu:
    From October 1 to March 7th (taking high side of estimate range) there have been 51 million flu illnesses with 670,000 hospitalizations and up to 55,000 deaths in the United States alone.
    This is a mortality rate of .107 %
    As Kevin alluded to the CDC uses a statistical model to estimate the number of flu cases at 51 million. This is almost never mentioned in the media when comparing rates as it is currently impossible to model the SARS-CovID-2 virus actual cases since so many are asymptomatic or mild.
    Here is the CDC website for CovID-19:
    Currently as of March 18, there are 7,038 cases with 97 deaths reported. Taken at face value that is a death rate of 1.37% and that is 10x higher than the flu. But there are two problems with that. The first is that there are or have been far more than 7,038 cases. If we assume that there 10 cases (not tested, not confirmed, or with mild un-realized symptoms) then the mortality rate drops to .137% and that is “only” 30% higher than the flu. But the other problem is that not all of those 7,038 cases have resolved yet and so the death rate may rise. The other problem is that 46 of those deaths are in King County Washington where the nursing home was – is that skewing the rate? (Anecdotally, I am sure that nursing home has had an outbreak of the flu before which did not result in so many deaths)
    Perhaps a better comparison is the number of deaths per cases that required doctor visits. For the Flu this 8.2%. For CovID, the only data I found suggested that there have been 508 hospitalizations and given the total number of deaths, we can assume at pretty startling 19% mortality rate.
    So I think overall back of the napkin that its safe to say that CoVID-19 is at least 2x more deadly than the flu and that the number skews higher for those over the age of 80.
    The other problem is, of course, the spread. Many Americans are vaccinated from the flu or have had the flu recently and may still have antibodies in their system. Even with these precautions in place which serve to reduce transmission vectors, there have been 55 million cases of the flu. We can assume that without vaccinations or herd immunity that CovID might easily double the number of flu cases in a few months.
    If we assume 110 million for the number of cases and just 2x the flu mortality rate, it yields a fatality number of: 235,400. If I take the estimate from some epidemiologists that the infection rate could be up to 70% of all Americans (224 million people) then the number deaths jumps to: 479,360 (almost half million). BUT if all those cases hit in a short period of time, it would overwhelm the number of available hospital beds and ventilators, so that could send the mortality rate higher, if just 2x higher due to medical rationing (or 4x total higher than the flu) then we are at 1 million US deaths. (note: Some of these deaths would overlap with flu deaths and some additional people will die of the flu or other illnesses because hospitals are full of CovID patients)
    Most of those deaths would be the elderly with underlying medical complications. In the harshest, most dispassionate, discussion of Utilitarian tradeoffs, some people may question the value of changing life as we know it for all the children with their whole lives ahead of them for the very short term compromised lives of the sick elderly that are most in this virus’s cross-hairs. Of course its too soon for that thinking, but it is coming.
    The NY Times did a comparison of deaths by age group and cause of death of CovID-19 assuming a mortality rate of .1% (10x higher than the flu) and an infection rate of 30% (96 million cases)
    How Coronavirus Deaths Could Compare With Other Major Killers
    1 Heart disease 655,381
    2 Cancer 599,274
    3 Alzheimer’s, dementia and brain degeneration 267,311
    4 Emphysema and COPD 154,603
    5 Stroke 147,810
    6 Coronavirus (estimate) 145,000
    7 Diabetes 84,946
    8 Drug overdoses 67,367
    9 Pneumonia/flu 59,690
    10 Liver disease and cirrhosis 55,918
    11 Renal failure 50,404
    12 Car crashes 42,114
    13 Septicemia 40,718
    14 Guns 39,201
    15 Falls 37,558
    16 Hypertension 35,835
    17 Parkinson’s and other movement disorders 35,598
    18 Obesity and other metabolic disorders 35,178
    19 Digestive diseases 31,015
    20 Atherosclerosis and other arterial diseases 24,808
    Age 10 to 19
    1 Guns 3,148
    2 Car crashes 2,870
    3 Suicide 1,532
    4 Cancer 1,071
    10 Coronavirus (estimate) 200
    Age 20 to 29
    1 Drug overdoses 11,477
    2 Guns 8,929
    3 Car crashes 8,284
    4 Suicide 3,542
    8 Coronavirus (estimate) 700
    Age 30 to 39
    1 Drug overdoses 17,303
    2 Guns 6,806
    3 Cancer 6,540
    4 Car crashes 6,365
    14 Coronavirus (estimate) 800
    Age 40 to 49
    1 Cancer 19,395
    2 Heart disease 18,276
    3 Drug overdoses 14,303
    4 Liver disease and cirrhosis 5,916
    11 Coronavirus (estimate) 2,200
    Age 50 to 59
    1 Cancer 70,996
    2 Heart disease 54,498
    3 Drug overdoses 14,688
    4 Liver disease and cirrhosis 14,642
    6 Coronavirus (estimate) 9,500
    Age 60 to 69
    1 Cancer 147,935
    2 Heart disease 102,449
    3 Coronavirus (estimate) 29,000
    4 Emphysema and COPD 27,768
    5 Diabetes 19,122
    Age 70 to 79
    1 Cancer 172,805
    2 Heart disease 135,243
    3 Emphysema and COPD 48,288
    4 Coronavirus (estimate) 44,000
    5 Alzheimer’s, dementia and brain degeneration 37,218
    Age 80+
    1 Heart disease 335,740
    2 Alzheimer’s, dementia and brain degeneration 220,976
    3 Cancer 177,706
    4 Stroke 86,390
    6 Coronavirus (estimate) 58,000
    You can see that for every age category that CoVID would “only” be the 4th or 5th cause of death (only in age 60-69 would it be as high as the 3rd leading cause of death).
    Is this just something we are going to have to live with? At least until a vaccine is created. Between living permanently quarantined with no human contact or accepting those death rates, I am going with the death rates. Then again I am only 53 so I may be biased.
    Let’s be more optimistic. Here is China’s infection rate in the very densely populated city of Wuhan after quarantine measures were put in place. They were able to flatten the arrival of new cases to a relative trickle in just 14 days. Can the United States do the same? I hope so – we can do this for 14 days, what I fear is what happens to our collective psychological and economic health after months and months of the current situation.
    ( Chart would not copy)
    This chart was in the CDC website today – is the curve flattening already? I hope so, but I am not expert.
    ( Chart would not copy)
    Some parting comments:
    -About inoculation: For some viruses, we don’t maintain immunity for very long. This is related to the common cold and you can certainly catch a cold multiple times a year. If we can get this and not be immunized for more than a few weeks, then this is going to be a big problem. In China there were reports of infection, recovery and re-infection.
    -About vaccines – a vaccine for SARS was never found. There is no vaccine for many viruses despite decades of trying (HIV for one). Therefore, I extrapolate that there is no guarantee that we find a working virus. But I did read that one is already being tested, so lets hope we do.
    -There have been reports even among recovered patients that up to 20% lung capacity damage may be permanent. And another report of damage to the testes in men leading to not being able to have children. I don’t know the incident rate or even it is true, but that adds additional scary dimensions beyond just the death rate.
    -Everyone is commenting the Scrubs video as if infection disease transmission is a new thing. One TV newscaster said, “Wow 14 years ago”. Haven’t we all known to wash our hands and not to touch things when there is an infection outbreak. I think this has been known for at least 100 years. But that Scrubs episode arc (3 or 4 episodes ) was fantastic.
    -The Canadians were the first to isolate and grow copies of the virus. This happened as soon as hockey was cancelled. No better way to motivate Canadians.
    That’s enough,


  2. Well, I’m no data expert, but I’ll throw in my 2¢ anyway.
    How about we calculate the number of COVID deaths per population? I don’t care what my chance of dying is if I get the virus. I want to know my chance of dying from the virus if I live in the United States. At least we’ll be using fairly accurate numbers. That’s 565,000 / 328.2 million or 0.172%. At that rate, I probably don’t even know anyone who died of COVID…and I don’t. That doesn’t mean I’m going to ignore all the warnings. The odds of me dying in a car accident is probably similar (I’m tired of looking stuff up, just humor me). Regardless, I’m still going to drive the speed limit and stay on the right side of the road.
    So what should I do to make sure I keep the odds in my favor? Go to parties? Eat in restaurants? Drive on the left?
    I’ll do what lowers my chances of getting infected:
    – Don’t live in a nursing home
    – Stay home when possible
    – Stay away from non-household people
    – Wear a mask
    – Get vaccinated
    Maybe I’m overreacting, but I wouldn’t call it panicking.


    1. I agree with you. That doesn’t sound like panic at all. That sounds like a thoughtful evaluation of available information and a personal decision to take certain precautions. Different people make different decisions and there’s nothing wrong with that.

      It’s also good to understand risk factors that apparently make a big difference, such as age, health, and other factors. For example, apparently about 94% of people who were identified as having died of Covid had comorbidities. That’s useful information.


      My blog post was about the widespread panic that was induced by false or at least profoundly misleading information. People understandably interpreted the percentages (Case Fatality Rate) that were widely publicized as an indication of their chances of dying if infected. But estimating that would be very different calculation (Infection Fatality Rate) and a much lower number, especially when testing was severely limited. That doesn’t mean there was no risk or that all this was a hoax or that anyone who takes any precautions is living in fear. It also doesn’t mean that it’s ok that people have died from this.

      I’m also not saying this was some kind of massive conspiracy to frighten people. Here I’m only pointing to an example of extreme bias, I’m sure usually for honorable reasons – to get people to “take it more seriously”.
      And it didn’t happen because “we didn’t know enough at the time”. We absolutely did. (Notice the date of the article.) It was just a terrible misuse of data.

      Regarding vaccination, I have a separate post on the extreme bias there. Here again, it makes sense to make an informed decision based on the best information available on risk, benefit, and unknowns, which unfortunately takes a lot of hard work to uncover.

      Ethics in Analytics for Covid-19 and Beyond: The Importance of Setting Morality Aside


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s