COVID19_42

Posted 13 May 2020

<Previous            Next>

A shorter piece today because (even in a lockdown world) there are deadlines!!

You may have realised over the last 35,000 words of my COVID19 posts – yes, it really has been that many words – that I am something of a born again statistician.  My need to be data-centric in our flimsy and fickle world driven by Social Media has given me a chance to knock the dust of my numbers toolkit.  Medians, modes; regression lines, and probability models have become my intellectual companions in this locked-down household of one
Today there was a deja vu moment: I have been here before.  Not in lockdown of course, but in terms of what is going on around me.  When I was a statistician, one of my crusades was to make people understand the difference between data and information.  Too often the former becomes confused with the latter and those that make such mistakes can have a delusional sense that they are in control or they understand or they can predict.  Conversely, and something I saw all too often ‘back in the day’, just because data have not yet been converted into information, do not prematurely dismiss that data.   Rather, mark it ‘information pending’.
Data is like a raw uncut diamond; knowledge is the precious stone you can create from it.  And you wouldn’t throw away an uncut diamond just because it hasn’t yet been crafted into a sparkling gemstone.
Let me give you an example: “The speedometer on your car tells you you are travelling at 36mph”.  Whilst you might think that is information, in isolation it is no more than one datum.  I need other data to make it actionable: “What is the speed limit of the road you are on?”  “How far ahead is the car in front?” ” Is the car in front still moving?” “Are you at a red traffic light?”  All those extra bits of data would allow me to transform the speedometer reading into information. But without them, “36mph” gives me no actionable information.  Equally, do not ignore the speedometer either – the other data to make it actionable may come along later.
There is also a darker side to wanting data confused with information – propagation of a narrative.   Someone who (to continue my car analogy) thinks all cars should travel at 30mph would parade the speedometer reading of 36mph as triumphant information that evidences the world is going to Hell and back.  Unless you have the wits to spot the sleight of hand that disguised data as information, you could be drawn into that same narrative.
I see all of the above in the data around COVID19.
The daily death figures are data, but the media wants to deliver it to us as information.  They are flawed data – heavily flawed.  They cover only hospital and (now) Care Home deaths.  It relies on COVID19 appearing on the death certification.  Cases could be missed or over-stated.  It only becomes information in its trend.  Are deaths on the rise, staying the same or reducing?  Comparing to yesterday, last week or the last month begin to make it information.  But the media do not do that for us.  So instead our (mostly) logical brains to try to derive information from it and in the absence of anything else it is most like to stoke our fear narrative.
I saw a good example of that.  A discussion on Social Media revealed concern from a teacher as to why were there plans to reopen schools: “the last time there were 300 deaths per day, we were going into lockdown, so why are we considering opening schools when we are once more at 300 deaths a day?”.  The datum has been confused for information; the upward trend of eight weeks ago compared to the downward trend today have been lost.
The excess deaths data is another good example.  These are more reliable measures of death than the daily figures as these are derived from all cause deaths.  Comparing total deaths expected in a normal year with deaths this year, the excess tells us something about how the COVID world has changed things.  These are not necessarily all COVID deaths, they are just COVID related.  They could be people not entering the medical system for fear of COVID and dying of conditions or complications that might ordinarily have been eminently treatable.  They could be extra suicides because of lockdown and recession (an article from Australia has started estimating this particular ‘cost’ to society).  All of those and COVID deaths are wrapped up in the ‘excess deaths’, now at over 50,000 by May 2.  It is still essentially data rather than information though.
Let me make the data more akin to information.  In 1968 there were 80,000 excess deaths in that ‘flu season’.  And they were deaths in all age groups (as flu often is) whereas the median age of COVID19 deaths is 80.  I have now just made the excess deaths figure more like information than data.  Yes it is bad in 2020 and likely will be as bad if not slightly worse than 1968 but we did not shut down the UK in 1968 nor do my parents (I would have been two at the time) regale stories of “that flu season” and of course in 1968, 80,000 would have been a higher percentage of the population than it is now.  I am not claiming any analysis is right or wrong, just giving you better information rather than just data and hopefully steering you away from falling into a default narrative.
At the weekend I saw an interview with Sir David Spiegelhalter, ex-president of the Royal Statistical Society.  He echoed all of this beautifully.  He was critical of the way the COVID numbers are communicated saying the reporting “is not the trustworthy communication of statisticians”.  He accused ministers of using the daily figures as a “number theatre” and failing to give the public “genuine information”.
Hear, hear.
He was also scornful of the way his earlier remarks concerning “international comparisons”  had been misrepresented by the Government.  Spiegelhalter had previously written an article advising caution on how comparisons of death rates in different countries are made.  Because different countries have different reporting methods, ranking countries on death rates could be meaningless.  However, he made clear in his interview at the weekend that he absolutely was not saying that country-to-country comparisons have no value. Far from it.  Understand the limitations of your data, but then derive information from those data as appropriate.  He was dismayed that ministers, including Boris, had publicly cited his earlier article as justification to ignore data from other countries.