Michael P.H. Stumpf, University of Melbourne
How do you know that something you are looking for is not there? Looking for a needle in a haystack is fundamentally easy – however laborious and tedious – if you know it’s definitely there. Looking for something, not finding it, and therefore concluding it does not exist is a different problem.
In Victoria, at the time of writing, we have had 35 consecutive days of zero newly detected COVID-19 infections. But, obviously, not everyone in the state has been tested.
So what does the lack of new cases tell us about the true frequency of infections in the Victorian population? Or, to put it another way, what is the maximum number of infections that could still lurk out there undetected?
These are what statistician call sampling problems. We do not test everyone, but instead rely on people with symptoms to come forward for testing. If everyone with symptoms gets themselves tested, this should give us a good idea of how many cases there are.
There are caveats: some people do not come forward for testing while others get tested several times; cases tend to cluster in families. But we can account for such uncertainties in the analysis framework that we use below.
Plenty of people are still getting tested. People check the Department of Health and Human Services’ social media feeds to see the daily “0” (the celebrated “doughnut”); some are concerned about the number of tests performed each day; and many people seriously worry about the chance of a return of the virus.
However, we can estimate the probability the virus is still out there in Victoria. There are different ways to do it, but ultimately they all give very similar results.
One good way is to adopt a “Bayesian” approach, which also lets us work out how accurate the estimate is likely to be, given the uncertainties in our assumptions and inputs. We could do the calculations exactly (using a paper and pencil, or computer algebra software), but for making predictions we usually use simulations.
For our estimate we need to know a few numbers:
With this we can estimate p, the frequency of cases, after taking into account that we found 0 positives among n tests. A p value of 1 would mean everybody in Victoria has COVID, and 0 would mean nobody does.
In the Bayesian framework we calculate p as a compromise between our prior knowledge (or beliefs) and the new information gleaned from the data.
The prior forces us to state explicitly what we expect or believe reality to look like. And because it is a probability it also accounts for our level of certainty or ignorance. When possible we can, for example, use information from previous studies to generate the prior.
To be cautious, we will start with the very pessimistic assumption that an average of 1% of people in Victoria are actually infected. (We can be confident the real number is much smaller, but we are interested in a worst-case scenario.)
We put this 1% figure into our model as a probability distribution (called a “beta distribution”) that produces variable results with an average of 0.01 (which is another way of writing 1%).
If there are 0 positive tests among n tests then this will happen with probability (1 – p)n. The bigger p is, the more people have the virus, and the smaller the chances we would see 0 positive results.
With these two ingredients, the prior knowledge and the information from the data, we can now estimate the true frequency of infection in the Victorian population.
On the first day of the ongoing sequence of zero cases, October 31, 2020, there were 19,850 tests performed (thus n=19,850). The expected value for the true positive rate in Victoria on that day was therefore a tiny 0.0000000041 (4.1 × 10–9). We ran a million simulations of this scenario, and only in 260 instances were there any cases at all left in the population, with a maximum of 986 possible hidden cases.
Now after over a month of zero cases, and a total number of 438,950 tests between October 31 and December 2, the estimated probability has gone down even further to 0.00000000011 (1.1 × 10–10). The highest number of lurking infections in one million simulations is now 39 cases (and only 132 of our million simulations contained any cases at all).
Three points are worth considering, especially when applying this approach in the context of other states and territories, or Australia as a whole.
We were arguably lucky to get to zero cases, but we can be very confident that we have now eliminated COVID-19 in the community. The absence of evidence for coronavirus infections has slowly become evidence for the absence of the virus from Victoria.
Michael P.H. Stumpf, Professor for Theoretical Systems Biology, University of Melbourne
This article is republished from The Conversation under a Creative Commons license. Read the original article.