Our chief data scientist explains what the data is saying about the spread of COVID-19.

Understanding the trajectory of the coronavirus is a core aspect of seeking to control the outbreak, and is also highly relevant to the task of investors as they look to position themselves effectively in current market volatility. Recently, news organizations have aggressively reported the opinions of experts, but have not always captured important nuances. Below, I note some new developments and explore what our data science is finding in New York City, which is currently the global epicenter of COVID-19.

Forecasting Clarification

British epidemiologist Neil Ferguson recently published a study that forecast an 18-month epidemic with the potential to kill a half a million people in the U.K. assuming that nothing would be done, and instead that governments would wait for “herd immunity” (see below) to taper the virus’ impact. However, on Thursday he answered questions from a Parliamentary committee, and his remarks were widely interpreted (including by the White House) as reducing this estimate of those deaths to under 20,000. The difference is that he was now explaining a scenario covered in the study in which social distancing is enforced. He noted that, in Italy, social distancing is actually having a much greater impact on limiting the spread of the virus then he anticipated—in particular that the density of cases has not exceeded 1 in a 1000 people. In short, it appears that enforced social distancing works to contain the spread of the virus—something we believe is great news, with some caveats.

It is important not to confuse this with the point of view espoused by Stanford’s John Ioannidis and now Oxford’s Sunetra Gupta. Professor Ioannidis did not publish a model of the epidemic, and his view is simply that, since we have not done enough testing, we do not know if we are close to herd immunity. Professor Gupta does have a model-driven analysis, which we have evaluated, and it suggests that over 50% of the U.S. population has already experienced the virus, such that the country is near herd immunity. As a result, he argues, there is no need to persist with the economic pain.

What Exactly Is Herd Immunity?

When a large enough percentage of the population has been infected with the virus, they will be individually immune to future infection (unless the virus mutates significantly, as occurs with the flu). At that point, infected people will come into contact almost exclusively with immune people, and the virus will not be able to spread. If a population has in fact reached herd immunity, as Gupta suggests has happened in the U.K., it may make economic sense to begin a phased process of restoring business activity.

How Will We Know When/If We Have Achieved It?

The most important number here is the percentage of the population that has not experienced the virus. If that number is 99%, and the virus runs rampant, then the problem (reflected in hospitalization, morbidity and mortality) will become 100 times worse, versus two times worse if the number is 50%. Our best estimate is that less than 0.5% of the susceptible population has been exposed to the virus. As a result, the response needs to take some form of containment. This statement is consistent with the views of Professor Ferguson (and many other epidemiologists); his latest conclusion is simply that social distancing is an effective means of containment.

Beyond that, it appears that the virus is more widespread than the numbers indicate, because there has not been sufficient testing—something that the U.S. is working to rectify. At this time, 82,179 Americans are known to have been infected (0.2 in 1,000). In Italy, 1.3 in 1,000 people have been infected, but 5.6 in 1,000 Italian doctors and nurses have been infected. The latter group has all been tested, but then they also have had more contact with the virus and more protection than the broader populace. The situation is obviously worse in Italy than in the United States, so we believe it is unlikely that 1.1 in 1000 (442,000) are infected and less likely that 5.6 in 1000 are infected (1,856,000). The latter number corresponds to about 0.6% of the U.S. population.

Testing data from Asia is also not consistent with the widespread infection predicted by Gupta. China, Hong Kong, Taiwan and Singapore have been strictly monitoring incoming airline passengers. Specifically, they test every passenger for COVID-19, whether symptomatic or not, and publish the source of the infection for each positive case. This makes it possible to estimate true current infection rate in population, and how it changes over time.

Using various data sources1 to estimate the number of passengers traveling from the U.K. to the above four destinations on March 20, when travel was not banned, and counting the number of those from the U.K. later tested positive, we estimate that the portion of passengers from the U.K. that were infected when entering these four territories was about 0.7%. If Gupta’s model were correct, more than 10% of the U.K. population would have been infected between March 15 to March 20. If the virus were already widespread (in over 50% of the population) then the cases would not occur in clusters, and they would occur at the same time in multiple locations. In addition, if infection were already widespread, then social distancing and other measures would have very little measurable impact.

Exploring this further, one nursing home was infected in the Seattle area, and now one in New Jersey has been infected; and there have not been known infections in the country’s other 15,600 nursing homes, which house 1.4 million patients. In other words, the evidence does not suggest that the virus is widespread in the U.S. On the Diamond Princess cruise ship, the crew members that tested positive lived on the same deck. One individual caused the spread in New Rochelle, one at a party in Westport, one at a Biogen meeting in Boston, and one in the first outbreak in Tokyo.

In short, we believe the virus is still localized to a small subset of the U.S. population (less than 1%), which makes some form of containment very important in managing the epidemic and its economic impact.

New York City: Social Contacts Correlate With More Infections

Previously, we found that while subway traffic in Manhattan had decreased year-over-year, it had increased by 30% in some stations in Brooklyn. The good news is that activity in the boroughs is now down on a year-over-year basis, indicating that more people are staying home. The trend indicates a decreasing level of subway traffic in the week ended March 20, as shown below for Brooklyn and in the overall subway system.

Subway Traffic in Brooklyn (Year-Over-Year)

Source: Source: ny.gov, Neuberger Berman.

Subway Traffic in New York City Subway (Year-Over-Year)

Source: ny.gov, Neuberger Berman.

Ridership declines comes in the context of a sharp rise of new COVID-19 cases in Brooklyn over the past three weeks, as shown.

Rate of New COVID-19 Cases in Brooklyn

Source: ny.gov, Neuberger Berman.

The rate of new COVID-19 cases in Brooklyn is highly correlated with subway traffic, if the COVID-19 cases are time shifted five days—which corresponds to the incubation period of the disease.

Rate of New COVID-19 Cases vs. Subway Traffic in Brooklyn (Cases Shifted by Five Days)

Source: ny.gov, Neuberger Berman.

If the subway traffic (shifted by five days) is used to scale the infection rate, it shifts the change to become more uniform, and shows evidence that increased social distancing can help reduce the spread of COVID-19.

Infection Rate Scaled by Brooklyn Subway Traffic

Source: ny.gov, Neuberger Berman.

Conclusion: Social Distancing Works

This analysis does not show that riding the subway causes the spread of the virus; it only shows a correlation. However, if a large percentage of the population were now immune to the virus, social distancing and other measures would have less impact than has been evidenced on the spread of the disease. If enforced, we believe social distancing can work, and will likely further localize the virus to a smaller number of infected people. Once these smaller subsets are identified, other measures will most likely be needed to completely contain the virus.

We will continue to generate insights on the spread of coronavirus drawn from our data analysis.