Data Analytics Holds the Key to Re-opening the U.S.
As America reopens from the COVID-19 pandemic, federal and state governments need to make decisions about where to loosen restrictions first. Should banks take priority over gyms? Sporting goods over electronics stores? The decision to keep one type of location closed and allow another to open should be based on sound analysis of the relative danger and benefit of the locations, not political expediency. Now, that data analysis is emerging.
New research — based on millions of smartphone geolocation data points and government statistics — offers policymakers a methodology to follow as they make decisions about where and when to reopen the country and loosen restrictions resulting from COVID-19.
The working paper, Rationing Social Contact During the COVID-19 Pandemic: Transmission Risk and Social Benefits of U.S. Locations, by MIT Initiative on the Digital Economy (IDE) researchers, Seth Benzell, Avinash Collis, and Christos Nicolaides, offers a timely antidote to the subjective decision-making now taking place by U.S. governors.
“We know that crowded places accelerate the spread of COVID-19. By using large-scale, human mobility data we can approximate the cumulative transmission danger due to physical proximity at different locations,” said Nicolaides, who also works at the University of Cyprus.
Analyzing geolocation data from a large sample (47 million) of smartphones, nationally representative consumer preference surveys (of 1,099 U.S. residents), and government statistics, the researchers measured the cumulative health risks against the benefits and social costs of closing 26 different location categories in the U.S. Additional data, collected by SafeGraph Inc., recorded visits to six million locations around the country from February to March 2020, he said.
Tapping Three Data Sources
“We used three different data sources to get at the complete picture, Collis explained. “Mobile location data let us study transmission risks of locations; government statistics let us study the economic aspect of locations, and consumer surveys that we conducted helped us classify the importance of these locations to consumers.”
The researcvhers then ranked the categories across 13 dimensions of risk and value. The cumulative risk of a location was determined by factors such as the number of total visits (with an emphasis on older visitors), distance traveled to the location, and the density of attendance. The result is a ranking of locations by danger based on proximity and by economic importance. High importance locations — such as banks, grocery stores, and dentists — were identified as an index of consumer welfare, employment, payroll, and spending.
Notably, both the risk and importance measures are cumulative; rather than measuring the implications of a single visit to one of these locations, the researchers gauge the total danger and importance of all visits. While the typical visit to a movie theater may be more dangerous than the visit to a liquor or tobacco store, for example, there are many more total visits to the latter.
Low importance /high risk rankings were assigned to locations such as gyms, sporting goods stores, and cafes. Among types of retail stores, electronics and furniture stores could be opened before liquor and tobacco stores or sporting goods stores, based on the analysis. Gyms and cafes, juice bars and dessert parlors should be opened only after banks, dentists, colleges, places of worship, auto dealers and repair shops (See figure 1, panel A).
Fig. 1. A. Category-cumulative importance index and cumulative danger index. The color scale reflects the residuals by category of a linear regression of the importance index on the danger index. GOLD categories have disproportionately high importance for their risk; BLUE categories have disproportionately low importance. Categories in GOLD should be opened earlier because they have a better importance–to-risk tradeoff; locations in BLUE should be opened later because they have a worse tradeoff ratio. B. Change in location category visits versus the category importance-risk residual. Marker sizes are proportional to total visits in February 2020.
Panel B of the figure shows that a location’s risk/reward tradeoff is closely related to which locations have actually seen declines in employment. With some exceptions, governments, business owners, and individuals seem to have intuitively internalized some of the analysis’ insights.
A Confusing State of the States
Governments and civic organizations across the world have made different decisions about how to implement and relax social distancing measures as spread of the virus slows. In fact, some U.S. states are enacting policies that directly contradict the research findings. While all 50 states closed public schools, only 45 closed gyms, and just 14 states restricted religious gatherings without social distancing requirements. Fully 49 states have kept liquor stores open throughout the pandemic (See data here).
The importance or danger of a category varies greatly state-to-state and over time. Heat waves in Southern California and South Florida sent masses of people to public beaches in mid-April, even though the governor of California said on April 28 that parks and trails wouldn’t open for a few more weeks. Elsewhere, pressure from certain business groups, such as gun enthusiasts, may have led to opening sporting goods stores before colleges, for instance. Church leaders may flout regulations and hold services despite mandates, and more remote and unpopulated states — with fewer reported cases of the coronavirus — relaxed shelter-in-place mandates much sooner than others.
In the absence of empirical evidence, states are making decisions in the dark, according to the researchers. “If so, we hope that policymakers will be able to combine the results of our study with their own intuition and expertise to make better informed decisions,” said Benzell.
The analysis is preliminary and its limitations are discussed in the paper. “We are continuing to make refinements,” Benzell said. “Our data-driven ranking should be combined with other data and the practical judgment of local policymakers to reach final decisions.”