The limits of smartphone data are on display as the country seeks to reopen
Federal and state officials are using smartphone location data to inform what amounts to a high-stakes public health experiment in reopening the economy while a lethal pandemic remains rampant.
But the value of that data is unproven in predicting when and how to relax restrictions, and epidemiologists are uncertain that the data will be as useful in forecasting the likely result of allowing nail salons, restaurants and movie theaters to reopen as it was in tracking the public's adherence to stay-at-home orders issued in the early weeks of the pandemic.
More than four months after a novel coronavirus emerged in China, this is just one of many key questions about the pandemic that remain subject to scientific inquiry and debate: How much do masks slow transmission? Does warmer weather impede the coronavirus? How vulnerable are children? What is the infection risk on increasingly crowded beaches and parks? Are the people who have survived covid-19, the disease caused by the coronavirus now immune?
Where people travel and how long they stay away from home can be measured with smartphone location data. But the increasingly popular movement maps derived from this data don't reveal how well people maintained social distancing once they reached their destinations -- something that is key to understanding the transmission of the coronavirus, epidemiologists say.
"In general, more movement does increase the risk of transmission, but you also have to take that with a grain of salt," said Saskia Popescu, a George Mason University epidemiologist. "It's not a crystal ball. It's a forecasting process."
The influential pandemic modelers at the University of Washington recently plugged location data -- gleaned from millions of smartphones -- into their equations to reflect the growing restlessness of Americans as a sweeping national shutdown approached the two-month mark.
The results were alarming: Projected mortality from covid-19 nearly doubled. Long-term hospital demand soared. A predicted period of summertime calm abruptly vanished, replaced by a grim stretch in which hundreds of people were expected to die each day. Put simply, the modelers had come to believe that as Americans resumed traveling more widely, a new wave of infection and death surely would follow.
But that conclusion, while widely shared by public health authorities eyeing the same smartphone data, is at least one step beyond what the science can reliably show at this point, epidemiologists say. Although more traveling almost certainly means more transmission, the amount depends on the subtleties of human interaction -- How close did people stand? Did they sneeze? Were there physical barriers such as a closed car window? -- that smartphone location data does not reveal.
This distinction may seem academic, but it bears on the urgent and unresolved national question of whether Americans can resume some semblance of their previous lives while still limiting the spread of the deadly virus.
These unknowns -- combined with the ongoing national failure to deploy adequate testing and contact tracing -- have left policymakers grappling in the dark, fearing that the wrong advice could unleash a new wave of infection or, conversely, worsen economic devastation that has drawn comparisons to the Great Depression.
"Anyone who speaks with a lot of confidence about how this plays out hasn't thought very deeply about it," said Eli Fenichel, a Yale University professor of natural resource economics who has studied the relationship between location data and the spread of disease.
Among the most closely scrutinized scientific products during the pandemic have been models, such as the University of Washington's, that attempt to show the future trajectory of covid-19 infection and death. Though initially built to anticipate demand for resources, especially at hospitals, pandemic models have been embraced more broadly by policymakers forced to make decisions with little meaningful precedent for guidance.
Marketers of smartphone location data -- derived mainly from mobile apps and sold primarily to advertisers or analysts looking for insights into consumer behavior -- began building free dashboards of aggregated, anonymized data in March. Facebook and Google also released their own maps of community movement, based on data from smartphones running their software.
Never before had so much information on the collective locations of Americans been so readily available and easily visualized -- causing privacy advocates, long critical of the smartphone location industry, to grit their teeth even though the data did not reveal the locations of individuals. Critics also noted inherent biases in smartphone data given that nearly 1 in 5 Americans don't have one, with rates lowest among seniors, who are most likely to die of covid-19.
But the data soon proved effective in showing the rapid decrease in overall travel as the pandemic surged in the United States in March, prompting wide-ranging government orders that restricted millions of Americans largely to their homes and essential businesses, such as grocery stores. The Centers for Disease Control and Prevention cited location data from SafeGraph, a San Francisco-based marketer of smartphone location data, in its April 17 Morbidity and Mortality Weekly Report concluding that residents of four hard-hit metropolitan areas had sharply limited their movements at a time of tightening government orders.
Analysis of the data soon became common, shaping major policy decisions and much news coverage about the failure of Americans to heed government warnings. A Slack channel among those using SafeGraph data grew to about 2,500 members from more than 1,000 institutions, government agencies and other groups, the company said.
Epidemiologists were able to establish tight connections, especially noticeable during the period of rising concern about covid-19 in March, between less travel, more time spent at home and fewer infections.
"You watch mobility drop, and it almost perfectly predicted when transmission" slowed, said Christopher Murray, director of the University of Washington's Institute for Health Metrics and Evaluation, which produces the model.
But when it comes to calculating the effects of rising mobility on transmission, Murray said, "We don't have enough data to test that."
Modelers at the University of Texas at Austin, meanwhile, had reported their own successes in linking smartphone location data to coronavirus transmission in China. Using location data in 369 Chinese cities, the Texas modelers had correctly determined that infection had spread extensively beyond its origin in Wuhan during a period of widespread travel for celebrations related to the Lunar New Year in January.
As the Texas team began developing models of the outbreak in the United States -- a task made more difficult by the sheer novelty of a pathogen that spreads rapidly, with no cure or vaccine, through communities with no prior immunity -- the smartphone location data was a welcome addition, said Lauren Ancel Meyers, a mathematical biologist who leads the Texas effort.
But the exact connection between the slowing movement in late March and the decline in transmission was less clear when attempting to project into a future in which the world had changed. Officials could lift restrictions but not fear of covid-19.
During the weeks when Americans were traveling much less, amid the closure of schools, businesses and large gathering spots, other changes also were underway that did not appear in the smartphone data. People stood farther back when chatting with friends. Many wore masks. Hand-washing become more frequent and thorough. Sneezing near others became seen as a reckless act. Modest fevers became cause to seek medical advice and testing.
Such subtleties are hard to capture in available data sets and, as a result, hard to convey in disease models. They rely on equations that amount to the best, most educated guesses of scientists weighing factors likely to affect transmission and death.
The University of Texas model now shows deaths falling gradually later this month, despite growing travel by Americans. But it will take at least several more weeks of infection data to determine more precisely the relationship between the broad metrics of movement and covid-19 transmission.
All of this is made even more difficult by the time lag between the moments of infection, positive test results, hospitalizations and, in the worst outcomes, deaths. That means policy mistakes today could easily take weeks to show up in such traditional epidemiological data. But smartphone data showing renewed trips to coffee shops or nail salons typically depict events just a few days earlier -- giving potential early warnings that the pandemic is growing more dangerous.
But how much remains unclear.
"This data is informative, and it has allowed us to make reasonable predictions," Meyers said. "It's not perfect."
Murray, the University of Washington modeler, said after the dire update last week: "It's very likely transmission will go up. But how much? ... We hope it will be less strong."
This uncertainty underscores the treacherous nature of modeling. The data change, but so does the behavior driving the data. Sometimes, as when the White House embraced the University of Washington model and the relatively optimistic forecasts it was offering before recent updates, models can shape policy in ways that affect behavior and, arguably, the course of the pandemic.
"That's why these models keeping changing -- because the behavior keeps changing," said bioethicist Kelly Hills, co-founder of the consulting firm Rogue Bioethics in Massachusetts.
Numerous epidemiologists in recent weeks have criticized Murray's model -- especially in the time before the smartphone data was added -- as too rosy and overly confident in predictions that kept shifting with each update. Many took note when the nation's tally of deaths surged past the model's onetime prediction of 60,400 and then, on May 4, when the prediction rose to 134,000 deaths by end the of August. (It was updated again on Tuesday to 147,00 deaths.) The actual death toll, which many analysts say is an undercount, now exceeds 83,000.
Murray has defended his team's work as relying on the best available data at a time when underlying conditions kept shifting and said he believed their predictions contributed to President Donald Trump extending national restrictions through the end of April.
But the once-rarified world of pandemic modeling has grown unquestionably politicized. Arizona authorities, at a time when Republican Gov. Doug Ducey was pushing to lift restrictions, shut down the respected modeling efforts at Arizona State University last week -- before relenting amid public outcry.
As scientists and officials keep searching for answers about the future of the epidemic, some relatively simple facts continue to push their way to the surface.
Slowing down movement overall slows down new infections. Social distancing works. And in a world without a vaccine or a cure, easing these protective measures probably means more transmission and death -- even if the extent of the resulting surge remains unclear.
Yet, even more broadly, these are remarkably blunt tools against a lethal, contagious disease. They are made necessary because authorities lack precise disease surveillance tools -- a result of the national failure to widely deploy testing and contact tracing to identify new infections before they have a chance to spread. These are among the most essential tools in the public health tool kit against the threat the nation faces, and they still are sorely lacking at the necessary scale in a nation with more than double the recorded covid-19 cases and deaths than any other, according to tracking by Johns Hopkins University.
In the absence of direct evidence, scientists are relying on the best indirect evidence they can find, such as smartphone location data. Murray called it a "proxy" for what modelers really want to know, which is how much people are increasing their risk as they leave their homes more and more.
"It's tons of data, and it looks really nice," said Fenichel, the Yale professor. "But it's like any other data set. It's messy."
Using smartphone location data, he recently modeled the effect of social distancing on transmission in every county in the nation, finding that the data is more reliable in densely populated urban and suburban areas than in rural ones. Travel in less densely populated areas may look extensive even if actual human contact is minimal.
The more important finding of modeling he and his colleagues have determined is this: Perhaps not everyone needs to stay isolated, but sick people definitely should.
"It actually looks like keeping infectious people at home is the most important thing we can do," Fenichel said.
But, he added, in a world without enough testing, nobody knows who those people are. And smartphone location data won't tell you.