Summer 2020


A watercolor painting of the novel coronavirus SARS-CoV-2. depicted in vivid pinks and purples against a bright green and yellow background.
This watercolor painting, which depicts a coronavirus as it enters the lungs, surrounded by mucus secreted by respiratory cells, secreted antibodies, and several small immune systems proteins, makes SARS-CoV-2 appear quite beautiful. But the virus caused an ugly mess across the globe this year by causing widespread cases of COVID-19, a sometimes deadly infectious disease. A COVID-19 research group, led by UW experts, is using data science to help pandemic managers navigate the chaos. Illustration by David S. Goodsell, RCSB Protein Data Bank; doi: 10.2210/rcsb_pdb/goodsell-gallery-019


It began with a question from a yoga instructor, sent to the inbox of horticulture professor Brian Yandell. Now it’s a major collaborative effort, housed at UW–Madison, that will help the world better understand and respond to COVID-19.

“My yoga instructor emailed me to ask about an article she had read,” says Yandell, who has been a UW professor of horticulture and statistics for 38 years and is now interim director of the American Family Insurance Data Science Institute (AFI DSI). The article detailed the sneaky, exponential growth of COVID-19 in other parts of the world. And it highlighted how mathematical modeling can help inform policymakers and the general public about the pandemic.

Inspired, Yandell began to look for data. He started developing an app to assess COVID-19 case projections in the Midwest. But after an impromptu conversation with his neighbor, Mary Bottari, chief of staff to Madison’s mayor, he realized that, to gather enough data to create meaningful models of the disease, he was going to need more help.

An evening aerial photo of the Discovery Building, awash in golden light from street lights and windows, as well as the City of Madison skyline, including the white shining state Capitol Building standing a dome above the rest of the buildings.
Nighttime falls over the Discovery Building (center), which houses UW’s Data Science Hub. Members of the Hub are participating in a nationwide effort — led by the COVID-19 Research Group at the American Family Insurance Data Science Institute — to bring new insights to pandemic management. Photo: Jeff Miller

So Yandell reached out to researchers at UW Health, the UW School of Medicine and Public Health, and others on campus, such as Michael Ferris, director of the Data Science Hub at the Discovery Building. He contacted Ajay Sethi, associate professor of population health sciences, and Malia Jones, assistant scientist at the Applied Population Laboratory in CALS (see “COVID Crush Shows Disease Spread Is No Game” in this issue for more about Jones’s pandemic-related work), along with several colleagues at the College of Engineering. His list of connections grew longer every day.

Within three weeks, Yandell had heard from more than 100 people throughout the U.S. who are now participating in various ways. The AFI DSI COVID-19 Research Group works under a charter that focuses their efforts in three areas: interpreting data, using data to create models, and sharing information and findings. These areas are some of the hallmarks of data science, a burgeoning interdisciplinary field that involves developing new methods for data collection, storage, and analysis. The goal is to find more effective and efficient ways to draw insights and useful information from massive data sets.

Early modeling results from the research coalition show that the speed of viral transmission has slowed since Wisconsin Gov. Tony Evers issued the first “safer at home” executive order on March 25. The results demonstrate that, in the absence of other options, such as a vaccine or approved therapies, physical distancing (also referred to as social distancing) is necessary to stop the spread of COVID-19.

“While the human costs of COVID-19 are clear, so are the steps we must take to protect our families, neighbors, and community,” team member Jonathan Patz, director of the Global Health Institute, wrote in a recent op-ed. “Physical distancing must be our top priority to stop new cases of COVID-19 from overwhelming our health care system.”

Information, Not Assumptions

Even after public orders to remain at home expire, Patz, Sethi, and Yandell stress that physical distancing will continue to be necessary until broad-scale testing and contact tracing become available.

A close shot of green, red, and yellow lights and cables on the back of a computer terminal.
Cables and computer equipment in the Wisconsin Institute for Discovery data center at UW–Madison, which is utilized by the COVID-19 Research Group at UW’s American Family Insurance Data Science Institute. Photo: Michael P. King

“We need an overflow of information from testing,” Yandell says. “The data we get from testing can then be used to refine our models rather than making assumptions.”

The data science research group was recently asked by the Wisconsin State Emergency Operations Center and the state’s Department of Health Services (DHS) for advice on developing and implementing a contact tracing data system, which is being used to help identify potentially infected people and contain the pandemic. Yandell, Patz, and Todd Shechter, UW’s chief technology officer, are also advising the UW System and UW–Madison administrations on their plans to safely open up research and instruction activities over the coming months.

“We are considering multipronged approaches to testing and contact tracing, including traditional, digital, genetic, batch (within businesses or institutions), and possibly wastewater approaches,” Yandell says. “We are also discussing models of testing and tracing in production plants and university campuses.”

Ferris says data on COVID-19 are quickly evolving with increasing detail from counties and from city and regional levels. Epidemiological and statistical models extrapolate from this data to arrive at future predictions in a process called “calibration.”

No predictive models are perfect, but with adequate data and fine-tuned mathematical parameters, they can be useful tools for helping anticipate the future. Or, as Yandell likes to point out (using the words of George Box, founder of the UW Department of Statistics), “All models are wrong, but some models are useful.”

A former student of Box’s, research group member Kevin Little, now runs a private consulting firm in Madison. He’s using control charts to help DHS diagnose “gating criteria,” coronavirus-related benchmarks established by public health officials, to ensure the safe opening of counties. Another team member, UW data scientist Steve Goldstein, is leading a group that just won a Wisconsin Alumni Research Foundation Accelerator Grant to develop interactive data visualization tools to quickly detect potential spikes across regions of the state.

The Right Resources

With calibration, researchers can add new drivers — such as an updated number of cases or changes to the availability of ventilators at regional hospitals — to enhance the fit of the data and improve predictions. Two main drivers of the spread of COVID-19 include not practicing physical distancing and transmission of the disease by asymptomatic individuals, Ferris says.

“We want these models to be effective and help decision-makers and the general public understand the evolution of this system and how we can use interventions to affect that evolution,” Ferris says.

Among the team’s goals is to help ensure the right resources are available in the right places at the right times, he adds. And researchers are using data visualization tools to track infectious disease trends.

Incomplete or missing data, stemming from a lack of adequate testing and unanticipated changes in resource availability, limit modeling efforts for response. “We have to continually refine our models to reflect changes in the supply chain, such as when new nurses and doctors may become available,” Ferris says.

The types of health data that are available also are limited due to privacy considerations and laws that guard some kinds of health information.

“This isn’t a perfectly defined procedure, but modeling is iterative,” Ferris says. “We collect data, build a model informed by that data, and run that model to make some inferences about how we would change that system. Then, we rerun them until we are confident in our model and can suggest action based upon it.”

Each time, he adds, “We grow more confident in our conclusions.”

Opportunities to Brainstorm

The research group has added several data dashboards, continues to work on a website, and shares information as appropriate and necessary.

Several members of the COVID-19 data science group also participate in ongoing discussion over the messaging platform Slack. The discussion was started by Mikhail Kats, associate professor of electrical and computer engineering, and provides UW–Madison faculty, staff, and students with opportunities to brainstorm and to apply their skills and resources to understanding and mitigating the effects of the pandemic.

Discussions have included data mining, data visualization, and building data repositories to evaluate the spread of COVID-19 in Wisconsin. They’ve also looked at new methods to disinfect vehicles, design personal protective equipment (PPE), and identify existing drugs that could be used to treat COVID-19.

One member of the Slack group, Lennon Rodgers, director of the Grainger Engineering Design Innovation Lab, is working with Madison-area manufacturers, a design consulting firm, and campus colleagues to help meet the urgent demand for producing medical face shields — key PPE for health care workers treating COVID-19 patients.

Yandell says he now sees his role as “the traffic cop,” making connections and providing space for others to do their work. For instance, he connected Song Gao, assistant professor of geography, to a researcher at the University of Chicago, and together they are examining aggregated cell phone data to understand people’s movement across the United States. Yandell learned of the Chicago research team, led by health geographer Marynia Kolak, through another one of his connections: his brother-in-law, Carmi Neiger, a geography professor at Elmhurst College.

“I’d rather be here in calm times,” Yandell says. “I think that is true for all of us. But this is what I — what we — need to be doing now, together, to stop this pandemic and minimize its impacts.”

This article was posted in Basic Science, Features, Health and Wellness, Summer 2020 and tagged , , , , , , , .