Three different units for COVID-19 tests

There are three ways that states and other jurisdictions can report counts of COVID-19 tests. What are the differences, and why is it important to understand these distinctions?

Sep 06, 2020

Welcome back to the COVID-19 Data Dispatch, where the intricacies of data definitions are a topic worth 1,000 words.

Last week, this little newsletter reached 100 subscribers. Thank you all for your support and trust in looking to me for explanations of COVID-19 data news.

In honor of that milestone, this week, I’m covering a topic near and dear to my heart: testing units. Plus: an update on the state of school data, reporting delays in Florida, and a recent interview I did for the Association of Health Care Journalists.

Please continue to share my work; I appreciate anything you do to help get the word out. If you were forwarded this email, you can subscribe here:

Let’s talk about testing units

Colorado is one of six states currently reporting its testing in “test encounters,” a new metric that has appeared in recent weeks. Screenshot of Colorado’s dashboard taken on September 5.

A few weeks ago, one of my coworkers at Stacker asked me: how many people in the U.S. have been tested for COVID-19?

This should be a simple question. We should have a national dataset, run by a national public health department, which tracks testing in a standardized manner and makes regular reports to the public. The Department of Health and Human Services (HHS) does run a national testing dataset, but this dataset only includes diagnostic, polymerase chain reaction (PCR) test results, is not deduplicated—a concept I’ll go into more later—and is not widely publicized or cited.

Meanwhile, 50 state public health departments report their local testing results in 50 different ways. Different departments have different practices for collecting and cleaning their test results, and beyond that, they report these results using different units, or the definitive magnitudes used to describe values.

You might remember how, in a high school science class, you’d get a point off your quiz for putting “feet” instead of “meters” next to an answer. Trying to keep track of units for COVID-19 data in the U.S. is like that, except every student in the class of 50 is putting down a slightly different unit, no teacher is grading the answers, and there’s a mob of angry observers right outside the classroom shouting about conspiracy theories.

Naturally, the COVID Tracking Project is keeping track anyway. In this issue, I’ll cite the Project’s work to explain the three major units that states are using to report their test results, including the benefits and drawbacks of each.

Much of this information is drawn from a COVID Tracking Project blog post by Data Quality Lead Kara Schechtman, published on August 13. I highly recommend reading the full post and checking out this testing info page if you want more technical details on testing units.

(Disclaimer: Although I volunteer for the COVID Tracking Project and have contributed to data quality work, this newsletter reflects only my own reporting and explanations based on public Project blog posts and documentation. I am not communicating on behalf of the Project in any way.)

Specimens versus people

Last spring, when the COVID Tracking Project’s data quality work started, state testing units fell into two main categories: specimens and people.

When a state reports its tests in specimens, their count describes the number of vials of human material, taken from a nose swab or saliva test, which are sent off to a lab and tested for the novel coronavirus. Counts in this unit reflect pure testing capacity: knowing the number of specimens tested can tell researchers and public health officials how many testing supplies and personnel are available. “Specimens tested” counts may thus be more precise on a day-to-day basis, which I would consider more useful for calculating a jurisdiction’s test positivity rate, that “positive tests divided by total tests” value which has become a crucial factor in determining where interstate travelers can go and which schools can reopen.

But “specimens tested” counts are difficult to translate into numbers of people. A person who got tested five times would be included in their state’s “specimens tested” count each time—and may even be included six, seven, or more times, as multiple specimens may be collected from the same person during one round of testing. For example, the nurse at CityMD might swab both sides of your nose. Including these double specimens as unique counts may artificially inflate a state’s testing numbers.

When a state reports its tests in people, on the other hand, their count describes the number of unique human beings who have been tested in that state. This type of count is useful for measuring demographic metrics, such as what share of the state’s population has been tested. In most cases, when states report population breakdowns of their testing counts, they do so in units of people; this is true for at least four of the six states which report testing by race and ethnicity, for example.

Reporting tests in units of people requires public health departments to do a process called deduplication: taking duplicate results out of the dataset. If a teacher in Wisconsin (one of the “people tested” states) got tested once back in April, once in June, and once this past week, the official compiling test results would delete those second two testing instances, and the state’s dataset would count that teacher only once.

The problem with such a reporting method is that, as tests become more widely available and many states ramp up their surveillance testing to prepare for school reopening, we want to know how many people are being tested now. As recent COVID Tracking Project weekly updates have noted, testing seems to be plateauing across the country. But in the states which report “people tested” rather than “specimens tested,” it is difficult to say whether fewer tests are actually taking place or the same people are getting tested multiple times, leading them to not be counted in recent weeks’ testing numbers.

Test encounters

So, COVID-19 testing counts need to reflect the numbers of people tested, to provide an accurate picture of who has access to testing and avoid double-counting when two specimens are taken from one person. But these counts also need to reflect test capacity over time, by allowing for accurate test positivity calculations to be made on a daily or weekly basis.

To solve this problem, the COVID Tracking Project is suggesting that states use a new unit: test encounters. The Project defines this unit as the number of people tested per day. As Kara Schechtman’s blog post explains, though this term may be new, it’s actually rather intuitive:

Although the phrase “testing encounters” is unfamiliar, its definition just describes the way we talk about how many times people have been “tested for COVID-19” in everyday life. If an individual had been tested once a week for a month, she would likely say she had been tested four times, even if she had been swabbed seven times (counted as seven tests if we count in specimens), and even though she is just one person (counted as one test if we count in unique people). In this case, that commonsense understanding is also best for the data.

To arrive at a “testing encounters” count, state public health departments would need to deduplicate multiple specimens from the same person, but only if those multiple specimens were taken on the same day. “Testing encounters” counts over time would accurately reflect a state’s testing capacity, without any artificial inflation of numbers. And, as a bonus, such counts would align with public understanding of what it’s like to get tested for COVID-19—making them easier for journalists like myself to explain to our readers.

What is your state doing?

The COVID Tracking Project currently reports total test encounters for five states—Colorado, Rhode Island, Virginia, New York, and Washington—along with the District of Columbia. Other states may report similar metrics, but have not yet been verified to match the Project’s definition.

You can find up-to-date information about which units are reported for each state on a new website page conveniently titled, “How We Report Total Tests.” The page notes that the Project prioritizes testing capacity in choosing which state counts to foreground in its public dataset:

Where we must choose a unit for total tests reporting, we are prioritizing units of test encounters and specimens above people—a change which we believe will provide the most useful measure of each jurisdiction’s testing capacity.

Also, if you’ve visited the COVID Tracking Project’s website recently, you might have noticed that the state data pages have seen a bit of a redesign, in order to make it clear exactly which units each state is using. Each state’s data presentation now includes all three units, with easy-to-click definition popups for each one:

I recommend checking out your state’s page to see which units your public health department is using for COVID-19 tests, as well as any notes on major reporting changes (outlined below the state’s data boxes). You can read more about the site redesign here.

When my coworker asked me how many people in the U.S. have been tested for COVID-19, I wasn’t able to give him a precise answer. The lack of standards around testing units and deduplication methods, as well as the federal government’s failure to be a leader in this work, have made it difficult to comprehensively report on testing in America. But if people—and I mean readers like you, not just data nerds like me—make testing units part of their regular COVID-19 conversations, we can help raise awareness on this issue. We can push our local public health departments to standardize with each other, or at least get better about telling us exactly what they’re doing to give us the numbers they put up on dashboards every day.

School data update

Since last week’s issue, four more forms of official state reporting on COVID-19 in schools have come to my attention:

New Hampshire is publishing school-associated case data, including active cases, recovered cases, and outbreak status (not clearly defined) on a page of the state’s dashboard, updated daily.
Mississippi is publishing a weekly report on cases, quarantines, and outbreaks among students, teachers, and staff, aggregated by county. So far, the state has released reports on the weeks ending August 21 and August 28.
Hawaii’s state Department of Education is publishing a page on COVID-19 in the school district, updated weekly. (Did you know that the entire state of Hawaii is comprised of one school district?)
New York is launching a public dashboard on COVID-19 in schools; this dashboard will be available starting on September 9. So far, the page states that, “New York school districts will be required to provide the Department of Health with daily data on the number of people who have tested positive for COVID-19 beginning Tuesday, September 8th.” Last week, Mayor Bill de Blasio announced that classes in New York City would be delayed by two weeks to allow for more extensive safety precautions.

In addition, the nonprofit civic data initiative USAFacts has compiled a dataset of reopening plans in America’s 225 largest public school districts. The dataset classifies reopening plans as online, hybrid, in-person, or other, with information as of August 17.

Meanwhile, on the higher education front:

Education reporter (and friend of this newsletter!) Benjy Renton has launched a dashboard keeping track of COVID-19 outbreaks on college and university campuses. The dashboard organizes outbreaks according to their alert level, based on new cases in the past week.
I am continuing to monitor the COVID-19 metrics reported by college and university dashboards in my comparison spreadsheet. I haven’t had the chance to expand this analysis much in the past week, but it continues to be an ongoing project.

Florida is no longer sending tests to Quest Diagnostics

This past Tuesday, the Florida Department of Health (DOH) announced that the department would stop working with Quest Diagnostics. Quest is one of the biggest COVID-19 test providers in the nation, with test centers and labs set up in many states. The company claimed in a statement to the Tampa Bay Times that it has “provided more COVID-19 testing on behalf of the citizens of Florida than any other laboratory.”

So, why is Florida’s DOH cutting ties? Quest Diagnostics failed to report the results of 75,000 tests to the state department in a timely manner. Most of these results were at least two weeks old, and some were as old as April. As all the old results were logged at once on Monday night, Florida’s test and case counts both shot up: nearly 4,000 of those tests were positive.

Such a reporting delay skews analysis of Florida’s testing capacity over time, especially as many of the backlogged tests were reportedly conducted during the peak of the state’s outbreak in June and July. This delay also likely means that, while the people tested with this batch of tests still received their results in a timely manner (according to Quest), contact tracers and other public health workers were unable to track or trace the nearly 4,000 Floridians who were diagnosed. Such an error may have led to many more cases.

According to Florida Governor Ron DeSantis, such an error is tantamount to violating state law:

“To drop this much unusable and stale data is irresponsible,” DeSantis said in a statement Tuesday. “I believe that Quest has abdicated their ability to perform a testing function in Florida that the people can be confident in. As such I am directing all executive agencies to sever their COVID-19 testing relationships with Quest effective immediately.”

But is cutting all ties with Quest the correct response? Florida’s testing capacity already is below recommended levels. According to the Harvard Global Health Institute, the state has conducted 124 tests per 100,000 people over the past week (August 30 to September 5), with a positivity rate of 13.2%. This per capita rate is far below the state’s suggested mitigation target of 662 tests per 100,000 people, and this test positivity rate is far above the recommended World Health Organization rate of 5%.

Florida will be able to send many of its tests to state-supported, public testing sites, the Tampa Bay Times reports. Still, this switch will take time and cause additional logistical hurdles at a time when Florida should not be putting the breaks on testing.

My insights on COVID-19 data reporting

I recently had the honor of speaking to Bara Vaida, from the Association of Health Care Journalists (AHCJ), about my work at Stacker, the COVID Tracking Project, and this newsletter. The full interview is up on AHCJ’s site, but I wanted to highlight my answer to Bara’s question, “What would you say are the common mistakes that you see in how COVID-19 data is reported?”:

I think it is not contextualizing data appropriately. You have to explain what the data mean. For example, you can say a state’s positivity rate fell from one week to the next, but it is important to explain the numerator and the denominator ― the number of tests that were completed and how many of those tests were positive. And you have to explain that positivity rate in the context of what is happening in the state. Is the state actually doing more testing, or did it have to shut down testing centers because of a hurricane, causing both the number of tests and the number of positives to go down — this happened in Florida a few weeks ago. And also, don’t forget there are real people behind these numbers. It’s always important to remember that.

I also spoke to education reporter Alexander Russo for his recent column in Phi Delta Kappan. The article provides advice geared towards journalists covering COVID outbreaks in schools, but it’s also a useful primer for teachers, parents, and anyone else closely following school data.

Featured data sources

Colors of COVID: This Canadian project relies on public surveys to collect data on how COVID-19 is impacting marginalized communities in the country. The project plans to release quarterly reports with these survey results.
America’s Health Rankings’ Senior Report: America’s Health Rankings has conducted annual reviews of health metrics in every U.S. state since 1990. The organization’s most recent Senior Report is over 100 pages of data on older Americans, including rankings of the healthiest states for seniors.
The Long-Term Care COVID Tracker: I wrote in my August 16 newsletter that the COVID Tracking Project had released a snapshot of a new dataset compiling cases and deaths in nursing homes, assisted living facilities, and other long-term care facilities. As of this week, the full dataset is out, including historical data going back to May 21 and extensive notes on how each state is reporting this crucial information. Read more about it here.
NIOSH-Approved N-19 Respirators: This CDC list includes thousands of surgical N-95 respirators approved by the National Institute for Occupational Safety and Health (NIOSH); it’s intended as a source for healthcare workers. As a recent article in Salon points out, the list includes over 600 valved models, despite recent guidances instructing the public to avoid valved masks.

COVID source callout

It is not uncommon, as we increasingly realize that COVID-19 is not going away any time soon, for state public health departments to give their websites makeovers. Hastily-compiled pages and PDF reports have given way to complex dashboards, complete with interactive charts and color-coding.

These revamps can be helpful for users who would rather click through a menu than scroll through a report. But from a data collection perspective, it’s often challenging to go from a document or single page (where I could easily hit Ctrl+F to find a value) to a dashboard which requires clicking and searching through numerous popups.

The most recent state to go through such a revision is South Carolina. In late August, the state released a new dashboard, called the County-Level Dashboard, and reorganized much of its information on COVID-19 demographics and other metrics.

In fact, when I first looked at South Carolina’s revised pages, I could not find any demographic data at all. This information used to be reported on a page marked “Demographic Data by Case”; now, that page goes to a dashboard on cases in South Carolina’s long-term care facilities. It wasn’t until I read through the public health department’s new Navigation Manual that I realized demographic data are now integrated on the county dashboard. If I click, for example, “Go to cases,” I’m brought to a page reporting case rates by county, age, race, ethnicity, and gender.

Demographic data ahoy! Via the South Carolina County Dashboard, September 6.

To South Carolina’s credit, these new pages report demographic data in whole numbers, a more precise format than the percents of total cases and deaths released by many other states (and by SC itself before this reorganization). I also appreciate the addition of a Navigation Manual—such detailed instructions can help make a dashboard more accessible.

But I would advise any designers of state dashboard revamps to consider how to label figures more clearly from the get-go, so that journalists and state residents alike aren’t confused.

COVID-19 Data Dispatch