“Is Dr. Anthony Fauci on Cameo?”

Dr. Anthony Fauci, Dr. Robert Redfield, and Admiral Brett Giroir testified before Congress on America's national COVID-19 response. Will their testimonies help the public access better COVID-19 data?

Welcome back to the COVID-19 Data Dispatch, the only newsletter that can reference an iconic comedy podcast and a congressional hearing on the COVID-19 pandemic in one breath.

This week, I’ll discuss Friday’s congressional subcommittee hearing on “The Urgent Need for a National Plan to Contain the Coronavirus” from a COVID-19 data perspective. I’ll also further unpack issues with the Department of Health & Human Services (HHS)’s hospitalization data and alert you to three exciting datasets (one newly released!).

This is a brand-new newsletter, so I would appreciate anything you can do to help get the word out. If you were forwarded this email, you can subscribe here:

COVID-19 data in the political arena

NIAID Director Dr. Anthony Fauci testifies before House Select Subcommittee on the Coronavirus Crisis on July 31. Screenshot retrieved from the hearing’s livestream.

In the most recent episode of comedy podcast My Brother, My Brother and Me (approx. timestamp 23:50), youngest brother Griffin McElroy solemnly asks, “Is Dr. Anthony Fauci on Cameo?”

McElroy’s question, asked in the context of a rather silly and unscientific discussion on contaminated basketballs, refers to a video-sharing service in which fans can pay celebrities to send personalized messages. Dr. Fauci is, of course, not on Cameo. But he did make a public appearance this past Friday: he testified before the House Subcommittee on the Coronavirus Crisis. This was Dr. Fauci’s first Congressional appearance in several weeks; Democrats have claimed that the White House blocked him from testifying earlier in the summer.

Dr. Fauci was joined on the witness stand by Centers for Disease Control and Prevention (CDC) Director Dr. Robert Redfield and Assistant Secretary for Health Admiral Brett Giroir, who leads policy development at the Department of Health and Human Services (HHS). All three witnesses answered questions about their respective departments, covering COVID-19-related topics from test wait times to the public health implications of Black Lives Matter protests.

For comprehensive coverage of the hearing, you can read my Tweet thread for Stacker:

But here, I will focus on five major takeaways for the COVID-19 data world.

First: the results of scientific studies on the pandemic are publicly shared. In his opening statement, Dr. Fauci cited four top priorities for the National Institute of Allergy and Infectious Diseases (NIAID): improving scientific knowledge of how the novel coronavirus works, developing tests that can diagnose the disease, characterizing and testing methods of treating patients, and developing and testing vaccines. The Congressmembers on the House subcommittee were particularly interested in this last priority; Dr. Fauci reassured several legislators that taking vaccine development at “warp speed” will not come at the cost of safety.

Rep. Jackie Walorski, a Republican from Indiana, was especially concerned about Chinese interference in vaccine development. She repeatedly asked Dr. Fauci if he believed China was “hacking” American vaccine research, and if he believed this was a threat to the progress of such work. Dr. Fauci replied that all clinical results from NIAID work are shared publicly through the usual scientific process, to invite feedback from the greater medical community.

Clinical studies in particular are listed in a National Institutes of Health (NIH) database called ClinicalTrials.gov. On this site, any user can easily search for studies relating to COVID-19; there are 2,844 listed at the time I send this newsletter. 256 of these studies are marked as “completed,” and two of those have results posted. I see no reason to doubt that, if Rep. Walorski were to visit this database in the coming months, she would find the results of vaccine trials here as well.

Dr. Fauci also publicized the COVID-19 Prevention Network, a website on which Americans can volunteer for vaccine trials. According to Dr. Fauci, 250,000 individuals had registered by the time of the hearing.

Second: nursing homes are getting COVID-19 antigen tests, big time. Dr. Redfield, Admiral Giroir, and several of the House representatives at the hearing highlighted a recent initiative by HHS to distribute rapid diagnostic COVID-19 tests to nursing homes in hotspot areas. In his opening remarks, Dr. Redfield stated that, by the end of this week, federal health agencies will have delivered “nearly one million point-of-care test kits to 1,019 of the highest risk nursing homes, with 664 nursing homes scheduled for next week.”

The tests being distributed identify antigens, protein fragments on the surface of the novel coronavirus. Like polymerase chain reaction (PCR) tests, antigen tests determine if a patient is infected at the time they are tested; unlike PCR tests, they may be produced and distributed cheaply, and return results in minutes. Antigen tests have lower sensitivity, however, meaning that they may miss identifying patients who are in fact infected.

The antigen test distribution initiative is great news for the nursing homes across the country that will be able to test and treat their residents more quickly. But from a data perspective, it poses one major question: how will the results of these tests be reported? While antigen tests may be diagnostic, their results should not be lumped in with PCR test results because they have a different accuracy level and serve a different purpose in the pandemic.

The Nursing Home COVID-19 Public File, a national dataset run by the Center for Medicare and Medicaid Services, reports “confirmed” and “suspected” COVID-19 cases in the nation’s nursing homes. The dataset does not specify what types of tests were used to identify these cases, or the total tests conducted in each home. Similarly, state-reported datasets on COVID-19 in nursing homes typically report only cases and deaths, not testing numbers. And, as of the most recent COVID Tracking Project analysis, the only state currently reporting antigen tests in an official capacity is Kentucky. But more states may be including antigen test numbers in their counts of “confirmed cases” or “molecular tests,” as several states lumped PCR and serology tests this past spring. As hundreds of nursing homes across the country begin to use the antigen tests so graciously distributed by the federal government, we must carefully watch to identify where those numbers show up.

Third: Admiral Giroir doesn’t know what data his agency publishes.

If you watch just five minutes from Friday’s hearing, I highly recommend the five minutes in which Rep. Nydia Velázquez (a Democrat from New York) interrogates Admiral Giroir about COVID-19 test wait times. Here’s my transcript of a key moment in the conversation:

Rep. Velázquez: Dr. Redfield, I’d like to turn to you. Does the CDC have comprehensive information about the wait times for test results in all 50 states?

Dr. Redfield: I would refer that question back to the Admiral.

Rep. Velázquez: Sir?

Admiral Giroir: Yes, we have comprehensive information on wait times in all 50 states, from the large, commercial labs.

Rep. Velázquez: And do you publish this data? These data?

Admiral Giroir: Uh… we talk about it. Always. I mean, I was on… I was with 69 journalists yesterday, and we talk about that frequently.

He went on to claim that decisionmakers at the state and city level have data on test wait times from commercial labs. But where are these data? HHS has collected testing data since the beginning of the pandemic; these data were first published on a CDC dashboard in early May and are now available on HealthData.gov.

The HealthData.gov dataset includes test results from CDC labs, commercial labs, state public health labs, and in-house hospital labs. For each test, the dataset includes geographic information, a date, and the test’s outcome. It does not include the time between the test being administered and its results being reported to the patient. In fact, that “date” can either be a. the date the test was completed, b. the date the result was reported, c. the date the specimen was collected, d. the date the test arrived at a testing facility, or e. the date the test was ordered. So, if there’s another, secret dataset which includes more precise dating, I personally would love to see it made public.

Also, who are those 69 journalists, Admiral Giroir? How do I join those ranks? I have some questions about HHS hospitalization data.

Fourth: everyone wants to reopen schools. Dr. Redfield said, opening schools is “in the best public health interest of K-12 students.” Dr. Fauci said, schools should reopen so that schools can access health services, teachers can identify instances of child abuse, and to avoid “downstream unintended consequences for families.” Rep. Steve Scalise, the subcommittee’s Ranking Member (and a Republican from Louisiana, home to one of the country’s most annoying COVID-19 dashboards), said, “Don’t deny these children the right to seek the American dream that everybody else has deserved over the history of our country.” Rep. James Clyburn, the subcommittee’s Chair (a Democrat from South Carolina), said that school reopening must not be a “one size fits all approach,” but it should be done for the good of students and their families.

Clearly, reopening schools is a popular political opinion. But does the country have the data we need to determine if schools can reopen safely? Reopening, as Dr. Fauci explained in response to an early question from Rep. Clyburn, is most safely done when COVID-19 is no longer circulating widely in a community. School districts can determine whether the disease is circulating widely through looking at case counts over time, but for those case counts to be accurate, the region must be doing enough testing and contact tracing to catch all cases.

And testing data, while they are certainly collected at the county and zip code levels by local public health departments, are not standardized at all. HHS doesn’t publish county-level testing data. Nor does the COVID Tracking Project. This lack of standardization for any geographic region smaller than a state is troubling, as public health leaders and journalists alike cannot currently assess the scope of local outbreaks with any kind of broad comparison. To put it simply: I would love to do a story on how many school districts can safely reopen right now, based on their case counts and test metrics. But the data I would need to do this story do not exist.

Fifth: all data are political; COVID-19 data are especially political. I know, I know. Data have been political since humans started collecting them. One of America’s most comprehensive data sources, the U.S. Census, started as a way to enforce the Three-Fifths Compromise.

But watching this Friday’s hearing hammered home for me how the mountains of data produced by this pandemic, coupled with the complete lack of standards across the institutions producing them, has made it particularly easy for politicians to quote random numbers out of context in order to advance their agendas. Rep. Clyburn said, “At least 11 states… are currently performing less than 30% of the tests they need to control the virus.” (Which states? How many tests do they need to perform? Where di that benchmark come from? What other metrics should the states be following?) And, on the other side of the aisle, Rep. Scalise held up a massive stack of paper and waved it right at the camera, claiming that the high number of tests that have been conducted in this country is evidence of President Trump’s national plan. (But how many tests have we conducted per capita? What are the positivity rates? What statistics can we actually correlate to President Trump’s plan?)

In fact, after the hearing, the White House put out a press release claiming that America has “the best COVID-19 testing system in the world.” The briefing includes such claims as, “the U.S. has already conducted more than 59 million tests,” and, “the Federal Government has distributed more than 44 million swabs and 36 million tubes of media to all 50 States.” None of the statistics in the briefing are put into terms reflecting how many people have actually been tested, compared to the country’s total population. And none of the statistics are contextualized with public health information on what targets we should be meeting to control the pandemic.

The experts who might have been consulted on that brief—Dr. Fauci, Dr. Redfield, and Admiral Giroir—all sat before Congressional Representatives on Friday morning, quietly nodding when Representatives asked if their respective departments were doing everything possible to protect America. If they had answered otherwise, they may not have returned for future hearings. The whole thing felt very performative to me: the Democrats threw veiled jibes at President Trump, the Republicans bemoaned China and Black Lives Matter protests, and Dr. Fauci fact-checked such basic statements as, “Children are not immune to COVID-19.”

And almost everyone in the room—including all three witnesses—removed their mask when they spoke.

If Dr. Fauci were available to commission on the video service Cameo, I would pay him good money to send a personal message to every Congressmember on that subcommittee telling them, confidentially, exactly what he thinks of their questions. And then I would ask him for Admiral Giroir’s personal cell phone number.

No, we’re not done talking about HHS hospitalization data

The HHS is still collecting and publishing COVID-19 hospitalization data, and I, personally, feel as though I know both more and less than I did when I wrote last week’s newsletter. This week’s issue is already rather long, so here, I will focus on outlining the main questions I have right now.

Why are HHS’s COVID-19 hospitalization numbers higher than states’? While HHS’s most public-facing dataset is the HHS Protect hospital utilization dataset, last updated on July 23, the department also reports daily counts of the hospital beds occupied in every state. This dataset includes counts of all currently hospitalized patients with confirmed and suspected COVID-19. Local public health departments in all 50 states and D.C. also report the same datapoint; the COVID Tracking Project collects, standardizes, and reports these local counts daily.

According to analysis by the COVID Tracking Project, over the week of July 20 to July 26, HHS reported an average of 24% more hospitalized COVID-19 patients across the U.S. than the states did. Figures for some states show even more variation. In Florida, for example, HHS’s count nearly doubled from July 26 to July 27 (from about 11,000 patients to about 21,500 patients). The state reported about 9,000 hospitalized COVID-19 patients both days.

In Arkansas, meanwhile, the state has reported about 500 hospitalizations each day for the past week, while HHS has reported about 1,600. Overall, for 28 out of 53 states and territories, there is at least one day in the past week when HHS’s count of currently hospitalized COVID-19 patients is at least 50% higher than the state public health department’s count.

The COVID Tracking Project suggests several potential reasons for this discrepancy. Some hospitals may report to HHS, but not to their state public health departments, either because they are federally-run hospitals (such as hospitals run by the Veteran’s Association) or because HHS’s tie to federal supplies such as remsidivir provides a greater incentive for complete reporting. State definitions for who counts as a COVID-19 patient differ from place to place, and may be narrower than the federal categorization, which includes all confirmed and suspected cases. And some hospitals might also be inputting data entry errors or double-counting their patient numbers as they adjust to the new reporting system. As I noted in last week’s issue, we do not know how HHS is screening for and removing data entry errors in their dataset.

How did the CDC-to-HHS switch impact local public health departments? The COVID Tracking Project’s blog post on hospitalization data also explains that several states had delays or errors in reporting current hospitalization numbers because the states previously relied on the CDC’s database for these values. Public health departments in Idaho, Missouri, South Carolina, Wyoming, Texas, and California have all documented issues with compiling hospitalization data at the state level thanks to the CDC-to-HHS system change. Similar issues may be going unreported in other states.

As I described last week, changing database systems in the middle of a pandemic can be particularly challenging for already-overburdened hospitals. It can take multiple hours a day to enter data into both HHS and state reporting systems, and that’s on top of the technological and bureaucratic hurdles that hospitals must clear. Public health departments are scrambling to help their hospitals, as hospitals are scrambling to report the correct data—to say nothing of actually taking care of their patients.

Why should I trust a database built by a tech company that got the job through suspicious means? According to an investigation by NPR, TeleTracking Technologies received its federal contract to build HHS’s data system for collecting hospital data under some unusual circumstances. For one thing, HHS claimed that TeleTracking’s contract was won through competitive bidding, but none of 20 competitors contacted by NPR knew about this opportunity. For another, the process HHS used to award that contract is typically used for scientific research and new technology, not database building. And finally, Michal Zamagias, TeleTracking’s CEO, is a real estate investor and long-time Republican donor with ties to the Trump Organization.

Rep. Clyburn—you know, that chair of the congressional coronavirus subcommittee—has launched an investigation into TeleTracking and its CEO. Other Congressmembers are asking questions, too. I, for one, am excited to see what they find.

States are auditing their COVID-19 data

This past Tuesday, several top state auditors announced a joint initiative: they’re going to review how state COVID-19 data are collected and reported. Auditors from five states—Delaware, Florida, Mississippi, Ohio, and Pennsylvania—worked with the National State Auditors Association to put together a framework that every state can use. 13 other states, as well as D.C. and Puerto Rico, have already expressed interest in using the framework.

This data audit was the brainchild of Delaware State Auditor Kathleen McGuiness, who describes her motivations in a Delaware press release:

I saw variation in the reporting and monitoring of COVID-19 cases by states nationally and felt it was important to have a consistent tool for states to easily review and share information about how their state’s approach to data use informs COVID-19 mitigation efforts. It’s an issue every state is grappling with during this pandemic, and I’m proud to lead this effort toward a universal goal.

The results of this audit won’t be shared for several months, but it’s good news that the initiative is at least taking place. Where the federal government has failed to institute data standards, the states are taking matters into their own hands. This is the most interested I’ve been in auditing since I watched Parks and Recreation season three.

Featured data sources

  • Public health departments, underfunded and under threat: This week, Kaiser Health News (KHN) data reporter Hannah Recht released the dataset behind KHN and The Associated Press’s recent feature on how local public health departments in the U.S. have been left unprepared to face COVID-19. The dataset includes six files examining spending and staffing at public health departments across the country.

  • COVID-19 testing sites: The healthcare company Castlight has built a comprehensive database of COVID-19 testing sites in the U.S., down to the ZIP Code level. Castlight’s Tableau dashboard allows users to explore this database by county and compare the number of available test sites with current case counts. This dataset was cited in a recent 538 article on testing disparities.

  • The CoronaVirusFacts Alliance Database: Since the start of the pandemic, Poynter’s International Fact-Checking Network has connected fact-checkers in over 70 countries working to correct COVID-19 misinformation. The results of these fact-checkers’ work are compiled in a database, which you can search by country, fact rating, and topic.

COVID source callout

Missouri’s COVID-19 dashboard includes the same number in three places.

The number: people in Missouri with cases of COVID-19 confirmed by PCR testing. The places: a value called “Lab Confirmed Cases in Missouri as of 2 pm Today” on the “Overview” tab of MO’s dashboard, a value called “Individuals with Positive PCR Results” on the “Testing - PCR” tab of the dashboard, and the cumulative total of a chart called “Cases by Reported Date” on the “Cases - Demographics” tab of the dashboard.

None of these values are equivalent, and yet all three values are accurate, because of complexities involving where lab-confirmed positive cases are logged in MO’s internal database and how they are assigned to a date. I’ll spare you the details of how this works because they took me several hours (and several Slack thread rereads) to wrap my own head around; suffice it to say that MO’s dashboard is bad and I hate it.

And my own confusion caused the COVID Tracking Project to change which value we use for reporting PCR-confirmed cases in MO, then change it back two days later. (Michal and Quang, if you’re reading this, I’m sorry.)

More recommended reading

My recent bylines

News from the COVID Tracking Project


That’s all for today! I promise most issues won’t be this long; I just had a lot to unpack from this week’s congressional subcommittee hearing.

If you’d like to share this baby newsletter further, you can do so here:


And if you have any feedback for me, you can send me an email (betsyladyzhets@gmail.com) or comment on the post directly:

Leave a comment