How to lie with statistics

Dissecting a deviously misleading public health message

Last week the Salt Lake County Department of Public Health posted a lighthearted graphic on Twitter, purporting to show Covid vaccination rates by Zodiac sign in the county. The chart, intended to be “a fun conversation starter to promote vaccination” according to the Salt Lake Tribune, went viral and garnered national press coverage.

But even accepting the data on their own terms — astrology is fake, after all — these numbers are almost certainly inaccurate enough to shift the entire effort from the “fun and quirky” side of the ledger into “unnecessary and misleading.”

The first thing that stands out in the chart is the massive variation in purported vaccination rates, ranging from 46 percent of the county’s Scorpios to 70 percent of the county’s Leos. That’s a huge spread, much bigger than you’d expect to arise from just random noise assuming vaccination rates are more or less uniform across the birth date distribution. What’s going on there?

Like a lot of people, when I first encountered the chart I assumed the underlying data were correct. It comes directly from the county health department, which presumably knows both the birth date and vaccination status of people who’ve received their shots. If the numbers are correct and astrology is bunkus then there must be some unseen, hidden variable driving everything, right?

For instance, researchers have found that the share of kids born to teen moms tends to be a bit higher in winter months. Kids with teen moms tend to be worse off in mid-life, so maybe they’re less likely to be vaccinated?

But that can’t be it: the difference in teen mom birth share between summer and winter months boils down to one or two percentage points, not nearly large enough to plausibly drive a 24-point spread in vaccination rates.

Things didn’t start to click until I read the footnotes of the health department’s chart (data pro-tip: always, ALWAYS read the footnotes!). There you’ll find the following: “‘US Population by Zodiac Sign’ from University of Texas-Austin, applied to Salt Lake County 2020 population estimates.” This was a puzzle: why would they need zodiac sign data for the U.S. when they’ve got their own county data right in front of them?

Then it hit me: they know the birth date of everyone in the county who received a shot, but they don’t know the birthdays of all the people who haven’t been vaccinated. If a person hasn’t received their shot, they don’t have a record in the county vaccination database. You need to know, or at least estimate, the zodiac signs of the unvaccinated in order to calculate the percent vaccinated.

The officials who put the chart together apparently did so under the assumption that Salt Lake County’s Zodiac distribution is the same as the national Zodiac distribution — if Scorpios make up 8 percent of the national population, the officials assumed they make up 8 percent of the county population too. There’s nothing inherently wrong about this, but it can introduce a boatload of error if your local population differs from the national distribution in a meaningful way. And when it comes to childbearing, Utah is dramatically different from the rest of the United States.

Utah has long had one of the highest fertility rates in the nation, owing in large part to the predominance of Mormonism there. The state’s moms are younger than the national average. It stands to reason that the distribution of birth dates in the state is different than the national average as well.

If that’s the case, then the county health department’s estimates are going to be off by quite a bit. But the problems don’t end there. I tried searching for the University of Texas data that the county used for the national Zodiac distribution. I couldn’t find it anywhere — it doesn’t appear to exist on the internet. I did, however, find an incredibly sketchy page hosted by Texas A&M, which includes a table of the most and least common Zodiac signs. I can’t be sure this was the county’s actual source — they didn’t respond to an inquiry — but it’s the source cited in the Salt Lake Tribune’s story on the data, which includes quotes from the county health officials.

The Texas A&M page says its source for the data is the “Statistic Brain Research Institute,” which appears to be some sort of data-driven content mill. Their data appears to be garbage: the numbers the Texas A&M page reports are quite a bit different than what you’d get from an actual distribution of birth dates, for instance. 538 actually pulled daily national birth date data some years back, bless them, using Social Security number from 2000 through 2014. Those figures suggest, for instance, that Scorpios make up about 8.2 percent of births since 2000, compared to 9.4 percent reported in the Texas A&M data. And that Leos account for 8.8 percent of the population rather than 7.1. And so on — all the numbers are similarly off.

As best as I can tell what’s happening here is a compounding series of errors — using crappy, unreliable national data to create a crappy, unreliable estimate for a very different local population. Magnify those errors all the way up the chain and I can start to see how you’d get such wide variation in vaccination rate by Zodiac sign. Again, I can’t be 100% certain of this because it’s ultimately unclear where the Salt Lake Health Dept. got their national data from. The whole point of citing your sources is to let other people check your work! If that isn’t possible you’re doing citations wrong.

There’s one more unfortunate layer to all this, and that’s the eagerness with which national and local press covered these figures. The Salt Lake Tribune, for instance, does mention there could be a mismatch between national and local birth date distributions. But most of the article proceeds as if the findings are trustworthy, with statements like “Salt Lake County residents born under the exuberant, high-achieving, let-no-opportunity-pass sign of Leo have been vaccinated for the coronavirus at higher rates than those of any other zodiac sign.”

USA Today opens its story by briefly noting that Zodiac signs “probably” don’t actually influence vaccination rates, and then spends the rest of the article interviewing an astrologer about what the findings mean ("Fire signs want to lead the charge," she tells the credulous reporter. The article is filed under “news.”)

The Guardian gets points for giving one of their data journalists, the talented Alvin Chang, plenty of space to debunk the supposed relationship. But then it undermines him by going with a click-baity question mark headline: “Leos are most likely to get vaccinated, say Utah officials. Is it written in the stars?” No, it’s not.

The press, and newspapers in particular, are due for a reckoning for their role in legitimizing and promoting astrology, a superstition that’s basically a gateway to all manner of pseudoscience and misinformation. Horoscopes have been a staple of print dailies for decades, giving the impression that astrology is as vital a piece of civic knowledge as box scores, weather reports and Wall Street returns.

Some folks argue that astrology is just harmless entertainment and that we should let people enjoy things. I might have bought this two decades ago, before a misinformation crisis was tearing the country apart. But today I find it impossible to defend. A 2018 poll by the National Science Foundation found that nearly 40 percent of Americans believe astrology is “very” or “sort of” scientific, while Pew has found that close to one third “believe” in it.

These false beliefs do real-world harm, as astronomer Phil Plait persuasively writes. The astrology industry drains billions of dollars from the economy annually. It promotes sloppy thinking: “The more we teach people to simply accept anecdotal stories, hearsay, cherry-picked data (picking out what supports your claims but ignoring what doesn't), and, frankly, out-and-out lies, the harder it gets for people to think clearly,” Plait writes.

While newspapers rightly denounce misinformation and conspiracy theories, they continue to peddle horoscopes and astrological nonsense alongside the good work of their reporters. They’re promoting pseudoscience and superstition. Plait’s right: it’s sloppy. And that sloppiness is on full display in much of the coverage of the Salt Lake County health department’s stunt.