Tag Archives: data science

why we’re numb to mass death

I’ve always found close up pictures of Hiroshima victims to be some of the most affecting images I’ve ever seen, and yet knowing that 100,000* people were vaporized in a fraction of a second has less emotional effect. We also get numb to hearing about steady numbers of deaths that add up to a lot over time, like car accidents. I’m not a monster – this is a bug in human psychology. This article in Axios gives other examples of the phenomenon, from the Holocaust to the Rwanda genocide to the U.S. coronavirus meltdown. The article links to an academic paper by Paul Slovic at the University of Oregon, who studies this “psychic numbing” effect.

A defining element of catastrophes is the magnitude of their harmful consequences. To help society prevent or mitigate damage from catastrophes, immense effort and technological sophistication are often employed to assess and communicate the size and scope of potential or actual losses. This effort assumes that people can understand the resulting numbers and act on them appropriately. However, recent behavioral research casts doubt on this fundamental assumption. Many people do not understand large numbers. Indeed, large numbers have been found to lack meaning and to be underweighted in decisions unless they convey affect (feeling). As a result, there is a paradox that rational models of decision making fail to represent. On the one hand, we respond strongly to aid a single individual in need. On the other hand, we often fail to prevent mass tragedies – such as genocide – or take appropriate measures to reduce potential losses from natural disasters. We believe this occurs, in part, because as numbers get larger and larger, we become insensitive; numbers fail to trigger the emotion or feeling necessary to motivate action. We shall address this problem of insensitivity to mass tragedy by identifying certain circumstances in which it compromises the rationality of our actions and by pointing briefly to strategies that might lessen or overcome this problem.

The More Who Die, the Less We Care: Psychic Numbing and Genocide

I’ve often thought about a class that would teach the history of a war or tragedy by the numbers, by focusing on the number of deaths, who the people were, where they occurred and when they occurred. I think that would be educational (if depressing). But to put it in perspective you might need some visuals. One idea would be a stadium with people vanishing from seats. (This would work for, say, up to 100,000 deaths.) For even larger numbers, maybe you could start with a point in the center of the town where the class is being held or where students live, and then expand the dot outward as though all the people who live inside it were to vanish. You could even make this an app based on census data, and let the user pick the center of the bubble. Then finally, you probably should tie some of the deaths to individual stories, or interviews with survivors, friends and family. For me personally though, the numbers are important to put the emotional stories in context, and I am wary of news stories that don’t have numbers. Morbid stuff!

* Okay, I admit the “100,000 people in a fraction of a second” is just a number I picked somewhat for shock value. According to Wikipedia, 70,000-80,000 people were either vaporized instantly or burned to death shortly after the blast. Then a bunch more died of radiation poisoning of course. Does this make it any better? No, when it’s my turn please just vaporize me.

my official election prediction

There’s plenty of election coverage out there, so who needs this post? Well, I’ve been looking for one source of information on when the swing state polls close, what the vote counting situation is, and what the current poll/forecast situation is. I don’t see all of that in one place so here, just for myself, is some info.

I’m a little partial to FiveThirtyEight, just because I’ve been following them for a few elections now. There are other polling and modeling sources out there. I got poll closing times from 270 to win.

Florida

  • Poll closing: 7:00 p.m. ET (for most of the state including all the sizable cities, except that little bit of the panhandle including Pensacola at 8:00 p.m. ET)
  • The counting situation, according to 538: Despite their bad reputation from that election year that shall not be named around the turn of the century, they expect to have most or all results within two hours of closing. They count absentee and mail-in votes in advance, so they just need to combine them with live results and it should result in a more or less complete count. Unless things are really really close, like, you know, that one year…
  • 538 poll average on Friday 10/30 around 4:30 p.m.: Biden +2.2%
  • 538 odds on Friday 10/30 around 4:30 p.m.: Biden 66/34

Georgia

  • Poll closing: 7:00 p.m. ET
  • The counting situation: quick. They’ve counted mail-in ballots in advance. They expect overseas ballots to trickle in, but things would have to be really close for those to matter.

Ohio

  • Poll closing: 7:30 p.m. ET
  • The counting situation: They count absentee ballots in advance, then in-person votes, but they will still count absentee ballots received up to November 13. So if it is close enough that outstanding mail-in ballots could make a difference, news organizations won’t call it on election night.

North Carolina

  • Poll closing: 7:30 p.m. ET
  • The counting situation: About 80% should be counted right away, and more over the next few hours. But then they will still count ballots arriving by November 12, so same story: news organizations won’t call it if it is close.

Texas

  • Poll closing: 8:00 ET (locations in the Central Time Zone, which is almost all of Texas), 9:00 PM (locations in the Mountain Time Zone, which is basically El Paso)
  • The counting situation: Almost everything early on election night. They will still count ballots received by 5 p.m. the day after election day.

Pennsylvania

  • Poll closing: 8:00 p.m. ET
  • The counting situation: Oh, my beloved home state. Pennsyltucky as some call it, but that is completely unfair to the great state of Kentucky which plans to count 90% of ballots on election night. Under state law, we will not start counting mail-in ballots until polls open on election day. The process is supposed to conclude around Friday. Enormous numbers of people have voted by mail, including yours truly. Republicans will tend to vote in person, Democrats by mail. The state is about equally split (basically the Philadelphia metro region and downtown Pittsburgh vs. pretty much everyone else). So it could look like things are trending Republican on election night, but there will be enormous numbers of outstanding ballots expected to skew Democratic. Pennylvania will also count ballots received up to three days after election day, as allowed in not one but two Supreme Court cases over the past few weeks. Bottom line, it seems unlikely this one will be called on election night.

Michigan

  • Poll closing: 9:00 p.m. ET
  • The counting situation: “a few days”. They will start counting mail-in ballots one day early, but are not expecting to finish until around Friday.

Arizona

  • Poll closing: 9:00 p.m. ET
  • The counting situation: “most” on election night

Iowa

  • Poll closing: 10:00 p.m. ET
  • The counting situation: “most” on election night, and they are counting mail-in ballots early

Wisconsin

  • Poll closing: 9:00 p.m. ET
  • The counting situation: “all results by Wednesday morning”

Nevada

  • Poll closing: 10:00 p.m. ET
  • The counting situation: expecting to get most votes from the Vegas area on election night, but counting all votes could take until November 10

Okay, so how might election night unfold. First, I went to 270 to Win’s interactive map. You can pre-populate it with a variety of forecasts from a variety of sources, which is cool. I stuck with 538. Then I turned all the states above into “tossups”. I gave Trump one bonus electoral vote from Maine’s second district, which I don’t know anything about or what to do with.

This starting point is: Biden 227, Trump 126 (remember, you need 270 to win)

Let’s do a scenario where things go unexpectedly well for Trump.

  • Florida closes and is counted quickly. Biden 227, Trump 155
  • Counting also goes well in Georgia. Biden 227, Trump 171
  • Let’s say things go well for Trump in Ohio (where he is a slight favorite), and news organizations are willing to call it: Biden 227, Trump 189
  • North Carolina is counted quickly and goes to Trump: Biden 227, Trump 204
  • Texas goes to Trump quickly and decisively: Biden 227, Trump 242
  • Pennsylvania: no call on election night
  • Michigan: no call on election night
  • Arizona goes to Biden: Biden 238, Trump 242
  • Iowa is counted quickly and goes to Trump: Biden 238, Trump 248
  • Wisconsin is not really close. Even with some outstanding ballots, let’s say news organizations call it for Biden on election night. Biden 248, Trump 248
  • Nevada is not really close, but let’s say there is no call on election night.

We are tied. We go to bed, and every politician in America from President on down starts running their mouth on Wednesday. Lawsuits ensue. But those votes from Pennsylvania, Michigan, and Nevada trickle in during the week, and Biden has substantial leads in all three. It would take an extraordinary amount of luck just for Trump to get close to 50/50 odds.

Here’s a more likely scenario, so let’s consider this my prediction:

  • Florida is called for Biden around 8 p.m. The call is made by the same news organizations that called Florida for Al Gore precisely 20 years ago, but they are much more conservative (in the statistical sense, meaning looking for a higher degree of certainty) these days. Biden 256, Trump 126
  • Georgia goes narrowly for Trump. Biden 256, Trump 142
  • Ohio goes to Trump. Biden 256, Trump 160
  • North Carolina is called for Biden around 9 p.m. Biden 271, Trump 160. IT’S OVER!!!
  • I’m going to stop doing math now. Texas and Iowa go narrowly to Trump, but Arizona, Wisconsin, and Nevada pile on for Biden late Tuesday night or sometime on Wednesday, and the route is on.
  • It doesn’t matter if Pennsylvania and Michigan take a long time to count their votes, but eventually they do, and the route becomes a landslide. I’ll call 300+ a landslide, although it certainly falls short of the near-sweep (525/538) Ronald Reagan pulled off in 1984. Like the guy or not, that was a clear victory.
  • My final prediction: Biden 334, Trump 204

coronavirus trackers and simulations revisited

Update: December 13, 2020 (and from time to time since then, I update links if I notice they are broken)

This post is getting a surprising amount of attention. I don’t normally update posts, but I am updating this one since it is getting attention and the commentary in the original post is significantly outdated. Rest assured, if you are a historian in the far future studying what I was thinking back in June 2020, I have kept the original post at the bottom. I am keeping all the links, just grouping them somewhat and removing (from this section) the outdated commentary. (Thank you, Word Press, for making a simple copy-and-paste operation like this beyond excruciating.)

Data Trackers

  • Johns Hopkins – map, stats, access to data sets
  • New York Times – a national (U.S.) map by county and plots by state (now, with a paywall! as of 7/30/21. Which I will never pay because WEAPONS OF MASS DESTRUCTION!)
  • Financial Times – similar to others, but they look at excess deaths a little differently and have some interesting graphics
  • BBC – similar to NYT, but international
  • CDC – changed this link to their “COVID-19 by County” page on 2/26/22; the updated recommendation is to mask indoors if new cases in your county are 200,000 per 100,000 population per week, AND if the number of people entering the hospital and/or in the hospital is above certain thresholds. It’s a little hard to find the data and figure out yourself, so if you trust the CDC (and who wouldn’t?) you can just type in your county and they will tell you if it is high/medium/low.
  • https://coronavirus.thebaselab.com/ – a variety of maps and plots
  • City Observatory – intermittent data-based articles and maps
  • Our World in Data – excellent interactive country-level data, maps, and plots. A tip – you can also type in “world” or the name of a continent in the country box.
  • https://aatishb.com/covidtrends/ – a very clever animated time series of growth in cases over time, by country
  • Reuters – just more numbers and maps, similar to NYT
  • Covid Act Now – state-level data and communication in a simple, easy to understand index format
  • Harvard Global Health Institute COVID Risk Levels Dashboard – similar to Covid Act Now, but less simple and less easy to understand. Seems to have more ability to drill down into county-level data, although when you do that much of it is blank.
  • Wastewater surveillance from “Biobot Analytics” – added 4/30/22.

Simulations

  • University of Washington IHME – the best place I have found for understandable future projections. At the state level.
  • FiveThirtyEight – compares different models (no longer updating as of 7/30/21)
  • https://covid19risk.biosci.gatech.edu/ – This site calculates the probability that someone in a group of a given size is infected, based on the estimated rate of active cases in a U.S. state.
  • MicroCOVID – a risk calculator based on local data and allowing you to adjust your risk tolerance and try out various scenarios (added 8/8/21), such as “one night stand with a random person” (on the latter, please remember there are other diseases besides just Covid-19, for example antibiotic-resistant syphilis…)
  • Covid-19 Forecast hub – another visualization of various models and ensembles of models

Vaccine Trackers

Local Pennsylvania/Philadelphia Interest

  • The state of Pennsylvania has a useful dashboard which they have now made public (or it was public before and I didn’t notice.) It compares cases, positive tests, and hospital data for the current and last 7-day period, at the county level.
  • Speaking of Philadelphia, a shout out to the Philadelphia Health Department which provides some open downloadable data.

Miscellaneous Stuff

Original Post (June 27, 2020)

I decided to list out and summarize the variety of trackers and simulations I’ve mentioned in previous posts. Like many people (in the U.S. Northeast at least), I was glued to coronavirus info on various screens from roughly mid-March to mid-May, then my attention started to gradually drift to other things as the situation got better. Now, it seems that it has either stabilized at a not-quite-out-of-the-woods level, or is slowly reversing itself as we see other parts of the country start to be affected more seriously (sorry if you are reading this and are being affected, we in the Northeast take no pleasure in your suffering, I promise, although we suggest you turn out any bigoted anti-science politicians in your area who are letting this happen.) Anyway, I find that I am interested in starting to look at trackers and simulations again on a daily basis. These are in the order I discovered them.

  • Johns Hopkins – a neat map early on, although now the entire world has become a blob. Still a good place to stare at data.
  • New York Times – a national (U.S.) map by county and plots by state. seems to load even though I have used all my free articles for the month.
  • BBC – they update continuously but I’m not sure if this link will be to the latest
  • CDC – this is what I would have predicted would be the go-to source of information and expertise if you asked me before all this started…but it’s mediocre at best. Yes, that just about sums it up.
  • https://coronavirus.thebaselab.com/ – a variety of maps and plots to stare at, not my first stop but a little different if I am tired of others
  • University of Washington IHME – still the most informative state-level simulations I have found, accounting for hospital capacity among other things
  • City Observatory – they did an awesome analysis by U.S. metro area, which I have not seen anyone else do (human beings interact with each other socially and economically in cities and their suburbs, which often cut across states, and states often contain metro areas that are not connected much socially or economically. Economists, social scientists and urban planners know this of course, but nobody else studying the epidemic seems to have figured this out. Seriously, other data visualization and simulation sites, you can do this, it’s just a matter of grouping data by counties.) Unfortunately, they quit updating it and have not automated it. I still check every now and then to see if they have picked it up.
  • Our World in Data – pretty much every conceivable way of looking at data by country. I like to look at confirmed deaths per million across countries. By this measure, the starkest contrast is east vs. west. The eastern countries were hit first, hard, and without warning, and their death rates are very, very low. They have a variety of government types, responses, ethnicities and cultures. I just don’t think anybody has come close to explaining it. The U.S. is in the middle of the pack of western countries, which somewhat contradicts conventional wisdom and suggests news organizations are making the obvious error of not normalizing by population.
  • https://aatishb.com/covidtrends/ – an animated time series of new confirmed cases in the past week vs. total confirmed cases, both on a log scale, by country. As I write this, shows the beginning of a concerning uptick for the United States, and Brazil out of control.
  • Reuters – I actually never wrote about this one, but it has a map and some numbers.
  • FiveThirtyEight – they have an aggregation of various simulation models out there. New York and New Jersey look like a stream sprayed horizontally out of a garden hose, while Texas and Florida (today) look more like a fire hose.
  • https://covid19risk.biosci.gatech.edu/ – This site calculates the probability that someone in a group of a given size is infected, based on the estimated rate of active cases in a U.S. state. I assume it’s estimated active cases, anyway, or it wouldn’t make sense. It would be better by metro area (seriously guys, someone just get this done), but still a nice idea. I’m in Philadelphia, but I figure the New Jersey numbers are probably the most applicable.
  • Covid Act Now – provides a composite risk index at the state level, and county when county level data is available in the right format (which is not that often)
  • Harvard Global Health Institute COVID Risk Levels Dashboard – keeps it simple with just data on new cases, but gives you a variety of nice mapping, charting, and tabular formats to slice and dice the data at country, (U.S.) state or county level.
  • The state of Pennsylvania has a useful dashboard which they have now made public (or it was public before and I didn’t notice.) It compares cases, positive tests, and hospital data for the current and last 7-day period, at the county level.
  • Speaking of Philadelphia, a shout out to the Philadelphia Health Department which provides some open downloadable data.
  • I look at the FAO food price index on occasion. It’s falling lately. Sometimes I look at oil and gold prices, and how many Special Drawing Rights can be bought with one U.S. dollar. Oh and, the Rapture Index is at an all time high!

April 2020 in Review

Most frightening and/or depressing story:

  • The coronavirus thing just continued to grind on and on, and I say that with all due respect to anyone reading this who has suffered serious health or financial consequences, or even lost someone they care about. After saying I was done posting coronavirus tracking and simulation tools, I continued to post them throughout the month – for example here, here, here, here, and here. After reflecting on all this, what I find most frightening and depressing is that if the U.S. government wasn’t ready for this crisis, and isn’t able to competently manage this crisis, it is not ready for the next crisis or series of crises, which could be worse. It could be any number of things, including another plague, but what I find myself fixating on is a serious food crisis. I find myself thinking back to past crises – We got through two world wars, then managed to avoid getting into a nuclear war to end all wars, then worked hard to secure the loose nuclear weapons floating around. We got past acid rain and closed the ozone hole (at least for awhile). Then I find myself thinking back to Hurricane Katrina – a major regional crisis we knew was coming for decades, and it turned out no government at any level was prepared or able to competently manage the crisis. The unthinkable became thinkable. Then the titans of American finance broke the global financial system. Now we have a much bigger crisis in terms of geography and number of people affected all over the world. The crises may keep escalating, and our competence has clearly suffered a decline. Are we going to learn anything?

Most hopeful story:

  • Well, my posts were 100% doom and gloom this month, possibly for the first time ever! Just to find something positive to be thankful for, it’s been kind of nice being home and watching my garden grow this spring.

Most interesting story, that was not particularly frightening or hopeful, or perhaps was a mixture of both:

  • There’s a comet that might be bright enough to see with the naked eye from North America this month.

one more covid tracker

I thought I was over covid trackers, but I just can’t help it. I know this isn’t my first “one more”, and it might not be my last. This one plots new cases over the past week on the vertical axis vs. total confirmed cases on the horizontal, the animates over time. You can add any country or U.S. state. The simulation starts whenever 10 cases were reported in that location, and you can see them grow at first exponentially and then deviate from the line when they start to get it under control. You can pick a log or arithmetic axis – log is good for the math, but it kind of lets you forget that there is a difference between 10 people dying and 10,000 people dying. Anyway, it’s nice and thanks to this person for posting it for free.

more coronavirus tracking

This massive data analysis entry from Our World in Data is a pretty good example of how to take a data set and beat the crap out of it from every angle.

I like what they did. Since it’s by country, it allows interesting comparisons across countries but is not meant to provide local or regional-specific information. Countries are pretty big. My favorite trackers that are most relevant to my situation are still the City Observatory analyses of U.S. metro areas and the University of Washington simulations of available hospital capacity. The latter are by state.

March 2020 in Review

To state the obvious, March 2020 was all about the coronavirus. At the beginning of the month, we here in the U.S. watched with horror as it spread through Europe. We were hearing about a few cases in Seattle and California, and stories about people flying back from Italy and entering the greater New York area and other U.S. cities without medical screening. It was horrible, but still something happening mostly to other people far away on TV. In the middle of the month, schools and offices started to close. By the end of the month, it was a full blown crisis overwhelming hospitals in New York and New Jersey and starting to ramp up in other U.S. cities. It’s a little hard to follow my usual format this month but I’ll try. Most frightening and/or depressing story:
  • Hmm…could it be…THE CORONAVIRUS??? The way the CDC dropped the ball on testing and tracking, after preparing for this for years, might be the single most maddening thing of all. There are big mistakes, there are enormously unfathomable mistakes, and then there are mistakes that kill hundreds of thousands of people (at least) and cost tens of trillions of dollars. I got over-excited about Coronavirus dashboards and simulations towards the beginning of month, and kind of tired of looking at them by the end of the month.
Most hopeful story:
  • Some diabetics are hacking their own insulin pumps. Okay, I don’t know if this is a good thing. But if medical device companies are not meeting their patient/customers’ needs, and some of those customers are savvy enough to write software that meets their needs, maybe the medical device companies could learn something.
Most interesting story, that was not particularly frightening or hopeful, or perhaps was a mixture of both:
  • I studied up a little on the emergency powers available to local, state, and the U.S. federal government in a health crisis. Local jurisdictions are generally subordinate to the state, and that is more or less the way it has played out in Pennsylvania. For the most part, the state governor made the policy decisions and Philadelphia added a few details and implemented them. The article I read said that states could choose to put their personnel under CDC direction, but that hasn’t happened. In fact, the CDC seems somewhat absent in all this other than as a provider of public service announcements. The federal government officials we see on TV are from the “Institute of Allergies and Infectious Diseases”, which most people never heard of, and to a certain extent the surgeon general. I suppose my expectations on this were created mostly by Hollywood, and if this were a movie the CDC would be swooping in with white suits and saving us, or possibly incinerating the few to save the many. If this were a movie, the coronavirus would also be mutating into a fog that would seep into my living room and turn me inside out, so at least there’s that.
https://www.youtube.com/watch?v=4chSOb3bY6Y

yet another coronavirus tracker

Here’s another tracker someone has put together, allowing comparison of countries based on days since their first case of the virus. For the US, it has state and county-level features although it appears data is not available for all of these. Metro-area data would be even more awesome, but now I’m asking too much in an app someone has put together and posted to the world for free!

another coronavirus tracking tool

I like the Johns Hopkins tool, but either it doesn’t let you break down the data by both geography and time, or it is not obvious how you would do that. At a first glance, this tool from weather.us appear to do that, and produce the data in a table that you could play with yourself.

Why does this matter? It might be nice to get a sense of when you think your city or region is starting to turn the corner from an exponential growth curve to an S-shaped curve that will eventually level off. The news media might or might not provide that information in the form you would like to see it on a given day.