Category Archives: Online Tools / Apps / Data Sources

vehicle speed and pedestrian injuries/deaths

Here is the hard data on a person’s probability of survival when hit by a car traveling at a range of speeds. You should go to the link and look at the graphs, but here are a few highlights I picked out:

  • For the average person hit by the average vehicle, you need to get speed down to the 30-35 mph range to have a 75% survival probability, and the 20-25 mph range if you want a 90% survival probability. 15 mph would get you up to about 95%.
  • All people are not average. A 70-year-old struck at 30 mph has something like a 60% chance of living, while a 30-year-old has more like a 85% chance (I’m eyeballing a tiny graph, these numbers are not exact.)
  • All vehicles are not equal. Getting struck by a pickup truck or SUV is more likely to be deadly than a car. Again just eyeballing, if you’re hit by a light truck vs. a car at 30 mph, the average person’s odds of survival would drop from something like 80% to 75%.
  • Those numbers are for death. Obviously, the risk of severe injury short of death is higher. Again using the 30 mph example, the risk of severe injury for the average person hit by the average vehicle looks to be around 50%.

I think our first instinct is to look for someone to blame – and it’s obviously true that better driver behavior, pedestrian behavior, or both could prevent accidents. But police enforcement is obviously part of the answer. It upsets me when I hear the Philadelphia Police openly say they don’t enforce traffic laws because they have “real crimes” to attend to. Sure, their job is to keep the population safe from violence on our city’s streets – well, this is violence on our city’s streets! And it disproportionately puts children and the elderly at risk compared to other forms of crime.

Finally, better design of streets, intersections, and signals is a big part of the answer. Nearly perfect designs exist in places like Denmark and the Netherlands, but well-trained and well-intentioned U.S. engineers are either ignorant of them or cynically assume they can’t or won’t work here, or that they are not affordable.

I assume these same police and engineers would not go out on the streets and shoot old people and children in the head, because that would be unethical, so why is knowingly allowing the preventable deaths of old people and children through ignorance and negligence any different? And why does the public largely accept this and assume it can’t change?

how freight moves

Here are some statistics on how freight moves in the U.S. Compared to my preconceived notions, trucking is even more dominant compared to rail than I thought. Even pipelines move more than twice the weight of rail. Air is vanishingly small in terms of weight, but used to move higher-value items. It’s not too surprising that the monetary value of everything shipped is projected to grow along with the economy, but it is a little surprising to me that the weight of everything shipped is projected to grow by 40% over the next 30 years. It argues against the idea that we are “dematerializing”, or achieving economic growth without physical growth. Sure, people like Alan Greenspan can make an argument that the weight per dollar is not increasing, but what does that mean exactly when a dollar is a fairly arbitrary human measure of value? Ultimately the tonnage of everything we move, from raw materials and fossil fuels to manufactured goods to waste, is one proxy for ecological footprint, and it doesn’t look like we are going to turn the corner soon. The only way that would change is if we had a closed loop, “circular economy” where the waste becomes raw materials again. Then we could theoretically keep shipping it around the loop faster and faster without increasing our footprint. That is, given enough clean, cheap energy.

Ted Cruz

Thinking of Ted Cruz as an alternative to Donald Trump? Looking at Ted Cruz on ontheissues.org, here’s my assessment. He’s a traditional “god, gays, and guns” Christian fundamentalist. The government should have the right to tell us what to believe in (his particular brand of Christian fundamentalism, of course) and what it is okay for us to do in our own bedrooms and families. He would continue the failed “tough on crime” policies that have put so much of our poor and minority population behind bars at enormous taxpayer expense. He would “stand up” to nuclear-armed foreign governments like Russia, China and Iran through aggressive military means. On the other hand, in most matters not involving personal religious beliefs, sexual practices or armed violence against the already-born, he’s a “starve the beast” zealot who is ideologically opposed to the very idea of government. He would try to end government involvement in retirement, health care, education, environmental protection, financial stability and the ability to counteract recessions through fiscal and monetary policy.

Personally, I consider it completely non-partisan to look at the risks involved and just say no. This irrational, inconsistent set of ideas is not based on any sort of factual analysis or attempt to understand how the world works. It is likely to destabilize the economy and/or get us into wars. It’s just dangerous. Thinking people of any political stripe should just say no and back candidates who are interested in real solutions to real problems.

dots moving around on a map

This is just dots moving around on a map, but I find these dots very engaging in helping me understand urban planning concepts and results of a simulation.

I found this on R bloggers, which talks about how the simulation and map were created.

Data Scientist Todd Schneider has followed-up on his tour-de-force analysis of Taxi Rides in NYC with a similar analysis of the Citi Bike data. Check out the wonderful animation of bike rides on September 16 below. While the Citi Bike data doesn’t include actual trajectories (just the pick-up and drop-off locations), Todd has “interpolated” these points using Google Maps biking directions. Though these may not match actual routes (and gives extra weight to roads with bike lanes), it’s nonetheless an elegant visualization of bike commuter patterns in the city.

swing the election

Here’s an interesting interactive tool on FiveThirtyEight.com where you can play around with U.S. voter turnout and preferences among various demographic groups.

I ran a few scenarios:

  • The default scenario is that each demographic group (educated white, uneducated white, black, hispanic/latino, and Asian) votes for the same party in the same proportions as 2012, and turns out at the same rate, but the absolute size of each group is adjusted for changes between 2012 and 2016.
    • electoral votes 332-206 in favor of DEMOCRATS
  • Let’s go back to the default, and all the Asian people stay home.
    • 332-206 in favor of DEMOCRATS (just not enough people, and maybe already concentrated in democratic states)
  • Back to the default, and all the hispanic/latino people stay home.
    • 283-255 in favor of DEMOCRATS (perhaps hispanics/latinos are also concentrated in already democratic states?)
  • Back to the default, and black turnout falls from 66% to 29%
    • 286-252 in favor of REPUBLICANS (perhaps this flips some key midwest swing states like Pennsylvania, Ohio, Michigan, Wisconsin, etc.)
  • Back to the default, and uneducated whites swing strongly to the right, from 62% last time to 69% Republican (maybe a terrorist attack? a major incident with China or Russia? I don’t want to say false flag, this is not one of those conspiracy websites…)
    • 282-256 in favor of REPUBLICANS (probably those swing states again)
  • Stay with the previous scenario, but educated whites swing ever so slightly to the left, from 56% Republican last time to 54% Republican (what would cause this? I don’t know, some crazy right-wing candidate spouting racist nonsense maybe, I’m not naming names…)
    • 275-263 in favor of DEMOCRATS

So the bottom line is that the minority groups tend to vote Democrat.The uneducated whites tend to vote Republican. The educated whites are the swing voters who end up being the deciding factor. So it is hard to see how a Republican candidate who appeals strongly to uneducated whites but alienates educated whites could ever stand much of a chance.

on the issues

Ontheissues.org is a little bit junky but it has a lot of information on where the candidates stand, well, on the issues. It then graphs them on an interesting spectrum based on where they stand on government intervention in the social and economic spheres.

Social Questions:  Liberals and libertarians agree in choosing the less-government answers, while conservatives and populists agree in choosing the more-restrictive answers.

Economic Questions:  Conservatives and libertarians agree in choosing the less-government answers, while liberals and populists agree in choosing the more-restrictive answers.

Nate Silver’s Iowa Caucus Predictions

Political season is data science season! Here is some more on Nate Silver’s forecasting methods. If you are reading this in real time (Sunday January 31), by tomorrow night we will find out what actually happens. I will reproduce some graphics here – these are all from the FiveThirtyEight site, so please thank me for the free advertising and don’t send me to copyright jail.

For Clinton vs. Sanders, here is Nate’s average of polls as of today. He gives more recent polls greater weighting, and also adjusts somehow for bias shown in the same polls in the past.

Average of polls: Clinton 48.0% vs. Sanders 42.7%

Now, this is within the 4-6% “margin of error” reported by most polls. (I find this easier to find on the RealClearPolitics site, although curiously it lists margins of error for Democratic polls but not Republican ones. RealClearPolitics does a straight-up poll average without all the corrections that today is Clinton 47.3% vs. Sanders 44%. So all the corrections don’t make an enormous difference.) I can’t easily and quickly find information on whether the “margin of error” is a standard error or a confidence interval or what, but generally when the polls are within the margin of error the media tends to report it as a “statistical tie” or dead heat. And that is exactly what they are saying in this case.

Nate Silver does a set of simulations – it sounds very complicated, but in essence I assume he takes his adjusted poll average for each candidate, some measure of spread like standard error, then runs a whole bunch of simulations. Which leads to results like this:

Clinton-Sanders Simulation

http://projects.fivethirtyeight.com/election-2016/primary-forecast/iowa-democratic/

Based on this, Nate Silver gives Clinton an 80% chance of winning Iowa and Sanders only a 20% chance.

So what’s interesting is that you have the average of polls (48-43 or 47-44 depending on source), which everyone says is a statistical tie. You have Silver’s predicted result (50-43) based on a large number of simulations, and then you have the resulting odds considering both the predicted result and the spread in the predictions (80-20). In other words, the computer is generating random numbers and 80% of simulations end up favoring Clinton. Of course in real life the dice get rolled only once, but these odds seem pretty good for Clinton.

Meanwhile, the Trump-Cruz contest is similarly close in the polls (30-25 in favor of Trump), but the predicted result (26-25 in favor of Trump) and odds (48-41 in favor of Trump) are much closer. From a quick glance, this appears to be because the spreads are much wider. I don’t know why that would be the case – presence of more viable candidates on the Republican side? Or maybe there is just more variability in the polls and nobody actually knows why.

Republican Iowa Caucus simulation

http://projects.fivethirtyeight.com/election-2016/primary-forecast/iowa-republican/

 

 

where are the refugees from?

Here’s a pretty awesome data analysis on where (legal) refugees who enter the U.S. come from, and where they go. It’s great both for the information, and for the presentation of the information, which is simple yet highly effective. Click on the link, but here are a few facts to whet your appetite:

  • The country of origin for the most refugees to the U.S. in 2014 was Iraq, at 19.651.
  • Surprisingly (to me at least), next is Burma at 14,577.
  • Rounding out the top five are Somalia (9,011), Bhutan (8,316), and D.R. Congo (4,502).
  • After Cuba (4,063), the next highest country from Central or South America is Columbia at 243.

I might have guessed Iraq, but I don’t think I would have guessed anything else on this list. In a number of cases, there are groups of essentially stateless people living in various places (Bhutan and Burma, for example) that the U.S. has agreed to resettle in fairly large groups. In other cases, there are just a handful of people from a given country granted refugee status in a given year. It is a little hard to make sense of why one group is allowed and the next is not.