MIT Technology Review has a roundup of online Coronavirus dashboards. The most popular one apparently is from the Johns Hopkins Center for System Science and Engineering.
Tag Archives: data science
election 2020!
I’m going to try not to get too carried away with election 2020 posts. For one thing, a lot of people know a lot more than me about election 2020. Nate Silver for example. In 2012 I made a little spreadsheet electoral college model that helped me understand the election that year. By 2016, that sort of thing was so easy to find on the internet and so much more sophisticated than anything I could hope to come up with that it wasn’t really worth the trouble. For another thing, it can be fun to forecast the outcomes of certain events, sports for instance, and come back later to see how you did. Sports are fun because you pretend to care about them, but you know deep down that they don’t matter. Politics is not like that – they matter and I care, so it is just not that fun to be wrong.
Okay, with that rambling preamble, and before the first voting starts in the Iowa caucus (I’m writing on Sunday, January 26), I’m going to give my predictions. But before I give my predictions, let me be open and honest about what I want to happen. I want Bernie Sanders to be elected President, and I want him to serve alongside a Democratic House and a Democratic Senate. This would give the United States a chance to tackle the systemic corruption problem that is dragging our nation down, and put us on a path to future success. Elizabeth Warren would have a chance of doing this too, and I actually prefer her policy positions overall, but I think Bernie Sanders is the stronger leader and the leader we need right now.
I don’t think that is what is going to happen. Of the three (President Bernie Sanders, Democratic House, Democratic Senate), the Democratic Senate is particularly unlikely. Let’s look at PredictIt – gamblers there are giving about a 70/30 chance of the Senate remaining in Republican hands. Those are not awful odds for Democrats, but in a straight-up betting situation you would not take those odds. And keep in mind, a super-majority of 60 in the Senate is required to pass major legislation, not just a majority of 51. So even if Sanders or Warren gets in as President, and assuming the House stays Democratic as seems likely, it will be close to impossible to get major progressive legislation through on issues like campaign finance, health care, childcare or education. A Republican Senate will also block any efforts to reengage with the United Nations or ratify treaties on things such as climate change or human rights. A Democratic President will be stuck trying to fine-tune rules and regulations across the executive branch, rebuild the State Department and shape foreign policy to the extent possible through the executive branch.
Let’s start with general election polls out as of right now. People say these don’t mean anything. But I recall looking at Clinton vs. Trump in these polls, before we knew that either of them would be the nominee in 2016, and being surprised that people thought Trump would beat her. The same polls showed Bernie Sanders beating Trump. So let’s look at these wildly inaccurate, not very useful polls on RealClearPolitics as of Sunday, January 26.
- Biden vs. Trump: Biden leads by 4.3% nationally, an average of 6 polls taken between December 4 and January 23. Of these 6, 1 shows Trump leading and 1 shows a tie, while the others show Biden leading by 2-9%.
- Sanders vs. Trump: Sanders leads by 3.2% nationally, an average of 6 polls taken between December 4 and January 23. Of these 6, 1 shows Trump leading , while the others show Sanders leading by 1-8%.
- Warren vs. Trump: Warren leads by 1.4% nationally, an average of 6 polls taken between December 4 and January 23. Of these 6, 2 show Trump leading and 1 shows a tie, while the others show Warren leading by 5-7%.
- Incidentally, today PredictIt gives the eventual Republican nominee a 52% chance of beating the eventual Democratic nominee, which doesn’t exactly gel with the numbers above.
The first thing that occurs to me is that these polls (not counting PredictIt) show any of the three most likely Democratic nominees winning the popular vote, whereas they showed Hillary losing it at a similar point in 2016 (based on my memory, I don’t know how to get the historical poll data). Democrats have reasons to be confident, but they are under-confident for obvious reasons. They are probably about as under-confident right now as they were over-confident as of Hillary Clinton’s victory party-like last rally in Philadelphia on election-eve 2016.
The second thing that occurs to me is that the Warren thing is just too close for comfort. I like Warren, but she seems like a risky nominee when Bernie Sanders is so similar in his policy views, and is the stronger potential leader in my view. Similar to Obama, people have this weird reaction to her as an elitist egghead. I personally am comforted when I feel like the people leading the country have a better grasp of subjects like economics and history than I do, but it does not seem as most of my fellow humans share these feelings.
Which leaves us with Sanders and Biden. Let’s go back to Nate Silver and his Monte Carlo models which are so much better than anything I could come up with. His model suggests a 58% chance that no Democratic candidate wins a majority of delegates. Biden has a 42% chance, Sanders a 22% chance, and there is a 15% chance that nobody gets a majority. Nate points out that in the event nobody gets a majority, but somebody gets a clear plurality, one thing that could happen is that the delegates cast votes for their pledged candidate in the first round of voting, but the candidates and delegates arrange in advance for the plurality candidate to get the majority of votes in the second round. I think you have to say that the two most likely outcomes as of today are that Biden gets a majority of delegates on the first vote, or Biden gets a clear plurality of delegates and gets a majority vote on a second ballot as a pre-determined outcome. Put those two together and this is the likely outcome – the Biden vs. Sanders showdown goes to Biden, the Biden vs. Trump showdown goes to Biden, and we have President Biden.
Now let me tell you why my purely subjective, purely anecdotal experience suggests that a President Sanders is a real possibility. It could be that I am rationalizing what I want to happen, of course, which would make me a human being, but nonetheless here it is. I am originally from Martinsville, Virginia, a former Appalachian manufacturing powerhouse that has fallen on very hard times, and this is an understatement. My grandparents’ generation moved from rural subsistence lifestyles to urban factory worker lifestyles. My parents generation worked in those factories when they were young, then got laid off when the factories moved to Mexico and eventually China. I remember friends and relatives railing against Bill Clinton and NAFTA because they thought this took their jobs and the quality of life of their families away. Now, in my personal view, NAFTA was just the final nail in the coffin created by decades of policies meant initially to prop up Cold War allies, which then proved a convenient narrative for multinational corporations, and turned out to be straightforward to represent in abstract mathematical models by academic economists.
Barrack Obama made Martinsville, Virginia one of his early campaign stops, and I know for a fact that some of my hillbilly friends and relatives you would never expect to vote for him bought into his “hopey changey” vision and voted for him in 2008 and 2012. When 8 years of Obama didn’t noticeably improve their lives, and the Democrat running in 2016 had the last name “Clinton”, these same friends and relatives voted for Trump in 2016. I think people who self-identify as America’s lost industrial base in Pennsylvania (where I now live), Ohio, Michigan, and Wisconsin did the same. To state the obvious, 2020 is not 2016, there is no candidate named Clinton, and Bernie Sanders won the 2016 primaries in some of these states. At least some of these “working class” Trump voters are going to love Bernie Sanders. Combine this with the coin toss in Florida which went Trump’s way in 2016, an outside chance of Texas flipping Democratic in 2020, and uncertainty about the economy, and Bernie has a good chance. Like I said, Democrats are under-confident.
So let’s be clear: I think the odds favor Biden, a Democratic House, and a Republican Senate. I think Sanders, a Democratic House, and a Republican Senate is the second most likely outcome. Trump, a Democratic House, and a Republic Senate is probably the third most likely outcome. Nobody knows what is going to happen with the economy or geopolitical events, but in the next 11 months something is probably going to happen. Sanders, a Democratic House, and a Democratic Senate is not a high probability, but as Nate Silver might point out, a sports metaphor might help us realize that the odds are not that different from perhaps an underdog like the Philadelphia Eagles winning the 2018 Superbowl (which they did). It’s likely enough to be worth fighting for.
The Trifecta Checkup
The author of Junk Charts recommends answering three questions to determine if a data visualization is a good one: What is the question, what does the data say, and what does the visual say? If the answers to the three questions are the same, it is a good graphic.
rain measurement using cameras
This article is about estimating rainfall using ordinary surveillance camera footage and computer algorithms to process the videos. Measuring rainfall with physical rain gauges is subject to a lot of error, and so far the only real way to reduce the uncertainty is to add more gauges, which of course costs money. Radar can be used to improve our knowledge of what is going on in the spaces between rain gauges, but ultimately the radar-based estimates still end up being calibrated to the gauges. New methods to improve accuracy for a given gauge coverage, and/or reduce cost and gauge coverage while maintaining accuracy, would be welcome.
“Opportunistic sensing” represents an appealing idea for collecting unconventional data with broad spatial coverage and high resolution, but few studies have explored its feasibility in hydrology. This study develops a novel approach to measuring rainfall intensity in real‐world conditions based on videos acquired by ordinary surveillance cameras. The proposed approach employs a convex optimization algorithm to effectively decompose a rainy image into two layers: a pure rain‐streak layer and a rain‐free background layer, where the rain streaks represent the motion blur of falling raindrops. Then, it estimates the instantaneous rainfall intensity via geometrical optics and photographic analyses. We investigated the effectiveness and robustness of our approach through synthetic numerical experiments and field tests. The major findings are as follows. First, the decomposition‐based identification algorithm can effectively recognize rain streaks from complex backgrounds with many disturbances. Compared to existing algorithms that consider only the temporal changes in grayscale between frames, the new algorithm successfully prevents false identifications by considering the intrinsic visual properties of rain streaks. Second, the proposed approach demonstrates satisfactory estimation accuracy and is robust across a wide range of rainfall intensities. The proposed approach has a mean absolute percentage error of 21.8%, which is significantly lower than those of existing approaches reported in the literature even though our approach was applied to a more complicated scene acquired using a lower‐quality device. Overall, the proposed low‐cost, high‐accuracy approach to vision‐based rain gauging significantly enhances the possibility of using existing surveillance camera networks to perform opportunistic hydrology sensing.
predictive policing
Here’s an interesting article on predictive policing from Motherboard. People are concerned that if a particular area has been overpoliced in the past, that is where the algorithms are going to predict crime in the future and they will continue to be overpoliced. Others just don’t like the idea of proprietary algorithms. I think any of these concerns could be badly depending on how it is implemented, but I don’t see why the tool itself could not be implemented in a fair way. In fact, I don’t see why measures to prevent discrimination couldn’t be built into the algorithms themselves. If the algorithms say people in a particular area or in a particular demographic group are being arrested at higher rates, it could help the search for route causes and preventive measures to help a particular group revert back to the mean. Transparency seems good in principle, maybe publishing some generalized statistics and maps, but of course if it is too predictable exactly where the police are going to be and when, people could take advantage of that. You could try to get around this by balancing random and targeted patterns within the algorithm.
August 2018 in Review
Most frightening stories:
- In certain provinces with insurgent activity, the Chinese government is reportedly combining surveillance and social media technologies to score people and send those with low scores to re-education camps, from which it is unclear if anyone returns.
- Noam Chomsky doesn’t love Trump, but points out that climate change and/or nuclear weapons are still existential threats and that more mainstream leaders and media outlets have failed just as miserably to address them as Trump has. In related news, the climate may be headed for a catastrophic tipping point and while attention is mostly elsewhere, a fundamentalist takeover of Pakistan’s nuclear arsenal is still one of the more serious risks out there.
- The U.S. government is apparently very worried about a severe cyber attack. Also, a talented 11-year-old can hack a voting machine.
Most hopeful stories:
- There are some new ideas for adjusting GDP to account for natural capital and ecosystem services. There are also ideas to better account for “intangible products” like software in GDP. And R&D is a good investment that the U.S. could do more of.
- While the U.S. uses a few less straws and pats itself on the back, there are serious ideas in other countries for tackling the root problem of packaging.
- Vancouver has successfully combined green street and complete street concepts. The American Society of Landscape Architects has also compiled some helpful resources on this topic.
Most interesting stories, that were not particularly frightening or hopeful, or perhaps were a mixture of both:
- Google Lens can identify a plant or animal from its picture, and the subway body scanners from Total Recall are now real.
- There are some neat high-tech camp stoves out there that can burn almost anything with very little smoke and even charge electronic devices.
- I found a good article about making box plots in R.
box plots!
One of my nerdy interests in box plots. And no, you can’t make great ones in Excel. Here is a blog about making fantastic ones in the R package ggplot2.
March 2018 in Review
Most frightening stories:
- One reason the U.S. blunders into war repeatedly is that it does not do a good job of analyzing the motives of its adversaries.
- International investors may be losing confidence in the U.S. dollar. And a serious financial crisis in China is a possibility, although China is also trying to become a “cyber superpower“.
- One reason propaganda works is that even knowledgeable people are more likely to believe a statement the more often it is repeated.
Most hopeful stories:
- One large sprawling city could be roughly the economic equivalent of several small high-density cities. This could potentially be good news for the planet if you choose in favor of the latter, and preserve the spaces in between as some combination of natural land and farm land.
- The problems with free parking, and solutions to the problems, are well known. This could potentially be good news if anything were to be actually done about it. Self-parking cars could be really fantastic for cities.
- The coal industry continues to collapse, and even the other fossil fuels are saying they are a bunch of whining losers. And yes, I consider this positive. I hope there aren’t too many old ladies whose pensions depend on coal at this point.
Most interesting stories, that were not particularly frightening or hopeful, or perhaps were a mixture of both:
- Some people really do win the lottery more than they should.
- You can buy a computerized chicken coop or cider maker.
- You can do network analysis or call Matlab in R.
calling Matlab from R
There are at least three ways to call Matlab from R. Which probably doesn’t interest the vast majority of people, but could be useful in engineering where different disciplines and people from different backgrounds are trained in a variety of tools but still need to work together.
Calculating the RPI
If you wanted to calculate the RPI in R here is how you would do it.
https://www.r-bloggers.com/calculating-college-basketball-rankings-using-functional-programming-in-r/