Update – new data for 2022-03-24 I have no words!
Update – new data for 2022-03-23 More of the same – not looking good.
Update – new data for 2020-03-22 Lots more cases and older people getting hit harder than ever.
Update – new data as of 2020-03-21 It shows that it continues to big a big problem for the older population.
Wow – I did not think that I was ever going to be writing another Alteryx blog post… I have been out of day to day operations for a few years and starting this year I have no association with Alteryx at all. However, it is still the best data analysis tool out there and Alteryx has been kind enough to let me keep a license. If you just want the conclusion, here is my Colorado COVID-19 report.
Like everyone, I have been obsessively following the COVID-19 data. We are all scared. Colorado has upped its game on data reporting for this crisis with this portal: https://covid19.colorado.gov/data In particular though, I have a few issues with it. While Tableau produces very nice looking charts and maps, Tableau is not the best way to publish high volume data. The site is crashing periodically and having various issues. If you look at the # of network requests it takes to serve up this one page, it is insane. I have argued for years that static reports are generally more appropriate than interactive for this reason among many. But if the report was just a static PDF, it would make it much easier to put on a content distribution network.
My second issue with the Colorado was one particular chart. The reported positive tests by age just end up looking like an age histogram of the state. I was afraid that this diminished the threat of this virus to older people. See the chart below (copied from https://covid19.colorado.gov/data on 2020-03-21):
It kind of makes it look like there is no problem with the older crowd. I found this hard to believe. I am not an expert on data visualization (I am a software architect/programmer). But I do know a thing or 2. Most importantly is that reporting absolute numbers can often be misleading. It is always better to normalize values. In this case a chart with infection rate instead of raw #’s paints a very different picture:
In this case it is very obvious that older individuals are not getting this disease at a lower rate. And the hospitalization rate for older individuals is very high.
So long story short, I decided to make my own static report on COVID-19 for the state of Colorado. I used the data from the Colorado open data portal: here. This data seems to be 1 day out of date for Colorado. I also wanted to add some national data charts which I got here. Because of the nature of the update schedule, this data seems to be 2 days out of date. And finally the age breakdown data for Colorado does not seem to be anywhere for download, but it was a small enough amount of data that I just typed it in. This would all be very easy to adapt for your own state. For obvious reasons, Colorado is where I am interested right now.
Footnote: thoughts about Alteryx now that I am a few years removed from it (note: many of these issues are probably originally my fault, I am not placing blame):
- The download tool temporary file mode is not really documented. Being a few years out it took me a few minutes to figure out how to follow that with a dynamic input tool. That should be easier and better documented. It makes it really easy to read a CSV from the web.
- The interactive chart needs an option to have a logarithmic scale. Especially for this data.
- Why is the chart tool in pixels when the other report tools are in inches/cm? And why does it default to 72dpi when the rest of the report tools (and windows in general) default to 96dpi? And what’s up with LaTeX? And finally – it doesn’t work sometimes if you haven’t run your workflow recently. It would be awesome if it told you that.
- Getting all the tools lining up and connections not intersecting is as hard as ever.
- I have a fairly big laptop screen, but with almost requiring the config and output windows up all the time, you have a very small amount of work area left. And it still doesn’t support high resolution screens very well.
- Tabs within tabs within tabs. All with different styles to try to make you think you aren’t in tabs.