Inspiring Ingenuity

Alteryx, Bicycles and Teaching Kids Programming.

Alteryx: Big Data and Current Events

5 Comments

Or a National Summary of Food Deserts

ba9b4a63ebf91e8f91c66f1ec3e4d397Food Deserts, areas that are a longer than normal distance to grocery stores, have been an ongoing topic in politics and demographics for a while now.  I first heard about the concept when Rahm Emanuel started talking about it in his run for mayor of Chicago.  I have continued to see articles and blog posts about it and every time I think that Alteryx would make it much easier to create a more nuanced analysis.

They say a picture is worth a thousand words, so I wanted to start with a map.  As I explored in Dot Density Maps, mapping a phenomenon that without exaggerating rural areas can be very hard.  Look at some of the other maps online: here, here, here, etc…  They all show the problem seemingly as a rural problem.  To be fair, a lot of blogs are looking at it from a socioeconomic or health point of view and rural areas do play a large part.  In the map on the left (click for a larger version) each dot represents 500 people in a Food Desert.  It becomes clear that the issue is primarily a suburban issue.  There is a ring around almost every city of food desert.  From an environmental point of view, this is a disaster.  It becomes impossible to walk or ride a bike to get food, so that much more gas is burned and that much more time is wasted sitting in traffic.

So how big an issue is it?  If you only count grocery stores with 10 or more employees, almost 10% of people live in a food desert.  Specifically: 31.9 million total people (9.4%), 24.3 million (10.0%) in urban areas and 7.5 million  (7.9%) in rural areas.  If you count all grocery stores with even just a few employees, it doesn’t look as bad at just over 2% of the population, but the health and cost issue certainly come into play if you are forced by your location to get your food at such small stores.

What are the top and bottom metro areas for food deserts?  In particular, I am looking at the top 50 CBSAs by population – when you include all the smaller cities it really skews the results.  San Juan, Puerto Rico came in with 67% of the population living in a food desert, but frankly I don’t know enough of the data quality or demographics to know if that is correct.  It may be the norm there to shop at smaller stores, so maybe the 10+ employee filter isn’t appropriate there.  I think I will need to go there to do some research, maybe this winter when its cold and snowing in Boulder?

Engine_11268_2d8baffaf97a4d2c98e5014024fe6aea

Birmingham and Austin are basically a tie for the most people at over 19% of the population!  San Antonio, Oklahoma City and Atlanta round out the top 5.  Birmingham is also at the bottom of the list of distance to Grocery stores.  The 80th percentile distance is 4.16 miles!  This means 20% of the population has to go farther than this!  2nd place for the distance is Oklahoma City at 2.9 miles, so it is a big difference.

Much to my surprise, the best city in terms of fewest number of people in a food desert was Los Angeles with only 1.1%.  With the California car culture, I would have guessed it would be worse, but the density worked in its favor.  The rest of the top 5 is Miami, Portland OR, San Francisco and San Jose with NYC close behind.  The results for 80th percentile distance is very similar with Los Angeles being the best at 0.8 miles…  In distance NYC moves solidly in the top 5 with Portland OR falling down a bit to 8th.  See the table on the right for the full list (sorry its an image – I couldn’t figure out how to make a nice table on this blog.)

So how did Chicago do?  Was Rahm Emanuel right to bring up this issue?  The other cities of a similar size mostly did a lot better, so yes I think they have an issue and it is worth focusing on.  They are far from the worst though, so maybe they have made progress these last few years they have been focusing on it.

Defining Food Deserts

There are no consistent definitions of what exactly constitutes a food desert. The USADA defines it as 1 mile from the nearest grocery store, but they also take into account wealth. For my purposes, wealth is less important, because I am equally interested in the transportation costs and environmental concerns associated in living far away from a store. In my case I live within a block of a small, but good grocery store. Living in a house that close to a grocery story is the single best thing I have done for my quality of life. Very rarely do I go on a big shopping trip or use a car for that purpose – or any other really.

After surveying a lot of sources I am defining a food desert as:  Urban populations living 2 or more miles from the nearest grocery store or rural populations living 10 or more miles.  See below for my definitions of Urban vs Rural and grocery store locations.  Unless otherwise noted, all numbers & maps were created filtering the grocery store list to locations with 10 or more employees.  I know there are many good small stores, but far and away most of them seem small and specialty and not the kind of places you might do bulk grocery shopping.

Sources

All populations are from the 2010 US Census SF1 file at the block level. As far as distances are concerned I am assuming that all the people are at the “interior point” as defined by the US Census. I could have used a household file to get down to the individual level, and in fact I did in some in my testing, but I feel the commercially available household files under represent the poorer people in the country.  This makes sense because they collect data off of things like credit card purchases and they are selling their data to marketers.  People without credit cards or permanent addresses just aren’t as important when your goal is selling.  The US Census has the goal to be an accurate count of everyone.

The grocery stores and employee figures come from the come from the D&B Q2 2013 Analytic file that ships with the Alteryx Designer Desktop Professional Edition. In particular the grocery stores are identified by “NAICS 445110: Supermarkets and Other Grocery (except Convenience) Stores”   There are about 150 thousand grocery stores in the country, with 34,365 stores with 10 or more employees.

Urban/Rural

It turns out the Urban Rural information is either not available for blocks in the SF1 dataset, or the version of the dataset I have (from the census) has a bug. I hear there might have been a revised SF1 that fixed some issues, and I know I received my disk on the day it came out, so maybe it is fixed. Anyway, I needed to append the Urban/Rural information, so I headed back to my Downloading From TIGER App and downloaded the UAC (Urban Area/Urban Cluster) layer. A simple point in Polygon and we now can identify what blocks are urban vs. rural. In particular, I am only using the “Urban Area” records, NOT the “Urban Cluster” areas.  By this definition there are 91,260,129 rural people and 226,547,162 urban.

Alteryx & Big Data

There are over 11 million census blocks, each with a polygon and centroid.  I downloaded the polygons using my Downloading From Tiger – Interesting to note that I tried to download more data today and it failed because of the US government shutdown…  Sad smile  Then doing a find nearest from those 11 million points to 150K grocery stores and a Weighted Percentile to each.,  I ended up with more than 15GB of data to analyze to produce just a few results.  This really shows the power of Alteryx for mowing through  large quantities of data and coming up with actionable intelligence.

Thanks for reading,

ned.

[Edit:  There is a follow up to this post with a few more maps and a detailed download for all CBSAs – see https://alteryxned.wordpress.com/2013/10/03/food-desert-follow-up/]

Advertisements

5 thoughts on “Alteryx: Big Data and Current Events

  1. Wow, I didn’t know the shutdown meant they had to shut down their ftp servers!

  2. Yeah – all my links to gov sites are broken. USADA, even the NAICS definition.

  3. Your instinct re. San Juan is right — very few American-style big grocery stores. Mostly independent stands & little stores, and significantly, lots of access to good produce — for example, I’d say San Juan is distinctly less of a produce desert than most of outer-borough NYC.

    As I don’t need to tell you, their sample areas are almost meaninglessly large for major metro areas. For example, “NY/NJ/PA metro” — just within NYC’s five boroughs, a very small fraction of that area — access levels range range from Manhattan and the gentrified Brooklyn/Queens neighborhoods (where groceries and good produce are everywhere) to the non-gentrified parts of the boroughs, i.e. most of NYC’s area (where you often can’t get to good produce without either spending $5 on roundtrip subway or driving).

  4. Pingback: Food Desert – Follow Up | Ned Harding

  5. Pingback: New Ideas Needed | Inspiring Ingenuity