In this post I’ll explain how I tabulated mean income by household from the 2010 American Community Survey by zip code, and also provide three data sets that will be useful for you in joining other ACS census tract level data you want to zip codes.
One piece of data I needed recently for a project was income by zip code. The US Census provides tons of data at various geographical scales, but you often have to dig through its web site and merge a bunch of tables together to get what you want. Unfortunately income by zip code appears to be one of these data sets.
The good news is you can get a proxy for household income, household median income, and per capita income by zip code . The bad news is that unless you want to pay for a data source from ESRI or some other vendor, you’ll have to do some legwork to get this data.
Why Doesn’t The Census Provide Income by Zip Code Data Tables?
Zip codes are created by the US Postal Service to help mail delivery. This is the main reason why the Census doesn’t use zip codes. (see this link to learn more). In contrast, the census views the country according to its own geographies—census tracts, census blocks, etc:
The census uses these geographies because it says census geographies are relatively stable over time while zip codes can cross state, place, county, and census geography boundaries as well as change over time. The census web site says that “Because of the ill-defined nature of ZIP Code boundaries, the Census Bureau does not have a file (crosswalk) showing the relationship between U.S. Census Bureau geography and U.S. Postal Service ZIP Codes.”
But since so many people want to view census data by zip code the census created a new statistical area called the ZIP Code Tabulation Area (ZCTA), which uses census blocks as the basis for Zip Code Tabulation Areas. It is this data source that we can use to link zip codes to income data.
Software and Data Sources I Used
Ideally you’ll want to use a spatial analysis software like ArcGIS to spatially join census tracts to zip code tabulation areas. ArcGIS is a very expensive piece of software but it’s typically available free to students through their universities and they also have a home use version for $100! You can probably also use a scientific software package like NumPy to do this spatial join as well but this is probably a painful task so I wouldn’t recommend it.
In any case I’ll provide this join to you so you can just download my file at the end of this post.
You’ll also want to have some type of database program on hand to easily join different tables together. I like MySQL but Microsoft Access or Open Office Database work as well. And of course Excel works but you might have to manually create the joins.
The most granular level that the census provides income data at is the census tract level.
The census’s American FactFinder site contains a variety of tables you can download at different geographical levels. To get census tract level data for a state you first chose to search by Geographies from the left hand menu. Then choose geography by census tract, select “All Census Tracts within [state]”, and click the “Add to your selections” button.
Next, close the select geographies window. You are now presented to a search screen showing all data at the census tract level within your selected state. Type “income” in the Narrow your search box and you’ll get a list of all income related data from the American Community Survey. (from this list I prefer data from ACS 5 year estimates rather than 1 year estimates—to learn the difference between the two read the ACS documentation).
Choose a data file. In my case I chose S1902: Mean Income in the past 12 months (in 2010 inflation-adjusted dollars). At the next screen you’ll be presented with a table. It’s easier to deal with the table with the rows and columns transposed, so click the modify table button and choose “Transpose rows/columns.” Then download the data in csv format.
Census Tract and Zip Code Area Shapefiles
To download shapefiles for 2010 census tracts and zip code tabulation areas I went to the Census Tiger/Line Shapefiles site.
Linking Census Tracts to Zip Codes
How do census tracts compare in area to zip code areas? I’ve found that in dense urban areas census tracts tend to be the same size or smaller than zip code tabulation areas (see the following images of Boston), while in less populated areas census tracts can be larger than zip code areas.
Also, not all zip codes are included in the census’s zip code tabulated areas data set, as this image of the census tract and zip code shapefiles for Idaho shows:
To link tracts to zip codes I used the spatial join tool in ArcGIS 10.1. How to do spatial joins is a complex topic and one I could spend a post (or two or three) talking about, but for the purposes of this post I’ll just tell you that I did a one-to-one join using the zip code layer as the target feature and the census tract as the join feature. For each zip code, I associated it with one census tract—the one that contained its centroid (center of area).
An alternate way is to do a one-to-many join so that any census tracts that intersect with that zip code are joined to it. You can then take an average income of all the census tract to estimate the income of that zip code, but then you’d also have to factor in what percentage of each census tract was in that zip code and figure out some weighting system to take this into account. I thought it was simpler to just join to one census tract because I assumed that adjacent census tracts probably don’t vary too greatly in income (probably valid for less dense areas but might not be the strongest assumption for dense urban cities where you might have a low income area right next to a higher income area).
Linking Census Tract Data to Zip Codes
Each census tract income table from Factfinder will have a census tract ID associated with it. You can use this ID to join it to the appropriate zip code from the file I’ve provided. If you’re downloading your own data you should know that the ids in the factfinder seem to be integers, while the ids from the shapefiles are 11-character text data. This means you’ll need to convert the factfinder files to be 11 character text types padded by zeros on the left to be able to join the two data sets (converting to integers creates errors for some reason…).
Other Data Processing You Might Want to Do
Because cost of living varies by state, you might want to use a cost of living adjustment by state on the income data. The Missouri Economic Research and Information Center has some cost of living index values by state for 2011/2012, which I’ve also included as a csv data table for you to download.
- Census tracts linked to Zip Code Tabulation Areas (csv file, mysql file)
- Mean household income by zip code for selected states (csv file)
- Cost of living adjustments by state (tab delimited file)
If you use any of my data sets all I ask is that you credit me. Hope this helps!
If you found this post helpful, here are some other articles you might like:
- Articles about ArcGIS
- Articles about Geospatial analysis using other tools like mapbox, google maps, and python
What Else Would You Like to Know?
Let me know what other topics you’d like me to post about by filling out the form below. Happy to help!