You would think that looking up a ZIP code and its associated demographic data should be fairly straightforward, but there are some apparent discrepancies in the ZIP code data included in CDXZipStream that we’d like to discuss here. CDXZipStream is our add-in for performing ZIP code and location analysis in Microsoft Excel.
There are about 42,000 ZIP codes in use in the United States, and every month we update our data feeds to reflect any changes made by the U.S. Postal Service. All of these active ZIP codes are included in CDXZipStream functions CDXRadius, CDXDistance, CDXClosestZip, CDXZipList, and CDXFindZip. To easily create a list of all ZIP codes in Excel, you can use the CDXRadius function for any valid ZIP code, and select a radius size (such as 50,000 miles) that would encompass the entire U.S. As of this writing, this returns 41,813 ZIPs.
However, when using CDXZipStream data feeds from census data, such as CDXCensus2010 and CDXACSZCTA, there are only about 33,000 ZIP codes that have associated data. There are a few reasons for this. For instance, some ZIP codes are unique for some high volume addresses (such as 20505 for the CIA in Washington, DC) or may apply to a PO Box only (such as 22313 for the PO Boxes of Alexandria, VA). There’s even a postal service boat, the J.W. Westcott II, which has an assigned ZIP code (48222). In these cases, there is no census population associated with these ZIP codes, and they are excluded from census survey data altogether.
In other cases, there may be a population for the ZIP code (so it is listed in a census data feed), but the population count is so low that for certain measurements, such as income or housing value, there is not enough data to provide a statistically significant result. This is especially true for data from the American Community Survey, which is taken from a much smaller sample of the population compared to the 100% count performed for the ten-year census. Demographics also are a factor; a college campus that has only young single residents may have no statistically significant data associated with households or families. The same applies to ZIP codes that cover military installations, nursing homes, prisons, and other special populations.
If you are getting unexpected results from one of our CDXZipStream data feeds, consider doing a quick Google search on a few of the ZIP codes in question and check the results on sites such as zipskinny.com or zip-codes.com. Just about all the data from these sites are sourced from the U.S. Census Bureau, as are our data feeds, so the demographic results should be consistent across the board.
Finally, keep in mind that ZIP codes are first and foremost designed to aid in the delivery of mail. ZIP code areas are not designed around geographic or man-made boundaries; a single ZIP can cross county and even state boundaries, and are not tied in any way to congressional districts, census areas, or any other political boundary. In rural areas, ZIP codes may not even be assigned. So although they are very useful for a variety of applications ranging from market research to area analysis, applying them beyond their original intention can have some unintended consequences.