One of the more persistent issues in geospatial analysis is that of consistency in time series analysis. We can’t help but want to compare the growth of one city against another over time. Time is the easy part – a year today is the same as a year in millennia past, but the spatial definitions? An entirely different matter.

A few weeks ago, we looked at the patterns of manufacturing employment, and a natural follow up is to ask: in those areas where manufacturing jobs have disappeared, has there been a decline in the income of residents? Or have those traditionally lower-middle income jobs been replaced by higher income sectors?

This requires a definition of “place” that is appropriate to our topic of study – something that needs to consider several factors:

How many geographic areas are there? Too few, and the study becomes uninteresting. Too many and the forest may be lost for the trees – the local detail swamps the broader geographic patterns we want to explore.

 

How much interaction is there between the areas? Even at the state level, we face the annoying problem that people insist on working in a different state from where they live. While some level of interaction between study areas is always present, choose areas which are too small and interaction effects will dominate the analysis.
How stable are the definitions of areas over time? The areal extent of a spatial object may change, even if it continues to have the same name. ZIP codes and city boundaries are notorious for changing over time. While instability is not an insurmountable problem, it is inevitable that the most interesting areas where change is occurring will be the most unstable.

 

The analyst must be aware of territorial changes, especially for areas with consistent naming.

 

Is the data collected or disseminated at this scale? Any re-estimation of spatial data boundaries will result in increased error, and this should be avoided where possible or minimized. As a rule of thumb, data should be available for more detailed spatial areas than are analyzed in order to minimize the effects of changing definitions over time.

 

Do these areas cover the entire country? The most recognized areas are cities and metropolitan areas, but these do not extend beyond urbanized areas or are defined using administrative boundaries such as counties which bear little resemblance to the areas after which they are named.

 

How recognizable are the areas? We could easily define arbitrary areas such as grid or hex shapes, but the readers of the analysis will need to attach meaning to those grids, cluttering up the maps with labels. By ignoring natural boundaries (coastlines, lakes) such areas create interpretation issues when clipped.

 

Life is full of trade-offs, as the spatial analyst knows all too well. The obvious question is: do we have data over a long time series that has consistent areas defined?

Most change maps in the US are at the state level because of stability, but if we are honest, this results in a boring map that contains a lopsided set of comparisons. Treating Texas and Rhode Island as somehow equivalent spatial units doesn’t really work.

For any other geographic level, the stability of areas becomes a substantial factor. To undertake work below the state level, the historical values for previous censuses must be re-estimated on the current boundaries. Even when done well, this introduces error into the values for small areas. It is vital that these estimates be done at the most detailed geography possible. Even then, the results must be interpreted with caution since with each census, areas must be re-estimated and the accumulated error in older census data can become significant.

Despite the issues and inherent error, we believe strongly that the effort to maintain historical data is worth it. Only by understanding how areas evolve over time can we begin to predict the effects of technological and policy change on the landscape. This is why we maintain and distribute historical data back as far as 1970 that has been painstakingly re-estimated to current boundaries.