Spatial Time Series Analysis: A Cautionary Note

Understanding the patterns of change at a local level is always an important component of location research. Is the area growing or declining over time? Have there been significant shifts in the demographic makeup? How is the business landscape changing? Is overall employment growing or shrinking, and how is the mix of businesses changing? A geographic time series analysis could be a good way to identify these changes, but it also comes with some challenges that need to be addressed.

Time series analysis can be difficult for small area geographies for several reasons:

Spatial Definition: the boundaries for the collection and distribution of data change at each census year, but also between census years when there are changes to the administrative geography (counties) over time. Other geographic areas, such as ZIP codes and places, change with alarming regularity.
Attribute Definition: the concepts being measured can change over time. A good example of this are the ever-changing definitions of race and ethnicity used by the Census Bureau over the past few decades (which are about to change yet again).
Attribute Collection and Classification: even if the concept has not changed, there are cases where the way the data is collected, such as the specific questions asked on survey forms, can change over time. Classifications regularly change, such as top categories for the distribution of income – in 1980, the highest income category was $100,000+, and current tables use $250,000+.

With residential demographics, the preferred method is to standardize historical data on current boundaries. AGS carries census data from 1970 onwards, all harmonized on current boundaries. It must be understood that the error associated with any data value increases with each adjustment to the spatial definitions of block groups. The more change in an area over time, the more error in the historical numbers. Once the data are standardized on current boundaries, care must be taken to understand and adjust for definitional and categorization changes.

With historical census data, reasonably reliable comparisons can be made if one understands the limitations. With business establishment data, two further complications merit discussion:

Business classifications are complex and highly detailed, and many companies engage in a broad range of activities. Business statistics are always based on the “primary” business activity – as a broad rule of thumb, the more digits in the NAICS code, the less confidence you should have in it. Classification error is substantial.
Most business data is based on a combination of the County Business Patterns (CBP) file published by the Census Bureau annually and one or more business marketing lists. The CBP is spatially coarse, has considerable data suppression, is weak on the locations of multi-locational firms, and does not include many small businesses and government offices. Business lists are typically ‘sticky’, meaning that because revenue is mostly based on record quantity, closed businesses tend to hang around the file. The classification can be spotty, and the details on employment levels at local branches are often estimated from national values. Some sections of the NAICS coding are more complex than others – retail and government services are straightforward, information services not so much.

The result is that time series analysis for a small area is subject to considerable error. The more detailed the sector being analyzed, the larger the geographic areas should be, and the results should always be viewed in terms of overall pattern rather than focusing on individual areas.

all pattern rather than focusing on individual areas.

Spatial Time Series Analysis: A Cautionary Note

Submit a Comment Cancel reply

Subscribe To The AGS Weekly Newsletter

You have Successfully Subscribed!

Recent Posts

Categories

Recent Comments