We admit we had never heard of the Office of Privacy and Open Government, found somewhere inside the Department of Commerce. But we are extremely grateful for their existence, given their publication on June 4 of a policy directive governing the use of disclosure avoidance (DA) techniques within statistical products. It affects “all statistical products disseminated by the Census Bureau and the Bureau of Economic Analysis”. The directive can be read in full at Disclosure Avoidance for Statistical Products | U.S. Department of Commerce.
For a decade, the Census Bureau has spent countless taxpayer dollars studying, developing, and implementing disclosure avoidance techniques. The basic concept was this – if I have a list of people in a census block from another source, I might be able to identify an individual or family within the data. But this is not actually disclosure, since I already know them. All I could ever do is confirm or refine what I already thought I knew.
The methods created to prevent this were very complex and resulted in significant delays in the release of the 2020 census. Worse, they mangled the data to the point where only two things could be known to be true — the dwelling count and whether there were or weren’t persons in group quarters in the area. Anything else you could not rely on, and the result was terminology like ghost blocks, mermaid blocks, and Lord of the Flies blocks. Or blocks where the average number of persons in households with 7+ persons was 6.8.
If that wasn’t bad enough, researchers discovered that it didn’t achieve its primary goal. A classic case of closing the barn door after the horse is half a mile down the road.
After the 2020 census, all indicators were pointing towards the bureau doubling down on the efforts, with even discussions about adding DA to sampled products like the ACS. We were pleasantly surprised to see this policy directive, as it gives clear evidence that the fine folks at the Census Bureau have come to their senses.
Obviously, there are laws in place which require the Bureau to maintain the anonymity of Census respondents. The key policy directive reads: “Any use of noise infusion is inconsistent with the Department’s policies”, and the DA used for the 2020 census was all about noise. So, what will be permitted going forward?
- “Coarsening”, which includes methods that “involve reducing the level of detail or specificity of published statistics, such as through rounding, aggregating (grouping), and/or the use of ranges.
- “Suppression”, to be used as a last resort, which means the redaction of certain values.
How this will look in practice is anybody’s guess. If it means that the age distribution at the block level is as simple as <18, 18-64, and 65+ but we know that these are the correct numbers, we will consider this a substantial improvement. We will most certainly not have five year cohorts by sex and race/ethnicity, which is fine with us since those numbers were unusable.
In many respects, we must thank the Census Bureau for creating an unusable as delivered 2020 census product. It forced us to consider and heavily invest in data sources and processes that have taken us well beyond the decennial census and the ACS. Among other things, we are now tracking parcel level changes quarterly, new building permits monthly, monitoring the projects and progress of the largest single family and multifamily builders, and using high resolution imagery for verification. In short, the precision of our demographics has been elevated by an order of magnitude. A noise free census product, added to our parcel base, will result in even more accurate demographics for the next decade.