Methods for Ensuring that Statistical Information does not Reveal Underlying Individual Data
Paul B. Massell, (U.S. Census Bureau), firstname.lastname@example.org
Many federal agencies collect information from individual persons, households, and companies with the goal of producing data products which represent statistical summaries of the individuals records, often in the form of tables but sometimes in some other form, such as statistical models. Another type of data product occasionally released is microdata, a subset of the records collected in which, typically, key identifying fields are eliminated and other fields are smoothed to prevent their use in an identification. Even when only statistical summaries are released, the agency must take care to protect the confidentiality of the underlying individual data. This is needed, for example, when tables have cells that represent totals (e.g., income) based on a very small number of individuals. We give a brief overview of methods used for protecting the individuals data underlying statistical tables. We conjecture that such methods may be useful in the development of statistical (profiling) models for identifying bad guys in which there may be a strong legal and/or national security interest in not revealing any of the underlying individuals data.