Household survey programs often publish georeferenced microdata, but anonymization methods like location obfuscation hinder data augmentation with local auxiliary information. This paper proposes an alternative: releasing two datasets – (1) original microdata without geographic identifiers for non-representative results and (2) synthetic microdata with original cluster locations. Experiments using 2011 Costa Rican census data and satellite information show this strategy reduces re-identification risk by 60-80%, even with multiple disclosed attributes, while maintaining data utility.
Publisher
Humanities & Social Sciences Communications
Published On
May 09, 2023
Authors
Till Koebe, Alejandra Arias-Salazar, Timo Schmid
Tags
data anonymization
georeferenced microdata
data utility
re-identification risk
synthetic microdata
Related Publications
Explore these studies to deepen your understanding of the subject.