Robust research is dependent on high-quality medical, behavioral, and socio-demographic data. Large volumes of personal data are collected on study participants in order to draw reliable conclusions. However, this large-scale data-collection presents privacy risks for participants involved.
This research, carried out by an international team of researchers, describes a new method of assessing the risk of participant re-identification in any given study. Using this model, which analyzes 15 demographic attributes, the authors found that 99.98% of Americans can be re-identified after having featured in a research dataset.
Research datasets are supposed to be anonymized prior to sharing, meaning that they are no longer regarded as personal data, however this study demonstrates that de-identification methods are not robust enough to guarantee participants’ privacy. The consequences of this, the authors note, are potentially serious for participants who are identified – it could impact on their insurance status, employment, and relationships.