Week 9 - Sensitive data

Learning goals

Understand the importance of protecting sensitive data and ensuring privacy and confidentiality.
Identify and evaluate established approaches and techniques for de-identifying and anonymizing data to mitigate the risk of re-identification.
Apply the acquired techniques while utilizing an R package to quantify the information loss and utility.

Before class, you can go ahead and install the sdcMicro package if you choose not to use the servers.

Demo: South Park

Exercise: Whale Entanglement

Instructions are noted in the .rmd files.

Other useful links can be found on the slides.

Bledsoe, E. K., Burant, J. B., Higino, G. T., Roche, D. G., Binning, S. A., Finlay, K., … & Srivastava, D. S. (2022). Data rescue: saving environmental data from extinction. Proceedings of the Royal Society B, 289(1979), https://doi.org/10.1098/rspb.2022.0938
Bourgault, B., Tremblay, H.; Schloss, I.R.; Plante, S. & Archambault, P. (2017). “Commercially Sensitive” Environmental Data: A Case Study of Oil Seep Claims for the Old Harry Prospect in the Gulf of St. Lawrence, Canada. Case Studies in the Environment. https://doi.org/10.1525/cse.2017.sc.454841
Gehrke, J., Kifer, D., Machanavajjhala, A. (2011). ℓ-Diversity. In: van Tilborg, H.C.A., Jajodia, S. (eds) Encyclopedia of Cryptography and Security. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-5906-5_899
Samarati, P., & Sweeney, L. (1998). Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. https://dataprivacylab.org/dataprivacy/projects/kanonymity/paper3.pdf