Week 9 - Sensitive data

Learning goals

  • Understand the importance of protecting sensitive data and ensuring privacy and confidentiality.
  • Identify and evaluate established approaches and techniques for de-identifying and anonymizing data to mitigate the risk of re-identification.
  • Apply the acquired techniques while utilizing an R package to quantify the information loss and utility.

Student notes

Before class, install the sdcMicro package if you choose not to use the servers.

If you are testing using sensitive data, make sure to launch it from RStudio, not from the website.

Slides and other materials

Resources

  1. sdcMicro Documentation: https://sdcpractice.readthedocs.io/en/latest/intro.html

  2. sdcMicro Shiny app: https://sdcappdocs.readthedocs.io/en/latest/introsdcApp.html

Other useful links can be found on slides.

Suggested readings

  1. Bledsoe, E. K., Burant, J. B., Higino, G. T., Roche, D. G., Binning, S. A., Finlay, K., … & Srivastava, D. S. (2022). Data rescue: saving environmental data from extinction. Proceedings of the Royal Society B, 289(1979), https://doi.org/10.1098/rspb.2022.0938

  2. Bourgault, B., Tremblay, H.; Schloss, I.R.; Plante, S. & Archambault, P. (2017). “Commercially Sensitive” Environmental Data: A Case Study of Oil Seep Claims for the Old Harry Prospect in the Gulf of St. Lawrence, Canada. Case Studies in the Environment. https://doi.org/10.1525/cse.2017.sc.454841

  3. Gehrke, J., Kifer, D., Machanavajjhala, A. (2011). ℓ-Diversity. In: van Tilborg, H.C.A., Jajodia, S. (eds) Encyclopedia of Cryptography and Security. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-5906-5_899

  4. Samarati, P., & Sweeney, L. (1998). Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. https://dataprivacylab.org/dataprivacy/projects/kanonymity/paper3.pdf

In-class exercise (Day 2)

Instructions for the Whale Entanglement Exercise


This work is licensed under CC BY 4.0

UCSB logo