Anonymizing FHIR Data

Not everyone is allowed to see a patient’s data.

Even in a hospital or clinical setting access to PHI is restricted – as it should be.

But if that same data is “anonymized” or “de-identified” – with all connections back to the patient removed – then it can be used much more robustly and with fewer restrictions.

FHIR resources are full of PHI.

  • Patient
  • Observation
  • CarePlan
  • Procedure

Almost every resource type can be filled with rich data that points back to an individual patient.

How do you anonymize these resources in such a way that the patient cannot be identified while still retaining value?

Microsoft’s “FHIR and DICOM data anonymization” project is a good place to start.

It’s an open source project that combines a set of FHIRpath expressions with a customizable rules engine. All with the aim of removing or obfuscating individual FHIR elements and attributes.

I’ve worked on real-world projects that use this library very effectively, building it into existing applications.

Positioned correctly in your data flow it can anonymize resources “on the fly” in response to a GET request from a consumer who has limited access.

Or it can be triggered by an event notification that sends a newly updated resource to an external data warehouse.

Run the resource through your anonymizing engine and it leaves behind a shell of a resource that still provides value to consuming applications but with no way to connect that resource back to a real patient.

  • A blood pressure reading
  • A time of day
  • A patient between 50 and 60 years of age
  • A hypertension condition

Anonymous but still valuable.

Here’s the project.

It’s free. It’s open source. And it works.

---

Ways to Work With Me

Discover more from Darren Devitt

Subscribe now to keep reading and get access to the full archive.

Continue reading