Not everyone is allowed to see a patient’s data.
Even in a hospital or clinical setting access to PHI is restricted – as it should be.
But if that same data is “anonymized” or “de-identified” – with all connections back to the patient removed – then it can be used much more robustly and with fewer restrictions.
FHIR resources are full of PHI.
- Patient
- Observation
- CarePlan
- Procedure
Almost every resource type can be filled with rich data that points back to an individual patient.
How do you anonymize these resources in such a way that the patient cannot be identified while still retaining value?
Microsoft’s “FHIR and DICOM data anonymization” project is a good place to start.
It’s an open source project that combines a set of FHIRpath expressions with a customizable rules engine. All with the aim of removing or obfuscating individual FHIR elements and attributes.
I’ve worked on real-world projects that use this library very effectively, building it into existing applications.
Positioned correctly in your data flow it can anonymize resources “on the fly” in response to a GET request from a consumer who has limited access.
Or it can be triggered by an event notification that sends a newly updated resource to an external data warehouse.
Run the resource through your anonymizing engine and it leaves behind a shell of a resource that still provides value to consuming applications but with no way to connect that resource back to a real patient.
- A blood pressure reading
- A time of day
- A patient between 50 and 60 years of age
- A hypertension condition
Anonymous but still valuable.
It’s free. It’s open source. And it works.
---