You are here
Clinical Text De-Identification Research
The Privacy Rule of Health Insurance Portability and Accountability Act (HIPAA) requires that clinical documents be stripped of personally identifying information before they can be released to researchers and others. We have been developing a software tool to de-identify clinical records, which we have named NLM Scrubber. Version 1.0 of the system currently recognizes and redacts patient names, alphanumeric identifiers, addresses and dates. NLM Scrubber’s success rate of de-identifying these identifiers is around 99% and its rate of conserving text of health information with no personal identifiers is 99%, without counting de-identified provider names as false positives. We plan to release the system as an open source tool in early 2014.