Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.
De-Identification of PHI
I. Protected Health Information (PHI)
The Privacy Rule protects all "individually identifiable health information" held or transmitted by a covered entity or its business associate, in any form or media, whether electronic, paper, or oral. The Privacy Rule calls this information "protected health information (PHI)." PHI that is linked based on the following list of 18 identifiers must be treated with special care according to US Health Insurance Portability and Accountability Act (HIPAA)
II. De-Identification of PHI
Clinic records are commonly used for Medical research. These data need to be de-Identified before they are used according to the Privacy Rule. To develop a system to remove all above 18 elements of identifiers automatically for PHI is imperative for medical research. The general approach on de-Identification is:
In order to evaluate the de-Identification system, a gold standard corpus is needed. The gold standard corpus involves experts hand tag the medical records. The following indexes are used for the evaluation:
Test | Positive | TP | FP (Type I Error) | Positive Predictive value (Precision) = TP / (TP + FP) |
Negative | FN (Type II Error) | TN | Negative Predictive value = TN / (FN + TN) | |
Sensitivity (recall) = TP / (TP + FN) | Specificity = TN / (TN + FP) |
III. How VTT is used?
First, VTT is used as a tool for hand tagging medical records for the gold standard data set. VTT provides GUI to ease the human tagging process by showing tagged text in different visual styles (colors, fonts, sizes, etc..)
Second, VTT read and write tags, markups for a specified text from and to a file in VTT file format. This VTT file format is also used in developing auto de-Identification programs.
A schematic diagram bellow shows a typical de-Identification developing project using VTT.