You are here

Automatic Classification of Structured Product Labels for Pregnancy Risk Drug Categories, a Machine Learning Approach.

Printer-friendly versionPrinter-friendly version
Rodriguez LM, Fushman DD
AMIA Annu Symp Proc. 2015 Nov 5;2015:1093-102. eCollection 2015.
Abstract: 

With regular expressions and manual review, 18,342 FDA-approved drug product labels were processed to determine if the five standard pregnancy drug risk categories were mentioned in the label. After excluding 81 drugs with multiple-risk categories, 83% of the labels had a risk category within the text and 17% labels did not. We trained a Sequential Minimal Optimization algorithm on the labels containing pregnancy risk information segmented into standard document sections. For the evaluation of the classifier on the testing set, we used the Micromedex drug risk categories. The precautions section had the best performance for assigning drug risk categories, achieving Accuracy 0.79, Precision 0.66, Recall 0.64 and F1 measure 0.65. Missing pregnancy risk categories could be suggested using machine learning algorithms trained on the existing publicly available pregnancy risk information.

Rodriguez LM, Fushman DD. Automatic Classification of Structured Product Labels for Pregnancy Risk Drug Categories, a Machine Learning Approach. AMIA Annu Symp Proc. 2015 Nov 5;2015:1093-102. eCollection 2015.