You are here
Towards automating the initial screening phase of a systematic review.
Systematic review authors synthesize research to guide clinicians in their practice of evidence-based medicine. Teammates independently identify provisionally eligible studies by reading the same set of hundreds and sometimes thousands of citations during an initial screening phase. We investigated whether supervised machine learning methods can potentially reduce their workload. We also extended earlier research by including observational studies of a rare condition. To build training and test sets, we used annotated citations from a search conducted for an in-progress Cochrane systematic review. We extracted features from titles, abstracts, and metadata, then trained, optimized, and tested several classifiers with respect to mean performance based on 10-fold cross-validations. In the training condition, the evolutionary support vector machine (EvoSVM) with an Epanechnikov or radial kernel is the best classifier: mean recall=100%; mean precision=48% and 41%, respectively. In the test condition, EvoSVM performance degrades: mean recall=77%, mean precision ranges from 26% to 37%. Because near-perfect recall is essential in this context, we conclude that supervised machine learning methods may be useful for reducing workload under certain conditions.