PUBLICATIONS

Abstract

Named Entity Recognition in Affiliations of Biomedical Articles Using Statistics and HMM Classifiers.


Kim J, Thoma GR

The 2016 International Conference on Data Mining (DMIN2016), Las Vegas, USA, pp. 236-241, July, 2016.

Abstract:

This paper proposes an automated algorithm that extracts authors’ information from affiliations in biomedical journal articles in MEDLINE® citations. The algorithm collects words from an affiliation, estimates features of each word, and uses a supervised machine-learning algorithm called Hidden Markov Model (HMM) and heuristics rules to identify the words as one of seven labels such as city, state, country, etc. Eleven sets of word lists are collected to train and test the algorithm from 1,767 training data set. Each set contains collections of words ranging from 100 to 44,000. Experimental results of the proposed algorithms using a testing set of 1,022 affiliations show 94.23% and 93.44% accuracy.


Kim J, Thoma GR. Named Entity Recognition in Affiliations of Biomedical Articles Using Statistics and HMM Classifiers. 
The 2016 International Conference on Data Mining (DMIN2016), Las Vegas, USA, pp. 236-241, July, 2016.

PDF