You are here
Automated Classification of Author’s Sentiments in Citation Using Machine Learning Techniques: A Preliminary Study.
Scientific papers generally include citations to external sources such as journal articles, books, or Web links to refer to works that are related in an important way to the research. The reason for the citation appears within the sentences surrounding the citation tag in the body text, and represents the relationship between the citation and cited works as supportive, contrastive, corrective, etc. This could be an important clue for researchers seeking relevant previous work or approaches for a certain research purpose. We propose to develop an automated method to identify the citing author’s sentiments toward the cited external sources expressed in citation sentences using machine-learning techniques and linguistic cues. As a preliminary study, this paper presents a support vector machine (SVM)-based text categorization technique to classify the author’s sentiments specifically toward Comment-on (CON) articles. CON, a MEDLINE citation field, indicates previously published articles commented on by authors of a given article expressing possibly complimentary or contradictory opinions. An SVM with a radial basis kernel function (RBF) is implemented, and Input feature vectors for the SVM are created based on n-grams word statistics representing the distribution of words in CON sentences. Experiments conducted on a set of CON sentences collected from 414 different online biomedical journal titles show that the SVM with a RBF yields the best result for an input feature vector combining uni-gram and bi-gram word statistics.