You are here
Towards a characterization of apparent contradictions in the biomedical literature using context analysis.
With the substantial growth in the biomedical research literature, a larger number of claims are published daily, some of which seemingly disagree with or contradict prior claims on the same topics. Resolving such contradictions is critical to advancing our understanding of human disease and developing effective treatments. Automated text analysis techniques can facilitate such analysis by extracting claims from the literature, flagging those that are potentially contradictory, and identifying any study characteristics that may explain such contradictions.
Using SemMedDB, our own PubMed-scale repository of semantic predications (subject-relation-object triples), we identified apparent contradictions in the biomedical research literature and developed a categorization of contextual characteristics that explain such contradictions. Clinically relevant semantic predications relating to 20 diseases and involving opposing predicate pairs (e.g., an intervention treats or causes a disease) were retrieved from SemMedDB. After addressing inference, uncertainty, generic concepts, and NLP errors through automatic and manual filtering steps, a set of apparent contradictions were identified and characterized.
We retrieved 117,676 predication instances from 62,360 PubMed abstracts (Jan 1980-Dec 2016). From these instances, automatic filtering steps generated 2,236 candidate contradictory pairs. Through manual analysis, we determined that 58 of these pairs (2.6%) were apparent contradictions. We identified five main categories of contextual characteristics that explain these contradictions: a) internal to the patient, b) external to the patient, c) endogenous/exogenous, d) known controversy, and (e) contradictions in literature. Categories (a) and (b) were subcategorized further (e.g., species, dosage) and accounted for the bulk of the contradictory information.