Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.
CC Source Model - Co-occurrence in Corpus (MEDLINE)
I. Introduction
Co-occurrence hypothesis is one of the most popular approaches for antonym identification [1989 Charles & Miller, 1995 Fllbaum, 2015 Tesfaye]. In this Co-occurrence in Corpus (CC) model, first, we enhanced co-occurrence patterns from previous researches [Justeson and Katz, 1991] to identify 10 co-occurrence patterns. These patterns are derived from a collection of 1000 antonyms from the internet domain [Lu, 2021]. The MEDLINE n-gram set [Lu 2015] is used as the corpus. These patterns are in the format of [X keyword Y], while keywords include: -and-, -or-, -to-, -versus-, -than-, -vs-, -from-, -nor-, -and/or- and -as well as-. High frequency co-occurrence terms that meet these patterns from the corpus (MEDLINE n-gram set) that are not Lexicon synonyms [Lu 2017], has CUIs, and meet STI rules are retrieved as aPair candidates, such as [above|below|prep], [accept|reject|verb], [sick|well|adj] and [birth|death|noun]. Both frequency in the MEDLINE (word count) and in the keywords (pattern count) are taken into consideration during this process.
II. Design
Two MEDLINE n-grams files are used for this model:
Derived Pattern Details, please see design documents for details:
Ant-2 | Ant-2 | Co-occurrence Examples | ||||||
---|---|---|---|---|---|---|---|---|
normal | abnormal |
|
We observed from above table,
III. Implementation
Java source codes are implemented in the directory of Medline:
Algorithm:
STI-1 | STI-2 | Frequency |
---|---|---|
T033|Finding | T080|Qualitative Concept | 38 |
T033|Finding | T121|Pharmacologic Substance | 10 |
T033|Finding | T169|Functional Concept | 19 |
T033|Finding | T170|Intellectual Product | 11 |
T033|Finding | T184|Sign or Symptom | 15 |
T078|Idea or Concept | T080|Qualitative Concept | 10 |
T080|Qualitative Concept | T081|Quantitative Concept | 13 |
T080|Qualitative Concept | T082|Spatial Concept | 10 |
T080|Qualitative Concept | T121|Pharmacologic Substance | 10 |
T080|Qualitative Concept | T169|Functional Concept | 37 |
T121|Pharmacologic Substance | T169|Functional Concept | 10 |
IV. References