Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.

The SPECIALIST Lexicon

Antonym - Processes for Annual Release and Stats Reports

I. Set Up

  • base directory: ${ANTONYM_DIR}
  • binary scripts: ./bin
  • data: ./data
    • 0.Antonym
  • Pre-requirements:
    Must complete updates on aPairs from LEX, SD, PD, (TT), CC, SN
shell>cd ${ANTONYM_DIR}/bin
shell>GetAntonyms ${YEAR}

II. Processes

  • Generate aPairs, negation cue words, and antonym files
    OptionDescriptioninputOutputNotesOption
    1
    • generate aPairs from tagged candidates
    • Antonym.GenAPairsFromTagCand.java
    • ${ANT_DIR}/input/antCand.data.tag.${YEAR}
    • ${ANT_DIR}/input/domain.data
    • ${LEX_DIR}/input/LRSPL
    • ./output/aPairs.data
    • This program generates aPairs with all spVars
    • This program removes duplicated aPairs by spVars from different sources
    • The result include some duplicated aPairs from the different order of aPairs from different sources. They are taken care of in Step-3.
    • This is the antonym file contains unique aPairs.
    • manually copy aPairs.data to aPairs.data.${YEAR}
    1
    2
    • generate negation cue words from tagged candidates
    • Antonym.GenNegCueWordsFromTagCand.java
    • ${ANT_DIR}/input/antCand.data.tag.${YEAR}
    • ${LEX_DIR}/input/LRSPL
    • ./output/negCueWords.data
    • This is the negation cue word file (unique).
    • manually copy negCueWords.data to negCueWords.data.${YEAR}
    2
    3
    • Gen antonyms release file from results of step-1 (DB table for Lexical Tools)
    • Antonym.GenAntFromAPairs.java
    • ./output/aPairs.data.${YEAR}
    • ./output/antonyms.data

    • ./output/antonyms.data.tagConflict
      => Must be 0, if not:
      • send ./output/antonyms.data.tagConflict to linguist to tag same aPairs.
      • manully fixed in ./input/antCand.data.tag
      • re-run step 1,2,3 (update aPairs.data.${YEAR}) until tag conflict is 0
      • then fix the tag duplicates.
    • ./output/antonyms.data.tagDuplicate
      => Must be 0, if not:
      • The duplicated tag is caused by spVars.
      • manually review and fix ./input/antCand.data.tag
      • delete the duplicates (by keeping the smaller EUI as EUI-1) search for the EUIs only from the ./input/antonyms.data.tagDuplicate
      • re-run step 1,2,3 (update aPairs.data.${YEAR}) until tag duplicates is 0
      • then fix the src duplicates.
    • ./output/antonyms.data.srcConflict
      • Computer program auto-fixes the src according to the following order (LEX > SD > PD > CC > SN) if the same aPair is tagged from multiple sources
      • The fixes is conducted on the input (./output/aPairs.data.${YEAR}) and result in the output (./output/antonyms.data).
      • All src conflicts are in the log file ./output/antonyms.data.srcConflict
        =>In general, no action is needed because computer program takes care of conflicts by reassign the src and remove the one not needed. However, we can randomly check the following:
        • review conflcits in the log file ./output/antonyms.data.srcConflict
        • check src conflicts from the source input file ./output/aPairs.data.${YEAR} with multiple sources
        • ensure only 1 src exist in the output file ./output/antonyms.data
        • all fxied conflicts are kept in the known exceptions for references (./output/antonyms.data.srcConflict.${YEAR}.${NO}.known.
        • known source conflicts history:
          YearException No.
          20233
          20248
          202568
          202691
    • This is the antonym release (also used as the DB table for Lexical tools).
    • manually copy antonyms.data to antonyms.data.${YEAR}
    3
    5
    • Get stats on tagged antonym candidate file
    • ${ANT_DIR}/input/antCand.data.tag.${YEAR}
    • ./output/analysis/antCand.data.tag.stats
    • ./output/analysis/domain.out.cand
    • If run the first time, shell> mkdir ${OUTPUT}/analysis
    • Generate stats and domains from antonym candidate tagged file
    5
    6
    • Get stats on canonical antonym from tagged candidate file
    • ${ANT_DIR}/input/antCand.data.tag.${YEAR}
    • ./output/analysis/antCand.data.tag.canon.stats
    • ./output/analysis/domain.out.cand.canon
    • Generate stats and domains from canonical antonym in tagged file
    6
    7
    • Get stats on antonym file
    • ./output/antonyms.data
    • ./output/antonyms.data.2-10

    • ./output/analysis/antonym.data.stats
    • ./output/analysis/domain.out.antonym
    • Generate stats and domains from antonym file
    • This file is used to update antonym growth.
    7