option | Description | input | Output | Notes | option
|
---|
80 |
- Unify and sort aPairs from Antonyms in WordNet
- WordNet.WnAPairFile.java
|
- ${WN_DIR}inData/WnAPairs.data.${WN_YEAR}
|
- ./output/PreCand/WnAPairs.unique.data.${WN_YEAR}
|
- if run the first time:
- shell> mkdir ./output/PreCand
- sort and unify aPairs in the WordNet
- the format is [antonym-1|antonym-2|POS]
- antonym-1 and antonym-2 are sorted by alphabetical order
- The output file should be the same if the same WordNet version (3.) is used. And thus, data from previous years can be used
| 80
|
81 |
- No need after 2023+
- Generate word candidates from aPairs in WordNet
- GenWordCandFromAPairs.java
|
- antCandWordNet.data.b4 (must run step 82 first)
- ${LMW_DIR}/inData/invalidLeadTerms.data.abs
- ${LMW_DIR}/inData/invalidEndTerms.data.abs
- ${LMW_DIR}/inData/invalidLeadEndTermCandidates.data
- ${LMW_DIR}/inData/validLeadTerms.data.pat
- ${LMW_DIR}/inData/validEndTerms.data.pat
|
|
- Word candidate must be completed before completing the aPair from WordNet because all antonyms must be in the Lexicon.
- ${LMW_DIR} uses ${PREV_YEAR}
- This step is completed once and not needed after 2023+.
- The output should be the same because WordNet does not change! And we use the same release of MEDLINE and Metathesaurus. Too less difference and too much efforts to update above two filters.
| 81
|
82 |
- Generate antonym candidates from WordNet
- WordNet.GenAPairCand.java
|
- ${ANT_DIR}/nput/antCand.data.tag
- ./output/PreCand/WnAPairs.unique.data
- ${LEXICON_DIR}/input/inflVars.data
- ${LEXICON_DIR}/input/synonym.data
- ${META_DIR}/input/normTermCui.data
- ${META_DIR}/input/MRSTY.RRF
- ${PROJECTS}/LVG/lvg${LVG_YEAR}/data/config/lvg.properties
- ${MW_DIR}/data/${PRE_YEAR}/inData/invalidLeadTerms.data.abs
- ${MW_DIR}/data/${PRE_YEAR}/inData/invalidEndTerms.data.abs
- ${MW_DIR}/data/${PRE_YEAR}/inData/invalidLeadEndTermCandidates.data
|
- ./output/Cand/antCandWordNet.data.all
- ./output/Cand/antCandWordNet.data.b4tag
- ./output/Cand/antCandWordNet.data.notLex
- ./output/Cand/antCandWordNet.data.yes
- ./output/Cand/antCandWordNet.data.no
- ./output/Cand/antCandWordNet.data.tbd
- ./output/Cand/antCandWordNet.data.trap.1.spVar
- ./output/Cand/antCandWordNet.data.trap.2.cf
- ./output/Cand/antCandWordNet.data.trap.3.mw
- ./output/Cand/antCandWordNet.data.trap.4.sp
|
- retrieve unique aPair candidates from aPairs in WordNet
- filter out illegal aPairs,
- spellnig variants
- combin filter to exclude illegal words
- multiwords.
- synonyms
- check antonym criteria: STI, CUI, etc. (not used in the current 2023 model)
- Convert aPairs from WordNet (3 fields) to aPair candidates format (10 fields)
- Get EUI by inflVars|pos
- Check if known to Lexicon
- Auto-tagged from previous tagged if known to lexicon
- outNo = notLexNo + noNo + yesNo + tbdNo
- TBD should be 0 (antCandWordNet.data.tbd)
- If not, send antCandWordNet.data.tbd to linguist to tag
| 82
|
83 |
- Validate and fix tags of candidates (SN)
- Antonym.ValidateTaggedCand.java
|
- ./output/candTagged/antCandWordNet.data.tag.tagged
=> link to the latest tagged candidate file
- ${ANT_DIR}/input/domain.data
|
- ./output/candTagged/antCandWordNet.data.tag.fixed
|
- Link (only) tagged candiddates to ./candTagged/antCandWordNet.data.tag.tagged
- Run this step until tag and fixed files are the same
- Fixed file is the auto-fixes on [TYPE_TBD] and [DOMAIN_TBD] to [NA] and [DOMAIN_NONE].
- Fixed file is sorted by alphabetical order.
- Manully fix other error, such as domain.
- Manually copy the fixed file to tagged file
- Manually copy antCandWordNet.data.tag.tagged to antCandWordNet.data.tag.tagged.${YEAR}
| 83
|
84 |
- Update tagged candidates (SN) to release tagged file
- Antonym.UpdateAllTaggedFile.java
|
- ./output/candTagged/antCandWordNet.data.tag.tagged.${YEAR}
- ${ANT_DIR}/input/antCand.data.tag.${YEAR}
- ${ANT_DIR}/input/domain.data
|
- ${ANT_DIR}/input/antCand.data.tag.updated
- ${ANT_DIR}/input/antCand.data.tag.updated.srcConflict
=> aPairs with same tag, but different source model
=> could > 0
- ${ANT_DIR}/input/antCand.data.tag.updated.tagConflict
=> same aPairs with different tags, send to linguist to re-tag
=> must = 0
|
- This step automatically updates all antonym candidate tag file
- Manully copy antCand.data.tag.updated to antCand.data.tag.updated.6.SN
- Manully copy antCand.data.tag.updated to antCand.data.tag.${YEAR}
- The output file is used to generate antonym and negation files for the release.
- Re-run steps 82-84 until it passes all steps (tbd = 0 in step 82)
- This model is not completed in 2025.
| 84
|