Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.

Lexical Tools

Annually Release - Data from Lexicon

Following files are derived from lexicon and are needed to be installed to Lvg database first. All these operations are done under "$LVG_Components/PreDataBase/" directory.
The whole set of these data file are stored on "$Lvg_Components/PreDataBase/data/{YEAR}/data/" directory.

shell> 1.LoadLexiconFiles ${YEAR}
1
2
3
4

=> load Lexcicon files from the Lexicon to PreDatabase/${DATA_ORG}. These files are used to generate the DB tables files for Lexical Tools.

shell>  2.GenerateLexiconFiles ${YEAR}

=> The seven files belows can be generated by the following command:

  • infl.data
    • $LexBuild/Tools/Lexicon/GenerateInflVars generates $Lexicon/{YEAR}/tables/inflVars.data
    • copy above file to infl.data
    • The format of fields of infl.data is:
      Inflected formCategory (in number)Inflection (in number)EUIBase formCitation Form

  • acronym.data
    • $LexBuild/Tools/GenerateTables/GenerateTables generates $LexBuild/Lexicon/{YEAR}/tables/LRABR and put it in ./data/{YEAR}/dataOrg/.
    • Run ModifyAcronym to change the format of above file and output to ./data/{YEAR}/data/acronym.data
    • The new format of fields of acronym.data is:
      expNpLcexptypeacrNpLcacr

  • properNoun.data
    • $LexBuild/Tools/GenerateTables/GenerateTables generates $LexBuild/Lexicon/{YEAR}/tables/LRPRP and put it in ./data/{YEAR}/dataOrg/.
    • grep "|noun|proper|" ${LRPRP} | flds 2 | sort -u > ${TAR_DIR}proper
    • Copy proper to ./data/{YEAR}/data/properNoun.data
    • The new format of fields of properNoun.data is:
      proper noun

  • nominalization.data
    • $LexBuild/Tools/GenerateTables/GenerateTables generates $LexBuild/Lexicon/{YEAR}/tables/LRNOM and put it in ./data/{YEAR}/dataOrg/.
    • Run ModifyNominalization to change the format of above file and output to ./data/{YEAR}/data/nominalization.data
    • The new format of fields of nominalization.data is:
      EUI 1term 1Category 1EUI 2 term 2Category 2

  • derivation.data
    • Copy and update derivation.data from the LEXICON - DM.data
    • The new format of fields of derivation.data is:
      Base 1POS 1EUI 1Base 2POS 2EUI 2NegationTypePrefix

  • synonyms.data
    • Copy and update synonyms.data from the LEXICON - SM.data
    • The new format of fields of synonyms.data is:
      index key of Base 1Base 1POS 1Base 2POS 2CUI

  • antonyms.data
    • Copy and update antonyms.data from the LEXICON - AM.data
    • The new format of fields of antonyms.data is:
      index key of Base 1Base 1EUI 1Base 2EUI 2POSTypeNegationDomainSource

    After above 7 files are properly generated, steps described below are then followed:

    • Copy above files to "$LVG_DIR/data/tables/" (./bin/MoveLexiconFiles)
    • Run Analyze* to check max. sizes of all fields
      • java AnalyzeInflection
      • java AnalyzeAcronym
      • java AnalyzeProperNoun
      • java AnalyzeNominalization
      • java AnalyzeDerivation
      • java AnalyzeSynonym
      • java AnalyzeAntonym
    • Load these data into Idb and MySql database
      cd $LVG_DIR/loadDb/bin

      • LoadLexiconToMyIdb

      • LoadLexiconToMySql