Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.

Sub-Term Mapping Tools

About Sub-Term Mapping Tools

Sub-Term Mapping Tools (STMT) is a generic tool set that provides comprehensive sub-term related features for NLP applications with Java APIs and command line tools. It is used to find the longest prefix, prefixes, sub-terms, and synonymous sub-term substitutions in query expansion. There are six major components used in STMT:

  • Corpus: the collection of terms on the specific interest of the project. It is stored in a tree structure in STMT with each term as a branch in the tree and each word in the term as a node in the branch.
  • Sub-term: a sub-term is a term that is a subset of another term in the corpus
  • Prefix sub-term: a sub-term starts with the beginning of the input term
  • The longest prefix sub-term: the longest sub-term starts with the beginning of the input term
  • Sub-term patterns: pattern permutations with specified sub-term numbers
  • Synonymous sub-term substitutions: all permutations of new terms by substituting synonymous sub-terms

They are illustrated by the following example:

  • Corpus:
    TermsSynonymous Terms
    chronicchron|long-term|persistent|recurrent periodic|relapsing
    infectiouscommunicable|contagious|infection
    otitis externaauditory|auditory canal|aural|ear
    ......

  • Input term:
    otitis externa, chronic infectious

  • Sub-terms (3):
    • otitis externa
    • chronic
    • infectious

  • Prefix sub-term (1):
    • otitis externa

  • The longest prefix sub-term (1):
    • otitis externa

  • Sub-term Patterns (8):
    • Zero sub-term patterns (1):
      • otitis externa chronic infectious
    • One sub-term patterns (3):
      • otitis externa chronic infectious
      • otitis externa chronic infectious
      • otitis externa chronic infectious
    • Two sub-term patterns (3):
      • otitis externa chronic infectious
      • otitis externa chronic infectious
      • otitis externa chronic infectious
    • Three sub-term patterns (1):
      • otitis externa chronic infectious

  • Synonymous sub-term substitutions (117 = 12 + 47 + 60):
    • Synonymous sub-term substitutions with one sub-term (12 = 4 + 5 + 3):
      • otitis externa chronic infectious (4)
        • auditory chronic infectious
        • auditory canal chronic infectious
        • aural chronic infectious
        • ear chronic infectious
      • otitis externa chronic infectious (5)
        • otitis externa chron infectious
        • otitis externa long-term infectious
        • otitis externa persistent infectious
        • otitis externa relapsing infectious
        • otitis externa recurrent periodic infectious
      • otitis externa chronic infectious (3)
        • otitis externa chronic communicable
        • otitis externa chronic contagious
        • otitis externa chronic infection
    • Synonymous sub-term substitutions with two sub-term patterns (47 = 20 + 12 + 15):
      • otitis externa chronic infectious (20 = 4 x 5)
      • otitis externa chronic infectious (12 = 4 x 3)
      • otitis externa chronic infectious (15 = 5 x 3)
    • Synonymous sub-term substitutions with three sub-term patterns (1):
      • otitis externa chronic infectious (60 = 4 x 5 x 3)