Lexical Tools

Get Unicode Base Synonym

  • Short Description: Get the base synonym of the input Unicode.

  • Full Description:

    This flow returns the Unicode base synonyms from the input using table mapping method. The mapping table is defined in the file of $LVG/data/Unicode/unicodeSynonymMap.data. Users may add/modify this file from the default set for their applications. Please refer to the design documents of get Unicode Synonyms for details.

    When the -m flag is specified, the detail mutate operations for each characters of the input string are added after the standard set of lvg output fields. There are two basic mutate operations for stripping diacritics as shown in following table:

    OperationsDescriptionsExample
    NONo operationµ -> µ
    MPTable lookup mappingμ -> µ


  • Difference:

    This flow is simplified (from previous versions) to just return the base of Unicode synonym from the defined mapping table since 2008. Please note that the output of this flow is Unicode (not ASCII).

  • Features:
    1. Get the base synonym of the Unicode for the input term.


  • Symbol: q4

  • Examples:
    
    shell> lvg -f:q4 -m
    µ
    µ|µ|2047|16777215|q4|1|NO|
    μ
    μ|µ|2047|16777215|q4|1|MP|
    
    
    More examples

  • Implementation Logic:
    1. Check if the character is in the Unicode synonym mapping table:
      • if yes, return the mapped Unicode base synonym
      • if no, return the original input character

  • Source Code: ToGetUnicodeSynonyms.java

  • Hierarchy: Object -> Transformation -> ToGetUnicodeSynonyms