Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.

Lexical Tools

Norm Unicode to ASCII with Synonym Option

  • Introduction:
    This normalization is used to convert Unicode string to pure ASCII. This Norm is identical to Unicode synonym conversion followed by Unicode Norm. In other words:
    • First, it converts Unicode characters to its defined synonym base (-f:q4)
    • Then, it used core Norm to normalize the converted Unicode string (-f:q7)
    • Last, it converts non-ASCII Unicode characters to ![Unicode Name]! format (-f:q3)

    The main advantages of using this Norm are:

    • Pure ASCII results
    • More normalization in terms of Unicode synonyms
    • Preserve Unicode information
      Unicode characters can be retrieved from ![Unicode Name]!

  • Algorithm:
    • Convert Unicode characters to Unicode synonym base
    • Perform core norm
    • Convert no-ASCII characters from the result of above to Unicode name as ![Unicode name]!

  • References: