You are here

Converting Unicode Lexicon and Lexical Tools for ASII NLP Applications

Printer-friendly versionPrinter-friendly version
Lu C, Browne AC
AMIA Annu Symp Proc 2011:1870.
Abstract: 

The NLP SPECIALIST Lexicon and Lexical Tools, distributed by National Library of Medicine (NLM), have been released in Unicode (UTF-8) format since 2006. Lexicon is used as corpus while Lexical Tools are used as software packages in NLP (Natural Language Processing) projects. Some NLP projects still only deal with ASCII (7-bit) characters. This paper describes how to convert UTF-8 Lexicon and integrate Lexical Tools to a pure ASCII NLP project, MetaMap.

Lu C, Browne AC. Converting Unicode Lexicon and Lexical Tools for ASII NLP Applications AMIA Annu Symp Proc 2011:1870.