The SPECIALIST Lexicon

SN Source Model - Semantic Network (WordNet)

I. Introduction

Some existing computational lexicons and thesauri encode antonymy along with other semantic relations for NLP applications. WordNet has one of the most comprehended collections for antonyms and is used in this model.

II. Design

This model includes three process to generate aPair candidates:

  • Retrieve aPairs from WordNet
    The Java Word-Net Interface (JWI) (Finlayson 2014) is used to retrieve 12,248 antonym pairs with same POS in the format of [ant-1|ant-2|POS] from WordNet 3.0 [Miller 1995], [Fellbaum, 2005].
  • Add words (single words and multiwords) to the Lexicon
    We add words from retrieved aPairs that are not known to Lexicon to increase recall.
  • Apply aPair criteria
    • unify and remove duplicated aPairs
    • unify spelling variants
    • remove invalid words (combined exclusive filters)
    • filter out Lexicon synonyms (sPairs)
    • remove existing aPairs
    • remove multiwords
    • reformat from 3 field to 10 field, then sort

The CUI and STI criteria are not used here because WordNet is a semantic network and thus there is no need to check from the semantic aspect.

Some retrieved aPair candidate examples are shown in the table below. Please see design documents for details.

Ant-2Ant-2POS
admitdenyverb
agitationcalmnessnoun
acutelychronicallyadv
wrongrightadj
.........

III. Implementation

Java source codes are implemented in the directory of WordNetAPairs:

  • GenAPairCand.java.java

These candidates are converted to standard 10 field format and sent to linguists for tagging and further processing.