Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.

Lexical Tools

SD-Rule Transaction Details: 2016 to 2017

The detail transaction of SD-Rules are described as below:

  • The following table shows the transcation on the 11 new propsoed SD-Rules in 2017.

    Computer Generated SD-Rules
    IDProposed New RuleSourceResultsRank & Rule 2016Rank & Rule 2017TypeCount ChangeAccu. Count
    01-CG1sation$|noun|zed$|adjnomDGoodNone5: sation$|noun|zed$|adj New in 2017+183
    02-CG2ed$|adj|zation$|nounnomDGoodNone6: ed$|adj|zation$|nounNew in 2017+184
    03-CG3$|noun|tous$|adjorgDGoodNone28: $|noun|tous$|adjNew in 2017+185
    04-CG4$|noun|y$|nounorgDGood39: graph$|noun|graphy$48: h$|noun|hy$|nounParent-1-Child+085
    05-CG5sity$|noun|us$|adjnomDGood53: osity$|noun|ous$|adj58: osity$|noun|ous$|adjParent-1-Child+085
    06-CG6e$|verb|tion$|nounnomDGoodNone59: e$|verb|tion$|nounNew in 2017+186
    07-CG7ous$|adj|y$|nounnomDGood68: ous$|adj|y$|noun72: ous$|adj|y$|nounDuplicate+086
    08-CG8$|noun|ish$|adjorgDBadNone88: $|noun|ish$|adjNew in 2017+086
    09-CG9$|noun|fully$|advorgDBadNone118: $|noun|fully$|advNew in 2017+086
    Expert-Suggested SD-Rules
    10-ES1er$|noun|ress$|nounExpertsBadNone95: er$|noun|ress$|nounNew in 2017+086
    11-ES2$|noun|ty$|adjExpertsBadNone97: $|noun|ty$|adjNew in 2017+086

    All 82 good SD-Rules in 2016 are evaluated as good rules in 2017. They could be identical, or replaced by the parent-rules or child-rules.

  • Good SD-Rules count in Optimal Set:
    • 2016 has 82 good rules while 2017 has 86 good rules in optimate set:
    • From the evaluation, 6 of 11 new rules are good (4 bad; 1 duplicated). Why is the total number of good SD-Rule only increased by 4 (from 82 to 86), not 88 (82 + 6)? It is because:
      • 2 of good new rules are the parent-rule of 1 existing rules (+0).
      • 4 new rules have no parent-child relationshion with existing rule (+4)

      • So, tolal change is 6-2 = 4.

  • Good Rules comparison (2016-2017):
    Type20162017Details
    No Change8080...
    Good Rule turn bad00N/A
    Parent-1-Child22
    20162017
    39: graph$|noun|graphy$48: h$|noun|hy$|noun
    53: osity$|noun|ous$|adj58: sity$|noun|us$|adj
    New in 201704
    • 5: sation$|noun|zed$|adj
    • 6: ed$|adj|zation$|noun
    • 28: $|noun|tous$|adj
    • 59: e$|verb|tion$|noun
    Total8286 

  • In our process, we only analyze parent-child hierachy for those SD-Rules has parent-child relationship co-exist in the collected set because it is very expensive (time comsuming) to evaluate all parent-child rules. Shoule we modify the processes as:
    • Normalize all SD-Rules to it's root-parent-rule.
    • Analyze parent-child-hieracy for all SD-Rules.

    in 2017, we spent 2 weeks to evaluated 17 parents rules. If we modify to this process, there will be 101 parents rules, very expensive!!

The conclusion is the optimized set of SD-Rules is very steady as we expected. We believe this is one of the component that implies that Lexicon is a good representative subset of general English.