Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov.

Lexical Tools

Results of Proposed Rules - 2021

I. Results

11 non-duplicated SD-Rules are proposed to be added to the SD-Rule for evaluation. The results from the optimal set are described as follows:

SD-RuleRankPrecisionInstancesSourceDecomposeResults
Good Rules
able$|adj|eability$|noun13100.00%42NOM_DRoot-ParentGood SD-Rule
ster$|verb|stration$|noun14100.00%29NOM_DDecompose-ChildGood SD-Rule
lter$|verb|ltration$|noun16100.00%15NOM_DDecompose-ChildGood SD-Rule
ability$|noun|eable$|adj3497.92%48NOM_DRoot-ParentGood SD-Rule
d$|verb|sion$|noun5693.18%44NOM_DRoot-ParentGood SD-Rule
narity$|noun|nary$|adj6890.48%21NOM_DDecompose-ChildGood SD-Rule
e$|verb|ition$|noun7189.58%48NOM_DRoot-ParentGood SD-Rule
ge$|verb|gence$|noun7488.89%18NOM_DDecompose-ChildGood SD-Rule
$|noun|cide$|noun7886.67%15EXP_SUGRoot-ParentGood SD-Rule
t$|verb|tted$|adj9177.78%9EXP_SUGRoot-ParentGood SD-Rule
$|verb|ed$|adj10170.10%311EXP_SUGRoot-ParentGood SD-Rule
ctic$|adj|xis$|noun10465.85%27ORG_FACTRoot-ParentGood SD-Rule
Bad Rules
e$|noun|ous$|adj11057.22%187ORG_FACTRoot-ParentBad SD-Rule
ed$|adj|ment$|noun11551.76%85NOM_DRoot-ParentBad SD-Rule
$|adj|y$|noun12742.07%145NOM_DRoot-ParentBad SD-Rule
er$|noun|y$|noun12839.26%163ORG_FACTRoot-ParentBad SD-Rule
c$|adj|sm$|noun13420.63%504NOM_DRoot-ParentBad SD-Rule
er$|noun|ing$|noun1450.00%534ORG_FACTRoot-ParentBad SD-Rule
  • Good SD-Rules: 12 of them are evaluated as good rules in the optimized set
  • Bad SD-Rules: 6 are bad rules.

  • Experts' suggestion: 100% (3/3) is good.
  • Computation rules: 60% (9/15) is good.
    • NOM_D: 73% (8/11) is good
    • ORG_FACT: 25% (1/4) is good

  • In the optimized set, 4 child rules are used to replace 3 proposed root-parent rules
    Proposed parent ruleChild Rule used
    er$|verb|ration$|nounster$|verb|stration$|noun
    lter$|verb|ltration$|noun
    ity$|noun|y$|adjnarity$|noun|nary$|adj
    $|verb|nce$|nounge$|verb|gence$|noun

II. Further Observation on NOM_D

The top SD-Rules generated from NOM_D are added and evaluated (${SUFFIX_D}/data/${YEAR}/dataR/SdRulesFromSdPairs/nomD/sdRulesFromSdPairs.rpt.${YEAR}).

IDSD-RuleRankNotes
Added in 2015: Freq. > 200, Coverage > 1.00% , Accum. Coverage > 80.0%
1$|adj|ness$|noun1Good
2bility$|noun|ble$|adj2Good (ility$|noun|le$|adj)
3se$|verb|zation$|noun3Good
4sation$|noun|ze$|verb4Good
5iness$|noun|y$|adj16Good
6ation$|noun|e$|verb21Good
7nce$|noun|nt$|adj25Good (ce$|noun|t$|adj)
8e$|verb|ion$|noun26Good
9cy$|noun|t$|adj27Good
10$|verb|ment$|noun28Good
11ication$|noun|y$|verb29Good
12ed$|adj|ion$|noun30Good
13$|adj|ity$|noun32Good
14e$|adj|ity$|noun35Good
15$|verb|ion$|noun49Good
16$|verb|ing$|noun53Good
17$|verb|ation$|noun61Good
Added in 2016: Freq. > 100, coverage > 0.40% , Accum. Coverage > 83.36%)
18e$|verb|is$|noun43Good
19ation$|noun|ed$|adj50Good
20e$|verb|ing$|noun60Good
21$|adj|ism$|noun62Good
22e$|adj|ion$|noun100Bad
Added in 2017: Freq. > 70, Coverage > 0.30% , Accum. Coverage > 85.00%)
23sation$|noun|zed$|adj7Good
24sed$|adj|zation$|noun8Good
25sity$|noun|us$|adj65Good (osity$|noun|ous$|adj)
26e$|verb|tion$|noun63Good
27ous$|adj|y$|noun116Bad (exit in 2013)
Added in 2020: Freq. > 50, Coverage > 0.20% , Accum. Coverage > 87.41%)
28ability$|noun|ible$|adj10Good
29sable$|adj|zability$|noun12Good
30sability$|noun|zable$|adj13Good
31sis$|noun|ze$|verb41Good
32al$|noun|e$|verb92Good
Added in 2021: Freq. > 40, Coverage > 0.17% , Accum. Coverage > 89.27%)
33ability$|noun|eable$|adj34Good
34c$|adj|sm$|noun134Bad
35er$|verb|ration$|noun15,29Good
36$|verb|nce$|noun74Good
37ed$|adj|ment$|noun85Bad
38ity$|noun|y$|adj68Good
39$|adj|y$|noun145Bad
40able$|adj|eability$13Good
41e$|verb|ition$|noun71Good
42d$|verb|sion$|noun56Good

The results shows 88.09% (37/42) are good SD-Rules, more SD-Rules from nomD should be added and evaluated in the future releases.

III. Further Observation on ORG_FACT

The top SD-Rules generated from ORG_FACT are added and evaluated (${SUFFIX_D}/data/${YEAR}/dataR/SdRulesFromSdPairs/orgFacts/sdRulesFromSdPairs.rpt.${YEAR}).

IDSD-RuleRankNotes
Added in 2015: Freq. > 40, Coverage > 1.00% , Accum. Coverage > 11.50%
1$|noun|less$|adj17Good
2$|adj|ally$|adv23Good
3ist$|noun|y$|noun45Good
4$|verb|ion$|noun49Good, also in NOM_D
5c$|adj|s$|noun57Good (ic$|adj|is$|noun)
6$|noun|ful$|adj64Good
Added in 2016: Freq. >= 35; Accu. coverage: > 16.00% Ind Coverage: > 0.80%
7sia$|noun|tic$|adj47Good
8e$|noun|ic$|adj56Good
9on$|noun|ve$|adj58Good
10$|noun|ship$|noun79Good
11$|noun|age$|noun114Bad
Added in 2017: Freq. >= 30; Accu. coverage: > 19.00% Ind Coverage: > 0.70%
12$|noun|tous$|adj33Good
13$|noun|ish$|adj94Bad
14$|noun|y$|noun101Bad
15$|noun|fully$|adv128Bad
Added in 2020: Freq. >= 25; Accu. coverage: > 23.00% Ind Coverage: > 0.60%
16ion$|noun|ory$|adj5Good
17$|adj|s$|noun49Good
18$|verb|age$|noun100Bad
19$|noun|ial$|adj106Bad
Added in 2021: Freq. >= 20; Accu. coverage: > 26.00% Ind Coverage: > 0.49%
20er$|noun|y$|noun128Bad
21ctic$|adj|xis$|noun104Good
22er$|noun|ing$|noun145Bad
23e$|noun|ous$|adj110Bad

The results shows 60.87% (14/23) are good SD-Rules, more SD-Rules from orgD should be added in the future releases.

V. Future Work

Evaluated more SD-Rules from NOM_D and ORG_FACT down the list.
ORG_FACT is closed to the limit, maybe review 1 more year until there is no good rules canbe found.