Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted. The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov. Updates regarding government operating status and resumption of normal operations can be found at OPM.gov
Test Set
II. Description
The test set used in CSpell was generated by finding consumer health questions with the highest count of OOV (out of vocabulary) terms from the NER (Name Entity Recognition) collections. The SPECIALIST Lexicon 2017 release was used as the dictionary to identify OOVs. The errors were manually annotated by two annotators (Dr. Alan R. Aronson and Sonya E. Shooshan) independently. The disagreements were reconciled by the annotators with arbitration by Dr. Dina Demner-Fushman as needed. This test set is summarized as follows:
Consumer health questions | 224 |
Tokens | 16,707 |
Annotation tags | 1,946 |
Instances of non-word corrections | 974 |
Instances of real-word corrections | 1,178 |
Word count per question | 3 - 337 |
Average word count per question | 72.36 |
Error per question | 0 - 22 |
Average error per question | 4.90 |
Error rate (error per token) | 0.07 (= 1,178/16,707) |
III. Distribution of Errors in the Test Set
Count | Minimum | Maximum | Average |
---|---|---|---|
Character | 23 | 1985 | 504.71 |
Word | 3 | 337 | 72.36 |
Error Tag | 0 | 22 | 4.90 |
IV. Generation Processes
V. Performance Tests