Dictionary Analysis
I. Introduction
Four dictionaries (Jazzy, Ensemble, Medline, and Lexicon) are compared. Here are the summary:
Jazzy | Ensemble | Medline | Lexicon | |
---|---|---|---|---|
Size | 159,345 | 459,038 | 496,387 | 558,353 |
Files | 1 + 10 | 1 + 10 | 1 | 1 |
Preserved Case | No (LC) | No (LC) | No (LC) | Yes |
Verified | No | No | No | Yes |
General English | Yes | Yes | Yes | Yes |
Biomedical | No | Yes | Yes | Yes |
Coded | No | No | No | Yes, extra information are available:
|
II. Analysis and Tests
Analysis and performance tests are conducted from various dictionaries to obtain a better dictionary generation. Please see the following URL for details:
From the above results, we observe:
III. Overlap and Contain Check
Lexicon (lexicon.ewLc.dic, 534,330) | |||
---|---|---|---|
Src+Tar | Src-Tar | Tar-Src | |
Ensemble (medical.dic, 299,670) | 71,212 | 228,458 | 463,118 |
Medline (medline.dic, 496,387) | 212,961 | 283,426 | 321,369 |
Jazzy (eng_com.dic, 150,843) | 104,853 | 45,990 | 429,477 |
Jazzy (spVar10File.dic, 8,502) | 6,198 | 2,304 | 528,132 |