LMW Candidate Post-Processes
The post-processes are used to conduct analysis and aggregation on LMW candidate lists. These lists include LMW candidates from various models. The numbers are based on real-time data. In other words, this program needs to be re-run to get the latest number when:
1
2
3
4
II. Functionality:
III. Input Files:
IV. Output Files:
V. Detail Process
Step | Description | Input | Output | Notes |
---|---|---|---|---|
1 | Aggregate and analyze all previous LMW candidate files
=> This program is to analyze the precision of candidate list (candidates are valid LMWs) |
|
| Must update:
|
2 | Get not-BaseForm/LMW from LexCheck files
=> This program is to analyze the precision of invalid LMWs from LexCheck file: notBaseForm.data and notLmw.data |
|
| Must update:
|
3 | Combine output files from steps 1 and 2 to get the total data set . |
|
| Must run steps 1 and 2
|
4 | Copy result files in Steps 1-3 to ./DataLog |
| ./DataLog/${YEAR}/${YEAR}_${MM}_${DD}/
| Must run steps 1 - 3
|
10 | Filter and tag valid/invalid LMWs for a raw candidate file |
Specify:
|
|
|
20 | Generate DL TtSet from valid/invalid LMWs candidate files | |||
21 | Generate DL TtSet from inflVars (valid) and invalid LMWs in n-grams .. |