neleval evaluate
¶
Evaluate system output
Usage summary¶
$ neleval evaluate --help
usage: neleval evaluate [-h] -g GOLD [-f {json,none,tab}] [-m NAME] [-b FIELD]
[--by-doc] [--by-type] [--overall]
[--type-weights FILE]
FILE
Evaluate system output
positional arguments:
FILE
optional arguments:
-h, --help show this help message and exit
-g GOLD, --gold GOLD
-f {json,none,tab}, --fmt {json,none,tab}
-m NAME, --measure NAME
Which measures to use: specify a name (or group name)
from the list-measures command. This flag may be
repeated.
-b FIELD, --group-by FIELD
Report results per field-value, and micro/macro-
averaged over these, Multiple --group-by may be used.
E.g. -b docid -b type. NB: micro-average may not equal
overall score.
--by-doc Alias for -b docid
--by-type Alias for -b type
--overall With --group-by, report only overall, not per-group
results
--type-weights FILE File mapping gold and sys types to a weight, such as
produced by weights-for-hierarchy
Evaluating each document separately¶
TODO