neleval significance

Test for pairwise significance between systems

Usage summary

$ neleval significance --help
usage: neleval significance [-h] -g GOLD [-n TRIALS] [--permute] [--bootstrap]
                            [-j N_JOBS] [-f {json,none,tab}] [-m NAME]
                            [--type-weights FILE] [--metrics METRICS]
                            FILE [FILE ...]

Test for pairwise significance between systems

positional arguments:
  FILE

optional arguments:
  -h, --help            show this help message and exit
  -g GOLD, --gold GOLD
  -n TRIALS, --trials TRIALS
  --permute             Use the approximate randomization method
  --bootstrap           Use bootstrap resampling
  -j N_JOBS, --n_jobs N_JOBS
                        Number of parallel processes, use -1 for all CPUs
  -f {json,none,tab}, --fmt {json,none,tab}
  -m NAME, --measure NAME
                        Which measures to use: specify a name (or group name)
                        from the list-measures command. This flag may be
                        repeated.
  --type-weights FILE   File mapping gold and sys types to a weight, such as
                        produced by weights-for-hierarchy
  --metrics METRICS     Test significance for which metrics (default:
                        precision,recall,fscore)