CLI Reference
Use this page when you need exact command syntax and practical options. For a short walkthrough, start with the CLI Quick Start.
run
run executes a target node against your input data. Use it when you want real metric results, report files, cache writes, and eval summaries.
nexagauge run <target_node> --input <source> [shared options] [run options]Targets
| Target type | Nodes | Use when |
|---|---|---|
| Utility | chunk, claims, dedup, geval_steps | You want intermediate artifacts for debugging or inspection. |
| Metric | relevance, grounding, redteam, geval, reference | You want one evaluation dimension. |
| Full eval | eval | You want all eligible metrics plus aggregate summaries and reports. |
scan and report are not direct CLI targets. scan runs automatically, and report is produced by run eval when report output is available.
Run-only options
| Option | Purpose |
|---|---|
--output-dir, -o | Write run outputs. Creates case_report/ and metrics/ inside the directory. |
--llm-concurrency | Global cap on concurrent LLM calls across all workers. Lower this first if you hit provider rate limits. |
estimate
estimate uses the same target and branch planning as run, but previews uncached cost without making billable provider calls.
nexagauge estimate <target_node> --input <source> [shared options] [estimate options]Use estimate before run when you want to understand likely spend for a branch, dataset slice, or model-routing change.
Estimate-only options
| Option | Purpose |
|---|---|
--cache-dir | Use a specific cache directory for this estimate. Useful when comparing against an isolated cache. |
The estimate table reports the target branch by node, including cached work, uncached work, eligible uncached cases, and total estimated cost.
Shared run / estimate arguments
These options apply to both run and estimate.
| Option | Purpose |
|---|---|
--input, -i | Required. Local input file or dataset source. |
--start | Start row index, inclusive. |
--end | End row index, exclusive. |
--limit, -n | Maximum number of rows to process. |
--llm-model MODEL | Set the global primary judge model. |
--llm-model NODE=MODEL | Override the primary model for one node in the target branch. Repeat as needed. |
--llm-fallback MODEL | Set the global fallback model. |
--llm-fallback NODE=MODEL | Override the fallback model for one node in the target branch. |
--continue-on-error | Continue processing remaining cases after a case failure. This is the default. |
--fail-fast | Stop on the first failed case. |
--max-workers | Number of cases processed concurrently. Default is 1. |
--max-in-flight | Backpressure limit for submitted-but-not-yet-emitted cases. Useful with --max-workers > 1. |
--force | Ignore cache reads but still write new cache entries. |
--no-cache | Disable both cache reads and writes. |
--debug | Print per-node debug logs. For estimate, the progress bar is hidden while debug logs are enabled. |
cache
Cache commands inspect and remove node-level cache artifacts.
nexagauge cache dir
nexagauge cache delete [options]Use cache dir to see the active cache location. Use cache delete --dry-run before deleting so you can confirm the size and file count.
Cache options
| Option | Purpose |
|---|---|
--dry-run | Print what would be deleted without removing files. |
--yes, -y | Delete without the confirmation prompt. |
--cache-dir | Delete a specific cache directory instead of the default. |
Cache location is resolved in this order: --cache-dir, then NEXAGAUGE_CACHE_DIR, then the per-user default cache path.
Examples
run
# Run the full eval branch and write reports
nexagauge run eval --input sample.json --output-dir ./report
# Run one metric branch on the first 50 rows
nexagauge run grounding --input sample.json --limit 50
# Inspect claim extraction only
nexagauge run claims --input sample.json --limit 5 --debug
# Use one global judge model
nexagauge run eval --input sample.json --llm-model openai/gpt-4o-mini
# Use a stronger model only for grounding and reference
nexagauge run eval \
--input sample.json \
--llm-model openai/gpt-4o-mini \
--llm-model grounding=openai/gpt-4o \
--llm-model relevance=openai/gpt-4o
# Run with more case-level parallelism while limiting provider pressure
nexagauge run eval \
--input sample.json \
--max-workers 4 \
--max-in-flight 8 \
--llm-concurrency 16
# Force a fresh run while still updating cache
nexagauge run relevance --input sample.json --force
# Run without reading or writing cache
nexagauge run redteam --input sample.json --no-cache
# Stop immediately if a case fails
nexagauge run eval --input sample.json --fail-fastestimate
# Estimate full eval cost
nexagauge estimate eval --input sample.json
# Estimate one metric branch
nexagauge estimate grounding --input sample.json --limit 100
# Estimate a specific row slice
nexagauge estimate eval --input sample.json --start 100 --end 200
# Estimate with a different global model
nexagauge estimate eval --input sample.json --llm-model openai/gpt-4o
# Estimate one node with a stronger model
nexagauge estimate eval \
--input sample.json \
--llm-model openai/gpt-4o-mini \
--llm-model redteam=openai/gpt-4o
# Estimate as if nothing were cached
nexagauge estimate eval --input sample.json --no-cache
# Estimate using an isolated cache directory
nexagauge estimate eval --input sample.json --cache-dir ./.tmp-cachecache
# Print the active cache directory
nexagauge cache dir
# Preview what would be deleted
nexagauge cache delete --dry-run
# Delete the default cache after confirmation
nexagauge cache delete
# Delete without prompt
nexagauge cache delete --yes
# Delete a custom cache directory
nexagauge cache delete --cache-dir ./.tmp-cache --yes