CLI Reference

Use this page when you need exact command syntax and practical options. For a short walkthrough, start with the CLI Quick Start.

run

run executes a target node against your input data. Use it when you want real metric results, report files, cache writes, and eval summaries.

bash

nexagauge run <target_node> --input <source> [shared options] [run options]

Targets

Target type	Nodes	Use when
Utility	`chunk`, `claims`, `dedup`, `geval_steps`	You want intermediate artifacts for debugging or inspection.
Metric	`relevance`, `grounding`, `redteam`, `geval`, `reference`	You want one evaluation dimension.
Full eval	`eval`	You want all eligible metrics plus aggregate summaries and reports.

scan and report are not direct CLI targets. scan runs automatically, and report is produced by run eval when report output is available.

Run-only options

Option	Purpose
`--output-dir`, `-o`	Write run outputs. Creates `case_report/` and `metrics/` inside the directory.
`--llm-concurrency`	Global cap on concurrent LLM calls across all workers. Lower this first if you hit provider rate limits.

estimate

estimate uses the same target and branch planning as run, but previews uncached cost without making billable provider calls.

bash

nexagauge estimate <target_node> --input <source> [shared options] [estimate options]

Use estimate before run when you want to understand likely spend for a branch, dataset slice, or model-routing change.

Estimate-only options

Option	Purpose
`--cache-dir`	Use a specific cache directory for this estimate. Useful when comparing against an isolated cache.

The estimate table reports the target branch by node, including cached work, uncached work, eligible uncached cases, and total estimated cost.

Shared `run` / `estimate` arguments

These options apply to both run and estimate.

Option	Purpose
`--input`, `-i`	Required. Local input file or dataset source.
`--start`	Start row index, inclusive.
`--end`	End row index, exclusive.
`--limit`, `-n`	Maximum number of rows to process.
`--llm-model MODEL`	Set the global primary judge model.
`--llm-model NODE=MODEL`	Override the primary model for one node in the target branch. Repeat as needed.
`--llm-fallback MODEL`	Set the global fallback model.
`--llm-fallback NODE=MODEL`	Override the fallback model for one node in the target branch.
`--continue-on-error`	Continue processing remaining cases after a case failure. This is the default.
`--fail-fast`	Stop on the first failed case.
`--max-workers`	Number of cases processed concurrently. Default is `1`.
`--max-in-flight`	Backpressure limit for submitted-but-not-yet-emitted cases. Useful with `--max-workers > 1`.
`--force`	Ignore cache reads but still write new cache entries.
`--no-cache`	Disable both cache reads and writes.
`--debug`	Print per-node debug logs. For `estimate`, the progress bar is hidden while debug logs are enabled.

cache

Cache commands inspect and remove node-level cache artifacts.

bash

nexagauge cache dir
nexagauge cache delete [options]

Use cache dir to see the active cache location. Use cache delete --dry-run before deleting so you can confirm the size and file count.

Cache options

Option	Purpose
`--dry-run`	Print what would be deleted without removing files.
`--yes`, `-y`	Delete without the confirmation prompt.
`--cache-dir`	Delete a specific cache directory instead of the default.

Cache location is resolved in this order: --cache-dir, then NEXAGAUGE_CACHE_DIR, then the per-user default cache path.

Examples

run

bash

# Run the full eval branch and write reports
nexagauge run eval --input sample.json --output-dir ./report

# Run one metric branch on the first 50 rows
nexagauge run grounding --input sample.json --limit 50

# Inspect claim extraction only
nexagauge run claims --input sample.json --limit 5 --debug

# Use one global judge model
nexagauge run eval --input sample.json --llm-model openai/gpt-4o-mini

# Use a stronger model only for grounding and reference
nexagauge run eval \
  --input sample.json \
  --llm-model openai/gpt-4o-mini \
  --llm-model grounding=openai/gpt-4o \
  --llm-model relevance=openai/gpt-4o

# Run with more case-level parallelism while limiting provider pressure
nexagauge run eval \
  --input sample.json \
  --max-workers 4 \
  --max-in-flight 8 \
  --llm-concurrency 16

# Force a fresh run while still updating cache
nexagauge run relevance --input sample.json --force

# Run without reading or writing cache
nexagauge run redteam --input sample.json --no-cache

# Stop immediately if a case fails
nexagauge run eval --input sample.json --fail-fast

estimate

bash

# Estimate full eval cost
nexagauge estimate eval --input sample.json

# Estimate one metric branch
nexagauge estimate grounding --input sample.json --limit 100

# Estimate a specific row slice
nexagauge estimate eval --input sample.json --start 100 --end 200

# Estimate with a different global model
nexagauge estimate eval --input sample.json --llm-model openai/gpt-4o

# Estimate one node with a stronger model
nexagauge estimate eval \
  --input sample.json \
  --llm-model openai/gpt-4o-mini \
  --llm-model redteam=openai/gpt-4o

# Estimate as if nothing were cached
nexagauge estimate eval --input sample.json --no-cache

# Estimate using an isolated cache directory
nexagauge estimate eval --input sample.json --cache-dir ./.tmp-cache

cache

bash

# Print the active cache directory
nexagauge cache dir

# Preview what would be deleted
nexagauge cache delete --dry-run

# Delete the default cache after confirmation
nexagauge cache delete

# Delete without prompt
nexagauge cache delete --yes

# Delete a custom cache directory
nexagauge cache delete --cache-dir ./.tmp-cache --yes

CLI Reference

run

Targets

Run-only options

estimate

Estimate-only options

Shared run / estimate arguments

cache

Cache options

Examples

run

estimate

cache

Shared `run` / `estimate` arguments