CLI Reference

Use this page when you need exact command syntax and practical options. For a short walkthrough, start with the CLI Quick Start.

run

run executes a target node against your input data. Use it when you want real metric results, report files, cache writes, and eval summaries.

bash
nexagauge run <target_node> --input <source> [shared options] [run options]

Targets

Target typeNodesUse when
Utilitychunk, claims, dedup, geval_stepsYou want intermediate artifacts for debugging or inspection.
Metricrelevance, grounding, redteam, geval, referenceYou want one evaluation dimension.
Full evalevalYou want all eligible metrics plus aggregate summaries and reports.

scan and report are not direct CLI targets. scan runs automatically, and report is produced by run eval when report output is available.

Run-only options

OptionPurpose
--output-dir, -oWrite run outputs. Creates case_report/ and metrics/ inside the directory.
--llm-concurrencyGlobal cap on concurrent LLM calls across all workers. Lower this first if you hit provider rate limits.

estimate

estimate uses the same target and branch planning as run, but previews uncached cost without making billable provider calls.

bash
nexagauge estimate <target_node> --input <source> [shared options] [estimate options]

Use estimate before run when you want to understand likely spend for a branch, dataset slice, or model-routing change.

Estimate-only options

OptionPurpose
--cache-dirUse a specific cache directory for this estimate. Useful when comparing against an isolated cache.

The estimate table reports the target branch by node, including cached work, uncached work, eligible uncached cases, and total estimated cost.

Shared run / estimate arguments

These options apply to both run and estimate.

OptionPurpose
--input, -iRequired. Local input file or dataset source.
--startStart row index, inclusive.
--endEnd row index, exclusive.
--limit, -nMaximum number of rows to process.
--llm-model MODELSet the global primary judge model.
--llm-model NODE=MODELOverride the primary model for one node in the target branch. Repeat as needed.
--llm-fallback MODELSet the global fallback model.
--llm-fallback NODE=MODELOverride the fallback model for one node in the target branch.
--continue-on-errorContinue processing remaining cases after a case failure. This is the default.
--fail-fastStop on the first failed case.
--max-workersNumber of cases processed concurrently. Default is 1.
--max-in-flightBackpressure limit for submitted-but-not-yet-emitted cases. Useful with --max-workers > 1.
--forceIgnore cache reads but still write new cache entries.
--no-cacheDisable both cache reads and writes.
--debugPrint per-node debug logs. For estimate, the progress bar is hidden while debug logs are enabled.

cache

Cache commands inspect and remove node-level cache artifacts.

bash
nexagauge cache dir
nexagauge cache delete [options]

Use cache dir to see the active cache location. Use cache delete --dry-run before deleting so you can confirm the size and file count.

Cache options

OptionPurpose
--dry-runPrint what would be deleted without removing files.
--yes, -yDelete without the confirmation prompt.
--cache-dirDelete a specific cache directory instead of the default.

Cache location is resolved in this order: --cache-dir, then NEXAGAUGE_CACHE_DIR, then the per-user default cache path.

Examples

run

bash
# Run the full eval branch and write reports
nexagauge run eval --input sample.json --output-dir ./report

# Run one metric branch on the first 50 rows
nexagauge run grounding --input sample.json --limit 50

# Inspect claim extraction only
nexagauge run claims --input sample.json --limit 5 --debug

# Use one global judge model
nexagauge run eval --input sample.json --llm-model openai/gpt-4o-mini

# Use a stronger model only for grounding and reference
nexagauge run eval \
  --input sample.json \
  --llm-model openai/gpt-4o-mini \
  --llm-model grounding=openai/gpt-4o \
  --llm-model relevance=openai/gpt-4o

# Run with more case-level parallelism while limiting provider pressure
nexagauge run eval \
  --input sample.json \
  --max-workers 4 \
  --max-in-flight 8 \
  --llm-concurrency 16

# Force a fresh run while still updating cache
nexagauge run relevance --input sample.json --force

# Run without reading or writing cache
nexagauge run redteam --input sample.json --no-cache

# Stop immediately if a case fails
nexagauge run eval --input sample.json --fail-fast

estimate

bash
# Estimate full eval cost
nexagauge estimate eval --input sample.json

# Estimate one metric branch
nexagauge estimate grounding --input sample.json --limit 100

# Estimate a specific row slice
nexagauge estimate eval --input sample.json --start 100 --end 200

# Estimate with a different global model
nexagauge estimate eval --input sample.json --llm-model openai/gpt-4o

# Estimate one node with a stronger model
nexagauge estimate eval \
  --input sample.json \
  --llm-model openai/gpt-4o-mini \
  --llm-model redteam=openai/gpt-4o

# Estimate as if nothing were cached
nexagauge estimate eval --input sample.json --no-cache

# Estimate using an isolated cache directory
nexagauge estimate eval --input sample.json --cache-dir ./.tmp-cache

cache

bash
# Print the active cache directory
nexagauge cache dir

# Preview what would be deleted
nexagauge cache delete --dry-run

# Delete the default cache after confirmation
nexagauge cache delete

# Delete without prompt
nexagauge cache delete --yes

# Delete a custom cache directory
nexagauge cache delete --cache-dir ./.tmp-cache --yes