Cache

Overview

nexa-gauge includes cache support in every normal run. The cache is part of the execution model, not an afterthought: it lets repeated evaluations reuse trusted work while recomputing only the records and routes that actually changed.

The practical effect is simple:

  • unchanged records reuse prior node outputs
  • changed records invalidate only the affected work
  • changed model or strategy routing invalidates the affected route
  • reports remain reproducible for the same data and execution route

This makes nexa-gauge suitable for iterative prompt work, regression testing, benchmark runs, and production-style evaluation jobs where cost and repeatability matter.

Execution Flow

Graph
Rendering diagram...

What Is Cached

nexa-gauge caches node outputs for each record and route. A route is the exact path through the graph for the selected target, including the prerequisite nodes needed to produce that target.

Examples:

  • Running grounding can reuse stable chunk, refiner, and claims outputs.
  • Running eval can reuse unchanged utility outputs and only execute metrics that are still eligible and uncached.
  • Re-running the same dataset after a small edit only recomputes the affected record paths.

Aggregation and presentation steps are fast and are intentionally recomputed from cached upstream artifacts.

Cache Identity

Cache reuse depends on both data and routing. If either changes in a meaningful way, nexa-gauge recomputes the affected node output.

Data attributes that participate in cache identity include:

AttributeWhy it matters
generationSource text for chunking, claims, safety, and most scoring.
questionRequired by relevance and optional judge fields.
contextRequired by grounding and optional judge fields.
referenceRequired by reference metrics and optional judge fields.
gevalGEval criteria, steps, and selected item fields.
redteamCustom safety rubrics and selected item fields.
reference_filesExternal reference material attached to the case.

Route attributes that participate in cache identity include:

AttributeWhy it matters
target node routeDifferent targets use different graph paths.
node nameEach node has an isolated cache identity.
model routingPrimary model, fallback model, and temperature affect judge output.
utility strategychunker, refiner, and refiner_top_k affect downstream artifacts.
upstream pathA node is reused only when the prerequisite path that produced its inputs is also compatible.

You do not need to manage this manually. If the record or route changes, nexa-gauge treats the old output as stale for that path.

Estimate And Run

run writes reusable cache entries for executed work.

bash
nexagauge run eval --input sample.json --output-dir ./report

estimate can use compatible cached run artifacts to produce a more realistic cost preview without replaying the full branch.

bash
nexagauge estimate eval --input sample.json

This is useful after a partial run: estimates can distinguish work that is already cached from work that would still need LLM calls.

Controls

Caching is enabled by default.

OptionBehavior
defaultRead existing cache entries and write new entries for executed work.
--forceIgnore existing entries for this run, but write fresh results.
--no-cacheDisable cache reads and writes.
--cache-dirUse a specific cache directory for cache commands.
NEXAGAUGE_CACHE_DIRSet the default cache location for runs.

Use --force when you want fresh results after changing external assumptions. Use --no-cache only when you need a fully uncached diagnostic run.

Operational Guidance

For stable comparisons:

  • keep input records stable
  • pin model routing with --llm-model and --llm-fallback
  • keep utility strategy flags stable
  • use the same target node when comparing runs

For dataset iteration:

  • edit only the records you need to change
  • rerun the same target
  • let nexa-gauge reuse unaffected records and routes

The cache is designed to make repeated evaluation practical without hiding invalidation. Stable work is reused; changed work is recomputed.