Reports & Metrics

Gump tracks structured metrics for every run. Use gump report to see them.

Basic report

gump report

Detailed step view

gump report --detail build/task-1/converge

Step: build/task-1/converge
─────────────────────────────────────────────────
Agent:     claude-sonnet → claude-opus (escalated at attempt 4)
Type:      code
Status:    pass (attempt 4/6)

Attempts:
  1. gate_fail     1m12s  $0.08  14 turns  12k/3k tok
     gate: compile ✓, test ✓, review ✗
  2. gate_fail     0m58s  $0.06  10 turns  8k/2k tok
     gate: compile ✓, test ✓, review ✗
  3. gate_fail     1m05s  $0.07  12 turns  10k/3k tok
     gate: compile ✓, test ✓, review ✗
  4. pass          0m45s  $0.15   8 turns  6k/2k tok  [claude-opus]
     gate: compile ✓, test ✓, review ✓

State:
  cost: 0.36 | tokens: 36k/10k | turns: 44 | retries: 3
Files Changed:
  M math.go (+12 -0) | M math_test.go (+28 -0)

Comparing runs

gump report --last 5

Shows a summary table of your last 5 runs.

What Gump measures

High confidence (100%)

Deterministic and exact: git diff, gate exit codes, wall-clock duration, retries, guard triggers.

Medium confidence (~80%)

Tracked live from the agent stream: tokens (in, out, cache), cost estimates, timestamps, context window usage.

Reconstructed (~50%)

Inferred from heuristics: turn count, turn classification (coding, execution, exploration, planning), shell classification for Codex.

Key metrics

Cost — Per step, per task, per run. Based on published provider pricing. TTFD (Time To First Diff) — Time between agent launch and first file modification. Long TTFD = agent exploring before writing. Useful for prompt optimization. Stall detection — tool_error_count, correction_loops, fatal_loops, repeated_action_loops. These appear in gump report --detail. Context window usage — How much of the agent’s context window was consumed per step.

Analytics model

Event → Turn → Step → Run

Each event carries a timestamp and semantic labels. Turns are classified by dominant activity. Steps aggregate turns. Runs aggregate steps.

​Reports & Metrics

​Basic report

​Detailed step view

​Comparing runs

​What Gump measures

​High confidence (100%)

​Medium confidence (~80%)

​Reconstructed (~50%)

​Key metrics

​Analytics model