Reports & Metrics
Gump tracks structured metrics for every run. Usegump report to see them.
Basic report
Detailed step view
Drill into a specific step:Comparing runs
What Gump measures
High confidence (100%)
These are deterministic and exact:- Git diff (files changed, lines added/removed)
- Gate exit codes (pass/fail)
- Wall-clock duration per step and per run
- Number of retries and escalations
- Guard triggers
Medium confidence (~80%)
Tracked live from the agent stream, but subject to provider accuracy:- Tokens (input, output, cache read, cache write)
- Cost (estimated from token counts and provider pricing)
- Timestamps from the agent’s native events
- Context window usage per step
Reconstructed (~50%)
Inferred from heuristics, not directly reported by all providers:- Turn count (cognitive cycles)
- Turn classification (coding, execution, exploration, planning)
- Shell command classification for Codex
Key metrics
Cost
Tracked per step, per item, per run. Aggregated from agent token reports. The estimate is based on published provider pricing — actual bills may differ slightly.TTFD (Time To First Diff)
The time between agent launch and the first file modification. A long TTFD suggests the agent spent time exploring or planning before writing code. A useful signal for prompt optimization.Stall detection
Gump detects agents that spin in circles:tool_error_count— repeated tool failurescorrection_loops— edit → test → fail → edit cyclesfatal_loops— the agent hits the same error repeatedlyrepeated_action_loops— identical tool calls in sequence
gump report --detail and help diagnose why a step needed many turns or failed.