Gump Eats Its Own Cooking

How we build Gump, with Gump.

What this is

This is the workflow the Gump team uses to develop Gump itself. It’s published for transparency and as a reference for building advanced workflows combining TDD, adversarial review via validators, and custom gates. This is not a workflow you should copy as-is. It’s a reference for building your own.

gump run gump-dogfood --spec spec.md

The workflow

name: gump-dogfood
max_budget: 15.00

steps:
  - name: decompose
    type: split
    get:
      prompt: |
        Analyze this spec and the full codebase.
        Decompose into independent tasks.
        Each task must be implementable and testable in isolation.
        Order tasks by dependency — if task B depends on task A, A comes first.
    run:
      agent: claude-opus
    gate: [schema]
    hitl: before_gate
    each:
      - name: tests
        type: code
        get:
          prompt: |
            Write comprehensive tests for: {task.description}
            Cover edge cases and error conditions.
            Only create/modify test files in: {task.files}
        run:
          agent: claude-sonnet
          guard:
            max_turns: 40
        gate:
          - compile
          - "touched: *_test.*"
          - tests_found
        retry:
          - attempt: 2
            agent: claude-opus
          - exit: 3

      - name: impl
        type: code
        get:
          session: from: tests
          prompt: |
            Implement code to pass all tests.
            Do not modify the tests.
            Only modify implementation files in: {task.files}
            {prev.gate.review.comments}
        run:
          agent: claude-sonnet
          guard:
            max_turns: 60
            max_budget: 3.00
        gate:
          - compile
          - test
          - "untouched: *_test.*"
          - validate: validators/correctness-review
              diff: "{diff}"
              spec: "{task.description}"
              agent: claude-opus
        retry:
          - attempt: 2
            prompt: |
              Deviations: {gate.review.comments}
              Fix only these.
          - attempt: 4
            agent: claude-opus
            session: new
            worktree: reset
          - exit: 6

  - name: quality
    gate: [compile, lint, test]

  - name: smoke
    gate:
      - bash: "make smoke-test"

How it works

The plan is validated manually (hitl: before_gate). Each task follows: tests (TDD red) → impl (TDD green) → validator review → convergence if needed. The max_budget: 3.00 guard on impl prevents a single task from consuming the entire run budget. The final smoke gate runs an end-to-end smoke test suite.

Real metrics

From actual Gump development runs. Updated as significant data accumulates.

Metric	Value
Average cost per run	$3–$ 8
Tasks per run	2–5
First-try success rate (impl)	~60%
Average retries per task	1.4
Escalations per run	0–2
Average duration	3–7 min

Adapt to your project

Replace the smoke gate

bash: "make smoke-test" is specific to Gump. Replace with your own integration test.

Remove HITL

Remove hitl: before_gate on decompose for CI/CD pipelines.

Adjust per-step budget

The max_budget: 3.00 on impl is calibrated for Gump. Adjust for your task sizes.

Swap the validator

Replace validators/correctness-review with your own domain-specific reviewer.

Overview

Workflows

Patterns

Gump Eats Its Own Cooking

Gump Eats Its Own Cooking

What this is

The workflow

How it works

Real metrics

Adapt to your project

Overview

Workflows

Patterns

Documentation Index

​Gump Eats Its Own Cooking

​What this is

​The workflow

​How it works

​Real metrics

​Adapt to your project

Gump Eats Its Own Cooking

What this is

The workflow

How it works

Real metrics

Adapt to your project