done when
Syntax
done when <predicate>
What it does
The spine of "you can't fake done" โ the predicate the loop actually runs each cycle to decide whether it's finished. It's the difference between a loop and an open-ended prompt: the agent doesn't get to declare success, it has to pass a check. The command runs in your shell with your privileges, like an npm script, so it must be a real, runnable command โ a paraphrased or aspirational check is a loop that can never go green.
There are four forms. A test passes (a named test or suite). A command passes (exit code 0). A scan finds nothing โ which means both exit 0 and empty output, so a lingering match keeps the loop red. Or a human confirms a plain-language condition, for the things only an eye can judge. Two modifiers sharpen a check: passes N times re-runs it to smoke out a lucky green (a flake guard), and a skill predicate (the skill "โฆ" approves) hands judgment to a rubric or LM judge for the non-deterministic parts. You can list as many done when lines as you need โ all of them must pass, a conjunction โ which is how you combine a deterministic test with an eval that judges how the work was done. Write this line first, before any behavior: if you can't state the check, you don't yet understand the goal.
Example
done when the test "billing.spec.ts::apostrophe" passes
done when "pnpm test" passes
done when "semgrep --severity=high" finds nothing
done when a human confirms "the UI looks right"the four forms
# combine a deterministic test with a trajectory eval โ both must pass
done when "pnpm test cart" passes 3 times
done when the skill "code-review" approves on the trajectory
the bar: didn't weaken a test to go green; no writes outside src/cart/test + eval, and a flake guard
Common mistakes
- A paraphrased or fictional command. The predicate runs verbatim in your shell. If the command doesn't exist or the test name is wrong, the loop can never go green. Run the check by hand once before you ship the loop.
- Confusing
passeswithfinds nothing.passesonly wants exit 0; a scan that exits 0 but prints matches would falsely pass. Usefinds nothingfor scanners โ it requires exit 0 and empty output. - Only a human check on an unattended loop.
a human confirms โฆblocks until someone answers โ fine in-session, a dead stop for a headless run. Pair it with a machine check, or make the deterministic form the gate. - A slow check with no flake guard on flaky suites. A costly predicate re-run every cycle is expensive; a flaky one can pass by luck. Add
passes N timeswhere a green can happen by chance.