LoopFlow
Tutorial Workshop Keywords ๐ŸŽฎ LoopFlow Lab
Keywords / done when

done when

How a loop verifies itself. Core syntax

Syntax

done when <predicate>

What it does

The spine of "you can't fake done" โ€” the predicate the loop actually runs each cycle to decide whether it's finished. It's the difference between a loop and an open-ended prompt: the agent doesn't get to declare success, it has to pass a check. The command runs in your shell with your privileges, like an npm script, so it must be a real, runnable command โ€” a paraphrased or aspirational check is a loop that can never go green.

There are four forms. A test passes (a named test or suite). A command passes (exit code 0). A scan finds nothing โ€” which means both exit 0 and empty output, so a lingering match keeps the loop red. Or a human confirms a plain-language condition, for the things only an eye can judge. Two modifiers sharpen a check: passes N times re-runs it to smoke out a lucky green (a flake guard), and a skill predicate (the skill "โ€ฆ" approves) hands judgment to a rubric or LM judge for the non-deterministic parts. You can list as many done when lines as you need โ€” all of them must pass, a conjunction โ€” which is how you combine a deterministic test with an eval that judges how the work was done. Write this line first, before any behavior: if you can't state the check, you don't yet understand the goal.

Example

done when the test "billing.spec.ts::apostrophe" passes
done when "pnpm test" passes
done when "semgrep --severity=high" finds nothing
done when a human confirms "the UI looks right"
the four forms
# combine a deterministic test with a trajectory eval โ€” both must pass
done when "pnpm test cart" passes 3 times
done when the skill "code-review" approves on the trajectory
  the bar: didn't weaken a test to go green; no writes outside src/cart/
test + eval, and a flake guard

Common mistakes

Related