LoopFlow
Tutorial Workshop Keywords Blog ๐ŸŽฎ LoopFlow Lab
Blog / Make Claude Code self-correct

How to make Claude Code self-correct

Claude Code does one pass and stops; a .loop gives it a verifiable finish line and a reflect back-edge so it retries until the tests actually pass.

Claude Code does one pass and stops. To make it self-correct โ€” retry until the tests actually pass โ€” you need a verifiable finish line and a back-edge that feeds each failure into the next attempt. That is exactly what a .loop file is.

The problem: agents do one pass and stop

Ask Claude Code to "fix the failing checkout test" and it edits some code, runs the suite once (maybe), and replies "Done โ€” fixed the rounding." Sometimes it's right. Often the test is still red, or a different test broke, or it never ran the suite at all. "Done" is whatever the model said, not a command that passed.

The fix isn't a better prompt. A prompt fires once and trusts the model's word. What you want is a control loop: the agent acts, a real check grades the result, and on red the agent tries again โ€” with the failure in hand โ€” until the check is green or a ceiling stops it. That control loop is the artifact you edit, not the prose.

What "self-correcting" actually means

A self-correcting run is four repeating steps wrapped around a verdict the model can't author:

The load-bearing part is observe. At that step the runtime โ€” plain deterministic code, not the LLM โ€” spawns your check as a real OS process and reads the exit code from the operating system. Exit 0 = pass; anything else = fail. The model does the work; the OS grades it. That's why "done" can't be faked: the verdict never comes out of the model's mouth.

The other load-bearing part is reflect, the back-edge. Without it, a retry is just the same blind attempt again. With it, cycle two reads what broke in cycle one. That is the difference between a loop that thrashes and one that converges.

A concrete walkthrough

Say the checkout tax test is red. Here is the whole self-correcting loop โ€” the finish line at the top, the safety net at the bottom:

loop "fix the checkout tax test":
  goal: the checkout tax test passes with no regressions
  done when the test "checkout.spec.ts::tax" passes

  look at: src/checkout, and the last failure
  allow edits automatically, but ask me before pushes

  each cycle: plan, then act, then observe
  when it fails: reflect on which layer broke, then plan again
  after 6 tries: stop and warn "tax fix thrashing"

Every cycle runs the named test. Red โ†’ reflect writes a one-line diagnosis ("rounding happens before the discount, not after") โ†’ that becomes context for the next plan. The loop stops only when the process your machine ran returns 0. It works on a branch and, with no git: block, never pushes to main.

Two lines earn their keep. look at: ends with and the last failure โ€” reflect's diagnosis only reaches the next plan if you pass it forward. And after 6 tries is the floor: a back-edge with no ceiling is a token bonfire. Never write one without the other.

The check doesn't have to be a test. Any command that exits non-zero on failure works, and you can list several โ€” all must pass:

loop "harden the auth middleware":
  goal: rate limiting works and no high-severity findings remain
  done when "pnpm test src/auth" passes
  done when "semgrep --severity=high" finds nothing

  look at: src/auth, and the last failure
  each cycle: plan, then act, then observe
  when it fails: reflect, then plan again
  after 8 tries: stop and warn "auth loop stuck"

finds nothing means "this scanner must report zero" โ€” exit 0 and empty output. Stacking a test with a scan is how you make one loop prove both "it works" and "it's clean."

How to run it

Install once, then two ways to drive it:

npx @loop-lang/loop init      # writes the /loopflow skill + AGENTS.md into your repo

A .loop is plain text, so the same file works in Cursor or Copilot โ€” point the agent at AGENTS.md and it reads the goal, the check, and the stopping rules the same way Claude Code does.

Common pitfalls

Wrap-up

Making Claude Code self-correct isn't a trick โ€” it's structure. A goal, a check the OS grades, a reflect back-edge, and a try ceiling turn one-shot prompting into a loop that retries until the tests pass and stops when they do. Write it once, re-run it forever, review it in a PR.

Try it now in the playground, or learn the whole language line by line in the tutorial.