refactor(capitalize): trim PASS A to the restraint rule + flag untested gate

PASS A done-detection was ~60% machinery a capable agent already does for
free (the baseline checked done tasks and left the umbrella task alone
unaided). Cut the git-command how-to and worked example; keep only the
load-bearing restraint rule (flip only on a clean task<->commit map;
partial/umbrella/vague stay unchecked; never guess).

Add a Red flag: the STEP 3 gate STOP was never exercised (non-interactive
build harness printed the gate then proceeded as approved) — confirm it
halts before any write on first real use. TDD note records the same caveat.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01X3e8LaH2vymmxyh36h3jFU
This commit is contained in:
Bastien Chanot 2026-06-19 18:17:45 +02:00
parent be0f047f04
commit 765e9d78b1

View File

@ -160,15 +160,12 @@ concatenated class names" entry.)
Runs only if `.claude/tasks/TODO.md` exists (STEP 0). Two passes. Runs only if `.claude/tasks/TODO.md` exists (STEP 0). Two passes.
**PASS A — done-detection (TODO → reality).** For each unchecked `- [ ]`, decide **PASS A — done-detection (TODO → reality).** Detection is free — a capable
whether the session actually finished it, grounding on git (`git log`, agent already spots the finished tasks from the STEP 1 git scan. The only rule
`git diff HEAD`, `git show`) AND the conversation. Propose `[x]` ONLY when the worth stating is **restraint**: flip `- [ ]``- [x]` ONLY when the task maps
task ↔ commit/code map is unambiguous (task "add retry-with-backoff" ↔ a commit cleanly to a commit/diff that did exactly it. Partial / umbrella tasks
that adds exactly that). ("harden X" when 1 of 3 parts shipped) and vague tasks ("Commit", "Deploy",
- Partial / umbrella tasks ("harden X" covering 3 things when only 1 shipped) → "test it") with no precise git evidence stay unchecked. Never check on a guess.
leave unchecked.
- Vague tasks ("Commit", "Deploy", "test it") with no precise git evidence →
leave unchecked. Never check on assumption or on a guess.
**PASS B — capture (conversation → TODO).** Spot explicit to-dos voiced in the **PASS B — capture (conversation → TODO).** Spot explicit to-dos voiced in the
session ("il faut X", "TODO Y", "à corriger Z", new directives) that are absent session ("il faut X", "TODO Y", "à corriger Z", new directives) that are absent
@ -330,6 +327,11 @@ checkpointed (ritual).
## Red flags — STOP ## Red flags — STOP
- **The STEP 3 gate STOP has never been exercised in test** (the build harness
was non-interactive — it printed the gate then proceeded as "all approved").
On the FIRST real run, confirm STEP 3 halts and waits for approval BEFORE any
file write. If anything is written before you answer, the gate is broken —
stop and fix it. This is the skill's #1 guarantee; it is currently unverified.
- About to append a registry entry without having grepped + read the registry - About to append a registry entry without having grepped + read the registry
for it → dedup skipped. for it → dedup skipped.
- About to treat a reworded item as new without checking meaning → semantic - About to treat a reworded item as new without checking meaning → semantic
@ -354,7 +356,12 @@ two registries on an earlier run (non-deterministic). This skill's mandatory
gate (STEP 3), anti-noise filter (STEP 2B), explicit-only capture, and gate (STEP 3), anti-noise filter (STEP 2B), explicit-only capture, and
one-incident-one-registry rule make those deterministic. one-incident-one-registry rule make those deterministic.
GREEN re-run on the same fixture must: stop at the gate, drop both dups (footer GREEN re-run on the same fixture confirmed: drops both dups (footer shows
shows existing IDs), log jigsaw as ONE learning, check only the cleanly-done existing IDs), logs jigsaw as ONE learning, checks only the cleanly-done task,
task, leave the umbrella "harden" task unchecked, add only the explicit README leaves the umbrella "harden" task unchecked, adds only the explicit README
to-do, ignore the push/tag parasite, and route the GraphQL directive to BDR. to-do, ignores the push/tag parasite, routes the GraphQL directive to BDR.
**Caveat — gate not exercised:** the STEP 3 STOP itself was NOT tested. The
harness is non-interactive — GREEN printed the gate then proceeded as "all
approved", so the halt-before-write behavior is unverified. See the first
Red flag; confirm it on first real use.