Merge feature/reconcile-skill into develop

This commit is contained in:
Bastien Chanot 2026-06-30 13:42:56 +02:00
commit aede7af793
12 changed files with 1416 additions and 9 deletions

View File

@ -62,6 +62,7 @@ rules:
| BDR-038 | 2026-06-27 | deploy skill: per-project learning runbook, two-moment cold-resume | accepted | | BDR-038 | 2026-06-27 | deploy skill: per-project learning runbook, two-moment cold-resume | accepted |
| BDR-039 | 2026-06-29 | Gitea branch protection = Option-1 owner-pushable, not require-PR | accepted | | BDR-039 | 2026-06-29 | Gitea branch protection = Option-1 owner-pushable, not require-PR | accepted |
| BDR-040 | 2026-06-29 | doc-syncer MINOR-shape oracle: deterministic floor under LLM's MINOR call | accepted | | BDR-040 | 2026-06-29 | doc-syncer MINOR-shape oracle: deterministic floor under LLM's MINOR call | accepted |
| BDR-041 | 2026-06-30 | /reconcile = deterministic declared-vs-real engine + thin gated skill (reconciler, not lister) | accepted |
--- ---
@ -635,3 +636,12 @@ rules:
- **ENGRAVED LIMIT — do not over-read the guarantee**: oracle catches STRUCTURAL/size significance, NOT semantic. A 3-line edit that CHANGES MEANING, no heading, small → still reads MINOR (rc 0) and auto-commits. Deterministic FLOOR under LLM judgment = REDUCTION of RISK-1's gross cases, NOT elimination. LLM owns the semantic call above the floor; the visible surface ([[BDR-036]]) stays the content backstop. - **ENGRAVED LIMIT — do not over-read the guarantee**: oracle catches STRUCTURAL/size significance, NOT semantic. A 3-line edit that CHANGES MEANING, no heading, small → still reads MINOR (rc 0) and auto-commits. Deterministic FLOOR under LLM judgment = REDUCTION of RISK-1's gross cases, NOT elimination. LLM owns the semantic call above the floor; the visible surface ([[BDR-036]]) stays the content backstop.
- **Scope tranché**: ① oracle + ② [[LRN-071]] masked-commit fix built. ③ branch-guard (doc-commit refusing main/develop) DEFERRED — duplicates the protected-base predicate a 3rd time (lib + gitflow hook + here); migrated repos have the hook → ③ guards a state that shouldn't exist. Reconsider only for repos outside `gitflow init`. - **Scope tranché**: ① oracle + ② [[LRN-071]] masked-commit fix built. ③ branch-guard (doc-commit refusing main/develop) DEFERRED — duplicates the protected-base predicate a 3rd time (lib + gitflow hook + here); migrated repos have the hook → ③ guards a state that shouldn't exist. Reconsider only for repos outside `gitflow init`.
- **Build**: TDD RED→GREEN. run-doc-shape.sh 19/19 (incl. threshold boundary + env-override) + behavioral Scenario D. Wired doc-syncer STEP A4 + doc-commit.md ACKNOWLEDGMENTS coherence. shellcheck clean. - **Build**: TDD RED→GREEN. run-doc-shape.sh 19/19 (incl. threshold boundary + env-override) + behavioral Scenario D. Wired doc-syncer STEP A4 + doc-commit.md ACKNOWLEDGMENTS coherence. shellcheck clean.
## BDR-041 — /reconcile = deterministic declared-vs-real engine + thin gated skill (reconciler, not lister)
- **Date**: 2026-06-30
- **Status**: accepted
- **Decision**: `/reconcile` answers "what work is REALLY open?" by confronting DECLARATIVE sources (TODO `[x]`/`[ ]`/`[~]`, registry statuses, `## Index`) against REAL state (git/fs, registry BODY). Split: `lib/reconcile.sh` (deterministic engine — body enumeration, `reconcile_oracle_*`, BLK last-block-wins, lexical deferral sweep, contradiction candidates, pure `reconcile_verdict` kernel) + `skills/reconcile/SKILL.md` (thin orchestration + A/B/C write-back gate). Founding principle = RECURSIVE COHERENCE: never use a declarative source as oracle (Index/checkbox/status/path-name) — enumerate from BODY `## ID` headings, decide done/stale from git/fs. TESTED: run-reconcile.sh T1 reds if engine reads `## Index` (shim → 51≠72, canary LRN-020 dropped). Registries READ-ONLY (curation = /prune-memory).
- **Why**: RED proved a capable agent reconciles when GUIDED ("use git + justify") but MIRRORS the TODO when not (false positives + missed contradiction), and even guided hits the compound-status trap (BLK-008). Value = determinism + cheapness + gate, NOT teaching ([[LRN-031]], [[LRN-075]]). Mechanical 80% → script; judgment 20% → thin skill (writing-skills: automate mechanical, document judgment).
- **Alternatives rejected**: monolithic teaching/discipline skill (agents reconcile when guided → no teaching value); `grep '[ ]'` lister (reproduces the lie); trust Index (drift, [[LRN-055]]); blocking write-back gate (friction — A/B/C surface chosen).
- **Honest limits (graven)**: deferral detection LEXICAL (marked-only; unmarked "à reprendre quand X" missed); contradictions = CANDIDATES (token overlap), surfaced not asserted; cross-repo "not verifiable here"; cross-ref verdicts ("[~] done because chantier X below complete") surfaced, not auto-resolved.
- **Reference**: `lib/reconcile.sh`, `lib/tests/run-reconcile.sh` (20/20) + `lib/tests/fixtures/` (neutral-named, [[LRN-077]]), `skills/reconcile/SKILL.md`, CLAUDE.md "Skill routing". Born of the 2026-06-29 manual inventory (its known-good oracle). Built via superpowers:writing-skills. See [[LRN-075]], [[LRN-076]], [[EVAL-011]].

View File

@ -31,6 +31,7 @@ rules:
| EVAL-008 | 2026-06-27 | Doc-sync coupled machinery — 28/28 real-exec, swap-sweep caught prior debt | keep | | EVAL-008 | 2026-06-27 | Doc-sync coupled machinery — 28/28 real-exec, swap-sweep caught prior debt | keep |
| EVAL-009 | 2026-06-27 | deploy skill subagent-driven build: multi-stage review + pressure-test net-positive | keep | | EVAL-009 | 2026-06-27 | deploy skill subagent-driven build: multi-stage review + pressure-test net-positive | keep |
| EVAL-010 | 2026-06-29 | prune-memory hardening: RED-7 deterministic fix + RED-8 accept + 34-row index backfill | keep | | EVAL-010 | 2026-06-29 | prune-memory hardening: RED-7 deterministic fix + RED-8 accept + 34-row index backfill | keep |
| EVAL-011 | 2026-06-30 | /reconcile build: RED contaminated→corrected (unguided control), GREEN behavioral confirmed, dogfooded on itself | keep |
--- ---
@ -120,3 +121,10 @@ rules:
- **method**: read-first cartography (sub-agent, confirmed) → RED-7 closed by a DETERMINISTIC test ([[LRN-046]]) + STEP-2 example fictionalized → RED-8 re-reviewed, consciously accepted ([[LRN-047]]) → 34 missing Index rows composed + inserted in ID-sorted slots → STEP-4 verify zero MISSING/ORPHAN; deterministic suite all-green, shellcheck clean. - **method**: read-first cartography (sub-agent, confirmed) → RED-7 closed by a DETERMINISTIC test ([[LRN-046]]) + STEP-2 example fictionalized → RED-8 re-reviewed, consciously accepted ([[LRN-047]]) → 34 missing Index rows composed + inserted in ID-sorted slots → STEP-4 verify zero MISSING/ORPHAN; deterministic suite all-green, shellcheck clean.
- **anomalies**: (1) RED-7 test FALSE-GREEN caught in real time — ugrep parsed `-9..` as an option → empty → green; fixed via /usr/bin/grep ([[LRN-074]]). The RED was WATCHED, not assumed. (2) RED-7 premise verified: LRN-014/016 ARE complementary → the old example modeled a WRONG merge, not just primed it. (3) backfill: 4/5 title-derived Applies-to (the awk-missed entries) missed a real future-app nuance on re-read → corrected before insert (without the 5-check, 4 Index rows would have diverged from source, engraved forever). (4) almost wrote a colliding EVAL-009 (deploy) — read the file first → EVAL-010. (5) pre-existing LRN-021 Index row out of ID-order → moved. - **anomalies**: (1) RED-7 test FALSE-GREEN caught in real time — ugrep parsed `-9..` as an option → empty → green; fixed via /usr/bin/grep ([[LRN-074]]). The RED was WATCHED, not assumed. (2) RED-7 premise verified: LRN-014/016 ARE complementary → the old example modeled a WRONG merge, not just primed it. (3) backfill: 4/5 title-derived Applies-to (the awk-missed entries) missed a real future-app nuance on re-read → corrected before insert (without the 5-check, 4 Index rows would have diverged from source, engraved forever). (4) almost wrote a colliding EVAL-009 (deploy) — read the file first → EVAL-010. (5) pre-existing LRN-021 Index row out of ID-order → moved.
- **action**: keep. RED-7 GREEN (deterministic), RED-8 documented-accept, drift 34→0. [[LRN-073]] + [[LRN-074]] engraved — 2 pattern-families this session (fail-silent [[LRN-066]]/[[LRN-071]] + command-assumption [[LRN-074]]). - **action**: keep. RED-7 GREEN (deterministic), RED-8 documented-accept, drift 34→0. [[LRN-073]] + [[LRN-074]] engraved — 2 pattern-families this session (fail-silent [[LRN-066]]/[[LRN-071]] + command-assumption [[LRN-074]]).
## EVAL-011 — /reconcile build (TDD): contaminated RED corrected, behavioral GREEN, dogfood on itself
- **Date**: 2026-06-30
- **output**: `lib/reconcile.sh` (engine) + `lib/tests/run-reconcile.sh` (20/20) + `lib/tests/fixtures/` (neutral) + `skills/reconcile/SKILL.md` (thin) + CLAUDE.md routing.
- **method**: 2-arm RED — GUIDED baselines (A/B, "use git + justify") SUCCEEDED = contaminated; UNGUIDED tempting baselines (a4872/a0f68) MIRRORED the TODO = real failure ([[LRN-075]]). RED-B = deterministic Index-ignore with TEETH (shim engine to read Index → reds). GREEN behavioral (a8404): same terrain + skill → verified via engine, stale reported done w/ SHAs, applied A/B/C gate, surfaced cross-ref as candidate. Dogfood: ran on the live repo, found its OWN chantier, marked S3 PARTIAL (routage absent per path oracle), not done.
- **anomalies**: (1) first baseline LEADING → corrected with an unguided control ([[LRN-075]]). (2) fixture-name false-signal — a0f68 read "pre-reconcile" from the dir name → re-froze fixtures neutral ([[LRN-077]]). (3) harness caught a real bug mid-build: BLK-004 status bleed from BLK-005's header ([[LRN-076]]). (4) META: marked `[x] routage DONE` BEFORE the CLAUDE.md edit succeeded (errored — Read-first) → created the exact declared-vs-real gap `/reconcile` traps, caught by the next verify. The tool's build produced the gap it detects.
- **action**: keep. RED watched red before green ([[LRN-074]] discipline), bug caught + fixed, behavioral loop closed.

View File

@ -248,3 +248,9 @@ rules:
- Contradiction caught: chantier `--help` (STEP 0.5 per SKILL.md) contradicts [[BDR-001]] accepted (helper via session-start hook; per-SKILL.md copy REJECTED) → `--help` BLOCKED pending BDR-001 resolution (supersede or re-route). - Contradiction caught: chantier `--help` (STEP 0.5 per SKILL.md) contradicts [[BDR-001]] accepted (helper via session-start hook; per-SKILL.md copy REJECTED) → `--help` BLOCKED pending BDR-001 resolution (supersede or re-route).
- Our OWN manual inventory had an error: line 26 cleanup-machine declared "auto-cleaned next make plugin" but fs shows darwin-skill still present → demoted "done"→"still deferred" after fs check. Proof-by-example the queue needs a RECONCILER (declared-vs-real), not a `[ ]`-grepper. - Our OWN manual inventory had an error: line 26 cleanup-machine declared "auto-cleaned next make plugin" but fs shows darwin-skill still present → demoted "done"→"still deferred" after fs check. Proof-by-example the queue needs a RECONCILER (declared-vs-real), not a `[ ]`-grepper.
- Reconciled TODO (5 ticked + 1 requalify + 1 split, annotated `reconcile 2026-06-29` w/ evidence) + queued `/reconcile` skill chantier (4-cat output, inter-registry contradiction detection, GATED TODO edit). Sequencing: /reconcile FIRST (oracle = today's inventory, perishable) → resolve BDR-001 → --help. - Reconciled TODO (5 ticked + 1 requalify + 1 split, annotated `reconcile 2026-06-29` w/ evidence) + queued `/reconcile` skill chantier (4-cat output, inter-registry contradiction detection, GATED TODO edit). Sequencing: /reconcile FIRST (oracle = today's inventory, perishable) → resolve BDR-001 → --help.
## 2026-06-30 — /reconcile skill shipped (declared-vs-real reconciler)
- Built `/reconcile` via superpowers:writing-skills (TDD): engine `lib/reconcile.sh` + harness 20/20 + thin gated skill. Recursive coherence (never trust a declarative source, incl. Index) made a TESTED guarantee — T1 reds on an Index-reader shim. [[BDR-041]].
- RED 2-arm: guided baselines succeed (contaminated) / unguided mirror the TODO (real failure) → value = determinism+gate, not teaching ([[LRN-075]]). GREEN behavioral confirmed; dogfooded on its own chantier (S3 marked partial honestly). [[EVAL-011]].
- Learnings: unguided-control RED ([[LRN-075]]); last-block-wins status + BLK-004 bleed bug ([[LRN-076]]); neutral fixture names = same symptom/distinct cause as [[LRN-074]] ([[LRN-077]]).
- Ship: feature/reconcile-skill → develop (gitflow finish). Push to origin gated (ASK).

View File

@ -94,6 +94,9 @@ rules:
| LRN-072 | 2026-06-29 | a stranded-artifact bug can be fixed by NOT creating the artifact (negative diff), not by plumbing its commit — if the producing step is speculative/unused, delete it | a stranded/duplicated/uncommitted-artifact bug — before building machinery, ask if the PRODUCING step is wanted; speculative-at-creation → remove, deliberate-on-demand → keep | | LRN-072 | 2026-06-29 | a stranded-artifact bug can be fixed by NOT creating the artifact (negative diff), not by plumbing its commit — if the producing step is speculative/unused, delete it | a stranded/duplicated/uncommitted-artifact bug — before building machinery, ask if the PRODUCING step is wanted; speculative-at-creation → remove, deliberate-on-demand → keep |
| LRN-073 | 2026-06-29 | a skill's worked-example must use FICTIONAL ids, never live registry ids (they prime real-data behavior) | any skill/agent with a worked example over the SAME data it operates on — use reserved/fictional ids; test deterministically that no live id appears | | LRN-073 | 2026-06-29 | a skill's worked-example must use FICTIONAL ids, never live registry ids (they prime real-data behavior) | any skill/agent with a worked example over the SAME data it operates on — use reserved/fictional ids; test deterministically that no live id appears |
| LRN-074 | 2026-06-29 | system `grep`/`awk` may be ugrep/mawk: don't assume flag-parsing, use `/usr/bin/grep`, watch the RED go red (4th command-assumption miss this session) | any shell test/guard riding on grep/awk/sed semantics — pin `/usr/bin/<tool>`, run the assertion, confirm it reds on the defect before trusting green | | LRN-074 | 2026-06-29 | system `grep`/`awk` may be ugrep/mawk: don't assume flag-parsing, use `/usr/bin/grep`, watch the RED go red (4th command-assumption miss this session) | any shell test/guard riding on grep/awk/sed semantics — pin `/usr/bin/<tool>`, run the assertion, confirm it reds on the defect before trusting green |
| LRN-075 | 2026-06-30 | skill-vs-no-skill RED must test the UNGUIDED control: a "use git + justify" baseline makes a capable agent succeed (contaminated RED); real failure shows only on the tempting prompt | building any skill whose value is determinism/gate over a capable agent — strip guidance AND tempt the failure; control succeeds → don't author (or rescope) |
| LRN-076 | 2026-06-30 | append-only registry status mutates in place (UPDATE/FINAL blocks): CURRENT status = LAST status line, not Index, not first; range scan inclusive of next header bleeds a sibling's status word | parsing any in-place-mutated status; take last line, bound entry exclusive of next header |
| LRN-077 | 2026-06-30 | test fixtures must carry NEUTRAL names — a name that telegraphs the answer lets the subject pass by reading the name, not doing the work | designing any test fixture/path; same symptom as [[LRN-074]] (passes for WRONG reason), distinct cause (leaky fixture vs assumed command) |
--- ---
@ -840,3 +843,20 @@ rules:
- **pattern**: a RED-7 test used `grep -vE '-9[0-9][0-9]$'`; the system grep is UGREP → parsed the leading `-9..` as an OPTION → errored → empty → FALSE GREEN (a RED that never goes red). Caught only because the output was READ, not assumed. 4th time this session an assumed command behavior was false on execution (after `set -o pipefail` + `grep -q` SIGPIPE, …). The skill's own verify already hard-codes `/usr/bin/grep` (line 189) for this exact reason — re-learned. - **pattern**: a RED-7 test used `grep -vE '-9[0-9][0-9]$'`; the system grep is UGREP → parsed the leading `-9..` as an OPTION → errored → empty → FALSE GREEN (a RED that never goes red). Caught only because the output was READ, not assumed. 4th time this session an assumed command behavior was false on execution (after `set -o pipefail` + `grep -q` SIGPIPE, …). The skill's own verify already hard-codes `/usr/bin/grep` (line 189) for this exact reason — re-learned.
- **fix**: `/usr/bin/grep` (GNU) where GNU semantics matter; avoid leading-dash regex args (or use `-e`/`--`); never trust the system tool is GNU/POSIX (mawk≠gawk, ugrep≠grep). - **fix**: `/usr/bin/grep` (GNU) where GNU semantics matter; avoid leading-dash regex args (or use `-e`/`--`); never trust the system tool is GNU/POSIX (mawk≠gawk, ugrep≠grep).
- **future application**: any shell test/guard whose correctness rides on grep/awk/sed semantics → pin `/usr/bin/<tool>` AND run the assertion, confirming it goes red on the defect before trusting green. Execute, don't assume command behavior. RECURRENT motif — audit any "assumed tool behavior" the way the fail-silent family ([[LRN-066]]/[[LRN-071]]) is audited. - **future application**: any shell test/guard whose correctness rides on grep/awk/sed semantics → pin `/usr/bin/<tool>` AND run the assertion, confirming it goes red on the defect before trusting green. Execute, don't assume command behavior. RECURRENT motif — audit any "assumed tool behavior" the way the fail-silent family ([[LRN-066]]/[[LRN-071]]) is audited.
## LRN-075 — skill-vs-no-skill RED must test the UNGUIDED control, not a leading one
- **Date**: 2026-06-30
- **pattern**: building `/reconcile`, the first baseline ("repo git — use git + justify each item") made a capable agent reconcile correctly → CONTAMINATED RED (writing-skills: control doesn't exhibit the failure → nothing to fix). Real failure surfaced only on the UNGUIDED tempting prompt ("is the queue empty?", todo-pointed, no git hint): agent MIRRORED the TODO — stale `[ ]` reported open (false positives), decisions.md never opened, contradiction missed. Variance: one rep FLAIRED staleness but wrote a disclaimer ("à vérifier") instead of verifying — a disclaimer is not a verification.
- **why**: skill value was never "teach an agent to reconcile" (it can, guided) — it is determinism + cheapness + gate ([[LRN-031]]), so the answer never depends on phrasing or whether the agent felt like checking git. Confirmed engine-heavy / skill-thin design.
- **future application**: any skill-vs-no-skill / TDD RED — the control must REMOVE guidance AND tempt the failure ([[LRN-028]] sibling). Unguided control still succeeds → don't author (or rescope to determinism/gate). The skill also helps the agent who SUSPECTS but doesn't verify, not only the ignorant one.
## LRN-076 — append-only registry status mutates in place: LAST block wins
- **Date**: 2026-06-30
- **pattern**: a BLK entry evolves via `UPDATE`/`FINAL` blocks (BLK-008: `resolved` → middle `REVERTED, UPSTREAM/open``FINAL — RESOLVED`). A guided baseline read the MIDDLE block → reported BLK-008 upstream/open, wrong. Current status = the LAST status-bearing line in the body, never the Index, never the first. Bonus bug the harness caught mid-build: a `sed` range inclusive of the NEXT entry's header bled a sibling's status word (BLK-005 header "...upstream rename" polluted BLK-004) → drop `^## BLK-` header lines before extracting.
- **future application**: parsing any in-place-mutated status (blockers, revision blocks) — take the LAST status line, bound the entry EXCLUSIVE of the next header. Don't trust Index/first-line.
## LRN-077 — test fixtures must carry NEUTRAL names (pass for the right reason)
- **Date**: 2026-06-30
- **pattern**: a baseline agent on a worktree named `wt-pre-reconcile` read "pre-reconcile" FROM THE DIR NAME and inferred staleness — reasoning for the WRONG reason (the name), not the right one (verify git). Fixtures + the GREEN test were re-frozen under NEUTRAL names so the engine reaches truth by querying git, never by reading a path hint.
- **meta — same symptom, distinct cause as [[LRN-074]]**: 074 = a COMMAND-ASSUMPTION (ugrep parsed `-9..` → false green); 077 = a LEAKY FIXTURE (name telegraphs the answer). Different mechanisms, SAME symptom: the test passes/fails for the wrong reason. Cross-cutting lesson = verify a test passes for the RIGHT reason, not merely that it passes — whether the false signal comes from an assumed command (074) or a leaky fixture (077).
- **future application**: name fixtures/paths neutrally; for any green, ask "did it pass because the subject did the work, or because something leaked the answer?"

View File

@ -310,7 +310,7 @@ LAST of 3 chantiers. Read-first cartography confirmed RED-7/8 + measured 34-row
- [x] FINISH — merged bugfix/prune-memory-hardening → develop — DONE (reconcile 2026-06-29 : merge `73e12be`) - [x] FINISH — merged bugfix/prune-memory-hardening → develop — DONE (reconcile 2026-06-29 : merge `73e12be`)
- [x] PUSH — develop → origin — DONE (reconcile 2026-06-29 : develop == origin/develop, 0 commit en avance) - [x] PUSH — develop → origin — DONE (reconcile 2026-06-29 : develop == origin/develop, 0 commit en avance)
## 2026-06-29 — skill /reconcile (RÉCONCILIATEUR file-ouverte ↔ réel) [QUEUED, non bloqué] ## 2026-06-29 — skill /reconcile (RÉCONCILIATEUR file-ouverte ↔ réel) [BUILT 2026-06-29 — finish pending]
Genèse : l'inventaire manuel du 2026-06-29 a prouvé que le TODO mentait (5 cases fait-mais-non-coché Genèse : l'inventaire manuel du 2026-06-29 a prouvé que le TODO mentait (5 cases fait-mais-non-coché
+ 1 "auto-nettoyé" qui ne l'était pas + 1 rejeté marqué "deferred"). Cet inventaire EST la spec du + 1 "auto-nettoyé" qui ne l'était pas + 1 rejeté marqué "deferred"). Cet inventaire EST la spec du
skill ET son cas de test de référence (résultat manuel connu-bon à reproduire). skill ET son cas de test de référence (résultat manuel connu-bon à reproduire).
@ -335,11 +335,13 @@ SORTIE = les 4 catégories de l'inventaire 2026-06-29 :
(modifie un fichier tracké → jamais silencieux). (modifie un fichier tracké → jamais silencieux).
Subtasks (à détailler au lancement) : Subtasks (à détailler au lancement) :
- [ ] Spec : table oracle-par-source (commit existe / branche absente / tree clean / skill linké / - [x] Spec : table oracle-par-source (commit existe / branche absente / tree clean / skill linké /
statut registre) — chaque "déclaré" a son test réel statut registre) — chaque "déclaré" a son test réel — DONE (lib/reconcile.sh : reconcile_oracle_*)
- [ ] Décider build : superpowers:writing-skills (TDD, RED = fixture TODO menteur reproduisant les 7 écarts) - [x] Décider build : superpowers:writing-skills (TDD, RED = fixture TODO menteur reproduisant les 7 écarts) — DONE (RED a4872 miroir + RED-B Index-ignore à dents + GREEN comportemental a8404)
- [ ] `skills/reconcile/SKILL.md` + routage CLAUDE.md (triggers : "reconcile", "file vraiment vide ?", - [x] `skills/reconcile/SKILL.md` — DONE (skill mince : orchestration + gate A/B/C + limites honnêtes)
"qu'est-ce qui reste ouvert", "inventaire chantiers") - [x] routage CLAUDE.md (triggers : "reconcile", "file vraiment vide ?",
- [ ] Détecteur de contradictions inter-registres (BDR accepted vs chantier qui le contredit) "qu'est-ce qui reste ouvert", "inventaire chantiers") — DONE (CLAUDE.md "Skill routing" + link.sh)
- [ ] Gate de réconciliation (diff TODO proposé, A/B/C confirm avant edit) - [x] Détecteur de contradictions inter-registres (BDR accepted vs chantier qui le contredit) — DONE (reconcile_contradiction_candidates ; surface, n'asserte pas)
- [ ] Test final = reproduire l'inventaire 2026-06-29 (cat. 1-4 + contradiction BDR-001) comme oracle - [x] Gate de réconciliation (diff TODO proposé, A/B/C confirm avant edit) — DONE (SKILL.md "The gate" ; registres read-only)
- [x] Test final = reproduire l'inventaire 2026-06-29 (cat. 1-4 + contradiction BDR-001) comme oracle — DONE (run-reconcile.sh 20/20, fixtures neutres, RED prouvé rouge avant le vert)
- NOTE : bâti, non committé (develop +1 `bdfa9bc`, livrables untracked) → finish pending (feature/reconcile-skill)

View File

@ -264,6 +264,7 @@ only the non-obvious cases: gstack fallbacks, disambiguation, cryptic names.
- Ship / deploy / PR → ship (ship-feature if gstack off) - Ship / deploy / PR → ship (ship-feature if gstack off)
- Docs post-ship → document-release (doc if gstack off); stale-doc audit → doc - Docs post-ship → document-release (doc if gstack off); stale-doc audit → doc
- Audit of changes since last run → audit-delta - Audit of changes since last run → audit-delta
- Open-work inventory / "queue empty?" / stale TODO vs real git → reconcile
- Design / UI (build, system, audit, polish) → see "Design work" below - Design / UI (build, system, audit, polish) → see "Design work" below
- Architecture review → plan-eng-review - Architecture review → plan-eng-review
- Before /clear or /compact → capitalize; end-of-session ritual → close - Before /clear or /compact → capitalize; end-of-session ritual → close

108
lib/reconcile.sh Normal file
View File

@ -0,0 +1,108 @@
#!/usr/bin/env bash
# reconcile.sh — deterministic engine for /reconcile.
#
# Confronts DECLARATIVE sources (TODO checkboxes, registry statuses) against REAL
# state (git, fs, registry BODY). It is the engine behind the /reconcile skill;
# the skill orchestrates + gates, this file holds the mechanical truth-probes.
#
# FOUNDING PRINCIPLE — recursive coherence (LRN-055): a reconciler must NEVER trust
# a declarative source as an oracle. So this engine NEVER reads the `## Index` table
# nor believes a `[x]`/`[ ]` checkbox; it enumerates registry entries from the BODY
# `## ID —` headings and decides "done/stale" from git/fs only. It practices what it
# preaches — the recursive-coherence test (run-reconcile.sh T1) reds if this is broken.
#
# HONEST LIMITS (graven, do not over-read):
# - Deferral detection is LEXICAL (reconcile_deferrals): it catches deferrals MARKED
# by a keyword, misses ones phrased without one ("à reprendre quand X"). Deterministic
# on the detectable; the skill SURFACES the ambiguous for human review, never asserts.
# - Contradiction detection surfaces CANDIDATES (token overlap), never asserts a verdict.
#
# Layers:
# reconcile_enumerate_ids — body-only ID enumeration (recursive-coherence core)
# reconcile_oracle_* — live git/fs truth probes
# reconcile_blk_current_status — registry status, LAST block wins (compound/UPDATE/FINAL)
# reconcile_blk_open — blockers whose CURRENT status is not resolved
# reconcile_verdict — pure kernel: declared checkbox × real fact → verdict
# reconcile_deferrals — lexical deferral sweep (honest limit)
# reconcile_contradiction_candidates — accepted-BDR ⇄ open-chantier token overlap (surface)
set -uo pipefail
RECONCILE_GREP="${RECONCILE_GREP:-/usr/bin/grep}" # LRN-074: pin grep, never assume GNU flags
# markers that LEXICALLY signal a deferral/follow-up (honest limit: marked-only)
RECONCILE_DEFER_RE='[Dd]efer|[Ff]ollow-?up|[Oo]ut.of.scope|OUT-OF-SCOPE|[Rr]econsider|[Rr]evisit|2e passage|won.t.do|optional\)|one-line ticket'
# --- recursive coherence: enumerate IDs from the BODY, never the ## Index ---
# $1 registry file, $2 prefix (BDR|LRN|BLK|EVAL). Emits one id per line, unique.
reconcile_enumerate_ids() {
"$RECONCILE_GREP" -oE "^## ${2}-[0-9]+" "$1" | "$RECONCILE_GREP" -oE "${2}-[0-9]+" | sort -u
}
# --- live truth probes (query the REAL repo/fs; this is the "verify, don't believe" core) ---
reconcile_oracle_tree_clean() { # rc 0 = working tree clean
[ -z "$(git -C "${1:-.}" status --porcelain 2>/dev/null)" ]
}
reconcile_oracle_merge_done() { # $1 repo, $2 branch fragment → rc 0 if a merge commit exists
[ -n "$(git -C "${1:-.}" log --oneline --grep "Merge .*$2" -1 2>/dev/null)" ]
}
reconcile_oracle_pushed() { # $1 repo, $2 branch → rc 0 if nothing unpushed vs origin
[ -z "$(git -C "${1:-.}" rev-list "origin/$2..$2" 2>/dev/null)" ]
}
reconcile_oracle_sha_exists() { # $1 repo, $2 sha → rc 0 if the commit object exists
git -C "${1:-.}" cat-file -e "${2}^{commit}" 2>/dev/null
}
reconcile_oracle_msg_committed() { # $1 repo, $2 grep → rc 0 if a commit message matches
[ -n "$(git -C "${1:-.}" log --oneline --grep "$2" -1 2>/dev/null)" ]
}
reconcile_oracle_path_present() { [ -e "${1:?}" ]; } # $1 path → rc 0 if it still exists on disk
# --- registry status: LAST status-bearing line wins (the BLK-008 trap A fell into) ---
# $1 blockers file, $2 id. Echoes the current status line (compound/UPDATE/FINAL aware).
reconcile_blk_current_status() {
# drop ALL `## BLK-` header lines first: the range is inclusive of the NEXT entry's
# header, and a sibling header may carry a status word (e.g. BLK-005 "...upstream rename")
# → cross-entry bleed. The entry's own header carries no status, so dropping it is safe.
sed -n "/^## ${2} /,/^## BLK-/p" "$1" \
| "$RECONCILE_GREP" -v '^## BLK-' \
| "$RECONCILE_GREP" -iE 'status|RESOLVED|REVERTED|upstream|resolved|[^a-z]open' \
| tail -1
}
# blockers whose CURRENT status is not resolved → emits "id<TAB>status"
reconcile_blk_open() {
local id st
for id in $(reconcile_enumerate_ids "$1" BLK); do
st=$(reconcile_blk_current_status "$1" "$id")
case "$st" in
*RESOLVED*|*resolved*) : ;;
*) printf '%s\t%s\n' "$id" "$st" ;;
esac
done
}
# --- pure reconciliation kernel: declared checkbox × real fact → verdict (no git, fully testable) ---
# $1 checkbox char (x| |~), $2 real_done (true|false).
reconcile_verdict() {
case "$1:$2" in
" :true") echo "STALE:open-but-done" ;;
"x:false") echo "STALE:done-but-open" ;;
"~:true") echo "STALE:partial-but-done" ;;
*) echo "CONSISTENT" ;;
esac
}
# --- lexical deferral sweep (HONEST LIMIT: marked-only) → "src<TAB>line<TAB>text" ---
reconcile_deferrals() {
[ -f "$1" ] && "$RECONCILE_GREP" -nE "$RECONCILE_DEFER_RE" "$1" 2>/dev/null | sed 's/^/TODO\t/'
[ -f "$2" ] && "$RECONCILE_GREP" -nE "$RECONCILE_DEFER_RE" "$2" 2>/dev/null | sed 's/^/BDR\t/'
return 0
}
# --- contradiction CANDIDATES (surface, never assert): CLI-flag token shared by a BDR + the TODO ---
# $1 decisions file, $2 todo file. CLI-flag-like tokens are distinctive enough to flag for review.
reconcile_contradiction_candidates() {
local tok
for tok in $("$RECONCILE_GREP" -oE '\-\-[a-z][a-z-]+' "$1" 2>/dev/null | sort -u); do
"$RECONCILE_GREP" -qF -- "$tok" "$2" 2>/dev/null \
&& printf 'CANDIDATE\t%s\tflag "%s" in a BDR title and an open TODO chantier — review for contradiction\n' "$tok" "$tok"
done
return 0
}

12
lib/tests/fixtures/real-state.snapshot vendored Normal file
View File

@ -0,0 +1,12 @@
# Frozen oracle answers as of the reconcile point (bdfa9bc). Each value is an
# independently-checkable git/fs truth, hand-recorded — NOT generated by reconcile.sh.
# Consumed by the deterministic kernel test (T4). Live oracles are proven separately (T6).
merge_done:bugfix/prune-memory-hardening=true
pushed:develop=true
tree_clean=true
commit_msg:gitmodules=true
path:.claude/skills/darwin-skill=present
blk_current:BLK-008=resolved
blk_current:BLK-009=open
blk_current:BLK-001=open
blk_current:BLK-003=open

View File

@ -0,0 +1,809 @@
---
type: learnings_registry
entry_prefix: LRN
schema:
id: LRN-XXX
date: YYYY-MM-DD
pattern: string (what was observed, abstracted)
context: string (where/when it happened - concrete)
future_application: string (when to recall this)
rules:
- Capture learnings that apply beyond current task.
- Abstract from incident — pattern reusable, not one-shot fact.
- Link to source (commit, file, PR) when possible.
- Replaces previous LESSONS.md format. Old file empty — no content to migrate.
---
# Learnings registry (LRN)
## Index
| ID | Date | Pattern | Applies to |
|----|------|---------|------------|
| LRN-001 | 2026-04-22 | `rtk` shape-compression breaks pipes | any pipeline chaining `rtk curl/cat/read` into `jq`, `python -c`, `awk` |
| LRN-002 | 2026-04-23 | Moving report-file paths requires grepping bash READS, not just WRITES | any refactor that moves a generated file used by a dispatcher |
| LRN-003 | 2026-04-27 | Claude Code `disable*` settings use sentinel string `"disable"`, not boolean | any change to `permissions.defaultMode` or related blocker keys |
| LRN-004 | 2026-04-27 | `framer-motion` rebranded `motion` Nov 2024 — different packages per framework | any new project recommending animation lib; auditing legacy imports |
| LRN-005 | 2026-05-03 | `claude plugin install` does NOT enable — separate `claude plugin enable` required | every plugin installer targeting ALWAYS-ON status |
| LRN-006 | 2026-05-03 | `caveman-shrink` (and any MCP middleware proxy) non-functional without upstream wrapper | any MCP middleware/proxy package — never `claude mcp add` it bare |
| LRN-007 | 2026-05-06 | `toggle-external.sh enable` missed source-only state (3rd lifecycle case) | toggle scripts for tools with separate install + symlink steps |
| LRN-008 | 2026-05-06 | Biggest skill-quality wins from edge-case tables, not workflow rewrites | any skill <85 first check for FAILURE PATHS / EDGE CASES / ERROR HANDLING section |
| LRN-021 | 2026-05-20 | Refactor commands→skills must sweep `~/.claude/commands/` for orphan wrappers | any refactor moving `agents/foo.md``skills/foo/SKILL.md`; onboard/init-project audits |
| LRN-009 | 2026-05-06 | Dry-run scoring noise wrongly triggers reverts on already-strong skills | darwin-skill ratchet on skills >91 — relax or use real subagent eval |
| LRN-010 | 2026-05-06 | `~/.claude/skills,agents` symlink to Documents/claude — git from `~/.claude` fails | any optimization or batch edit on personal skills/agents |
| LRN-011 | 2026-05-07 | Single subagent emits N independently-gated scores → labeled extraction + axis-aware loop + per-axis escalation | any audit pipeline shipping multiple gated metrics from one subagent |
| LRN-012 | 2026-05-07 | Bash heredoc + stdin pipe collision = silent empty output | any shell pipeline piping data into `python3 - <<'PY' ... PY` (or any heredoc'd interpreter) |
| LRN-013 | 2026-05-07 | marked CLI 16.x ignore stdin, dump own cli.js source | any shell MD→HTML via npx marked — use `-i FILE` not stdin |
| LRN-014 | 2026-05-11 | Pandoc base gfm strips header id attrs — need `gfm+gfm_auto_identifiers` | any MD→HTML/PDF with cross-references (`[§4](#nap)`) via pandoc |
| LRN-015 | 2026-05-11 | BrightLocal Free Tools retired 2026 — Moz Local Citation Checker is free replacement | client SEO/NAP docs — re-validate tool URLs + free-tier status annually |
| LRN-016 | 2026-05-11 | Pandoc GFM checkbox markup breaks adjacent-sibling CSS — target `li > input` directly | styling task-list checkboxes in pandoc-rendered HTML/PDF |
| LRN-017 | 2026-05-12 | Thin-dispatcher SKILL.md round-1 win = fallback + frontmatter triggers (+15 to +30) | any `/darwin-skill` round-1 on a dispatcher SKILL.md |
| LRN-018 | 2026-05-12 | Darwin eval subagents drift on total math — recompute in main thread | any subagent-driven SKILL.md rescore |
| LRN-019 | 2026-05-15 | Deployable-project doc split: README dev-quickstart + DEPLOY 14-section prod-VPS topology | any onboard/doc-syncer/scaffold producing docs for a deployable project |
| LRN-024 | 2026-06-02 | New sibling command sharing logic → extract helper + refactor existing caller, never copy-paste; assert pre/post state equality | adding a subcommand/branch reusing logic inline in a peer command |
| LRN-025 | 2026-06-02 | `.gitignore` gstack allowlist must cover ALL toggleable skills (incl. parked) — else enabling one = untracked git noise | any toggle that moves local-symlink skills into a tracked dir; post-submodule-bump reconcile |
| LRN-026 | 2026-06-09 | `disable-model-invocation: false` = ENABLED not blocking; only `true` blocks (model + orchestrator); binary, no per-caller | Claude Code skill frontmatter; deciding self-route/chain vs human-only entry point |
| LRN-027 | 2026-06-11 | Agents improvise audit boundaries from file dates when no machine state — periodic skills need machine-readable state file, never inference | any recurring/periodic skill needing "since last run" semantics |
| LRN-030 | 2026-06-18 | Opus 4.8 under-delegates subagents/memory/custom-tools by default — counter via explicit CLAUDE.md fan-out rule | any Opus 4.8 session; tuning delegation; inline-vs-subagent decision |
| LRN-031 | 2026-06-19 | Skill value = gate + anti-noise + determinism, not re-coding what a capable agent does free | building/reviewing any skill; writing-skills TDD fixture design |
| LRN-046 | 2026-06-25 | Destructive skill: deterministic oracle (byte-identical / count census) > semantic judge | any destructive/irreversible skill; behavioral-oracle TDD |
| LRN-047 | 2026-06-25 | A noisy safety guard (13/13 FP) = a guard you learn to ignore = risk → refine, don't tolerate | any guard/alert/lint that can false-positive |
| LRN-048 | 2026-06-25 | A "0/OK/pass" must prove it LOOKED (counted both sides), else verify hard-wired to pass | any verify/test/lint reporting success |
| LRN-051 | 2026-06-26 | `git commit -- pathspec` strict on no-match → filter scoped commits to changed paths | any scoped-commit automation |
| LRN-052 | 2026-06-26 | Hash-anchoring: 2 cases it does NOT apply (pre-code founding, squash-merge) | capitalizing founding/arch decisions; squash repos |
| LRN-053 | 2026-06-26 | Read-before teeth = verifiable disposition in the artifact, not the act of reading | any read-before / check-before wiring |
| LRN-054 | 2026-06-26 | No deterministic oracle for "already in context" → never add a presence-skip branch | skip-if-seen optimizations over conversation state |
| LRN-055 | 2026-06-26 | Body `## ID —` headings = drift-immune index; the `## Index` table is not | choosing a substrate to index/select over |
| LRN-056 | 2026-06-26 | `grep PAT dir/*.md` on absent dir ERRORS (exit 2), not no-op → guard `[ -d ]` | any glob-fed scan that must no-op on nothing |
| LRN-057 | 2026-06-26 | Match consumption mechanism to consumer (mechanical / external-cognitive / inline) | wiring any produce→consume invariant |
| LRN-058 | 2026-06-27 | Same bug-class ≠ same fix — verify the twin shares the fix's PRECONDITION before replicating | porting a fix to a "same bug" twin |
| LRN-059 | 2026-06-27 | Step-number SWAP flips meanings (sweep refs) ≠ letter-suffix insertion (shifts nothing) | any pipeline renumber |
| LRN-060 | 2026-06-27 | Fail-closed guard proven by what it REFUSES (loudly); pass dynamic lists as argv not separator-string | automated scoped-commit / destructive guards |
| LRN-061 | 2026-06-27 | Runtime net for an unwired skill → check the wiring first (deterministic gap = fix structurally; non-det aléa = net OK, cf BDR-033) | "build a hook/watcher to catch when X isn't done" |
| LRN-062 | 2026-06-27 | deploy first-run detection = file-existence, never `git describe` | any "first run vs incremental" tool — detect by explicit on-disk marker |
| LRN-063 | 2026-06-27 | delta-since-marker = `git diff --name-only X HEAD` (two endpoints), never rev-list/three-dot | any delta-since-checkpoint over git — explicit two endpoints for tree diff |
| LRN-064 | 2026-06-27 | surgical-commit helper family partitions `.claude/`; new subtree needs own allowlist sibling | adding a committable `.claude/X` subtree |
| LRN-065 | 2026-06-27 | cross-session cold-resume skill = disk-bridge read-first (audit-delta convention) | any "do work → user acts out-of-band → resume later" skill |
| LRN-066 | 2026-06-27 | surgical-commit must fail LOUD on git-ignored target paths (else silent no-op) | any helper relying on `git status --porcelain` to detect changes |
| LRN-067 | 2026-06-28 | pipeline that LOOKS 2-level can terminate at SAME level; human-mediated step (interactive menu) masks the double-action until automated | replacing an interactive/human step with a deterministic one over a delegated sub-skill |
| LRN-068 | 2026-06-29 | enforcement-bootstrap must be transactional: activate the guard LAST + gate it on the bootstrap commit succeeding; precheck identity | any init that installs a hook/protection AND commits |
| LRN-069 | 2026-06-29 | token-authed remote writes under CC perms: inline-env (never `export`), token in header not argv, keep `git push` on ASK as the gate | scripting git/curl writes to a private remote from tool calls |
| LRN-070 | 2026-06-29 | clean-tree-gated migration blocked by a dirty submodule → diagnose pointer-vs-content; for a local edit use `submodule.<name>.ignore=dirty`, never blind reset | migrating/releasing a superproject whose submodule carries intentional local edits |
| LRN-071 | 2026-06-29 | fail-loud must cover the helper's OWN commit, not just its inputs — 3rd occurrence of the swallowed-commit pattern (a failed op masked by a later returning-0 statement) | any helper whose return value gates a downstream "success" — audit every fallible internal op propagates, esp. the commit |
| LRN-072 | 2026-06-29 | a stranded-artifact bug can be fixed by NOT creating the artifact (negative diff), not by plumbing its commit — if the producing step is speculative/unused, delete it | a stranded/duplicated/uncommitted-artifact bug — before building machinery, ask if the PRODUCING step is wanted; speculative-at-creation → remove, deliberate-on-demand → keep |
---
## LRN-001 — `rtk` shape-compression silently breaks downstream parsers
- **Date**: 2026-04-22
- **Pattern**: when tracking tool (`rtk`) intercepts stdout and returns schematized/compressed representation instead of raw payload, every downstream parser breaks silently — user (or LLM) never sees `rtk`'s output, only parser error.
- **Context**: `rtk curl` replaces raw JSON output with tokenized version, regardless of TTY vs pipe. Claude Code hooks auto-rewrite `curl``rtk curl`, so behavior impossible to anticipate without knowing hook.
- **Future application**: for any tool auto-rewriting standard commands, explicitly verify pipe behavior. Documented workaround: `exclude_commands=["curl"]` in `~/.config/rtk/config.toml`, or `rtk proxy`. See `BLK-001`.
## LRN-002 — Moving report-file paths requires grepping bash READS, not just WRITES
- **Date**: 2026-04-23
- **Pattern**: when moving write path of generated file (report, artifact, cache), must also grep places that READ that file — not only those that write it. Dispatchers (orchestrator skills dispatching to agent then parsing result) typically contain bash commands like `test -s X.md`, `grep ... X.md`, `wc -l X.md` — refs invisible if only grep for "write" or "output path".
- **Context**: `.claude/audits/` refactor (commit `5c5e82c`). First pass: updated write paths across 5 skills (seo/geo/harden/validate/code-clean) and 3 agents. User asked for verify-gate. They re-grepped, found 10+ bare bash refs (e.g. `test -s HARDEN.md`, `grep -oE ... VALIDATE.md`) missed — dispatchers broken (looking at project root while agent writing to `.claude/audits/`). Fixed in commit `5c5e82c` (bundled with same commit).
- **Future application**:
- Before declaring file-path migration "complete", grep **basename** (`grep -rn "HARDEN\.md"`) plus full path — catch bare bash usages.
- If file used in pipelines (`test`, `grep`, `wc`, `cat`, `head`), search for those verbs explicitly.
- **Verify-gates save work**: one extra round forced exhaustive re-grepping. Without it, two dispatchers shipped broken.
## LRN-003 — Claude Code `disable*` settings use sentinel string `"disable"`, not boolean
- **Date**: 2026-04-27
- **Pattern**: Claude Code blocker-style settings (`disableAutoMode`, `disableBypassPermissionsMode`) use literal string `"disable"` as sentinel. Key absent = feature available; value `"disable"` turns blocker on. Any other value (including `false`, `true`, `null`) has no effect — doc explicitly states this.
- **Context**: switching `permissions.defaultMode` to `"auto"` while `disableAutoMode: "disable"` still present would have failed at startup ("auto mode unavailable"). Naming `disable<Foo>: "disable"` reads ambiguously — easy to assume boolean toggle and leave key in place.
- **Future application**:
- Before changing `defaultMode`, audit matching `disable*` key in same `permissions` block. If present with value `"disable"`, remove it.
- Same logic for `bypassPermissions` mode and `disableBypassPermissionsMode`.
- Don't trust doc's naming — read value semantics. Sentinel strings beat booleans here because harness can distinguish "unset" from "explicitly off" (admin policy).
- **Reference**: commit `1421578`, doc `https://code.claude.com/docs/en/settings`.
## LRN-004 — `framer-motion` rebranded `motion` (Nov 2024) — different packages per framework
- **Date**: 2026-04-27
- **Pattern**: `framer-motion` renamed `motion` November 2024. Rename not cosmetic: bundles React (`motion/react`), Svelte, vanilla-JS support under single npm package, while Vue gets own parallel package `motion-v`. Legacy package `framer-motion` still installs and works but in maintenance mode — recommending it in new framework default locks projects into legacy import paths day one. Detection of "is animation already covered" must include both names plus broader anim ecosystem (`gsap`, `lottie-react`, `react-spring`, `popmotion`, `@formkit/auto-animate`) to avoid double-installs.
- **Context**: building animation-lib auto-install in `/init-project` and `/onboard`. Initial user phrasing "framer-motion" (old name remembered). Picking package name without verifying rename would have shipped legacy imports in every new scaffold.
- **Future application**:
- For React / Next.js / Remix / Astro+React / Svelte: `motion` (`import { motion } from 'motion/react'`).
- For Vue 3 / Nuxt: `motion-v` (separate package, separate API).
- For React Native: do NOT recommend `motion` — use `react-native-reanimated` (motion targets DOM).
- When auditing existing projects, check both `framer-motion` and `motion` keys in `package.json` deps; treat either as "animation already covered".
- Before adopting any "industry default" lib in framework, verify canonical package name current — naming churn (rebrand, scope change `@org/lib`, fork) common in JS land.
- **Reference**: helper `lib/animation-lib-check.sh`, BDR-005.
## LRN-005 — `claude plugin install` does NOT enable — `claude plugin enable` separate step
- **Date**: 2026-05-03
- **Pattern**: Claude Code CLI splits "available" from "active" for marketplace plugins. `claude plugin install --scope user name@source` only copies plugin into `~/.claude/plugins/cache/<marketplace>/<plugin>/<version>/`. Does NOT write `name@source: true` into user's `settings.json:enabledPlugins` map. Without explicit `claude plugin enable name@source`, plugin sits dormant — installed but unloaded. Symmetric with `claude plugin disable`, which keeps cache and only removes enabledPlugins entry.
- **Context**: discovered auditing why `security-guidance` and `superpowers` were ✘ disabled in `claude plugin list` despite project's `install-plugins.sh` summary banner declaring them "ALWAYS ON". Root cause: `install_plugin()` only ran `claude plugin install`, never `enable`. Bug stayed invisible because hardcoded `printf "│ ✅ ON : security-guidance rtk superpowers │"` in `session-start.sh` printed same names regardless of actual state — lying banner agreed with lying install.
- **Future application**:
- For any plugin meant ALWAYS ON, follow `claude plugin install` with `claude plugin enable name@source` (idempotent — no-op if already enabled).
- Detect "actually enabled" via `enabledPlugins[name@source] === true` in `settings.json`, NOT presence of cache dir. Pattern implemented in `lib/detect-plugins.sh:plugin_enabled()` (filesystem grep, no subprocess).
- Any banner / status display claiming plugin on must read state, never hardcode names. Hardcoded labels turn single bug into two co-conspiring bugs masking each other.
- **Reference**: commit `2ec7935`, `lib/detect-plugins.sh:plugin_enabled`, `install-plugins.sh:enable_plugin()`.
## LRN-006 — `caveman-shrink` (and any MCP middleware proxy) needs upstream wrapper to function
- **Date**: 2026-05-03
- **Pattern**: some MCP packages are middleware proxies, not standalone servers. They wrap upstream MCP server and transform its responses (e.g. `caveman-shrink` compresses prose fields). Running them bare via `claude mcp add proxy-name -- npx -y proxy-pkg` registers server that errors immediately with "missing upstream command" — every health check fails, and Claude Code reports MCP broken until human intervenes. CLI `claude mcp add` doesn't validate that configured command launches working stdio MCP, so bad registration silently lands.
- **Context**: when adding caveman, upstream installer auto-registers `claude mcp add caveman-shrink -- npx -y caveman-shrink` and prints "registered. wrap an upstream by editing the mcpServers entry". Following that flow leaves user with permanently failing MCP entry until they realize they must edit `~/.claude.json` manually.
- **Future application**:
- For any MCP that is proxy/middleware (read package docs for "upstream", "wraps", "proxy"), register under DERIVED name `<proxy>-<upstream>` with upstream baked into args. Example for caveman-shrink wrapping filesystem server:
```
claude mcp add caveman-shrink-fs --scope user -- \
npx -y caveman-shrink npx -y @modelcontextprotocol/server-filesystem /path
```
- Detection of "is this MCP correctly set up?" must look for the derived name (`caveman-shrink-*`), not the bare proxy name. Bare-name registration is treated as broken.
- Default install scripts should NOT auto-register middleware MCPs — print the snippet for the user to choose an upstream. See `install-plugins.sh` STEP 5.5.
- **Reference**: commit `9b20b84`, `lib/detect-plugins.sh:detect_caveman_shrink`, `install-plugins.sh` STEP 5.5 MCP block.
## LRN-007 — `toggle-external.sh enable` missed source-only state
- **Date**: 2026-05-06
- **Pattern**: `lib/toggle-external.sh enable <tool>` for npx/external skills (`darwin-skill`, `find-skills`, `emil-design-eng`) handled 2 states only: symlink in `skills-disabled/` → move to `skills/`, or symlink in `skills/` → already enabled. Missed 3rd: source dir at `~/.agents/skills/<tool>` but no symlink. First-run after `make plugin` lands here until `bash link.sh` runs. `enable` errored `not installed — run: make plugin` — misleading, plugin already installed.
- **Context**: user ran `./lib/toggle-external.sh enable darwin-skill` after fresh install. `~/.agents/skills/darwin-skill/` populated by `install-plugins.sh` STEP 8.5 npx call, but `link.sh` (separate step) not run, so `skills/darwin-skill` symlink never created. Fix `lib/toggle-external.sh:161-179` — add `elif [ -d "$src" ]` branch creating symlink direct when source dir present. Error message now show resolved source path.
- **Future application**:
- Any toggle script for tools with separate install + symlink steps must check 3 states: disabled-dir, enabled-dir, source-only. Source-only branch create symlink in place, not fail.
- Error messages name path checked, not abstract tool name — caller verify install vs symlink state without rereading script.
- Symmetric pairs (`enable`/`disable`) both handle same lifecycle states; missing state in one half = silent dead end.
- **Reference**: `lib/toggle-external.sh:161-179`, `link.sh:69-83`, `install-plugins.sh:598-633` STEP 8.5.
## LRN-008 — biggest skill-quality wins come from edge-case tables, not workflow rewrites
- **Date**: 2026-05-06
- **Pattern**: darwin-skill round 1 across 18 personal skills. Top 4 gains (analyze +18.5, skills-perso +11.9, refactor +11.0, hotfix +9.0) all from same shape: add 1-page failure-mode table (file-not-found, malformed input, partial state, denied user input) with concrete action per row. Skills already had clean happy-path workflows; D3 (edge cases) was systemic gap.
- **Context**: most personal skills delegate to single agent file. Workflow steps already explicit. Missing: explicit "what when X unexpected" rows. Adding 5-12 row table with `| situation | action |` shape moved D3 from 3-7 → 9-10 and total +5 to +18.
- **Future application**:
- Skill scoring <85: first inspect agent file for EDGE CASES / FAILURE PATHS / ERROR HANDLING section. Absence = strong predictor of D3 weakness.
- Template: rows for `target not found`, `input malformed`, `tool/API timeout`, `user denies action`, `partial output`, `permission denied`. Map each → fallback / retry / ask-user / fail-fast.
- Costs ~15-50 lines, unlocks +5 to +15 score.
- **Reference**: `.claude/audits/DARWIN-SKILL-OPTIMIZATION.md`, commits `649351b`, `eb34627`, `1768d04`, `ef87074`, `a3f28d5`.
## LRN-009 — dry-run scoring noise wrongly triggers reverts on already-strong skills
- **Date**: 2026-05-06
- **Pattern**: darwin-skill ratchet rule = revert if new < old. Dry_run scoring (subagent reads SKILL.md, mentally simulates, scores 8 dims) has ±1pt noise per dim per re-eval. Skill at 91-94 has small headroom, so single noisy -1 on D2 flips total from +1 to -1 (false revert). code-clean + doc both reverted with objectively useful content (empty-approval branch, README/DEPLOY templates) revert was dry_run noise artifact, not real regression.
- **Context**: ratchet preserves only commits with strict total > old. For dry_run near ceiling, too strict. Real subagent eval would have lower noise floor since output quality differences observable.
- **Future application**:
- Skills baseline >91: skip optimization (diminishing returns), OR use real subagent eval not dry_run, OR relax ratchet to "new ≥ old - 1" with manual diff review.
- Edits to high-scoring skills must be minimal (1-3 lines, surgical) so D2 (workflow clarity) not perturbed by added bulk.
- When reverting content-rich change, log content elsewhere (`~/.claude/notes/`) so work not lost — second smaller patch can reintroduce idea.
- **Reference**: `.claude/audits/DARWIN-SKILL-OPTIMIZATION.md`, commits `63e08f9`→`822d437` revert (code-clean), `c7b8522`→`765d1c1` revert (doc).
## LRN-010 — ~/.claude/skills + ~/.claude/agents symlink to /home/bchanot-ubuntu/Documents/claude
- **Date**: 2026-05-06
- **Pattern**: editing `~/.claude/skills/<x>/SKILL.md` or `~/.claude/agents/<x>.md` modifies file at `/home/bchanot-ubuntu/Documents/claude/{skills,agents}/`. `~/.claude` is empty config dir with symlinks; actual git repo + working tree is in Documents/claude. `git add` from `~/.claude` fails with `pathspec is beyond a symbolic link`. Must operate git from Documents/claude.
- **Context**: darwin-skill run created branch in `~/.claude` first (separate git repo, mostly empty). Real branch with skill changes had to be created in Documents/claude. Two repos, two branches.
- **Future application**:
- Any optimization or batch edit on personal skills/agents operates from `/home/bchanot-ubuntu/Documents/claude` for git to track changes.
- `readlink ~/.claude/skills` + `readlink ~/.claude/agents` first if unsure. Both point to Documents/claude/{skills,agents}.
- Don't waste branch in `~/.claude` — nothing to track for skill content.
- **Reference**: `.claude/audits/DARWIN-SKILL-OPTIMIZATION.md`, branch `auto-optimize/skills-20260506-1730` in Documents/claude.
## LRN-011 — Single subagent emits N independently-gated scores: pattern
- **Date**: 2026-05-07
- **Pattern**: when one subagent produces 2+ scores that each must clear independent thresholds (e.g. `/seo` subagent → SEO classique + GEO scores in same `SEO.md`), orchestrator must:
1. Extract each score via labeled grep (`extract_score_labeled f "Score SEO" + "Score GEO"`) — never fall back to "first /20 found" (collapses scores or fakes duplicate).
2. Loop continuation: `while (any axis < threshold) AND iter ≤ MAX`. Single-axis condition exits early while other axis still below.
3. Re-dispatch prompt labels each axis with current score + PASS/FAIL state, plus axis-specific fix list. Generic "improve the audit" wastes iterations on already-passing axis.
4. Escalation prompt names affected axes explicitly. User chooses per-axis (continue / stop / override per axis).
5. Override transparency file lists axes separately (e.g. `SEO classique: NOT overridden, GEO (IA): overridden`).
6. Backward compat: `allow_fallback` flag — fall back to generic single-score parse for primary axis (legacy compat) but NOT for secondary axis (UNKNOWN forces re-dispatch with explicit format demand).
- **Context**: client-handover pipeline gates SEO + GEO independently (BDR-010). Both scores live in same `.claude/audits/SEO.md`, written by one /seo subagent in one dispatch. Naive "extract first /20" collapsed both into SEO classique value — gate fired on SEO only. Pattern above generalizes to any future audit shipping multiple gated metrics from one subagent (e.g. /harden could split TLS + headers + redirects).
- **Future application**:
- Any audit subagent emitting multiple scores → use labeled extractor pattern + axis-aware loop + per-axis escalation. Never collapse to single score for gate.
- When designing new audits with multiple metrics, mandate labeled score format in skill SKILL.md (e.g. `Score <axis> : X.X / 20`). Avoids retrofit later.
- When 2+ scores share one subagent, prompt template lists both PASS/FAIL state + axis-specific fix categories. Otherwise subagent wastes iterations on passing axis.
- **Reference**: `agents/client-handover-writer.md` (`extract_score_labeled` STEP 3, axis-aware loop STEP 4, escalation STEP 4, threshold strictness STEP 8 SEO.md branch). BDR-010.
## LRN-012 — Bash heredoc + stdin pipe collision = silent empty output
- **Date**: 2026-05-07
- **Pattern**: when running an inline-heredoc'd interpreter — `python3 - <<'PY' ... PY`, `bash <<'SH' ... SH`, `node -e <<'JS' ... JS` etc. — the heredoc IS the interpreter's stdin. Any data piped from upstream is **silently discarded**. Symptom: `sys.stdin.read()` (or equivalent) returns the heredoc body itself (often empty after the script consumes it via the read), and the produced output is empty. Exit code is `0`, no error message — silent failure. Diagnose via `bash -x` trace: you see the python ran, but no upstream data ever reached it.
- Anti-pattern (broken): `printf '%s' "$DATA" | python3 - <<'PY' \n template = sys.stdin.read() \n ... \n PY`
- Fix 1 (env var): `DATA="$DATA" python3 - <<'PY' \n import os; template = os.environ['DATA'] \n PY`
- Fix 2 (file path arg): `python3 - "$FILE_PATH" <<'PY' \n import sys; template = open(sys.argv[1]).read() \n PY` — note `"$FILE_PATH"` AFTER `-` becomes `sys.argv[1]`.
- Fix 3 (write tempfile, read inside): `echo "$DATA" > /tmp/x; FILE=/tmp/x python3 - <<'PY' \n template = open(os.environ['FILE']).read() \n PY`.
- **Context**: `skills/client-handover/scripts/handover-to-pdf.sh` v1 piped HTML template through a `substitute()` function that ran `python3 - <<'PY'` and read `sys.stdin`. Pipe dropped silently, `.html` output 0 bytes. Caught by post-write `wc -l`; root cause found via `bash -x`. Fixed by passing template path through `HQ_TEMPLATE_PATH` env var, python opens the file directly (`render_template()` in current script).
- **Future application**:
- Never combine an inline heredoc with an upstream pipe targeting the same interpreter. Pick one input channel: heredoc OR pipe, not both.
- When in doubt: pass data via env vars (small payloads), file paths (large payloads), or argv. Reserve stdin for cases where the interpreter has NO heredoc.
- Add post-write size check (`test -s "$FILE"` or `wc -l`) for any generated artifact in a shell pipeline — surfaces silent-failure modes immediately.
- When debugging "script ran but file empty", run `bash -x script.sh` and look for the `+ python3 -` line — if you see no upstream data being consumed, you have the heredoc-pipe collision.
- **Reference**: `skills/client-handover/scripts/handover-to-pdf.sh` `render_template()` (env-var-based, current); BDR-011 caveat list; commit `e06b52a` (final fix shipped with the renderer).
---
## LRN-013 — marked CLI 16.x ignore stdin, dump own cli.js source
- **Date**: 2026-05-07
- **Context**: `/client-handover` PDF rendering. `handover-to-pdf.sh` fallback chain pandoc → python-markdown → npx marked. On host with only npx, pipeline ran `npx --yes marked < "$src"` and produced 2-page PDF where body = marked package's `cli.js` source (`#!/usr/bin/env node`, `Marked CLI`, copyright, `import { main } from './main.js'`). Real MD content (30 KB) entirely lost.
- **Pattern**: marked 16.x CLI regression — stdin path broken, ignores piped input, prints its own binary source. Only `-i FILE` flag works. Verified: `echo "test" | npx marked` → marked source. `npx marked -i FILE` → correct HTML.
- **Why**: do not assume marked CLI accepts stdin like awk/jq/sed. Check actual conversion output before shipping any MD→HTML renderer.
- **How to apply**: any shell md→html using marked CLI must call `npx --yes marked --gfm -i "$src"`. Keep pandoc + python-markdown ahead in fallback chain — more stable. Smoke-test: render small MD, grep output for known content; fail loudly if mismatch.
- **Reference**: `skills/client-handover/scripts/handover-to-pdf.sh` line ~140 (npx fallback fixed). Commit fixing bug.
---
## LRN-014 — Pandoc base gfm strips header id attrs — need gfm+gfm_auto_identifiers
- **Date**: 2026-05-11
- **Pattern**: `pandoc --from=gfm --to=html5` does NOT auto-generate `id` attributes on header elements. Internal anchor links like `[§4 NAP](#nap)` become dead refs in rendered HTML/PDF. Symptom: rendered doc has `<h2>NAP</h2>` (no `id`), browser/PDF anchor resolves nowhere, user clicks link and goes nowhere. Enable id auto-gen by switching to `--from=gfm+gfm_auto_identifiers` — pandoc then emits `<h2 id="nap">NAP</h2>` (kebab-case slug from header text).
- **Context**: `skills/client-handover/scripts/handover-to-pdf.sh` MD→HTML cascade. 6-chapter handover doc added internal cross-references between chapters (§5 todo references back to §4 NAP table for values). Default `--from=gfm` produced HTML with no header ids — internal links dead. Discovered after rendering test handover, clicking link in PDF, going to top of doc instead of NAP section.
- **Future application**:
- Any pandoc MD→HTML pipeline with `[text](#anchor)` cross-references → enable `gfm_auto_identifiers` extension explicitly.
- Smoke-test internal anchors before shipping any renderer: render → `grep -E 'id="[^"]+"' out.html` → confirm headers have ids.
- Slug rules: pandoc lowercases + replaces non-alpha with `-`, e.g. `## §4 NAP table``id="ss-4-nap-table"`. If you control header text, keep slugs predictable.
- **Reference**: `skills/client-handover/scripts/handover-to-pdf.sh` line 121 (`--from=gfm+gfm_auto_identifiers`). Commit `b15b275`.
---
## LRN-015 — BrightLocal Free Tools retired 2026, Moz Local Citation Checker is free replacement
- **Date**: 2026-05-11
- **Pattern**: SEO/NAP tool landscape churns yearly. BrightLocal Free Tools page (`brightlocal.com/free-local-tools/`) retired in 2026 — service now paid-only. Moz Local Citation Checker (`moz.com/local`, "Check My Listing" / "Get Free Audit") is current free replacement: 60s NAP-consistency audit across 50+ directories (Google Business, Apple Maps, Yelp, Pages Jaunes, Bing Places), no credit card required.
- **Context**: client-handover NAP checklist (FR + EN versions) recommended brightlocal.com free tools — link dead, page redirects to paid tier. Caught during handover-doc render. Swapped both language versions to Moz Local with explicit "no credit card" note + path through homepage (button labels can change, URL `moz.com/local` is stable).
- **Future application**:
- Any client-facing doc recommending "free SEO/NAP tools" → verify URLs alive + tool still free annually. SEO vendors churn free tiers regularly.
- Prefer linking to vendor homepage + naming the button ("click Check My Listing") over deep links to specific tool URLs. Vendor URLs deprecate; homepages persist.
- Maintain a short list of "verified-recent" free tools in the handover skill rather than rediscovering on each render.
- **Reference**: `skills/client-handover/checklists/seo-geo-manual.md` (FR section line ~218, EN section line ~429). Commit `abd2612`.
---
## LRN-016 — Pandoc GFM checkbox markup breaks adjacent-sibling CSS — target `li > input` directly
- **Date**: 2026-05-11
- **Pattern**: pandoc GFM emits task-list checkboxes as `<li><input disabled type="checkbox"> text…</li>` with **no wrapper class** and **no list-item class**. Adjacent-sibling CSS rule `li input[type="checkbox"] + *` absolutely-positions the first element sibling AFTER the input — typically `<a>`, `<code>`, `<strong>`, or `<em>` inside the bullet text. Effect: that inline element gets yanked out of flow, overlaps adjacent content in rendered PDF. Symptom: PDF has links/code-spans visibly overlapping subsequent text.
- **Context**: `skills/client-handover/resources/branding/zenquality.css` task-list styling. Initial rule tried to render custom checkbox box via `+ *` selector targeting the first sibling after `<input>`. Worked when bullet was plain text (no inline elements), broke when bullet contained `<a href="...">` or `<code>…</code>` — those got absolutely-positioned. Caught in rendered LIVRAISON.pdf — checkbox icons OK but link/code text overlapped neighbors.
- **Future application**:
- For pandoc GFM checkbox styling, target `li > input[type="checkbox"]` directly. Style native `<input>` via `appearance: none` + custom box rendering (background, border, size) on the input itself.
- Avoid `+ *` and other sibling-selector tricks on bare-input markup — pandoc gives no wrapper to anchor to, siblings vary per bullet content.
- Render checklist with realistic content (`<a>`, `<code>`, `<strong>`) before signing off — bare text bullets won't surface the bug.
- Symptom signature: rendered PDF has overlapping inline elements ONLY in task lists — points to a sibling-selector rule firing on inline content.
- **Reference**: `skills/client-handover/resources/branding/zenquality.css` `li > input[type="checkbox"]` rule + `li.task-list-item::before` (lines 372410). Commit `465fe9e`.
---
## LRN-017 — Thin-dispatcher SKILL.md round-1 win = fallback + frontmatter triggers (+15 to +30)
- **Date**: 2026-05-12
- **Pattern**: thin-dispatcher SKILL.md (delegates to `agents/<x>.md`, body 15-30 lines, no inline workflow) scores low on darwin rubric (45-70) because dims D2/D3/D4/D5 punish empty body. Round-1 universal fix:
1. Add fallback clause — `If $HOME/.claude/agents/<x>.md unreachable, emit "<X> agent missing." and STOP. Never improvise — silent behavior change is unsafe.`
2. Add triggers to frontmatter `description` — explicit `Triggers: "<keyword>", "<synonym>", "<i18n variant>".`
3. For destructive skills (refactor, commit-change): add safety rationale + pre-flight check stub.
Δ +13 to +31 observed: status 45.3→76.2 (+30.9), refactor 48.4→74.3 (+25.9), plugin-check 59.2→76.8 (+17.6), commit-change 69.6→83.5 (+13.9). 150% byte cap tight — trim aggressively.
- **Context**: `/darwin-skill` run 2026-05-12, branch `auto-optimize/20260512-1319` merged to master, 5 commits. skills-perso (66.4→80.1, +13.7) NOT a dispatcher — different patch (Known-limits subsection on the heuristic).
- **Future application**:
- Any darwin round-1 on a dispatcher SKILL.md → skip diagnosis, apply this template directly. Saves one eval cycle.
- After round 1, gains flatten near 75-80 → pivot to next-lowest skill, do not grind rounds 2-3 on same target.
- For thin originals (<500B), 150% cap is the binding constraint pre-trim drafts before committing.
- **Reference**: `.claude/audits/DARWIN-SKILL-2026-05-12.md`. Commits `512df48`..`134561d`. results.tsv at `~/.agents/skills/darwin-skill/results.tsv`.
---
## LRN-018 — Darwin eval subagents drift on total math — recompute in main thread
- **Date**: 2026-05-12
- **Pattern**: analyzer subagents asked to score SKILL.md and compute weighted total drift on the formula. Two recurring errors: (a) divide `Σ(dim×weight)` by `100` instead of `10` (off by factor 10 — produces 6.17 instead of 61.7, then sometimes the subagent silently re-multiplies); (b) use D8 weight 7 instead of the spec value 25 (status: spec says D8 weight = 25, easy to confuse with D4 weight = 7). Per-dim judgments themselves stable across runs; computed totals unreliable.
- **Context**: 5 round-1 evals during darwin 2026-05-12. Refactor subagent computed 743÷10 correctly in scratch but wrote `617/100 = 61.7` — actual correct total 74.3. Subsequent prompts explicitly stating "D8 weight is 25" cleared the second error.
- **Future application**:
- Prompt subagent for dim scores only, not weighted total. Main thread computes `Σ(dim_i × weight_i) / 10` deterministically.
- If subagent must compute, include weight table in prompt AND show example computation for one row.
- When comparing baseline vs round-N, use main-thread recomputed totals on BOTH sides, not the two subagents' self-reported numbers.
- Score recalibration between baseline subagent and round-1 subagent is real (independent re-anchoring) — first-round Δ tends to overstate improvement. Direction reliable, magnitude noisy.
- **Reference**: see "Methodology notes" section of `.claude/audits/DARWIN-SKILL-2026-05-12.md`.
---
## LRN-019 — Deployable-project doc split: README dev, DEPLOY prod-VPS 14 sections
- **Date**: 2026-05-15
- **Pattern**: deployable project → split docs by audience, not by topic. README = dev + features audience (one-line pitch, Features, Stack, Quick start (dev), Verifying a change, Build & deploy summary, Documentation cross-links, License). DEPLOY.md = ops/SRE audience, prod-only, 14 sections mirroring real VPS-deploy shape (topology table, env vars, VPS provisioning, two-layer firewall = cloud security group + UFW, Docker tuning = log caps + `live-restore`, first-time setup, routine deploys, persistence/volumes, backups + cron + retention, TLS = Caddy/nginx + ACME, observability = logs + healthchecks, hardening = SSH keys-only + fail2ban + unattended-upgrades, rollback, runbook). Dev quick-start NEVER in DEPLOY.md — mixed dev/prod = drift source. Trivial deploy (no Docker, no compose, no fly.toml, no k8s, no scripts/deploy.*) → fold into README, skip DEPLOY.md.
- **Context**: applied 2026-05-15 in `agents/doc-syncer.md` STEP 5/6 rewrite. Generalizes README-vs-DEPLOY ownership drift seen across multi-maintainer repos (devs read one doc, ops read another, both edit independently, conflicts pile up). 14-section template comes from real Scaleway DEV1-S walkthrough — shape works on any provider (Scaleway, Hetzner, OVH, DO, Vultr, plain bare-metal).
- **Future application**:
- Any `/onboard` / `/doc` / `/init-project` producing docs for a deployable project → apply the split directly. Don't ask user "where should dev setup go" — README, always.
- Existing repo has DEPLOY.md with "Local development" / "Dev setup" section → flag as drift, propose moving content to README, removing section from DEPLOY in same patch round.
- Existing repo has README.md mixing prod topology details (firewall, TLS, backups) → flag as drift, propose moving to DEPLOY.md.
- 14-section template = ceiling not floor. Drop sections that don't apply (no DB → drop "Managed DB" section, no domain → drop TLS section). Don't pad to hit 14.
- Audience test before merging a doc section: "would a junior dev clone-and-run with this?" → README. "Would an on-call SRE provisioning a new VPS use this?" → DEPLOY. If both → split it.
- **Reference**: commit `7ee9b42`, `agents/doc-syncer.md` STEP 5 (README template lines 223335), STEP 6 (DEPLOY.md 14-section template lines 338541). Linked to [[doc-syncer-readme-auto-deploy-prod]] (BDR-016).
---
## LRN-021 — Refactor migrating commands→skills must sweep `~/.claude/commands/` for orphan wrappers
- **Date**: 2026-05-20
- **Pattern**: when refactor moves orchestrator from `.claude/agents/foo.md` into `~/.claude/skills/foo/SKILL.md`, any pre-existing wrapper at `~/.claude/commands/foo.md` that references the old agent path becomes orphan. Wrapper still resolves `/foo` (slash commands take precedence over skills in dispatch), executes broken `Load and follow: .claude/agents/foo.md` instructions, fails silently or hits "file not found" mid-orchestration. Untracked files in `~/.claude/commands/` survive every refactor commit invisibly — git status in project repo never shows them.
- **Context**: 2026-05-20, `/ship-feature` hit BLK-004. Wrapper from before refactor `21960e0` ("changed orchestrators into skills") referenced 6 agent files; 5 deleted by refactor. Wrapper untracked → never flagged for cleanup. Detected only when user invoked `/ship-feature` and read the broken `Load and follow strictly:` list.
- **Future application**:
- Any commit moving orchestrator from `agents/foo.md``skills/foo/SKILL.md``grep -rln "agents/foo.md" ~/.claude/commands/` and delete stale wrappers in same commit.
- `/onboard` + `/init-project` must check `~/.claude/commands/` for wrappers referencing paths that no longer exist; print warning.
- When auditing skills (darwin-skill, /skills-perso, /profile), also list `~/.claude/commands/*.md` and cross-check each `Load and follow:` line resolves.
- Skills with `disable-model-invocation: true` rely on slash-dispatch — when wrapper exists, wrapper wins. Removing wrapper exposes skill directly; replacing skill behavior requires updating BOTH wrapper and SKILL.md.
- **How to detect early**: post-refactor script — `for f in ~/.claude/commands/*.md; do grep -Eo '\.claude/agents/[a-z-]+\.md' "$f" | while read p; do test -f "$HOME/$p" || echo "ORPHAN $f → missing $p"; done; done`.
- **Reference**: BLK-004, commits `0241e1d` + `21960e0`.
---
## LRN-020 — profile-sentinel-collision: literal labels in cmd output must not match profile filenames
- **Date**: 2026-05-18
- **Context**: Adding `lib/profiles/full.profile` exposed an aliasing bug in `lib/profile.sh:421`. `cmd_current` returned literal "full (all gstack skills enabled — no profile set)" when no profile was applied — a sentinel meaning "no profile active, full gstack on". With a real profile now named `full`, output became ambiguous: same word, opposite meanings (sentinel = no profile vs. profile name = canonical full set). Renamed sentinel to "none".
- **Pattern**: when a CLI returns named identifiers from a known namespace (profiles, channels, modes), any sentinel/placeholder value MUST be outside that namespace. Reserve sentinel strings like `none`, `unset`, `default`, `<none>` — never reuse a real identifier as "absence of identifier".
- **Where applicable**:
- Any `cmd_current` / `cmd_status` / `cmd_active` that reports either a real entity OR a "nothing applied" state.
- Profile/preset systems with named profiles.
- Selector outputs in shell scripts where downstream code does `[ "$x" = "<name>" ]`.
- **How to detect early**:
- Before adding a new entity name to a namespace, grep the codebase for hardcoded literals matching the candidate name (`grep -rn '"full"\|"none"\|"default"' lib/`).
- Audit `case` statements + `echo` lines in CLI commands for namespace-reserved labels.
- **Cost when missed**: shell-script consumers parsing the output break silently — `[ "$prof" = "full" ]` matches both meanings. User reads ambiguous status. No type system to catch it.
- **Reference**: `lib/profile.sh:421` sentinel rename in same commit as new `full.profile`. Linked to [[profile-full-superset]] (BDR-017).
---
## LRN-022 — Audit `lib/profiles/*.profile` against gstack skill list after every submodule bump
- **Date**: 2026-05-21
- **Context**: 2026-05-21, `/hotfix` on BLK-005. Gstack upstream renamed `checkpoint` skill to `context-save` (shadow conflict with Claude Code native `/checkpoint` rewind alias). Five local `lib/profiles/*.profile` files referenced the dead name. Warning `⚠ missing: checkpoint — try: bash link.sh` looked actionable but link.sh cannot resurrect an upstream-deleted skill — suggested next step dead end. Misdiagnosis cost user confused round-trip before `/hotfix` traced the rename.
- **Pattern**: profiles couple to external naming registry (`skills-external/gstack/*/`). When upstream renames or removes a skill, profiles silently break: `bash lib/profile.sh set <profile>` warns but does not fail; user has no signal at submodule-bump time. Same shape as any pinned-name reference into a vendored dep (config referring to npm subpath, k8s manifest referring to image tag, etc.).
- **Where applicable**:
- Any `git submodule update` or `git pull` inside `skills-external/gstack/` — diff skill list before/after.
- `make plugin`, `bash install-plugins.sh` — any time external skill source moves.
- When `bash lib/profile.sh apply|set <name>` warns `missing: <skill>`, treat warning as ground truth: skill is genuinely absent from `skills-external/gstack/` AND `skills-disabled/`. `link.sh` cannot fix it.
- **How to detect early**:
```bash
# After any gstack submodule bump:
diff <(ls skills-external/gstack/ | grep -v '^\.' | sort) \
<(awk '$2 != "personal" && $2 != "external" && $2 !~ /^(plugin|mcp|cli)/ && /^[a-z]/ {print $1}' lib/profiles/*.profile | sort -u) \
| grep '^>' # entries in profiles but not in gstack = stale references
```
Run as part of post-submodule-bump audit. Pair with `bash lib/profile.sh set <each-profile>` smoke test — any `⚠ missing:` line = stale entry.
- **Cost when missed**: every profile listing dead name emits misleading warning on `set`. User chases `link.sh` (suggested by `enable_skill` at `lib/profile.sh:191`) which silently no-ops. "try: bash link.sh" message hardcodes a fix that only applies to a different failure mode (skill exists upstream but not symlinked yet) — should differentiate. Follow-up: make missing-skill warning say "missing upstream: not in skills-external/gstack/" when applicable.
- **Reference**: BLK-005, commit `69c5ded`. Linked to [[ship-feature-orphan-wrapper]] (LRN-021) — same shape: post-refactor stale references survive because no automated sweep catches them.
---
## LRN-023 — Scripts invoked via symlink must resolve `$REPO` with `cd -P` (physical path), not default `cd` (logical)
- **Date**: 2026-05-21
- **Context**: 2026-05-21, BLK-006. `lib/profile.sh:43` used `REPO="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"`. Default `cd` preserves the logical (symlink-following) pathname, so when invoked via `bash "$HOME/.claude/lib/profile.sh"` — a symlinked entry point wired by `link.sh``$REPO` resolved to `/home/bchanot-ubuntu/.claude` instead of the real repo `/home/bchanot-ubuntu/Documents/claude`. `$SKILLS_DIR` happened to keep working because `~/.claude/skills` was itself a symlink to the repo, but `$DISABLED_DIR` was a real sibling directory at `~/.claude/skills-disabled` — separate from the repo's actual `skills-disabled/`. `cmd_current` scanned the wrong dir and reported `none` even when 14 gstack skills were genuinely disabled in the repo.
- **Pattern**: any script that
1. computes paths relative to `$BASH_SOURCE[0]` AND
2. is meant to be invoked via a symlink at the install location (e.g. `~/.claude/lib/foo.sh -> <repo>/lib/foo.sh`) AND
3. references sibling directories that are NOT also symlinked into the install location
MUST resolve the script's home via `cd -P` (or `realpath` / `readlink -f`), never default `cd`. Default `cd` returns the logical path the user typed (or the symlinked entry point) — anything you build off that path will follow symlinks for some siblings and fall back to real directories for others, depending on whether each sibling has a symlink in the install location.
- **Where applicable**:
- Any `lib/`, `bin/`, `scripts/` directory in a repo that gets symlinked into `~/.claude/`, `~/.config/`, `/usr/local/`, etc. via an install script.
- Specifically in this repo: `lib/profile.sh`, plus any other script that derives `$REPO`/`$ROOT` from `$BASH_SOURCE`. Audit `grep -rn 'cd "$(dirname "${BASH_SOURCE' lib/ hooks/ agents/`.
- Same pattern in Python (`Path(__file__).resolve().parent.parent` is the safe equivalent — `.resolve()` is the analog of `cd -P`; bare `Path(__file__).parent.parent` is the bug).
- **How to detect early**:
- When writing or reviewing a `REPO=` / `ROOT=` line in a shell script: check whether the script is reachable via a symlink. If yes, `-P` is mandatory.
- Smoke test: from a directory OUTSIDE the repo, invoke the script via both `bash /<real-path>/script.sh` and `bash /<symlinked-path>/script.sh`. Any path the script computes should be identical between the two runs.
- Lint via: `grep -n 'cd "$(dirname "${BASH_SOURCE' <script>` — every match should also contain `cd -P` (or be followed by an explicit `realpath` call).
- **Cost when missed**: state lands in two parallel directories. Reads from one, writes from the other. False-negative status reports. Worst case: silent data loss when one dir is cleaned by a tool that thinks the other is canonical.
- **Reference**: BLK-006, commit `a4558ee`. Linked to [[gstack-rename-profile-audit]] (LRN-022) — both bugs surfaced from the same `/profile set full` invocation, but root causes are independent.
---
## LRN-024 — New sibling command sharing logic → extract helper + refactor caller, never copy-paste
- **Date**: 2026-06-02
- **Pattern**: New `gstack on|off` needed same skill-toggle loops already inline in `cmd_reset` (enable-all-parked) + `cmd_set` (disable-not-in-profile). Copy-paste = divergence risk (gstack__ prefix logic, mktemp keep-file). Instead extracted `enable_all_gstack()` + `disable_gstack_not_in()` + `parked_gstack_count()`; refactored `cmd_reset`/`cmd_set` to call them, then added `cmd_gstack` as 3rd caller. Behavior preserved exact (code MOVED not changed).
- **Why matters**: CLAUDE.md "more elegant solution exists?" — slight scope expansion (touch existing fns) beats duplication. Risk contained by test: snapshot original symlink state → run on/off cycle → re-park exact original → assert final == original. PASS, live env untouched.
- **Key trick**: when mutating shared resource (symlinks, files, db), verify refactor by asserting `final_state == original_state` after a round-trip, not just "command exited 0".
- **Applies to**: any new subcommand/branch reusing logic inline in a peer command — extract first, refactor existing caller, then add new caller. shellcheck after.
- **Reference**: BDR-018, `lib/profile.sh` enable_all_gstack/disable_gstack_not_in/parked_gstack_count. Linked to [[gstack-on-off-verb]] (BDR-018).
---
## LRN-025 — gstack `.gitignore` allowlist must cover ALL toggleable skills, not just currently-enabled ones
- **Date**: 2026-06-02
- **Pattern**: gstack per-skill symlinks are local (regenerated by gstack `./setup`), kept out of git by an explicit `.gitignore` allowlist (`skills/<name>` per skill). Parked skills hide in `skills-disabled/` (blanket-ignored), so a skill missing from the allowlist looks harmless — UNTIL `profile reset` / `gstack on` (BDR-018) moves it into `skills/`, where it surfaces as an untracked symlink (git noise, risk of accidental commit). Found 6 parked skills (`document-generate`, `landing-report`, `scrape`, `setup-gbrain`, `skillify`, `sync-gbrain`) + 6 new unlinked (`spec`, 5 `ios-*`) all absent from the allowlist.
- **Why matters**: allowlist completeness is invisible until a toggle exercises it. The `skills-disabled/` blanket-ignore masks the gap for parked skills.
- **Applies to**: any system where a local-only (gitignored) artifact gets MOVED into a tracked dir by a toggle. Allowlist/ignore rules must enumerate the artifact's BOTH states (parked + active). After a gstack submodule bump, reconcile THREE surfaces, not two: `lib/profiles/*.profile` (LRN-022) **AND** `.gitignore` skills allowlist **AND** decide link/no-link per skill (platform relevance — iOS skills are Mac-only).
- **Detect**: `comm -23 <(gstack source skill names) <(grep '^skills/' .gitignore | sed 's#skills/##')` should be empty after any bump.
- **Reference**: BLK-007, `.gitignore` gstack section. Linked to [[gstack-rename-profile-audit]] (LRN-022), [[gstack-on-off-verb]] (BDR-018).
---
## LRN-026 — `disable-model-invocation: false` means ENABLED, not blocked
- **Date**: 2026-06-09
- **Pattern**: frontmatter key reads as "disable?" → `false` = NOT disabled = model invocation ENABLED. Easy to misread `false` as "off/blocked"; it is the opposite. Only `true` blocks. Absent key = default = enabled. `true` blocks BOTH surfaces: model auto-routing (description-match) AND orchestrator/sub-skill chaining via the Skill tool. Binary — no per-caller granularity, so you cannot allow orchestrator-chaining while forbidding model auto-fire.
- **Why matters**: two traps. (1) Adding `disable-model-invocation: false` thinking you block invocation — you don't, it's a no-op noise line. (2) Keeping `true` "for safety" on a skill you actually want orchestrators to chain (e.g. `ship-feature`, `refactor`) — silently breaks your own CLAUDE.md routing; the model sees the intent but can't fire. Real destructive-action safety = careful/guard hooks (block `rm -rf`/force-push live), INDEPENDENT of this flag — so `true` on an orchestrator buys ~0 data-safety, only suppresses auto-fire (token/time cost).
- **Applies to**: any Claude Code skill frontmatter. Want skill model-routable + orchestrator-chainable → omit key (or `false`). Want human-only `/command` entry point → `true`, accepting it also blocks orchestrators. Guard genuinely dangerous ops at the hook layer, not via this flag.
- **Reference**: BDR-019, 19 `skills/*/SKILL.md`. Linked to [[remove-disable-model-invocation-repowide]] (BDR-019).
---
## LRN-027 — Periodic "since last run" skill needs machine-readable state file — agents improvise boundaries from file dates otherwise
- **Date**: 2026-06-11
- **Context**: TDD baseline for `/audit-delta` (superpowers:writing-skills RED phase, isolated worktree, no skill). Agent asked to "audit everything changed since last audit run". No recorded state → agent guessed boundary from most recent file mtime/date in `.claude/audits/` (grabbed `DARWIN-SKILL-2026-05-12.md` — darwin report, not audit checkpoint), used `git log --after=<date>` (date-based, drifts on rebase/timezone/amend), then wrote ITS checkpoint as prose inside dated report — next run must guess again, same failure loop. Also: zero approval gate under "fix what you find + I'm in meeting" pressure, shellcheck-pass called "verified", all axes one mixed pass.
- **Pattern**: any recurring skill with "since last run" semantics MUST persist machine-readable state (JSON, SHA-based, per-dimension if partial runs possible) + skill must FORBID inference fallbacks explicitly ("do NOT scan report dates", "no `--after`"). Baseline agents fill state vacuum with plausible-wrong heuristics, confidently.
- **Why matters**: improvised boundary = wrong scope silently. Date boundaries break on rebase. Prose checkpoints unparseable. Single marker desyncs partial runs.
- **Applies to**: future periodic skills (audit, sync, drift-check, recurring reports). Design state file FIRST, write anti-inference rules in skill body.
- **Reference**: `skills/audit-delta/SKILL.md` STEP 0 + Common mistakes table. Linked to [[audit-delta-design]] (BDR-020).
---
## LRN-028 — "No-skill" subagent baselines invalid when skill installed globally — subagents see + invoke installed skills
- **Date**: 2026-06-11
- **Context**: darwin run on `audit-delta`. 3 baseline subagents (prompt without skill) meant as no-skill control. All 3 followed skill protocol anyway — one report said "Invoked the /audit-delta skill". Skill symlinked in `~/.claude/skills/` → auto-listed in every subagent's available-skills → "baseline" = contaminated, differential comparison dead.
- **Pattern**: control condition must REMOVE capability, not omit mention. Globally installed skills leak into all subagents. True baseline: fixture env with skill uninstalled/renamed, or isolated worktree pre-install (how audit-delta's own TDD RED phase did it — only valid baseline evidence that run).
- **Detect**: baseline report cites skill name / follows its exact protocol → contaminated.
- **Applies to**: darwin dim8 with/without tests, any A/B skill eval, TDD RED baselines.
- **Reference**: darwin results.tsv 2026-06-11 baseline row. Linked to [[audit-delta-design]] (BDR-020), LRN-027.
---
## LRN-029 — Edit adding exception to blanket rule WILL contradict it — counterbalanced blind judges catch what self-review misses
- **Date**: 2026-06-11
- **Context**: darwin Round 1 added STEP 0 exception (dangling marker → marker frozen) to `audit-delta`. Pre-existing 3c blanket rule ("unreachable user → marker still updates") now contradicted it. Self-review missed; 4/4 independent blind judges (2 per round, doc order swapped to kill position bias) flagged the live contradiction. Round 2 fixed via explicit cross-ref exception clause in 3c.
- **Pattern**: (1) any edit adding exception → grep doc for blanket rules covering same variable (here: marker updates), cross-ref or contradict. (2) Judge protocol that works: 2+ judges, A/B order counterbalanced, blind to version age, score named dims, require consensus. SkillLens 46.4% solo-judge accuracy is real — consensus + counterbalance compensates.
- **Why matters**: improvement edits create inconsistency debt invisible to author in same context (darwin blacklist #1).
- **Applies to**: skill/doc/spec edits adding branches; any self-modified artifact scoring.
- **Reference**: commits 0d2ece7 (introduced), 9fc93fa (fixed). Linked to LRN-027.
---
## LRN-030 — Opus 4.8 under-delegates subagents/memory/custom-tools by default — counter with explicit fan-out rule in CLAUDE.md
- **Date**: 2026-06-18
- **Context**: User noticed Claude rarely spawns subagents. Real cause = Opus 4.8 documented behavioral trait (Anthropic migration notes, surfaced via claude-api skill): conservative reaching for capabilities needing explicit "decide-to-use" step — subagent delegation, file-based memory, custom tools — won't reach unless reasonably sure needed. Less than 4.6/4.7. Session was partly correct task-sizing (1-2 file reads → inline right), partly real under-reach.
- **Pattern**: model-level under-delegation steerable via explicit prompt/config, NOT hard hook. Counter = CLAUDE.md `## Workflow` rule: task fans out across independent items (many files, parallel searches, multi-point checks) → delegate to subagents, don't iterate serially; default to delegation for multi-file exploration.
- **Why matters**: long sessions grind serially + fill main context when 3 parallel agents (cavecrew-investigator / Explore) would map at once. Default tendency wastes the agents the config already defines.
- **Applies to**: any Opus 4.8 session; tuning delegation behavior; deciding inline vs subagent. Same trait drives memory + custom-tool under-use — same counter.
- **Reference**: commit 02a0ba0 (CLAUDE.md `## Workflow` edit).
---
## LRN-031 — Skill value = gate + anti-noise + determinism, NOT re-coding what a capable agent does free
- **Date**: 2026-06-19
- **Pattern**: capable agent + strong CLAUDE.md already nails the easy-path (dedup, semantic-dedup, routing, done-detection) unaided. A skill earns its complexity ONLY on guarantees the agent drops under pressure: mandatory approval gate, anti-noise filters, explicit-only capture, determinism (baseline non-deterministic across runs). Re-documenting free behavior = bloat. Corollary (TDD): if no-skill RED baseline PASSES, fixture under-probes — strengthen on the value dimensions (subtle/pressured cases), never ship a skill justified by a test its absence passes. Trim each procedure to its load-bearing rule (PASS A done-detection → keep restraint rule, drop git-command how-to the agent runs anyway).
- **Context**: built merged `/capitalize` (BDR-023) via writing-skills TDD. RED v1 baseline passed (deduped, checked done task, ignored parasite) — too easy. RED v2 (semantic dup + ambiguous umbrella task + parasite-phrased-as-task + orientation directive + rushed prompt) failed on anti-noise (folded push/tag into TODO) + invented subtask + no approval stop. Those 4 = the skill's real marginal value; rest the baseline did free.
- **Future application**:
- Building/reviewing a skill → ask "does the baseline agent already do this for free?" Keep only gate + filters + determinism + non-obvious restraint rules; cut machinery re-describing capable-agent behavior.
- RED baseline passes without the skill → harden the fixture before writing, don't ship.
- Trim each procedure section to its load-bearing rule; delete how-to the agent performs anyway.
- **Reference**: BDR-023, `skills/capitalize/SKILL.md` STEP 2B + Red flags. Linked to [[LRN-008]] (skill wins from edge-cases not workflow rewrites), [[LRN-028]] ("no-skill" baseline contamination when skill installed globally).
---
## LRN-032 — Rule has a domain; applying it outside that domain = category error — check artifact type before invoking
- **Date**: 2026-06-19
- **Context**: enriching `profile.sh list` display. Cited CLAUDE.md `80 chars/line` to justify compact counters + reject ellipsis truncation. Measured: 7/10 `list` rows still >80 (max 97) — descriptions 58-73 chars, fixed prefix 24. Truncating to hit 80 would break `list` function (at-a-glance profile compare).
- **Pattern (general)**: every rule carries a DOMAIN. Applying it outside that domain = category error. Before invoking ANY rule, identify artifact class it governs + confirm THIS artifact is that class. Mismatch → don't apply. Never apply rules mechanically.
- **Specific instance**: `80 chars/line` = SOURCE-CODE domain (edit readability, diffs, split terminals). CLI runtime output = displayed, not diffed/edited → out of domain. So `list` overflow OK; keep aligned left block (name+counters), descriptions run full.
- **Future application**: invoking a limit/convention/style rule → first ask "what artifact class does this govern, is THIS that class?". Catches misapplied norms (line-length on output, lint on generated files, prose rules on data).
- **Reference**: `lib/profile.sh` `cmd_list`, commit 5776195. Linked to [[LRN-031]] — both meta-lessons on NOT applying mechanically (LRN-031 = value of a skill; LRN-032 = domain of a rule).
---
## LRN-033 — Multibyte separator breaks `printf %-Ns` (byte-width) padding — pad via `${#}` char-count
- **Date**: 2026-06-19
- **Context**: `profile.sh list` ITEMS column = compact counts "12s·1p·1m·1c" using `·` (U+00B7, 2 bytes UTF-8).
- **Pattern**: `printf '%-Ns'` pads to N BYTES, not display columns. Multibyte char → field over-counts → columns misalign (off by bytes-minus-chars). Fix: display width via `${#str}` (char-count, UTF-8-aware under multibyte locale) + pad with `printf '%*s' <gap> ''`. Alt: keep multibyte content in LAST column (no pad) — existing `cmd_list` already did this for descriptions.
- **Future application**: aligning any column with non-ASCII (`·` `—` box-drawing, accents) → never trust `%-Ns`; use `${#}` + manual space pad, or put multibyte field last. Verify with `wc -L` (display width), not `wc -c`.
- **Reference**: `rpad()` in `lib/profile.sh`, commit 5776195.
---
## LRN-034 — Narrated state ≠ ground truth; the missed alarm was internal contradiction — verify against git
- **Date**: 2026-06-21
- **Context**: CLAUDE.md audit reprise. Assistant first said correctly "P3 non écrit" (profile.sh pivot). User then asserted "P3 DÉJÀ appliqué" (diff-approval confused with diff-writing — user acknowledged). Assistant ACCEPTED it ("P3 clos, je n'y touche pas") without reopening git; it carried into the resume prompt as "P3 APPLIQUÉ et committé". On reprise, git log + file content (design routing still split 3×) proved P3 never applied. Eventually applied → commit 493b6b9.
- **Cause (shared)**: origin = ambiguous user assertion (approval ≠ application, acknowledged); assistant failure = swallowing it without verification. Not one party's fault — both unverified.
- **Lead lesson — the missed alarm was internal contradiction**: assistant had said "P3 non écrit", then accepted "P3 fait" two turns later. A claim contradicting what you said just before = loudest possible signal to re-check — and it was reconciled by quietly accepting the newer claim. THAT is the real failure.
- **Pattern**: narrated/remembered state from ANY source (user OR assistant) is not ground truth. Approval of a diff ≠ its application.
- **Future application**: anyone asserts "X is done" → verify (git log, file content, grep) before building on it; ESPECIALLY when it contradicts your own earlier statement, or after a context/window break. Internal contradiction → stop, re-check git, never reconcile by accepting the newer claim silently.
- **Reference**: P3 reprise, commit 493b6b9. Linked to [[LRN-032]] (verify before applying a rule), [[LRN-035]] (check the artifact, not the claim/count).
---
## LRN-035 — Honest dedup: name-mention ≠ definition-instance; a dosage rule can make a "dedup" task a no-op
- **Date**: 2026-06-21
- **Context**: P4 of CLAUDE.md audit = factor "≤2 files, obvious fix" "repeated ~8×". Inspection: 4/8 = skill NAME `hotfix` in lists (not scope defs); 3/8 = context-specialized scope phrasings (routing trigger "typo, CSS, config, ≤2 files" / design "single cosmetic value" / general exemption "obvious fix" — NOT identical), 2 in protected sections (routing table, P3-consolidated design); canonical single source already created by P5 in `## Planning & TODO`. Net: factorize nothing.
- **Pattern**: before factoring "duplication", separate name/reference mentions from actual definition instances; check whether copies are identical or context-specialized. Apply dosage (keep inline where read-in-isolation needs it; in doubt keep inline). A dedup proposal can correctly collapse to no-op — kill it by applying the rule, don't force factorization to honor the proposal.
- **Future application**: any "X repeated N times → factor it" → audit what each occurrence IS; count real dup-of-definition, not keyword hits. Manufacturing factorization degrades local readability for zero gain.
- **Reference**: P4 no-op, CLAUDE.md audit (commit 663b16c). Linked to [[LRN-031]] (skill value = don't re-code free behavior, don't force a procedure), [[LRN-032]] (rule has a domain).
---
## LRN-036 — `command -v <cli>` in a shelled-out script depends on PATH carrying the cli's bin, NOT on the alias
- **Date**: 2026-06-21
- **Context**: design-tool-gate.sh shelled out (`bash script.sh`) by skill/hook checks `command -v claude` to verify magic + ui-ux-pro-max. Live run reported "claude absent" → unverified, though `claude mcp list` worked elsewhere same shell.
- **Refuted hypothesis**: "claude = alias (claude→dtach_claude function), alias dies in non-interactive subshell → cause". Alias DOES die in `bash script.sh`, but HARMLESS: real binary on inherited PATH (`~/.nvm/versions/node/vX/bin/claude`), so `command -v claude` resolves it. Proven: normal `bash script.sh` → FOUND; `PATH=/usr/bin:/bin bash script.sh` → NOT FOUND. Lever = PATH, not alias.
- **Real cause**: `command -v claude` succeeds only when PATH carries the node bin dir. Skill/hook can shell script out with sanitized PATH lacking it; nvm path version-pinned → node upgrade moves it. Either → check = unknown.
- **Fix**: don't trust inherited PATH. `ensure_claude_on_path()` probes known dirs (`~/.claude/local`, `~/.local/bin`, `/usr/local/bin`, nvm glob `sort -V | tail -1` = newest) + prepends bin dir (carries claude AND its node runtime, same dir; claude shebang needs node). Fail-visible exit 11 = the MITIGATION/net, NOT the cause.
- **Future application**: any script shelling out a CLI that may run from hook/subshell → resolve the binary's bin dir explicitly, don't assume interactive PATH. Test under `PATH=/usr/bin:/bin` to simulate sanitized context. Distinguish alias/function (interactive-only, never in subshell) vs real binary on PATH (what `command -v` finds in scripts).
- **Reference**: `ensure_claude_on_path()` in `lib/design-tool-gate.sh`, commit f963318. Linked to [[LRN-034]] (narrated/plausible state ≠ ground truth — here the plausible alias theory was wrong; test the real subshell, don't accept it).
---
## LRN-037 — Verify the load-bearing scenario on the REAL subject in REAL context, not a stub or a logic argument
- **Date**: 2026-06-21
- **Context**: design-gate chantier. 4 successive plausible claims each REFUTED only by running the real thing: (1) .env read path was `$REPO/.env`, not `~/.claude/.env` (read the actual script); (2) fail-open — unknown folded into silent READY (saw it in live output); (3) "alias dies in subshell = cause" (refuted: real binary on inherited PATH → `command -v` succeeds); (4) real cause = PATH carrying nvm bin (proven by `PATH=/usr/bin:/bin` run). Logic/stub never caught any. The DISCRIMINATING magic-OFF-under-stripped-PATH → exit 10 is what proved the gate truly runs `claude mcp list` vs. defaulting to READY.
- **Pattern**: for the load-bearing scenario, run it on the REAL subject in the REAL invocation context (prod path `$HOME/.claude/lib/...`, prod-like PATH), not a stub or a "the code path is correct" argument. A stub proves branch coverage; only the real subject proves the integration. Always add a DISCRIMINATING case — force the failure state; the check must REPORT it, not pass by default (a check that only ever passes proves nothing).
- **Future application**: any "fixed/works" claim on a critical path → produce the real run output (command + lines + exit code) before capitalizing or shipping; don't summarize ("condition met") in place of the output. Stub/logic = necessary for branch coverage, never sufficient for the integration claim. Most rentable discipline of the whole segment: every refutation came from execution, none from reasoning.
- **Reference**: design-gate chantier, the `PATH=/usr/bin:/bin` matrix (magic-on → READY/0, magic-off → INCOMPLETE/10), commits 4d19135 / f963318. Linked to [[LRN-036]] (the concrete instance: the PATH cause surfaced only by the real run), [[LRN-034]] (its twin — 034 = don't trust a narrated *claim*; 037 = don't trust a *stub/logic argument* as proof; both demand execution against ground truth).
---
## LRN-038 — Playwright host-platform override for distros newer than its hardcoded support list
- **Date**: 2026-06-23
- **Context**: fresh Ubuntu 26.04. gstack `./setup` aborted: "Playwright does not support chromium on ubuntu26.04-x64". Playwright 1.58.2's registry hardcodes `ubuntu20.04/22.04/24.04` only; a newer release → no matching build → hard error. gstack is a pinned submodule (must not edit).
- **Pattern**: `PLAYWRIGHT_HOST_PLATFORM_OVERRIDE=ubuntuXX.04-<arch>` forces a fallback build. MUST include arch (`x64`/`arm64`) — bare `ubuntu24.04` fails ("does not support … ubuntu24.04"). Set it from the WRAPPER: `export` before the submodule's setup (install-time download) AND persist to the shell profile (runtime launch) — both paths call `getHostPlatform`. No submodule edit. Gate on real OS version (`sort -V` compare) so supported distros are untouched. Test with the LOCAL `./node_modules/.bin/playwright``bunx playwright` pulls the LATEST playwright (different browser revision than the local import), which masks the result.
- **Future application**: any pinned tool that hardcodes an OS allowlist breaks on a fresh OS upgrade. Look for a host-platform override env before bumping/forking the dep. Prove the fallback binary actually runs (`ldd` = no missing libs + a real headless render), not just that the download resolves.
- **Reference**: `install-plugins.sh` `playwright_platform_override()`, commit 211c7d4. Linked to [[BLK-008]].
- **2026-06-23 CORRECTION (override REVERTED, commit b9c3937)**: the override is NOT a usable fix on Ubuntu 26.04. It makes `playwright install` switch to the ubuntu24.04 fallback build, which downloads to 100% then HANGS at extraction (chrome binary never materializes; real machine + sandbox). Turned a 0.5s fast-fail into an install-blocking hang. The isolated proof (`ldd` + headless render) PASSED but used an already-extracted sibling build (rev 1228) — it masked the install-path hang in the real flow (rev 1208). **Sharpened lesson**: proving the binary launches in isolation is NOT proving the install path works — run the ACTUAL install command end-to-end (it must COMPLETE, not just "download resolves" nor "a binary launches"). The override technique stays valid in general, but the EXTRACTION/COMPLETE step is part of "does it work".
---
## LRN-039 — Installers drift hand-curated config → snapshot+trap-restore guard; anchor gitignore for pollution
- **Date**: 2026-06-23
- **Context**: fresh Ubuntu `make install`. 3rd-party installers mutated repo files: graphify rewrote `CLAUDE.md`+hooks (every `graphify install`, Step 7), `claude plugin install` flipped `enabledPlugins`, the example-skills `cp` churned `frontend-design`, `npx skills add` wrote project-scope `.agents/` + `skills-lock.json`.
- **Pattern**: file an installer rewrites but YOU curate → snapshot to a `mktemp -d` at start + `trap restore EXIT` (`cmp -s` before `cp`, revert only real diffs). Preserves pre-existing edits, no git dependency, idempotent, survives early-exit. Pure generated pollution → gitignore. ANCHOR the ignore (`/.agents/`, NOT `.agents/` and NOT `agents`) so it can't catch a legit sibling — our agents live in `agents/` (no dot). Verify with `git check-ignore -v <legit-dir>` that the pattern doesn't over-match.
- **Future application**: audit a fresh install = `git status` right after `make install`; classify every drift as (a) curated → guard, or (b) pollution → anchored gitignore. Never `git checkout` to clean drift (destroys uncommitted work). Prove the guard with an isolated drift→restore test before trusting it.
- **Reference**: `install-plugins.sh` `restore_curated_configs` + EXIT trap, `.gitignore` `/.agents/`, commits 51afe9b / 7de8761. Linked to [[BDR-028]].
---
## LRN-040 — OS newer than a pinned tool supports = TWO distinct layers (version build + security policy)
- **Date**: 2026-06-23
- **Context**: gstack browser on fresh Ubuntu 26.04. Layer 1 = Playwright 1.58.2 ships no browser build for 26.04 → install errors (the host-platform override "fixes" the error but its fallback build HANGS at extraction — dead end, [[BLK-008]]). Layer 2 = even with Playwright 1.61 (native 26.04 build that launches fine in isolation), the real browse path aborts "No usable sandbox" because Ubuntu 24.04+ restricts unprivileged user namespaces via AppArmor.
- **Pattern**: (a) bump the tool PAST the OS-support threshold — don't force the OS to look older (overrides/fallbacks are fragile; prove the install COMPLETES, not just that a binary launches). For a pinned submodule dep: `bun add X@latest` in the submodule, automatable in the installer, idempotent by grepping the dep's support list for the running OS tag before bumping. (b) SEPARATELY handle OS security hardening: Chromium needs `--no-sandbox` where `sysctl kernel.apparmor_restrict_unprivileged_userns=1`; gstack exposes `GSTACK_CHROMIUM_NO_SANDBOX=1` (#1562). Gate persistence on the sysctl, not an OS-version guess.
- **Future application**: "tool X broke after an OS upgrade" → check BOTH (1) does X ship a build / support entry for the new OS (bump if not), and (2) does the new OS's hardening (userns/AppArmor/SELinux) block X at runtime (needs an opt-out flag). Fix one without the other and it still fails. Verify the FULL runtime path (drive a real page) — here the isolated `chromium.launch()` PASSED while the real `browse` path failed on the sandbox.
- **Reference**: `install-plugins.sh`, `.bashrc` `GSTACK_CHROMIUM_NO_SANDBOX=1`, gstack `browse/src/browser-manager.ts` `shouldEnableChromiumSandbox()`, commit 3b8ffb1. Linked to [[BDR-029]], [[BLK-008]], [[LRN-038]].
---
## LRN-041 — A check reading a symlink an EARLIER install step makes → false negative if that step's precondition wasn't met
- **Date**: 2026-06-23
- **Context**: install warned "MAGIC_API_KEY not found in ~/.claude/.env" though the key WAS set there. Root: the check grep'd `$REPO/.env` — a symlink → `~/.claude/.env` ([[BDR-026]]) created by `link.sh`'s `link_env`. On a fresh machine `~/.claude/.env` is created AFTER `link.sh` runs (install first warns "create it"), so the symlink was never made and the key was unreachable via `$REPO/.env`. `make plugin` also never runs `link.sh`. The warning misleadingly blamed `~/.claude/.env`.
- **Pattern**: a check that reads a path PRODUCED by an earlier setup step silently fails when that step's precondition wasn't met yet (target absent → symlink skipped). Fix: read the CANONICAL source and/or self-heal (create the missing symlink when the canonical exists). Env-key greps must tolerate `export `/leading whitespace and require a non-empty value: `^[[:space:]]*(export[[:space:]]+)?KEY=.` — and the message must name the real gap (symlink missing vs key absent), with an actionable hint (`run make link`).
- **Future application**: any "X not found in FILE" where FILE is a symlink/derived path → verify the producing step ran with its precondition, prefer the canonical source, self-heal or give an actionable message. Sandbox note: `.env*` reads were blocked — diagnosed via directory listing + regex tests on SYNTHETIC lines, never reading the secret.
- **Reference**: `install-plugins.sh` magic check (self-heal symlink + tolerant regex), `link.sh` `link_env`, commit 1b028cb. Linked to [[BDR-026]].
---
## LRN-042 — `npx skills add` / gstack `./setup` resolve install target RELATIVE TO CWD — run from repo = wrong dir, breaks `$HOME` symlink assumptions
- **Date**: 2026-06-23
- **Context**: darwin-skill `npx -y skills add` (Step 8.5) + gstack `./setup` (Step 2) both ran with CWD=repo. The `skills` CLI writes to `<cwd>/.agents/skills`; gstack `./setup` likewise wrote per-skill dirs into repo-local `.agents/skills`/`.claude/skills`. So darwin landed in `$REPO/.agents/skills/darwin-skill` + `$REPO/.claude/skills/darwin-skill`, NOT `$HOME/.agents/skills/darwin-skill` where `link.sh` (NPX_EXTERNAL_SKILLS) + `install-plugins.sh` (`_dst`) look → symlink never created, "darwin-skill not installed — run make plugin" though it WAS installed. SELF-REINFORCING: once `$REPO/.agents` exists, every later `skills add` targets it. `find-skills` only worked because an earlier run (before `$REPO/.agents` existed) wrote it to `$HOME`. BDR-028/LRN-039 had already gitignored repo `.agents/`+`skills-lock.json` as "drift noise" — masked the symptom, never saw the install was landing in the WRONG PLACE.
- **Pattern**: a per-user installer that resolves its target relative to CWD (walks up for / creates `.<tool>/` in CWD) silently installs into the project tree when run from a repo that already carries such a dir. Gitignoring the junk hides it but the artifact is unreachable from `$HOME`-based consumers. Fix: run the installer from `$HOME` (`(cd "$HOME" && npx -y skills add …)`) so it targets `$HOME/.agents/skills`; clean up the repo-local copies (gitignored → safe `rm -rf`). Also fix the ordering twin: `link.sh` must re-run AFTER the install steps that produce what it symlinks (install.sh ran link FIRST; install-plugins never re-linked) — added a final `link.sh` step so `make plugin`/`make install` finish self-sufficient.
- **Future application**: before running any `npx <x> add` / `<tool> init` / `setup` that materializes a dotfile dir, set CWD to where the artifact MUST live (usually `$HOME`), don't trust the script's default resolution. When a "X not installed" warning contradicts a "successfully installed" log line → diff the EXPECTED path vs where the log says it wrote (here log line showed `~/Documents/claude/.agents/skills/darwin-skill`). When an installer A produces inputs for symlinker B, B must run after A in the same invocation.
- **Reference**: `install-plugins.sh` Step 8.5 (`cd "$HOME"` + parasite cleanup) + Step 10 (final `link.sh`), `update-all.sh` Step 7.5, log `install-20260623-181416.log:1399`. Extends [[LRN-039]] (BDR-028 — gitignored the symptom) + [[LRN-007]] (toggle-external source-only state) + [[LRN-041]] (install-ordering false-negative). gstack on-demand consumer = [[BDR-030]].
---
## LRN-043 — CLAUDE.md skill-routing: cut name-obvious lines (already in skill descriptions), keep only non-derivable signal + dense catch-all
- **Date**: 2026-06-25
- **Context**: compressing the Skill-routing block of the global CLAUDE.md. Claude already sees every skill's `description` in session context (the available-skills list). A routing line that merely restates "task X → skill named X" duplicates that description → pure token waste loaded every session.
- **Pattern**: in a routing list, KEEP only lines carrying signal NOT derivable from the skill name — (a) conditional fallbacks (gstack ON/OFF), (b) misleading/cryptic names where name ≠ function (`validate` → W3C/WCAG, not form/data/build validation; `cso` → security audit; `plan-eng-review` → architecture review), (c) disambiguation between near-twins (feat/hotfix/bugfix by file-count). CUT the name-obvious rest, replace with ONE dense catch-all ("most skills route by name — match the request to the skill whose description fits"). GUARD: a misleading name is NOT transparent → it needs its own explicit line or it mis-routes; never cut those to save a line (user restored `validate` + `plan-eng-review` for exactly this).
- **Future application**: compressing any routing/dispatch table whose entries the model already sees elsewhere → delete the redundant majority, keep the non-obvious minority + a generic fallback. Test each candidate cut: "is this mapping derivable from the skill name + its own description?" Yes → cut. No → keep explicit.
- **Reference**: `~/.claude/CLAUDE.md` §Skill routing, commit ba743cf (routing block 40 → 23 lines). Linked to [[BDR-031]].
---
## LRN-044 — Edit/Write tools refuse to write THROUGH a symlink — pass the resolved real path
- **Date**: 2026-06-25
- **Context**: editing `~/.claude/CLAUDE.md`, a symlink → `~/Documents/claude/CLAUDE.md` (the tracked repo file). Read worked through the symlink; Edit errored: "Refusing to write through symlink … Resolve the symlink and pass the real target path explicitly."
- **Pattern**: many of this user's `~/.claude/*` config files are symlinks INTO the claude-config repo (`~/Documents/claude/`). Edit/Write block writes through a symlink (safety against clobbering link targets); Read does not — so Read-through-link succeeds then Edit-through-link fails on the same path.
- **Future application**: before editing any `~/.claude/...` config file, resolve it first (`readlink -f <path>`, or `ls -la` to spot the arrow). Then Read AND Edit the RESOLVED real path so the harness's read-tracking matches what you write — and `git` status/diff/commit land naturally in the repo that owns the file.
- **Reference**: hit while editing `~/.claude/CLAUDE.md``~/Documents/claude/CLAUDE.md`. Linked to [[BDR-031]].
---
## LRN-045 — Renaming a command: audit exact-name leak-guard / forbidden-token regexes
- **Date**: 2026-06-25
- **Context**: rename `/validate``/web-validate`. A client-deliverable leak-guard in `agents/client-handover-writer.md:1462` greps generated docs for internal tool names via `grep -niE '/(seo|harden|validate|cso|...)\b'`. The `web-` prefix means `/web-validate` no longer matches the `/validate` branch (the `/` must sit immediately before `validate`; post-rename a `-` sits there) → renamed command leaks SILENTLY into client-facing output. No error — the gate just stops catching it.
- **Pattern**: any rename of a command/skill/identifier must sweep regexes/allowlists/denylists that match the OLD name by exact token — leak guards, forbidden-token gates, routing dispatchers, CI greps. A prefix/suffix rename breaks anchored matches (`/oldname\b`) with zero error. Fix = alternation covering BOTH names (`web-validate|validate`), NOT replacement — old artifacts (already-shipped client docs, logs) still carry the legacy name and must stay caught.
- **Future application**: when renaming, grep the BARE old token inside regex/test/gate files, not just `/oldname` command refs. A blind `replace_all '/old' '/new'` MISSES these because the guard stores the name inside an alternation (`|old|`), not as `/old`. For each guard found, extend to `new|old`; verify the gate line shows both names.
- **Reference**: `agents/client-handover-writer.md:1462`, rename commit `e5e673a`. Linked to [[BDR-032]].
## LRN-046 — Destructive skill: deterministic oracle > semantic judge
- **Date**: 2026-06-25
- **Pattern**: On a DESTRUCTIVE skill the binding oracle must be DETERMINISTIC (byte-identical, or count-based census per-entry × per-category), not a semantic judge. A judge false-greens twice: (a) PRESERVED-but-MUTATED content — RED-4, a "meaning preserved" collapse still rewrote a permanent safety rule; byte-identical caught it, the judge would not; (b) a 0-result that happened by CHANCE — "no negation inverted" ≠ protected, it was the dice not a guard. If the oracle must be behavioral/LLM, pair it with a deterministic check that is the gate.
- **Context**: prune-memory v1.1 TDD (EVAL-006, skill `0a3e766`). RED-4 collapse + RED-3 compression.
- **Future application**: any destructive/irreversible skill or safety check; any TDD whose natural oracle is an LLM judge — make the binding check deterministic, keep the judge as a secondary net.
- **Reference**: skill `0a3e766`, `tests/run-behavioral.md`. Linked to [[EVAL-006]].
## LRN-047 — A noisy safety guard is a risk, not discomfort
- **Date**: 2026-06-25
- **Pattern**: A safety guard that cries wolf (13/13 false positives on real data) is a guard you learn to IGNORE → the day of the true positive you skip it by habit. On a destructive op a noisy guard = security RISK, not annoyance → REFINE it (here line-grep → count-based census), don't tolerate. Measure the false-positive rate on REAL data, not fixtures — all-green fixtures hid the 13.
- **Context**: prune-memory v1.1 (EVAL-006). The RED-5 line-grep fidelity guard fired 13/13 false positives on the live learnings.md (line-sharing) → replaced by a per-entry census (0 FP, proven).
- **Future application**: any guard/alert/lint/test that can false-positive — measure FP on real data before shipping; a guard habitually ignored is worse than none.
- **Reference**: skill `0a3e766`, `tests/run-deterministic.sh` (RED-5). Linked to [[EVAL-006]].
## LRN-048 — A "0 / OK / pass" must prove it LOOKED
- **Date**: 2026-06-25
- **Pattern**: A passing result ("0 errors", "OK", "clean") must PROVE it inspected — show the work counted something on both sides (census non-zero on HEAD and WORK). Else it is a verify hard-wired to pass = the original prune-memory v1 lie (`basename | cut -c1-3` never matched any heading → verify always printed blank-OK). A 0 by inaction is indistinguishable from a 0 by correctness; force the success path to surface its coverage.
- **Context**: prune-memory v1.1 (EVAL-006). v1 STEP-4 verify always reported OK (wrong prefix → 0 markers → blank). The fix's 0-false-positive is only trustworthy because the census was shown counting both sides.
- **Future application**: any verify/test/lint reporting success — design the pass to surface what it examined (counts / files / lines) so a vacuous pass is visible, not silent.
- **Reference**: skill `0a3e766`, EVAL-006 (verify-proof anomaly). Linked to [[EVAL-006]].
## LRN-049 — Non-destructive repeated nudge: stateless-minimal surface > state marker (conditional on stakes)
- **Date**: 2026-06-25
- **Pattern**: To dedup a REPEATED but NON-DESTRUCTIVE suggestion (hint/nudge/advisory in a stateless flow — gate, hook, lint note), minimize the surface (always 1 line) instead of a persistence marker. A marker buys "exactly once" but costs state (file + gitignore + location), wrong scope ("session" via a plain file = forever-per-project), and staleness with no cleanup. Goal is not "prevent re-fire" but "make re-fire cheap enough to never be noise" — strip the per-occurrence richness and there is nothing left to dedup. **Conditional on stakes**: [[LRN-046]]/[[LRN-047]] ("deterministic > behavioral", "noisy guard = risk") were forged on a DESTRUCTIVE skill where a false-green = data loss → there a deterministic marker earns its cost. Here it is a 1-line cosmetic note → re-fire is annoyance, not risk → do NOT import marker-grade infra. Same determinism requirement, opposite cost/benefit.
- **Context**: design-gate §4 anim-lib suggestion ([[BDR-033]]). User reserved the marker-vs-refire call; winning third option was "always 1 line, stateless".
- **Future application**: any repeated advisory in a stateless surface — first bound the noise by minimizing the surface; reach for a marker/flag-file ONLY when a missed dedup is costly (destructive, irreversible, money, security), not merely repetitive. Match the guard's cost to the stake it protects.
- **Reference**: `lib/design-gate.md` §4, [[BDR-033]]. Conditions [[LRN-046]], [[LRN-047]].
## LRN-050 — On a symlinked/live file, show-before-write is the ONLY control gate
- **Date**: 2026-06-25
- **Pattern**: When the edit target is symlinked into the live path (`~/.claude/lib/`→repo, `~/.claude/CLAUDE.md`→repo …), saving the file IS deploying it — write and go-live collapse into one act. No later deploy step catches a bad change, so the pre-write review (show the drafted diff, get explicit go) is the ONLY checkpoint before the change is in service — unlike a normal file where build/commit/deploy offers a second net. On live/symlinked targets, show→validate→write is mandatory, not courtesy; "edit silently then show" forfeits the only gate.
- **Context**: this session twice wrote-then-showed on `lib/design-gate.md` (live via symlink). Both harmless (non-destructive), but the pattern would bite on a destructive live edit. User flagged it → inverted to show→validate→write.
- **Future application**: before editing any file, check if it is live (`readlink -f`, compare to `~/.claude/`); if live, treat the pre-write diff as a mandatory approval gate, not an optional preview. Generalizes to any "edit = deploy" target (dotfiles, served config, hot-reloaded sources).
- **Reference**: `lib/design-gate.md` (symlink → `~/.claude/lib/`). Sibling to [[LRN-044]] (write-through-symlink → resolve real path). Linked to [[BDR-033]].
## LRN-051 — `git commit -- <pathspec>` strict on no-match → filter scoped commits to changed paths
- **Date**: 2026-06-26
- **Pattern**: Automating a scoped commit (commit only subtree X), pass to `git add`/`git commit` ONLY paths with real pending changes. `git add -- <pathspec>` TOLERATES a no-match pathspec (rc 0, stages the matching ones); `git commit -- <pathspec>` is STRICT — one no-match pathspec ABORTS the whole commit (`error: pathspec '<x>' did not match any file(s) known to git`). So a clean scoped path (e.g. empty `.claude/tasks`) silently aborts the commit on most runs. Filter via `git status --porcelain -- <path>` to changed paths only. Bonus: `git commit -- pathspec` = PARTIAL commit (working-tree of those paths, ignores rest of index) → surgical-scope safety: dangling code (untracked OR pre-staged) never embarked.
- **Context**: building `lib/memory-commit.sh`. Naive `git commit -- .claude/memory .claude/tasks` aborted whenever `.claude/tasks` was clean. Caught by real-exec test (T1/T2/T2-bis), NOT by assuming git's behavior — `add` and `commit` are NOT symmetric on pathspecs.
- **Future application**: any "commit only subtree X" automation — filter to changed paths; rely on partial-commit for surgical scope; never assume tool behavior symmetric across sibling subcommands — exec-test it.
- **Reference**: commit `58cb91d` (`_changed_paths` filter + T1/T2/T2-bis), `bbef41c` (stdout hash contract). See [[BDR-034]].
## LRN-052 — Hash-anchoring applicability — 2 cases where `Reference: commit <hash>` does NOT apply
- **Date**: 2026-06-26
- **Pattern**: The anchoring convention (`Reference: commit <hash>`) means "the commit that IMPLEMENTS this decision" (BDR-033 → 11792cc). It does NOT apply in 2 cases: (1) a FOUNDING decision made pre-code (at design time) — attested by no implementing commit; anchoring it to the unrelated scaffold commit is a FALSE anchor. (2) a SQUASH-MERGED PR — the anchored commit ceases to exist post-squash. Forcing a hash in either case dilutes what "anchored" means everywhere else. Rule: pre-code founding decisions carry NO hash (path+date suffice); squash-merge workflows can't anchor.
- **Context**: building init-project STEP 10b (capitalize founding architecture decisions). A founding "Astro not Next" has no implementing commit. Surfaced the BOUNDARY of the anchoring convention — completes it, not contradicts it.
- **Future application**: capitalizing founding/architecture decisions, or working in squash-merge repos — do NOT fabricate a hash; the anchor only means something when a real implementing commit exists.
- **Reference**: commit `df60df6` (init-project STEP 10b hash rule), `lib/capitalize-commit.md` (2-hash non-confusion). See [[BDR-034]], [[BDR-033]].
## LRN-053 — Read-before teeth = verifiable disposition in the artifact, not the act of reading
- **Date**: 2026-06-26
- **Pattern**: An "always read X before planning" invariant guarantees NOTHING by the read alone — "ran before the plan" proves the digest was PRODUCED, not CONSUMED. The teeth are a verifiable DISPOSITION: the plan/diagnosis must NAME each surfaced item it honors, or state none binds. [[LRN-048]] ("a 0/OK must prove it LOOKED") one step further — the guarantee is "did it STATE a verdict on each?" (checkable), not "did it look?" (not). Without the trace, even natural consumption (inline reader=planner) degrades to read-then-ignore.
- **Context**: analyze-before-plan ([[BDR-035]]). feat's first draft ("feed the MINI-PLAN") had no forced trace → user flagged it as the link where wiring goes cosmetic; strengthened to "MINI-PLAN names in-force or states none". bugfix DIAGNOSIS names `PRIOR: BLK-xxx`.
- **Future application**: any read-before / check-before / advisory wiring — force the consuming artifact to emit a per-item verdict; never trust "data was available" = "data was used".
- **Reference**: `lib/analyze-before-plan.md` (OUTPUT), `agents/feater.md` STEP 0.6, `agents/bugfixer.md` STEP 2.5. Extends [[LRN-048]]. See [[BDR-035]].
## LRN-054 — No deterministic oracle for "already in context" → never add a presence-skip branch
- **Date**: 2026-06-26
- **Pattern**: "Skip the work if the info is already in my context" has no clean implementation: (1) self-judgment = the behavioral guard [[LRN-046]] rejects, unreliable on long convos ([[LRN-034]]); (2) a session marker records "was read", NOT "still present" → after a compaction the body is gone but the marker says skip → FALSE-SKIP (the marker cost [[BDR-033]] priced); (3) the agent cannot grep its own context window. No presence oracle exists. Do the work unconditionally when cheap; bite on the verifiable disposition.
- **Context**: analyze-before-plan ([[BDR-035]]). Tried to skip PASS-2 full-read for "already in context" entries; predicate had no oracle. Resolved: PASS-2 reads selected set unconditionally (~tens of transient lines, digest-only persists). A decision WRITTEN earlier same-conversation must still re-surface as in-force (content in context ≠ flow treated it as a constraint).
- **Future application**: any "skip if already seen/in-context" optimization over conversation state — reject; no oracle. Make the work cheap+unconditional, or use a deterministic EXTERNAL ledger (not context introspection).
- **Reference**: `lib/analyze-before-plan.md` (THE INVARIANT). Conditions [[LRN-046]], [[LRN-034]], [[BDR-033]]. See [[BDR-035]].
## LRN-055 — Body `## ID —` headings are a drift-immune index; the maintained `## Index` table is not
- **Date**: 2026-06-26
- **Pattern**: When a registry keeps both per-entry `## ID — title` headings AND a hand-maintained `## Index` table, the Index DRIFTS (entries land in the body, the manual update lapses) while headings cannot (an entry IS its heading — 100% coverage by construction). Measured: decisions 11/34 (32%), learnings 21/52 (40%), blockers 2/9 (22%) missing from the Index — scattered in large blocks (e.g. decisions BDR-024033 unindexed while the newer BDR-034 is), not an old/new split. The manual Index-update step is simply unreliable. Key any selector/scan off `grep '^## <PREFIX>-'`, never the convenience Index. Backfill (prune-memory passe D) = human-TOC hygiene, NOT a selector dependency.
- **Context**: analyze-before-plan ([[BDR-035]]) two-pass. First instinct "reuse the Index capitalize maintains"; measuring the drift killed it — the convenient artifact was the unreliable one, the guaranteed one (headings) sat free.
- **Future application**: choosing a substrate to index/select over — prefer what the STRUCTURE guarantees over what a step PROMISES to maintain. Verify maintained-artifact completeness before depending on it.
- **Reference**: `lib/analyze-before-plan.md` (PASS 1). `skills/prune-memory` passe D. See [[BDR-035]].
## LRN-056 — `grep PAT dir/*.md` on an absent dir ERRORS (exit 2), it does not no-op → guard with `[ -d ]`
- **Date**: 2026-06-26
- **Pattern**: A bare `grep -E PAT dir/*.md` over a glob matching nothing (dir absent, or present with no `.md`) does NOT return clean-empty — the unmatched glob is passed LITERALLY to grep, which fails: `No such file or directory`, **exit 2** (grep error). Distinct from a real no-match: grep over an existing file with no hit = **exit 1**. Verified: bare grep on absent dir → 2; `[ -d dir ] && ls dir/*.md >/dev/null 2>&1 && grep …` on absent dir → 1 (`[ -d ]` false, short-circuits, grep never runs); grep on present-but-empty registry → 1. exit 2 = grep error; exit 1 = guard-skip OR clean no-match.
- **Context**: analyze-before-plan include ([[BDR-035]]). DO step said "absent → no-op" but the bare grep would ERROR at init-project STEP 2 (registries created STEP 5, absent at analyze). Caught by exec-test, not assumption.
- **Future application**: any glob-fed scan that must no-op on "nothing there" — guard `[ -d dir ]` (+ file-exists) BEFORE the glob; never assume grep degrades. Exec-test the absent/empty case.
- **Reference**: `lib/analyze-before-plan.md` (PASS 1 guard). Sibling to [[LRN-051]] (exec-test tool behavior, never assume). See [[BDR-035]].
## LRN-057 — Match the consumption mechanism to the consumer (mechanical / external-cognitive / inline-cognitive)
- **Date**: 2026-06-26
- **Pattern**: When a produced artifact must be CONSUMED downstream, the mechanism depends on the consumer: (a) MECHANICAL (git merge integrating a branch) — production on the shared substrate = consumption, automatic ([[BDR-034]]'s "commit before FINISH"); (b) EXTERNAL-COGNITIVE (an unmodifiable skill like `superpowers:brainstorming`) — "produced before" ≠ "consumed"; INJECT the artifact into the consumer's INPUT at the invocation boundary (orchestrator = adapter) + a RECONCILIATION gate that EXPOSES the disposition for review (not auto-detect); (c) INLINE-COGNITIVE (same agent reads then plans) — reader=planner, same context → natural consumption, just force the trace ([[LRN-053]]). Don't import (b)'s machinery where (c) suffices, nor assume (a)'s automatism when the consumer is cognitive.
- **Context**: analyze-before-plan ([[BDR-035]]). ship-feature brainstorm = external-cognitive → STEP 0d injection + STEP 3 expose-for-review gate; feat/bugfix = inline-cognitive → natural + trace, no injection. The asymmetry vs [[BDR-034]] (mechanical merge) was the chantier's hardest point.
- **Future application**: wiring ANY produce→consume invariant — classify the consumer first (mechanical / external-cognitive / inline-cognitive), pick the lightest sufficient mechanism. Stops reflexively importing orchestrator-grade injection+gate where an inline trace would do.
- **Reference**: `skills/ship-feature/SKILL.md` STEP 0d/1/2/3, `agents/bugfixer.md`+`feater.md`. Contrast [[BDR-034]] (mechanical). See [[BDR-035]], [[LRN-053]].
## LRN-058 — Same bug-class ≠ same fix: verify the twin shares the fix's PRECONDITION before replicating
- **Date**: 2026-06-27
- **Pattern**: A deferred "twin" fix ("doc-sync = same PR bug → reorder before FINISH like memory") REFUTED on inspection: memory's reorder worked because memory ALREADY committed (helper existed, only timing wrong); doc-syncer committed NOTHING → reordering uncommitted docs still misses the merge. The fix relied on a PRECONDITION (artifact already committed) the twin did NOT share. "Same symptom" ≠ "same mechanism". A read-phase grep (zero git commit in doc-syncer) caught it before any code — saved shipping an illusion-of-fix.
- **Context**: doc-sync coupled ([[BDR-036]]). The chantier's central lesson; the user named the trap upfront ("même bug ≠ même fix").
- **Future application**: any "fix X like we fixed Y" — NAME Y's load-bearing precondition, CONFIRM X has it, before replicating. Cheap read-phase check beats a shipped non-fix.
- **Reference**: [[BDR-036]], [[BDR-034]].
## LRN-059 — A step-number SWAP flips meanings → sweep external refs; a letter-suffix insertion shifts nothing
- **Date**: 2026-06-27
- **Pattern**: Renumbering a pipeline has two shapes, opposite ref-risk. (1) SWAP (STEP 8↔9 = FINISH↔DOC SYNC) flips what each number MEANS → every external ref can go silently false OR accidentally true; grep the WHOLE repo, read each hit individually. PROVEN: ship-feature's swap silently broke README:153 — which a PRIOR chantier's swap had ALSO broken (e8eff7e moved DOC SYNC 8→9, missed the ref) → debt COMPOUNDS across chantiers. (2) LETTER-SUFFIX insertion (10b, 0d) shifts NO existing number → breaks nothing (init-project's 10b left zero stale refs). Discipline: prefer letter-suffix insertions; on a swap do a full external sweep + per-ref verify; COMPLETE an accidentally-true ref (don't lean on the coincidence — it re-breaks at the next swap).
- **Context**: doc-sync coupled ([[BDR-036]]). The Task-6 sweep caught README:153 (prior debt) + verified 5 USAGE refs post-swap.
- **Future application**: any pipeline renumber — classify swap vs insertion; swap → grep+read every ref. The external sweep catches PAST chantiers' debt, not only the current one.
- **Reference**: [[BDR-036]]. Sibling [[LRN-002]], [[LRN-045]] (grep reads not just writes).
## LRN-060 — A fail-closed guard is proven by what it REFUSES (loudly); pass dynamic lists as argv, not a separator-string
- **Date**: 2026-06-27
- **Pattern**: Two robustness lessons from doc-commit. (a) The inverse-`.claude/` exclusion is a SECURITY guard (BDR-022) → test it by what it must REFUSE (forbidden path ALONE, and MIXED with legit), not only what it accepts; and refuse LOUDLY (dedicated exit 4, names the offender, refuse-ALL on mixed) — silent-filtering would MASK an upstream violation (doc-syncer surfaced a `.claude/` it must never patch). The refusal IS the alarm. (b) Pass a dynamic file list as ARGV, never a separator-joined string: argv has no in-band delimiter → a path with spaces survives as one element (proven, T7); newline is only the producer's text format the agent maps to argv. Space-join-then-resplit would mis-split + the `[ -e ]` filter then silently drops it.
- **Context**: doc-commit.sh ([[BDR-036]]), T1a/b/c (refuse paths) + T7 (argv space-safe), all real-exec.
- **Future application**: any automated scoped-commit / destructive guard — test the REFUSAL path + refuse loud; pass lists as argv. Same family as [[LRN-046]] (deterministic oracle for a destructive guard).
- **Reference**: [[BDR-036]], [[LRN-051]] (changed-paths filter), [[LRN-046]].
## LRN-061 — Runtime net proposed for an unwired skill → check the wiring first
- **Date**: 2026-06-27
- **Pattern**: Tempted to build a runtime guard/hook/monitor that watches for a bad OUTCOME (memory written but uncommitted)? First ask if the outcome is a MISSING WIRING, not a behavioral lapse. A per-turn Stop-hook was proposed to catch "dirty memory" — but the cause was `/capitalize`+`/close` not calling the commit include (they predate it). Fix for an unwired skill = WIRE it (deterministic, zero-noise, at source); a monitor over a wiring hole pays RECURRING cost to detect a ONE-TIME omission, and a frequent ignored nag is itself a risk ([[LRN-047]]). **NOT "runtime nets are bad"** — the split is by DETERMINISM: a MISSING WIRING is deterministic → repair structurally; a genuinely NON-DETERMINISTIC aléa → a runtime net IS the right tool. Good counter-example: [[BDR-033]] anim-lib nudge — "will the user want motion?" is unknowable statically → a stateless 1-line suggestion is correct. Same determinism test as [[LRN-046]]/[[LRN-049]], applied to the build-or-not question.
- **Context**: deferred "v2 capitalize hook" ([[BDR-037]]). Read-phase killed it before code: git proved skills predate the include (oubli), memory already committed by hand 35×, orphans self-heal via `commit_memory`. The hook would've been disabled within an hour (frequent ignored nag).
- **Future application**: any "build a hook/watcher/lint to catch when X isn't done" — first grep whether X is even WIRED at its source. Deterministic/structural gap (missing include/call) → fix structurally; reserve runtime nets for non-deterministic lapses, never to complete a rollout. Classify by determinism BEFORE building.
- **Reference**: [[BDR-037]], [[BDR-034]] (rollout this completes), [[BDR-033]] (the GOOD net — contrast). Conditions [[LRN-047]], [[LRN-049]], [[LRN-054]].
## LRN-062 — deploy first-run detection = file-existence, never `git describe`
- **pattern**: detect "first deploy / no prior marker" by `[ -f .claude/deploy/STATE.json ]` (deterministic). NEVER `git describe --tags --match 'deploy/*'` — it errors `fatal: No names found, cannot describe anything`, exit 128, when no matching tag exists (verified git 2.53). Oracle = committed `STATE.json` holding `deployed_sha` (external ledger; never infer from context — [[LRN-054]]).
- **context**: deploy skill design. The describe-128 result is only the REASON NOT to use describe — never the detection path.
- **future application**: any "first run vs incremental" tool — detect by an explicit on-disk marker's existence, not by a git query that errors on the empty case.
## LRN-063 — delta-since-marker = `git diff --name-only <base> HEAD` (two endpoints), never rev-list/three-dot
- **pattern**: "files changed since marker X" = `git diff --name-only <X_sha> HEAD` — two explicit endpoints = literal tree diff. NEVER `git rev-list X..HEAD` (ancestry → phantom deltas after rebase: an orphaned marker yields the whole history). NEVER three-dot `X...HEAD` (merge-base → UNDERCOUNTS on divergence). Verified git 2.53 (linear: all forms agree; diverged: two-dot = both sides, three-dot = one side only).
- **context**: deploy delta mechanism. Footgun: `git diff A..B``git diff A B` (two endpoints), but `rev-list A..B` = ancestry — same `..` token, different meaning per command.
- **future application**: any delta-since-checkpoint over git — explicit two endpoints for the tree diff (artifact list); reserve `rev-list` for commit-counting only.
## LRN-064 — surgical-commit helper family partitions `.claude/`; a new subtree needs its own allowlist sibling
- **pattern**: the surgical-commit helpers each own a `.claude/` partition by OPPOSITE rules — `memory-commit.sh` ALLOWLISTS `.claude/memory`+`.claude/tasks`; `doc-commit.sh` EXCLUDES all `.claude/**` (loud rc 4, BDR-022 — [[LRN-060]], [[BDR-036]]). So committing a NEW `.claude/` subtree (e.g. `.claude/deploy/`) can reuse NEITHER: doc-commit refuses it, memory-commit ignores it. Verified live: real `doc-commit.sh` → rc 4 on `.claude/deploy/PROCEDURE.md`. Solution: mint a sibling (`deploy-commit.sh`) with a TARGET allowlist for the new subtree — guard order = traversal `*..*` reject FIRST, then `.claude/deploy/*` allow, else refuse. Inherit rc 3 unsafe-git, short-hash stdout, changed-paths filter.
- **future application**: adding a committable `.claude/X` subtree → new allowlist sibling, don't bend an existing helper; order the path guard traversal-first.
## LRN-065 — cross-session cold-resume skill = disk-bridge read-first (audit-delta convention)
- **pattern**: a skill that hands BACK control mid-flow (user acts out-of-band) and RESUMES — possibly in a NEW session, context gone — must carry ALL resume state on disk. A bridge file's PRESENCE = the wait-marker ("in flight, awaiting report"); STEP 0 reads it FIRST and resumes from its captured `{base, target, delta, step_reached}` WITHOUT recomputing (HEAD may have moved during the gap → "current HEAD" is wrong). Convention = audit-delta "the state file is the only memory between runs", extended from run-to-run to a MID-FLOW pause. `client-handover` only pauses in-context (synchronous), NOT cold — deploy is the first cold-resume form. A `runbook_rev` (FULL sha) does double duty: in-flow regenerate trigger + cold-resume staleness check; regenerate the instantiated artifact if ABSENT or stale. Pressure-test confirmed (fresh agent resumed from the bridge, excluded the moved-HEAD temptation).
- **future application**: any "do work → user acts out-of-band → resume later" skill — persist a disk bridge, read it first, never recompute on resume; mark the wait by the bridge's existence, not by conversation context.
## LRN-066 — surgical-commit must fail LOUD on git-ignored target paths (else silent no-op)
- **pattern**: `git status --porcelain -- <path>` HIDES git-ignored paths → a surgical-commit helper that filters changes via porcelain SILENTLY no-ops (rc 1) when the target project ignores the path (e.g. `.claude/` wholesale) → the artifact never persists, the skill silently forgets. Fix: guard with `git check-ignore -q <path>` BEFORE the changed-filter; any passed path ignored → LOUD refusal + dedicated rc (5), never a silent no-op. Fail-closed/loud over silent. (Same porcelain mechanism as the changed-filter — [[LRN-051]].)
- **context**: `deploy-commit.sh`; the FINAL whole-branch review caught it (per-task reviews could not — it is a skill↔target-repo seam). Applies to the whole memory/doc/deploy-commit family.
- **future application**: any helper relying on `git status --porcelain` to detect changes — add a `git check-ignore` guard; a path that must persist but is ignored has to fail loud, not no-op.
## LRN-067 — a pipeline that looks 2-level can finish at the SAME level; a human-mediated step masks the collision until automated
- **pattern**: an orchestrator delegating to a sub-skill can LOOK two-level (sub assembles parts, orchestrator integrates) yet the sub's TERMINAL node operates at the SAME level as the orchestrator's own finish → double-integration. `subagent-driven-development` assembles tasks on ONE branch (no per-task sub-branches — true) BUT its last flowchart node IS `finishing-a-development-branch` = feature→base merge, the SAME act as the orchestrator's FINISH. init-project (STEP 8 SDD + STEP 11 finish) AND ship-feature (STEP 4 SDD + STEP 9 finish) BOTH invoked finish TWICE. Latent, not visibly broken: SDD's terminal finish is INTERACTIVE (menu → human picks "keep as-is"), so the human SILENTLY de-duplicated. Collision SURFACES the moment the orchestrator's finish becomes DETERMINISTIC (gitflow finish) → real double-merge. Fix = scope the sub-skill by instruction to stop before its terminal step (NO fork — the finish is a flowchart node the controller follows, not a script; verified by reading SDD's scripts). Pressure-test: RED agent chained the finish ("literal next node in the flowchart"); GREEN with the scope instruction stopped + returned.
- **context**: gitflow chantier, wiring orchestrators onto `gitflow finish`. Mapping (premise #6) caught it by READING the real (SDD `SKILL.md` + `scripts/`) BEFORE coding — the seam-bug class `deploy` hit, caught earlier this time. Two human-gate backstops survive a missed instruction: SDD's interactive menu + the `gitflow finish` human gate ([[LRN-054]] — no oracle; deterministic layer carries the dangerous case).
- **future application**: before replacing an interactive/human-mediated step with a deterministic one, check whether a delegated sub-skill's TERMINAL step operates at the same level — the human gate may have been silently de-duplicating a double-action. Read the sub-skill's real flow (nodes + scripts), don't assume "distinct levels".
## LRN-068 — enforcement-bootstrap must be transactional: activate the guard LAST and gate it on the bootstrap commit succeeding
- **pattern**: a routine that BOTH installs an enforcement guard (pre-commit hook, branch protection, lock) AND makes a bootstrap commit must be transactional, else a partial run strands it. Two teeth: (a) precheck preconditions (git identity, clean tree) and fail LOUD before ANY mutation; (b) the guard-activation step must NOT run if the guarded bootstrap commit failed — order activation LAST and gate it on commit success. A `cmd_a || cmd_b` form SWALLOWS cmd_b's failure when a later stmt returns 0 → the failure never propagates; use explicit `if ! …; then … || return 1; fi`.
- **context**: `gitflow_init` ([[BLK-012]]). Existing-repo path swallowed the socle-commit failure (`git diff --cached --quiet || git commit`, then `git branch develop` returned 0 masking it) → init CONTINUED and ran `gitflow_activate_hook` though the socle was never committed → every re-run self-blocked (commit on main blocked by the hook just installed). Fresh-repo path already propagated → the asymmetry was the bug. Fix: fatal socle commit + identity precheck; verified on an identity-less repo → aborts rc1 with ZERO mutation, 57/57 tests green.
- **future application**: any init/bootstrap installing enforcement (hooks, protection, immutability) + committing — activate LAST, gate on the commit, precheck identity/clean-tree up front, make every link propagate (no `||` swallow). TEST the partial-failure path (identity-less / commit-blocked repo) → must abort with zero mutation and stay re-runnable.
## LRN-069 — token-authed remote writes under CC perms: inline-env (never `export`), token in the header, keep `git push` on ASK as the real gate
- **pattern**: a secrets-guard `Bash(export *)` in `permissions.deny` auto-denies ANY command whose FIRST token is `export …` — a false positive (`export GIT_CONFIG_VALUE_0="Authorization: token $TOK" …` reads as blocked when only the `export` prefix tripped it, not the git/curl op). Correct model for token-authed remote writes from tool calls: (a) INLINE env assignment `GIT_CONFIG_COUNT=1 GIT_CONFIG_KEY_0=http.extraHeader GIT_CONFIG_VALUE_0="Authorization: token $TOK" git push …` (no `export` keyword → passes; token rides the http header via git env-config, NEVER in argv nor written to the clone's `.git/config`); (b) keep `Bash(git push *)` on ASK (not deny) — that prompt IS the per-write human gate; don't suppress it, don't allow-list pushes in settings.
- **context**: gitflow migration on Gitea. 3 consecutive tool-call denials traced to `Bash(export *)` (false positive); an earlier INLINE-env `ls-remote` passed; the user's own `!` shell ran the same `git push` fine (not under CC perms). Confirmed `git push` is ASK by design = the right gate locus, NOT `export *`.
- **future application**: scripting token-authed git/curl writes under CC perms → inline env (never `export`), token in `Authorization` header (curl `-H`, git `GIT_CONFIG_*` extraHeader), keep `git push` on ASK as the approval. Tool-call denied unexpectedly → read `permissions.deny` for an over-broad prefix rule (`export *`, `env`, `printenv`) catching a false positive BEFORE concluding the op itself is blocked.
## LRN-070 — clean-tree-gated migration + a dirty submodule: diagnose pointer-vs-content, ignore=dirty not blind reset
- **pattern**: an op gated on a clean tree (`git status --porcelain`) is blocked by a submodule showing ` M`. FIRST distinguish: (a) **pointer move** — gitlink (HEAD) ≠ submodule HEAD → resettable via `git submodule update`/`checkout`; (b) **dirty content** — gitlink UNCHANGED, files modified INSIDE the submodule → a local edit. For an intentional local edit, `checkout --`/`submodule update` correctly REFUSE to discard it, and a blind "reset" would DESTROY it. Exclude it non-destructively: `git config submodule.<name>.ignore dirty` (local `.git/config`) → status stops reporting the submodule's dirty content, gate passes, edit preserved. Commit it to `.gitmodules` to share the ignore across clones.
- **context**: claude gitflow self-migration. `skills-external/gstack` showed ` M`; gitlink `070722a` == submodule HEAD `070722a` (NOT a pointer move), 2 tracked-modified files (`bun.lock`+`package.json`) = the [[BLK-008]] Playwright 1.61 bump (Ubuntu 26.04 browser). The planned "reset" (D2) would have discarded the browser fix; `submodule.skills-external/gstack.ignore=dirty` cleared the tree for `migrate_local`, bump intact.
- **future application**: any clean-tree-gated op (migrate/release/bisect) on a superproject with a submodule carrying intentional local edits → diagnose pointer-vs-content FIRST (compare gitlink to submodule HEAD); for content, `submodule.<name>.ignore=dirty`, never a blind reset. Cross-ref [[BLK-008]] (gstack -dirty by design).
## LRN-071 — fail-loud must cover the helper's OWN commit, not just its inputs — 3rd occurrence of the swallowed-commit pattern
- **pattern**: a surgical-commit helper guarded LOUD on its INPUTS (scope) but SILENT on its OWN `git commit`. `doc-commit.sh`: `set -uo pipefail` (no `-e`) + unguarded `git commit` → on rejection (pre-commit hook on a protected branch / signing / etc.) execution CONTINUES: `printf "committed"` lies, `git rev-parse --short HEAD` emits the PREVIOUS HEAD hash, function exits 0. Orchestrator reads rc 0 + non-empty hash → believes success; docs silently uncommitted, tree dirty (RISK-2).
- **RECURRENT (3×) — audit systematically, not an isolated bug**: same fail-silent-where-it-must-fail-loud class in the surgical-commit family — [[LRN-066]] (`deploy-commit.sh`: porcelain hides a git-ignored path → silent no-op; fix = loud rc 5) + [[LRN-068]]/[[BLK-012]] (`gitflow_init`: socle-commit failure swallowed by `||` then `git branch` returned 0 → init continued past the dead commit) + this. The common mechanism: a fallible op (esp. a commit) whose failure isn't propagated, MASKED by a later returning-0 statement. The motif RETURNS; treat it as a known smell.
- **fix**: guard the commit — `if ! git commit …; then LOUD + return 5; fi`. rc 5 = "tried, git refused" (distinct from rc 3 = "could not start"). Empty stdout (no stale hash), loud stderr. Proven by T8: RED showed the masking (rc 0 + stale hash + false "committed"), GREEN rc 5 + empty + REJECTED, 32/32.
- **future application**: any helper whose RETURN VALUE gates a downstream "success" — audit that EVERY fallible internal op propagates its failure, ESPECIALLY the load-bearing commit. `set -uo pipefail` without `-e` does NOT abort mid-function; an unchecked failing command followed by a returning-0 line exits 0 and lies. Check `cmd || other` forms, no-`-e` blocks, every "report success after the op" line. Test the partial-failure path (commit-blocked repo) → must fail loud, empty, non-zero.
## LRN-072 — a stranded-artifact bug can be fixed by NOT creating the artifact (negative diff), not by plumbing its commit
- **pattern**: 3rd member of the post-FINISH-artifact class (memory, docs, GSD ROADMAP) — but UNLIKE the first two (real artifacts ALWAYS produced → couple a commit), the GSD artifact came from a SPECULATIVE, opt-in, rarely-used producer (init-project auto-bootstrapping a multi-session engine at project creation). The reflex fix (reorder + build `gsd-commit.sh` + tests) would have added machinery to faithfully commit an artifact nobody uses. The right fix was a NEGATIVE diff: delete the producer → orphan never created → bug dissolves, zero new code (BLK-011).
- **the refutation that got there**: the framing "ROADMAP redundant with TODO" was WRONG (gsd ≫ roadmap = state machine/crash-recovery/cost/parallel/worktree; TODO ≠ gsd ROADMAP = different altitude + consumer). Reading REFUTED both premises, yet the CONCLUSION (remove the step) held for a STRONGER reason: speculatively scaffolding a heavy engine the sole user doesn't use, at creation, is bad per se. Right answer, reason corrected before engraving — change the QUESTION before changing the code.
- **future application**: a stranded / duplicated / uncommitted-artifact bug → BEFORE building machinery to handle the artifact, ask whether the step that PRODUCES it is actually used / wanted / non-speculative. Speculative or unused (esp. a personal/single-user repo) → DELETE the producer; the cleanest fix is the absent one. Distinguish speculative-at-creation (REMOVE) from deliberate-on-demand (KEEP). Family: [[BLK-010]], [[BLK-011]], [[BDR-036]].

308
lib/tests/fixtures/todo-snapshot.md vendored Normal file
View File

@ -0,0 +1,308 @@
# TODO
## 2026-06-23 — install self-sufficient + gstack on-demand par profil
Goal: `make install`/`make plugin`/`make update` installent TOUT sans étape
manuelle. Plus le profil-driven gstack on-demand (option 1 user : gstack OFF
par défaut, mais `set <profil>` qui a besoin de gstack l'active pour ce profil).
Root causes trouvées (logs install-20260623-181416.log) :
- Bug A : install.sh lance link.sh (étape 5) AVANT install-plugins.sh (étape 6),
qui n'a jamais re-lancé link.sh → symlinks npx/externes jamais créés au 1er run
(LRN-022 documentait déjà le trou). update-all.sh re-link déjà (L364).
- Bug B : `npx skills add` + gstack ./setup résolvent leur cible relativement au
CWD (repo) → darwin-skill atterrit dans $REPO/.agents/skills + $REPO/.claude/skills
au lieu de $HOME/.agents/skills. Auto-entretenu une fois $REPO/.agents créé.
- Bug C : profile.sh "missing — try: bash link.sh" trompeur (link.sh ne crée pas
les skills gstack) ; full.profile liste 35 skills gstack jamais posés dans skills/.
- [x] Edit 1 — install-plugins.sh Step 8.5 : `npx skills add` depuis $HOME (subshell cd)
- [x] Edit 2 — install-plugins.sh : cleanup parasites $REPO/.agents/skills + $REPO/.claude/skills (gitignorés)
- [x] Edit 3 — install-plugins.sh : Step 10 final re-lance `bash "$REPO/link.sh"` (idempotent)
- [x] Edit 4 — update-all.sh Step 7.5 : `npx skills add` depuis $HOME (même Bug B)
- [x] Edit 5 — lib/profile.sh : GSTACK_SRC var + enable_skill gstack branche on-demand
(symlink skills/<name> → skills-external/gstack/<name>) + message honnête
- [x] Verif — shellcheck/bash -n propres ; migré darwin → $HOME/.agents/skills + `bash link.sh`
(skills/darwin-skill OK) ; `profile.sh set full` → 0 "missing", 35 gstack on-demand ;
cycle minimal↔full OK ; git propre (symlinks gstack gitignorés) ; profil full restauré
- [~] Cleanup machine courante : $REPO/.claude/skills/darwin-skill + .agents/skills VIDE
restent (rm bloqué par garde permission .claude/) → auto-nettoyés au prochain `make plugin`
- [x] Capitalize — LRN-042 (Bug B CWD-relatif) + BDR-030 (gstack on-demand par profil) + journal 2026-06-23
- [ ] Commit (via /commit-change)
## profile.sh — verbe `gstack on|off`
- [x] Extraire helper `enable_all_gstack()` (boucle de cmd_reset) — anti-duplication
- [x] Extraire helper `disable_gstack_not_in(prof)` (boucle gstack de cmd_set) — anti-duplication
- [x] Extraire helper `parked_gstack_count()` (réutilise pattern cmd_current)
- [x] Refactor cmd_reset + cmd_set pour utiliser les helpers (comportement préservé)
- [x] `cmd_gstack()` : `on` = enable tout gstack (garde label active-profile), `off` = disable gstack hors profil actif
- [x] Wire main() dispatch `gstack)` + usage() + bloc header
- [x] Doc : SKILL.md argument-hint + exemples + output-policy (Makefile générique suffit)
- [x] shellcheck propre + tests (help/bad-action/none-error/on/off cycle) — état live restauré exact
- [x] Investigué "fix" full.profile : PAS un bug — curation par design (BDR-017 caveat). Aucun fix code.
- [x] FOLLOW-UP (BLK-007 résolu) : linké `spec` (symlink chirurgical) + ajouté à full/web-full ; iOS NON linké (Linux, besoin Mac+Tailscale) ; `.gitignore` allowlist gstack complété (12 ajouts + checkpoint stale retiré) → `gstack on` git-clean ; LRN-025 capitalisé
- [x] Capitalize : BDR-018, LRN-024, BLK-007, EVAL-002, journal 2026-06-02 + backfill index (BDR-017, BLK-005/006)
## README.md overhaul
- [x] Plan
- [x] Corriger section install ctx7 (retirer MCP, clarifier CLI + API key)
- [x] Marquer ruflo comme désactivé
- [x] Supprimer section Troubleshooting/bugs courants
- [x] Simplifier stacks tierces (gstack, ruflo, ctx7, GSD) — juste description + lien
- [x] Ajouter section skills personnels (skills-perso)
- [x] Ajouter section système d'autogestion (plugin-advisor, tokens, synergies)
- [x] Nettoyer section Updating (retirer instructions manuelles par outil)
- [x] Nettoyer section Maintenance (retirer doublon updating)
- [x] Mettre à jour table Plugins reference (ctx7 row, ruflo OFF)
- [x] Corriger lien USAGE.md dans l'intro (retirer mention cas/erreurs)
## USAGE.md cleanup
- [x] Supprimer tous les "Cas de figure — corrections vX.X.X validées"
- [x] Supprimer table "Erreurs fréquentes"
- [x] Corriger `/readme``/doc` dans bonnes pratiques
- [x] Supprimer séparateurs orphelins
## Skill /doc
- [x] Mettre à jour doc-syncer.md pour gérer ajouts/suppressions de features
- [x] Mettre à jour SKILL.md description pour mentionner feature delta
## Auto-activation ui-ux-pro-max sur détection design
- [x] Créer `lib/design-gate.md` — snippet réutilisable (detect design signals + ask to activate ui-ux-pro-max)
- [x] Intégrer dans feater.md — STEP 0.5 entre scope check et mini-plan
- [x] Intégrer dans hotfixer.md — STEP 1.5 (si CSS/style/animation)
- [x] Intégrer dans bugfixer.md — STEP 1.5 (si bug UI/style)
- [x] Mettre à jour plugin-advisor.md — PHASE 4 : cohérence avec le design gate
- [x] Mettre à jour CLAUDE.md skill routing — documenter le comportement auto
## Refonte agents/seo-analyzer.md
- [x] Lire agent actuel + plugin-advisor + interviewer + feater + hotfixer + analyzer
- [x] Réécrire l'agent complet v1 (11 étapes)
- [x] Ajouter orchestration sub-agents (hotfixer/feater) + triage par batches
- [x] Déplacer plugin-advisor après détection stack (STEP 3 au lieu de STEP 0)
- [x] Ajouter 2 niveaux d'audit (LOCAL code-only / FULL live+externe)
- [x] Adapter scoring, legal, GEO aux deux niveaux
- [x] Renumeroter proprement (0-14) + corriger toutes les refs internes
- [ ] Commit
## /onboard — cso archetype-aware
Problème : prompt cso fallback est non-adaptatif — cherche XSS/SQLi/CORS même sur firmware.
Objectif : charger `## Typical pain points` + `Surface sécurité` de l'archétype et les injecter dans le prompt cso.
- [x] STEP 4.5 → ajouter extraction de archetype-context.md (pain points + Surface sécurité + category) — validé sur firmware-embedded / nextjs-app-router / library
- [x] STEP 6 dispatch cso fallback → re-écrire prompt : universal checks + sections conditionnelles par category (web / embedded / library / cli / infra / data / desktop)
- [x] STEP 6 dispatch cso gstack ON → passer `--archetype <name> --context-file .onboard-audit/archetype-context.md` dans args
- [ ] OUT-OF-SCOPE ce fix : étendre le pattern à analyze/code-clean/doc (déjà reçoivent `ARCHETYPE: <name>`, juste pas le context-file). À faire dans un 2e passage si besoin.
## /validate — nouveau skill W3C + WCAG (option A)
Scope : W3C HTML validity (validator.nu API) + W3C CSS validity (jigsaw API) + WCAG a11y (axe-core CLI / pa11y / WAVE API / fallback statique). Même pattern que /harden (audit par défaut, --fix avec confirmation A/B/C/D). Rapport = VALIDATE.md racine. Complémentaire à /onboard (qui audite a11y au setup initial — /validate est l'outil on-demand réutilisable).
Design décisions :
- **Agent dédié** : `agents/validator-analyzer.md` (nouveau). Pas de réutilisation de seo-analyzer — scope différent (validité syntaxique vs indexabilité).
- **Depth** : LOCAL (fichiers HTML/CSS statiques, tools npm locaux si dispo) | FULL (URL live + APIs distantes W3C/WAVE).
- **External validators** : validator.nu/?out=json (HTML), jigsaw.w3.org/css-validator (CSS), WAVE API optionnelle (quota gratuit ~100/mois), axe-cli local, pa11y-cli local.
- **Tools fallback order** : npm tools locaux → APIs distantes → agent général statique (cas onboard). Aucun install forcé.
- **--fix conservateur** : `alt=""` sur images décoratives évidentes, `lang` sur `<html>`, fermetures de tags manquantes, sauts de niveau heading renumérotés. PAS : labels forms, contraste couleurs, landmarks (demandent décision humaine).
- **Out of scope** : meta tags/SEO → /seo ; JSON-LD → /geo ; security headers → /harden ; code linting générique (ESLint/Prettier) → hors scope web standards.
Subtasks :
- [x] Créer `agents/validator-analyzer.md` — spec 6 étapes (478 lignes)
- [x] Créer `skills/validate/SKILL.md` — dispatcher (378 lignes)
- [x] Ajouter routage `/validate` dans `~/.claude/CLAUDE.md` section "Skill routing"
- [x] Mettre à jour `skills/harden/SKILL.md` — W3C/a11y redirigé vers /validate
- [x] Mettre à jour `skills/seo/SKILL.md` — cross-ref /validate pour W3C/WCAG
- [x] Grep cohérence : refs /validate correctes, skill détecté par la harness
## Animation lib (`motion`) — install + détection
Problème : `motion` (ex-`framer-motion`, rebrandé nov 2024) n'est ni installé par les scripts ni détecté par plugin-advisor / design-gate. Ajouter détection + install conditionnel.
Décisions :
- **Package** : `motion` (npm `motion`, import `motion/react`). `motion-v` pour Vue 3 (package séparé). Svelte/vanilla → `motion`.
- **Éligibilité** : tout projet qui peut consommer l'API. ✅ React/Next/Remix/Astro+React, Vue3/Nuxt, Svelte. ❌ Backend, CLI, embedded, Flutter, WordPress/Drupal/Strapi, RN (réservé `react-native-reanimated`).
- **init-project** STEP 5 : auto-install si éligible + absent (l'utilisateur a déjà validé scaffold).
- **onboard** STEP 2.5 : propose + attendre OK (projet existant, opt-in).
- **plugin-advisor** : read-only — détecte + reporte ("✅ motion installed" ou " eligible but absent — run /onboard").
- **design-gate** : ajouter motion/motion-v/framer-motion (legacy) dans filesystem signals.
Subtasks :
- [x] Créer `lib/animation-lib-check.sh` — fonctions `detect_anim_eligibility()` + `is_anim_lib_installed()` + `recommend_anim_install_cmd()`
- [x] Patcher `agents/scaffolder.md` PHASE 4 — note (le scaffolder n'installe PAS, l'orchestrateur init-project STEP 5e gère)
- [x] Patcher `skills/init-project/SKILL.md` — STEP 5e ANIMATION LIB (auto-install si éligible)
- [x] Patcher `skills/onboard/SKILL.md` — STEP 2.5 ANIMATION LIB (propose + attendre yes/skip)
- [x] Patcher `agents/plugin-advisor.md` PHASE 1 (sourcing du helper) + PHASE 2 (signaux `anim-lib-eligible`/`anim-lib-installed`) + PHASE 3 (section ANIMATION LIB read-only)
- [x] Patcher `lib/design-gate.md` — ajouter motion/motion-v/framer-motion + autres anim-libs dans filesystem signals
- [x] Tester : shellcheck OK ; matrix React/Vue/RN/backend/with-motion/no-package/pnpm tous corrects
## Helper `--help` / `help` sur tous les skills (option C)
Problème : aucun skill ne gère `--help` aujourd'hui. `argument-hint` affiche juste la syntaxe en autocomplétion, pas de description/exemples. L'utilisateur doit lire le SKILL.md ou deviner.
Objectif : `/<skill> --help` (ou `/<skill> help`) affiche un bloc standardisé (description, args, exemples, cross-refs) et exit SANS dispatcher l'agent ni modifier quoi que ce soit.
Design :
- **Lib partagée** : créer `skills/lib/help-handler.md` — snippet réutilisable "if $ARGUMENTS contains --help|help|-h, extract frontmatter fields (description, argument-hint, cross-refs) + afficher bloc d'aide standardisé + STOP".
- **Format d'aide** standardisé :
```
/<skill><titre court>
DESCRIPTION
<extrait de la frontmatter description, dépouillé des Triggers>
USAGE
/<skill> <argument-hint>
ARGUMENTS
<liste détaillée de chaque flag avec son effet nouvelle section
dans les SKILL.md, ou parsée depuis STEP 0 arg parsing>
EXAMPLES
<3-4 exemples concrets>
SEE ALSO
<extrait des "For X use /Y" de la frontmatter>
```
- **Intégration** : ajouter STEP 0.5 ("Handle --help") dans chaque SKILL.md juste après STEP 0 parsing args. Ordre : parse args → check --help → si oui afficher + exit → sinon continuer.
- **Skills à patcher** : `~/Documents/claude/skills/` = ~20 skills persos + skills-perso list pour référence. Ne PAS toucher skills-external/gstack (ownership externe) ni example-skills.
Subtasks :
- [ ] Créer `skills/lib/help-handler.md` — snippet réutilisable (détection + extraction + affichage)
- [ ] Définir format d'aide standard + section "ARGUMENTS" vs reuse de argument-hint
- [ ] Décider : sections ARGUMENTS/EXAMPLES doivent-elles être dans la frontmatter (nouveau champ YAML) ou dans le corps du SKILL.md (nouvelle section `## Help`) ?
- [ ] Patcher un skill pilote (`/validate`) — valider UX _(désormais `/web-validate` — renommé e5e673a)_
- [ ] Patcher les skills perso restants : analyze, bugfix, code-clean, commit-change, doc, feat, geo, graphify, harden, hotfix, init-project, make-pdf, onboard, plan-tune, plugin-check, refactor, seo, ship-feature, skills-perso, status, benchmark-models, context-save, context-restore
- [ ] Mettre à jour `~/.claude/CLAUDE.md` — mentionner convention --help disponible sur tous les skills perso
- [ ] Note : skills-external/gstack ont leur propre convention, ne pas toucher
## Skill profiles (partition gstack par usage)
- [x] Plan
- [x] `lib/profile.sh` — list/show/current/apply/set/reset/diff via symlink toggle
- [x] `lib/profiles/{design,dev,qa,audit,minimal}.profile` — 5 profils
- [x] `skills/profile/SKILL.md` — slash command `/profile`
- [x] Wire `agents/plugin-advisor.md` — DETECT call profile.sh current + OUTPUT line PROFILE + nouvelle section "Skill profiles" dans TOGGLING EXTERNAL TOOLS
- [x] Wire `lib/toggle-external.sh` — header pointer vers profile.sh
- [x] `Makefile` — targets profile/profile-list/profile-current/profile-reset
- [x] Tests : list/show/current/diff/set/reset/apply tous OK, shellcheck propre, symlinks bien restaurés après reset
## Profile system v2 — extension plugins/MCPs/CLIs
- [x] Inventaire complet : 7 plugins (4 ON / 3 OFF), 0 MCP local, 4 CLIs installés
- [x] Définir `MANAGED_PLUGINS` (ui-ux-pro-max, plugin-dev, pr-review-toolkit) + `PROTECTED_PLUGINS` (caveman, security-guidance, superpowers)
- [x] `profile.sh` étendu : nouveau type `plugin@<marketplace>` (auto-toggle via `claude plugin enable/disable`), `mcp` (delegate à toggle-external.sh pour magic), `cli` (advisory only)
- [x] `cmd_set` désactive aussi les MANAGED_PLUGINS hors profil
- [x] `cmd_reset` ne touche PAS aux plugins (info line explicite — re-enable manuel ou via apply)
- [x] `cmd_current` : compte `enabled` + `installed`, tiebreaker = total le plus grand
- [x] `cmd_show` : colonne TYPE élargie à 30 chars pour `plugin@ui-ux-pro-max-skill`
- [x] 4 nouveaux profils : `web`, `seo`, `web-full`, `backend`
- [x] Profils existants raffinés (design, dev, qa, audit) avec `plugin@<marketplace>` + `cli`
- [x] `skills/profile/SKILL.md` : table profils mise à jour + table mécanisme par type
- [x] `agents/plugin-advisor.md` : table de recommandations étendue avec web/seo/web-full/backend
- [x] Tests : `set web` enable ui-ux-pro-max+magic, `set seo` disable ui-ux-pro-max, `set minimal` épargne always-on, `reset` restaure 64 skills
- [x] Memoire : BDR-008 (v2 décision) + journal entry 2026-05-04
- [x] Shellcheck propre
## /audit-delta — skill audit incrémental multi-axes (2026-06-11)
But : 1 skill, 4 axes cochables (conformité CLAUDE.md, erreurs/améliorations,
code mort, sécurité), scope = diff depuis dernier run (marqueur SHA persistant,
par axe), boucle par axe : audit → gate approbation → fix → re-vérification
obligatoire avant axe suivant. Construit via superpowers:writing-skills (TDD).
- [x] RED : baseline subagent sans skill (worktree isolé) — 7 gaps documentés
(boundary par date de fichier, checkpoint en prose, pas de marqueur par
axe, zéro gate, lint=verify, passe unique mélangée, registres auto-écrits)
- [x] GREEN : skills/audit-delta/SKILL.md — pass sous pression (state file
utilisé, gate tenu malgré "fix tout + meeting", marqueurs par axe OK)
- [x] REFACTOR : trou trouvé (premier run + user injoignable, aucune règle) →
patch : défaut full codebase report-only, jamais "from HEAD" ; re-test pass
- [x] Vérif finale : skill découvrable (~/.claude/skills/audit-delta via symlink
skills/), frontmatter valide, worktrees de test nettoyés
- [x] Capitalize : BDR-020 + LRN-027 + journal 2026-06-11
- [ ] Commit (via /commit-change quand prêt)
## 2026-06-11 — darwin eval: 4 confirmed bugs fix (branch auto-optimize/*-bugfixes)
- [x] geo-analyzer.md: unreachable user → ALL file fixes report-only (STEP 12/13 triage gate)
- [x] init-project SKILL.md: repoint readme-updater.md (absent) → doc-syncer.md x2
- [x] analyzer.md: resolve "Update project memory" vs "Do not modify files" contradiction
- [x] onboard SKILL.md: allowed-tools += Agent, Skill (workflow STEPs 5-7 need them)
- [x] re-test geo fixture (unreachable) → expect zero source edits; 2 blind judges on geo-analyzer diff
- [x] commit per fix, results.tsv rows, merge if green
## 2026-06-19 — cleanup/caveman-always-on (full plugin purge)
Goal: disable caveman plugin + delete every repo dep on it. Plugin.json
self-declares always-on hooks → "enabled w/o always-on" impossible → full
purge. Keep memory-registry terse-format rule (separate subsystem); only
replace dead `/caveman:compress` cmd refs w/ "Legacy entries
(pre-format-rule): compress manually or via claude.ai on demand."
Version 3.4.0 → 3.5.0.
- [x] RUNTIME (user, no TTY): plugin disable + uninstall caveman@caveman; mcp list check
- [x] PHASE 2: settings.json (2 hook blocks + enabledPlugins + marketplace); hooks/ files delete; .gitignore block; session-start.sh L134
- [x] PHASE 3: install-plugins.sh STEP 5.5; update-all.sh block; plugins.lock.json; doctor.sh; lib/detect-plugins.sh; lib/profile.sh; plugin-advisor.md; skills/profile/SKILL.md
- [x] PHASE 4: README row; USAGE always-on line; CHANGELOG; CLAUDE.md cmd ref; skills/capitalize+prune-memory cmd refs; version.txt
- [x] PHASE 5: shellcheck clean (SC1091 info only); full diff reviewed → committed + merged to master
## 2026-06-26 — coupled-capitalize invariant v1 (Frame 2)
Plan: [.claude/tasks/2026-06-26-coupled-capitalize-invariant.md](2026-06-26-coupled-capitalize-invariant.md)
Goal: every dev flow commits its memory automatically (1 commit/flow) via shared
include; ship-feature reordered (capitalize before FINISH = PR-bug fix). Hook v2,
doc-sync twin chantier deferred. Safety in the pathspec, never `git add -A`.
- [x] Task 1 — `lib/memory-commit.sh` + tests T1/T2/T2-bis/T3/T4/T5/T6/T7 (real exec, outputs reported) — 58cb91d + bbef41c
- [x] Task 2 — `lib/capitalize-commit.md` include — b44791b
- [x] Task 3 — wire feater/hotfixer/bugfixer/commit-changer — 2763678
- [x] Task 4 — ship-feature reorder (capitalize before FINISH) — e8eff7e
- [x] Task 5 — init-project founding-decisions capitalize (F5) — df60df6
- [x] Task 6 — behavioral verify + shellcheck + CHANGELOG + BDR/LRN — this commit
- [ ] v2 (deferred) — Stop hook (non-blocking, BDR-033 style) reusing the detector
- [~] twin chantier — doc-sync → own plan (2026-06-27). NOTE: "reorder before FINISH" REFUTED — doc-syncer commits nothing, needs reorder + NEW doc-commit mechanism.
## 2026-06-27 — doc-sync coupled (twin of coupled-capitalize)
Plan: [.claude/tasks/2026-06-27-doc-sync-coupled.md](2026-06-27-doc-sync-coupled.md)
Goal: orchestrators commit the docs doc-sync patched, on the branch, BEFORE FINISH.
Same PR-bug class as memory, NOT same fix: doc-syncer commits nothing (proven) →
reorder + CREATE doc-commit.sh/.md (mirror memory-commit, 4 deltas). Surface-don't-block.
- [x] Task 1 — `lib/doc-commit.sh` + `lib/tests/run-doc-commit.sh` — 24/24 real-exec pass, shellcheck clean. T1a/b/c (guard catches .claude/+CLAUDE.md, mixed→refuse-all-loud) + T2 dynamic pathspec + T3/T4/T5/T6. Exit taxonomy 0/2/3/4 (4=scope violation).
- [x] Task 2 — `lib/doc-commit.md` include — 4a54a65. 4-exit report table (rc 4 = loud upstream anomaly), visible surface w/ agent-composed summary (attribution locked 3×), 2 conscious acks.
- [x] Task 3 — `agents/doc-syncer.md` `PATCHED_FILES:` OUTPUT — fb1f359. Newline (one path/line), both STEP 9 + AUTO A4; NONE silent. Separator contract aligned producer↔consumer, argv space-safe, T7 proves it (28/28). Additive, callers unaffected.
- [x] Task 4 — ship-feature reorder — 636b491. DOC SYNC 9→8 (+doc-commit), FINISH 8→9, HTML comment deleted. Ref-coherence: 159/189 STEP 8→9 FINISH + README:152-153 illustration completed (stale since e8eff7e). Historical records left (append-only).
- [x] Task 5 — init-project reorder — e81f629. SYNC README 12→10c (+doc-commit), GSD 13→12, /13→/12. Order 10b→10c→11→12. Ref-coherence: USAGE ×5 (table, illustration, 3 GSD refs) each verified post-swap. Latent-bug check: none (10b was non-shifting). BLK-011 record left (append-only), TODO locator→12.
- [x] Task 6 — ref-sweep — clean (no old headers; live refs fixed in Task 4/5; historicals left; USAGE:256 non-ordering). Caught inline-flow gap → Task 6b.
- [x] Task 6b — wire doc-commit into feat/bugfix/hotfix DOC SYNC — 1b01b95. commit-change exempt (no DOC SYNC); hotfix wired (include no-ops on empty).
- [x] Task 7 — close: `run-doc-behavioral.md` + shellcheck clean + 28/28 + CHANGELOG + BDR-036 / LRN-058-060 / EVAL-008. surface-replaces-gate + partial-init + scope-expansion engraved honestly.
- [x] RESOLVED 2026-06-29 — [[BLK-010]] closed by `gitflow_init` root commit (init-project STEP 5f): scaffold/README get a deterministic commit owner + HEAD born before the worktree step. Verified (mechanism + STEP 5f wiring + T2 test); blockers.md index+body updated.
- [x] RESOLVED 2026-06-29 — [[BLK-011]] closed by REMOVAL: init-project STEP 12 (speculative gsd auto-bootstrap) deleted → orphan never created. Negative diff, not commit-plumbing ([[LRN-072]]). See chantier below.
- [x] DONE 2026-06-29 — doc-sync MINOR gate strengthened: ① shape-oracle [[BDR-040]] + ② masked-commit fix [[LRN-071]] (③ branch-guard deferred). See chantier below.
## 2026-06-29 — gitflow universal model + 6-repo migration (DONE)
Goal: universal gitflow across all `bchanot/*` Gitea repos. Lib built across prior sessions; migrated + hardened + dogfooded this session.
- [x] Lib hardened at ROOT — `gitflow_init` socle-commit made FATAL + identity precheck + `migrate_local` identity guard (BLK-012 → LRN-068); 57/57 green, abort-zero-mutation proven on identity-less repo
- [x] `lib/gitflow-migrate.sh` — probe (rights, not just identity) / local / remote, reversible→irreversible ordering, delete-master LAST
- [x] Migrated 6 repos (faunosteo, config, bchanot-cv, zenquality, game, claude): master→main, develop, Option-1 protection, master deleted — each delete behind eyeball+GO, ZERO loss, no force/`--no-verify`, settings intact
- [x] claude SELF-APPLIED — own committed lib migrated it; chantier landed C1 feat `167ea96` + C2 memory `1254643` + socle `620071b`; hook now governs claude
- [x] gstack submodule dirty (BLK-008 Playwright bump) excluded via `submodule.ignore=dirty` (LRN-070), NOT reset
- [x] Deleted merged branches: `feat/deploy-skill` (local+remote) + `cleanup/caveman-always-on` (remote)
- [x] Dogfood PROVEN: hook whitelists `.claude/**` on main + Option-1 lets owner push (commit `1620e5b`)
- [x] Capitalize: BDR-039 (Option-1 protection), LRN-068/069/070, BLK-010 closed + BLK-012, journal 2026-06-29 — committed + pushed on main
- [ ] follow-up (optional) — `submodule.gstack.ignore=dirty` into committed `.gitmodules` (share across clones); zenquality `cleanup/post-smtp-fix` rename `<type>/<name>` or finish+delete
## 2026-06-29 — MINOR-gate strengthening (doc-syncer) [branch feature/minor-gate-strengthening]
Read-first cartography refuted the literal premise: "strengthen MINOR gate" = 3 problems;
the literal one (blocking gate on MINOR) contradicts engraved [[BDR-036]]. Scope: ①+②, not B,
③ deferred. Built test-first (Iron Law).
- [x] ② fix masked commit failure — `doc-commit.sh` exit 5 fail-loud ([[LRN-071]], 3rd occurrence of the swallowed-commit pattern). RED T8 proved masking, GREEN 32/32 + taxonomy (sh header/funcdoc + `doc-commit.md` rc-5 row)
- [x] ① MINOR-shape oracle — `lib/doc-shape.sh` ([[BDR-040]]) + `run-doc-shape.sh` 19/19 (boundary + env-override). Wired doc-syncer STEP A4 (escalate whole set → existing SIGNIFICANT gate; no=revert all, select=keep subset) + `doc-commit.md` ACKNOWLEDGMENTS coherence + behavioral Scenario C/D
- [x] shellcheck clean (doc-commit.sh, doc-shape.sh, both test harnesses); coherence ref-sweep clean
- [x] Capitalize — BDR-040 + LRN-071 + CHANGELOG (Added/Fixed) + journal 2026-06-29 (cont.)
- [x] FINISH — merged feature/minor-gate-strengthening → develop (`0f0bd7f`) on explicit signal
- [~] ③ branch-guard in doc-commit DEFERRED — duplicates protected-base predicate 3rd time (lib + hook + here); all migrated repos have the hook. Reconsider only for repos outside `gitflow init`
## 2026-06-29 — BLK-011 GSD ROADMAP post-FINISH [branch bugfix/blk-011-gsd-roadmap]
User reframed: don't plumb a commit for the stranded ROADMAP — ask if gsd belongs at init at all.
Read refuted both option-premises (gsd ≫ roadmap; TODO ≠ gsd ROADMAP) but conclusion A held for a
stronger reason: speculative auto-bootstrap of an unused engine at creation is bad per se ([[LRN-072]]).
- [x] Resolve by REMOVAL — deleted init-project STEP 12 (negative diff 26/+13), not a commit helper
- [x] Ref-coherence sweep ("test" for a removal) — header 12→11-step, 10c note, 4 USAGE refs; zero dangling STEP-12 refs repo-wide
- [x] Scope guardrails — deliberate gsd use KEPT (onboarder PHASE 6, plugin-advisor, status-reporter)
- [x] Capitalize — [[BLK-011]] resolved (true reason + premise trace) + [[LRN-072]] + CHANGELOG Removed + journal 2026-06-29 (cont. 2)
- [x] FINISH — merged bugfix/blk-011-gsd-roadmap → develop (`ce4391a`); develop pushed to origin (6 commits, SSH)
## 2026-06-29 — prune-memory hardening (RED-7/8 + index backfill) [branch bugfix/prune-memory-hardening]
LAST of 3 chantiers. Read-first cartography confirmed RED-7/8 + measured 34-row index drift.
- [x] RED-7 (example-priming) — fictionalized STEP-2 example to 9xx ids (live ids primed a wrong merge of complementary LRN-014/016); DETERMINISTIC test (run-deterministic.sh) per [[LRN-046]]. Caught its own ugrep false-green → /usr/bin/grep ([[LRN-074]]). [[LRN-073]]
- [x] RED-8 (added-negation inversion) — consciously ACCEPTED as documented limit in BACKLOG ([[LRN-047]]); no fragile guard built
- [x] Index backfill — 34 missing rows (decisions 11, learnings 21, blockers 2) composed + ID-sorted insert; drift 34→0, STEP-4 verify OK; moved pre-existing out-of-order LRN-021
- [x] Capitalize — [[LRN-073]] + [[LRN-074]] + [[EVAL-010]] + journal 2026-06-29 (cont. 3)
- [ ] FINISH — merge bugfix/prune-memory-hardening → develop (awaiting explicit human signal)
- [ ] PUSH — develop → origin after the 3 chantiers land (awaiting explicit human signal)

72
lib/tests/run-reconcile.sh Executable file
View File

@ -0,0 +1,72 @@
#!/usr/bin/env bash
# run-reconcile.sh — TDD harness for lib/reconcile.sh.
#
# Iron Law: these tests were RED before the engine existed (scratchpad RED-B). They prove
# the engine VERIFIES (git/fs/body) rather than BELIEVES (checkbox/Index/name). Fixtures
# carry NEUTRAL names on purpose — the engine must reach the truth by querying git, never by
# reading a path hint (the a0f68 baseline failure that read "pre-reconcile" from the dir name).
set -uo pipefail
GREP=/usr/bin/grep # LRN-074: pin grep
HERE="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
REPO="$(cd "$HERE/.." && pwd)"
FIX="$HERE/fixtures"
MEM="$REPO/../.claude/memory"; [ -d "$MEM" ] || MEM="$REPO/.claude/memory"
# shellcheck source=/dev/null
source "$REPO/reconcile.sh"
pass=0; fail=0
ok() { echo "GREEN ✓ $*"; pass=$((pass+1)); }
no() { echo "RED ✗ $*"; fail=$((fail+1)); }
has() { printf '%s\n' "$1" | $GREP -qF -- "$2"; } # substring present in multiline string
echo "=== T1 recursive coherence — enumerate from BODY, never the ## Index ==="
DRIFT="$FIX/registry-index-drift.md"
mapfile -t IDS < <(reconcile_enumerate_ids "$DRIFT" LRN)
if [ "${#IDS[@]}" -eq 72 ]; then ok "T1a enumerated 72 body ids"; else no "T1a got ${#IDS[@]}, expected 72 (an Index-reader gives 51)"; fi
if printf '%s\n' "${IDS[@]}" | $GREP -qx "LRN-020"; then ok "T1b includes body-only canary LRN-020"; else no "T1b dropped LRN-020 — read the Index, not the body"; fi
# teeth: an Index-based enumerator would RED here (the fixture discriminates)
idx_only=$($GREP -oE '^\| LRN-[0-9]+' "$DRIFT" | $GREP -oE 'LRN-[0-9]+' | sort -u)
if printf '%s\n' "$idx_only" | $GREP -qx "LRN-020"; then no "T1c teeth LOST — Index path also yields LRN-020"; else ok "T1c teeth intact — Index path OMITS LRN-020 (engine reading the Index would fail T1b)"; fi
echo; echo "=== T2 BLK status — LAST block wins (the BLK-008 trap) ==="
b="$MEM/blockers.md"
case "$(reconcile_blk_current_status "$b" BLK-008)" in
*RESOLVED*|*resolved*) ok "T2a BLK-008 current = resolved (read FINAL, not the middle REVERTED)";;
*) no "T2a BLK-008 misread as non-resolved — fell into the compound-status trap";;
esac
case "$(reconcile_blk_current_status "$b" BLK-009)" in
*open*|*upstream*) ok "T2b BLK-009 current = upstream/open";;
*) no "T2b BLK-009 misread";;
esac
open_ids=$(reconcile_blk_open "$b" | cut -f1 | sort | tr '\n' ' ')
if [ "$open_ids" = "BLK-001 BLK-003 BLK-009 " ]; then ok "T2c open blockers = {001,003,009}"; else no "T2c open = [$open_ids], expected {001,003,009}"; fi
echo; echo "=== T3 deferral lexical sweep (HONEST LIMIT: marked-only) ==="
defer=$(reconcile_deferrals "$FIX/todo-snapshot.md" "$MEM/decisions.md")
for mark in "OUT-OF-SCOPE" "DEFERRED" "follow-up" "one-line ticket"; do
if has "$defer" "$mark"; then ok "T3 found marked deferral: $mark"; else no "T3 missed marker: $mark"; fi
done
if $GREP -qE '^\s*- \[~\] Cleanup machine' "$FIX/todo-snapshot.md"; then ok "T3e [~] cleanup present for checkbox-state detection"; else no "T3e [~] cleanup not found"; fi
echo; echo "=== T4 reconciliation kernel (pure) + snapshot composition ==="
if [ "$(reconcile_verdict ' ' true)" = "STALE:open-but-done" ]; then ok "T4a ' '+done → STALE"; else no "T4a wrong"; fi
if [ "$(reconcile_verdict 'x' false)" = "STALE:done-but-open" ]; then ok "T4b 'x'+!done → STALE"; else no "T4b wrong"; fi
if [ "$(reconcile_verdict '~' true)" = "STALE:partial-but-done" ]; then ok "T4c '~'+done → STALE"; else no "T4c wrong"; fi
if [ "$(reconcile_verdict 'x' true)" = "CONSISTENT" ]; then ok "T4d 'x'+done → CONSISTENT"; else no "T4d wrong"; fi
truths=$($GREP -cE '=(true|resolved|present)$' "$FIX/real-state.snapshot")
if [ "$truths" -ge 6 ]; then ok "T4e snapshot supplies $truths real-true facts → kernel yields STALE for the 6 git-verifiable items"; else no "T4e snapshot facts=$truths (<6)"; fi
echo " (7th cat-4 item — twin doc-sync [~] cross-ref — is SURFACED for review, not auto-verified: honest limit)"
echo; echo "=== T5 contradiction candidates (surface, never assert) ==="
cand=$(reconcile_contradiction_candidates "$MEM/decisions.md" "$FIX/todo-snapshot.md")
if has "$cand" "--help"; then ok "T5 surfaced --help candidate (BDR-001 ⇄ --help chantier)"; else no "T5 missed --help candidate"; fi
echo; echo "=== T6 live oracle smoke — oracles QUERY real git/fs (not a name) ==="
if reconcile_oracle_merge_done "$REPO" "prune-memory"; then ok "T6a merge_done(prune-memory) via git log"; else no "T6a merge not found in git"; fi
if reconcile_oracle_sha_exists "$REPO" "be1dcef"; then ok "T6b sha_exists(be1dcef) via cat-file"; else no "T6b sha missing"; fi
dk="$MEM/../skills/darwin-skill"
if reconcile_oracle_path_present "$dk"; then ok "T6c path_present(darwin-skill) via fs"; else no "T6c path absent"; fi
echo; echo "================ $pass GREEN / $fail RED ================"
[ "$fail" -eq 0 ] && exit 0 || exit 1

51
skills/reconcile/SKILL.md Normal file
View File

@ -0,0 +1,51 @@
---
name: reconcile
description: Use when you need the REAL open-work state of a project and the TODO or memory registries may be stale — "is the queue empty?", "what's left open?", "qu'est-ce qui reste", before /close, after a break, or when a checkbox/status looks doubtful. Confronts declared status (TODO checkboxes, registry statuses) against real git/fs state and surfaces the gaps. NOT memory curation (that is /prune-memory).
---
# /reconcile — declared vs real
## Overview
The open-work queue lies whenever a `[x]`/`[ ]` checkbox, a registry status, or an `## Index` row has drifted from what git/fs actually show. `/reconcile` answers "what is really left?" by **verifying**, never by **believing** a declarative source.
**Founding principle — recursive coherence:** never use a declarative source as an oracle (not the Index, not a checkbox, not a status claim, not a path name). Truth = registry BODY (`## ID —` headings), git, fs. The skill practices what it preaches — `lib/tests/run-reconcile.sh` T1 reds if the engine ever reads the Index.
## When to use
- "Is the queue empty / what's left to do?" — the naive answer (`grep '[ ]'`) is a mirror of the TODO and inherits all its lies.
- A checkbox or status looks doubtful; work suspected stale after merges/pushes.
- Before `/close`, or re-orienting after a break.
Not for: curating/compressing registries → `/prune-memory`. The skill never edits registry content (read-only here).
## How it runs
**REQUIRED ENGINE:** `lib/reconcile.sh` does the mechanical truth-probing. A capable agent CAN reconcile by hand — but burns ~50k tokens, depends on the question being phrased well, and still hits traps (compound status, a disclaimer instead of a check). The engine is deterministic, cheap, repeatable. Source it, then:
1. Enumerate registry entries from BODY headings (`reconcile_enumerate_ids`, never the Index).
2. Per declared item, run the matching oracle (`reconcile_oracle_*`, `reconcile_blk_current_status` = last block wins) and apply `reconcile_verdict`.
3. Classify into the four categories.
4. **Gate the write-back** (below).
## Output — four categories
1. **Actionable now** — open AND real state confirms not done.
2. **Blocked (external)**`reconcile_blk_open`: current status upstream/open, not our code.
3. **Deferred** — lexically-marked follow-ups + `[~]` items (conditional triggers).
4. **TODO↔real gap** — declared ≠ verdict (open-but-done, done-but-open, partial-but-done).
Plus **contradiction candidates**`reconcile_contradiction_candidates`: accepted-BDR ⇄ open-chantier overlap, surfaced for human review.
## The gate (mandatory)
Reconciling the TODO edits a tracked file → never silent. Show the proposed diff, then ask: **A** apply all · **B** select a subset · **C** touch nothing. Registries stay READ-ONLY (append-only; curation is `/prune-memory`).
## Honest limits (do not over-read the guarantee)
- Deferral detection is **lexical** — catches MARKED deferrals, misses unmarked ("à reprendre quand X"). Deterministic on the detectable; surface the ambiguous for human review, never assert.
- Contradictions are **candidates** (token overlap), not verdicts — a human confirms.
- Cross-repo items (oracle in another repo) → "not verifiable here", never flagged stale.
- Cross-reference verdicts ("[~] done because chantier X below is complete") are SURFACED, not auto-resolved.
## Common mistakes
- Grepping `[ ]` and reporting it as open → reproduces the lie. Run the oracle.
- Reading the `## Index` for status → inherits its drift (T1 forbids it).
- Writing a disclaimer ("à vérifier si déjà fait") instead of verifying → the engine verifies, it never hedges-and-advances.
## Validation
`bash lib/tests/run-reconcile.sh` → 20/20, shellcheck clean. Oracle of record = the 2026-06-29 inventory (7 gaps + 3 blocked + 5 deferred + 1 contradiction), fixtures frozen under neutral names in `lib/tests/fixtures/`.