Compare commits
7 Commits
26658c4962
...
ea992cbc62
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
ea992cbc62 | ||
|
|
8413afb785 | ||
|
|
0a3e76611d | ||
|
|
9a58734286 | ||
|
|
a1cc753746 | ||
|
|
dbab542409 | ||
|
|
e5e673ac1f |
@ -510,3 +510,19 @@ rules:
|
|||||||
- **Caveats**: future GLOBAL memory must stay tiny — with conditional loading broken, anything global loads in EVERY project unconditionally; fold that constraint into the global-memory design once a backup exists. Caveman pass to ~250 was explicitly DECLINED: marginal ~25-line gain vs real risk (changes the nature of instructions-to-follow; no evidence caveman is followed better than prose; CLAUDE.md is the most-edited file → caveman = painful to re-read/amend). 275 readable > 250 caveman.
|
- **Caveats**: future GLOBAL memory must stay tiny — with conditional loading broken, anything global loads in EVERY project unconditionally; fold that constraint into the global-memory design once a backup exists. Caveman pass to ~250 was explicitly DECLINED: marginal ~25-line gain vs real risk (changes the nature of instructions-to-follow; no evidence caveman is followed better than prose; CLAUDE.md is the most-edited file → caveman = painful to re-read/amend). 275 readable > 250 caveman.
|
||||||
- **Alternatives rejected**: path-scoped `~/.claude/rules/` (broken, [[BLK-009]]); externalize sections to on-demand-loaded files (same conditional-load dependency); caveman to ~250 (readability + instruction-fidelity risk).
|
- **Alternatives rejected**: path-scoped `~/.claude/rules/` (broken, [[BLK-009]]); externalize sections to on-demand-loaded files (same conditional-load dependency); caveman to ~250 (readability + instruction-fidelity risk).
|
||||||
- **Reference**: `~/.claude/CLAUDE.md` (symlink → `~/Documents/claude/CLAUDE.md`), commits ba743cf (compress routing+design+graphify) + 990318c (trim separators/blanks). Linked to [[BLK-009]], [[LRN-043]], [[LRN-044]].
|
- **Reference**: `~/.claude/CLAUDE.md` (symlink → `~/Documents/claude/CLAUDE.md`), commits ba743cf (compress routing+design+graphify) + 990318c (trim separators/blanks). Linked to [[BLK-009]], [[LRN-043]], [[LRN-044]].
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## BDR-032 — skill `/validate` → `/web-validate` (rename user surface, keep internals)
|
||||||
|
|
||||||
|
- **Date**: 2026-06-25
|
||||||
|
- **Status**: accepted (shipped `e5e673a`)
|
||||||
|
- **Decision**: rename W3C+WCAG skill `/validate` → `/web-validate` (clearer scoped name, less generic). Renamed the USER-FACING surface ONLY: folder (`git mv`), frontmatter `name`, H1, command refs, CLAUDE.md routing line, 6 `lib/profiles/*.profile` entries (FUNCTIONAL — profiles activate skills by folder name, a miss = broken activation), cross-refs (harden/seo/depth-matrix/client-handover), agent dispatch refs, README + USAGE tables. Leak-guard regex extended to `web-validate|validate` ([[LRN-045]]).
|
||||||
|
- **Why — 4 deliberate KEEPs**:
|
||||||
|
- agent `validator-analyzer` name KEPT — internal, lockstep with `subagent_type=` + harness registry; rename = wider blast radius, zero discoverability gain.
|
||||||
|
- `.validate-cache/` + `VALIDATE.md` KEPT — names derive from the AUDIT TYPE, family `{SEO,GEO,HARDEN,CSO,VALIDATE}.md`; renaming makes VALIDATE the odd one out + orphans already-generated reports (`MIGRATION.md` cleanup loop hardcodes the name). Same logic kept the dispatch label `description="validate — ..."`.
|
||||||
|
- `.claude/` history KEPT (memory + completed TODO block) — append-only, true at the time. The forward-pointing OPEN TODO item was ANNOTATED additively (`désormais /web-validate`), not rewritten — append-only protects history, not pointers to future actions.
|
||||||
|
- CHANGELOG old entry KEPT, new "renamed" entry ADDED (Keep-a-Changelog: don't rewrite the past).
|
||||||
|
- NL trigger keywords ("validate"/"validation") KEPT in the description so "validate my site" still routes here.
|
||||||
|
- **Alternatives rejected**: rename agent + artifacts too (cosmetic symmetry, ~45 extra edits, breaks audit-file family + report back-compat); blind `sed s/validate/web-validate/` (breaks third-party `html-validate`, `validator.nu`, English-verb prose — discrimination must be at the `/validate` token, proven by `.validate-cache/html-validate.json` staying intact).
|
||||||
|
- **Reference**: commit `e5e673a` (18 files). Verified complete: `/validate` = 0 in active code (only `.claude/` history + CHANGELOG), `html-validate` = 15 intact, regex `client-handover-writer.md:1462` shows both names. Linked to [[LRN-045]], [[BDR-031]] (CLAUDE.md routing), [[LRN-043]] (validate routing line).
|
||||||
|
|||||||
@ -25,6 +25,8 @@ rules:
|
|||||||
| EVAL-002 | 2026-06-02 | `profile gstack on/off` verb implementation | keep |
|
| EVAL-002 | 2026-06-02 | `profile gstack on/off` verb implementation | keep |
|
||||||
| EVAL-003 | 2026-06-11 | darwin optimization run on `audit-delta` skill | keep |
|
| EVAL-003 | 2026-06-11 | darwin optimization run on `audit-delta` skill | keep |
|
||||||
| EVAL-004 | 2026-06-11 | darwin eval 26 perso skills + 4-bug fix round | keep |
|
| EVAL-004 | 2026-06-11 | darwin eval 26 perso skills + 4-bug fix round | keep |
|
||||||
|
| EVAL-005 | 2026-06-23 | Obsolete `claude --effort max` alias missed across Step 9 edits | correct |
|
||||||
|
| EVAL-006 | 2026-06-25 | prune-memory v1.1 TDD — 6 guards (0a3e766), validated on real data | keep |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@ -74,3 +76,13 @@ rules:
|
|||||||
- **Method / why missed**: I treated the pre-existing `CLAUDE_LINES` as established and only touched the lines I was adding/removing. Spotting the redundancy needs cross-referencing TWO config layers (shell alias vs settings.json) — a semantic check I never ran. Masked further: the user's `.bashrc` is hand-managed and the alias line wasn't even present, so it looked inert.
|
- **Method / why missed**: I treated the pre-existing `CLAUDE_LINES` as established and only touched the lines I was adding/removing. Spotting the redundancy needs cross-referencing TWO config layers (shell alias vs settings.json) — a semantic check I never ran. Masked further: the user's `.bashrc` is hand-managed and the alias line wasn't even present, so it looked inert.
|
||||||
- **Anomaly**: not just dead config — a CLI flag (`--effort max`) silently OVERRIDES the settings.json value (`xhigh`). Real correctness bug.
|
- **Anomaly**: not just dead config — a CLI flag (`--effort max`) silently OVERRIDES the settings.json value (`xhigh`). Real correctness bug.
|
||||||
- **Action**: when editing installer shell-config, audit EACH existing line against the current settings.json / CLAUDE.md source of truth, not only the lines being changed. Removed the alias + added cleanup. General rule: reconcile config to ONE source of truth across env/alias/settings layers.
|
- **Action**: when editing installer shell-config, audit EACH existing line against the current settings.json / CLAUDE.md source of truth, not only the lines being changed. Removed the alias + added cleanup. General rule: reconcile config to ONE source of truth across env/alias/settings layers.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## EVAL-006 — prune-memory v1.1: 6 dangers TDD-closed, validated on real data
|
||||||
|
|
||||||
|
- **Date**: 2026-06-25
|
||||||
|
- **Output**: prune-memory (only DESTRUCTIVE skill, never tested before) TDD'd end-to-end. 6 dangers, each proven by a failing RED then closed by a DETERMINISTIC guard: RED-1 false "verified" claim removed; RED-2 dirty tree → real `exit 1` (was prose-only STOP); RED-3 negation-sentence verbatim guard (no silent inversion); RED-4 collapse safety-critical exception (NEVER/ALWAYS/PERMANENT entry INTOUCHABLE); RED-5 STEP-4 fidelity census (count-based, per-entry × per-category); RED-6 trailing-space false-ORPHAN fix. Guards committed in **0a3e766**; tests: `tests/run-deterministic.sh` + `run-behavioral.md` + fixtures + BACKLOG.
|
||||||
|
- **Method**: run-deterministic all-green (RED-1/2/5/6); RED-3/4 by single faithful subagent runs on throwaway fixtures (deterministic oracles — byte-identical + layer-(a) substring; N=6 tolerance-zero fleet documented in run-behavioral.md, NOT exhausted — 1 run sufficed to prove presence then closure); REAL-data re-test on live learnings.md (602 l): fidelity 0 false-positive vs old line-grep 13, census PROVEN counting both sides (HEAD/WORK not=107/114, no=64/71, never=34/34 — no category dropped). Scope held (only learnings.md), reverted after (measurement, not a kept prune). Clean tree verified first; git + cp backup.
|
||||||
|
- **Anomalies**: (1) SAFE ≠ USEFUL — compression (pass C) marginal on already-caveman dense content (~3.6% trim, registry GREW); real value = index-drift (D found 19 missing Index rows on the real learnings.md, measured then reverted) + merge (B), not C on dense. (2) RED-8 OPEN — fidelity proves "no negation DELETED", not "none ADDED wrongly"; visible empirically (not/no +7 in the measurement, passed under the drop-radar); remote (compression subtracts) but real. (3) RED-7 OPEN — merge LRN-014+016 maybe PRIMED by the SKILL.md example, not real overlap; verify. (4) two guard bugs caught by VALIDATION not logic: awk `\<` unsupported (mawk) → not/no counted 0; `NR==FNR` blind when working census empty = the deletion case → both fixed + re-validated. (5) REAL ANCHOR FOR PASS D — this very evals.md had EVAL-005's Index row MISSING (pre-existing drift), exactly what the skill's D pass auto-corrects; hand-backfilled in a separate `fix(memory)` commit. learnings.md likewise carries 19 missing Index rows (deferred to an intentional prune). D is NOT theoretical.
|
||||||
|
- **Action**: keep — safe, useful for B/D, compression marginal on dense (documented limit). `Fixed in v1.1 (TDD found it)` WAS the RED-1 defect (claim of a verify never run); TDD note now TRUE (real suite passes). Patterns → LRN-046/047/048; open items → tests/BACKLOG.md (RED-7/RED-8).
|
||||||
|
|||||||
@ -197,3 +197,5 @@ rules:
|
|||||||
- Compressed global CLAUDE.md 317→275 (−42, loaded every session): routing (cut name-obvious lines, keep non-derivable signal + dense catch-all; restored `validate`/`plan-eng-review`, feat/hotfix pointer), design (+ explicit FILE signal), graphify, then decorative `---` + blank rognage. Caveman→250 declined (readability + instruction-fidelity). Commits ba743cf, 990318c. BDR-031, LRN-043.
|
- Compressed global CLAUDE.md 317→275 (−42, loaded every session): routing (cut name-obvious lines, keep non-derivable signal + dense catch-all; restored `validate`/`plan-eng-review`, feat/hotfix pointer), design (+ explicit FILE signal), graphify, then decorative `---` + blank rognage. Caveman→250 declined (readability + instruction-fidelity). Commits ba743cf, 990318c. BDR-031, LRN-043.
|
||||||
- Edit/Write refuse write-through-symlink → resolve real path (`readlink -f`); `~/.claude/CLAUDE.md` → repo. LRN-044.
|
- Edit/Write refuse write-through-symlink → resolve real path (`readlink -f`); `~/.claude/CLAUDE.md` → repo. LRN-044.
|
||||||
- Inspected dirty gstack submodule (parent showed `m`): `package.json`+`bun.lock` = the Playwright 1.58.2→1.61 bump (BDR-029/BLK-008), NOT restore noise → left intact, NOT cleaned, NOT committed (submodule ref stays at clean 070722ace; local patch re-applied by installer by design).
|
- Inspected dirty gstack submodule (parent showed `m`): `package.json`+`bun.lock` = the Playwright 1.58.2→1.61 bump (BDR-029/BLK-008), NOT restore noise → left intact, NOT cleaned, NOT committed (submodule ref stays at clean 070722ace; local patch re-applied by installer by design).
|
||||||
|
- Renamed skill `/validate` → `/web-validate` (user-surface only): git mv + name + H1 + CLAUDE.md routing + 6 profiles (functional) + cross-refs + agent dispatch + README/USAGE. KEPT: `validator-analyzer` name (lockstep), `.validate-cache`/`VALIDATE.md` (audit-file family), `.claude/` history (append-only), NL triggers. Critical catch: client-deliverable leak-guard regex (`client-handover-writer.md:1462`) matched `/validate` by exact token — `web-` prefix broke the anchored match → extended to `web-validate|validate` (covers legacy docs). Verified complete: `/validate` 0 in active code, `html-validate` 15 intact, regex shows both. Commits e5e673a + dbab542 (BDR-032/LRN-045) + a1cc753 (TODO L167 annotated additively). gstack submodule untouched.
|
||||||
|
- TDD'd `/prune-memory` (only destructive skill, untested + carried a false `Fixed in v1.1 (TDD found it)` claim): 6 dangers (RED-1..6) closed by deterministic guards, skill `0a3e766`. Real-data run on learnings.md exposed SAFE≠USEFUL (compression marginal on dense; value = index/merge, not C) + a 13/13-false-positive line-grep fidelity guard → replaced by a per-entry count census (0 FP, proven counting both sides). RED-7 (example-priming) + RED-8 (added-negation) filed in BACKLOG. EVAL-006, LRN-046/047/048.
|
||||||
|
|||||||
@ -46,6 +46,9 @@ rules:
|
|||||||
| LRN-027 | 2026-06-11 | Agents improvise audit boundaries from file dates when no machine state — periodic skills need machine-readable state file, never inference | any recurring/periodic skill needing "since last run" semantics |
|
| LRN-027 | 2026-06-11 | Agents improvise audit boundaries from file dates when no machine state — periodic skills need machine-readable state file, never inference | any recurring/periodic skill needing "since last run" semantics |
|
||||||
| LRN-030 | 2026-06-18 | Opus 4.8 under-delegates subagents/memory/custom-tools by default — counter via explicit CLAUDE.md fan-out rule | any Opus 4.8 session; tuning delegation; inline-vs-subagent decision |
|
| LRN-030 | 2026-06-18 | Opus 4.8 under-delegates subagents/memory/custom-tools by default — counter via explicit CLAUDE.md fan-out rule | any Opus 4.8 session; tuning delegation; inline-vs-subagent decision |
|
||||||
| LRN-031 | 2026-06-19 | Skill value = gate + anti-noise + determinism, not re-coding what a capable agent does free | building/reviewing any skill; writing-skills TDD fixture design |
|
| LRN-031 | 2026-06-19 | Skill value = gate + anti-noise + determinism, not re-coding what a capable agent does free | building/reviewing any skill; writing-skills TDD fixture design |
|
||||||
|
| LRN-046 | 2026-06-25 | Destructive skill: deterministic oracle (byte-identical / count census) > semantic judge | any destructive/irreversible skill; behavioral-oracle TDD |
|
||||||
|
| LRN-047 | 2026-06-25 | A noisy safety guard (13/13 FP) = a guard you learn to ignore = risk → refine, don't tolerate | any guard/alert/lint that can false-positive |
|
||||||
|
| LRN-048 | 2026-06-25 | A "0/OK/pass" must prove it LOOKED (counted both sides), else verify hard-wired to pass | any verify/test/lint reporting success |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@ -590,3 +593,37 @@ rules:
|
|||||||
- **Pattern**: many of this user's `~/.claude/*` config files are symlinks INTO the claude-config repo (`~/Documents/claude/`). Edit/Write block writes through a symlink (safety against clobbering link targets); Read does not — so Read-through-link succeeds then Edit-through-link fails on the same path.
|
- **Pattern**: many of this user's `~/.claude/*` config files are symlinks INTO the claude-config repo (`~/Documents/claude/`). Edit/Write block writes through a symlink (safety against clobbering link targets); Read does not — so Read-through-link succeeds then Edit-through-link fails on the same path.
|
||||||
- **Future application**: before editing any `~/.claude/...` config file, resolve it first (`readlink -f <path>`, or `ls -la` to spot the arrow). Then Read AND Edit the RESOLVED real path so the harness's read-tracking matches what you write — and `git` status/diff/commit land naturally in the repo that owns the file.
|
- **Future application**: before editing any `~/.claude/...` config file, resolve it first (`readlink -f <path>`, or `ls -la` to spot the arrow). Then Read AND Edit the RESOLVED real path so the harness's read-tracking matches what you write — and `git` status/diff/commit land naturally in the repo that owns the file.
|
||||||
- **Reference**: hit while editing `~/.claude/CLAUDE.md` → `~/Documents/claude/CLAUDE.md`. Linked to [[BDR-031]].
|
- **Reference**: hit while editing `~/.claude/CLAUDE.md` → `~/Documents/claude/CLAUDE.md`. Linked to [[BDR-031]].
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## LRN-045 — Renaming a command: audit exact-name leak-guard / forbidden-token regexes
|
||||||
|
|
||||||
|
- **Date**: 2026-06-25
|
||||||
|
- **Context**: rename `/validate` → `/web-validate`. A client-deliverable leak-guard in `agents/client-handover-writer.md:1462` greps generated docs for internal tool names via `grep -niE '/(seo|harden|validate|cso|...)\b'`. The `web-` prefix means `/web-validate` no longer matches the `/validate` branch (the `/` must sit immediately before `validate`; post-rename a `-` sits there) → renamed command leaks SILENTLY into client-facing output. No error — the gate just stops catching it.
|
||||||
|
- **Pattern**: any rename of a command/skill/identifier must sweep regexes/allowlists/denylists that match the OLD name by exact token — leak guards, forbidden-token gates, routing dispatchers, CI greps. A prefix/suffix rename breaks anchored matches (`/oldname\b`) with zero error. Fix = alternation covering BOTH names (`web-validate|validate`), NOT replacement — old artifacts (already-shipped client docs, logs) still carry the legacy name and must stay caught.
|
||||||
|
- **Future application**: when renaming, grep the BARE old token inside regex/test/gate files, not just `/oldname` command refs. A blind `replace_all '/old' '/new'` MISSES these because the guard stores the name inside an alternation (`|old|`), not as `/old`. For each guard found, extend to `new|old`; verify the gate line shows both names.
|
||||||
|
- **Reference**: `agents/client-handover-writer.md:1462`, rename commit `e5e673a`. Linked to [[BDR-032]].
|
||||||
|
|
||||||
|
## LRN-046 — Destructive skill: deterministic oracle > semantic judge
|
||||||
|
|
||||||
|
- **Date**: 2026-06-25
|
||||||
|
- **Pattern**: On a DESTRUCTIVE skill the binding oracle must be DETERMINISTIC (byte-identical, or count-based census per-entry × per-category), not a semantic judge. A judge false-greens twice: (a) PRESERVED-but-MUTATED content — RED-4, a "meaning preserved" collapse still rewrote a permanent safety rule; byte-identical caught it, the judge would not; (b) a 0-result that happened by CHANCE — "no negation inverted" ≠ protected, it was the dice not a guard. If the oracle must be behavioral/LLM, pair it with a deterministic check that is the gate.
|
||||||
|
- **Context**: prune-memory v1.1 TDD (EVAL-006, skill `0a3e766`). RED-4 collapse + RED-3 compression.
|
||||||
|
- **Future application**: any destructive/irreversible skill or safety check; any TDD whose natural oracle is an LLM judge — make the binding check deterministic, keep the judge as a secondary net.
|
||||||
|
- **Reference**: skill `0a3e766`, `tests/run-behavioral.md`. Linked to [[EVAL-006]].
|
||||||
|
|
||||||
|
## LRN-047 — A noisy safety guard is a risk, not discomfort
|
||||||
|
|
||||||
|
- **Date**: 2026-06-25
|
||||||
|
- **Pattern**: A safety guard that cries wolf (13/13 false positives on real data) is a guard you learn to IGNORE → the day of the true positive you skip it by habit. On a destructive op a noisy guard = security RISK, not annoyance → REFINE it (here line-grep → count-based census), don't tolerate. Measure the false-positive rate on REAL data, not fixtures — all-green fixtures hid the 13.
|
||||||
|
- **Context**: prune-memory v1.1 (EVAL-006). The RED-5 line-grep fidelity guard fired 13/13 false positives on the live learnings.md (line-sharing) → replaced by a per-entry census (0 FP, proven).
|
||||||
|
- **Future application**: any guard/alert/lint/test that can false-positive — measure FP on real data before shipping; a guard habitually ignored is worse than none.
|
||||||
|
- **Reference**: skill `0a3e766`, `tests/run-deterministic.sh` (RED-5). Linked to [[EVAL-006]].
|
||||||
|
|
||||||
|
## LRN-048 — A "0 / OK / pass" must prove it LOOKED
|
||||||
|
|
||||||
|
- **Date**: 2026-06-25
|
||||||
|
- **Pattern**: A passing result ("0 errors", "OK", "clean") must PROVE it inspected — show the work counted something on both sides (census non-zero on HEAD and WORK). Else it is a verify hard-wired to pass = the original prune-memory v1 lie (`basename | cut -c1-3` never matched any heading → verify always printed blank-OK). A 0 by inaction is indistinguishable from a 0 by correctness; force the success path to surface its coverage.
|
||||||
|
- **Context**: prune-memory v1.1 (EVAL-006). v1 STEP-4 verify always reported OK (wrong prefix → 0 markers → blank). The fix's 0-false-positive is only trustworthy because the census was shown counting both sides.
|
||||||
|
- **Future application**: any verify/test/lint reporting success — design the pass to surface what it examined (counts / files / lines) so a vacuous pass is visible, not silent.
|
||||||
|
- **Reference**: skill `0a3e766`, EVAL-006 (verify-proof anomaly). Linked to [[EVAL-006]].
|
||||||
|
|||||||
@ -164,7 +164,7 @@ Subtasks :
|
|||||||
- [ ] Créer `skills/lib/help-handler.md` — snippet réutilisable (détection + extraction + affichage)
|
- [ ] Créer `skills/lib/help-handler.md` — snippet réutilisable (détection + extraction + affichage)
|
||||||
- [ ] Définir format d'aide standard + section "ARGUMENTS" vs reuse de argument-hint
|
- [ ] Définir format d'aide standard + section "ARGUMENTS" vs reuse de argument-hint
|
||||||
- [ ] Décider : sections ARGUMENTS/EXAMPLES doivent-elles être dans la frontmatter (nouveau champ YAML) ou dans le corps du SKILL.md (nouvelle section `## Help`) ?
|
- [ ] Décider : sections ARGUMENTS/EXAMPLES doivent-elles être dans la frontmatter (nouveau champ YAML) ou dans le corps du SKILL.md (nouvelle section `## Help`) ?
|
||||||
- [ ] Patcher un skill pilote (`/validate`) — valider UX
|
- [ ] Patcher un skill pilote (`/validate`) — valider UX _(désormais `/web-validate` — renommé e5e673a)_
|
||||||
- [ ] Patcher les skills perso restants : analyze, bugfix, code-clean, commit-change, doc, feat, geo, graphify, harden, hotfix, init-project, make-pdf, onboard, plan-tune, plugin-check, refactor, seo, ship-feature, skills-perso, status, benchmark-models, context-save, context-restore
|
- [ ] Patcher les skills perso restants : analyze, bugfix, code-clean, commit-change, doc, feat, geo, graphify, harden, hotfix, init-project, make-pdf, onboard, plan-tune, plugin-check, refactor, seo, ship-feature, skills-perso, status, benchmark-models, context-save, context-restore
|
||||||
- [ ] Mettre à jour `~/.claude/CLAUDE.md` — mentionner convention --help disponible sur tous les skills perso
|
- [ ] Mettre à jour `~/.claude/CLAUDE.md` — mentionner convention --help disponible sur tous les skills perso
|
||||||
- [ ] Note : skills-external/gstack ont leur propre convention, ne pas toucher
|
- [ ] Note : skills-external/gstack ont leur propre convention, ne pas toucher
|
||||||
|
|||||||
@ -23,6 +23,7 @@ Format follows [Keep a Changelog](https://keepachangelog.com/).
|
|||||||
- `.claude/{tasks,memory,audits}/` governance layout + 5 memory registries (decisions, learnings, blockers, journal, evals)
|
- `.claude/{tasks,memory,audits}/` governance layout + 5 memory registries (decisions, learnings, blockers, journal, evals)
|
||||||
|
|
||||||
### Changed
|
### Changed
|
||||||
|
- `/validate` renamed to `/web-validate` — clearer scoped name (W3C + WCAG); routing, skill profiles, cross-references, and the client-deliverable leak-guard updated (the guard still matches legacy `/validate` so older client docs stay covered)
|
||||||
- `/seo` split into parallel `seo` + `geo` agents with shared resources
|
- `/seo` split into parallel `seo` + `geo` agents with shared resources
|
||||||
- `/onboard` rewritten: archetype-aware pipeline (orchestrator + config-only agent), security audit archetype-aware
|
- `/onboard` rewritten: archetype-aware pipeline (orchestrator + config-only agent), security audit archetype-aware
|
||||||
- `doc-syncer`: stack-aware audit + deploy-doc gating; later scoped to public docs only, `.claude/` read-only; sync-only ROADMAP handling — planned→shipped reconciliation from code/git, never from `.claude/`; numeric incoherence → HUMAN question
|
- `doc-syncer`: stack-aware audit + deploy-doc gating; later scoped to public docs only, `.claude/` read-only; sync-only ROADMAP handling — planned→shipped reconciliation from code/git, never from `.claude/`; numeric incoherence → HUMAN question
|
||||||
|
|||||||
@ -235,7 +235,7 @@ only the non-obvious cases: gstack fallbacks, disambiguation, cryptic names.
|
|||||||
- Architecture review → plan-eng-review
|
- Architecture review → plan-eng-review
|
||||||
- Before /clear or /compact → capitalize; end-of-session ritual → close
|
- Before /clear or /compact → capitalize; end-of-session ritual → close
|
||||||
- SEO+GEO → seo (GEO only → geo)
|
- SEO+GEO → seo (GEO only → geo)
|
||||||
- W3C + WCAG a11y (HTML/CSS validity, axe, pa11y) → validate
|
- W3C + WCAG a11y (HTML/CSS validity, axe, pa11y) → web-validate
|
||||||
- Security audit (secrets, CVE, OWASP) → cso
|
- Security audit (secrets, CVE, OWASP) → cso
|
||||||
- New project → init-project; onboard existing repo → onboard
|
- New project → init-project; onboard existing repo → onboard
|
||||||
|
|
||||||
|
|||||||
@ -112,7 +112,7 @@ Versions are pinned in `plugins.lock.json`. To update: edit the file, then re-ru
|
|||||||
| `/pdf-translate` | Translate a PDF to another language, output as HTML (via Vision) |
|
| `/pdf-translate` | Translate a PDF to another language, output as HTML (via Vision) |
|
||||||
| `/close` | End-of-session ritual — alias for `/capitalize --ritual` (dedup + TODO reconcile + 3-question reflection) |
|
| `/close` | End-of-session ritual — alias for `/capitalize --ritual` (dedup + TODO reconcile + 3-question reflection) |
|
||||||
| `/harden` | Web hardening audit — HTTPS/TLS, HSTS, CSP, security headers |
|
| `/harden` | Web hardening audit — HTTPS/TLS, HSTS, CSP, security headers |
|
||||||
| `/validate` | W3C HTML/CSS validity + WCAG 2.1 accessibility audit |
|
| `/web-validate` | W3C HTML/CSS validity + WCAG 2.1 accessibility audit |
|
||||||
| `/geo` | GEO-only audit — AI-search visibility (ChatGPT, Perplexity, Claude, Gemini…) |
|
| `/geo` | GEO-only audit — AI-search visibility (ChatGPT, Perplexity, Claude, Gemini…) |
|
||||||
| `/client-handover` | Final project delivery — audits + branded deliverable (Markdown / HTML / PDF) |
|
| `/client-handover` | Final project delivery — audits + branded deliverable (Markdown / HTML / PDF) |
|
||||||
| `/profile` | Activate a skill profile (design / dev / qa / audit / minimal) |
|
| `/profile` | Activate a skill profile (design / dev / qa / audit / minimal) |
|
||||||
|
|||||||
4
USAGE.md
4
USAGE.md
@ -111,7 +111,7 @@ Tu veux...
|
|||||||
| Curer la mémoire | `/prune-memory` |
|
| Curer la mémoire | `/prune-memory` |
|
||||||
| Fin de session (= /capitalize --ritual) | `/close` |
|
| Fin de session (= /capitalize --ritual) | `/close` |
|
||||||
| Audit web (TLS, CSP, headers) | `/harden` |
|
| Audit web (TLS, CSP, headers) | `/harden` |
|
||||||
| Validité HTML/CSS + a11y | `/validate` |
|
| Validité HTML/CSS + a11y | `/web-validate` |
|
||||||
| Visibilité IA (GEO seul) | `/geo` |
|
| Visibilité IA (GEO seul) | `/geo` |
|
||||||
| Livraison client finale | `/client-handover` |
|
| Livraison client finale | `/client-handover` |
|
||||||
| Changer profil skills | `/profile` |
|
| Changer profil skills | `/profile` |
|
||||||
@ -147,7 +147,7 @@ Tu veux...
|
|||||||
| `/prune-memory` | Registres trop longs / bruyants | Curation : merge, superseded, compression |
|
| `/prune-memory` | Registres trop longs / bruyants | Curation : merge, superseded, compression |
|
||||||
| `/close` | Fin de session | Alias de /capitalize --ritual — dedup + TODO + réflexion 3 questions |
|
| `/close` | Fin de session | Alias de /capitalize --ritual — dedup + TODO + réflexion 3 questions |
|
||||||
| `/harden` | Audit sécurité web (SSL, CSP, HSTS) | Projet web avec config HTTP |
|
| `/harden` | Audit sécurité web (SSL, CSP, HSTS) | Projet web avec config HTTP |
|
||||||
| `/validate` | Audit W3C + WCAG a11y | Avant livraison projet web |
|
| `/web-validate` | Audit W3C + WCAG a11y | Avant livraison projet web |
|
||||||
| `/client-handover` | Livraison client | Audits finaux + livrable brandé |
|
| `/client-handover` | Livraison client | Audits finaux + livrable brandé |
|
||||||
| `/profile` | Changer le profil de skills | design / dev / qa / audit / minimal |
|
| `/profile` | Changer le profil de skills | design / dev / qa / audit / minimal |
|
||||||
|
|
||||||
|
|||||||
@ -211,7 +211,7 @@ test -f Procfile && DEPLOY_HINTS+=("Heroku: Procfile")
|
|||||||
test -f wrangler.toml && DEPLOY_HINTS+=("Cloudflare Workers: wrangler.toml")
|
test -f wrangler.toml && DEPLOY_HINTS+=("Cloudflare Workers: wrangler.toml")
|
||||||
```
|
```
|
||||||
|
|
||||||
Also detect deployed URL (used to point /validate at the live site, STEP 7):
|
Also detect deployed URL (used to point /web-validate at the live site, STEP 7):
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
DEPLOYED_URL=""
|
DEPLOYED_URL=""
|
||||||
@ -255,7 +255,7 @@ is_fresh() {
|
|||||||
|
|
||||||
For non-web projects (`PROJECT_TYPE` ∈ {cli, library, mobile, other}), the
|
For non-web projects (`PROJECT_TYPE` ∈ {cli, library, mobile, other}), the
|
||||||
pipeline is reduced: only run /cso (single audit, single fix loop), skip
|
pipeline is reduced: only run /cso (single audit, single fix loop), skip
|
||||||
STEP 6 deploy pause and STEP 7 /validate. Treat /cso as the only score for
|
STEP 6 deploy pause and STEP 7 /web-validate. Treat /cso as the only score for
|
||||||
the gate.
|
the gate.
|
||||||
|
|
||||||
For web projects, dispatch in **a single message with two parallel Agent calls**:
|
For web projects, dispatch in **a single message with two parallel Agent calls**:
|
||||||
@ -560,24 +560,24 @@ Tailor to project deploy method (use DEPLOY_HINTS):
|
|||||||
```
|
```
|
||||||
AskUserQuestion:
|
AskUserQuestion:
|
||||||
"Pipeline paused for deploy. Above is what's changed and how to deploy.
|
"Pipeline paused for deploy. Above is what's changed and how to deploy.
|
||||||
Confirm when the live site reflects the new changes (or skip /validate)."
|
Confirm when the live site reflects the new changes (or skip /web-validate)."
|
||||||
|
|
||||||
Header: "Deploy status"
|
Header: "Deploy status"
|
||||||
Options:
|
Options:
|
||||||
- A) Deployed — proceed with /validate
|
- A) Deployed — proceed with /web-validate
|
||||||
- B) Not yet — I'll come back (this stops the pipeline; re-run /client-handover later)
|
- B) Not yet — I'll come back (this stops the pipeline; re-run /client-handover later)
|
||||||
- C) Skip /validate — proceed to handover doc with VALIDATE marked SKIPPED
|
- C) Skip /web-validate — proceed to handover doc with VALIDATE marked SKIPPED
|
||||||
```
|
```
|
||||||
|
|
||||||
If A → proceed to STEP 7. If B → exit cleanly with state report. If C →
|
If A → proceed to STEP 7. If B → exit cleanly with state report. If C →
|
||||||
mark `VALIDATE_SKIPPED=true` and jump to STEP 8.
|
mark `VALIDATE_SKIPPED=true` and jump to STEP 8.
|
||||||
|
|
||||||
If `DEPLOYED_URL` is still `[À CONFIRMER]` after option A: AskUserQuestion
|
If `DEPLOYED_URL` is still `[À CONFIRMER]` after option A: AskUserQuestion
|
||||||
"Quelle est l'URL du site déployé pour /validate ?" — capture URL.
|
"Quelle est l'URL du site déployé pour /web-validate ?" — capture URL.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## STEP 7 — RUN /validate (live site)
|
## STEP 7 — RUN /web-validate (live site)
|
||||||
|
|
||||||
Skip if `VALIDATE_SKIPPED=true` or `PROJECT_TYPE != web` (in either case
|
Skip if `VALIDATE_SKIPPED=true` or `PROJECT_TYPE != web` (in either case
|
||||||
ensure `VALIDATE_SKIPPED=true` is set so the gate logic in STEP 8 treats
|
ensure `VALIDATE_SKIPPED=true` is set so the gate logic in STEP 8 treats
|
||||||
@ -585,7 +585,7 @@ VALIDATE as not-applicable rather than failed).
|
|||||||
|
|
||||||
Dispatch `general-purpose` subagent:
|
Dispatch `general-purpose` subagent:
|
||||||
|
|
||||||
> Read `~/.claude/skills/validate/SKILL.md` and execute against the
|
> Read `~/.claude/skills/web-validate/SKILL.md` and execute against the
|
||||||
> deployed URL: `<DEPLOYED_URL>`. Audit W3C HTML validity (validator.nu),
|
> deployed URL: `<DEPLOYED_URL>`. Audit W3C HTML validity (validator.nu),
|
||||||
> W3C CSS validity (jigsaw.w3.org), WCAG 2.1 a11y (axe-core, pa11y).
|
> W3C CSS validity (jigsaw.w3.org), WCAG 2.1 a11y (axe-core, pa11y).
|
||||||
> Apply autonomous fixes ONLY in source code (the client controls deploy);
|
> Apply autonomous fixes ONLY in source code (the client controls deploy);
|
||||||
@ -601,8 +601,8 @@ SCORE_VALIDATE_AFTER=$(extract_score .claude/audits/VALIDATE.md)
|
|||||||
Note: VALIDATE has no `_BEFORE` (first run is post-deploy). The before/after
|
Note: VALIDATE has no `_BEFORE` (first run is post-deploy). The before/after
|
||||||
table for VALIDATE shows `—` for before, `<score>` for after.
|
table for VALIDATE shows `—` for before, `<score>` for after.
|
||||||
|
|
||||||
If /validate produced new fixes in source code, run STEP 5 again (mini-commit
|
If /web-validate produced new fixes in source code, run STEP 5 again (mini-commit
|
||||||
+ push) BEFORE moving to STEP 8 — but DO NOT loop /validate. The remaining
|
+ push) BEFORE moving to STEP 8 — but DO NOT loop /web-validate. The remaining
|
||||||
deploy of those fixes is mentioned to the user in the final doc.
|
deploy of those fixes is mentioned to the user in the final doc.
|
||||||
|
|
||||||
---
|
---
|
||||||
@ -931,7 +931,7 @@ concrete, no jargon. One short paragraph per idea.
|
|||||||
|
|
||||||
1. **Never name internal tools or skill identifiers in chapters 1–3.**
|
1. **Never name internal tools or skill identifiers in chapters 1–3.**
|
||||||
Forbidden tokens (do not appear, in any case, in the lay portion):
|
Forbidden tokens (do not appear, in any case, in the lay portion):
|
||||||
`/seo`, `/harden`, `/validate`, `/cso`, `/feat`, `/bugfix`,
|
`/seo`, `/harden`, `/web-validate`, `/cso`, `/feat`, `/bugfix`,
|
||||||
`/ship-feature`, `/ship`, `/code-clean`, `/refactor`, `seo-analyzer`,
|
`/ship-feature`, `/ship`, `/code-clean`, `/refactor`, `seo-analyzer`,
|
||||||
`geo-analyzer`, `validator-analyzer`, `harden`-as-product-name,
|
`geo-analyzer`, `validator-analyzer`, `harden`-as-product-name,
|
||||||
`SEO.md`, `HARDEN.md`, `VALIDATE.md`, `CSO.md`, `MAX_ITERATIONS`,
|
`SEO.md`, `HARDEN.md`, `VALIDATE.md`, `CSO.md`, `MAX_ITERATIONS`,
|
||||||
@ -1003,7 +1003,7 @@ external validators when relevant (Mozilla Observatory, SSL Labs,
|
|||||||
SecurityHeaders.com — these are recognized seals).
|
SecurityHeaders.com — these are recognized seals).
|
||||||
|
|
||||||
DO NOT mention internal tool/skill names here (no /seo, /harden,
|
DO NOT mention internal tool/skill names here (no /seo, /harden,
|
||||||
/validate, seo-analyzer, etc.). The lecture rapide IS where
|
/web-validate, seo-analyzer, etc.). The lecture rapide IS where
|
||||||
client-facing axis names live.]
|
client-facing axis names live.]
|
||||||
|
|
||||||
## 3. Ce qui a été fait
|
## 3. Ce qui a été fait
|
||||||
@ -1459,7 +1459,7 @@ Chapter 6 (Détails techniques) may use them in the optional glossary.
|
|||||||
|
|
||||||
```bash
|
```bash
|
||||||
awk '/^## 1\./{flag=1} /^## 6\./{flag=0} flag' "$OUTPUT" \
|
awk '/^## 1\./{flag=1} /^## 6\./{flag=0} flag' "$OUTPUT" \
|
||||||
| grep -niE '/(seo|harden|validate|cso|feat|bugfix|ship-feature|ship|code-clean|refactor)\b|seo-analyzer|geo-analyzer|validator-analyzer|SEO\.md|HARDEN\.md|VALIDATE\.md|CSO\.md|MAX_ITERATIONS|ALL_PASS|SCORE_[A-Z_]+'
|
| grep -niE '/(seo|harden|web-validate|validate|cso|feat|bugfix|ship-feature|ship|code-clean|refactor)\b|seo-analyzer|geo-analyzer|validator-analyzer|SEO\.md|HARDEN\.md|VALIDATE\.md|CSO\.md|MAX_ITERATIONS|ALL_PASS|SCORE_[A-Z_]+'
|
||||||
# expected: no matches. Each match is a leak — rewrite the offending
|
# expected: no matches. Each match is a leak — rewrite the offending
|
||||||
# chapter in client language before STEP 16.
|
# chapter in client language before STEP 16.
|
||||||
```
|
```
|
||||||
|
|||||||
@ -1,6 +1,6 @@
|
|||||||
---
|
---
|
||||||
name: validator-analyzer
|
name: validator-analyzer
|
||||||
description: Web standards audit agent — W3C HTML validity (validator.nu), W3C CSS validity (jigsaw.w3.org), WCAG 2.1 accessibility (axe-core, pa11y, WAVE). Dispatched from /validate. Produces scored .claude/audits/VALIDATE.md report with concrete diffs for auto-fixable issues and user actions for judgment-required fixes. Complementary to /harden (security), /seo (indexability), /geo (AI extraction).
|
description: Web standards audit agent — W3C HTML validity (validator.nu), W3C CSS validity (jigsaw.w3.org), WCAG 2.1 accessibility (axe-core, pa11y, WAVE). Dispatched from /web-validate. Produces scored .claude/audits/VALIDATE.md report with concrete diffs for auto-fixable issues and user actions for judgment-required fixes. Complementary to /harden (security), /seo (indexability), /geo (AI extraction).
|
||||||
tools: Read, Edit, Write, Bash, Grep, Glob, WebFetch
|
tools: Read, Edit, Write, Bash, Grep, Glob, WebFetch
|
||||||
---
|
---
|
||||||
|
|
||||||
@ -24,7 +24,7 @@ $ARGUMENTS
|
|||||||
|
|
||||||
## STEP 0 — Parse context
|
## STEP 0 — Parse context
|
||||||
|
|
||||||
If dispatched from `/validate`, context is in `$ARGUMENTS`. Extract:
|
If dispatched from `/web-validate`, context is in `$ARGUMENTS`. Extract:
|
||||||
|
|
||||||
- `TARGET_URL` — production URL (FULL) or "none" (LOCAL)
|
- `TARGET_URL` — production URL (FULL) or "none" (LOCAL)
|
||||||
- `DEPTH` — LOCAL | FULL
|
- `DEPTH` — LOCAL | FULL
|
||||||
@ -45,7 +45,7 @@ Standalone invocation (no dispatcher): ask ONCE as a bundled block:
|
|||||||
```bash
|
```bash
|
||||||
mkdir -p .validate-cache
|
mkdir -p .validate-cache
|
||||||
grep -q '^\.validate-cache/' .gitignore 2>/dev/null || \
|
grep -q '^\.validate-cache/' .gitignore 2>/dev/null || \
|
||||||
printf '\n# /validate cache\n.validate-cache/\n' >> .gitignore
|
printf '\n# /web-validate cache\n.validate-cache/\n' >> .gitignore
|
||||||
```
|
```
|
||||||
|
|
||||||
### Framework detection for SPA built-output targeting
|
### Framework detection for SPA built-output targeting
|
||||||
@ -450,10 +450,10 @@ Each entry : file:line + WCAG SC reference + suggested approach.
|
|||||||
- What the tool chain could not verify (e.g. dynamic content loaded
|
- What the tool chain could not verify (e.g. dynamic content loaded
|
||||||
via JS in LOCAL mode, color contrast on images, screen reader flow)
|
via JS in LOCAL mode, color contrast on images, screen reader flow)
|
||||||
- Reason + suggested follow-up (manual test with NVDA/VoiceOver,
|
- Reason + suggested follow-up (manual test with NVDA/VoiceOver,
|
||||||
run /validate --full post-deploy, etc.)
|
run /web-validate --full post-deploy, etc.)
|
||||||
|
|
||||||
## 8. Changes applied (appended by dispatcher after fix confirmation)
|
## 8. Changes applied (appended by dispatcher after fix confirmation)
|
||||||
<Empty until /validate --fix completes STEP 3>
|
<Empty until /web-validate --fix completes STEP 3>
|
||||||
```
|
```
|
||||||
|
|
||||||
Max 600 lines. Cite file:line or tool output for every finding.
|
Max 600 lines. Cite file:line or tool output for every finding.
|
||||||
@ -495,7 +495,7 @@ At end of §5, emit verbatim :
|
|||||||
READY TO APPLY — awaiting dispatcher confirmation
|
READY TO APPLY — awaiting dispatcher confirmation
|
||||||
```
|
```
|
||||||
|
|
||||||
**Do NOT apply any Edit/Write.** Dispatcher handles STEP 3 of `/validate`.
|
**Do NOT apply any Edit/Write.** Dispatcher handles STEP 3 of `/web-validate`.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|||||||
@ -10,7 +10,7 @@ analyze personal
|
|||||||
# SEO / GEO / web standards
|
# SEO / GEO / web standards
|
||||||
seo personal
|
seo personal
|
||||||
geo personal
|
geo personal
|
||||||
validate personal
|
web-validate personal
|
||||||
|
|
||||||
# Code + perf health
|
# Code + perf health
|
||||||
health
|
health
|
||||||
|
|||||||
@ -46,7 +46,7 @@ codex
|
|||||||
# === SEO / GEO / standards / security ================================
|
# === SEO / GEO / standards / security ================================
|
||||||
seo personal
|
seo personal
|
||||||
geo personal
|
geo personal
|
||||||
validate personal
|
web-validate personal
|
||||||
harden personal
|
harden personal
|
||||||
analyze personal
|
analyze personal
|
||||||
cso
|
cso
|
||||||
|
|||||||
@ -6,6 +6,6 @@ qa-only
|
|||||||
browse
|
browse
|
||||||
benchmark
|
benchmark
|
||||||
canary
|
canary
|
||||||
validate personal
|
web-validate personal
|
||||||
open-gstack-browser
|
open-gstack-browser
|
||||||
setup-browser-cookies
|
setup-browser-cookies
|
||||||
|
|||||||
@ -8,7 +8,7 @@ seo personal
|
|||||||
geo personal
|
geo personal
|
||||||
|
|
||||||
# W3C HTML/CSS validity + WCAG a11y
|
# W3C HTML/CSS validity + WCAG a11y
|
||||||
validate personal
|
web-validate personal
|
||||||
|
|
||||||
# Web hardening (HSTS, CSP, redirects — affects ranking signals)
|
# Web hardening (HSTS, CSP, redirects — affects ranking signals)
|
||||||
harden personal
|
harden personal
|
||||||
|
|||||||
@ -34,7 +34,7 @@ refactor personal
|
|||||||
# === SEO / GEO / standards ===========================================
|
# === SEO / GEO / standards ===========================================
|
||||||
seo personal
|
seo personal
|
||||||
geo personal
|
geo personal
|
||||||
validate personal
|
web-validate personal
|
||||||
harden personal
|
harden personal
|
||||||
analyze personal
|
analyze personal
|
||||||
|
|
||||||
|
|||||||
@ -30,7 +30,7 @@ commit-change personal
|
|||||||
refactor personal
|
refactor personal
|
||||||
|
|
||||||
# Validation companion (basic W3C/a11y check during build)
|
# Validation companion (basic W3C/a11y check during build)
|
||||||
validate personal
|
web-validate personal
|
||||||
|
|
||||||
# External: design skills
|
# External: design skills
|
||||||
emil-design-eng external
|
emil-design-eng external
|
||||||
|
|||||||
@ -5,7 +5,7 @@ description: |
|
|||||||
final audits, deploy validation against live site, and a branded
|
final audits, deploy validation against live site, and a branded
|
||||||
deliverable (Markdown + HTML + PDF). Multi-agent orchestrator: dispatches
|
deliverable (Markdown + HTML + PDF). Multi-agent orchestrator: dispatches
|
||||||
client-handover-writer which spawns parallel /seo + /harden subagents,
|
client-handover-writer which spawns parallel /seo + /harden subagents,
|
||||||
then /validate, then writes the deliverable.
|
then /web-validate, then writes the deliverable.
|
||||||
Triggers: "client handover", "compte rendu client", "livraison client",
|
Triggers: "client handover", "compte rendu client", "livraison client",
|
||||||
"rapport client", "deliverable", "summary for client", "handover doc",
|
"rapport client", "deliverable", "summary for client", "handover doc",
|
||||||
"livrable", "ship and handover", "finaliser et livrer".
|
"livrable", "ship and handover", "finaliser et livrer".
|
||||||
@ -39,11 +39,11 @@ The agent runs a **ship-and-handover pipeline** with explicit gates:
|
|||||||
- If still < 17/20 after cap → escalate to user with concrete remaining issues; user decides continue / stop / manual intervention.
|
- If still < 17/20 after cap → escalate to user with concrete remaining issues; user decides continue / stop / manual intervention.
|
||||||
4. **COMMIT + PUSH** — If files changed during fix loops, run /commit-change (atomic logical commits) then `git push`.
|
4. **COMMIT + PUSH** — If files changed during fix loops, run /commit-change (atomic logical commits) then `git push`.
|
||||||
5. **DEPLOY PAUSE** — List exact deploy artifacts: changed files since baseline, deploy hints from project (vercel.json, netlify.toml, Dockerfile, .github/workflows/deploy.yml, etc.), and the deploy process in plain words. Use AskUserQuestion: "Deploy done? (Yes / Not yet / Skip validate)". Block until Yes or Skip.
|
5. **DEPLOY PAUSE** — List exact deploy artifacts: changed files since baseline, deploy hints from project (vercel.json, netlify.toml, Dockerfile, .github/workflows/deploy.yml, etc.), and the deploy process in plain words. Use AskUserQuestion: "Deploy done? (Yes / Not yet / Skip validate)". Block until Yes or Skip.
|
||||||
6. **/validate (live site)** — Run validator-analyzer against the deployed URL. Capture `SCORE_VALIDATE`.
|
6. **/web-validate (live site)** — Run validator-analyzer against the deployed URL. Capture `SCORE_VALIDATE`.
|
||||||
7. **GATE — per-axis threshold ≥17/20** — Compute final `SCORE_*_AFTER` for SEO classique, GEO (IA), HARDEN, VALIDATE. If ANY < 17/20: STOP. Generate `.claude/audits/HANDOVER-ROADMAP.md` with prioritized analysis of what's blocking each below-threshold axis. Do NOT write the client deliverable. Report to user.
|
7. **GATE — per-axis threshold ≥17/20** — Compute final `SCORE_*_AFTER` for SEO classique, GEO (IA), HARDEN, VALIDATE. If ANY < 17/20: STOP. Generate `.claude/audits/HANDOVER-ROADMAP.md` with prioritized analysis of what's blocking each below-threshold axis. Do NOT write the client deliverable. Report to user.
|
||||||
8. **DOC GENERATION (only if all scores ≥17/20)** — Read `.claude/memory/` registries + full git history. Ask whether to include build/deploy chapter. Synthesize the client deliverable using the 4-chapter structure:
|
8. **DOC GENERATION (only if all scores ≥17/20)** — Read `.claude/memory/` registries + full git history. Ask whether to include build/deploy chapter. Synthesize the client deliverable using the 4-chapter structure:
|
||||||
- **§1 Ce qu'il fallait faire (et pourquoi)** — brief + motivation, 100–180 words.
|
- **§1 Ce qu'il fallait faire (et pourquoi)** — brief + motivation, 100–180 words.
|
||||||
- **§2 Ce qui a été fait** — lay summary, **≤300 words, zero technical jargon**, **no internal tool/skill names** (no `/seo`, `/harden`, `/validate`, `seo-analyzer`, etc. — replace with concept names: référencement / sécurité / conformité technique). Forbidden-token grep gate runs before write.
|
- **§2 Ce qui a été fait** — lay summary, **≤300 words, zero technical jargon**, **no internal tool/skill names** (no `/seo`, `/harden`, `/web-validate`, `seo-analyzer`, etc. — replace with concept names: référencement / sécurité / conformité technique). Forbidden-token grep gate runs before write.
|
||||||
- **§3 Ce qui vous reste à faire** — action-only checklist grouped by cadence (one-time / monthly / quarterly / yearly / when something changes).
|
- **§3 Ce qui vous reste à faire** — action-only checklist grouped by cadence (one-time / monthly / quarterly / yearly / when something changes).
|
||||||
- **§4 Détails techniques (pour les curieux)** — score table (SEO classique + GEO + sécurité + conformité, before/after, gated independently at ≥17/20), vulgarized BDR decisions, phases with technical detail, optional glossary.
|
- **§4 Détails techniques (pour les curieux)** — score table (SEO classique + GEO + sécurité + conformité, before/after, gated independently at ≥17/20), vulgarized BDR decisions, phases with technical detail, optional glossary.
|
||||||
- **§5 Annexe — plateformes externes** (web/local-business only).
|
- **§5 Annexe — plateformes externes** (web/local-business only).
|
||||||
|
|||||||
@ -352,8 +352,8 @@ Agent(
|
|||||||
- Legal pages (mentions légales, CGV, privacy) — unless the issue is
|
- Legal pages (mentions légales, CGV, privacy) — unless the issue is
|
||||||
a security-header gap on those pages, not their content
|
a security-header gap on those pages, not their content
|
||||||
- Content quality, keyword density, readability
|
- Content quality, keyword density, readability
|
||||||
- a11y / WCAG (owned by /validate — W3C + WCAG audit)
|
- a11y / WCAG (owned by /web-validate — W3C + WCAG audit)
|
||||||
- W3C HTML / CSS syntactic validity (owned by /validate)
|
- W3C HTML / CSS syntactic validity (owned by /web-validate)
|
||||||
|
|
||||||
If you detect an out-of-scope issue, DROP IT silently. Do NOT mention
|
If you detect an out-of-scope issue, DROP IT silently. Do NOT mention
|
||||||
it even as a "note". Stay focused.
|
it even as a "note". Stay focused.
|
||||||
|
|||||||
@ -49,7 +49,13 @@ Operates on `.claude/memory/` in the current project (CWD). Curates the
|
|||||||
|
|
||||||
```bash
|
```bash
|
||||||
test -d .claude/memory/ || { echo "no .claude/memory/ in $(pwd)"; exit 1; }
|
test -d .claude/memory/ || { echo "no .claude/memory/ in $(pwd)"; exit 1; }
|
||||||
git status --short .claude/memory/ 2>/dev/null
|
# RED-2 guard: a dirty tree is a HARD stop, enforced in-band (not a prose
|
||||||
|
# "STOP"). Git is the only backup; refuse to write over uncommitted state.
|
||||||
|
if [ -n "$(git status --short .claude/memory/ 2>/dev/null)" ]; then
|
||||||
|
git status --short .claude/memory/
|
||||||
|
echo "DIRTY: commit or stash .claude/memory/ first. Git is the only backup."
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
```
|
```
|
||||||
|
|
||||||
If working tree is dirty on any registry file → STOP with: "Commit or
|
If working tree is dirty on any registry file → STOP with: "Commit or
|
||||||
@ -75,6 +81,11 @@ age comparisons. Today's date is in the system context.
|
|||||||
- Journal entries older than 180 days with zero cross-reference from
|
- Journal entries older than 180 days with zero cross-reference from
|
||||||
later entries → propose collapse into 1-line month summary
|
later entries → propose collapse into 1-line month summary
|
||||||
(`## YYYY-MM` heading replaces detail).
|
(`## YYYY-MM` heading replaces detail).
|
||||||
|
**SAFETY-CRITICAL EXCEPTION (deterministic):** an entry whose body holds an
|
||||||
|
operational permanent rule is INTOUCHABLE — never collapse, summarize, or
|
||||||
|
reword it, regardless of age or cross-reference. Trigger: any line with
|
||||||
|
`NEVER`/`ALWAYS`/`PERMANENT`, or a negation + imperative (`must not`,
|
||||||
|
`do not`, `never deploy`…). The detail IS the value; keep it verbatim.
|
||||||
|
|
||||||
### B. Similar — merge candidates
|
### B. Similar — merge candidates
|
||||||
- Two+ entries sharing root keyword in title (e.g. `pandoc`,
|
- Two+ entries sharing root keyword in title (e.g. `pandoc`,
|
||||||
@ -141,8 +152,14 @@ Order: safe → destructive.
|
|||||||
(accepted), references (union).
|
(accepted), references (union).
|
||||||
4. **Inline caveman compression** — preserve frontmatter exactly (id,
|
4. **Inline caveman compression** — preserve frontmatter exactly (id,
|
||||||
date, title, status, references). Rewrite prose body to fragments:
|
date, title, status, references). Rewrite prose body to fragments:
|
||||||
- Drop articles (`a`, `an`, `the`).
|
- **NEGATION GUARD (deterministic, overrides every rule below):** never
|
||||||
- Drop filler (`just`, `really`, `basically`, `actually`, `simply`).
|
rewrite a sentence containing a negation token (`not`, `never`, `no`,
|
||||||
|
`cannot`, or any `*n't` contraction). Keep such sentences VERBATIM —
|
||||||
|
dropping a filler next to a `not`/`never` can silently invert meaning.
|
||||||
|
Compression touches negation-free sentences only.
|
||||||
|
- Drop articles (`a`, `an`, `the`) — negation-free sentences only.
|
||||||
|
- Drop filler (`just`, `really`, `basically`, `actually`, `simply`) —
|
||||||
|
negation-free sentences only.
|
||||||
- Short synonyms (`big` not `extensive`, `fix` not `implement a solution for`).
|
- Short synonyms (`big` not `extensive`, `fix` not `implement a solution for`).
|
||||||
- Keep code blocks, URLs, error messages, file paths VERBATIM.
|
- Keep code blocks, URLs, error messages, file paths VERBATIM.
|
||||||
- Keep IDs (BDR-XXX, LRN-XXX, commit hashes) verbatim.
|
- Keep IDs (BDR-XXX, LRN-XXX, commit hashes) verbatim.
|
||||||
@ -154,8 +171,8 @@ After each write, regenerate Index from body when rows changed.
|
|||||||
```bash
|
```bash
|
||||||
# Filename → ID-prefix map. Hard-mapped because filenames don't share
|
# Filename → ID-prefix map. Hard-mapped because filenames don't share
|
||||||
# their first 3 chars with the prefix (decisions → BDR, not DEC).
|
# their first 3 chars with the prefix (decisions → BDR, not DEC).
|
||||||
# v1 bug: derived prefix via `basename | cut -c1-3` → never matched,
|
# A prior version derived the prefix via `basename | cut -c1-3`, which never
|
||||||
# verify printed false-clean signal. Fixed in v1.1 (TDD found it).
|
# matched any heading and made verify a no-op (false-clean signal).
|
||||||
declare -A PREFIX_MAP=(
|
declare -A PREFIX_MAP=(
|
||||||
[decisions]=BDR
|
[decisions]=BDR
|
||||||
[learnings]=LRN
|
[learnings]=LRN
|
||||||
@ -175,9 +192,60 @@ for fname in decisions learnings blockers evals; do
|
|||||||
done
|
done
|
||||||
/usr/bin/grep -oE "^\| (${prefix})-[0-9]+ " "$f" | while read row; do
|
/usr/bin/grep -oE "^\| (${prefix})-[0-9]+ " "$f" | while read row; do
|
||||||
id=$(echo "$row" | awk '{print $2}')
|
id=$(echo "$row" | awk '{print $2}')
|
||||||
/usr/bin/grep -q "^## ${id} " "$f" || echo "ORPHAN INDEX: $id in $f"
|
# RED-6 fix: match id at a word boundary (space OR end-of-line) so a
|
||||||
|
# title-less heading "## BDR-009" is not flagged as a false orphan.
|
||||||
|
/usr/bin/grep -qE "^## ${id}( |\$)" "$f" || echo "ORPHAN INDEX: $id in $f"
|
||||||
done
|
done
|
||||||
done
|
done
|
||||||
|
|
||||||
|
# RED-5 fidelity guard (count-based, per-entry x per-category). STEP 0 ensured
|
||||||
|
# a clean tree, so git HEAD is the pre-prune backup. Fails the run if any
|
||||||
|
# negation/permanent token COUNT drops within an entry vs HEAD -- immune to the
|
||||||
|
# line-sharing false positives a removed-line grep produces. The STEP 3.4
|
||||||
|
# NEGATION GUARD keeps negation sentences verbatim; this proves none slipped.
|
||||||
|
# Journal entries are date-keyed and legitimately collapse, so the journal is
|
||||||
|
# restricted to {never,always,permanent} -- the markers the STEP 1.A safety
|
||||||
|
# exception protects from collapse (keys stay stable; casual not/no in a benign
|
||||||
|
# collapsed entry is not a loss). Contraction *n't is covered upstream by A.
|
||||||
|
census() { # reads a registry file on stdin -> "KEY:CAT<TAB>COUNT" per entry
|
||||||
|
awk '
|
||||||
|
/^## /{ id=$2 }
|
||||||
|
{ L=tolower($0); gsub(/[^a-z]+/," ",L); n=split(L,w," ")
|
||||||
|
for(i=1;i<=n;i++){ c=w[i]
|
||||||
|
if(c=="never") a[id":never"]++
|
||||||
|
else if(c=="always") a[id":always"]++
|
||||||
|
else if(c=="permanent") a[id":perm"]++
|
||||||
|
else if(c=="cannot") a[id":cannot"]++
|
||||||
|
else if(c=="not") a[id":not"]++
|
||||||
|
else if(c=="no") a[id":no"]++ } }
|
||||||
|
END{ for(k in a) if(a[k]>0) print k"\t"a[k] }'
|
||||||
|
}
|
||||||
|
fidelity_check() { # $1 = registry basename; returns 1 (and prints) on a drop
|
||||||
|
local fname="$1" f=".claude/memory/$1.md" cats drop
|
||||||
|
[ -f "$f" ] || return 0
|
||||||
|
git diff --quiet -- "$f" 2>/dev/null && return 0
|
||||||
|
if [ "$fname" = journal ]; then cats='never|always|perm'
|
||||||
|
else cats='never|always|perm|cannot|not|no'; fi
|
||||||
|
# Tag working "W" / HEAD "H" explicitly -- NOT NR==FNR, which misclassifies
|
||||||
|
# when the working census is empty (a fully-deleted safety entry = the case
|
||||||
|
# we most need to catch).
|
||||||
|
drop=$( { census < "$f" | awk '{print "W\t"$0}'
|
||||||
|
git show HEAD:"$f" | census | awk '{print "H\t"$0}'
|
||||||
|
} | awk -F'\t' -v cats="^($cats)\$" '
|
||||||
|
$1=="W" { w[$2]=$3; next }
|
||||||
|
{ n=split($2,p,":"); if (p[n] !~ cats) next
|
||||||
|
if ((w[$2]+0) < $3) print " "$2" (HEAD="$3" now="(w[$2]+0)")" }')
|
||||||
|
if [ -n "$drop" ]; then
|
||||||
|
echo "FIDELITY FAIL ($f): a negation/permanent token dropped within an entry:"
|
||||||
|
printf '%s\n' "$drop"; return 1
|
||||||
|
fi
|
||||||
|
return 0
|
||||||
|
}
|
||||||
|
FIDFAIL=0
|
||||||
|
for fname in decisions learnings blockers journal evals; do
|
||||||
|
fidelity_check "$fname" || FIDFAIL=1
|
||||||
|
done
|
||||||
|
[ "$FIDFAIL" = 1 ] && echo "Do NOT certify this run. Revert with: git checkout .claude/memory/"
|
||||||
echo "(blank above = OK)"
|
echo "(blank above = OK)"
|
||||||
|
|
||||||
wc -l .claude/memory/*.md | grep -v "\.original\.md"
|
wc -l .claude/memory/*.md | grep -v "\.original\.md"
|
||||||
|
|||||||
38
skills/prune-memory/tests/BACKLOG.md
Normal file
38
skills/prune-memory/tests/BACKLOG.md
Normal file
@ -0,0 +1,38 @@
|
|||||||
|
# prune-memory — test backlog (future REDs)
|
||||||
|
|
||||||
|
## RED-7 (candidate) — example-priming in the merge pass
|
||||||
|
Observed during the 2026-06-25 real-data measurement on the live
|
||||||
|
`learnings.md`: the skill merged **LRN-014 + LRN-016** — the EXACT pair
|
||||||
|
named as the worked example in `SKILL.md` STEP 2
|
||||||
|
("LRN-014 + LRN-016 — both pandoc rendering quirks → merge into NEW
|
||||||
|
LRN-017").
|
||||||
|
|
||||||
|
Hypothesis: the skill's own illustrative example PRIMED the merge on real
|
||||||
|
data, rather than a genuine content overlap between those two entries.
|
||||||
|
|
||||||
|
If confirmed, this is a design defect: a skill's example must not steer its
|
||||||
|
behavior on real registries.
|
||||||
|
- VERIFY FIRST: read the real LRN-014 / LRN-016 — do they actually overlap,
|
||||||
|
or did the example drive the merge?
|
||||||
|
- RED (if priming confirmed): fixture with entries at LRN-014/016 that do
|
||||||
|
NOT overlap (distinct topics) → assert the skill does NOT merge them.
|
||||||
|
- GREEN: fictionalize the SKILL.md example (obviously-fake IDs, or an
|
||||||
|
explicit "hypothetical" framing) so example IDs cannot match real entries.
|
||||||
|
|
||||||
|
Status: filed, not built. Surfaced by the real-data A-measurement.
|
||||||
|
|
||||||
|
## RED-8 (candidate) — added-negation inversion (documented limit, not a test yet)
|
||||||
|
The RED-5 fidelity guard flags negation/permanent token DROPS; it cannot catch
|
||||||
|
an ADDED negation that inverts meaning ("X works" -> "X never works") — that is
|
||||||
|
a count INCREASE. The STEP 3.4 NEGATION GUARD only protects sentences that
|
||||||
|
ALREADY contain a negation, so it does not stop a non-negation sentence being
|
||||||
|
rewritten WITH a negation. So NEITHER guard closes this case — a real hole,
|
||||||
|
documented honestly rather than claimed covered.
|
||||||
|
|
||||||
|
Practically remote: caveman compression and merge SUBTRACT tokens (drop filler);
|
||||||
|
they do not author new negations. Producing "X never works" from "X works"
|
||||||
|
requires ADDING a word, contrary to an operation that shortens.
|
||||||
|
- RED (if pursued): assert no op INCREASES an existing entry's negation count.
|
||||||
|
- Caveat: must exclude new/merged-entry ids (HEAD count 0 -> N is legitimate),
|
||||||
|
so an increase-check needs care to avoid its own false positives.
|
||||||
|
Status: documented limit, not built (low practical risk + non-trivial FP risk).
|
||||||
26
skills/prune-memory/tests/fixtures/red3-negation/.claude/memory/decisions.md
vendored
Normal file
26
skills/prune-memory/tests/fixtures/red3-negation/.claude/memory/decisions.md
vendored
Normal file
@ -0,0 +1,26 @@
|
|||||||
|
# Decisions
|
||||||
|
|
||||||
|
## Index
|
||||||
|
|
||||||
|
| ID | status | date | title |
|
||||||
|
|----|--------|------|-------|
|
||||||
|
| BDR-041 | accepted | 2026-05-12 | Cache TTL default |
|
||||||
|
| BDR-042 | accepted | 2026-05-01 | Async fs in request path |
|
||||||
|
|
||||||
|
## BDR-041 — Cache TTL default
|
||||||
|
|
||||||
|
Set the default cache TTL to 300 seconds. Short and uncontroversial.
|
||||||
|
|
||||||
|
## BDR-042 — Async fs in request path
|
||||||
|
|
||||||
|
We basically really need to make it absolutely clear that the fix did NOT
|
||||||
|
resolve the race condition in the auth middleware, despite the fact that it
|
||||||
|
actually appeared to work fine in local testing. The truth is that the
|
||||||
|
synchronous readFileSync call simply must never be placed on the hot request
|
||||||
|
path, because under real production load it just blocked the event loop and
|
||||||
|
the p99 latency did not improve at all — it actually got considerably worse
|
||||||
|
over time. So the conclusion we really want to record is this: blocking
|
||||||
|
filesystem calls are never acceptable inside a request handler, and the
|
||||||
|
earlier patch that seemed to fix the issue did not actually fix anything. It
|
||||||
|
simply masked the symptom. Future work must never reintroduce a synchronous
|
||||||
|
call here just to make a test pass.
|
||||||
13
skills/prune-memory/tests/fixtures/red4-journal/.claude/memory/journal.md
vendored
Normal file
13
skills/prune-memory/tests/fixtures/red4-journal/.claude/memory/journal.md
vendored
Normal file
@ -0,0 +1,13 @@
|
|||||||
|
# Journal
|
||||||
|
|
||||||
|
## 2025-11-03
|
||||||
|
- Shipped v2 auth migration. NEVER deploy migration 0007 without running
|
||||||
|
the backfill job first — doing so wiped 3% of user sessions in staging.
|
||||||
|
Root cause: FK cascade on the sessions table. This is a PERMANENT rule.
|
||||||
|
- Minor: bumped eslint to 9.x.
|
||||||
|
|
||||||
|
## 2026-01-15
|
||||||
|
- Refactored billing module. No relation to the auth work above.
|
||||||
|
|
||||||
|
## 2026-06-20
|
||||||
|
- Current session: started prune-memory TDD work.
|
||||||
17
skills/prune-memory/tests/fixtures/red6-orphan/.claude/memory/decisions.md
vendored
Normal file
17
skills/prune-memory/tests/fixtures/red6-orphan/.claude/memory/decisions.md
vendored
Normal file
@ -0,0 +1,17 @@
|
|||||||
|
# Decisions
|
||||||
|
|
||||||
|
## Index
|
||||||
|
|
||||||
|
| ID | status | date | title |
|
||||||
|
|----|--------|------|-------|
|
||||||
|
| BDR-009 | accepted | 2026-06-01 | titleless |
|
||||||
|
| BDR-010 | accepted | 2026-06-02 | has title |
|
||||||
|
|
||||||
|
## BDR-009
|
||||||
|
Body exists. Heading above has NO trailing space and NO title -- this is
|
||||||
|
the trap. STEP 4 loop-2 checks `^## BDR-009 ` (trailing space required)
|
||||||
|
and so reports a FALSE ORPHAN even though this body is right here.
|
||||||
|
|
||||||
|
## BDR-010 — Has title
|
||||||
|
Body exists. Control entry: heading has a title, so STEP 4 finds it and
|
||||||
|
does NOT false-orphan it. Proves the bug is specific to title-less headings.
|
||||||
96
skills/prune-memory/tests/run-behavioral.md
Normal file
96
skills/prune-memory/tests/run-behavioral.md
Normal file
@ -0,0 +1,96 @@
|
|||||||
|
# Behavioral RED suite — /prune-memory (RED-3, RED-4)
|
||||||
|
|
||||||
|
LLM-executed, non-deterministic. Orchestrated by the main agent, NOT a
|
||||||
|
plain script. Fleet **N=6** per RED, **TOLERANCE ZERO**: a single failing
|
||||||
|
run = the RED is red. A destructive skill gets no failure rate — "works
|
||||||
|
almost always" means "loses an entry the day the dice land wrong".
|
||||||
|
|
||||||
|
NEVER run against real registries. Each subagent gets a FRESH COPY of a
|
||||||
|
throwaway fixture under `tests/fixtures/`.
|
||||||
|
|
||||||
|
## Harness (per run, repeated N=6 times, independent subagents)
|
||||||
|
|
||||||
|
1. Copy the fixture to a fresh sandbox:
|
||||||
|
`cp -r tests/fixtures/<fix>/. $SANDBOX_i/`
|
||||||
|
2. Make it a CLEAN git repo so STEP 0 PRECHECK passes and the skill
|
||||||
|
proceeds to the destructive steps. Without this, STEP 0 finds no git
|
||||||
|
and aborts — the test would observe NOTHING (a silent false-green, the
|
||||||
|
exact trap we hunt):
|
||||||
|
`git -C $SANDBOX_i init -q && git -C $SANDBOX_i add -A \
|
||||||
|
&& git -C $SANDBOX_i -c user.email=t@t -c user.name=t commit -qm fixture`
|
||||||
|
3. Dispatch one subagent (tools: Read, Edit, Write, Bash, Grep, Glob) with:
|
||||||
|
- the full `SKILL.md` procedure,
|
||||||
|
- CWD = `$SANDBOX_i` (so `.claude/memory/` is the fixture),
|
||||||
|
- instruction: *"Execute /prune-memory on `.claude/memory/` here. At
|
||||||
|
STEP 2, approve ALL categories (answer `all`). Apply the changes.
|
||||||
|
Do not ask the human."*
|
||||||
|
4. Capture the result (`git -C $SANDBOX_i diff` vs the committed fixture is
|
||||||
|
the natural oracle feed).
|
||||||
|
5. Apply the RED's oracle (below). Record PASS/FAIL.
|
||||||
|
|
||||||
|
Verdict per RED: **FAIL (red)** if ANY of the 6 runs fails. PASS (green)
|
||||||
|
only if all 6 pass.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## RED-3 — compression must not drop/invert a negation
|
||||||
|
Target: `decisions.md` / **BDR-042** (red3-negation fixture).
|
||||||
|
|
||||||
|
**Layer (a) — deterministic substring survival.** Whitespace-normalize the
|
||||||
|
post-prune BDR-042 body (collapse every run of whitespace to one space).
|
||||||
|
Assert ALL three negation-bearing clauses survive as substrings:
|
||||||
|
- S1: `the fix did NOT resolve the race condition in the auth middleware`
|
||||||
|
- S2: `blocking filesystem calls are never acceptable inside a request handler`
|
||||||
|
- S3: `Future work must never reintroduce a synchronous call here just to make a test pass.`
|
||||||
|
|
||||||
|
Any one missing → FAIL. (Control: BDR-041 has no negation and may be
|
||||||
|
compressed or left as-is — not asserted.)
|
||||||
|
|
||||||
|
**Layer (b) — semantic judge (independent subagent).** Give it ORIGINAL vs
|
||||||
|
POST BDR-042 and ask: *"Did any negation get inverted or lost? Reply
|
||||||
|
`PRESERVED` | `LOST:<which>` | `INVERTED:<which>`."* Anything but
|
||||||
|
`PRESERVED` → FAIL.
|
||||||
|
|
||||||
|
Current skill (no negation guard): **expected FAIL (red)** — the caveman
|
||||||
|
rules drop `basically/really/actually/simply`, which sit right next to the
|
||||||
|
`not`/`never` tokens, so the negations are at risk.
|
||||||
|
|
||||||
|
Aligned GREEN fix (deterministic): never compress a sentence/line that
|
||||||
|
contains a negation token (`not` / `never` / `no` / `n't`) — keep it
|
||||||
|
verbatim. Then layer (a) passes byte-deterministically and layer (b)
|
||||||
|
becomes a pure safety net.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## RED-4 — collapse must not touch safety-critical content
|
||||||
|
Target: `journal.md` / the `## 2025-11-03` entry (red4-journal fixture).
|
||||||
|
|
||||||
|
**Deterministic, BYTE-IDENTICAL.** Extract the block from the heading
|
||||||
|
`## 2025-11-03` up to (not including) the next `## ` heading, from BOTH the
|
||||||
|
fixture and the post-prune file. `diff` them. **ANY byte difference = FAIL**
|
||||||
|
— a PERMANENT / safety-critical entry (operational `NEVER`/`ALWAYS`,
|
||||||
|
`PERMANENT`) must not be collapsed, summarized, or reworded at all. The
|
||||||
|
detail (`0007`, `backfill first`, `FK cascade`) IS the value; a "semantic
|
||||||
|
equivalent" summary is a FAIL.
|
||||||
|
|
||||||
|
Control: the `## 2026-06-20` entry (<30 days, current session) must also be
|
||||||
|
untouched — already covered by "What NOT to prune", checked as a sanity
|
||||||
|
guard.
|
||||||
|
|
||||||
|
Current skill (collapse criterion = age + zero cross-ref only, no
|
||||||
|
safety-critical exception): **expected FAIL (red)** — the 2025-11-03 entry
|
||||||
|
is >180 days old and has zero cross-reference (the 2026-01-15 entry says
|
||||||
|
"No relation"), so it is collapse-eligible.
|
||||||
|
|
||||||
|
Aligned GREEN fix (deterministic): collapse-exception — skip any entry whose
|
||||||
|
body contains an operational permanent rule (`NEVER`/`ALWAYS`/`PERMANENT`,
|
||||||
|
or negation + imperative), regardless of age/cross-ref.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Why the oracles are deterministic even though the subject is an LLM
|
||||||
|
The subagent run is non-deterministic; the **oracle** that judges its output
|
||||||
|
is not. RED-4 is a byte `diff`; RED-3 layer (a) is a substring check. The
|
||||||
|
non-determinism is absorbed by N=6 + tolerance-zero: we are not asking
|
||||||
|
"does it usually behave", we are asking "can it ever misbehave". One bad run
|
||||||
|
out of six condemns the skill.
|
||||||
92
skills/prune-memory/tests/run-deterministic.sh
Normal file
92
skills/prune-memory/tests/run-deterministic.sh
Normal file
@ -0,0 +1,92 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
# Deterministic RED suite for /prune-memory — RED-1, RED-2, RED-5, RED-6.
|
||||||
|
# Each MUST be red on the current (v1) skill. Pure mechanical oracles,
|
||||||
|
# no LLM. Faithful: RED-2/RED-6 execute the REAL bash blocks extracted
|
||||||
|
# from SKILL.md (no copy that could drift).
|
||||||
|
#
|
||||||
|
# Sandbox only (mktemp). NEVER touches real registries or the repo.
|
||||||
|
# Usage: bash run-deterministic.sh (exit 0 = all green, 1 = >=1 red)
|
||||||
|
set -uo pipefail
|
||||||
|
|
||||||
|
SKILL="${SKILL:-$HOME/.claude/skills/prune-memory/SKILL.md}"
|
||||||
|
HERE="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||||
|
SANDBOX="$(mktemp -d "${TMPDIR:-/tmp}/prune-red.XXXXXX")"
|
||||||
|
trap 'rm -rf "$SANDBOX"' EXIT
|
||||||
|
|
||||||
|
fail=0
|
||||||
|
red() { printf 'RED-%s: RED (skill defective, expected pre-GREEN) -- %s\n' "$1" "$2"; fail=1; }
|
||||||
|
green() { printf 'RED-%s: GREEN (skill fixed) -- %s\n' "$1" "$2"; }
|
||||||
|
|
||||||
|
# Pull the real fenced ```bash block under a "## <heading>" from SKILL.md.
|
||||||
|
# Verified by the extract-check before the suite was written.
|
||||||
|
extract_block() {
|
||||||
|
awk -v h="$1" '
|
||||||
|
$0 ~ "^## " h {f=1}
|
||||||
|
f && /^```bash/ {c=1; next}
|
||||||
|
f && /^```/ && c {c=0; f=0; next}
|
||||||
|
c {print}
|
||||||
|
' "$SKILL"
|
||||||
|
}
|
||||||
|
|
||||||
|
# ---- RED-1: no claim of a verification that never ran -----------------------
|
||||||
|
if grep -qE 'Fixed in v1\.1|TDD found it' "$SKILL"; then
|
||||||
|
red 1 "false 'Fixed in v1.1 (TDD found it)' claim present in SKILL.md"
|
||||||
|
else
|
||||||
|
green 1 "no unproven verification claim in SKILL.md"
|
||||||
|
fi
|
||||||
|
|
||||||
|
# ---- RED-2: STEP 0 PRECHECK must refuse a dirty registry tree ---------------
|
||||||
|
S2="$SANDBOX/red2"; mkdir -p "$S2/.claude/memory"
|
||||||
|
git -C "$S2" init -q
|
||||||
|
printf '## BDR-001 -- seed\n' > "$S2/.claude/memory/decisions.md"
|
||||||
|
git -C "$S2" add -A
|
||||||
|
git -C "$S2" -c user.email=t@t -c user.name=t commit -qm seed >/dev/null 2>&1
|
||||||
|
printf 'uncommitted dirty line\n' >> "$S2/.claude/memory/decisions.md"
|
||||||
|
extract_block "STEP 0" > "$S2/step0.sh"
|
||||||
|
( cd "$S2" && bash step0.sh >/dev/null 2>&1 ); code=$?
|
||||||
|
if [ "$code" -ne 0 ]; then
|
||||||
|
green 2 "STEP 0 exits $code on dirty tree (blocks the run)"
|
||||||
|
else
|
||||||
|
red 2 "STEP 0 exits 0 on dirty tree -- prose-only STOP, no machine block"
|
||||||
|
fi
|
||||||
|
|
||||||
|
# ---- RED-5: STEP 4 verify must catch a safety-critical content mutation -----
|
||||||
|
# Leans on the clean-tree precondition (RED-2): git HEAD is the pre-prune
|
||||||
|
# backup, so STEP 4 can diff against it. A GREEN verify must FLAG any deleted
|
||||||
|
# permanent/negation line; v1 has no such check and falsely certifies OK.
|
||||||
|
S5="$SANDBOX/red5"; mkdir -p "$S5/.claude/memory"
|
||||||
|
git -C "$S5" init -q
|
||||||
|
printf '# Journal\n\n## 2025-11-03\n- PERMANENT rule: NEVER deploy migration 0007 without the backfill job first.\n' \
|
||||||
|
> "$S5/.claude/memory/journal.md"
|
||||||
|
git -C "$S5" add -A
|
||||||
|
git -C "$S5" -c user.email=t@t -c user.name=t commit -qm seed >/dev/null 2>&1
|
||||||
|
# Simulate a BAD prune that collapses away the safety-critical NEVER line:
|
||||||
|
printf '# Journal\n\n## 2025-11\n- Shipped auth migration; minor cleanup.\n' \
|
||||||
|
> "$S5/.claude/memory/journal.md"
|
||||||
|
extract_block "STEP 4" > "$S5/step4.sh"
|
||||||
|
out5="$( cd "$S5" && bash step4.sh 2>/dev/null )"
|
||||||
|
if printf '%s\n' "$out5" | grep -qiE 'FIDELITY FAIL|safety-critical'; then
|
||||||
|
green 5 "STEP 4 flags the removed safety-critical NEVER line"
|
||||||
|
else
|
||||||
|
red 5 "STEP 4 certifies OK after a safety-critical line was deleted (no fidelity check)"
|
||||||
|
fi
|
||||||
|
|
||||||
|
# ---- RED-6: STEP 4 verify must not false-orphan a title-less heading --------
|
||||||
|
S6="$SANDBOX/red6"; mkdir -p "$S6/.claude/memory"
|
||||||
|
cp "$HERE/fixtures/red6-orphan/.claude/memory/decisions.md" \
|
||||||
|
"$S6/.claude/memory/decisions.md"
|
||||||
|
extract_block "STEP 4" > "$S6/step4.sh"
|
||||||
|
out="$( cd "$S6" && bash step4.sh 2>/dev/null )"
|
||||||
|
if printf '%s\n' "$out" | grep -qE '^ORPHAN INDEX: BDR-009'; then
|
||||||
|
red 6 "verify emits FALSE 'ORPHAN INDEX: BDR-009' (body exists; trailing-space bug)"
|
||||||
|
else
|
||||||
|
green 6 "verify does not false-orphan the title-less heading"
|
||||||
|
fi
|
||||||
|
|
||||||
|
echo "----"
|
||||||
|
if [ "$fail" -eq 0 ]; then
|
||||||
|
echo "SUITE: all GREEN"
|
||||||
|
else
|
||||||
|
echo "SUITE: >=1 RED red (skill defective as expected pre-GREEN)"
|
||||||
|
fi
|
||||||
|
exit "$fail"
|
||||||
@ -10,7 +10,7 @@ description: |
|
|||||||
"structured data", "JSON-LD", "sitemap", "robots.txt", "Google ranking",
|
"structured data", "JSON-LD", "sitemap", "robots.txt", "Google ranking",
|
||||||
"local SEO", "AI search", "GEO", "llms.txt", "ChatGPT visibility",
|
"local SEO", "AI search", "GEO", "llms.txt", "ChatGPT visibility",
|
||||||
"Perplexity", "Google AI Overview".
|
"Perplexity", "Google AI Overview".
|
||||||
For GEO only → /geo. For W3C/a11y → /validate. For bugs → /bugfix.
|
For GEO only → /geo. For W3C/a11y → /web-validate. For bugs → /bugfix.
|
||||||
argument-hint: optional keywords/scope, e.g. "local SEO plombier 91 94 77" or "SaaS B2B content strategy"
|
argument-hint: optional keywords/scope, e.g. "local SEO plombier 91 94 77" or "SaaS B2B content strategy"
|
||||||
allowed-tools:
|
allowed-tools:
|
||||||
- Read
|
- Read
|
||||||
@ -33,7 +33,7 @@ entry point for any SEO/GEO work on a web project.
|
|||||||
## Resources
|
## Resources
|
||||||
|
|
||||||
- `resources/depth-matrix.md` — depth-decision rules (LOCAL vs FULL),
|
- `resources/depth-matrix.md` — depth-decision rules (LOCAL vs FULL),
|
||||||
score-weight table per axis, dedup rules with sibling skills (/validate,
|
score-weight table per axis, dedup rules with sibling skills (/web-validate,
|
||||||
/harden), and the envelope schema for `.claude/audits/SEO.md`.
|
/harden), and the envelope schema for `.claude/audits/SEO.md`.
|
||||||
|
|
||||||
Read `resources/depth-matrix.md` at the start of STEP 0 — it pre-answers
|
Read `resources/depth-matrix.md` at the start of STEP 0 — it pre-answers
|
||||||
|
|||||||
@ -30,8 +30,8 @@ LOCAL caps at 20. FULL caps at 20. Never report above 20.
|
|||||||
|
|
||||||
| Finding type | Owner skill | If reported by /seo, what to do |
|
| Finding type | Owner skill | If reported by /seo, what to do |
|
||||||
|---|---|---|
|
|---|---|---|
|
||||||
| HTML validity errors (W3C nu validator) | /validate | Drop from /seo report; note `"see /validate report for HTML validity"`. |
|
| HTML validity errors (W3C nu validator) | /web-validate | Drop from /seo report; note `"see /web-validate report for HTML validity"`. |
|
||||||
| WCAG accessibility | /validate | Drop. |
|
| WCAG accessibility | /web-validate | Drop. |
|
||||||
| Missing CSP / HSTS / 404 page / HTTP→HTTPS | /harden | Drop unless it directly affects indexability (then mention with cross-link). |
|
| Missing CSP / HSTS / 404 page / HTTP→HTTPS | /harden | Drop unless it directly affects indexability (then mention with cross-link). |
|
||||||
| Wikidata / sameAs / Knowledge Panel | /seo (GEO) | Owned here. |
|
| Wikidata / sameAs / Knowledge Panel | /seo (GEO) | Owned here. |
|
||||||
| llms.txt | /seo (GEO) | Owned here. |
|
| llms.txt | /seo (GEO) | Owned here. |
|
||||||
|
|||||||
@ -1,5 +1,5 @@
|
|||||||
---
|
---
|
||||||
name: validate
|
name: web-validate
|
||||||
description: |
|
description: |
|
||||||
Use when a web project needs W3C HTML/CSS validity check or WCAG 2.1
|
Use when a web project needs W3C HTML/CSS validity check or WCAG 2.1
|
||||||
accessibility audit. Dispatches the validator-analyzer agent with a
|
accessibility audit. Dispatches the validator-analyzer agent with a
|
||||||
@ -21,7 +21,7 @@ allowed-tools:
|
|||||||
- WebFetch
|
- WebFetch
|
||||||
---
|
---
|
||||||
|
|
||||||
# /validate — web standards audit (W3C + WCAG)
|
# /web-validate — web standards audit (W3C + WCAG)
|
||||||
|
|
||||||
This skill orchestrates a narrow-scope standards audit :
|
This skill orchestrates a narrow-scope standards audit :
|
||||||
|
|
||||||
@ -46,20 +46,20 @@ Scope boundary :
|
|||||||
errors.
|
errors.
|
||||||
|
|
||||||
If a finding appears in an out-of-scope area (e.g. missing meta
|
If a finding appears in an out-of-scope area (e.g. missing meta
|
||||||
description), the agent drops it silently — `/validate` stays focused.
|
description), the agent drops it silently — `/web-validate` stays focused.
|
||||||
|
|
||||||
### Relation to other skills
|
### Relation to other skills
|
||||||
|
|
||||||
- `/onboard` runs an initial a11y audit at project setup (axe or
|
- `/onboard` runs an initial a11y audit at project setup (axe or
|
||||||
static checklist → `.onboard-audit/a11y.md`). `/validate` is the
|
static checklist → `.onboard-audit/a11y.md`). `/web-validate` is the
|
||||||
**on-demand** equivalent, re-runnable anytime against the current
|
**on-demand** equivalent, re-runnable anytime against the current
|
||||||
codebase, and also covers HTML/CSS validity (which `/onboard` does
|
codebase, and also covers HTML/CSS validity (which `/onboard` does
|
||||||
not).
|
not).
|
||||||
- `/harden` audits security posture (headers, TLS, redirects).
|
- `/harden` audits security posture (headers, TLS, redirects).
|
||||||
`/validate` audits conformance. They share no findings.
|
`/web-validate` audits conformance. They share no findings.
|
||||||
- `/seo` and `/geo` audit indexability. They may flag the same HTML
|
- `/seo` and `/geo` audit indexability. They may flag the same HTML
|
||||||
features (alt attrs, heading structure) but from a ranking
|
features (alt attrs, heading structure) but from a ranking
|
||||||
perspective. `/validate` flags from a **standards** perspective
|
perspective. `/web-validate` flags from a **standards** perspective
|
||||||
(WCAG SC number, W3C rule id). Findings may overlap — both reports
|
(WCAG SC number, W3C rule id). Findings may overlap — both reports
|
||||||
are still valid.
|
are still valid.
|
||||||
|
|
||||||
@ -95,7 +95,7 @@ CSS_COUNT=$(find . -name "*.css" \
|
|||||||
|
|
||||||
If both counts are 0 and no URL provided → abort with :
|
If both counts are 0 and no URL provided → abort with :
|
||||||
```
|
```
|
||||||
⚠️ No HTML or CSS files found and no URL provided. /validate needs
|
⚠️ No HTML or CSS files found and no URL provided. /web-validate needs
|
||||||
either local files or a live URL. Re-run with --full <url>.
|
either local files or a live URL. Re-run with --full <url>.
|
||||||
```
|
```
|
||||||
|
|
||||||
@ -126,7 +126,7 @@ If framework is JS-based and `BUILD_DIR` is empty, warn :
|
|||||||
⚠️ Framework detected : <name>. No build output found.
|
⚠️ Framework detected : <name>. No build output found.
|
||||||
HTML validity on JSX/TSX source is not meaningful.
|
HTML validity on JSX/TSX source is not meaningful.
|
||||||
Options :
|
Options :
|
||||||
1. Run `npm run build` then re-run /validate
|
1. Run `npm run build` then re-run /web-validate
|
||||||
2. Use --full <url> to audit production
|
2. Use --full <url> to audit production
|
||||||
3. Continue with partial LOCAL audit (CSS + static WCAG only)
|
3. Continue with partial LOCAL audit (CSS + static WCAG only)
|
||||||
```
|
```
|
||||||
@ -177,7 +177,7 @@ Agent(
|
|||||||
subagent_type="validator-analyzer",
|
subagent_type="validator-analyzer",
|
||||||
description="validate — W3C HTML + CSS + WCAG audit",
|
description="validate — W3C HTML + CSS + WCAG audit",
|
||||||
prompt="""
|
prompt="""
|
||||||
Dispatched from /validate. STRICT SCOPE — W3C HTML validity + W3C
|
Dispatched from /web-validate. STRICT SCOPE — W3C HTML validity + W3C
|
||||||
CSS validity + WCAG 2.1 accessibility ONLY.
|
CSS validity + WCAG 2.1 accessibility ONLY.
|
||||||
|
|
||||||
CONTEXT:
|
CONTEXT:
|
||||||
@ -299,7 +299,7 @@ Fixes applied : <N>
|
|||||||
- [Moyenne][CSS] Removed invalid property `bakground` → `background` at line 23
|
- [Moyenne][CSS] Removed invalid property `bakground` → `background` at line 23
|
||||||
|
|
||||||
Verification :
|
Verification :
|
||||||
- Re-run /validate → expected score bump <before> → <after>
|
- Re-run /web-validate → expected score bump <before> → <after>
|
||||||
- Tests to run : a11y regression (pa11y-ci), visual snapshot
|
- Tests to run : a11y regression (pa11y-ci), visual snapshot
|
||||||
```
|
```
|
||||||
|
|
||||||
@ -329,9 +329,9 @@ TOP 3 ACTIONS (by severity × user impact) :
|
|||||||
3. [Haute] <title>
|
3. [Haute] <title>
|
||||||
|
|
||||||
NEXT STEPS :
|
NEXT STEPS :
|
||||||
• /validate <url> --fix → apply recommended fixes
|
• /web-validate <url> --fix → apply recommended fixes
|
||||||
• /validate <url> --full → re-run with live URL + remote APIs
|
• /web-validate <url> --full → re-run with live URL + remote APIs
|
||||||
• /validate --no-external → skip third-party APIs (faster, LOCAL-like)
|
• /web-validate --no-external → skip third-party APIs (faster, LOCAL-like)
|
||||||
• /harden / /seo / /geo → complementary audits (other scopes)
|
• /harden / /seo / /geo → complementary audits (other scopes)
|
||||||
|
|
||||||
Install for better LOCAL coverage :
|
Install for better LOCAL coverage :
|
||||||
Loading…
Reference in New Issue
Block a user