Merge bugfix/prune-memory-hardening into develop
This commit is contained in:
commit
73e12be5f7
@ -27,6 +27,8 @@ rules:
|
|||||||
| BLK-005 | 2026-05-21 | gstack submodule rename (checkpoint→context-save) breaks profile entries | resolved |
|
| BLK-005 | 2026-05-21 | gstack submodule rename (checkpoint→context-save) breaks profile entries | resolved |
|
||||||
| BLK-006 | 2026-05-21 | `profile.sh current` false-negative via `~/.claude` symlink (`cd` not `cd -P`) | resolved |
|
| BLK-006 | 2026-05-21 | `profile.sh current` false-negative via `~/.claude` symlink (`cd` not `cd -P`) | resolved |
|
||||||
| BLK-007 | 2026-06-02 | 6 gstack source skills (ios-*, spec) unlinked post-bump — invisible to profiles + `gstack on` | resolved |
|
| BLK-007 | 2026-06-02 | 6 gstack source skills (ios-*, spec) unlinked post-bump — invisible to profiles + `gstack on` | resolved |
|
||||||
|
| BLK-008 | 2026-06-23 | gstack ./setup on Ubuntu 26.04: Playwright chromium unsupported → gstack browser (/browse, /qa, screenshots) silently dead | resolved (211c7d4) |
|
||||||
|
| BLK-009 | 2026-06-25 | user-level path-scoped rules (`paths:` frontmatter in `~/.claude/rules/`) never inject — broken in CC 2.1.190 (#21858) | upstream, open |
|
||||||
| BLK-010 | 2026-06-27 | init-project: scaffold (STEP 5) + bootstrap README (5b) have no deterministic commit owner; worktree `add -b` on unborn HEAD | resolved (uncommitted) |
|
| BLK-010 | 2026-06-27 | init-project: scaffold (STEP 5) + bootstrap README (5b) have no deterministic commit owner; worktree `add -b` on unborn HEAD | resolved (uncommitted) |
|
||||||
| BLK-011 | 2026-06-27 | init-project STEP 13 GSD post-FINISH creates ROADMAP.md → stranded doc (3rd post-FINISH artifact) | resolved (STEP 12 removed) |
|
| BLK-011 | 2026-06-27 | init-project STEP 13 GSD post-FINISH creates ROADMAP.md → stranded doc (3rd post-FINISH artifact) | resolved (STEP 12 removed) |
|
||||||
| BLK-012 | 2026-06-29 | gitflow_init half-applied: socle-commit failure swallowed → hook activated on partial run → re-run self-blocks | resolved |
|
| BLK-012 | 2026-06-29 | gitflow_init half-applied: socle-commit failure swallowed → hook activated on partial run → re-run self-blocks | resolved |
|
||||||
|
|||||||
@ -42,8 +42,19 @@ rules:
|
|||||||
| BDR-018 | 2026-06-02 | `profile gstack on/off` verb — toggle gstack keeping active-profile label | accepted |
|
| BDR-018 | 2026-06-02 | `profile gstack on/off` verb — toggle gstack keeping active-profile label | accepted |
|
||||||
| BDR-019 | 2026-06-09 | Remove `disable-model-invocation` repo-wide — align skills with CLAUDE.md routing | accepted |
|
| BDR-019 | 2026-06-09 | Remove `disable-model-invocation` repo-wide — align skills with CLAUDE.md routing | accepted |
|
||||||
| BDR-020 | 2026-06-11 | `/audit-delta`: per-axis SHA markers + always-on fix gate + unreachable-first-run = full report-only | accepted |
|
| BDR-020 | 2026-06-11 | `/audit-delta`: per-axis SHA markers + always-on fix gate + unreachable-first-run = full report-only | accepted |
|
||||||
|
| BDR-021 | 2026-06-27 | CLAUDE.md restructure: contradiction purge, project-specific sections labeled, critical sections never compressed | accepted |
|
||||||
| BDR-022 | 2026-06-18 | doc-syncer scoped to public docs; `.claude/` + `CLAUDE.md` read-only context, never targets; conventions + clean mode | accepted |
|
| BDR-022 | 2026-06-18 | doc-syncer scoped to public docs; `.claude/` + `CLAUDE.md` read-only context, never targets; conventions + clean mode | accepted |
|
||||||
| BDR-023 | 2026-06-19 | Merge /close into /capitalize — 2 modes + TODO reconcile; /close alias | accepted |
|
| BDR-023 | 2026-06-19 | Merge /close into /capitalize — 2 modes + TODO reconcile; /close alias | accepted |
|
||||||
|
| BDR-024 | 2026-06-27 | `profile show --plain` = claude-free parse contract for the design gate | accepted |
|
||||||
|
| BDR-025 | 2026-06-27 | Design gate profile-based; remedy `/profile design`; magic required-but-manual; unknown → fail-visible; claude via PATH-repair | accepted |
|
||||||
|
| BDR-026 | 2026-06-27 | Secret source-of-truth outside the repo (`~/.claude/.env`) reached via a `repo/.env` symlink | accepted |
|
||||||
|
| BDR-027 | 2026-06-27 | Minimal npm-via-nvm bootstrap over a centralized prereq lib | accepted |
|
||||||
|
| BDR-028 | 2026-06-27 | Hand-curated config install-immutable (auto-revert guard) + de-vendor installer-managed skills | accepted |
|
||||||
|
| BDR-029 | 2026-06-27 | Installer auto-fixes gstack browser on an OS newer than its pinned Playwright supports | accepted |
|
||||||
|
| BDR-030 | 2026-06-27 | gstack skills activated ON-DEMAND per profile, not pre-installed; OFF by default stays | accepted |
|
||||||
|
| BDR-031 | 2026-06-27 | global CLAUDE.md lightening = COMPRESSION, not path-scope / externalization | accepted |
|
||||||
|
| BDR-032 | 2026-06-27 | skill `/validate` → `/web-validate` (rename user surface, keep internals) | accepted |
|
||||||
|
| BDR-033 | 2026-06-27 | design-gate §4: anim-lib suggestion — suggest-only, non-blocking, stateless 1-line | accepted |
|
||||||
| BDR-034 | 2026-06-26 | Coupled-capitalize invariant v1 — memory commit auto per dev flow (Frame 2) | accepted |
|
| BDR-034 | 2026-06-26 | Coupled-capitalize invariant v1 — memory commit auto per dev flow (Frame 2) | accepted |
|
||||||
| BDR-035 | 2026-06-26 | Analyze-before-plan invariant v1 — read-before bookend of coupled-capitalize | accepted |
|
| BDR-035 | 2026-06-26 | Analyze-before-plan invariant v1 — read-before bookend of coupled-capitalize | accepted |
|
||||||
| BDR-036 | 2026-06-27 | Doc-sync coupled invariant — commit docs doc-syncer patches (twin of BDR-034, BUILT not reordered) | accepted |
|
| BDR-036 | 2026-06-27 | Doc-sync coupled invariant — commit docs doc-syncer patches (twin of BDR-034, BUILT not reordered) | accepted |
|
||||||
|
|||||||
@ -30,6 +30,7 @@ rules:
|
|||||||
| EVAL-007 | 2026-06-26 | Coupled-capitalize machinery — TDD 13 + e2e, surgical scope proven | keep |
|
| EVAL-007 | 2026-06-26 | Coupled-capitalize machinery — TDD 13 + e2e, surgical scope proven | keep |
|
||||||
| EVAL-008 | 2026-06-27 | Doc-sync coupled machinery — 28/28 real-exec, swap-sweep caught prior debt | keep |
|
| EVAL-008 | 2026-06-27 | Doc-sync coupled machinery — 28/28 real-exec, swap-sweep caught prior debt | keep |
|
||||||
| EVAL-009 | 2026-06-27 | deploy skill subagent-driven build: multi-stage review + pressure-test net-positive | keep |
|
| EVAL-009 | 2026-06-27 | deploy skill subagent-driven build: multi-stage review + pressure-test net-positive | keep |
|
||||||
|
| EVAL-010 | 2026-06-29 | prune-memory hardening: RED-7 deterministic fix + RED-8 accept + 34-row index backfill | keep |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@ -113,3 +114,9 @@ rules:
|
|||||||
- **method**: per-task review (sonnet; opus on the keystone) + writing-skills pressure-test (fresh agent on a `PENDING.json`+moved-HEAD fixture) + final whole-branch review (opus).
|
- **method**: per-task review (sonnet; opus on the keystone) + writing-skills pressure-test (fresh agent on a `PENDING.json`+moved-HEAD fixture) + final whole-branch review (opus).
|
||||||
- **anomalies**: (1) the PLAN's code carried 3 latent bugs — missing `git add` for new files, SC2086 unquoted `$viol`, comment-before-shebang SC1128 — all caught by the implementer's TDD+shellcheck gate → plan-code is a DRAFT, the test gate is load-bearing. (2) the final whole-branch review caught 2 Important seam-bugs INVISIBLE to per-task reviews: target-repo `.claude/`-ignored silent no-op ([[LRN-066]]) + `NEXT.sh`-absence non-regeneration → holistic review earns its keep. (3) pressure-test confirmed the cold-resume discipline holds under temptation (the agent excluded the moved-HEAD `0034`). (4) a reviewer subagent bugged out once (user killed it) → re-dispatched clean (transient, not a finding).
|
- **anomalies**: (1) the PLAN's code carried 3 latent bugs — missing `git add` for new files, SC2086 unquoted `$viol`, comment-before-shebang SC1128 — all caught by the implementer's TDD+shellcheck gate → plan-code is a DRAFT, the test gate is load-bearing. (2) the final whole-branch review caught 2 Important seam-bugs INVISIBLE to per-task reviews: target-repo `.claude/`-ignored silent no-op ([[LRN-066]]) + `NEXT.sh`-absence non-regeneration → holistic review earns its keep. (3) pressure-test confirmed the cold-resume discipline holds under temptation (the agent excluded the moved-HEAD `0034`). (4) a reviewer subagent bugged out once (user killed it) → re-dispatched clean (transient, not a finding).
|
||||||
- **action**: keep. Multi-stage adversarial review + a behavioral pressure-test caught classes of bug single-pass review misses — worth the cost on a keystone skill.
|
- **action**: keep. Multi-stage adversarial review + a behavioral pressure-test caught classes of bug single-pass review misses — worth the cost on a keystone skill.
|
||||||
|
|
||||||
|
## EVAL-010 — prune-memory hardening (RED-7 fix + RED-8 accept + 34-row index backfill)
|
||||||
|
- **Date**: 2026-06-29
|
||||||
|
- **method**: read-first cartography (sub-agent, confirmed) → RED-7 closed by a DETERMINISTIC test ([[LRN-046]]) + STEP-2 example fictionalized → RED-8 re-reviewed, consciously accepted ([[LRN-047]]) → 34 missing Index rows composed + inserted in ID-sorted slots → STEP-4 verify zero MISSING/ORPHAN; deterministic suite all-green, shellcheck clean.
|
||||||
|
- **anomalies**: (1) RED-7 test FALSE-GREEN caught in real time — ugrep parsed `-9..` as an option → empty → green; fixed via /usr/bin/grep ([[LRN-074]]). The RED was WATCHED, not assumed. (2) RED-7 premise verified: LRN-014/016 ARE complementary → the old example modeled a WRONG merge, not just primed it. (3) backfill: 4/5 title-derived Applies-to (the awk-missed entries) missed a real future-app nuance on re-read → corrected before insert (without the 5-check, 4 Index rows would have diverged from source, engraved forever). (4) almost wrote a colliding EVAL-009 (deploy) — read the file first → EVAL-010. (5) pre-existing LRN-021 Index row out of ID-order → moved.
|
||||||
|
- **action**: keep. RED-7 GREEN (deterministic), RED-8 documented-accept, drift 34→0. [[LRN-073]] + [[LRN-074]] engraved — 2 pattern-families this session (fail-silent [[LRN-066]]/[[LRN-071]] + command-assumption [[LRN-074]]).
|
||||||
|
|||||||
@ -234,4 +234,11 @@ rules:
|
|||||||
## 2026-06-29 (cont. 2) — BLK-011 resolved by REMOVAL (init-project GSD bootstrap)
|
## 2026-06-29 (cont. 2) — BLK-011 resolved by REMOVAL (init-project GSD bootstrap)
|
||||||
- User challenge reframed the chantier: don't plumb a commit for the stranded ROADMAP — ask if gsd belongs at init AT ALL. Read REFUTED both my option-premises (gsd ≫ roadmap; TODO ≠ gsd ROADMAP) but conclusion A (remove STEP 12) held for a STRONGER reason: speculative auto-bootstrap of an unused multi-session engine at creation is bad per se. Best fix = NEGATIVE diff ([[LRN-072]]).
|
- User challenge reframed the chantier: don't plumb a commit for the stranded ROADMAP — ask if gsd belongs at init AT ALL. Read REFUTED both my option-premises (gsd ≫ roadmap; TODO ≠ gsd ROADMAP) but conclusion A (remove STEP 12) held for a STRONGER reason: speculative auto-bootstrap of an unused multi-session engine at creation is bad per se. Best fix = NEGATIVE diff ([[LRN-072]]).
|
||||||
- Removed init-project STEP 12 (+ header 12→11-step, 10c note, 4 USAGE coherence fixes). Coherence sweep = zero dangling STEP-12 refs (the "test" for a removal). Deliberate gsd use KEPT (onboarder PHASE 6, plugin-advisor, status-reporter). [[BLK-011]] → resolved.
|
- Removed init-project STEP 12 (+ header 12→11-step, 10c note, 4 USAGE coherence fixes). Coherence sweep = zero dangling STEP-12 refs (the "test" for a removal). Deliberate gsd use KEPT (onboarder PHASE 6, plugin-advisor, status-reporter). [[BLK-011]] → resolved.
|
||||||
- Branch `bugfix/blk-011-gsd-roadmap`; no finish yet (awaiting signal).
|
- Branch `bugfix/blk-011-gsd-roadmap`; FINISHED → develop (`ce4391a`) on explicit signal; pushed develop to origin (6 commits, SSH).
|
||||||
|
|
||||||
|
## 2026-06-29 (cont. 3) — prune-memory hardening (RED-7/8 + index backfill)
|
||||||
|
- Read-first cartography (confirmed my own measurements). RED-7 (example-priming): the STEP-2 example named live LRN-014+016 and modeled merging them — verified COMPLEMENTARY, a merge the skill forbids. Fix = fictionalize example to 9xx + DETERMINISTIC test ([[LRN-046]], not flaky behavioral). [[LRN-073]].
|
||||||
|
- RED-7 test caught its OWN false-green in real time: ugrep parsed `-9..` as an option → empty → green; fixed via /usr/bin/grep. 4th command-assumption miss this session → [[LRN-074]] (2nd engraved pattern-family, alongside fail-silent [[LRN-066]]/[[LRN-071]]).
|
||||||
|
- RED-8 (added-negation): consciously ACCEPTED as documented limit ([[LRN-047]] — FP-prone guard worse than honest limit on a destructive skill).
|
||||||
|
- Index backfill: 34 missing rows (decisions 11, learnings 21, blockers 2) composed + ID-sorted insert; drift 34→0, STEP-4 verify OK. Re-read the 5 awk-missed Applies-to → 4 corrected a nuance the title dropped. Moved pre-existing out-of-order LRN-021. [[EVAL-010]].
|
||||||
|
- Branch `bugfix/prune-memory-hardening`; no finish yet (awaiting signal). LAST of 3 chantiers.
|
||||||
|
|||||||
@ -28,7 +28,6 @@ rules:
|
|||||||
| LRN-006 | 2026-05-03 | `caveman-shrink` (and any MCP middleware proxy) non-functional without upstream wrapper | any MCP middleware/proxy package — never `claude mcp add` it bare |
|
| LRN-006 | 2026-05-03 | `caveman-shrink` (and any MCP middleware proxy) non-functional without upstream wrapper | any MCP middleware/proxy package — never `claude mcp add` it bare |
|
||||||
| LRN-007 | 2026-05-06 | `toggle-external.sh enable` missed source-only state (3rd lifecycle case) | toggle scripts for tools with separate install + symlink steps |
|
| LRN-007 | 2026-05-06 | `toggle-external.sh enable` missed source-only state (3rd lifecycle case) | toggle scripts for tools with separate install + symlink steps |
|
||||||
| LRN-008 | 2026-05-06 | Biggest skill-quality wins from edge-case tables, not workflow rewrites | any skill <85 — first check for FAILURE PATHS / EDGE CASES / ERROR HANDLING section |
|
| LRN-008 | 2026-05-06 | Biggest skill-quality wins from edge-case tables, not workflow rewrites | any skill <85 — first check for FAILURE PATHS / EDGE CASES / ERROR HANDLING section |
|
||||||
| LRN-021 | 2026-05-20 | Refactor commands→skills must sweep `~/.claude/commands/` for orphan wrappers | any refactor moving `agents/foo.md` → `skills/foo/SKILL.md`; onboard/init-project audits |
|
|
||||||
| LRN-009 | 2026-05-06 | Dry-run scoring noise wrongly triggers reverts on already-strong skills | darwin-skill ratchet on skills >91 — relax or use real subagent eval |
|
| LRN-009 | 2026-05-06 | Dry-run scoring noise wrongly triggers reverts on already-strong skills | darwin-skill ratchet on skills >91 — relax or use real subagent eval |
|
||||||
| LRN-010 | 2026-05-06 | `~/.claude/skills,agents` symlink to Documents/claude — git from `~/.claude` fails | any optimization or batch edit on personal skills/agents |
|
| LRN-010 | 2026-05-06 | `~/.claude/skills,agents` symlink to Documents/claude — git from `~/.claude` fails | any optimization or batch edit on personal skills/agents |
|
||||||
| LRN-011 | 2026-05-07 | Single subagent emits N independently-gated scores → labeled extraction + axis-aware loop + per-axis escalation | any audit pipeline shipping multiple gated metrics from one subagent |
|
| LRN-011 | 2026-05-07 | Single subagent emits N independently-gated scores → labeled extraction + axis-aware loop + per-axis escalation | any audit pipeline shipping multiple gated metrics from one subagent |
|
||||||
@ -40,15 +39,37 @@ rules:
|
|||||||
| LRN-017 | 2026-05-12 | Thin-dispatcher SKILL.md round-1 win = fallback + frontmatter triggers (+15 to +30) | any `/darwin-skill` round-1 on a dispatcher SKILL.md |
|
| LRN-017 | 2026-05-12 | Thin-dispatcher SKILL.md round-1 win = fallback + frontmatter triggers (+15 to +30) | any `/darwin-skill` round-1 on a dispatcher SKILL.md |
|
||||||
| LRN-018 | 2026-05-12 | Darwin eval subagents drift on total math — recompute in main thread | any subagent-driven SKILL.md rescore |
|
| LRN-018 | 2026-05-12 | Darwin eval subagents drift on total math — recompute in main thread | any subagent-driven SKILL.md rescore |
|
||||||
| LRN-019 | 2026-05-15 | Deployable-project doc split: README dev-quickstart + DEPLOY 14-section prod-VPS topology | any onboard/doc-syncer/scaffold producing docs for a deployable project |
|
| LRN-019 | 2026-05-15 | Deployable-project doc split: README dev-quickstart + DEPLOY 14-section prod-VPS topology | any onboard/doc-syncer/scaffold producing docs for a deployable project |
|
||||||
|
| LRN-020 | 2026-05-18 | profile-sentinel-collision: literal labels in cmd output must not match profile filenames | a CLI reporting a real named-identifier OR a "nothing applied" state — keep sentinels outside the namespace (string-eq consumers break) |
|
||||||
|
| LRN-021 | 2026-05-20 | Refactor commands→skills must sweep `~/.claude/commands/` for orphan wrappers | any refactor moving `agents/foo.md` → `skills/foo/SKILL.md`; onboard/init-project audits |
|
||||||
|
| LRN-022 | 2026-05-21 | audit `lib/profiles/*.profile` against the gstack skill list after every submodule bump | any gstack submodule bump / external-skill-source move; a "missing:" warning = upstream rename/deletion (link.sh can't fix) |
|
||||||
|
| LRN-023 | 2026-05-21 | scripts invoked via symlink must resolve `$REPO` with `cd -P` (physical), not logical `cd` | any script invoked via a symlink that derives its repo root from `$BASH_SOURCE` (cd -P/realpath; Python .resolve()) |
|
||||||
| LRN-024 | 2026-06-02 | New sibling command sharing logic → extract helper + refactor existing caller, never copy-paste; assert pre/post state equality | adding a subcommand/branch reusing logic inline in a peer command |
|
| LRN-024 | 2026-06-02 | New sibling command sharing logic → extract helper + refactor existing caller, never copy-paste; assert pre/post state equality | adding a subcommand/branch reusing logic inline in a peer command |
|
||||||
| LRN-025 | 2026-06-02 | `.gitignore` gstack allowlist must cover ALL toggleable skills (incl. parked) — else enabling one = untracked git noise | any toggle that moves local-symlink skills into a tracked dir; post-submodule-bump reconcile |
|
| LRN-025 | 2026-06-02 | `.gitignore` gstack allowlist must cover ALL toggleable skills (incl. parked) — else enabling one = untracked git noise | any toggle that moves local-symlink skills into a tracked dir; post-submodule-bump reconcile |
|
||||||
| LRN-026 | 2026-06-09 | `disable-model-invocation: false` = ENABLED not blocking; only `true` blocks (model + orchestrator); binary, no per-caller | Claude Code skill frontmatter; deciding self-route/chain vs human-only entry point |
|
| LRN-026 | 2026-06-09 | `disable-model-invocation: false` = ENABLED not blocking; only `true` blocks (model + orchestrator); binary, no per-caller | Claude Code skill frontmatter; deciding self-route/chain vs human-only entry point |
|
||||||
| LRN-027 | 2026-06-11 | Agents improvise audit boundaries from file dates when no machine state — periodic skills need machine-readable state file, never inference | any recurring/periodic skill needing "since last run" semantics |
|
| LRN-027 | 2026-06-11 | Agents improvise audit boundaries from file dates when no machine state — periodic skills need machine-readable state file, never inference | any recurring/periodic skill needing "since last run" semantics |
|
||||||
|
| LRN-028 | 2026-06-11 | "no-skill" subagent baselines invalid when the skill is installed globally | any A/B skill eval / TDD RED baseline / darwin with-vs-without — control must REMOVE the capability, not omit mention |
|
||||||
|
| LRN-029 | 2026-06-11 | an edit adding an exception to a blanket rule will contradict it — counterbalanced blind judges catch it | skill/doc/spec edits adding a branch/exception; scoring any self-modified artifact (counterbalanced blind judges) |
|
||||||
| LRN-030 | 2026-06-18 | Opus 4.8 under-delegates subagents/memory/custom-tools by default — counter via explicit CLAUDE.md fan-out rule | any Opus 4.8 session; tuning delegation; inline-vs-subagent decision |
|
| LRN-030 | 2026-06-18 | Opus 4.8 under-delegates subagents/memory/custom-tools by default — counter via explicit CLAUDE.md fan-out rule | any Opus 4.8 session; tuning delegation; inline-vs-subagent decision |
|
||||||
| LRN-031 | 2026-06-19 | Skill value = gate + anti-noise + determinism, not re-coding what a capable agent does free | building/reviewing any skill; writing-skills TDD fixture design |
|
| LRN-031 | 2026-06-19 | Skill value = gate + anti-noise + determinism, not re-coding what a capable agent does free | building/reviewing any skill; writing-skills TDD fixture design |
|
||||||
|
| LRN-032 | 2026-06-19 | a rule has a domain; applying it outside = category error — check artifact type first | invoking a limit/convention/style rule — confirm it governs THIS artifact class |
|
||||||
|
| LRN-033 | 2026-06-19 | multibyte separator breaks `printf %-Ns` byte-width padding — pad via `${#}` char-count | aligning any column with non-ASCII (·, —, box-drawing, accents) |
|
||||||
|
| LRN-034 | 2026-06-21 | narrated state ≠ ground truth; the missed alarm was internal contradiction — verify vs git | anyone asserts "X is done" — verify (git/file/grep) before building on it |
|
||||||
|
| LRN-035 | 2026-06-21 | honest dedup: name-mention ≠ definition-instance; a dosage rule can make "dedup" a no-op | any "X repeated N times → factor it" — audit what each occurrence IS |
|
||||||
|
| LRN-036 | 2026-06-21 | `command -v <cli>` in a shelled-out script depends on PATH carrying the cli's bin, not the alias | any script shelling out a CLI from a hook/subshell |
|
||||||
|
| LRN-037 | 2026-06-21 | verify the load-bearing scenario on the REAL subject in REAL context, not a stub/logic argument | any "fixed/works" claim on a critical path — produce the real run output |
|
||||||
|
| LRN-038 | 2026-06-23 | Playwright host-platform override for distros newer than its hardcoded support list | any pinned tool with an OS allowlist breaking on a fresh OS upgrade |
|
||||||
|
| LRN-039 | 2026-06-23 | installers drift hand-curated config → snapshot+trap-restore guard; anchor gitignore for pollution | audit a fresh install with `git status` right after `make install` |
|
||||||
|
| LRN-040 | 2026-06-23 | OS newer than a pinned tool = TWO layers (version build + security policy) | "tool X broke after an OS upgrade" — check both build-support and OS hardening |
|
||||||
|
| LRN-041 | 2026-06-23 | a check reading a symlink an earlier install step makes → false negative if that step's precondition unmet | any "X not found in FILE" where FILE is a symlink/derived path |
|
||||||
|
| LRN-042 | 2026-06-23 | `npx skills add` / gstack `./setup` resolve install target RELATIVE TO CWD — repo CWD = wrong dir | before any `npx <x> add` / `<tool> init` that materializes a dotfile dir, set CWD |
|
||||||
|
| LRN-043 | 2026-06-25 | CLAUDE.md skill-routing: cut name-obvious lines, keep only non-derivable signal + dense catch-all | compressing any routing/dispatch table whose entries the model sees elsewhere |
|
||||||
|
| LRN-044 | 2026-06-25 | Edit/Write refuse to write THROUGH a symlink — pass the resolved real path | before editing any `~/.claude/...` config file — resolve it first |
|
||||||
|
| LRN-045 | 2026-06-25 | renaming a command: audit exact-name leak-guard / forbidden-token regexes | when renaming, grep the BARE old token inside regex/test/gate files |
|
||||||
| LRN-046 | 2026-06-25 | Destructive skill: deterministic oracle (byte-identical / count census) > semantic judge | any destructive/irreversible skill; behavioral-oracle TDD |
|
| LRN-046 | 2026-06-25 | Destructive skill: deterministic oracle (byte-identical / count census) > semantic judge | any destructive/irreversible skill; behavioral-oracle TDD |
|
||||||
| LRN-047 | 2026-06-25 | A noisy safety guard (13/13 FP) = a guard you learn to ignore = risk → refine, don't tolerate | any guard/alert/lint that can false-positive |
|
| LRN-047 | 2026-06-25 | A noisy safety guard (13/13 FP) = a guard you learn to ignore = risk → refine, don't tolerate | any guard/alert/lint that can false-positive |
|
||||||
| LRN-048 | 2026-06-25 | A "0/OK/pass" must prove it LOOKED (counted both sides), else verify hard-wired to pass | any verify/test/lint reporting success |
|
| LRN-048 | 2026-06-25 | A "0/OK/pass" must prove it LOOKED (counted both sides), else verify hard-wired to pass | any verify/test/lint reporting success |
|
||||||
|
| LRN-049 | 2026-06-25 | non-destructive repeated nudge: stateless-minimal surface > state marker (conditional on stakes) | any repeated advisory in a stateless surface — bound noise before reaching for a marker |
|
||||||
|
| LRN-050 | 2026-06-25 | on a symlinked/live file, show-before-write is the ONLY control gate | before editing any file — check if it is live, treat pre-write diff as an approval gate |
|
||||||
| LRN-051 | 2026-06-26 | `git commit -- pathspec` strict on no-match → filter scoped commits to changed paths | any scoped-commit automation |
|
| LRN-051 | 2026-06-26 | `git commit -- pathspec` strict on no-match → filter scoped commits to changed paths | any scoped-commit automation |
|
||||||
| LRN-052 | 2026-06-26 | Hash-anchoring: 2 cases it does NOT apply (pre-code founding, squash-merge) | capitalizing founding/arch decisions; squash repos |
|
| LRN-052 | 2026-06-26 | Hash-anchoring: 2 cases it does NOT apply (pre-code founding, squash-merge) | capitalizing founding/arch decisions; squash repos |
|
||||||
| LRN-053 | 2026-06-26 | Read-before teeth = verifiable disposition in the artifact, not the act of reading | any read-before / check-before wiring |
|
| LRN-053 | 2026-06-26 | Read-before teeth = verifiable disposition in the artifact, not the act of reading | any read-before / check-before wiring |
|
||||||
@ -71,6 +92,8 @@ rules:
|
|||||||
| LRN-070 | 2026-06-29 | clean-tree-gated migration blocked by a dirty submodule → diagnose pointer-vs-content; for a local edit use `submodule.<name>.ignore=dirty`, never blind reset | migrating/releasing a superproject whose submodule carries intentional local edits |
|
| LRN-070 | 2026-06-29 | clean-tree-gated migration blocked by a dirty submodule → diagnose pointer-vs-content; for a local edit use `submodule.<name>.ignore=dirty`, never blind reset | migrating/releasing a superproject whose submodule carries intentional local edits |
|
||||||
| LRN-071 | 2026-06-29 | fail-loud must cover the helper's OWN commit, not just its inputs — 3rd occurrence of the swallowed-commit pattern (a failed op masked by a later returning-0 statement) | any helper whose return value gates a downstream "success" — audit every fallible internal op propagates, esp. the commit |
|
| LRN-071 | 2026-06-29 | fail-loud must cover the helper's OWN commit, not just its inputs — 3rd occurrence of the swallowed-commit pattern (a failed op masked by a later returning-0 statement) | any helper whose return value gates a downstream "success" — audit every fallible internal op propagates, esp. the commit |
|
||||||
| LRN-072 | 2026-06-29 | a stranded-artifact bug can be fixed by NOT creating the artifact (negative diff), not by plumbing its commit — if the producing step is speculative/unused, delete it | a stranded/duplicated/uncommitted-artifact bug — before building machinery, ask if the PRODUCING step is wanted; speculative-at-creation → remove, deliberate-on-demand → keep |
|
| LRN-072 | 2026-06-29 | a stranded-artifact bug can be fixed by NOT creating the artifact (negative diff), not by plumbing its commit — if the producing step is speculative/unused, delete it | a stranded/duplicated/uncommitted-artifact bug — before building machinery, ask if the PRODUCING step is wanted; speculative-at-creation → remove, deliberate-on-demand → keep |
|
||||||
|
| LRN-073 | 2026-06-29 | a skill's worked-example must use FICTIONAL ids, never live registry ids (they prime real-data behavior) | any skill/agent with a worked example over the SAME data it operates on — use reserved/fictional ids; test deterministically that no live id appears |
|
||||||
|
| LRN-074 | 2026-06-29 | system `grep`/`awk` may be ugrep/mawk: don't assume flag-parsing, use `/usr/bin/grep`, watch the RED go red (4th command-assumption miss this session) | any shell test/guard riding on grep/awk/sed semantics — pin `/usr/bin/<tool>`, run the assertion, confirm it reds on the defect before trusting green |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@ -807,3 +830,13 @@ rules:
|
|||||||
- **pattern**: 3rd member of the post-FINISH-artifact class (memory, docs, GSD ROADMAP) — but UNLIKE the first two (real artifacts ALWAYS produced → couple a commit), the GSD artifact came from a SPECULATIVE, opt-in, rarely-used producer (init-project auto-bootstrapping a multi-session engine at project creation). The reflex fix (reorder + build `gsd-commit.sh` + tests) would have added machinery to faithfully commit an artifact nobody uses. The right fix was a NEGATIVE diff: delete the producer → orphan never created → bug dissolves, zero new code (BLK-011).
|
- **pattern**: 3rd member of the post-FINISH-artifact class (memory, docs, GSD ROADMAP) — but UNLIKE the first two (real artifacts ALWAYS produced → couple a commit), the GSD artifact came from a SPECULATIVE, opt-in, rarely-used producer (init-project auto-bootstrapping a multi-session engine at project creation). The reflex fix (reorder + build `gsd-commit.sh` + tests) would have added machinery to faithfully commit an artifact nobody uses. The right fix was a NEGATIVE diff: delete the producer → orphan never created → bug dissolves, zero new code (BLK-011).
|
||||||
- **the refutation that got there**: the framing "ROADMAP redundant with TODO" was WRONG (gsd ≫ roadmap = state machine/crash-recovery/cost/parallel/worktree; TODO ≠ gsd ROADMAP = different altitude + consumer). Reading REFUTED both premises, yet the CONCLUSION (remove the step) held for a STRONGER reason: speculatively scaffolding a heavy engine the sole user doesn't use, at creation, is bad per se. Right answer, reason corrected before engraving — change the QUESTION before changing the code.
|
- **the refutation that got there**: the framing "ROADMAP redundant with TODO" was WRONG (gsd ≫ roadmap = state machine/crash-recovery/cost/parallel/worktree; TODO ≠ gsd ROADMAP = different altitude + consumer). Reading REFUTED both premises, yet the CONCLUSION (remove the step) held for a STRONGER reason: speculatively scaffolding a heavy engine the sole user doesn't use, at creation, is bad per se. Right answer, reason corrected before engraving — change the QUESTION before changing the code.
|
||||||
- **future application**: a stranded / duplicated / uncommitted-artifact bug → BEFORE building machinery to handle the artifact, ask whether the step that PRODUCES it is actually used / wanted / non-speculative. Speculative or unused (esp. a personal/single-user repo) → DELETE the producer; the cleanest fix is the absent one. Distinguish speculative-at-creation (REMOVE) from deliberate-on-demand (KEEP). Family: [[BLK-010]], [[BLK-011]], [[BDR-036]].
|
- **future application**: a stranded / duplicated / uncommitted-artifact bug → BEFORE building machinery to handle the artifact, ask whether the step that PRODUCES it is actually used / wanted / non-speculative. Speculative or unused (esp. a personal/single-user repo) → DELETE the producer; the cleanest fix is the absent one. Distinguish speculative-at-creation (REMOVE) from deliberate-on-demand (KEEP). Family: [[BLK-010]], [[BLK-011]], [[BDR-036]].
|
||||||
|
|
||||||
|
## LRN-073 — a skill's worked-example must use FICTIONAL ids, never live registry ids (they prime real-data behavior)
|
||||||
|
- **pattern**: prune-memory's STEP-2 plan example named real LRN-014 + LRN-016 ("merge these"). A real-data run merged exactly that pair — though they're COMPLEMENTARY (header-ids vs checkbox-CSS), a merge its own rule forbids. Example ids that match live entries, in context at audit time, PRIME the action: you can't tell "judged correctly" from "pattern-matched its own example".
|
||||||
|
- **fix**: fictionalize example ids (9xx — can't match a live registry) + make the example model a CORRECT action. Lock it DETERMINISTICALLY ([[LRN-046]]): assert the example carries only fictional ids — not a flaky behavioral "did priming fire" test (RED-7).
|
||||||
|
- **future application**: any skill/agent whose instructions contain a worked example over the SAME data it operates on (registries/files/records) → use reserved/fictional identifiers; test deterministically that no live id appears in the example block.
|
||||||
|
|
||||||
|
## LRN-074 — system `grep`/`awk` may be ugrep/mawk: don't assume flag-parsing, use `/usr/bin/grep`, watch the RED go red
|
||||||
|
- **pattern**: a RED-7 test used `grep -vE '-9[0-9][0-9]$'`; the system grep is UGREP → parsed the leading `-9..` as an OPTION → errored → empty → FALSE GREEN (a RED that never goes red). Caught only because the output was READ, not assumed. 4th time this session an assumed command behavior was false on execution (after `set -o pipefail` + `grep -q` SIGPIPE, …). The skill's own verify already hard-codes `/usr/bin/grep` (line 189) for this exact reason — re-learned.
|
||||||
|
- **fix**: `/usr/bin/grep` (GNU) where GNU semantics matter; avoid leading-dash regex args (or use `-e`/`--`); never trust the system tool is GNU/POSIX (mawk≠gawk, ugrep≠grep).
|
||||||
|
- **future application**: any shell test/guard whose correctness rides on grep/awk/sed semantics → pin `/usr/bin/<tool>` AND run the assertion, confirming it goes red on the defect before trusting green. Execute, don't assume command behavior. RECURRENT motif — audit any "assumed tool behavior" the way the fail-silent family ([[LRN-066]]/[[LRN-071]]) is audited.
|
||||||
|
|||||||
@ -296,4 +296,13 @@ stronger reason: speculative auto-bootstrap of an unused engine at creation is b
|
|||||||
- [x] Ref-coherence sweep ("test" for a removal) — header 12→11-step, 10c note, 4 USAGE refs; zero dangling STEP-12 refs repo-wide
|
- [x] Ref-coherence sweep ("test" for a removal) — header 12→11-step, 10c note, 4 USAGE refs; zero dangling STEP-12 refs repo-wide
|
||||||
- [x] Scope guardrails — deliberate gsd use KEPT (onboarder PHASE 6, plugin-advisor, status-reporter)
|
- [x] Scope guardrails — deliberate gsd use KEPT (onboarder PHASE 6, plugin-advisor, status-reporter)
|
||||||
- [x] Capitalize — [[BLK-011]] resolved (true reason + premise trace) + [[LRN-072]] + CHANGELOG Removed + journal 2026-06-29 (cont. 2)
|
- [x] Capitalize — [[BLK-011]] resolved (true reason + premise trace) + [[LRN-072]] + CHANGELOG Removed + journal 2026-06-29 (cont. 2)
|
||||||
- [ ] FINISH — merge bugfix/blk-011-gsd-roadmap → develop (awaiting explicit human signal)
|
- [x] FINISH — merged bugfix/blk-011-gsd-roadmap → develop (`ce4391a`); develop pushed to origin (6 commits, SSH)
|
||||||
|
|
||||||
|
## 2026-06-29 — prune-memory hardening (RED-7/8 + index backfill) [branch bugfix/prune-memory-hardening]
|
||||||
|
LAST of 3 chantiers. Read-first cartography confirmed RED-7/8 + measured 34-row index drift.
|
||||||
|
- [x] RED-7 (example-priming) — fictionalized STEP-2 example to 9xx ids (live ids primed a wrong merge of complementary LRN-014/016); DETERMINISTIC test (run-deterministic.sh) per [[LRN-046]]. Caught its own ugrep false-green → /usr/bin/grep ([[LRN-074]]). [[LRN-073]]
|
||||||
|
- [x] RED-8 (added-negation inversion) — consciously ACCEPTED as documented limit in BACKLOG ([[LRN-047]]); no fragile guard built
|
||||||
|
- [x] Index backfill — 34 missing rows (decisions 11, learnings 21, blockers 2) composed + ID-sorted insert; drift 34→0, STEP-4 verify OK; moved pre-existing out-of-order LRN-021
|
||||||
|
- [x] Capitalize — [[LRN-073]] + [[LRN-074]] + [[EVAL-010]] + journal 2026-06-29 (cont. 3)
|
||||||
|
- [ ] FINISH — merge bugfix/prune-memory-hardening → develop (awaiting explicit human signal)
|
||||||
|
- [ ] PUSH — develop → origin after the 3 chantiers land (awaiting explicit human signal)
|
||||||
|
|||||||
@ -112,21 +112,23 @@ Print one block per registry. Example:
|
|||||||
|
|
||||||
```
|
```
|
||||||
PRUNE PLAN — decisions.md (N entries → M after if approved)
|
PRUNE PLAN — decisions.md (N entries → M after if approved)
|
||||||
|
(IDs below are FICTIONAL — 9xx range, never a live registry entry — so this
|
||||||
|
worked example cannot PRIME a real prune. Apply the same shapes to real IDs.)
|
||||||
|
|
||||||
[A. Obsolete — mark superseded]
|
[A. Obsolete — mark superseded]
|
||||||
BDR-003 — Gitignore wildcard pattern — status: proposed since 2026-03-12
|
BDR-901 — example proposed decision — status: proposed since <90+ days ago>
|
||||||
→ mark: status: deprecated (no follow-up after 90 days)
|
→ mark: status: deprecated (no follow-up after 90 days)
|
||||||
BDR-011 — Client handover 4-chapter — body says superseded by BDR-013
|
BDR-902 — example decision — body says superseded by BDR-903
|
||||||
→ fix Index: status = "superseded by BDR-013"
|
→ fix Index: status = "superseded by BDR-903"
|
||||||
|
|
||||||
[B. Similar — merge]
|
[B. Similar — merge]
|
||||||
LRN-014 + LRN-016 — both pandoc rendering quirks
|
LRN-901 + LRN-902 — SAME concept (two notes on the identical bug)
|
||||||
→ propose: merge into NEW LRN-017 ("Pandoc rendering quirks")
|
→ propose: merge into NEW LRN-904 with both bodies appended +
|
||||||
with both bodies appended + caveman pass; sources marked
|
caveman pass; sources marked status: superseded by LRN-904
|
||||||
status: superseded by LRN-017
|
(merge ONLY same-concept entries — complementary/different-angle stays split)
|
||||||
|
|
||||||
[C. Bloated — inline caveman rewrite]
|
[C. Bloated — inline caveman rewrite]
|
||||||
BDR-011 — body 612 words, filler density 7.2% → ~380 expected (-38%)
|
BDR-902 — body 612 words, filler density 7.2% → ~380 expected (-38%)
|
||||||
|
|
||||||
[D. Index drift]
|
[D. Index drift]
|
||||||
(none)
|
(none)
|
||||||
|
|||||||
@ -19,7 +19,15 @@ behavior on real registries.
|
|||||||
- GREEN: fictionalize the SKILL.md example (obviously-fake IDs, or an
|
- GREEN: fictionalize the SKILL.md example (obviously-fake IDs, or an
|
||||||
explicit "hypothetical" framing) so example IDs cannot match real entries.
|
explicit "hypothetical" framing) so example IDs cannot match real entries.
|
||||||
|
|
||||||
Status: filed, not built. Surfaced by the real-data A-measurement.
|
Status: RESOLVED 2026-06-29. VERIFY-FIRST done — the real LRN-014 (pandoc header-id
|
||||||
|
stripping) and LRN-016 (pandoc checkbox CSS overlap) are COMPLEMENTARY (different
|
||||||
|
angles), NOT overlapping: the SKILL.md example modeled a *wrong* merge AND used live
|
||||||
|
IDs that primed it on real data. GREEN: the whole STEP-2 example fictionalized to 9xx
|
||||||
|
IDs (cannot match any live registry) + the merge example now models a same-concept
|
||||||
|
merge with an explicit "merge ONLY same-concept" note. Closed by a DETERMINISTIC test
|
||||||
|
(run-deterministic.sh RED-7: the STEP-2 example must carry only 9xx ids) — not the
|
||||||
|
flaky behavioral fixture originally proposed, per LRN-046 (deterministic oracle >
|
||||||
|
semantic judge on a destructive skill). Test caught its own ugrep false-green first.
|
||||||
|
|
||||||
## RED-8 (candidate) — added-negation inversion (documented limit, not a test yet)
|
## RED-8 (candidate) — added-negation inversion (documented limit, not a test yet)
|
||||||
The RED-5 fidelity guard flags negation/permanent token DROPS; it cannot catch
|
The RED-5 fidelity guard flags negation/permanent token DROPS; it cannot catch
|
||||||
@ -35,4 +43,12 @@ requires ADDING a word, contrary to an operation that shortens.
|
|||||||
- RED (if pursued): assert no op INCREASES an existing entry's negation count.
|
- RED (if pursued): assert no op INCREASES an existing entry's negation count.
|
||||||
- Caveat: must exclude new/merged-entry ids (HEAD count 0 -> N is legitimate),
|
- Caveat: must exclude new/merged-entry ids (HEAD count 0 -> N is legitimate),
|
||||||
so an increase-check needs care to avoid its own false positives.
|
so an increase-check needs care to avoid its own false positives.
|
||||||
Status: documented limit, not built (low practical risk + non-trivial FP risk).
|
Status: CONSCIOUSLY ACCEPTED as a documented limit 2026-06-29 (re-reviewed, not built).
|
||||||
|
Rationale held on re-read: (1) remote — caveman/merge SUBTRACT tokens; authoring a new
|
||||||
|
negation runs against the operation; no evidence in the real-data measurement (the
|
||||||
|
"+7 not/no" in EVAL-006 is new/merged-entry ids going 0→N, NOT an existing entry
|
||||||
|
inverted). (2) An FP-safe increase-check is non-trivial: the census only emits non-zero
|
||||||
|
counts, so a 0→1 ADD produces a working-line with NO HEAD-line to compare — catching it
|
||||||
|
needs the HEAD entry-id set to exclude legitimately-new/merged ids. A noisy increase-check
|
||||||
|
= a guard you learn to ignore (LRN-047), worse than the honest documented limit on a
|
||||||
|
destructive skill. Revisit only if a real inversion is ever observed.
|
||||||
|
|||||||
@ -1,5 +1,5 @@
|
|||||||
#!/usr/bin/env bash
|
#!/usr/bin/env bash
|
||||||
# Deterministic RED suite for /prune-memory — RED-1, RED-2, RED-5, RED-6.
|
# Deterministic RED suite for /prune-memory — RED-1, RED-2, RED-5, RED-6, RED-7.
|
||||||
# Each MUST be red on the current (v1) skill. Pure mechanical oracles,
|
# Each MUST be red on the current (v1) skill. Pure mechanical oracles,
|
||||||
# no LLM. Faithful: RED-2/RED-6 execute the REAL bash blocks extracted
|
# no LLM. Faithful: RED-2/RED-6 execute the REAL bash blocks extracted
|
||||||
# from SKILL.md (no copy that could drift).
|
# from SKILL.md (no copy that could drift).
|
||||||
@ -83,6 +83,21 @@ else
|
|||||||
green 6 "verify does not false-orphan the title-less heading"
|
green 6 "verify does not false-orphan the title-less heading"
|
||||||
fi
|
fi
|
||||||
|
|
||||||
|
# ---- RED-7: STEP 2 plan example must use FICTIONAL ids, never live registry ids
|
||||||
|
# Live ids in the worked example PRIME the skill to act on those exact entries on
|
||||||
|
# real data (observed 2026-06-25: it merged the example's LRN-014 + LRN-016 on the
|
||||||
|
# live learnings.md, though they are complementary, not overlapping). Fictional ids
|
||||||
|
# (9xx) cannot match a real registry. Reads SKILL.md only — sandbox-safe.
|
||||||
|
# /usr/bin/grep (not the system grep, which may be ugrep — a leading-dash pattern
|
||||||
|
# like `-9..` is then misparsed as an option, erroring to an empty + FALSE GREEN).
|
||||||
|
ex7="$(awk '/^PRUNE PLAN/{f=1} f{print} /^Approve per category/{f=0; exit}' "$SKILL")"
|
||||||
|
bad7="$(printf '%s\n' "$ex7" | /usr/bin/grep -oE '(BDR|LRN|BLK|EVAL)-[0-9]+' | /usr/bin/grep -vE '9[0-9][0-9]$' | sort -u | tr '\n' ' ')"
|
||||||
|
if [ -n "${bad7// /}" ]; then
|
||||||
|
red 7 "STEP 2 example uses LIVE-range ids (prime real-data ops): ${bad7% }"
|
||||||
|
else
|
||||||
|
green 7 "STEP 2 example uses only fictional (9xx) ids"
|
||||||
|
fi
|
||||||
|
|
||||||
echo "----"
|
echo "----"
|
||||||
if [ "$fail" -eq 0 ]; then
|
if [ "$fail" -eq 0 ]; then
|
||||||
echo "SUITE: all GREEN"
|
echo "SUITE: all GREEN"
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user