RED-7 (example-priming): the STEP-2 worked example named live IDs (LRN-014 +
LRN-016) and modeled merging them — but they are complementary (header-ids vs
checkbox-CSS), a merge the skill's own rule forbids. Live IDs in an example prime
the skill to act on those exact entries on real data. Fictionalized the whole
STEP-2 example to 9xx IDs (cannot match a live registry); the merge example now
models a same-concept merge. Closed by a DETERMINISTIC test (run-deterministic.sh
RED-7: the example must carry only 9xx ids) per LRN-046, not a flaky behavioral
fixture. The test caught its own ugrep false-green first (a leading-dash pattern
parsed as an option) — fixed via /usr/bin/grep, the same dodge the skill's verify
already uses at line 189.
RED-8 (added-negation inversion): re-reviewed, consciously accepted as a documented
limit in BACKLOG — remote (compression subtracts tokens), and an FP-safe increase
check is non-trivial (needs the HEAD entry-id set to exclude legit new/merged 0->N);
a noisy guard is worse than the honest limit on a destructive skill (LRN-047).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01C6bUdvHnajCNzgVQefZowj
Only destructive skill, previously untested. A RED suite (tests/) proved 6
dangers; each closed by a deterministic guard:
- RED-1 removed false "Fixed in v1.1 (TDD found it)" verify claim
- RED-2 STEP 0 dirty-tree is now a real exit 1 (was a prose-only STOP)
- RED-3 STEP 3.4 negation-sentence verbatim guard (no silent inversion)
- RED-4 STEP 1-A collapse safety-critical exception (NEVER/ALWAYS/PERMANENT)
- RED-5 STEP 4 fidelity census (count-based, per-entry x per-category)
- RED-6 STEP 4 trailing-space false-ORPHAN fix
Tests: run-deterministic.sh (all-green), run-behavioral.md, fixtures, BACKLOG
(RED-7/RED-8 open). Validated on the real learnings.md: 0 fidelity
false-positive vs 13, scope held, registry reverted.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01W9sqAwZxBMZSynZoVrEJhd
Strip the disable-model-invocation frontmatter key from all 19 editable skills. Absent = default = model invocation enabled. 8 were 'true' and blocked the model AND orchestrators from self-routing (status, plugin-check, analyze, onboard, refactor, init-project, pdf-translate, ship-feature) — contradicting the CLAUDE.md skill-routing rules. The other 11 were 'false', a no-op noise line.
The setting is binary (no per-caller granularity), so enabling orchestrator chaining also enables model auto-fire — accepted. Genuinely destructive operations remain guarded by the careful/guard hooks, independent of this flag.
Capitalized: BDR-019 (decision), LRN-026 (learning), journal 2026-06-09.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
First end-to-end run of /prune-memory on real .claude/memory/ surfaced
a broken verify script:
Old: `prefix=$(basename "$f" .md | tr a-z A-Z | cut -c1-3)` derived
the prefix from the filename's first 3 letters → produced DEC / LEA /
BLO. Actual prefixes are BDR / LRN / BLK. The grep then matched zero
entries, no MISSING/ORPHAN was ever reported, and the script printed
its "OK if blank" footer regardless of real state. False clean signal.
Fixed: hard-mapped filename → prefix via `declare -A PREFIX_MAP`.
Verified against current registries — 14 BDR + 16 LRN + 2 BLK + 1 EVAL
entries all index-consistent, no false negatives.
Added EVAL prefix to the map (evals.md was missing from the loop in
v1). Footer line clarified to `(blank above = OK)`. `wc -l` excludes
`.original.md` backups from the output.
Note: caveat in skill body said "v1 ships without baseline TDD test —
STEP 2 approval gate is the safety net". First real test caught a
verify bug that bypassed STEP 2 entirely. Lesson: STEP 4 is its own
safety net and needs its own test.
Co-Authored-By: Claude <noreply@anthropic.com>