Commit Graph

7 Commits

Author SHA1 Message Date
Bastien Chanot
5821ce2017 fix(prune-memory): RED-7 fictional example IDs + RED-8 accepted limit
RED-7 (example-priming): the STEP-2 worked example named live IDs (LRN-014 +
LRN-016) and modeled merging them — but they are complementary (header-ids vs
checkbox-CSS), a merge the skill's own rule forbids. Live IDs in an example prime
the skill to act on those exact entries on real data. Fictionalized the whole
STEP-2 example to 9xx IDs (cannot match a live registry); the merge example now
models a same-concept merge. Closed by a DETERMINISTIC test (run-deterministic.sh
RED-7: the example must carry only 9xx ids) per LRN-046, not a flaky behavioral
fixture. The test caught its own ugrep false-green first (a leading-dash pattern
parsed as an option) — fixed via /usr/bin/grep, the same dodge the skill's verify
already uses at line 189.

RED-8 (added-negation inversion): re-reviewed, consciously accepted as a documented
limit in BACKLOG — remote (compression subtracts tokens), and an FP-safe increase
check is non-trivial (needs the HEAD entry-id set to exclude legit new/merged 0->N);
a noisy guard is worse than the honest limit on a destructive skill (LRN-047).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01C6bUdvHnajCNzgVQefZowj
2026-06-29 19:25:42 +02:00
Bastien Chanot
0a3e76611d fix(skill): prune-memory v1.1 — deterministic guards close 6 TDD'd defects
Only destructive skill, previously untested. A RED suite (tests/) proved 6
dangers; each closed by a deterministic guard:
- RED-1 removed false "Fixed in v1.1 (TDD found it)" verify claim
- RED-2 STEP 0 dirty-tree is now a real exit 1 (was a prose-only STOP)
- RED-3 STEP 3.4 negation-sentence verbatim guard (no silent inversion)
- RED-4 STEP 1-A collapse safety-critical exception (NEVER/ALWAYS/PERMANENT)
- RED-5 STEP 4 fidelity census (count-based, per-entry x per-category)
- RED-6 STEP 4 trailing-space false-ORPHAN fix
Tests: run-deterministic.sh (all-green), run-behavioral.md, fixtures, BACKLOG
(RED-7/RED-8 open). Validated on the real learnings.md: 0 fidelity
false-positive vs 13, scope held, registry reverted.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01W9sqAwZxBMZSynZoVrEJhd
2026-06-25 22:56:10 +02:00
Bastien Chanot
d4a5cfec93 chore(caveman): purge plugin + always-on integration
Disable + uninstall caveman@caveman and delete every repo dependency on
it: SessionStart/UserPromptSubmit hook blocks, standalone hook files,
settings.json enabledPlugins + marketplace entries, install-plugins.sh
STEP 5.5, update-all.sh refresh step, plugins.lock.json entry, doctor.sh
checks, lib/detect-plugins.sh helpers, lib/profile.sh + plugin-advisor +
skills/profile protected-list entries, .gitignore runtime-file block,
and README/USAGE docs. Dead /caveman:compress refs replaced with
manual/claude.ai guidance. Memory-registry terse-format convention kept
(separate subsystem). Version 3.4.0 -> 3.5.0.

On a subscription plan caveman's ~75% output-token compression has no
cost benefit, and the always-on hooks added friction on validation
gates and client deliverables.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01X3e8LaH2vymmxyh36h3jFU
2026-06-19 19:08:40 +02:00
Bastien Chanot
61a98573d7 chore(skills): remove disable-model-invocation repo-wide
Strip the disable-model-invocation frontmatter key from all 19 editable skills. Absent = default = model invocation enabled. 8 were 'true' and blocked the model AND orchestrators from self-routing (status, plugin-check, analyze, onboard, refactor, init-project, pdf-translate, ship-feature) — contradicting the CLAUDE.md skill-routing rules. The other 11 were 'false', a no-op noise line.

The setting is binary (no per-caller granularity), so enabling orchestrator chaining also enables model auto-fire — accepted. Genuinely destructive operations remain guarded by the careful/guard hooks, independent of this flag.

Capitalized: BDR-019 (decision), LRN-026 (learning), journal 2026-06-09.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 16:18:10 +02:00
bastien
11b0c6bb8e darwin: add test prompts for 5 skills (baseline pass)
Skills covered: close, graphify, harden, profile, prune-memory.
Used by /darwin-skill dim 8 effect testing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 17:50:06 +02:00
bastien
0fe6153ead fix(prune-memory): STEP 4 verify — prefix mapping bug (TDD RED→GREEN)
First end-to-end run of /prune-memory on real .claude/memory/ surfaced
a broken verify script:

Old: `prefix=$(basename "$f" .md | tr a-z A-Z | cut -c1-3)` derived
the prefix from the filename's first 3 letters → produced DEC / LEA /
BLO. Actual prefixes are BDR / LRN / BLK. The grep then matched zero
entries, no MISSING/ORPHAN was ever reported, and the script printed
its "OK if blank" footer regardless of real state. False clean signal.

Fixed: hard-mapped filename → prefix via `declare -A PREFIX_MAP`.
Verified against current registries — 14 BDR + 16 LRN + 2 BLK + 1 EVAL
entries all index-consistent, no false negatives.

Added EVAL prefix to the map (evals.md was missing from the loop in
v1). Footer line clarified to `(blank above = OK)`. `wc -l` excludes
`.original.md` backups from the output.

Note: caveat in skill body said "v1 ships without baseline TDD test —
STEP 2 approval gate is the safety net". First real test caught a
verify bug that bypassed STEP 2 entirely. Lesson: STEP 4 is its own
safety net and needs its own test.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-05-11 16:27:31 +02:00
bastien
fd619f5f9a feat(skills): add /prune-memory — curate .claude/memory/ registries
New personal skill to maintain memory registry hygiene. Gap identified
between existing tools:
- /caveman:compress — text-compresses one file, no curation
- /close — appends new entries end-of-session, doesn't prune
- /prune-memory (new) — audits, classifies, applies user-approved cleanup

Operations:
- Mark obsolete entries `status: superseded by <ID>` or `status: deprecated`
  (no hard delete — append-only per CLAUDE.md memory rule).
- Merge similar entries (new ID, sources marked superseded).
- Caveman-compress bloated prose-heavy entries inline.
- Repair Index drift (missing rows, orphaned rows).

Workflow: STEP 0 precheck (refuses dirty working tree, git = backup)
→ STEP 1 audit (A obsolete / B similar / C bloated / D drift)
→ STEP 2 plan + mandatory user approval
→ STEP 3 apply safe→destructive
→ STEP 4 verify Index sanity + line-count report.

Follows superpowers:writing-skills CSO conventions: "Use when..." trigger
description (under 1024-char spec), Quick Reference table, Common
Mistakes table, Failure Paths table. v1 ships without baseline TDD test
(noted in skill body); STEP 2 approval gate is the safety net.

Co-Authored-By: Claude <noreply@anthropic.com>
2026-05-11 16:24:16 +02:00