memory: capitalize darwin 2026-05-12 — LRN-017/018 + BLK-003 + BDR-015

LRN-017: thin-dispatcher SKILL.md round-1 invariant = fallback + triggers.
LRN-018: darwin eval subagents drift on total math (factor-10, D8 weight).
BLK-003: scripts/screenshot.mjs hardcoded macOS path blocks PNG cards on Linux.
BDR-015: exclude broken gstack symlinks from /darwin-skill scope.
journal: 2026-05-12 session entry.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
bastien 2026-05-12 18:07:29 +02:00
parent f4ceed20bb
commit f4a0cbe24d
4 changed files with 76 additions and 1 deletions

View File

@ -22,6 +22,7 @@ rules:
|----|------|---------|--------| |----|------|---------|--------|
| BLK-001 | 2026-04-22 | `rtk curl` breaks JSON pipelines | upstream | | BLK-001 | 2026-04-22 | `rtk curl` breaks JSON pipelines | upstream |
| BLK-002 | 2026-04-23 | `rmdir` denied in sandbox on empty directory | resolved | | BLK-002 | 2026-04-23 | `rmdir` denied in sandbox on empty directory | resolved |
| BLK-003 | 2026-05-12 | `scripts/screenshot.mjs` hardcoded macOS path blocks PNG cards on Linux | upstream |
--- ---
@ -44,4 +45,15 @@ rules:
- **Solution**: - **Solution**:
- This session: `git rm tasks/*.md` handled files individually (via `git rm`, cleared gate). Git auto-detected renames to `.claude/tasks/`, so `tasks/` directory removed implicitly at commit time. - This session: `git rm tasks/*.md` handled files individually (via `git rm`, cleared gate). Git auto-detected renames to `.claude/tasks/`, so `tasks/` directory removed implicitly at commit time.
- If dir persists empty after `git rm`: ask user to run `rmdir tasks` manually. - If dir persists empty after `git rm`: ask user to run `rmdir tasks` manually.
- **Status**: resolved (fixed via `git rm` + rename auto-detection; no `rmdir` needed in practice). - **Status**: resolved (fixed via `git rm` + rename auto-detection; no `rmdir` needed in practice).
## BLK-003 — `scripts/screenshot.mjs` hardcoded macOS path blocks PNG cards on Linux
- **Date**: 2026-05-12
- **Friction**: `/darwin-skill` Phase 3 generates result cards via `node ~/.agents/skills/darwin-skill/scripts/screenshot.mjs <html> <png>`. On Linux: script fails immediately — `require('/Users/alchain/.npm-global/lib/node_modules/playwright/node_modules/playwright-core')` resolves to a non-existent macOS user path. No PNG cards produced; Phase 3 falls back to markdown report only.
- **Real cause**: upstream `alchaincyf/darwin-skill` author dev'd on macOS, shipped absolute path to their own homedir's global npm install of playwright. Zero portability layer (no PATH lookup, no `playwright` bare require, no fallback to `npx`).
- **Solution**:
- Workaround (used 2026-05-12): skip PNG generation, deliver markdown + HTML cards (HTML viewable in browser without playwright).
- Local patch: `npm i -g playwright` then replace `require('/Users/alchain/...')` with `require('playwright')`. Two lines edit.
- Spec-documented fallback: `npx playwright screenshot "file:///path/to/card.html#<theme>" out.png --viewport-size=960,1280 --wait-for-timeout=2000` — works without modifying the file, costs ~150MB chromium download.
- PR upstream to `github.com/alchaincyf/darwin-skill` once tested.
- **Status**: upstream (third-party skill at `~/.agents/skills/darwin-skill/scripts/screenshot.mjs`, not in any of our repos).

View File

@ -36,6 +36,7 @@ rules:
| BDR-012 | 2026-05-07 | client-handover cover: white bg + green accents + PNG logo default | accepted | | BDR-012 | 2026-05-07 | client-handover cover: white bg + green accents + PNG logo default | accepted |
| BDR-013 | 2026-05-11 | client-handover: 6-chapter doc — promote scores §2 + NAP §4 | accepted | | BDR-013 | 2026-05-11 | client-handover: 6-chapter doc — promote scores §2 + NAP §4 | accepted |
| BDR-014 | 2026-05-11 | Personal SKILL.md descriptions: "Use when [triggers]…" pattern + 1024-char spec limit | accepted | | BDR-014 | 2026-05-11 | Personal SKILL.md descriptions: "Use when [triggers]…" pattern + 1024-char spec limit | accepted |
| BDR-015 | 2026-05-12 | Exclude broken gstack symlinks from /darwin-skill scope (external ownership) | accepted |
--- ---
@ -264,3 +265,20 @@ rules:
- Orchestrators still describe orchestration role explicitly (e.g. client-handover: "Multi-agent orchestrator: dispatches the client-handover-writer agent which spawns parallel /seo + /harden subagents") — that's role identification, not workflow summary. - Orchestrators still describe orchestration role explicitly (e.g. client-handover: "Multi-agent orchestrator: dispatches the client-handover-writer agent which spawns parallel /seo + /harden subagents") — that's role identification, not workflow summary.
- Other 10 personal skills (analyze, bugfix, code-clean, commit-change, feat, hotfix, plugin-check, refactor, status, skills-perso) still partially summarize workflow but stay under 1024 chars. Not retrofitted in this pass — flagged for follow-up only if shortcut symptoms observed. - Other 10 personal skills (analyze, bugfix, code-clean, commit-change, feat, hotfix, plugin-check, refactor, status, skills-perso) still partially summarize workflow but stay under 1024 chars. Not retrofitted in this pass — flagged for follow-up only if shortcut symptoms observed.
- **Reference**: commit `1da6a31`, 8 SKILL.md files (client-handover, doc, geo, seo, validate, ship-feature, init-project, onboard), superpowers:writing-skills "CSO" section, agentskills.io/specification. - **Reference**: commit `1da6a31`, 8 SKILL.md files (client-handover, doc, geo, seo, validate, ship-feature, init-project, onboard), superpowers:writing-skills "CSO" section, agentskills.io/specification.
---
## BDR-015 — Exclude broken gstack symlinks from /darwin-skill scope (external ownership)
- **Date**: 2026-05-12
- **Status**: accepted
- **Decision**: 5 dirs in `~/Documents/claude/skills/` whose `SKILL.md` symlinks point to non-existent gstack paths (`skills-external/gstack/<name>/SKILL.md` missing) — `benchmark-models`, `context-restore`, `context-save`, `make-pdf`, `plan-tune` — are excluded from `/darwin-skill` baseline + optimization. Marked `status=error` in `results.tsv` with note `broken gstack symlink — out of scope`. NOT scored, NOT optimized, NOT deleted.
- **Why**: darwin-skill constraint #1 forbids changing a skill's core function — implies external/gstack-owned skills are out of scope. Symlinks resolve to `skills-external/gstack` which is third-party submodule. Plus the targets are broken — gstack's actual layout (`benchmark/`, `health/`, `qa/`, etc.) doesn't include these 5 names, suggesting upstream rename or removal. Repairing them is a separate triage task, not darwin's concern.
- **Alternatives rejected**:
- Fix symlinks first then darwin-optimize → out of scope, blocks the optimization queue on gstack archaeology.
- Score them with `FILE_NOT_FOUND` and include in averages → biases stats, mixes signal with infrastructure issue.
- Optimize the gstack source files directly → external ownership, never modify.
- Delete the broken symlinks → would obscure that the user once expected these to exist; leave for triage.
- **Caveats**:
- If/when symlinks are repaired (real gstack target exists), re-run baseline to bring them in scope.
- Bigger picture: `benchmark-models` looks like a deliberate rename of gstack's `benchmark` to disambiguate from the gstack-skill called `/benchmark`. Could be a planned migration that stalled. Worth a one-line ticket separate from darwin.

View File

@ -87,3 +87,15 @@ rules:
- NAP checklist polish (commit `abd2612`): added "Description courte" field + replaced retired BrightLocal Free Tools with Moz Local Citation Checker (LRN-015). - NAP checklist polish (commit `abd2612`): added "Description courte" field + replaced retired BrightLocal Free Tools with Moz Local Citation Checker (LRN-015).
- CSS bugfix (commit `465fe9e`): pandoc GFM checkbox markup `<li><input ...> text…</li>` has no wrapper class, adjacent-sibling rule `li input + *` yanks `<a>`/`<code>` siblings out of flow. Fixed by targeting `li > input[type="checkbox"]` directly. Captured as LRN-016. - CSS bugfix (commit `465fe9e`): pandoc GFM checkbox markup `<li><input ...> text…</li>` has no wrapper class, adjacent-sibling rule `li input + *` yanks `<a>`/`<code>` siblings out of flow. Fixed by targeting `li > input[type="checkbox"]` directly. Captured as LRN-016.
- 4 atomic commits `b15b275..1da6a31` via `/commit-change`. Decisions BDR-013, BDR-014 + learnings LRN-014, LRN-015, LRN-016 capitalized. Pre-existing BDR-012 + LRN-013 Index rows backfilled (prior session entries existed in body but missing from Index). - 4 atomic commits `b15b275..1da6a31` via `/commit-change`. Decisions BDR-013, BDR-014 + learnings LRN-014, LRN-015, LRN-016 capitalized. Pre-existing BDR-012 + LRN-013 Index rows backfilled (prior session entries existed in body but missing from Index).
## 2026-05-12
- Ran `/darwin-skill` full pipeline on cwd repo (real skill source, not `~/.claude/skills/` runtime mirror). Baseline scored 23 personal skills + 5 broken gstack symlinks excluded. Avg baseline 75.6.
- Phase 2 round 1 on bottom 5: status 45.3→76.2 (+30.9), refactor 48.4→74.3 (+25.9), plugin-check 59.2→76.8 (+17.6), skills-perso 66.4→80.1 (+13.7), commit-change 69.6→83.5 (+13.9). All KEEP. Avg 58.0→78.2 (+20.2/skill).
- Rounds 2-3 skipped — diminishing returns past round 1 on dispatcher pattern. graphify (29.0, 62KB SKILL.md) deferred to Phase 2.5 exploratory rewrite per user.
- Pattern observed: thin-dispatcher round-1 invariant = fallback + frontmatter triggers. Replicable across the 4 dispatchers tested. Captured as LRN-017.
- Methodology gotcha: darwin eval subagents drift on total math (factor-10 errors, D8 weight 7 vs 25). Direction reliable, magnitude noisy. Captured as LRN-018. Recompute totals in main thread going forward.
- BDR-015: broken gstack symlinks (5 dirs) excluded from darwin scope — external ownership + missing targets.
- BLK-003: `scripts/screenshot.mjs` hardcoded macOS path → PNG cards skipped on Linux. Markdown report + 5 new test-prompts.json + 5 optimized SKILL.md only. Upstream issue, workaround in place.
- Branch `auto-optimize/20260512-1319` merged via `--no-ff` to master. 6 commits land. Report at `.claude/audits/DARWIN-SKILL-2026-05-12.md`. results.tsv at `~/.agents/skills/darwin-skill/results.tsv` (33 rows).
- Pre-existing uncommitted `agents/doc-syncer.md` (mtime 15:33, before session) NOT touched — left for the work session that owns it.

View File

@ -36,6 +36,8 @@ rules:
| LRN-014 | 2026-05-11 | Pandoc base gfm strips header id attrs — need `gfm+gfm_auto_identifiers` | any MD→HTML/PDF with cross-references (`[§4](#nap)`) via pandoc | | LRN-014 | 2026-05-11 | Pandoc base gfm strips header id attrs — need `gfm+gfm_auto_identifiers` | any MD→HTML/PDF with cross-references (`[§4](#nap)`) via pandoc |
| LRN-015 | 2026-05-11 | BrightLocal Free Tools retired 2026 — Moz Local Citation Checker is free replacement | client SEO/NAP docs — re-validate tool URLs + free-tier status annually | | LRN-015 | 2026-05-11 | BrightLocal Free Tools retired 2026 — Moz Local Citation Checker is free replacement | client SEO/NAP docs — re-validate tool URLs + free-tier status annually |
| LRN-016 | 2026-05-11 | Pandoc GFM checkbox markup breaks adjacent-sibling CSS — target `li > input` directly | styling task-list checkboxes in pandoc-rendered HTML/PDF | | LRN-016 | 2026-05-11 | Pandoc GFM checkbox markup breaks adjacent-sibling CSS — target `li > input` directly | styling task-list checkboxes in pandoc-rendered HTML/PDF |
| LRN-017 | 2026-05-12 | Thin-dispatcher SKILL.md round-1 win = fallback + frontmatter triggers (+15 to +30) | any `/darwin-skill` round-1 on a dispatcher SKILL.md |
| LRN-018 | 2026-05-12 | Darwin eval subagents drift on total math — recompute in main thread | any subagent-driven SKILL.md rescore |
--- ---
@ -232,3 +234,34 @@ rules:
- Render checklist with realistic content (`<a>`, `<code>`, `<strong>`) before signing off — bare text bullets won't surface the bug. - Render checklist with realistic content (`<a>`, `<code>`, `<strong>`) before signing off — bare text bullets won't surface the bug.
- Symptom signature: rendered PDF has overlapping inline elements ONLY in task lists — points to a sibling-selector rule firing on inline content. - Symptom signature: rendered PDF has overlapping inline elements ONLY in task lists — points to a sibling-selector rule firing on inline content.
- **Reference**: `skills/client-handover/resources/branding/zenquality.css` `li > input[type="checkbox"]` rule + `li.task-list-item::before` (lines 372410). Commit `465fe9e`. - **Reference**: `skills/client-handover/resources/branding/zenquality.css` `li > input[type="checkbox"]` rule + `li.task-list-item::before` (lines 372410). Commit `465fe9e`.
---
## LRN-017 — Thin-dispatcher SKILL.md round-1 win = fallback + frontmatter triggers (+15 to +30)
- **Date**: 2026-05-12
- **Pattern**: thin-dispatcher SKILL.md (delegates to `agents/<x>.md`, body 15-30 lines, no inline workflow) scores low on darwin rubric (45-70) because dims D2/D3/D4/D5 punish empty body. Round-1 universal fix:
1. Add fallback clause — `If $HOME/.claude/agents/<x>.md unreachable, emit "<X> agent missing." and STOP. Never improvise — silent behavior change is unsafe.`
2. Add triggers to frontmatter `description` — explicit `Triggers: "<keyword>", "<synonym>", "<i18n variant>".`
3. For destructive skills (refactor, commit-change): add safety rationale + pre-flight check stub.
Δ +13 to +31 observed: status 45.3→76.2 (+30.9), refactor 48.4→74.3 (+25.9), plugin-check 59.2→76.8 (+17.6), commit-change 69.6→83.5 (+13.9). 150% byte cap tight — trim aggressively.
- **Context**: `/darwin-skill` run 2026-05-12, branch `auto-optimize/20260512-1319` merged to master, 5 commits. skills-perso (66.4→80.1, +13.7) NOT a dispatcher — different patch (Known-limits subsection on the heuristic).
- **Future application**:
- Any darwin round-1 on a dispatcher SKILL.md → skip diagnosis, apply this template directly. Saves one eval cycle.
- After round 1, gains flatten near 75-80 → pivot to next-lowest skill, do not grind rounds 2-3 on same target.
- For thin originals (<500B), 150% cap is the binding constraint pre-trim drafts before committing.
- **Reference**: `.claude/audits/DARWIN-SKILL-2026-05-12.md`. Commits `512df48`..`134561d`. results.tsv at `~/.agents/skills/darwin-skill/results.tsv`.
---
## LRN-018 — Darwin eval subagents drift on total math — recompute in main thread
- **Date**: 2026-05-12
- **Pattern**: analyzer subagents asked to score SKILL.md and compute weighted total drift on the formula. Two recurring errors: (a) divide `Σ(dim×weight)` by `100` instead of `10` (off by factor 10 — produces 6.17 instead of 61.7, then sometimes the subagent silently re-multiplies); (b) use D8 weight 7 instead of the spec value 25 (status: spec says D8 weight = 25, easy to confuse with D4 weight = 7). Per-dim judgments themselves stable across runs; computed totals unreliable.
- **Context**: 5 round-1 evals during darwin 2026-05-12. Refactor subagent computed 743÷10 correctly in scratch but wrote `617/100 = 61.7` — actual correct total 74.3. Subsequent prompts explicitly stating "D8 weight is 25" cleared the second error.
- **Future application**:
- Prompt subagent for dim scores only, not weighted total. Main thread computes `Σ(dim_i × weight_i) / 10` deterministically.
- If subagent must compute, include weight table in prompt AND show example computation for one row.
- When comparing baseline vs round-N, use main-thread recomputed totals on BOTH sides, not the two subagents' self-reported numbers.
- Score recalibration between baseline subagent and round-1 subagent is real (independent re-anchoring) — first-round Δ tends to overstate improvement. Direction reliable, magnitude noisy.
- **Reference**: see "Methodology notes" section of `.claude/audits/DARWIN-SKILL-2026-05-12.md`.