chore(memory): EVAL-005 — obsolete effort alias missed (cross-config audit gap)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01UyNYwD4UccVw9ZCFZyJX55
This commit is contained in:
parent
d0a3740de5
commit
6516b85f0f
@ -64,3 +64,13 @@ rules:
|
||||
- **Method**: 5 parallel structure judges (shared rubric file, calibration anchor, lower-score-when-hesitating rule) + 5 behavior tests on fixtures (hotfix, geo, commit-change, status, analyze) + geo fix validated by re-test (0 source edits, `?? .claude/` only) + 2/2 counterbalanced blind judges (safety 3→9).
|
||||
- **Anomalies**: (1) KEY: stub skills (analyze 33.5, hotfix 36.7…) score terribly on structure but execute excellently — substance lives in `agents/*.md`; rubric must judge SKILL.md+agent.md as system, else misleading. (2) geo confirmed live: 2 HTML source files edited unsupervised pre-fix. (3) Self-inflicted: overwrote 5 pre-existing test-prompts.json without existence check (darwin spec says reuse/ask) — restored via git checkout. (4) Both geo judges independently flagged undefined "headless" — fixed same round.
|
||||
- **Action**: keep — bugs real, fixes verified. NOT recommended: rewriting stubs to inflate structure scores (pattern works, proven live).
|
||||
|
||||
---
|
||||
|
||||
## EVAL-005 — Obsolete `claude --effort max` alias missed across repeated Step 9 edits
|
||||
|
||||
- **Date**: 2026-06-23
|
||||
- **Output checked**: install-plugins.sh Step 9 kept `alias claude='claude --effort max'` while `settings.json` sets `"effortLevel": "xhigh"` (the source of truth). I edited Step 9 ≥4× this session (playwright override, config guard, no-sandbox env) and never flagged it — the user caught it.
|
||||
- **Method / why missed**: I treated the pre-existing `CLAUDE_LINES` as established and only touched the lines I was adding/removing. Spotting the redundancy needs cross-referencing TWO config layers (shell alias vs settings.json) — a semantic check I never ran. Masked further: the user's `.bashrc` is hand-managed and the alias line wasn't even present, so it looked inert.
|
||||
- **Anomaly**: not just dead config — a CLI flag (`--effort max`) silently OVERRIDES the settings.json value (`xhigh`). Real correctness bug.
|
||||
- **Action**: when editing installer shell-config, audit EACH existing line against the current settings.json / CLAUDE.md source of truth, not only the lines being changed. Removed the alias + added cleanup. General rule: reconcile config to ONE source of truth across env/alias/settings layers.
|
||||
|
||||
@ -186,3 +186,5 @@ rules:
|
||||
- gstack browser FIXED on Ubuntu 26.04 (full saga). `git submodule update` would NOT help (latest gstack still pins playwright 1.58.2). Two layers: (1) bumped Playwright→1.61 in submodule (native 26.04 build), (2) GSTACK_CHROMIUM_NO_SANDBOX=1 for AppArmor userns block. Both automated in install-plugins.sh (auto-bump gated on dep support-list grep; env gated on apparmor sysctl) + env to .bashrc. Verified browse drives a real page (200). Discovered user's .bashrc is hand-managed (installer's env lines had been wiped by a restore). Commit 3b8ffb1. BDR-029, LRN-040, BLK-008 resolved.
|
||||
|
||||
- Fixed MAGIC_API_KEY false-negative: check grep'd `repo/.env` (symlink), never created because `~/.claude/.env` was made AFTER link.sh on the fresh machine (and `make plugin` skips link.sh). install-plugins.sh now self-heals the symlink + both scripts use a tolerant regex (export/whitespace/non-empty). Immediate fix: `make link`. Sandbox blocked all `.env*` reads → diagnosed via dir listing + synthetic-line regex tests. Commit 1b028cb. LRN-041.
|
||||
|
||||
- Removed obsolete `alias claude='claude --effort max'` from install Step 9 — settings.json `effortLevel: xhigh` is the source of truth and the CLI alias would override it (forcing max over xhigh). Step 9 now also strips the alias + old CLAUDE_EFFORT from the profile if present. A dtach `cc` launcher was prototyped then dropped — deferred to a later sprint (per user). Why missed earlier = EVAL-005 (never cross-audited existing Step 9 lines vs settings.json).
|
||||
|
||||
Loading…
Reference in New Issue
Block a user