claude/.claude/memory/evals.md
Bastien Chanot da4e6b9590 feat(profile): add gstack on|off verb to lib/profile.sh
Centralize gstack toggling in the `profile` command without losing the
active-profile label.

  - `gstack on`  re-enables ALL parked gstack skills (moves
    skills-disabled/gstack__* back) but does NOT touch .active-profile,
    so the user layers full gstack on top of their current profile and
    the statusline label is preserved. Unlike `reset`, which clears the
    label to "none".
  - `gstack off` disables gstack skills not listed in the active profile;
    errors cleanly when no profile is active (needs one to know what to
    keep).

Refactor (behavior-preserving): extract three shared helpers
`enable_all_gstack`, `disable_gstack_not_in`, `parked_gstack_count` and
rewire `cmd_reset` + `cmd_set` to reuse them instead of duplicating the
symlink-toggle loops. Wire `gstack` into main() dispatch, usage(), and the
header usage block.

Docs: SKILL.md argument-hint, examples, and output-policy updated. The
generic `make profile cmd="gstack on"` target already covers Make usage.

Verified: shellcheck CLEAN, `bash -n` OK, 6-case test (help, bad-action,
off-with-no-profile, on, off-trim, on-cycle) with final assertion that the
live symlink state was restored exactly to its pre-test value.

Memory: capitalize BDR-018 (decision), LRN-024 (DRY helper-extraction
pattern), BLK-007 (6 gstack source skills ios-*/spec unlinked post
submodule bump — open follow-up), EVAL-002 (self-eval, false "full.profile
bug" flag corrected pre-edit). Backfill index drift: BDR-017, BLK-005/006.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 18:31:48 +02:00

45 lines
2.5 KiB
Markdown

---
type: evals_registry
entry_prefix: EVAL
schema:
id: EVAL-XXX
date: YYYY-MM-DD
output: string (what was produced)
method: string (how it was evaluated - manual read, test, benchmark, user feedback)
anomalies: list of strings (what was wrong, missing, surprising)
action: [keep | correct | deprecate]
rules:
- Log an eval whenever you validate the quality of something Claude produced (report, audit, plan, generated code).
- Action keep - the output is fit for purpose as-is.
- Action correct - needs revision; capture what.
- Action deprecate - the approach itself is flawed; link to the decision that replaces it.
---
# Evals registry (EVAL)
## Index
| ID | Date | Output | Action |
|----|------|--------|--------|
| EVAL-001 | 2026-04-23 | `.claude/` restructure plan (ship-feature STEP 2) | keep |
| EVAL-002 | 2026-06-02 | `profile gstack on/off` verb implementation | keep |
---
## EVAL-001 — `.claude/` restructure plan
- **Date**: 2026-04-23
- **Output**: 21-task plan migrate `tasks/` to `.claude/tasks/` + create `.claude/memory/` + `.claude/audits/` + integrate CAPITALIZE across 5 skills + add `/close` skill.
- **Method**: manual review of 5 impacted skills/agents; verified `rtk` path-agnostic; confirmed `~/.claude/CLAUDE.md` symlinks to project (single file edit). Radical-honesty check on session-close ritual: confirmed aspirational without skill integration → scope expanded to Option D.
- **Anomalies**: none blocking. Note: `tasks/LESSONS.md` empty (101B, header only) — migration to `learnings.md` symbolic.
- **Action**: keep — plan validated, ready for execution.
---
## EVAL-002 — `profile gstack on|off` verb implementation
- **Date**: 2026-06-02
- **Output**: `cmd_gstack()` + 3 extracted helpers in `lib/profile.sh`; `cmd_reset`/`cmd_set` refactored to reuse; `skills/profile/SKILL.md` doc updated.
- **Method**: shellcheck 0.10.0 (CLEAN) + `bash -n`; 6-case live test (help; bad-action exit 1; `off` with active=none → exit 1 zero-mutation; `on` restores 14 + label `full` preserved NOT cleared; `off` trim; `on` cycle) with saved manifest + final assertion final-state == original (PASS, live env untouched).
- **Anomalies**: (1) Initial flag "full.profile omits ios/spec = bug" WRONG — full curated by design, confirmed by BDR-017 caveat. Self-corrected BEFORE any edit, no bad change shipped. Lesson: verify profile INTENT vs source completeness before calling omission a bug. (2) Surfaced real source-only gap → BLK-007 (open).
- **Action**: keep — verb works, tested, documented; false bug-flag caught pre-edit.