claude/docs/specs/2026-06-27-deploy-skill-design.md
Bastien Chanot b210e8d6a8 docs(deploy): design spec + implementation plan
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Claude-Session: https://claude.ai/code/session_01Ho5EQCFTSvYamuRtVZpp2d
2026-06-27 16:51:33 +02:00

14 KiB
Raw Blame History

Deploy skill — design spec

  • Date: 2026-06-27
  • Status: Design approved (5 knobs settled). No skill code written yet. Next step = implementation plan.
  • Scope: A new deploy skill = a per-project shell RUNBOOK that lives in .claude/deploy/, gets re-instantiated from the delta since the last deploy, and LEARNS from deploy errors in place.

1. Vision — deployment memory that learns

Three moments:

  1. BEFORE — produce the instantiated runbook: reference runbook + delta since last deploy, parameterized steps rewritten with the real artifacts (e.g. the migration step lists the migrations actually added since last deploy, not the runbook's examples).
  2. DURING — the user executes out-of-band (prod ssh — Claude must not run it) and reports deployed and tested OR failed at step X, here is the error → fix together until success.
  3. AFTER — on confirmed success: (a) if errors were hit + fixed, update the reference runbook so the next deploy does not repeat them; (b) lay the marker "deployed up to here" for the next diff.

Structural ancestor in the corpus: client-handover (BEFORE baseline → DURING user-deploy gate via AskUserQuestion → AFTER validate + react). No existing skill owns a learning per-project runbook — clean gap, no .claude/deploy/ precedent.

2. Locked decisions

# Knob Decision
1 Marker / oracle STATE file is the oracle (deployed SHA), annotated tag added as a human bookmark only
2 Learning storage In-place runbook edits + append-only INCIDENTS.md (distinct jobs, atomic coupling)
3 Parameterization # @delta: annotations bind dynamic steps to path-patterns; un-annotated steps are fixed
4 Bootstrap Offer both — user pastes an existing runbook OR skill scaffolds via artifact detection + interview
5 Execution model NEXT.sh is a step-by-step CHECKLIST — runnable shell, but driven by hand with manual # VERIFY: gates; never bash NEXT.sh unattended

Why #5 is design-time, not impl: the execution model is load-bearing for moments 2 and 3. Moment 2 is defined as "user reports failed at step X", and moment 3's LEARN loop must know which step failed to patch it. A single bash NEXT.sh blob collapses both into "exited non-zero somewhere" and can strand a prod deploy (migrations, restarts) in partial state with no step control. Checklist is entailed by the three-moment structure, not merely safer.

Treated as settled corollaries: user executes out-of-band; a new lib/deploy-commit.sh helper (existing helpers cannot commit the runbook — see §6, verified).

3. Architecture

.claude/deploy/
  PROCEDURE.md   reference runbook — fixed shell + `# @delta:` annotated steps   (edited IN-PLACE)
  INCIDENTS.md   DEP-NNN incident ledger: date, step, error verbatim, root cause,
                 fix, -> resolving commit hash                                    (APPEND-ONLY)
  STATE          deployed SHA + timestamp + outcome — the diff oracle             (overwritten each deploy)
  NEXT.sh        instantiated runbook — EPHEMERAL, not committed ; run STEP-BY-STEP
                 (checklist, manual # VERIFY: gates) — never `bash NEXT.sh` unattended

lib/deploy-commit.sh   surgical commit, allowlist = .claude/deploy/ , rc3 unsafe-git guard, short-hash stdout

Skill STEP spine (PRE-FLIGHT -> PROPOSE+GATE -> WRITE+COMMIT, house style):
  0 PRE-FLIGHT  runbook present?  absent -> bootstrap (paste | scaffold+interview)
  1 DELTA       STATE absent -> first deploy = full runbook ; else diff <STATE_SHA> HEAD
  2 INSTANTIATE expand @delta steps + read INCIDENTS pre-warns -> NEXT.sh -> GATE
  3 (user executes out-of-band; reports "done" | "failed at step X: <err>")
  4 LEARN       on failure: patch PROCEDURE step + append DEP-NNN -> GATE -> deploy-commit (ATOMIC)
  5 MARK        on success: write STATE@sha ; annotate + push tag ; optional doc

4. Delta mechanism — verified (git 2.53.0)

All three facts re-run live before writing this spec; observed output recorded, not assumed.

First-deploy detection = STATE-absent, deterministic. describe is off the detection path.

[ -f .claude/deploy/STATE ]            => exit 1 (absent = first deploy)  <- THE detector
git describe --tags --match 'deploy/*' => fatal: No names found ; exit 128 <- only the reason NOT to use describe
[ -f .claude/deploy/STATE ]            => exit 0 (present = delta path)

Delta = git diff --name-only <STATE_SHA> HEAD (two explicit endpoints; no dots, so it cannot be misread as three-dot).

LINEAR   git diff --name-only <sha> HEAD   => 0033_new.sql, svc.yml   (== two-dot == three-dot; merge-base == STATE)
DIVERGED two-dot   sideA sideB             => fileA.txt, fileB.txt     (both endpoints = true tree delta)
DIVERGED three-dot sideA...sideB           => fileB.txt                (merge-base — UNDERCOUNTS)

Two-dot/explicit-endpoints is the literal tree difference between the deployed tree and HEAD = what deploy needs. It is also rebase-robust: an orphaned marker still yields the correct tree diff, whereas git rev-list A..B (ancestry) reports phantom deltas after history rewrite (LRN-054's trap; verified in an earlier run). Never use rev-list ancestry for the artifact list.

delta -> steps: # @delta:<kind> annotations bind a dynamic step to the path-pattern that feeds it; the diff buckets straight into steps:

# @delta:migrations glob=supabase/migrations/*.sql
# @delta:rebuild   when=docker-compose*.yml,Dockerfile
# @delta:deps      when=package.json,*lock*

5. Learning model — runbook + INCIDENTS, non-redundant

Artifact Job Lifecycle
PROCEDURE.md The corrected procedure you run. A fix is baked into the step so the next run cannot repeat it. in-place
INCIDENTS.md The incident ledger; read at BEFORE-time to pre-warn ("0033 hit a lock timeout last deploy; runbook already carries --timeout, watch for it"). append-only

The pre-warn read is the function git log serves badly — that is why the ledger is not duplication. This mirrors the memory system's own split (append-only journal.md/blockers.md alongside in-place TODO/code).

Coupling invariant: one incident → one in-place PROCEDURE.md patch + one INCIDENTS.md append, committed atomically in a single deploy-commit.sh call. Never one without the other (mirrors BDR-034/036 "couple the commit to the integration step"). Significant patch (changes a prod path) → surface + approve before writing.

6. lib/deploy-commit.sh — new helper, inverse .claude/ rule (verified)

Neither existing helper can commit the runbook — confirmed live:

REAL doc-commit.sh  .claude/deploy/PROCEDURE.md => rc 4 "REFUSED — out-of-scope ... BDR-022 ... NOTHING committed"
REAL memory-commit.sh pending (deploy changed)  => rc 1 (ignores it; allowlist = .claude/memory|tasks only)

doc-commit.sh is built to keep .claude/** out of public-doc commits; .claude/deploy/ is under .claude/, so reuse is not just blocked, it is semantically wrong. deploy-commit.sh needs the inverse rule: a TARGET allowlist for .claude/deploy/*, modeled on memory-commit.sh (rc 3 unsafe-git guard, short-hash on stdout, chore(deploy):/docs(deploy): messages).

Allowlist guard — traversal reject ordered FIRST. Prototype matrix verified live:

_in_deploy_scope() {
  case "$1" in
    *..*) return 1 ;;                 # reject path traversal FIRST
    .claude/deploy/*) return 0 ;;     # ALLOW the deploy family only
    *) return 1 ;;                    # reject everything else
  esac
}
ALLOW  .claude/deploy/{PROCEDURE.md,INCIDENTS.md,STATE}
REJECT .claude/memory/*  .claude/tasks/*  .claude/secret  CLAUDE.md  src/*
REJECT .claude/deploy            (bare dir, no slash)
REJECT .claude/deploy-other/x    (trailing-slash requirement closes prefix confusion)
REJECT .claude/deploy/../memory/secret   (traversal closed by *..* matched first)

7. Bootstrap

STEP 0 PRE-FLIGHT: PROCEDURE.md present? Absent → bootstrap, two offered paths:

  1. Paste — user supplies an existing runbook (the game example); skill adopts + annotates it.
  2. Scaffold — skill detects deploy artifacts (migrations dir, compose/Dockerfile, package scripts, .env) + a short interview (ssh target, backup cmd, rollback note) → writes an annotated PROCEDURE.md.

First deploy has no marker → STATE-absent ⇒ full runbook fires; then lay STATE at the deployed SHA. The first deploy is the creation of the runbook + the first marker.

8. Open items (for the implementation plan)

NEXT.sh execution model resolved → decision #5 (checklist), promoted to design-time.

  • Tag push: tags don't push by default → AFTER step should git push --tag deploy/<date> or remind.
  • INCIDENTS.md ID/format detail (mirror blockers.md DEP-NNN); confirm name vs ERRORS-LEARNED.md.
  • @delta: annotation grammar (glob= vs when=) — finalize the small DSL.
  • Frontmatter allowed-tools set; STEP gate wording reuse from capitalize/client-handover.

9. Build sequencing & a structural flag

Two distinct disciplines, in order — do not conflate:

  1. writing-plans — global task ordering (helper → skill → bootstrap), dependencies, gates. The build plan.
  2. → execution →
  3. At the skill task ONLY: writing-skills — the discipline for the SKILL.md itself (structure, frontmatter, spine, config conventions). Used WHEN we reach the skill task, not before (it does not fire at plan time).

Structural flag for writing-skills to resolve — do NOT assume the linear-spine convention suffices: deploy's spine is unusual — two parts split by out-of-band execution: STEP 02 before → user deploys by hand → STEP 45 after, on the done/failed report. A skill that hands back control mid-run and resumes.

Preliminary recon (confirm at the skill task — NOT verified now):

  • The 6 completion flux (close, ship-feature, feat, bugfix, hotfix, commit-change) appear linear one-shot — synchronous gates at most, no out-of-band hand-back.
  • The relevant precedent is OUTSIDE those 6: client-handover already hands back — a synchronous "Deploy done?" AskUserQuestion pause (STEP 5) — but it holds state in conversation context, not on disk.
  • deploy's genuinely-new bit may be disk-bridged resume (NEXT.sh + STATE on disk as the bridge) — but whether NEXT.sh alone suffices to resume cross-session is an OPEN design question, not a settled answer (see §10). An earlier draft of this spec framed it as resolved; it is not. writing-skills must establish the convention (how to mark "I wait for your return here", detect + resume a pending deploy, hold state across the gap) — confirm there, do not assume the linear mould suffices.

10. Open design question (DESIGN-TIME, unresolved) — state across the two moments

deploy is a two-moment skill: moments 02 (BEFORE) → user deploys out-of-band → moment 3 (AFTER) on the done/failed report. The report may arrive in a different session. So the design must answer how state crosses the gap and what moment 3 must know to resume correctly.

skill deux-temps, état entre temps = [à concevoir : NEXT.sh seul suffit-il pour reprendre cross-session ?]

Sub-questions (to settle when we resume — NOT now, NOT assumed):

  • What must the bridge record? Moment 3 must (a) lay the correct marker = STATE ← target sha, and (b) capitalize the correct incident (which step, which delta). HEAD may have moved since NEXT.sh was generated → "current HEAD" is unsafe. The bridge must persist at least {base STATE sha, target sha, delta manifest} — inside NEXT.sh (header block) or a sidecar (.claude/deploy/PENDING)? Undecided.
  • Resume detection (re-entrancy): STEP 0 PRE-FLIGHT must detect "a deploy is pending, awaiting your report" — likely pending-bridge present + STATE not advanced to target — and branch RESUME (ask done/failed) vs FRESH. Is moment 3 a new deploy call that re-detects from disk, or a deploy --report? Undecided.
  • Ephemeral vs persistent tension — LINKED to sub-question 1 (not independent). §3 calls NEXT.sh "EPHEMERAL, not committed", yet a cross-session bridge MUST survive on disk. So: if the bridge must persist, NEXT.sh-as-bridge is impossible while NEXT.sh stays ephemeral. Likely binary resolution at plan time — either (a) NEXT.sh becomes persistent (contradicts §3), or (b) the bridge is a separate "deploy-in-progress" artifact {base/target/delta} distinct from NEXT.sh. Settle with writing-skills. (Uncommitted local state is fine; note the single-machine assumption — an uncommitted bridge won't follow a clone.)
  • Form-novelty — deploy's DEFINING characteristic: cross-session COLD resume. client-handover is a near precedent, not exact: it hands back in-context (same conversation, state held in memory). deploy must resume with the context lost — so the disk alone must carry everything to resume cold. No existing skill resumes without context; that is what sets deploy apart, and it makes sub-question 1 load-bearing (disk must suffice for a cold restart). deploy likely introduces a NEW skill form → writing-skills establishes the convention. Confirm there.

Next step: writing-plans to turn this spec into an implementation plan (helper first, then skill); at the skill task, writing-skills to shape it to convention and resolve the §10 two-moment state question — which is design-time, deferred only because we are stopped here, not because it is impl detail.