fix: spike workflow defaults to interactive UI demos, not stdout

Flips the bias in step 8b: build a simple HTML page/web UI by default, fall back to stdout only for pure fact-checking (binary yes/no, benchmarks). Mirrors upstream spike-idea skill constraint #3 update. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Merge remote-tracking branch 'origin/main' into hotfix/1.38.2
2026-04-25 17:25:23 +02:00 · 2026-04-21 09:19:04 -06:00 · 2026-04-21 09:16:24 -06:00 · 2026-04-21 09:14:52 -06:00 · 2026-04-21 09:14:32 -06:00 · 2026-04-21 15:13:56 +00:00
20 changed files with 864 additions and 251 deletions
--- a/.github/workflows/release.yml
+++ b/.github/workflows/release.yml
@@ -342,23 +342,32 @@ jobs:

      - name: Create PR to merge release back to main
        if: ${{ !inputs.dry_run }}
+        continue-on-error: true
        env:
          GH_TOKEN: ${{ github.token }}
          BRANCH: ${{ needs.validate-version.outputs.branch }}
          VERSION: ${{ inputs.version }}
        run: |
-          EXISTING_PR=$(gh pr list --base main --head "$BRANCH" --state open --json number --jq '.[0].number')
+          # Non-fatal: repos that disable "Allow GitHub Actions to create and
+          # approve pull requests" cause this step to fail with GraphQL 403.
+          # The release itself (tag + npm publish + GitHub Release) must still
+          # proceed. Open the merge-back PR manually afterwards with:
+          #   gh pr create --base main --head release/${VERSION} \
+          #     --title "chore: merge release v${VERSION} to main"
+          EXISTING_PR=$(gh pr list --base main --head "$BRANCH" --state open --json number --jq '.[0].number' 2>/dev/null || echo "")
          if [ -n "$EXISTING_PR" ]; then
            echo "PR #$EXISTING_PR already exists; updating"
            gh pr edit "$EXISTING_PR" \
              --title "chore: merge release v${VERSION} to main" \
-              --body "Merge release branch back to main after v${VERSION} stable release."
+              --body "Merge release branch back to main after v${VERSION} stable release." \
+              || echo "::warning::Could not update merge-back PR (likely PR-creation policy disabled). Open it manually after release."
          else
            gh pr create \
              --base main \
              --head "$BRANCH" \
              --title "chore: merge release v${VERSION} to main" \
-              --body "Merge release branch back to main after v${VERSION} stable release."
+              --body "Merge release branch back to main after v${VERSION} stable release." \
+              || echo "::warning::Could not create merge-back PR (likely PR-creation policy disabled). Open it manually after release."
          fi

      - name: Tag and push
--- a/commands/gsd/sketch.md
+++ b/commands/gsd/sketch.md
@@ -1,7 +1,7 @@
 ---
 name: gsd:sketch
-description: Rapidly sketch UI/design ideas using throwaway HTML mockups with multi-variant exploration
-argument-hint: "<design idea to explore> [--quick] [--text]"
+description: Sketch UI/design ideas with throwaway HTML mockups, or propose what to sketch next (frontier mode)
+argument-hint: "[design idea to explore] [--quick] [--text] or [frontier]"
 allowed-tools:
  - Read
  - Write
@@ -18,7 +18,12 @@ allowed-tools:
 <objective>
 Explore design directions through throwaway HTML mockups before committing to implementation.
 Each sketch produces 2-3 variants for comparison. Sketches live in `.planning/sketches/` and
-integrate with GSD commit patterns, state tracking, and handoff workflows.
+integrate with GSD commit patterns, state tracking, and handoff workflows. Loads spike
+findings to ground mockups in real data shapes and validated interaction patterns.
+
+Two modes:
+- **Idea mode** (default) — describe a design idea to sketch
+- **Frontier mode** (no argument or "frontier") — analyzes existing sketch landscape and proposes consistency and frontier sketches

 Does not require `/gsd-new-project` — auto-creates `.planning/sketches/` if needed.
 </objective>
--- a/commands/gsd/spike.md
+++ b/commands/gsd/spike.md
@@ -1,7 +1,7 @@
 ---
 name: gsd:spike
-description: Rapidly spike an idea with throwaway experiments to validate feasibility before planning
-argument-hint: "<idea to validate> [--quick] [--text]"
+description: Spike an idea through experiential exploration, or propose what to spike next (frontier mode)
+argument-hint: "[idea to validate] [--quick] [--text] or [frontier]"
 allowed-tools:
  - Read
  - Write
@@ -16,9 +16,14 @@ allowed-tools:
  - mcp__context7__query-docs
 ---
 <objective>
-Rapid feasibility validation through focused, throwaway experiments. Each spike answers one
-specific question with observable evidence. Spikes live in `.planning/spikes/` and integrate
-with GSD commit patterns, state tracking, and handoff workflows.
+Spike an idea through experiential exploration — build focused experiments to feel the pieces
+of a future app, validate feasibility, and produce verified knowledge for the real build.
+Spikes live in `.planning/spikes/` and integrate with GSD commit patterns, state tracking,
+and handoff workflows.
+
+Two modes:
+- **Idea mode** (default) — describe an idea to spike
+- **Frontier mode** (no argument or "frontier") — analyzes existing spike landscape and proposes integration and frontier spikes

 Does not require `/gsd-new-project` — auto-creates `.planning/spikes/` if needed.
 </objective>
--- a/get-shit-done/bin/gsd-tools.cjs
+++ b/get-shit-done/bin/gsd-tools.cjs
@@ -1204,10 +1204,6 @@ async function runCommand(command, args, cwd, raw, defaultValue) {
        'agents',
        path.join('commands', 'gsd'),
        'hooks',
-        // OpenCode/Kilo flat command dir
-        'command',
-        // Codex/Copilot skills dir
-        'skills',
      ];

      function walkDir(dir, baseDir) {
--- a/get-shit-done/workflows/execute-phase.md
+++ b/get-shit-done/workflows/execute-phase.md
@@ -649,10 +649,15 @@ Execute each selected wave in sequence. Within a wave: parallel if `PARALLELIZAT

       # Detect files deleted on main but re-added by worktree merge
       # (e.g., archived phase directories that were intentionally removed)
+       # A "resurrected" file must have a deletion event in main's ancestry —
+       # brand-new files (e.g. SUMMARY.md just created by the executor) have no
+       # such history and must NOT be removed (#2501).
       DELETED_FILES=$(git diff --diff-filter=A --name-only HEAD~1 -- .planning/ 2>/dev/null || true)
       for RESURRECTED in $DELETED_FILES; do
-         # Check if this file was NOT in main's pre-merge tree
-         if ! echo "$PRE_MERGE_FILES" | grep -qxF "$RESURRECTED"; then
+         # Only delete if this file was previously tracked on main and then
+         # deliberately removed (has a deletion event in git history).
+         WAS_DELETED=$(git log --follow --diff-filter=D --name-only --format="" HEAD~1 -- "$RESURRECTED" 2>/dev/null | grep -c . || true)
+         if [ "${WAS_DELETED:-0}" -gt 0 ]; then
           git rm -f "$RESURRECTED" 2>/dev/null || true
         fi
       done
--- a/get-shit-done/workflows/insert-phase.md
+++ b/get-shit-done/workflows/insert-phase.md
@@ -66,7 +66,11 @@ Extract from result: `phase_number`, `after_phase`, `name`, `slug`, `directory`.
 Update STATE.md to reflect the inserted phase:

 1. Read `.planning/STATE.md`
-2. Under "## Accumulated Context" → "### Roadmap Evolution" add entry:
+2. Update STATE.md's next-phase pointers to the newly inserted phase `{decimal_phase}`:
+   - Update structured field(s) used by tooling (e.g. `current_phase:`) to `{decimal_phase}`.
+   - Update human-readable recommendation text (e.g. `## Current Phase`, `Next recommended run:`) to `{decimal_phase}`.
+   - If multiple pointer locations exist, update all of them in the same edit.
+3. Under "## Accumulated Context" → "### Roadmap Evolution" add entry:
   ```
   - Phase {decimal_phase} inserted after Phase {after_phase}: {description} (URGENT)
   ```
--- a/get-shit-done/workflows/settings.md
+++ b/get-shit-done/workflows/settings.md
@@ -51,6 +51,17 @@ Parse current values (default to `true` if not present):
 <step name="present_settings">

 **Text mode (`workflow.text_mode: true` in config or `--text` flag):** Set `TEXT_MODE=true` if `--text` is present in `$ARGUMENTS` OR `text_mode` from init JSON is `true`. When TEXT_MODE is active, replace every `AskUserQuestion` call with a plain-text numbered list and ask the user to type their choice number. This is required for non-Claude runtimes (OpenAI Codex, Gemini CLI, etc.) where `AskUserQuestion` is not available.
+
+**Non-Claude runtime note:** If `TEXT_MODE` is active (i.e. the runtime is non-Claude), prepend the following notice before the model profile question:
+
+```
+Note: Quality, Balanced, and Budget profiles select Claude model tiers (Opus/Sonnet/Haiku).
+On non-Claude runtimes (Codex, Gemini CLI, etc.) these profiles have no effect on actual
+model selection — GSD agents will use the runtime's default model.
+Choose "Inherit" to use the session model for all agents, or configure model_overrides
+manually in .planning/config.json to target specific models for this runtime.
+```
+
 Use AskUserQuestion with current values pre-selected:

 ```
@@ -60,10 +71,10 @@ AskUserQuestion([
    header: "Model",
    multiSelect: false,
    options: [
-      { label: "Quality", description: "Opus everywhere except verification (highest cost)" },
-      { label: "Balanced (Recommended)", description: "Opus for planning, Sonnet for research/execution/verification" },
-      { label: "Budget", description: "Sonnet for writing, Haiku for research/verification (lowest cost)" },
-      { label: "Inherit", description: "Use current session model for all agents (best for OpenRouter, local models, or runtime model switching)" }
+      { label: "Quality", description: "Opus everywhere except verification (highest cost) — Claude only" },
+      { label: "Balanced (Recommended)", description: "Opus for planning, Sonnet for research/execution/verification — Claude only" },
+      { label: "Budget", description: "Sonnet for writing, Haiku for research/verification (lowest cost) — Claude only" },
+      { label: "Inherit", description: "Use current session model for all agents (required for non-Claude runtimes: Codex, Gemini CLI, OpenRouter, local models)" }
    ]
  },
  {
--- a/get-shit-done/workflows/sketch-wrap-up.md
+++ b/get-shit-done/workflows/sketch-wrap-up.md
@@ -255,15 +255,16 @@ The sketch-findings skill will auto-load when building the UI.

 ## ▶ Next Up

-**Start building** — implement the validated design
+**Explore frontier sketches** — see what else is worth sketching based on what we've explored

-`/gsd-plan-phase`
+`/gsd-sketch` (run with no argument — its frontier mode analyzes the sketch landscape and proposes consistency and frontier sketches)

 ───────────────────────────────────────────────────────────────

 **Also available:**
+- `/gsd-plan-phase` — start building the real UI
 - `/gsd-ui-phase` — generate a UI design contract for a frontend phase
- `/gsd-sketch` — sketch additional design areas
+- `/gsd-sketch [idea]` — sketch a specific new design area
 - `/gsd-explore` — continue exploring

 ───────────────────────────────────────────────────────────────
@@ -279,5 +280,6 @@ The sketch-findings skill will auto-load when building the UI.
 - [ ] Reference files contain design decisions, CSS patterns, HTML structures, anti-patterns
 - [ ] `.planning/sketches/WRAP-UP-SUMMARY.md` written for project history
 - [ ] Project CLAUDE.md has auto-load routing line
- [ ] Summary presented with next-step routing
+- [ ] Summary presented
+- [ ] Next-step options presented (including frontier sketch exploration via `/gsd-sketch`)
 </success_criteria>
--- a/get-shit-done/workflows/sketch.md
+++ b/get-shit-done/workflows/sketch.md
@@ -2,6 +2,10 @@
 Explore design directions through throwaway HTML mockups before committing to implementation.
 Each sketch produces 2-3 variants for comparison. Saves artifacts to `.planning/sketches/`.
 Companion to `/gsd-sketch-wrap-up`.
+
+Supports two modes:
+- **Idea mode** (default) — user describes a design idea to sketch
+- **Frontier mode** — no argument or "frontier" / "what should I sketch?" — analyzes existing sketch landscape and proposes consistency and frontier sketches
 </purpose>

 <required_reading>
@@ -25,9 +29,60 @@ Read all files referenced by the invoking prompt's execution_context before star
 Parse `$ARGUMENTS` for:
 - `--quick` flag → set `QUICK_MODE=true`
 - `--text` flag → set `TEXT_MODE=true`
+- `frontier` or empty → set `FRONTIER_MODE=true`
 - Remaining text → the design idea to sketch

-**Text mode (`workflow.text_mode: true` in config or `--text` flag):** Set `TEXT_MODE=true` if `--text` is present in `$ARGUMENTS` OR `text_mode` from init JSON is `true`. When TEXT_MODE is active, replace every `AskUserQuestion` call with a plain-text numbered list and ask the user to type their choice number. This is required for non-Claude runtimes (OpenAI Codex, Gemini CLI, etc.) where `AskUserQuestion` is not available.
+**Text mode:** If TEXT_MODE is enabled, replace AskUserQuestion calls with plain-text numbered lists.
+</step>
+
+<step name="route">
+## Routing
+
+- **FRONTIER_MODE is true** → Jump to `frontier_mode`
+- **Otherwise** → Continue to `setup_directory`
+</step>
+
+<step name="frontier_mode">
+## Frontier Mode — Propose What to Sketch Next
+
+### Load the Sketch Landscape
+
+If no `.planning/sketches/` directory exists, tell the user there's nothing to analyze and offer to start fresh with an idea instead.
+
+Otherwise, load in this order:
+
+**a. MANIFEST.md** — the design direction, reference points, and sketch table with winners.
+
+**b. Findings skills** — glob `./.claude/skills/sketch-findings-*/SKILL.md` and read any that exist, plus their `references/*.md`. These contain curated design decisions from prior wrap-ups.
+
+**c. All sketch READMEs** — read `.planning/sketches/*/README.md` for design questions, winners, and tags.
+
+### Analyze for Consistency Sketches
+
+Review winning variants across all sketches. Look for:
+
+- **Visual consistency gaps:** Two sketches made independent design choices that haven't been tested together.
+- **State combinations:** Individual states validated but not seen in sequence.
+- **Responsive gaps:** Validated at one viewport but the real app needs multiple.
+- **Theme coherence:** Individual components look good but haven't been composed into a full-page view.
+
+If consistency risks exist, present them as concrete proposed sketches with names and design questions. If no meaningful gaps, say so and skip.
+
+### Analyze for Frontier Sketches
+
+Think laterally about the design direction from MANIFEST.md and what's been explored:
+
+- **Unsketched screens:** UI surfaces assumed but unexplored.
+- **Interaction patterns:** Static layouts validated but transitions, loading, drag-and-drop need feeling.
+- **Edge case UI:** 0 items, 1000 items, errors, slow connections.
+- **Alternative directions:** Fresh takes on "fine but not great" sketches.
+- **Polish passes:** Typography, spacing, micro-interactions, empty states.
+
+Present frontier sketches as concrete proposals numbered from the highest existing sketch number.
+
+### Get Alignment and Execute
+
+Present all consistency and frontier candidates, then ask which to run. When the user picks sketches, update `.planning/sketches/MANIFEST.md` and proceed directly to building them starting at `build_sketches`.
 </step>

 <step name="setup_directory">
@@ -49,25 +104,45 @@ COMMIT_DOCS=$(gsd-sdk query config-get commit_docs 2>/dev/null || echo "true")
 </step>

 <step name="mood_intake">
-**If `QUICK_MODE` is true:** Skip mood intake. Use whatever the user provided in `$ARGUMENTS` as the design direction. Jump to `decompose`.
+**If `QUICK_MODE` is true:** Skip mood intake. Use whatever the user provided in `$ARGUMENTS` as the design direction. Jump to `load_spike_context`.

 **Otherwise:**

-Before sketching anything, explore the design intent through conversation. Ask one question at a time using AskUserQuestion, with a paragraph of context and reasoning for each.
+Before sketching anything, explore the design intent through conversation. Ask one question at a time — using AskUserQuestion in normal mode, or a plain-text numbered list if TEXT_MODE is active.

 **Questions to cover (adapt to what the user has already shared):**

-1. **Feel:** "What should this feel like? Give me adjectives, emotions, or a vibe." (e.g., "clean and clinical", "warm and playful", "dense and powerful")
-2. **References:** "What apps, sites, or products have a similar feel to what you're imagining?" (gives concrete visual anchors)
-3. **Core action:** "What's the single most important thing a user does here?" (focuses the sketch on what matters)
+1. **Feel:** "What should this feel like? Give me adjectives, emotions, or a vibe."
+2. **References:** "What apps, sites, or products have a similar feel to what you're imagining?"
+3. **Core action:** "What's the single most important thing a user does here?"

-You may need more or fewer questions depending on how much the user shares upfront. After each answer, briefly reflect what you heard and how it shapes your thinking.
+After each answer, briefly reflect what you heard and how it shapes your thinking.

 When you have enough signal, ask: **"I think I have a good sense of the direction. Ready for me to sketch, or want to keep discussing?"**

 Only proceed when the user says go.
 </step>

+<step name="load_spike_context">
+## Load Spike Context
+
+If spikes exist for this project, read them to ground the sketches in reality. Mockups are still pure HTML, but they should reflect what's actually been proven — real data shapes, real component names, real interaction patterns.
+
+**a.** Glob for `./.claude/skills/spike-findings-*/SKILL.md` and read any that exist, plus their `references/*.md`. These contain validated patterns and requirements.
+
+**b.** Read `.planning/spikes/MANIFEST.md` if it exists — check the Requirements section for non-negotiable design constraints (e.g., "must support streaming", "must render markdown"). These requirements should be visible in the mockup even though the mockup doesn't implement them for real.
+
+**c.** Read `.planning/spikes/CONVENTIONS.md` if it exists — the established stack informs what's buildable and what interaction patterns are idiomatic.
+
+**How spike context improves sketches:**
+- Use real field names and data shapes from spike findings instead of generic placeholders
+- Show realistic UI states that match what the spikes proved (e.g., if streaming was validated, show a streaming message state)
+- Reference real component names and patterns from the target stack
+- Include interaction states that reflect what the spikes discovered (loading, error, reconnection states)
+
+**If no spikes exist**, skip this step.
+</step>
+
 <step name="decompose">
 Break the idea into 2-5 design questions. Present as a table:

@@ -98,18 +173,18 @@ Before sketching, ground the design in what's actually buildable. Sketches are H
 **a. Identify the target stack.** Check for package.json, Cargo.toml, etc. If the user mentioned a framework (React, SwiftUI, Flutter, etc.), note it.

 **b. Check component/pattern availability.** Use context7 (resolve-library-id → query-docs) or web search to answer:
- What layout primitives does the target framework provide? (grid systems, nav patterns, panel components)
- Are there existing component libraries in use? (shadcn, Material UI, etc.) What components are available?
- What interaction patterns are idiomatic? (e.g., sheet vs modal vs dialog in mobile)
+- What layout primitives does the target framework provide?
+- Are there existing component libraries in use? What components are available?
+- What interaction patterns are idiomatic?

-**c. Note constraints that affect design.** Some things that look great in HTML are painful or impossible in certain stacks:
+**c. Note constraints that affect design:**
 - Platform conventions (iOS nav patterns, desktop menu bars, terminal grid constraints)
 - Framework limitations (what's easy vs requires custom work)
 - Existing design tokens or theme systems already in the project

-**d. Let research inform variants.** Use findings to make variants that are actually buildable — at least one variant should follow the path of least resistance for the target stack.
+**d. Let research inform variants.** At least one variant should follow the path of least resistance for the target stack.

-**Skip when unnecessary.** If it's a greenfield project with no stack chosen, or the user explicitly says "just explore visually, don't worry about implementation," skip this step entirely. The point is grounding, not gatekeeping.
+**Skip when unnecessary.** Greenfield project with no stack, or user says "just explore visually." The point is grounding, not gatekeeping.
 </step>

 <step name="create_manifest">
@@ -144,26 +219,24 @@ Build each sketch in order.

 ### For Each Sketch:

-**a.** Find next available number by checking existing `.planning/sketches/NNN-*/` directories.
-Format: three-digit zero-padded + hyphenated descriptive name.
+**a.** Find next available number. Format: three-digit zero-padded + hyphenated descriptive name.

 **b.** Create the sketch directory: `.planning/sketches/NNN-descriptive-name/`

 **c.** Build `index.html` with 2-3 variants:

-**First round — dramatic differences:** Build 2-3 meaningfully different approaches to the design question. Different layouts, different visual structures, different interaction models.
-
-**Subsequent rounds — refinements:** Once the user has picked a direction or cherry-picked elements, build subtler variations within that direction.
+**First round — dramatic differences:** 2-3 meaningfully different approaches.
+**Subsequent rounds — refinements:** Subtler variations within the chosen direction.

 Each variant is a page/tab in the same HTML file. Include:
 - Tab navigation to switch between variants (see `sketch-variant-patterns.md`)
 - Clear labels: "Variant A: Sidebar Layout", "Variant B: Top Nav", etc.
 - The sketch toolbar (see `sketch-tooling.md`)
 - All interactive elements functional (see `sketch-interactivity.md`)
- Real-ish content, not lorem ipsum
+- Real-ish content, not lorem ipsum (use real field names from spike context if available)
 - Link to `../themes/default.css` for shared theme variables

-**All sketches are plain HTML with inline CSS and JS.** No build step, no npm, no framework. Opens instantly in a browser.
+**All sketches are plain HTML with inline CSS and JS.** No build step, no npm, no framework.

 **d.** Write `README.md`:

@@ -210,16 +283,16 @@ Compare: {what to look for between variants}
 ──────────────────────────────────────────────────────────────

 **f.** Handle feedback:
- **Pick a direction:** "I like variant B" → mark winner in README, move to next sketch
- **Cherry-pick elements:** "Rounded edges from A, color treatment from C" → build a synthesis as a new variant, show again
- **Want more exploration:** "None of these feel right, try X instead" → build new variants
+- **Pick a direction:** mark winner, move to next sketch
+- **Cherry-pick elements:** build synthesis as new variant, show again
+- **Want more exploration:** build new variants

-Iterate until the user is satisfied with a direction for this sketch.
+Iterate until satisfied.

 **g.** Finalize:
-1. Mark the winning variant in the README frontmatter (`winner: "B"`)
-2. Add ★ indicator to the winning tab in the HTML
-3. Update `.planning/sketches/MANIFEST.md` with the sketch row
+1. Mark winning variant in README frontmatter (`winner: "B"`)
+2. Add ★ indicator to winning tab in HTML
+3. Update `.planning/sketches/MANIFEST.md`

 **h.** Commit (if `COMMIT_DOCS` is true):
 ```bash
@@ -235,7 +308,7 @@ gsd-sdk query commit "docs(sketch-NNN): [winning direction] — [key visual insi
 </step>

 <step name="report">
-After all sketches complete, present the summary:
+After all sketches complete:

 ```
 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
@@ -263,8 +336,8 @@ After all sketches complete, present the summary:
 ───────────────────────────────────────────────────────────────

 **Also available:**
+- `/gsd-sketch` — sketch more (or run with no argument for frontier mode)
 - `/gsd-plan-phase` — start building the real UI
- `/gsd-explore` — continue exploring the concept
 - `/gsd-spike` — spike technical feasibility of a design pattern

 ───────────────────────────────────────────────────────────────
@@ -275,8 +348,9 @@ After all sketches complete, present the summary:
 <success_criteria>
 - [ ] `.planning/sketches/` created (auto-creates if needed, no project init required)
 - [ ] Design direction explored conversationally before any code (unless --quick)
- [ ] Target stack researched — component availability, constraints, and idioms noted (unless greenfield/skipped)
- [ ] Each sketch has 2-3 variants for comparison (at least one follows path of least resistance for target stack)
+- [ ] Spike context loaded — real data shapes, requirements, and conventions inform mockups
+- [ ] Target stack researched — component availability, constraints, idioms (unless greenfield/skipped)
+- [ ] Each sketch has 2-3 variants for comparison (at least one follows path of least resistance)
 - [ ] User can open and interact with sketches in a browser
 - [ ] Winning variant selected and marked for each sketch
 - [ ] All variants preserved (winner marked, not others deleted)
--- a/get-shit-done/workflows/spike-wrap-up.md
+++ b/get-shit-done/workflows/spike-wrap-up.md
@@ -1,8 +1,8 @@
 <purpose>
-Curate spike experiment findings and package them into a persistent project skill for future
-build conversations. Reads from `.planning/spikes/`, writes skill to `./.claude/skills/spike-findings-[project]/`
-(project-local) and summary to `.planning/spikes/WRAP-UP-SUMMARY.md`.
-Companion to `/gsd-spike`.
+Package spike experiment findings into a persistent project skill — an implementation blueprint
+for future build conversations. Reads from `.planning/spikes/`, writes skill to
+`./.claude/skills/spike-findings-[project]/` (project-local) and summary to
+`.planning/spikes/WRAP-UP-SUMMARY.md`. Companion to `/gsd-spike`.
 </purpose>

 <required_reading>
@@ -22,7 +22,7 @@ Read all files referenced by the invoking prompt's execution_context before star
 <step name="gather">
 ## Gather Spike Inventory

-1. Read `.planning/spikes/MANIFEST.md` for the overall idea context
+1. Read `.planning/spikes/MANIFEST.md` for the overall idea context and requirements
 2. Glob `.planning/spikes/*/README.md` and parse YAML frontmatter from each
 3. Check if `./.claude/skills/spike-findings-*/SKILL.md` exists for this project
   - If yes: read its `processed_spikes` list from the metadata section and filter those out
@@ -93,21 +93,29 @@ For each included spike:
 <step name="synthesize">
 ## Synthesize Reference Files

-For each feature-area group, write a reference file at `references/[feature-area-name].md`:
+For each feature-area group, write a reference file at `references/[feature-area-name].md` as an **implementation blueprint** — it should read like a recipe, not a research paper. A future build session should be able to follow this and build the feature correctly without re-spiking anything.

 ```markdown
 # [Feature Area Name]

-## Validated Patterns
-[For each validated finding: describe the approach that works, include key code snippets extracted from the spike source, explain why it works]
+## Requirements

-## Landmines
-[Things that look right but aren't. Gotchas. Anti-patterns discovered during spiking.]
+[Non-negotiable design decisions from MANIFEST.md Requirements section that apply to this feature area. These MUST be honored in the real build. E.g., "Must use streaming JSON output", "Must support reconnection".]
+
+## How to Build It
+
+[Step-by-step: what to install, how to configure, what code pattern to use. Include key code snippets extracted from the spike source. This is the proven approach — not theory, but tested and working code.]
+
+## What to Avoid
+
+[Things that look right but aren't. Gotchas. Anti-patterns discovered during spiking. Dead ends that were tried and failed.]

 ## Constraints
+
 [Hard facts: rate limits, library limitations, version requirements, incompatibilities]

 ## Origin
+
 Synthesized from spikes: NNN, NNN, NNN
 Source files available in: sources/NNN-spike-name/, sources/NNN-spike-name/
 ```
@@ -121,7 +129,7 @@ Create (or update) the generated skill's SKILL.md:
 ```markdown
 ---
 name: spike-findings-[project-dir-name]
-description: Validated patterns, constraints, and implementation knowledge from spike experiments. Auto-loaded during implementation work on [project-dir-name].
+description: Implementation blueprint from spike experiments. Requirements, proven patterns, and verified knowledge for building [project-dir-name]. Auto-loaded during implementation work.
 ---

 <context>
@@ -132,6 +140,15 @@ description: Validated patterns, constraints, and implementation knowledge from
 Spike sessions wrapped: [date(s)]
 </context>

+<requirements>
+## Requirements
+
+[Copied directly from MANIFEST.md Requirements section. These are non-negotiable design decisions that emerged from the user's choices during spiking. Every feature area reference must honor these.]
+
+- [requirement 1]
+- [requirement 2]
+</requirements>
+
 <findings_index>
 ## Feature Areas

@@ -189,11 +206,47 @@ Add an auto-load routing line to the project's CLAUDE.md (create the file if it
 If this routing line already exists (append mode), leave it as-is.
 </step>

+<step name="generate_conventions">
+## Generate or Update CONVENTIONS.md
+
+Analyze all processed spikes for recurring patterns and write `.planning/spikes/CONVENTIONS.md`. This file tells future spike sessions *how we spike* — the stack, structure, and patterns that have been established.
+
+1. Read all spike source code and READMEs looking for:
+   - **Stack choices** — What language/framework/runtime appears across multiple spikes?
+   - **Structure patterns** — Common file layouts, port numbers, naming schemes
+   - **Recurring approaches** — How auth is handled, how styling is done, how data is served
+   - **Tools & libraries** — Packages that showed up repeatedly with versions that worked
+
+2. Write or update `.planning/spikes/CONVENTIONS.md`:
+
+```markdown
+# Spike Conventions
+
+Patterns and stack choices established across spike sessions. New spikes follow these unless the question requires otherwise.
+
+## Stack
+[What we use for frontend, backend, scripts, and why — derived from what repeated across spikes]
+
+## Structure
+[Common file layouts, port assignments, naming patterns]
+
+## Patterns
+[Recurring approaches: how we handle auth, how we style, how we serve, etc.]
+
+## Tools & Libraries
+[Preferred packages with versions that worked, and any to avoid]
+```
+
+3. Only include patterns that appeared in 2+ spikes or were explicitly chosen by the user.
+
+4. If `CONVENTIONS.md` already exists (append mode), update sections with new patterns. Remove entries contradicted by newer spikes.
+</step>
+
 <step name="commit">
 Commit all artifacts (if `COMMIT_DOCS` is true):

 ```bash
-gsd-sdk query commit "docs(spike-wrap-up): package [N] spike findings into project skill" .planning/spikes/WRAP-UP-SUMMARY.md
+gsd-sdk query commit "docs(spike-wrap-up): package [N] spike findings into project skill" .planning/spikes/WRAP-UP-SUMMARY.md .planning/spikes/CONVENTIONS.md
 ```
 </step>

@@ -206,6 +259,7 @@ gsd-sdk query commit "docs(spike-wrap-up): package [N] spike findings into proje
 **Processed:** {N} spikes
 **Feature areas:** {list}
 **Skill:** `./.claude/skills/spike-findings-[project]/`
+**Conventions:** `.planning/spikes/CONVENTIONS.md`
 **Summary:** `.planning/spikes/WRAP-UP-SUMMARY.md`
 **CLAUDE.md:** routing line added

@@ -214,56 +268,27 @@ The spike-findings skill will auto-load in future build conversations.
 </step>

 <step name="whats_next">
-## What's Next — Intelligent Spike Routing
+## What's Next

-Analyze the full spike landscape (MANIFEST.md, all curated findings, feature-area groupings, validated/invalidated/partial verdicts) and present three categories of next-step options:
+After the summary, present next-step options:

-### Category A: Integration Spikes — "Do any validated spikes need to be tested together?"
+───────────────────────────────────────────────────────────────

-Review every pair and cluster of VALIDATED spikes. Look for:
+## ▶ Next Up

- **Shared resources:** Two spikes that both touch the same API, database, state, or data format but were tested independently. Will they conflict, race, or step on each other?
- **Data handoffs:** Spike A produces output that Spike B consumes. The formats were assumed compatible but never proven.
- **Timing/ordering:** Spikes that work in isolation but have sequencing dependencies in the real flow (e.g., auth must complete before streaming starts).
- **Resource contention:** Spikes that individually work but may compete for connections, memory, rate limits, or tokens when combined.
+**Explore frontier spikes** — see what else is worth spiking based on what we've learned

-If integration risks exist, present them as concrete proposed spikes:
+`/gsd-spike` (run with no argument — its frontier mode analyzes the spike landscape and proposes integration and frontier spikes)

-> **Integration spike candidates:**
-> - "Spikes 001 + 003 together: streaming through the authenticated connection" — these were tested separately but the real app needs both at once
-> - "Spikes 002 + 005 data handoff: does the parser output match what the renderer expects?"
+───────────────────────────────────────────────────────────────

-If no meaningful integration risks exist, say so and skip this category.
-
-### Category B: Frontier Spikes — "What else should we spike?"
-
-Think laterally about the overall idea from MANIFEST.md and what's been proven so far. Consider:
-
- **Gaps in the vision:** What does the user's idea need that hasn't been spiked yet? Look at the MANIFEST.md idea description and identify capabilities that are assumed but unproven.
- **Discovered dependencies:** Findings from completed spikes that reveal new questions. A spike that validated "X works" may imply "but we'd also need Y" — surface those implied needs.
- **Alternative approaches:** If any spike was PARTIAL or INVALIDATED, suggest a different angle to achieve the same goal.
- **Adjacent capabilities:** Things that aren't strictly required but would meaningfully improve the idea if feasible — worth a quick spike to find out.
- **Comparison opportunities:** If a spike used one library/approach and it worked but felt heavy or awkward, suggest a comparison spike with an alternative.
-
-Present frontier spikes as concrete proposals with names, validation questions (Given/When/Then), and risk-ordering:
-
-> **Frontier spike candidates:**
-> 1. `NNN-descriptive-name` — Given [X], when [Y], then [Z]. *Why now: [reason this is the logical next thing to explore]*
-> 2. `NNN-descriptive-name` — Given [X], when [Y], then [Z]. *Why now: [reason]*
-
-Number them continuing from the highest existing spike number.
-
-### Category C: Standard Options
-
- `/gsd-plan-phase` — Start planning the real implementation
- `/gsd-add-phase` — Add a phase based on spike findings
- `/gsd-spike` — Spike additional ideas
- `/gsd-explore` — Continue exploring
+**Also available:**
+- `/gsd-plan-phase` — start planning the real implementation
+- `/gsd-spike [idea]` — spike a specific new idea
+- `/gsd-explore` — continue exploring
 - Other

-### Presenting the Options
-
-Present all applicable categories, then ask the user which direction to go. If the user picks a frontier or integration spike, write the spike definitions directly into `.planning/spikes/MANIFEST.md` (appending to the existing table) and kick off `/gsd-spike` with those spikes pre-defined — the user shouldn't have to re-describe what was just proposed.
+───────────────────────────────────────────────────────────────
 </step>

 </process>
@@ -271,11 +296,11 @@ Present all applicable categories, then ask the user which direction to go. If t
 <success_criteria>
 - [ ] All unprocessed spikes auto-included and processed
 - [ ] Spikes grouped by feature area
- [ ] Spike-findings skill exists at `./.claude/skills/` with SKILL.md, references/, sources/
- [ ] Core source files from all spikes copied into sources/
- [ ] Reference files contain validated patterns, code snippets, landmines, constraints
+- [ ] Spike-findings skill exists at `./.claude/skills/` with SKILL.md (including requirements), references/, sources/
+- [ ] Reference files are implementation blueprints with Requirements, How to Build It, What to Avoid, Constraints
+- [ ] `.planning/spikes/CONVENTIONS.md` created or updated with recurring stack/structure/pattern choices
 - [ ] `.planning/spikes/WRAP-UP-SUMMARY.md` written for project history
 - [ ] Project CLAUDE.md has auto-load routing line
 - [ ] Summary presented
- [ ] Intelligent next-step analysis presented with integration spike candidates, frontier spike candidates, and standard options
+- [ ] Next-step options presented (including frontier spike exploration via `/gsd-spike`)
 </success_criteria>
--- a/get-shit-done/workflows/spike.md
+++ b/get-shit-done/workflows/spike.md
@@ -1,7 +1,11 @@
 <purpose>
-Rapid feasibility validation through focused, throwaway experiments. Each spike answers one
-specific question with observable evidence. Saves artifacts to `.planning/spikes/`.
-Companion to `/gsd-spike-wrap-up`.
+Spike an idea through experiential exploration — build focused experiments to feel the pieces
+of a future app, validate feasibility, and produce verified knowledge for the real build.
+Saves artifacts to `.planning/spikes/`. Companion to `/gsd-spike-wrap-up`.
+
+Supports two modes:
+- **Idea mode** (default) — user describes an idea to spike
+- **Frontier mode** — no argument or "frontier" / "what should I spike?" — analyzes existing spike landscape and proposes integration and frontier spikes
 </purpose>

 <required_reading>
@@ -20,9 +24,62 @@ Read all files referenced by the invoking prompt's execution_context before star
 Parse `$ARGUMENTS` for:
 - `--quick` flag → set `QUICK_MODE=true`
 - `--text` flag → set `TEXT_MODE=true`
+- `frontier` or empty → set `FRONTIER_MODE=true`
 - Remaining text → the idea to spike

-**Text mode:** If TEXT_MODE is enabled, replace AskUserQuestion calls with plain-text numbered lists — emit the options and ask the user to type the number of their choice.
+**Text mode:** If TEXT_MODE is enabled, replace AskUserQuestion calls with plain-text numbered lists.
+</step>
+
+<step name="route">
+## Routing
+
+- **FRONTIER_MODE is true** → Jump to `frontier_mode`
+- **Otherwise** → Continue to `setup_directory`
+</step>
+
+<step name="frontier_mode">
+## Frontier Mode — Propose What to Spike Next
+
+### Load the Spike Landscape
+
+If no `.planning/spikes/` directory exists, tell the user there's nothing to analyze and offer to start fresh with an idea instead.
+
+Otherwise, load in this order:
+
+**a. MANIFEST.md** — the overall idea, requirements, and spike table with verdicts.
+
+**b. Findings skills** — glob `./.claude/skills/spike-findings-*/SKILL.md` and read any that exist, plus their `references/*.md`. These contain curated knowledge from prior wrap-ups.
+
+**c. CONVENTIONS.md** — read `.planning/spikes/CONVENTIONS.md` if it exists. Established stack and patterns.
+
+**d. All spike READMEs** — read `.planning/spikes/*/README.md` for verdicts, results, investigation trails, and tags.
+
+### Analyze for Integration Spikes
+
+Review every pair and cluster of VALIDATED spikes. Look for:
+
+- **Shared resources:** Two spikes that both touch the same API, database, state, or data format but were tested independently.
+- **Data handoffs:** Spike A produces output that Spike B consumes. The formats were assumed compatible but never proven.
+- **Timing/ordering:** Spikes that work in isolation but have sequencing dependencies in the real flow.
+- **Resource contention:** Spikes that individually work but may compete for connections, memory, rate limits, or tokens when combined.
+
+If integration risks exist, present them as concrete proposed spikes with names and Given/When/Then validation questions. If no meaningful integration risks exist, say so and skip this category.
+
+### Analyze for Frontier Spikes
+
+Think laterally about the overall idea from MANIFEST.md and what's been proven so far. Consider:
+
+- **Gaps in the vision:** Capabilities assumed but unproven.
+- **Discovered dependencies:** Findings that reveal new questions.
+- **Alternative approaches:** Different angles for PARTIAL or INVALIDATED spikes.
+- **Adjacent capabilities:** Things that would meaningfully improve the idea if feasible.
+- **Comparison opportunities:** Approaches that worked but felt heavy.
+
+Present frontier spikes as concrete proposals numbered from the highest existing spike number with Given/When/Then and risk ordering.
+
+### Get Alignment and Execute
+
+Present all integration and frontier candidates, then ask which to run. When the user picks spikes, write definitions into `.planning/spikes/MANIFEST.md` (appending to existing table) and proceed directly to building them starting at `research`.
 </step>

 <step name="setup_directory">
@@ -44,13 +101,16 @@ COMMIT_DOCS=$(gsd-sdk query config-get commit_docs 2>/dev/null || echo "true")
 </step>

 <step name="detect_stack">
-Check for the project's tech stack to inform spike technology choices:
+Check for the project's tech stack to inform spike technology choices.

+**Check conventions first.** If `.planning/spikes/CONVENTIONS.md` exists, follow its stack and patterns — these represent validated choices the user expects to see continued.
+
+**Then check the project stack:**
 ```bash
 ls package.json pyproject.toml Cargo.toml go.mod 2>/dev/null
 ```

-Use the project's language/framework by default. For greenfield projects with no existing stack, pick whatever gets to a runnable result fastest (Python, Node, Bash, single HTML file).
+Use the project's language/framework by default. For greenfield projects with no conventions and no existing stack, pick whatever gets to a runnable result fastest.

 Avoid unless the spike specifically requires it:
 - Complex package management beyond `npm install` or `pip install`
@@ -59,22 +119,31 @@ Avoid unless the spike specifically requires it:
 - Env files or config systems — hardcode everything
 </step>

-<step name="check_prior_spikes">
-If `.planning/spikes/MANIFEST.md` exists, read it. Scan the verdicts, names, and validation questions of all prior spikes. When decomposing the new idea, cross-reference against this history:
+<step name="load_prior_context">
+If `.planning/spikes/` has existing content, load context in this priority order:

- **Skip already-validated questions.** If a prior spike proved "WebSocket streaming works" with a VALIDATED verdict, don't re-spike it. Note the prior spike number and move on.
- **Build on prior findings.** If a prior spike was INVALIDATED or PARTIAL, factor that into the new decomposition — don't repeat the same approach, and flag the constraint to the user.
- **Call out relevant prior art.** When presenting the decomposition, mention any prior spikes that overlap: "Spike 003 already validated X, so we can skip that and focus on Y."
+**a. Conventions:** Read `.planning/spikes/CONVENTIONS.md` if it exists.

-If no `.planning/spikes/MANIFEST.md` exists, skip this step.
+**b. Findings skills:** Glob for `./.claude/skills/spike-findings-*/SKILL.md` and read any that exist, plus their `references/*.md` files.
+
+**c. Manifest:** Read `.planning/spikes/MANIFEST.md` for the index of all spikes.
+
+**d. Related READMEs:** Based on the new idea, identify which prior spikes are related by matching tags, names, technologies, or domain overlap. Read only those `.planning/spikes/*/README.md` files. Skip unrelated ones.
+
+Cross-reference against this full body of prior work:
+- **Skip already-validated questions.** Note the prior spike number and move on.
+- **Build on prior findings.** Don't repeat failed approaches. Use their Research and Results sections.
+- **Reuse prior research.** Carry findings forward rather than re-researching.
+- **Follow established conventions.** Mention any deviation.
+- **Call out relevant prior art** when presenting the decomposition.
+
+If no `.planning/spikes/` exists, skip this step.
 </step>

 <step name="decompose">
-**If `QUICK_MODE` is true:** Skip decomposition and alignment. Take the user's idea as a single spike question. Assign it spike number `001` (or next available). Jump to `research`.
+**If `QUICK_MODE` is true:** Skip decomposition and alignment. Take the user's idea as a single spike question. Assign it the next available number. Jump to `research`.

-**Otherwise:**
-
-Break the idea into 2-5 independent questions that each prove something specific. Frame each as an informal Given/When/Then. Present as a table:
+Break the idea into 2-5 independent questions. Frame each as Given/When/Then. Present as a table:

 ```
 | # | Spike | Type | Validates (Given/When/Then) | Risk |
@@ -86,30 +155,17 @@ Break the idea into 2-5 independent questions that each prove something specific

 **Spike types:**
 - **standard** — one approach answering one question
- **comparison** — same question, different approaches. Use a shared number with lettered variants: `NNN-a-name` and `NNN-b-name`. Both built back-to-back, then head-to-head comparison.
+- **comparison** — same question, different approaches. Shared number with letter suffix.

-Good spikes answer one specific feasibility question:
- "Can we parse X format and extract Y?" — script that does it on a sample file
- "How fast is X approach?" — benchmark with real-ish data
- "Can we get X and Y to talk to each other?" — thinnest integration
- "What does X feel like as a UI?" — minimal interactive prototype
- "Does X API actually support Y?" — script that calls it and shows the response
- "Should we use X or Y for this?" — **comparison spike**: same thin proof built with both
+Good spikes: specific feasibility questions with observable output.
+Bad spikes: too broad, no observable output, or just reading/planning.

-Bad spikes are too broad or don't produce observable output:
- "Set up the project" — not a question, just busywork
- "Design the architecture" — planning, not spiking
- "Build the backend" — too broad, no specific question
- "Research best practices" — open-ended reading with no runnable output
-
-Order by risk — the spike most likely to kill the idea runs first.
+Order by risk — most likely to kill the idea runs first.
 </step>

 <step name="align">
 **If `QUICK_MODE` is true:** Skip.

-Present the ordered spike list and ask which to build:
-
 ╔══════════════════════════════════════════════════════════════╗
 ║  CHECKPOINT: Decision Required                               ║
 ╚══════════════════════════════════════════════════════════════╝
@@ -119,35 +175,33 @@ Present the ordered spike list and ask which to build:
 ──────────────────────────────────────────────────────────────
 → Build all in this order, or adjust the list?
 ──────────────────────────────────────────────────────────────
-
-The user may reorder, merge, split, or skip spikes. Wait for alignment.
 </step>

 <step name="research">
-## Research Before Building
+## Research and Briefing Before Each Spike

-Before writing any spike code, ground each spike in reality. This prevents building against outdated APIs, picking the wrong library, or discovering mid-spike that the approach is impossible.
+This step runs **before each individual spike**, not once at the start.

-For each spike about to be built:
+**a. Present a spike briefing:**

-**a. Identify unknowns.** What libraries, APIs, protocols, or techniques does this spike depend on? What assumptions are you making about how they work?
+> **Spike NNN: Descriptive Name**
+> [2-3 sentences: what this spike is, why it matters, key risk or unknown.]

-**b. Check current docs.** Use context7 (resolve-library-id → query-docs) for any library or framework involved. Use web search for APIs, services, or techniques without a context7 entry. Read actual documentation — not training data, which may be stale.
+**b. Research the current state of the art.** Use context7 (resolve-library-id → query-docs) for libraries/frameworks. Use web search for APIs/services without a context7 entry. Read actual documentation.

-**c. Validate feasibility before coding.** Specifically check:
- Does the API/library actually support what the spike assumes? (Check endpoints, methods, capabilities)
- What's the current recommended approach? (The "right way" changes — what was learned in training may be deprecated)
- Are there version constraints, breaking changes, or migration gotchas?
- Are there rate limits, auth requirements, or platform restrictions that would block the spike?
+**c. Surface competing approaches** as a table:

-**d. Pick the right tool.** If multiple libraries could solve the problem, briefly compare them on: current maintenance status, API fit for the specific spike question, and complexity. Pick the one that gets to a runnable answer fastest with the fewest surprises.
+| Approach | Tool/Library | Pros | Cons | Status |
+|----------|-------------|------|------|--------|
+| ... | ... | ... | ... | ... |

-**e. Capture research findings.** Add a `## Research` section to the spike's README (before `## How to Run`) with:
- Which docs were checked and key findings
- The chosen approach and why
- Any gotchas or constraints discovered
+**Chosen approach:** [which one and why]

-**Skip research when unnecessary.** If the spike uses only well-known, stable tools already verified in this session, or if the entire spike is pure logic with no external dependencies, skip this step. The goal is grounding in reality, not busywork.
+If 2+ credible approaches exist, plan to build quick variants within the spike and compare them.
+
+**d. Capture research findings** in a `## Research` section in the README.
+
+**Skip when unnecessary** for pure logic with no external dependencies.
 </step>

 <step name="create_manifest">
@@ -159,57 +213,67 @@ Create or update `.planning/spikes/MANIFEST.md`:
 ## Idea
 [One paragraph describing the overall idea being explored]

+## Requirements
+[Design decisions that emerged from the user's choices during spiking. Non-negotiable for the real build. Updated as spikes progress.]
+
+- [e.g., "Must use streaming JSON output, not single-response"]
+- [e.g., "Must support reconnection on network failure"]
+
 ## Spikes

 | # | Name | Type | Validates | Verdict | Tags |
 |---|------|------|-----------|---------|------|
-| 001 | websocket-streaming | standard | WS connections can stream LLM output | VALIDATED | websocket, real-time |
-| 002a | pdf-parse-pdfjs | comparison | PDF table extraction | WINNER | pdf, parsing |
-| 002b | pdf-parse-camelot | comparison | PDF table extraction | — | pdf, parsing |
 ```

-If MANIFEST.md already exists, append new spikes to the existing table.
+**Track requirements as they emerge.** When the user expresses a preference during spiking, add it to the Requirements section immediately.
+</step>
+
+<step name="reground">
+## Re-Ground Before Each Spike
+
+Before starting each spike (not just the first), re-read `.planning/spikes/MANIFEST.md` and `.planning/spikes/CONVENTIONS.md` to prevent drift within long sessions. Check the Requirements section — make sure the spike doesn't contradict any established requirements.
 </step>

 <step name="build_spikes">
-Build each spike sequentially, highest-risk first.
+## Build Each Spike Sequentially

-**Comparison spikes** use a shared number with lettered variants: `NNN-a-descriptive-name` and `NNN-b-descriptive-name`. Both answer the same question using different approaches. Build them back-to-back, then report a head-to-head comparison before moving on. Judge on criteria that matter for the real build: API ergonomics, output quality, complexity, performance, or whatever the user cares about. The comparison spike's verdict names the winner and why.
+**Depth over speed.** The goal is genuine understanding, not a quick verdict. Never declare VALIDATED after a single happy-path test. Follow surprising findings. Test edge cases. Document the investigation trail, not just the conclusion.
+
+**Comparison spikes** use shared number with letter suffix: `NNN-a-name` / `NNN-b-name`. Build back-to-back, then head-to-head comparison.

 ### For Each Spike:

-**a.** Find next available number by checking existing `.planning/spikes/NNN-*/` directories.
-Format: three-digit zero-padded + hyphenated descriptive name. Comparison spikes: same number with letter suffix — `002a-pdf-parse-pdfjs`, `002b-pdf-parse-camelot`.
+**a.** Create `.planning/spikes/NNN-descriptive-name/`

-**b.** Create the spike directory: `.planning/spikes/NNN-descriptive-name/`
+**b.** Default to giving the user something they can experience. The bias should be toward building a simple UI or interactive demo, not toward stdout that only Claude reads. The user wants to *feel* the spike working, not just be told it works.

-**c.** Assess observability needs before writing code. Ask: **can Claude fully verify this spike's outcome by running a command and reading stdout, or does it require human interaction with a runtime?**
+**The default is: build something the user can interact with.** This could be:
+- A simple HTML page that shows the result visually
+- A web UI with a button that triggers the action and shows the response
+- A page that displays data flowing through a pipeline
+- A minimal interface where the user can try different inputs and see outputs

-Spikes that need runtime observability:
- **UI spikes** — anything with a browser, clicks, visual feedback
- **Streaming spikes** — WebSockets, SSE, real-time data flow
- **Multi-process spikes** — client/server, IPC, subprocess orchestration
- **Timing-sensitive spikes** — race conditions, debounce, polling, reconnection
- **External API spikes** — where the API response shape, latency, or error behavior matters for the verdict
+**Only fall back to stdout/CLI verification when the spike is genuinely about a fact, not a feeling:**
+- Pure data transformation where the answer is "yes it parses correctly"
+- Binary yes/no questions (does this API authenticate? does this library exist?)
+- Benchmark numbers (how fast is X? how much memory does Y use?)

-Spikes that do NOT need it:
- Pure computation (parse this file, transform this data)
- Single-run scripts with deterministic stdout
- Anything Claude can run and check the output of directly
+When in doubt, build the UI. It takes a few extra minutes but produces a spike the user can actually demo and feel confident about.

-**If the spike needs runtime observability,** build a forensic log layer into the spike:
+**If the spike needs runtime observability,** build a forensic log layer:
+1. Event log array with ISO timestamps and category tags
+2. Export mechanism (server: GET endpoint, CLI: JSON file, browser: Export button)
+3. Log summary (event counts, duration, errors, metadata)
+4. Analysis helpers if volume warrants it

-1. **An event log array** at module level that captures every meaningful event with an ISO timestamp and a direction/category tag (e.g., `"user_input"`, `"api_response"`, `"sse_frame"`, `"error"`, `"state_change"`)
-2. **A log export mechanism** appropriate to the spike's runtime:
-   - For server spikes: a `GET /api/export-log` endpoint returning downloadable JSON
-   - For CLI spikes: write `spike-log-{timestamp}.json` to the spike directory on exit or on signal
-   - For browser spikes: a visible "Export Log" button that triggers a JSON download
-3. **A log summary** included in the export: total event counts by category, duration, errors detected, environment metadata
-4. **Analysis helpers** if the event volume warrants it: a small script (bash/python) in the spike directory that extracts the signal from the log. Name it `analyze-log.sh` or similar.
+**c.** Build the code. Start with simplest version, then deepen.

-Keep the logging lightweight — an array push per event, not a logging framework. Inline it in the spike code.
+**d.** Iterate when findings warrant it:
+- **Surprising surface?** Write a follow-up test that isolates and explores it.
+- **Answer feels shallow?** Probe edge cases — large inputs, concurrent requests, malformed data, network failures.
+- **Assumption wrong?** Adjust. Note the pivot in the README.

-**d.** Build the minimum code that answers the spike's question (with the observability layer from step c if applicable). Every line must serve the question — nothing incidental.
+Multiple files per spike are expected for complex questions (e.g., `test-basic.js`, `test-edge-cases.js`, `benchmark.js`).

 **e.** Write `README.md` with YAML frontmatter:

@@ -227,36 +291,38 @@ tags: [tag1, tag2]
 # Spike NNN: Descriptive Name

 ## What This Validates
-[The specific feasibility question, framed as Given/When/Then]
+[Given/When/Then]

 ## Research
-[Docs checked, key findings, chosen approach and why, gotchas discovered. Omit if no external dependencies.]
+[Docs checked, approach comparison table, chosen approach, gotchas. Omit if no external deps.]

 ## How to Run
-[Single command or short sequence to run the spike]
+[Command(s)]

 ## What to Expect
-[Concrete observable outcomes: "When you click X, you should see Y within Z seconds"]
+[Concrete observable outcomes]

 ## Observability
-[If this spike has a forensic log layer: describe what's captured, how to export the log, and how to analyze it. Omit for spikes without runtime observability.]
+[If forensic log layer exists. Omit otherwise.]
+
+## Investigation Trail
+[Updated as spike progresses. Document each iteration: what tried, what revealed, what tried next.]

 ## Results
-[Filled in after running — verdict, evidence, surprises. If a forensic log was exported, include key findings from the log analysis here.]
+[Verdict, evidence, surprises, log analysis findings.]
 ```

-**f.** Auto-link related spikes: read existing spike READMEs and infer relationships from tags, names, and descriptions. Write the `related` field silently.
+**f.** Auto-link related spikes silently.

 **g.** Run and verify:
- If self-verifiable: run it, check output, update README verdict and Results section
- If needs human judgment: run it, present instructions using a checkpoint box:
+- Self-verifiable: run, iterate if findings warrant deeper investigation, update verdict
+- Needs human judgment: present checkpoint box:

 ╔══════════════════════════════════════════════════════════════╗
 ║  CHECKPOINT: Verification Required                           ║
 ╚══════════════════════════════════════════════════════════════╝

 **Spike {NNN}: {name}**
-
 **How to run:** {command}
 **What to expect:** {concrete outcomes}

@@ -264,45 +330,69 @@ tags: [tag1, tag2]
 → Does this match what you expected? Describe what you see.
 ──────────────────────────────────────────────────────────────

- If the spike has a forensic log layer: after verification, export the log and include key findings in the Results section. If something went wrong, ask the user to export the log and provide it for diagnosis.
+**h.** Update `.planning/spikes/MANIFEST.md` with the spike's row.

-**h.** Update verdict to VALIDATED / INVALIDATED / PARTIAL (or WINNER for comparison spike winners). Update Results section with evidence.
-
-**i.** Update `.planning/spikes/MANIFEST.md` with the spike's row.
-
-**j.** Commit (if `COMMIT_DOCS` is true):
+**i.** Commit (if `COMMIT_DOCS` is true):
 ```bash
-gsd-sdk query commit "docs(spike-NNN): [VERDICT] — [key finding in one sentence]" .planning/spikes/NNN-descriptive-name/ .planning/spikes/MANIFEST.md
+gsd-sdk query commit "docs(spike-NNN): [VERDICT] — [key finding]" .planning/spikes/NNN-descriptive-name/ .planning/spikes/MANIFEST.md
 ```

-**k.** Report before moving to next spike:
+**j.** Report:
 ```
 ◆ Spike NNN: {name}
  Verdict: {VALIDATED ✓ / INVALIDATED ✗ / PARTIAL ⚠}
-  Finding: {one sentence}
-  Impact: {effect on remaining spikes, if any}
+  Key findings: {not just verdict — investigation trail, surprises, edge cases explored}
+  Impact: {effect on remaining spikes}
 ```

-**l.** If a spike invalidates a core assumption: stop and present:
+Do not rush to a verdict. A spike that says "VALIDATED — it works" with no nuance is almost always incomplete.
+
+**k.** If core assumption invalidated:

 ╔══════════════════════════════════════════════════════════════╗
 ║  CHECKPOINT: Decision Required                               ║
 ╚══════════════════════════════════════════════════════════════╝

 Core assumption invalidated by Spike {NNN}.
-
 {what was invalidated and why}

 ──────────────────────────────────────────────────────────────
 → Continue with remaining spikes / Pivot approach / Abandon
 ──────────────────────────────────────────────────────────────
+</step>

-Only proceed if the user says to.
+<step name="update_conventions">
+## Update Conventions
+
+After all spikes in this session are built, update `.planning/spikes/CONVENTIONS.md` with patterns that emerged or solidified.
+
+```markdown
+# Spike Conventions
+
+Patterns and stack choices established across spike sessions. New spikes follow these unless the question requires otherwise.
+
+## Stack
+[What we use for frontend, backend, scripts, and why]
+
+## Structure
+[Common file layouts, port assignments, naming patterns]
+
+## Patterns
+[Recurring approaches: how we handle auth, how we style, how we serve]
+
+## Tools & Libraries
+[Preferred packages with versions that worked, and any to avoid]
+```
+
+Only include patterns that repeated across 2+ spikes or were explicitly chosen by the user. If `CONVENTIONS.md` already exists, update sections with new patterns from this session.
+
+Commit (if `COMMIT_DOCS` is true):
+```bash
+gsd-sdk query commit "docs(spikes): update conventions" .planning/spikes/CONVENTIONS.md
+```
 </step>

 <step name="report">
-After all spikes complete, present the consolidated report:
-
 ```
 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 GSD ► SPIKE COMPLETE ✓
@@ -314,32 +404,31 @@ After all spikes complete, present the consolidated report:
 |---|------|------|---------|
 | 001 | {name} | standard | ✓ VALIDATED |
 | 002a | {name} | comparison | ✓ WINNER |
-| 002b | {name} | comparison | — |

 ## Key Discoveries
-{surprises, gotchas, things that weren't expected}
+{surprises, gotchas, investigation trail highlights}

 ## Feasibility Assessment
-{overall, is the idea viable?}
+{overall viability}

 ## Signal for the Build
-{what the real implementation should use, avoid, or watch out for}
+{what to use, avoid, watch out for}
 ```

 ───────────────────────────────────────────────────────────────

 ## ▶ Next Up

-**Package findings** — wrap spike knowledge into a reusable skill
+**Package findings** — wrap spike knowledge into an implementation blueprint

 `/gsd-spike-wrap-up`

 ───────────────────────────────────────────────────────────────

 **Also available:**
+- `/gsd-spike` — spike more ideas (or run with no argument for frontier mode)
 - `/gsd-plan-phase` — start planning the real implementation
 - `/gsd-explore` — continue exploring the idea
- `/gsd-add-phase` — add a phase to the roadmap based on findings

 ───────────────────────────────────────────────────────────────
 </step>
@@ -348,15 +437,16 @@ After all spikes complete, present the consolidated report:

 <success_criteria>
 - [ ] `.planning/spikes/` created (auto-creates if needed, no project init required)
- [ ] Prior spikes checked — already-validated questions skipped, prior findings factored in
- [ ] Research grounded each spike in current docs before coding (unless pure logic/no deps)
+- [ ] Prior spikes and findings skills consulted before building
+- [ ] Conventions followed (or deviation documented)
+- [ ] Research grounded each spike in current docs before coding
+- [ ] Depth over speed — edge cases tested, surprising findings followed, investigation trail documented
 - [ ] Comparison spikes built back-to-back with head-to-head verdict
- [ ] Spikes needing human interaction have forensic log layer (event capture, export, analysis)
- [ ] Each spike answers one specific question with observable evidence
- [ ] Each spike README has complete frontmatter (including type), run instructions, and results
- [ ] User verified each spike (self-verified or human checkpoint)
- [ ] MANIFEST.md is current (with Type column)
+- [ ] Spikes needing human interaction have forensic log layer
+- [ ] Requirements tracked in MANIFEST.md as they emerge from user choices
+- [ ] CONVENTIONS.md created or updated with patterns that emerged
+- [ ] Each spike README has complete frontmatter, Investigation Trail, and Results
+- [ ] MANIFEST.md is current (with Type column and Requirements section)
 - [ ] Commits use `docs(spike-NNN): [VERDICT]` format
 - [ ] Consolidated report presented with next-step routing
- [ ] If core assumption invalidated, execution stopped and user consulted
 </success_criteria>
--- a/package-lock.json
+++ b/package-lock.json
@@ -1,12 +1,12 @@
 {
  "name": "get-shit-done-cc",
-  "version": "1.37.1",
+  "version": "1.38.2",
  "lockfileVersion": 3,
  "requires": true,
  "packages": {
    "": {
      "name": "get-shit-done-cc",
-      "version": "1.37.1",
+      "version": "1.38.2",
      "license": "MIT",
      "bin": {
        "get-shit-done-cc": "bin/install.js"
--- a/package.json
+++ b/package.json
@@ -1,6 +1,6 @@
 {
  "name": "get-shit-done-cc",
-  "version": "1.37.1",
+  "version": "1.38.2",
  "description": "A meta-prompting, context engineering and spec-driven development system for Claude Code, OpenCode, Gemini and Codex by TÂCHES.",
  "bin": {
    "get-shit-done-cc": "bin/install.js"
--- a/sdk/src/query/detect-custom-files.ts
+++ b/sdk/src/query/detect-custom-files.ts
@@ -14,8 +14,6 @@ const GSD_MANAGED_DIRS = [
  'agents',
  join('commands', 'gsd'),
  'hooks',
-  'command',
-  'skills',
 ];

 function walkDir(dir: string, baseDir: string): string[] {
--- a/sdk/src/query/roadmap.test.ts
+++ b/sdk/src/query/roadmap.test.ts
@@ -100,6 +100,63 @@ describe('stripShippedMilestones', () => {
  it('returns content unchanged when no details blocks', () => {
    expect(stripShippedMilestones('no details here')).toBe('no details here');
  });
+
+  // Bug #2496: inline ✅ SHIPPED heading sections must be stripped
+  it('strips ## heading sections marked ✅ SHIPPED', () => {
+    const content = [
+      '## Milestone v1.0: MVP — ✅ SHIPPED 2026-01-15',
+      '',
+      'Phase 1, Phase 2',
+      '',
+      '## Milestone v2.0: Current',
+      '',
+      'Phase 3',
+    ].join('\n');
+    const stripped = stripShippedMilestones(content);
+    expect(stripped).not.toContain('MVP');
+    expect(stripped).not.toContain('v1.0');
+    expect(stripped).toContain('v2.0');
+    expect(stripped).toContain('Current');
+  });
+
+  it('strips multiple inline SHIPPED sections and leaves non-shipped content', () => {
+    const content = [
+      '## Milestone v1.0: Alpha — ✅ SHIPPED 2026-01-01',
+      '',
+      'Old content',
+      '',
+      '## Milestone v1.5: Beta — ✅ SHIPPED 2026-02-01',
+      '',
+      'More old content',
+      '',
+      '## Milestone v2.0: Gamma',
+      '',
+      'Current content',
+    ].join('\n');
+    const stripped = stripShippedMilestones(content);
+    expect(stripped).not.toContain('Alpha');
+    expect(stripped).not.toContain('Beta');
+    expect(stripped).toContain('Gamma');
+    expect(stripped).toContain('Current content');
+  });
+
+  // Bug #2508 follow-up: ### headings must be stripped too
+  it('strips ### heading sections marked ✅ SHIPPED', () => {
+    const content = [
+      '### Milestone v1.0: MVP — ✅ SHIPPED 2026-01-15',
+      '',
+      'Phase 1, Phase 2',
+      '',
+      '### Milestone v2.0: Current',
+      '',
+      'Phase 3',
+    ].join('\n');
+    const stripped = stripShippedMilestones(content);
+    expect(stripped).not.toContain('MVP');
+    expect(stripped).not.toContain('v1.0');
+    expect(stripped).toContain('v2.0');
+    expect(stripped).toContain('Current');
+  });
 });

 // ─── getMilestoneInfo ─────────────────────────────────────────────────────
@@ -158,6 +215,66 @@ describe('getMilestoneInfo', () => {
    expect(info.version).toBe('v1.0');
    expect(info.name).toBe('milestone');
  });
+
+  // Bug #2495: STATE.md must take priority over ROADMAP heading matching
+  it('prefers STATE.md milestone over ROADMAP heading match', async () => {
+    const roadmap = [
+      '## Milestone v1.0: Shipped — ✅ SHIPPED 2026-01-01',
+      '',
+      'Phase 1',
+      '',
+      '## Milestone v2.0: Current Active',
+      '',
+      'Phase 2',
+    ].join('\n');
+    await writeFile(join(tmpDir, '.planning', 'ROADMAP.md'), roadmap);
+    await writeFile(
+      join(tmpDir, '.planning', 'STATE.md'),
+      '---\nmilestone: v2.0\nmilestone_name: Current Active\n---\n',
+    );
+    const info = await getMilestoneInfo(tmpDir);
+    expect(info.version).toBe('v2.0');
+    expect(info.name).toBe('Current Active');
+  });
+
+  // Bug #2508 follow-up: STATE.md has milestone version but no milestone_name —
+  // should use ROADMAP for the real name, still prefer STATE.md for version.
+  it('uses ROADMAP name when STATE.md has milestone version but no milestone_name', async () => {
+    const roadmap = [
+      '## Milestone v2.0: Real Name From Roadmap',
+      '',
+      'Phase 2',
+    ].join('\n');
+    await writeFile(join(tmpDir, '.planning', 'ROADMAP.md'), roadmap);
+    await writeFile(
+      join(tmpDir, '.planning', 'STATE.md'),
+      '---\nmilestone: v2.0\n---\n',  // no milestone_name
+    );
+    const info = await getMilestoneInfo(tmpDir);
+    expect(info.version).toBe('v2.0');
+    expect(info.name).toBe('Real Name From Roadmap');
+  });
+
+  it('returns correct milestone from STATE.md even when ROADMAP inline-SHIPPED stripping would fix it', async () => {
+    // ROADMAP with an unstripped shipped milestone heading (pre-fix state)
+    const roadmap = [
+      '## Milestone v1.0: Old — ✅ SHIPPED 2026-01-01',
+      '',
+      'Old phases',
+      '',
+      '## Milestone v2.0: New',
+      '',
+      'New phases',
+    ].join('\n');
+    await writeFile(join(tmpDir, '.planning', 'ROADMAP.md'), roadmap);
+    await writeFile(
+      join(tmpDir, '.planning', 'STATE.md'),
+      '---\nmilestone: v2.0\nmilestone_name: New\n---\n',
+    );
+    const info = await getMilestoneInfo(tmpDir);
+    expect(info.version).toBe('v2.0');
+    expect(info.name).toBe('New');
+  });
 });

 // ─── extractCurrentMilestone ──────────────────────────────────────────────
--- a/sdk/src/query/roadmap.ts
+++ b/sdk/src/query/roadmap.ts
@@ -50,7 +50,13 @@ interface PhaseSection {
 * Port of stripShippedMilestones from core.cjs line 1082-1084.
 */
 export function stripShippedMilestones(content: string): string {
-  return content.replace(/<details>[\s\S]*?<\/details>/gi, '');
+  // Pattern 1: <details>...</details> blocks (explicit collapse)
+  let result = content.replace(/<details>[\s\S]*?<\/details>/gi, '');
+  // Pattern 2: inline milestone headings marked as shipped.
+  // Keep aligned with heading levels accepted by extractCurrentMilestone() (## and ###).
+  const sections = result.split(/(?=^#{2,3}\s)/m);
+  result = sections.filter(s => !/^#{2,3}\s[^\n]*✅\s*SHIPPED\b/im.test(s)).join('');
+  return result;
 }

 /**
@@ -84,25 +90,35 @@ async function parseMilestoneFromState(projectDir: string): Promise<{ version: s
 */
 export async function getMilestoneInfo(projectDir: string): Promise<{ version: string; name: string }> {
  try {
+    // Priority 1: STATE.md frontmatter (authoritative for version; name only when real)
+    const fromState = await parseMilestoneFromState(projectDir);
+    const stateVersion = fromState?.version ?? null;
+    const stateName = fromState && fromState.name !== 'milestone' ? fromState.name : null;
+    if (stateVersion && stateName) {
+      return { version: stateVersion, name: stateName };
+    }
+    // STATE.md has a version but no real name — fall through to ROADMAP for the name,
+    // then override the version with the authoritative STATE.md value.
+
    const roadmap = await readFile(planningPaths(projectDir).roadmap, 'utf-8');

    // List-format: construction / blocked (legacy emoji)
    const barricadeMatch = roadmap.match(/🚧\s*\*\*v(\d+(?:\.\d+)+)\s+([^*]+)\*\*/);
    if (barricadeMatch) {
-      return { version: 'v' + barricadeMatch[1], name: barricadeMatch[2].trim() };
+      return { version: stateVersion ?? 'v' + barricadeMatch[1], name: barricadeMatch[2].trim() };
    }

    // List-format: in flight / active (GSD ROADMAP template uses 🟡 for current milestone)
    const inFlightMatch = roadmap.match(/🟡\s*\*\*v(\d+(?:\.\d+)+)\s+([^*]+)\*\*/);
    if (inFlightMatch) {
-      return { version: 'v' + inFlightMatch[1], name: inFlightMatch[2].trim() };
+      return { version: stateVersion ?? 'v' + inFlightMatch[1], name: inFlightMatch[2].trim() };
    }

    // Heading-format — strip shipped <details> blocks first
    const cleaned = stripShippedMilestones(roadmap);
    const headingMatch = cleaned.match(/##\s+.*v(\d+(?:\.\d+)+)[:\s]+([^\n(]+)/);
    if (headingMatch) {
-      return { version: 'v' + headingMatch[1], name: headingMatch[2].trim() };
+      return { version: stateVersion ?? 'v' + headingMatch[1], name: headingMatch[2].trim() };
    }

    // Milestone bullet list (## Milestones … ## Phases): use last **vX.Y Title** — typically the current row
@@ -110,21 +126,16 @@ export async function getMilestoneInfo(projectDir: string): Promise<{ version: s
    const boldMatches = [...beforePhases.matchAll(/\*\*v(\d+(?:\.\d+)+)\s+([^*]+)\*\*/g)];
    if (boldMatches.length > 0) {
      const last = boldMatches[boldMatches.length - 1];
-      return { version: 'v' + last[1], name: last[2].trim() };
-    }
-
-    const fromState = await parseMilestoneFromState(projectDir);
-    if (fromState) {
-      return fromState;
+      return { version: stateVersion ?? 'v' + last[1], name: last[2].trim() };
    }

    const allBare = [...cleaned.matchAll(/\bv(\d+(?:\.\d+)+)\b/g)];
    if (allBare.length > 0) {
      const lastBare = allBare[allBare.length - 1];
-      return { version: lastBare[0], name: 'milestone' };
+      return { version: stateVersion ?? lastBare[0], name: 'milestone' };
    }

-    return { version: 'v1.0', name: 'milestone' };
+    return { version: stateVersion ?? 'v1.0', name: 'milestone' };
  } catch {
    const fromState = await parseMilestoneFromState(projectDir);
    if (fromState) return fromState;
--- a/tests/bug-2501-resurrection-detection.test.cjs
+++ b/tests/bug-2501-resurrection-detection.test.cjs
@@ -0,0 +1,90 @@
+/**
+ * Tests for bug #2501: resurrection-detection block in execute-phase.md must
+ * check git history before deleting new .planning/ files.
+ *
+ * Root cause: the original logic deleted ANY .planning/ file that was absent
+ * from PRE_MERGE_FILES, which includes brand-new files (e.g. SUMMARY.md)
+ * that the executor just created. A true "resurrection" is a file that was
+ * previously tracked on main, deliberately deleted, and then re-introduced by
+ * a worktree merge. Detecting that requires a git history check, not just a
+ * pre-merge tree membership check.
+ */
+
+'use strict';
+
+const { test, describe } = require('node:test');
+const assert = require('node:assert/strict');
+const fs = require('fs');
+const path = require('path');
+
+const EXECUTE_PHASE = path.join(
+  __dirname, '..', 'get-shit-done', 'workflows', 'execute-phase.md'
+);
+
+describe('execute-phase.md — resurrection-detection guard (#2501)', () => {
+  let content;
+
+  // Load once; each test reads from the cached string.
+  test('file is readable', () => {
+    content = fs.readFileSync(EXECUTE_PHASE, 'utf-8');
+    assert.ok(content.length > 0, 'execute-phase.md must not be empty');
+  });
+
+  test('resurrection block checks git history for a prior deletion event', () => {
+    if (!content) content = fs.readFileSync(EXECUTE_PHASE, 'utf-8');
+    // Scope check to the resurrection block only (up to 1200 chars from its heading).
+    const resurrectionStart = content.indexOf('# Detect files deleted on main');
+    assert.ok(resurrectionStart !== -1, 'resurrection comment must exist');
+    const window = content.slice(resurrectionStart, resurrectionStart + 1200);
+
+    // The fix must add a git log --diff-filter=D check inside this block so that
+    // only files with a deletion event in the main branch ancestry are removed.
+    const hasHistoryCheck =
+      window.includes('--diff-filter=D') &&
+      window.includes('git log');
+    assert.ok(
+      hasHistoryCheck,
+      'execute-phase.md resurrection block must use "git log ... --diff-filter=D" to verify a file was previously deleted before removing it'
+    );
+  });
+
+  test('resurrection block does not delete files solely because they are absent from PRE_MERGE_FILES', () => {
+    if (!content) content = fs.readFileSync(EXECUTE_PHASE, 'utf-8');
+    // Extract the resurrection section (between the "Detect files deleted on main"
+    // comment and the next empty line / next major comment block).
+    const resurrectionStart = content.indexOf('# Detect files deleted on main');
+    assert.ok(
+      resurrectionStart !== -1,
+      'execute-phase.md must contain the resurrection-detection comment block'
+    );
+
+    // Grab a window of text around the resurrection block (up to 1200 chars).
+    const window = content.slice(resurrectionStart, resurrectionStart + 1200);
+
+    // The ONLY deletion guard should be the history check.
+    // The buggy pattern: `if ! echo "$PRE_MERGE_FILES" | grep -qxF "$RESURRECTED"`
+    // with NO accompanying history check. After the fix the sole condition
+    // determining deletion must involve a git-log history lookup.
+    const hasBuggyStandaloneGuard =
+      /if\s*!\s*echo\s*"\$PRE_MERGE_FILES"\s*\|\s*grep\s+-qxF\s*"\$RESURRECTED"/.test(window) &&
+      !/git log/.test(window);
+
+    assert.ok(
+      !hasBuggyStandaloneGuard,
+      'resurrection block must NOT delete files based solely on absence from PRE_MERGE_FILES without a git-history check'
+    );
+  });
+
+  test('resurrection block still removes files that have a deletion history on main', () => {
+    if (!content) content = fs.readFileSync(EXECUTE_PHASE, 'utf-8');
+    // The fix must still call `git rm` for genuine resurrections.
+    const resurrectionStart = content.indexOf('# Detect files deleted on main');
+    assert.ok(resurrectionStart !== -1, 'resurrection comment must exist');
+
+    const window = content.slice(resurrectionStart, resurrectionStart + 1200);
+    assert.ok(
+      window.includes('git rm'),
+      'resurrection block must still call git rm to remove genuinely resurrected files'
+    );
+  });
+});
--- a/tests/bug-2502-insert-phase-state-update.test.cjs
+++ b/tests/bug-2502-insert-phase-state-update.test.cjs
@@ -0,0 +1,62 @@
+/**
+ * Regression test for #2502: insert-phase does not update STATE.md's
+ * next-phase recommendation after inserting a decimal phase.
+ *
+ * Root cause: insert-phase.md's update_project_state step only added a
+ * "Roadmap Evolution" note to STATE.md, but never updated the "Current Phase"
+ * / next-run recommendation to point at the newly inserted phase.
+ *
+ * Fix: insert-phase.md must include a step that updates STATE.md's next-phase
+ * pointer (current_phase / next recommended run) to the newly inserted phase.
+ */
+
+'use strict';
+
+const { describe, test } = require('node:test');
+const assert = require('node:assert/strict');
+const fs = require('fs');
+const path = require('path');
+
+const INSERT_PHASE_PATH = path.join(
+  __dirname, '..', 'get-shit-done', 'workflows', 'insert-phase.md'
+);
+
+describe('bug-2502: insert-phase must update STATE.md next-phase recommendation', () => {
+  test('insert-phase.md exists', () => {
+    assert.ok(fs.existsSync(INSERT_PHASE_PATH), 'insert-phase.md should exist');
+  });
+
+  test('insert-phase.md contains a STATE.md next-phase update instruction', () => {
+    const content = fs.readFileSync(INSERT_PHASE_PATH, 'utf-8');
+
+    // Must reference STATE.md and the concept of updating the next/current phase pointer
+    const mentionsStateUpdate = (
+      /STATE\.md.{0,200}(next.phase|current.phase|next.run|recommendation)/is.test(content) ||
+      /(next.phase|current.phase|next.run|recommendation).{0,200}STATE\.md/is.test(content)
+    );
+
+    assert.ok(
+      mentionsStateUpdate,
+      'insert-phase.md must instruct updating STATE.md\'s next-phase recommendation to point to the newly inserted phase'
+    );
+  });
+
+  test('insert-phase.md update_project_state step covers next-phase pointer', () => {
+    const content = fs.readFileSync(INSERT_PHASE_PATH, 'utf-8');
+
+    const stepMatch = content.match(/<step name="update_project_state">([\s\S]*?)<\/step>/i);
+    assert.ok(stepMatch, 'insert-phase.md must contain update_project_state step');
+    const stepContent = stepMatch[1];
+
+    const hasNextPhasePointerUpdate = (
+      /\bcurrent[_ -]?phase\b/i.test(stepContent) ||
+      /\bnext[_ -]?phase\b/i.test(stepContent) ||
+      /\bnext recommended run\b/i.test(stepContent)
+    );
+
+    assert.ok(
+      hasNextPhasePointerUpdate,
+      'insert-phase.md update_project_state step must update STATE.md\'s next-phase pointer (current_phase) to the inserted decimal phase'
+    );
+  });
+});
--- a/tests/bug-2506-settings-profile-nonclaude-warning.test.cjs
+++ b/tests/bug-2506-settings-profile-nonclaude-warning.test.cjs
@@ -0,0 +1,55 @@
+/**
+ * Regression test for bug #2506
+ *
+ * /gsd-settings presents Quality/Balanced/Budget model profiles without any
+ * warning that on non-Claude runtimes (Codex, Gemini CLI, etc.) these profiles
+ * select Claude model tiers and have no effect on actual agent model selection.
+ *
+ * Fix: settings.md must include a non-Claude runtime note instructing users to
+ * use "Inherit" or configure model_overrides manually, and the Inherit option
+ * description must explicitly call out non-Claude runtimes.
+ *
+ * Closes: #2506
+ */
+
+'use strict';
+
+const { describe, test, before } = require('node:test');
+const assert = require('node:assert/strict');
+const fs = require('fs');
+const path = require('path');
+
+const SETTINGS_PATH = path.join(__dirname, '..', 'get-shit-done', 'workflows', 'settings.md');
+
+describe('bug #2506: settings.md non-Claude runtime warning for model profiles', () => {
+  let content;
+
+  before(() => {
+    content = fs.readFileSync(SETTINGS_PATH, 'utf-8');
+  });
+
+  test('settings.md contains a non-Claude runtime note for model profiles', () => {
+    assert.ok(
+      content.includes('non-Claude runtime') || content.includes('non-Claude runtimes'),
+      'settings.md must include a note about non-Claude runtimes and model profiles'
+    );
+  });
+
+  test('non-Claude note explains profiles are no-ops without model_overrides', () => {
+    assert.ok(
+      content.includes('model_overrides') || content.includes('no effect'),
+      'note must explain profiles have no effect on non-Claude runtimes without model_overrides'
+    );
+  });
+
+  test('Inherit option description explicitly mentions non-Claude runtimes', () => {
+    // The Inherit option in AskUserQuestion must call out non-Claude runtimes
+    const inheritOptionMatch = content.match(/label:\s*"Inherit"[^}]*description:\s*"([^"]+)"/s);
+    assert.ok(inheritOptionMatch, 'Inherit option with label/description must exist in settings.md');
+    const desc = inheritOptionMatch[1];
+    assert.ok(
+      desc.includes('non-Claude') || desc.includes('Codex') || desc.includes('Gemini'),
+      `Inherit option description must mention non-Claude runtimes; got: "${desc}"`
+    );
+  });
+});
--- a/tests/update-custom-backup.test.cjs
+++ b/tests/update-custom-backup.test.cjs
@@ -225,4 +225,58 @@ describe('detect-custom-files — update workflow backup detection (#1997)', ()
      `should detect custom reference; got: ${JSON.stringify(json.custom_files)}`
    );
  });
+
+  // #2505 — installer does NOT wipe skills/ or command/; scanning them produces
+  // false-positive "custom file" reports for every skill the user has installed
+  // from other packages.
+  test('does not scan skills/ directory (installer does not wipe it)', () => {
+    writeManifest(tmpDir, {
+      'get-shit-done/workflows/execute-phase.md': '# Execute Phase\n',
+    });
+
+    // Simulate user having third-party skills installed — none in manifest
+    const skillsDir = path.join(tmpDir, 'skills');
+    fs.mkdirSync(skillsDir, { recursive: true });
+    fs.writeFileSync(path.join(skillsDir, 'my-custom-skill.md'), '# My Skill\n');
+    fs.writeFileSync(path.join(skillsDir, 'another-plugin-skill.md'), '# Another\n');
+
+    const result = runGsdTools(
+      ['detect-custom-files', '--config-dir', tmpDir],
+      tmpDir
+    );
+
+    assert.ok(result.success, `Command failed: ${result.error}`);
+
+    const json = JSON.parse(result.output);
+    const skillFiles = json.custom_files.filter(f => f.startsWith('skills/'));
+    assert.strictEqual(
+      skillFiles.length, 0,
+      `skills/ should not be scanned; got false positives: ${JSON.stringify(skillFiles)}`
+    );
+  });
+
+  test('does not scan command/ directory (installer does not wipe it)', () => {
+    writeManifest(tmpDir, {
+      'get-shit-done/workflows/execute-phase.md': '# Execute Phase\n',
+    });
+
+    // Simulate files in command/ dir not wiped by installer
+    const commandDir = path.join(tmpDir, 'command');
+    fs.mkdirSync(commandDir, { recursive: true });
+    fs.writeFileSync(path.join(commandDir, 'user-command.md'), '# User Command\n');
+
+    const result = runGsdTools(
+      ['detect-custom-files', '--config-dir', tmpDir],
+      tmpDir
+    );
+
+    assert.ok(result.success, `Command failed: ${result.error}`);
+
+    const json = JSON.parse(result.output);
+    const commandFiles = json.custom_files.filter(f => f.startsWith('command/'));
+    assert.strictEqual(
+      commandFiles.length, 0,
+      `command/ should not be scanned; got false positives: ${JSON.stringify(commandFiles)}`
+    );
+  });
 });
Author	SHA1	Message	Date
Lex Christopherson	7bb6b6452a	fix: spike workflow defaults to interactive UI demos, not stdout Flips the bias in step 8b: build a simple HTML page/web UI by default, fall back to stdout only for pure fact-checking (binary yes/no, benchmarks). Mirrors upstream spike-idea skill constraint #3 update. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-21 09:19:04 -06:00
Lex Christopherson	43ea92578b	Merge remote-tracking branch 'origin/main' into hotfix/1.38.2 # Conflicts: # CHANGELOG.md # bin/install.js # sdk/src/query/init.ts	2026-04-21 09:16:24 -06:00
Lex Christopherson	a42d5db742	1.38.2	2026-04-21 09:14:52 -06:00
Lex Christopherson	c86ca1b3eb	fix: sync spike/sketch workflows with upstream skill v2 improvements Spike workflow: - Add frontier mode (no-arg or "frontier" proposes integration + frontier spikes) - Add depth-over-speed principle — follow surprising findings, test edge cases, document investigation trail not just verdict - Add CONVENTIONS.md awareness — follow established patterns, update after session - Add Requirements section in MANIFEST — track design decisions as they emerge - Add re-ground step before each spike to prevent drift in long sessions - Add Investigation Trail section to README template - Restructured prior context loading with priority ordering - Research step now runs per-spike with briefing and approach comparison table Sketch workflow: - Add frontier mode (no-arg or "frontier" proposes consistency + frontier sketches) - Add spike context loading — ground mockups in real data shapes, requirements, and conventions from spike findings Spike wrap-up workflow: - Add CONVENTIONS.md generation step (recurring stack/structure/pattern choices) - Reference files now use implementation blueprint format (Requirements, How to Build It, What to Avoid, Constraints) - SKILL.md now includes requirements section from MANIFEST - Next-steps route to /gsd-spike frontier mode instead of inline analysis Sketch wrap-up workflow: - Next-steps route to /gsd-sketch frontier mode Commands updated with frontier mode in descriptions and argument hints. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-21 09:14:32 -06:00
github-actions[bot]	337e052aa9	chore: bump version to 1.38.2 for hotfix	2026-04-21 15:13:56 +00:00
Lex Christopherson	969ee38ee5	fix: sync spike/sketch workflows with upstream skill v2 improvements Spike workflow: - Add frontier mode (no-arg or "frontier" proposes integration + frontier spikes) - Add depth-over-speed principle — follow surprising findings, test edge cases, document investigation trail not just verdict - Add CONVENTIONS.md awareness — follow established patterns, update after session - Add Requirements section in MANIFEST — track design decisions as they emerge - Add re-ground step before each spike to prevent drift in long sessions - Add Investigation Trail section to README template - Restructured prior context loading with priority ordering - Research step now runs per-spike with briefing and approach comparison table Sketch workflow: - Add frontier mode (no-arg or "frontier" proposes consistency + frontier sketches) - Add spike context loading — ground mockups in real data shapes, requirements, and conventions from spike findings Spike wrap-up workflow: - Add CONVENTIONS.md generation step (recurring stack/structure/pattern choices) - Reference files now use implementation blueprint format (Requirements, How to Build It, What to Avoid, Constraints) - SKILL.md now includes requirements section from MANIFEST - Next-steps route to /gsd-spike frontier mode instead of inline analysis Sketch wrap-up workflow: - Next-steps route to /gsd-sketch frontier mode Commands updated with frontier mode in descriptions and argument hints. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-21 09:05:47 -06:00
Tom Boucher	2980f0ec48	fix(sdk): stripShippedMilestones handles inline SHIPPED headings; getMilestoneInfo prefers STATE.md (#2508 ) * fix(sdk): stripShippedMilestones handles inline SHIPPED headings; getMilestoneInfo prefers STATE.md Fixes two compounding bugs: - #2496: stripShippedMilestones only stripped <details> blocks, ignoring '## Heading — ✅ SHIPPED ...' inline markers. Shipped milestone sections were leaking into downstream parsers. - #2495: getMilestoneInfo checked STATE.md frontmatter only as a last-resort fallback, so it returned the first heading match (often a leaked shipped milestone) rather than the current milestone. Moved STATE.md check to priority 1, consistent with extractCurrentMilestone. Closes #2495 Closes #2496 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(roadmap): handle ### SHIPPED headings and STATE.md version-only case Two follow-up fixes from CodeRabbit review of #2508: 1. stripShippedMilestones only split on ## boundaries; ### headings marked ✅ SHIPPED were not stripped, leaking into fallback parsers. Expanded the split/filter regex to #{2,3} to align with extractCurrentMilestone. 2. getMilestoneInfo's early-return on parseMilestoneFromState discarded the real milestone name from ROADMAP.md when STATE.md had only `milestone:` (no `milestone_name:`), returning the placeholder name 'milestone'. Now only short-circuits when STATE.md provides a real name; otherwise falls through to ROADMAP for the name while using stateVersion to override the version in every ROADMAP-derived return path. Tests: +2 new cases (### SHIPPED heading, version-only STATE.md). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 10:41:35 -04:00
Tom Boucher	8789211038	fix(insert-phase): update STATE.md next-phase recommendation after phase insertion (#2509 ) * fix(insert-phase): update STATE.md next-phase recommendation after inserting a phase Closes #2502 * fix(insert-phase): update all STATE.md pointers; tighten test scope Two follow-up fixes from CodeRabbit review of #2509: 1. The update_project_state instruction only said to find "the line" for the next-phase recommendation. STATE.md can have multiple pointers (structured current_phase: field AND prose recommendation text). Updated wording to explicitly require updating all of them in the same edit. 2. The regression test for the next-phase pointer update scanned the entire file, so a match anywhere would pass even if update_project_state itself was missing the instruction. Scoped the assertion to only the content inside <step name="update_project_state"> to prevent false positives. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 10:10:45 -04:00
Tom Boucher	57bbfe652b	fix: exclude non-wiped dirs from custom-file scan; warn on non-Claude model profiles (#2511 ) * fix(detect-custom-files): exclude skills and command dirs not wiped by installer (closes #2505) GSD_MANAGED_DIRS included 'skills' and 'command' directories, but the installer never wipes those paths. Users with third-party skills installed (40+ files, none in GSD's manifest) had every skill flagged as a "custom file" requiring backup, producing noisy false-positive reports on every /gsd-update run. Removes 'skills' and 'command' from both gsd-tools.cjs and the SDK's detect-custom-files.ts. Adds two regression tests confirming neither directory is scanned. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(settings): warn that model profiles are no-ops on non-Claude runtimes (closes #2506) settings.md presented Quality/Balanced/Budget model profiles without any indication that these tiers map to Claude models (Opus/Sonnet/Haiku) and have no effect on non-Claude runtimes (Codex, Gemini CLI, OpenRouter). Users on Codex saw the profile chooser as if it would meaningfully select models, but all agents silently used the runtime default regardless. Adds a non-Claude runtime note before the profile question (shown in TEXT_MODE, the path all non-Claude runtimes take) explaining the profiles are no-ops and directing users to either choose Inherit or configure model_overrides manually. Also updates the Inherit option description to explicitly name the runtimes where it is the correct choice. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-21 10:10:10 -04:00
Tom Boucher	a4764c5611	fix(execute-phase): resurrection-detection must check git history before deleting new .planning/ files (#2510 ) The guard at the worktree-merge resurrection block was inverting the intended logic: it deleted any .planning/ file absent from PRE_MERGE_FILES, which includes brand-new files (e.g. SUMMARY.md just created by the executor). A genuine resurrection is a file that was previously tracked on main, deliberately removed, and then re-introduced by the merge. Detecting that requires a git history check — not just tree membership. Fix: replace the PRE_MERGE_FILES grep guard with a `git log --follow --diff-filter=D` check that only removes the file if it has a deletion event in main's ancestry. Closes #2501	2026-04-21 09:46:01 -04:00
Jeremy McSpadden	30433368a0	fix(install): template bare .claude hook paths for non-Claude runtimes	2026-04-19 18:42:30 -05:00
Jeremy McSpadden	04fab926b5	test: add --no-sdk to hook-deployment installer tests Tests #1834, #1924, #2136 exercise hook/artifact deployment and don't care about SDK install. Now that installSdkIfNeeded() failures are fatal, these tests fail on any CI runner without gsd-sdk pre-built because the sdk/ tsc build path runs and can fail in CI env. Pass --no-sdk so each test focuses on its actual subject. SDK install path has dedicated end-to-end coverage in install-smoke.yml.	2026-04-19 18:39:32 -05:00
Jeremy McSpadden	f98ef1e460	fix(install): fatal SDK install failures + CI smoke gate (#2439 ) ## Why #2386 added `installSdkIfNeeded()` to build @gsd-build/sdk from bundled source and `npm install -g .`, because the npm-published @gsd-build/sdk is intentionally frozen and version-mismatched with get-shit-done-cc. But every failure path in that function was warning-only — including the final `which gsd-sdk` verification. When npm's global bin is off a user's PATH (common on macOS), the installer printed a yellow warning then exited 0. Users saw "install complete" and then every `/gsd-` command crashed with `command not found: gsd-sdk` (the #2439 symptom). No CI job executed the install path, so this class of regression could ship undetected — existing "install" tests only read bin/install.js as a string. ## What changed bin/install.js — installSdkIfNeeded() is now transactional* - All build/install failures exit non-zero (not just warn). - Post-install `which gsd-sdk` check is fatal: if the binary landed globally but is off PATH, we exit 1 with a red banner showing the resolved npm bin dir, the user's shell, the target rc file, and the exact `export PATH=…` line to add. - Escape hatch: `GSD_ALLOW_OFF_PATH=1` downgrades off-PATH to exit 2 for users with intentionally restricted PATH who will wire up the binary manually. - Resolver uses POSIX `command -v` via `sh -c` (replaces `which`) so behavior is consistent across sh/bash/zsh/fish. - Factored `resolveGsdSdk()`, `detectShellRc()`, `emitSdkFatal()`. .github/workflows/install-smoke.yml (new) - Executes the real install path: `npm pack` → `npm install -g <tgz>` → run installer non-interactively → `command -v gsd-sdk` → run `gsd-sdk --version`. - PRs: path-filtered to installer-adjacent files, ubuntu + Node 22 only. - main/release branches: full matrix (ubuntu+macos × Node 22+24). - Reusable via workflow_call with `ref` input for release gating. .github/workflows/release.yml — pre-publish gate - New `install-smoke-rc` and `install-smoke-finalize` jobs invoke the reusable workflow against the release branch. `rc` and `finalize` now `needs: [validate-version, install-smoke-*]`, so a broken SDK install blocks `npm publish`. ## Test plan - Local full suite: 4154/4154 pass - install-smoke.yml will self-validate on this PR (ubuntu+Node22 only) Addresses root cause of #2439 (the per-command pre-flight in #2440 is the complementary defensive layer). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 18:39:32 -05:00
Jeremy McSpadden	d0565e95c1	fix(set-profile): use hyphenated /gsd-set-profile in pre-flight message Project convention (#1748) requires /gsd-<cmd> hyphen form everywhere except designated test inputs. Fix the colon references in the pre-flight error and its regression test to satisfy stale-colon-refs.	2026-04-19 18:39:32 -05:00
Jeremy McSpadden	4ef6275e86	fix(set-profile): guard gsd-sdk invocation with command -v pre-flight (#2439 ) /gsd:set-profile crashed with `command not found: gsd-sdk` when gsd-sdk was not on PATH. The command invoked `gsd-sdk query` directly in a `!` backtick with no guard, so a missing binary produced an opaque shell error with exit 127. Add a `command -v gsd-sdk` pre-flight that prints the install/update hint and exits 1 when absent, mirroring the #2334 fix on /gsd-quick. The auto-install in #2386 still runs at install time; this guard is the defensive layer for users whose npm global bin is off-PATH (install.js warns but does not fail in that case). Closes #2439	2026-04-19 18:39:32 -05:00
Jeremy McSpadden	6c50490766	fix(sdk): register init.ingest-docs handler and add registry drift guard (#2442 ) The ingest-docs workflow called `gsd-sdk query init.ingest-docs` with a fallback to `init.default` — neither was registered in createRegistry(), so the workflow proceeded with `{}` and tried to parse project_exists, planning_exists, has_git, and project_path from empty. - Add initIngestDocs handler; register dotted + space aliases - Simplify workflow call; drop broken fallback - Repo-wide drift guard scans commands/, agents/, get-shit-done/, hooks/, bin/, scripts/, docs/ for `gsd-sdk query <cmd>` and fails on any reference with no registered handler (file:line citations) - Unit tests for the new handler Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 18:39:20 -05:00
Jeremy McSpadden	4cbebfe78c	docs(readme): add /gsd-ingest-docs to Brownfield commands Surfaces the new ingest-docs command from the Unreleased changelog in the README Commands section so users discover it without digging. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 18:39:20 -05:00
Jeremy McSpadden	9e87d43831	fix(build): include gsd-read-injection-scanner in hooks/dist (#2406 ) The scanner was added in #2201 but never added to the HOOKS_TO_COPY allowlist in scripts/build-hooks.js, so it never landed in hooks/dist/. install.js reads from hooks/dist/, so every install on 1.37.0/1.37.1 emitted "Skipped read injection scanner hook — not found at target" and the read-time prompt-injection scanner was silently disabled. - Add gsd-read-injection-scanner.js to HOOKS_TO_COPY - Add it to EXPECTED_ALL_HOOKS regression test in install-hooks-copy Fixes #2406 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 18:39:20 -05:00
github-actions[bot]	29ea90bc83	chore: bump version to 1.38.1 for hotfix	2026-04-19 23:37:15 +00:00
github-actions[bot]	0c6172bfad	chore: finalize v1.38.0	2026-04-18 03:45:59 +00:00
Jeremy McSpadden	e3bd06c9fd	fix(release): make merge-back PR step non-fatal Repos that disable "Allow GitHub Actions to create and approve pull requests" (org-level policy or repo-level setting) cause the "Create PR to merge release back to main" step to fail with a GraphQL 403. That failure cascades: Tag and push, npm publish, GitHub Release creation are all skipped, and the entire release aborts. The merge-back PR is a convenience — it's re-openable manually after the release. Making it non-fatal with continue-on-error lets the rest of the release complete. The step now emits ::warning:: annotations pointing at the manual-recovery command when it fails. Shell pipelines also fall through with `\|\| echo "::warning::..."` so transient gh CLI failures don't mask the underlying policy issue. Covers the failure mode seen on run 24596079637 where dry-run publish validation passed but the release halted at the PR-creation step.	2026-04-17 22:45:22 -05:00
github-actions[bot]	c69ecd975a	chore: bump to 1.38.0-rc.1	2026-04-18 03:05:35 +00:00
Jeremy McSpadden	06c4ded4ec	docs(changelog): promote Unreleased to [1.38.0] + add ultraplan entry	2026-04-17 22:03:26 -05:00
github-actions[bot]	341bb941c6	chore: bump version to 1.38.0 for release	2026-04-18 03:02:41 +00:00