feat(ingest-docs): add /gsd-ingest-docs workflow and command

Orchestrator for the ingest pipeline (#2387): - commands/gsd/ingest-docs.md — /gsd-ingest-docs command with [path] [--mode] [--manifest] [--resolve] args; @-references the shared doc-conflict-engine so the BLOCKER gate semantics are inherited from the same contract /gsd-import consumes. - get-shit-done/workflows/ingest-docs.md — end-to-end flow: 1. parse + validate args (traversal guard on path + manifest) 2. init query + runtime detect + auto mode-detect (.planning/ presence) 3. discover docs via directory convention OR manifest YAML 4. 50-doc cap — forces --manifest for larger sets in v1 5. discovery approval gate 6. parallel spawn of gsd-doc-classifier per doc (fallback to sequential on non-Claude runtimes) 7. single gsd-doc-synthesizer spawn 8. conflict gate honoring doc-conflict-engine safety rule — BLOCKER count > 0 aborts without writing PROJECT/REQUIREMENTS/ ROADMAP/STATE 9. route to gsd-roadmapper (new) or append-to-milestone (merge), audits roadmapper's required PROJECT.md fields and only prompts for gaps 10. commit via gsd-sdk Updates ARCHITECTURE.md counts (commands 80→81, workflows 77→78, agents tree-count 31→33). --resolve interactive is reserved (explicit future-release reject). Refs #2387 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 17:25:23 +02:00 · 2026-04-17 17:06:21 -05:00
parent 523a13f1e8
commit bfdf3c3065
3 changed files with 373 additions and 5 deletions
--- a/commands/gsd/ingest-docs.md
+++ b/commands/gsd/ingest-docs.md
@@ -0,0 +1,42 @@
+---
+name: gsd:ingest-docs
+description: Scan a repo for mixed ADRs, PRDs, SPECs, and DOCs and bootstrap or merge the full .planning/ setup from them. Classifies each doc in parallel, synthesizes a consolidated context with a conflicts report, and routes to new-project or merge-milestone depending on whether .planning/ already exists.
+argument-hint: "[path] [--mode new|merge] [--manifest <file>] [--resolve auto|interactive]"
+allowed-tools:
+  - Read
+  - Write
+  - Edit
+  - Bash
+  - Glob
+  - Grep
+  - AskUserQuestion
+  - Task
+---
+
+<objective>
+Build the full `.planning/` setup (or merge into an existing one) from multiple pre-existing planning documents — ADRs, PRDs, SPECs, DOCs — in one pass.
+
+- **Net-new bootstrap** (`--mode new`, default when `.planning/` is absent): produces PROJECT.md + REQUIREMENTS.md + ROADMAP.md + STATE.md from synthesized doc content, delegating final generation to `gsd-roadmapper`.
+- **Merge into existing** (`--mode merge`, default when `.planning/` is present): appends phases and requirements derived from the ingested docs; hard-blocks any contradiction with existing locked decisions.
+
+Auto-synthesizes most conflicts using the precedence rule `ADR > SPEC > PRD > DOC` (overridable via manifest). Surfaces unresolved cases in `.planning/INGEST-CONFLICTS.md` with three buckets: auto-resolved, competing-variants, unresolved-blockers. The BLOCKER gate from the shared conflict engine prevents any destination file from being written when unresolved contradictions exist.
+
+**Inputs:** directory-convention discovery (`docs/adr/`, `docs/prd/`, `docs/specs/`, `docs/rfc/`, root-level `{ADR,PRD,SPEC,RFC}-*.md`), or an explicit `--manifest <file>` YAML listing `{path, type, precedence?}` per doc.
+
+**v1 constraints:** hard cap of 50 docs per invocation; `--resolve interactive` is reserved for a future release.
+</objective>
+
+<execution_context>
+@~/.claude/get-shit-done/workflows/ingest-docs.md
+@~/.claude/get-shit-done/references/ui-brand.md
+@~/.claude/get-shit-done/references/gate-prompts.md
+@~/.claude/get-shit-done/references/doc-conflict-engine.md
+</execution_context>
+
+<context>
+$ARGUMENTS
+</context>
+
+<process>
+Execute the ingest-docs workflow end-to-end. Preserve all approval gates (discovery, conflict report, routing) and the BLOCKER safety rule.
+</process>
--- a/docs/ARCHITECTURE.md
+++ b/docs/ARCHITECTURE.md
@@ -113,7 +113,7 @@ User-facing entry points. Each file contains YAML frontmatter (name, description
 - **Copilot:** Slash commands (`/gsd-command-name`)
 - **Antigravity:** Skills

-**Total commands:** 80
+**Total commands:** 81

 ### Workflows (`get-shit-done/workflows/*.md`)

@@ -124,7 +124,7 @@ Orchestration logic that commands reference. Contains the step-by-step process i
 - State update patterns
 - Error handling and recovery

-**Total workflows:** 77
+**Total workflows:** 78

 ### Agents (`agents/*.md`)

@@ -409,14 +409,14 @@ UI-SPEC.md (per phase) ───────────────────

 ```
 ~/.claude/                          # Claude Code (global install)
-├── commands/gsd/*.md               # 80 slash commands
+├── commands/gsd/*.md               # 81 slash commands
 ├── get-shit-done/
 │   ├── bin/gsd-tools.cjs           # CLI utility
 │   ├── bin/lib/*.cjs               # 19 domain modules
-│   ├── workflows/*.md              # 77 workflow definitions
+│   ├── workflows/*.md              # 78 workflow definitions
 │   ├── references/*.md             # 35 shared reference docs
 │   └── templates/                  # Planning artifact templates
-├── agents/*.md                     # 31 agent definitions
+├── agents/*.md                     # 33 agent definitions
 ├── hooks/
 │   ├── gsd-statusline.js           # Statusline hook
 │   ├── gsd-context-monitor.js      # Context warning hook
--- a/get-shit-done/workflows/ingest-docs.md
+++ b/get-shit-done/workflows/ingest-docs.md
@@ -0,0 +1,326 @@
+# Ingest Docs Workflow
+
+Scan a repo for mixed planning documents (ADR, PRD, SPEC, DOC), synthesize them into a consolidated context, and bootstrap or merge into `.planning/`.
+
+- `[path]` — optional target directory to scan (defaults to repo root)
+- `--mode new|merge` — override auto-detect (defaults: `new` if `.planning/` absent, `merge` if present)
+- `--manifest <file>` — YAML file listing `{path, type, precedence?}` per doc; overrides heuristic classification
+- `--resolve auto|interactive` — conflict resolution (v1: only `auto` is supported; `interactive` is reserved)
+
+---
+
+<step name="banner">
+
+Display the stage banner:
+
+```
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+ GSD ► INGEST DOCS
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+```
+
+</step>
+
+<step name="parse_arguments">
+
+Parse `$ARGUMENTS`:
+
+- First positional token (if not a flag) → `SCAN_PATH` (default: `.`)
+- `--mode new|merge` → `MODE` (default: auto-detect)
+- `--manifest <file>` → `MANIFEST_PATH` (optional)
+- `--resolve auto|interactive` → `RESOLVE_MODE` (default: `auto`; reject `interactive` in v1 with message "interactive resolution is planned for a future release")
+
+**Validate paths:**
+
+```bash
+case "{SCAN_PATH}" in *..*) echo "SECURITY_ERROR: path contains traversal sequence"; exit 1 ;; esac
+test -d "{SCAN_PATH}" || echo "PATH_NOT_FOUND"
+if [ -n "{MANIFEST_PATH}" ]; then
+  case "{MANIFEST_PATH}" in *..*) echo "SECURITY_ERROR: manifest path contains traversal"; exit 1 ;; esac
+  test -f "{MANIFEST_PATH}" || echo "MANIFEST_NOT_FOUND"
+fi
+```
+
+If `PATH_NOT_FOUND` or `MANIFEST_NOT_FOUND`: display error and exit.
+
+</step>
+
+<step name="init_and_mode_detect">
+
+Run the init query:
+
+```bash
+INIT=$(gsd-sdk query init.ingest-docs 2>/dev/null || gsd-sdk query init.default)
+```
+
+Parse `project_exists`, `planning_exists`, `has_git`, `project_path` from INIT.
+
+**Auto-detect MODE** if not set:
+- `planning_exists: true` → `MODE=merge`
+- `planning_exists: false` → `MODE=new`
+
+If user passed `--mode new` but `.planning/` already exists: display warning and require explicit confirm via `AskUserQuestion` (approve-revise-abort from `references/gate-prompts.md`) before overwriting.
+
+If `has_git: false` and `MODE=new`: initialize git:
+```bash
+git init
+```
+
+**Detect runtime** using the same pattern as `new-project.md`:
+- execution_context path `/.codex/` → `RUNTIME=codex`
+- `/.gemini/` → `RUNTIME=gemini`
+- `/.opencode/` or `/.config/opencode/` → `RUNTIME=opencode`
+- else → `RUNTIME=claude`
+
+Fall back to env vars (`CODEX_HOME`, `GEMINI_CONFIG_DIR`, `OPENCODE_CONFIG_DIR`) if execution_context is unavailable.
+
+</step>
+
+<step name="discover_docs">
+
+Build the doc list from three sources, in order:
+
+**1. Manifest (if provided)** — authoritative:
+
+Read `MANIFEST_PATH`. Expected YAML shape:
+
+```yaml
+docs:
+  - path: docs/adr/0001-db.md
+    type: ADR
+    precedence: 0   # optional, lower = higher precedence
+  - path: docs/prd/auth.md
+    type: PRD
+```
+
+Each entry provides `path` (required, relative to repo root) + `type` (required, one of ADR|PRD|SPEC|DOC) + `precedence` (optional integer).
+
+**2. Directory conventions** (skipped when manifest is provided):
+
+```bash
+# ADRs
+find {SCAN_PATH} -type f \( -path '*/adr/*' -o -path '*/adrs/*' -o -name 'ADR-*.md' -o -regex '.*/[0-9]\{4\}-.*\.md' \) 2>/dev/null
+
+# PRDs
+find {SCAN_PATH} -type f \( -path '*/prd/*' -o -path '*/prds/*' -o -name 'PRD-*.md' \) 2>/dev/null
+
+# SPECs / RFCs
+find {SCAN_PATH} -type f \( -path '*/spec/*' -o -path '*/specs/*' -o -path '*/rfc/*' -o -path '*/rfcs/*' -o -name 'SPEC-*.md' -o -name 'RFC-*.md' \) 2>/dev/null
+
+# Generic docs (fall-through candidates)
+find {SCAN_PATH} -type f -path '*/docs/*' -name '*.md' 2>/dev/null
+```
+
+De-duplicate the union (a file matched by multiple patterns is one doc).
+
+**3. Content heuristics** (run during classification, not here) — the classifier handles frontmatter `type:` and H1 inspection for docs that didn't match a convention.
+
+**Cap:** hard limit of 50 docs per invocation (documented v1 constraint). If the discovered set exceeds 50:
+
+```
+GSD > Discovered {N} docs, which exceeds the v1 cap of 50.
+      Use --manifest to narrow the set to ≤ 50 files, or run
+      /gsd-ingest-docs again with a narrower <path>.
+```
+
+Exit without proceeding.
+
+**Display discovered set** and request approval (see `references/gate-prompts.md` — `yes-no-pick` pattern works; or `approve-revise-abort`):
+
+```
+Discovered {N} documents:
+  {N} ADR | {N} PRD | {N} SPEC | {N} DOC | {N} unclassified
+
+  docs/adr/0001-architecture.md       [ADR]    (from manifest|directory|heuristic)
+  docs/adr/0002-database.md           [ADR]    (directory)
+  docs/prd/auth.md                    [PRD]    (manifest)
+  ...
+```
+
+**Text mode:** apply the same `--text`/`text_mode` rule as other workflows — replace `AskUserQuestion` with a numbered list.
+
+Use `AskUserQuestion` (approve-revise-abort):
+- question: "Proceed with classification of these {N} documents?"
+- header: "Approve?"
+- options: Approve | Revise | Abort
+
+On Abort: exit cleanly with "Ingest cancelled."
+On Revise: exit with guidance to re-run with `--manifest` or a narrower path.
+
+</step>
+
+<step name="classify_parallel">
+
+Create staging directory:
+
+```bash
+mkdir -p .planning/intel/classifications/
+```
+
+For each discovered doc, spawn `gsd-doc-classifier` in parallel. In Claude Code, issue all Task calls in a single message with multiple tool uses so the harness runs them concurrently. For Copilot / sequential runtimes, fall back to sequential dispatch.
+
+Per-spawn prompt fields:
+- `FILEPATH` — absolute path to the doc
+- `OUTPUT_DIR` — `.planning/intel/classifications/`
+- `MANIFEST_TYPE` — the type from the manifest if present, else omit
+- `MANIFEST_PRECEDENCE` — the precedence integer from the manifest if present, else omit
+- `<required_reading>` — `agents/gsd-doc-classifier.md` (the agent definition itself)
+
+Collect the one-line confirmations from each classifier. If any classifier errors out, surface the error and abort without touching `.planning/` further.
+
+</step>
+
+<step name="synthesize">
+
+Spawn `gsd-doc-synthesizer` once:
+
+```
+Task({
+  subagent_type: "gsd-doc-synthesizer",
+  prompt: "
+    CLASSIFICATIONS_DIR: .planning/intel/classifications/
+    INTEL_DIR: .planning/intel/
+    CONFLICTS_PATH: .planning/INGEST-CONFLICTS.md
+    MODE: {MODE}
+    EXISTING_CONTEXT: {paths to existing .planning files if MODE=merge, else empty}
+    PRECEDENCE: {array from manifest defaults or default ['ADR','SPEC','PRD','DOC']}
+
+    <required_reading>
+    - agents/gsd-doc-synthesizer.md
+    - get-shit-done/references/doc-conflict-engine.md
+    </required_reading>
+  "
+})
+```
+
+The synthesizer writes:
+- `.planning/intel/decisions.md`, `.planning/intel/requirements.md`, `.planning/intel/constraints.md`, `.planning/intel/context.md`
+- `.planning/intel/SYNTHESIS.md`
+- `.planning/INGEST-CONFLICTS.md`
+
+</step>
+
+<step name="conflict_gate">
+
+Read `.planning/INGEST-CONFLICTS.md`. Count entries in each bucket (the synthesizer always writes the three-bucket header; parse the `### BLOCKERS ({N})`, `### WARNINGS ({N})`, `### INFO ({N})` lines).
+
+Apply the safety semantics from `references/doc-conflict-engine.md`. Operation noun: `ingest`.
+
+**If BLOCKERS > 0:**
+
+Render the report to the user, then display:
+
+```
+GSD > BLOCKED: {N} blockers must be resolved before ingest can proceed.
+```
+
+Exit WITHOUT writing PROJECT.md, REQUIREMENTS.md, ROADMAP.md, or STATE.md. The staging intel files remain for inspection. The safety gate holds — no destination files are written when blockers exist.
+
+**If WARNINGS > 0 and BLOCKERS = 0:**
+
+Render the report, then ask via AskUserQuestion (approve-revise-abort):
+- question: "Review the competing variants above. Resolve manually and proceed, or abort?"
+- header: "Approve?"
+- options: Approve | Abort
+
+On Abort: exit cleanly with "Ingest cancelled. Staged intel preserved at `.planning/intel/`."
+
+**If BLOCKERS = 0 and WARNINGS = 0:**
+
+Proceed to routing silently, or optionally display `GSD > No conflicts. Auto-resolved: {N}.`
+
+</step>
+
+<step name="route_new_mode">
+
+**Applies only when MODE=new.**
+
+Audit PROJECT.md field requirements that `gsd-roadmapper` expects. For fields derivable from `.planning/intel/SYNTHESIS.md` (project scope, goals/non-goals, constraints, locked decisions), synthesize from the intel. For fields NOT derivable (project name, developer-facing success metric, target runtime), prompt via `AskUserQuestion` one at a time — minimal question set, no interrogation.
+
+Delegate to `gsd-roadmapper`:
+
+```
+Task({
+  subagent_type: "gsd-roadmapper",
+  prompt: "
+    Mode: new-project-from-ingest
+    Intel: .planning/intel/SYNTHESIS.md (entry point)
+    Per-type intel: .planning/intel/{decisions,requirements,constraints,context}.md
+    User-supplied fields: {collected in previous step}
+
+    Produce:
+    - .planning/PROJECT.md
+    - .planning/REQUIREMENTS.md
+    - .planning/ROADMAP.md
+    - .planning/STATE.md
+
+    Treat ADR-locked decisions as locked in PROJECT.md <decisions> blocks.
+  "
+})
+```
+
+</step>
+
+<step name="route_merge_mode">
+
+**Applies only when MODE=merge.**
+
+Load existing `.planning/ROADMAP.md`, `.planning/PROJECT.md`, `.planning/REQUIREMENTS.md`, all `CONTEXT.md` files under `.planning/phases/`.
+
+The synthesizer has already hard-blocked on any LOCKED-in-ingest vs LOCKED-in-existing contradiction; if we reach this step, no such blockers remain.
+
+Plan the merge:
+- **New requirements** from synthesized `.planning/intel/requirements.md` that do not overlap existing REQUIREMENTS.md entries → append to REQUIREMENTS.md
+- **New decisions** from synthesized `.planning/intel/decisions.md` that do not overlap existing CONTEXT.md `<decisions>` blocks → write to a new phase's CONTEXT.md or append to the next milestone's requirements
+- **New scope** → derive phase additions following the `new-milestone.md` pattern; append phases to `.planning/ROADMAP.md`
+
+Preview the merge diff to the user and gate via approve-revise-abort before writing.
+
+</step>
+
+<step name="finalize">
+
+Commit the ingest results:
+
+```bash
+gsd-sdk query commit "docs: ingest {N} docs from {SCAN_PATH} (#2387)" \
+  .planning/PROJECT.md \
+  .planning/REQUIREMENTS.md \
+  .planning/ROADMAP.md \
+  .planning/STATE.md \
+  .planning/intel/ \
+  .planning/INGEST-CONFLICTS.md
+```
+
+(For merge mode, substitute the actual set of modified files.)
+
+Display completion:
+
+```
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+ GSD ► INGEST DOCS COMPLETE
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+```
+
+Show:
+- Mode ran (new or merge)
+- Docs ingested (count + type breakdown)
+- Decisions locked, requirements created, constraints captured
+- Conflict report path (`.planning/INGEST-CONFLICTS.md`)
+- Next step: `/gsd-plan-phase 1` (new mode) or `/gsd-plan-phase N` (merge, pointing at the first newly-added phase)
+
+</step>
+
+---
+
+## Anti-Patterns
+
+Do NOT:
+- Violate the shared conflict-engine contract in `references/doc-conflict-engine.md` (no markdown tables, no new severity labels, no bypass of the BLOCKER gate)
+- Write PROJECT.md, REQUIREMENTS.md, ROADMAP.md, or STATE.md when BLOCKERs exist in the conflict report
+- Skip the 50-doc cap — larger sets must use `--manifest` to narrow the scope
+- Auto-resolve LOCKED-vs-LOCKED ADR contradictions — those are BLOCKERs in both modes
+- Merge competing PRD acceptance variants into a combined criterion — preserve all variants for user resolution
+- Bypass the discovery approval gate — users must see the classified doc list before classifiers spawn
+- Skip path validation on `SCAN_PATH` or `MANIFEST_PATH`
+- Implement `--resolve interactive` in this v1 — the flag is reserved; reject with a future-release message