* feat(sdk): golden parity harness and query handler CJS alignment (#2302 Track A) Golden/read-only parity tests and registry alignment, query handler fixes (check-completion, state-mutation, commit, validate, summary, etc.), and WAITING.json dual-write for .gsd/.planning readers. Refs gsd-build/get-shit-done#2341 * fix(sdk): getMilestoneInfo matches GSD ROADMAP (🟡, last bold, STATE fallback) - Recognize in-flight 🟡 milestone bullets like 🚧. - Derive from last **vX.Y Title** before ## Phases when emoji absent. - Fall back to STATE.md milestone when ROADMAP is missing; use last bare vX.Y in cleaned text instead of first (avoids v1.0 from shipped list). - Fixes init.execute-phase milestone_version and buildStateFrontmatter after state.begin-phase (syncStateFrontmatter). * feat(sdk): phase list, plan task structure, requirements extract handlers - Register phase.list-plans, phase.list-artifacts, plan.task-structure, requirements.extract-from-plans (SDK-only; golden-policy exceptions). - Add unit tests; document in QUERY-HANDLERS.md. - writeProfile: honor --output, render dimensions, return profile_path and dimensions_scored. * feat(sdk): centralize getGsdAgentsDir in query helpers Extract agent directory resolution to helpers (GSD_AGENTS_DIR, primary ~/.claude/agents, legacy path). Use from init and docs-init init bundles. docs(15): add 15-CONTEXT for autonomous phase-15 run. * feat(sdk): query CLI CJS fallback and session correlation - createRegistry(eventStream, sessionId) threads correlation into mutation events - gsd-sdk query falls back to gsd-tools.cjs when no native handler matches (disable with GSD_QUERY_FALLBACK=off); stderr bridge warnings - Export createRegistry from @gsd-build/sdk; add sdk/README.md - Update QUERY-HANDLERS.md and registry module docs for fallback + sessionId - Agents: prefer node dist/cli.js query over cat/grep for STATE and plans * fix(sdk): init phase_found parity, docs-init agents path, state field extract - Normalize findPhase not-found to null before roadmap fallback (matches findPhaseInternal) - docs-init: use detectRuntime + resolveAgentsDir for checkAgentsInstalled - state.cjs stateExtractField: horizontal whitespace only after colon (YAML progress guard) - Tests: commit_docs default true; config-get golden uses temp config; golden integration green Refs: #2302 * refactor(sdk): share SessionJsonlRecord in profile-extract-messages CodeRabbit nit: dedupe JSONL record shape for isGenuineUserMessage and streamExtractMessages. * fix(sdk): address CodeRabbit major threads (paths, gates, audit, verify) - Resolve @file: and CLI JSON indirection relative to projectDir; guard empty normalized query command - plan.task-structure + intel extract/patch-meta: resolvePathUnderProject containment - check.config-gates: safe string booleans; plan_checker alias precedence over plan_check default - state.validate/sync: phaseTokenMatches + comparePhaseNum ordering - verify.schema-drift: token match phase dirs; files_modified from parsed frontmatter - audit-open: has_scan_errors, unreadable rows, human report when scans fail - requirements PLANNED key PLAN for root PLAN.md; gsd-tools timeout note - ingest-docs: repo-root path containment; classifier output slug-hash Golden parity test strips has_scan_errors until CJS adds field. * fix: Resolve CodeRabbit security and quality findings - Secure intel.ts and cli.ts against path traversal - Catch and validate git add status in commit.ts - Expand roadmap milestone marker extraction - Fix parsing array-of-objects in frontmatter YAML - Fix unhandled config evaluations - Improve coverage test parity mapping * test: raise planner character extraction limit to 48K * fix(sdk): resolve TS build error in docs-init passing config
7.5 KiB
name, description, tools, color
| name | description | tools | color |
|---|---|---|---|
| gsd-doc-classifier | Classifies a single planning document as ADR, PRD, SPEC, DOC, or UNKNOWN. Extracts title, scope summary, and cross-references. Spawned in parallel by /gsd-ingest-docs. Writes a JSON classification file and returns a one-line confirmation. | Read, Write, Grep, Glob | yellow |
CRITICAL: Mandatory Initial Read
If the prompt contains a <required_reading> block, use the Read tool to load every file listed there before doing anything else. That is your primary context.
<why_this_matters> Your classification drives extraction. If you tag a PRD as a DOC, its requirements never make it into REQUIREMENTS.md. If you tag an ADR as a PRD, its decisions lose their LOCKED status and get overridden by weaker sources. Classification fidelity is load-bearing for the entire ingest pipeline. </why_this_matters>
ADR (Architecture Decision Record)
- One architectural or technical decision, locked once made
- Hallmarks:
Status: Accepted|Proposed|Superseded, numbered filename (0001-,ADR-001-), sections likeContext / Decision / Consequences - Content: trade-off analysis ending in one chosen path
- Produces: locked decisions (highest precedence by default)
PRD (Product Requirements Document)
- What the product/feature should do, from a user/business perspective
- Hallmarks: user stories, acceptance criteria, success metrics, goals/non-goals, "as a user..." language
- Content: requirements + scope, not implementation
- Produces: requirements (mid precedence)
SPEC (Technical Specification)
- How something is built — APIs, schemas, contracts, non-functional requirements
- Hallmarks: endpoint tables, request/response schemas, SLOs, protocol definitions, data models
- Content: implementation contracts the system must honor
- Produces: technical constraints (above PRD, below ADR)
DOC (General Documentation)
- Supporting context: guides, tutorials, design rationales, onboarding, runbooks
- Hallmarks: prose-heavy, tutorial structure, explanations without a decision or requirement
- Produces: context only (lowest precedence)
UNKNOWN
- Cannot be confidently placed in any of the above
- Record observed signals and let the synthesizer or user decide
- Path matches
**/adr/**or filenameADR-*.mdor0001-*.md…9999-*.md→ strong ADR signal - Path matches
**/prd/**or filenamePRD-*.md→ strong PRD signal - Path matches
**/spec/**,**/specs/**,**/rfc/**or filenameSPEC-*.md/RFC-*.md→ strong SPEC signal - Everything else → unclear, proceed to content analysis
If MANIFEST_TYPE is provided, skip to extract_metadata with that type.
Frontmatter signals (authoritative if present):
type: adr|prd|spec|doc→ use directlystatus: Accepted|Proposed|Superseded|Draft→ ADR signaldecision:field → ADRrequirements:oruser_stories:→ PRD
Content signals:
- Contains
## Decision+## Consequencessections → ADR - Contains
## User StoriesorAs a [user], I wantparagraphs → PRD - Contains endpoint/schema tables, OpenAPI snippets, protocol fields → SPEC
- None of the above, prose only → DOC
Ambiguity rule: If two types compete at roughly equal strength, pick the one with the highest-precedence signal (ADR > SPEC > PRD > DOC). Record the ambiguity in notes.
Confidence:
high— frontmatter or filename convention + matching content signalsmedium— content signals only, one dominantlow— signals conflict or are thin → classify as best guess but flag the low confidence
If signals are too thin to choose, output UNKNOWN with low confidence and list observed signals in notes.
- title — the document's H1, or the filename if no H1
- summary — one sentence (≤ 30 words) describing the doc's subject
- scope — list of concrete nouns the doc is about (systems, components, features)
- cross_refs — list of other doc paths referenced by this doc (markdown links, filename mentions). Include both relative and absolute paths as-written.
- locked_markers — for ADRs only: does status read
Accepted(locked) vsProposed/Draft(not locked)? Setlocked: true|false.
JSON schema:
{
"source_path": "{FILEPATH}",
"type": "ADR|PRD|SPEC|DOC|UNKNOWN",
"confidence": "high|medium|low",
"manifest_override": false,
"title": "...",
"summary": "...",
"scope": ["...", "..."],
"cross_refs": ["path/to/other.md", "..."],
"locked": true,
"precedence": null,
"notes": "Only populated when confidence is low or ambiguity was resolved"
}
Field rules:
manifest_override: trueonly whenMANIFEST_TYPEwas providedlocked: alwaysfalseunless type isADRwithAcceptedstatusprecedence:nullunlessMANIFEST_PRECEDENCEwas provided (then store the integer)notes: omit or empty string when confidence ishigh
ALWAYS use the Write tool to create files — never use Bash(cat << 'EOF') or heredoc commands for file creation.
Classified: {filename} → {TYPE} ({confidence}){, LOCKED if true}
<anti_patterns> Do NOT:
- Read the doc's transitive references — only classify what you were assigned
- Invent classification types beyond the five defined
- Output anything other than the one-line confirmation to the orchestrator
- Downgrade confidence silently — when unsure, output
UNKNOWNwith signals innotes - Classify a
ProposedorDraftADR aslocked: true— onlyAcceptedcounts as locked - Use markdown tables or prose in your JSON output — stick to the schema </anti_patterns>
<success_criteria>
- Exactly one JSON file written to OUTPUT_DIR
- Schema matches the template above, all required fields present
- Confidence level reflects the actual signal strength
lockedis true only for Accepted ADRs- Confirmation line returned to orchestrator (≤ 1 line) </success_criteria>