feat: add Nyquist validation layer to plan-phase pipeline

Adds automated feedback architecture research to gsd-phase-researcher,
enforces it as Dimension 8 in gsd-plan-checker, and introduces
{phase}-VALIDATION.md as the per-phase validation contract.

Ensures every phase plan includes automated verify commands before
execution begins. Opt-out via workflow.nyquist_validation: false.

Closes #122 (partial), related #117

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Bantuson
2026-02-20 13:46:39 +02:00
parent 3cf26d69ee
commit e0f9c738c9
9 changed files with 352 additions and 13 deletions

View File

@@ -296,6 +296,37 @@ Verified patterns from official sources:
- What's unclear: [the gap]
- Recommendation: [how to handle]
## Validation Architecture
> Skip this section entirely if workflow.nyquist_validation is false in .planning/config.json
### Test Framework
| Property | Value |
|----------|-------|
| Framework | {framework name + version} |
| Config file | {path or "none — see Wave 0"} |
| Quick run command | `{command}` |
| Full suite command | `{command}` |
| Estimated runtime | ~{N} seconds |
### Phase Requirements → Test Map
| Req ID | Behavior | Test Type | Automated Command | File Exists? |
|--------|----------|-----------|-------------------|-------------|
| REQ-XX | {behavior description} | unit | `pytest tests/test_{module}.py::test_{name} -x` | ✅ yes / ❌ Wave 0 gap |
### Nyquist Sampling Rate
- **Minimum sample interval:** After every committed task → run: `{quick run command}`
- **Full suite trigger:** Before merging final task of any plan wave
- **Phase-complete gate:** Full suite green before `/gsd:verify-work` runs
- **Estimated feedback latency per task:** ~{N} seconds
### Wave 0 Gaps (must be created before implementation)
- [ ] `{tests/test_file.py}` — covers REQ-{XX}
- [ ] `{tests/conftest.py}` — shared fixtures for phase {N}
- [ ] Framework install: `{command}` — if no framework detected
*(If no gaps: "None — existing test infrastructure covers all phase requirements")*
## Sources
### Primary (HIGH confidence)
@@ -335,6 +366,8 @@ INIT=$(node ~/.claude/get-shit-done/bin/gsd-tools.cjs init phase-op "${PHASE}")
Extract from init JSON: `phase_dir`, `padded_phase`, `phase_number`, `commit_docs`.
Also check Nyquist validation config — read `.planning/config.json` and check if `workflow.nyquist_validation` is `true`. If `true`, include the Validation Architecture section in RESEARCH.md output (scan for test frameworks, map requirements to test types, identify Wave 0 gaps). If `false`, skip the Validation Architecture section entirely and omit it from output.
Then read CONTEXT.md if exists:
```bash
cat "$phase_dir"/*-CONTEXT.md 2>/dev/null
@@ -367,7 +400,33 @@ Based on phase description, identify what needs investigating:
For each domain: Context7 first → Official docs → WebSearch → Cross-verify. Document findings with confidence levels as you go.
## Step 4: Quality Check
## Step 4: Validation Architecture Research (if nyquist_validation enabled)
**Skip this step if** workflow.nyquist_validation is false in config.
This step answers: "How will Claude's executor know, within seconds of committing each task, whether the output is correct?"
### Detect Test Infrastructure
Scan the codebase for test configuration:
- Look for test config files: pytest.ini, pyproject.toml, jest.config.*, vitest.config.*, etc.
- Look for test directories: test/, tests/, __tests__/
- Look for test files: *.test.*, *.spec.*
- Check package.json scripts for test commands
### Map Requirements to Tests
For each requirement in <phase_requirements>:
- Identify the behavior to verify
- Determine test type: unit / integration / contract / smoke / e2e / manual-only
- Specify the automated command to run that test in < 30 seconds
- Flag if only verifiable manually (justify why)
### Identify Wave 0 Gaps
List test files, fixtures, or utilities that must be created BEFORE implementation:
- Missing test files for phase requirements
- Missing test framework configuration
- Missing shared fixtures or test utilities
## Step 5: Quality Check
- [ ] All domains investigated
- [ ] Negative claims verified
@@ -375,7 +434,7 @@ For each domain: Context7 first → Official docs → WebSearch → Cross-verify
- [ ] Confidence levels assigned honestly
- [ ] "What might I have missed?" review
## Step 5: Write RESEARCH.md
## Step 6: Write RESEARCH.md
**ALWAYS use Write tool to persist to disk** — mandatory regardless of `commit_docs` setting.
@@ -414,13 +473,13 @@ Write to: `$PHASE_DIR/$PADDED_PHASE-RESEARCH.md`
⚠️ `commit_docs` controls git only, NOT file writing. Always write first.
## Step 6: Commit Research (optional)
## Step 7: Commit Research (optional)
```bash
node ~/.claude/get-shit-done/bin/gsd-tools.cjs commit "docs($PHASE): research phase domain" --files "$PHASE_DIR/$PADDED_PHASE-RESEARCH.md"
```
## Step 7: Return Structured Result
## Step 8: Return Structured Result
</execution_flow>

View File

@@ -312,6 +312,105 @@ issue:
fix_hint: "Remove search task - belongs in future phase per user decision"
```
## Dimension 8: Nyquist Compliance
<dimension_8_skip_condition>
Skip this entire dimension if:
- workflow.nyquist_validation is false in .planning/config.json
- The phase being checked has no RESEARCH.md (researcher was skipped)
- The RESEARCH.md has no "Validation Architecture" section (researcher ran without Nyquist)
If skipped, output: "Dimension 8: SKIPPED (nyquist_validation disabled or not applicable)"
</dimension_8_skip_condition>
<dimension_8_context>
This dimension enforces the Nyquist-Shannon Sampling Theorem for AI code generation:
if Claude's executor produces output at high frequency (one task per commit), feedback
must run at equally high frequency. A plan that produces code without pre-defined
automated verification is under-sampled — errors will be statistically missed.
The gsd-phase-researcher already determined WHAT to test. This dimension verifies
that the planner correctly incorporated that information into the actual task plans.
</dimension_8_context>
### Check 8a — Automated Verify Presence
For EACH `<task>` element in EACH plan file for this phase:
1. Does `<verify>` contain an `<automated>` command (or structured equivalent)?
2. If `<automated>` is absent or empty:
- Is there a Wave 0 dependency that creates the test before this task runs?
- If no Wave 0 dependency exists → **BLOCKING FAIL**
3. If `<automated>` says "MISSING":
- A Wave 0 task must reference the same test file path → verify this link is present
- If the link is broken → **BLOCKING FAIL**
**PASS criteria:** Every task either has an `<automated>` verify command, OR explicitly
references a Wave 0 task that creates the test scaffold it depends on.
### Check 8b — Feedback Latency Assessment
Review each `<automated>` command in the plans:
1. Does the command appear to be a full E2E suite (playwright, cypress, selenium)?
- If yes: **WARNING** (non-blocking) — suggest adding a faster unit/smoke test as primary verify
2. Does the command include `--watchAll` or equivalent watch mode flags?
- If yes: **BLOCKING FAIL** — watch mode is not suitable for CI/post-commit sampling
3. Does the command include `sleep`, `wait`, or arbitrary delays > 30 seconds?
- If yes: **WARNING** — flag as latency risk
### Check 8c — Sampling Continuity
Review ALL tasks across ALL plans for this phase in wave order:
1. Map each task to its wave number
2. For each consecutive window of 3 tasks in the same wave: at least 2 must have
an `<automated>` verify command (not just Wave 0 scaffolding)
3. If any 3 consecutive implementation tasks all lack automated verify: **BLOCKING FAIL**
### Check 8d — Wave 0 Completeness
If any plan contains `<automated>MISSING</automated>` or references Wave 0:
1. Does a Wave 0 task exist for every MISSING reference?
2. Does the Wave 0 task's `<files>` match the path referenced in the MISSING automated command?
3. Is the Wave 0 task in a plan that executes BEFORE the dependent task?
**FAIL condition:** Any MISSING automated verify without a matching Wave 0 task.
### Dimension 8 Output Block
Include this block in the plan-checker report:
```
## Dimension 8: Nyquist Compliance
### Automated Verify Coverage
| Task | Plan | Wave | Automated Command | Latency | Status |
|------|------|------|-------------------|---------|--------|
| {task name} | {plan} | {wave} | `{command}` | ~{N}s | ✅ PASS / ❌ FAIL |
### Sampling Continuity Check
Wave {N}: {X}/{Y} tasks verified → ✅ PASS / ❌ FAIL
### Wave 0 Completeness
- {test file} → Wave 0 task present ✅ / MISSING ❌
### Overall Nyquist Status: ✅ PASS / ❌ FAIL
### Revision Instructions (if FAIL)
Return to planner with the following required changes:
{list of specific fixes needed}
```
### Revision Loop Behavior
If Dimension 8 FAILS:
- Return to `gsd-planner` with the specific revision instructions above
- The planner must address ALL failing checks before returning
- This follows the same loop behavior as existing dimensions
- Maximum 3 revision loops for Dimension 8 before escalating to user
</verification_dimensions>
<verification_process>
@@ -329,6 +428,8 @@ Orchestrator provides CONTEXT.md content in the verification prompt. If provided
```bash
ls "$phase_dir"/*-PLAN.md 2>/dev/null
# Read research for Nyquist validation data
cat "$phase_dir"/*-RESEARCH.md 2>/dev/null
node ~/.claude/get-shit-done/bin/gsd-tools.cjs roadmap get-phase "$phase_number"
ls "$phase_dir"/*-BRIEF.md 2>/dev/null
```

View File

@@ -157,9 +157,21 @@ Every task has four required fields:
- Good: "Create POST endpoint accepting {email, password}, validates using bcrypt against User table, returns JWT in httpOnly cookie with 15-min expiry. Use jose library (not jsonwebtoken - CommonJS issues with Edge runtime)."
- Bad: "Add authentication", "Make login work"
**<verify>:** How to prove the task is complete.
- Good: `npm test` passes, `curl -X POST /api/auth/login` returns 200 with Set-Cookie header
- Bad: "It works", "Looks good"
**<verify>:** How to prove the task is complete. Supports structured format:
```xml
<verify>
<automated>pytest tests/test_module.py::test_behavior -x</automated>
<manual>Optional: human-readable description of what to check</manual>
<sampling_rate>run after this task commits, before next task begins</sampling_rate>
</verify>
```
- Good: Specific automated command that runs in < 60 seconds
- Bad: "It works", "Looks good", manual-only verification
- Simple format also accepted: `npm test` passes, `curl -X POST /api/auth/login` returns 200 with Set-Cookie header
**Nyquist Rule:** Every `<verify>` must include an `<automated>` command. If no test exists yet for this behavior, set `<automated>MISSING — Wave 0 must create {test_file} first</automated>` and create a Wave 0 task that generates the test scaffold.
**<done>:** Acceptance criteria - measurable state of completion.
- Good: "Valid credentials return 200 + JWT cookie, invalid credentials return 401"

View File

@@ -97,6 +97,24 @@ A detailed reference for workflows, troubleshooting, and configuration. For quic
└── Done
```
### Validation Architecture (Nyquist Layer)
During plan-phase research, GSD now maps automated test coverage to each phase
requirement before any code is written. This ensures that when Claude's executor
commits a task, a feedback mechanism already exists to verify it within seconds.
The researcher detects your existing test infrastructure, maps each requirement to
a specific test command, and identifies any test scaffolding that must be created
before implementation begins (Wave 0 tasks).
The plan-checker enforces this as an 8th verification dimension: plans where tasks
lack automated verify commands will not be approved.
**Output:** `{phase}-VALIDATION.md` -- the feedback contract for the phase.
**Disable:** Set `workflow.nyquist_validation: false` in `/gsd:settings` for
rapid prototyping phases where test infrastructure isn't the focus.
### Execution Wave Coordination
```
@@ -206,7 +224,8 @@ GSD stores project settings in `.planning/config.json`. Configure during `/gsd:n
"workflow": {
"research": true,
"plan_check": true,
"verifier": true
"verifier": true,
"nyquist_validation": true
},
"git": {
"branching_strategy": "none",
@@ -240,6 +259,7 @@ GSD stores project settings in `.planning/config.json`. Configure during `/gsd:n
| `workflow.research` | `true`, `false` | `true` | Domain investigation before planning |
| `workflow.plan_check` | `true`, `false` | `true` | Plan verification loop (up to 3 iterations) |
| `workflow.verifier` | `true`, `false` | `true` | Post-execution verification against phase goals |
| `workflow.nyquist_validation` | `true`, `false` | `true` | Validation architecture research during plan-phase; 8th plan-check dimension |
Disable these to speed up phases in familiar domains or when conserving tokens.

View File

@@ -167,6 +167,7 @@ function loadConfig(cwd) {
verifier: true,
parallelization: true,
brave_search: false,
nyquist_validation: true,
};
try {
@@ -199,6 +200,7 @@ function loadConfig(cwd) {
plan_checker: get('plan_checker', { section: 'workflow', field: 'plan_check' }) ?? defaults.plan_checker,
verifier: get('verifier', { section: 'workflow', field: 'verifier' }) ?? defaults.verifier,
parallelization,
nyquist_validation: get('nyquist_validation', { section: 'workflow', field: 'nyquist_validation' }) ?? defaults.nyquist_validation,
brave_search: get('brave_search') ?? defaults.brave_search,
};
} catch {
@@ -4319,6 +4321,7 @@ function cmdInitPlanPhase(cwd, phase, raw) {
research_enabled: config.research,
plan_checker_enabled: config.plan_checker,
commit_docs: config.commit_docs,
nyquist_validation_enabled: config.nyquist_validation,
// Phase info
phase_found: !!phaseInfo,

View File

@@ -0,0 +1,104 @@
---
phase: {N}
slug: {phase-slug}
status: draft
nyquist_compliant: false
wave_0_complete: false
created: {date}
---
# Phase {N} — Validation Strategy
> Generated by `gsd-phase-researcher` during `/gsd:plan-phase {N}`.
> Updated by `gsd-plan-checker` after plan approval.
> Governs feedback sampling during `/gsd:execute-phase {N}`.
---
## Test Infrastructure
| Property | Value |
|----------|-------|
| **Framework** | {pytest 7.x / jest 29.x / vitest / go test / other} |
| **Config file** | {path/to/pytest.ini or "none — Wave 0 installs"} |
| **Quick run command** | `{e.g., pytest -x --tb=short}` |
| **Full suite command** | `{e.g., pytest tests/ --tb=short}` |
| **Estimated runtime** | ~{N} seconds |
| **CI pipeline** | {.github/workflows/test.yml — exists / needs creation} |
---
## Nyquist Sampling Rate
> The minimum feedback frequency required to reliably catch errors in this phase.
- **After every task commit:** Run `{quick run command}`
- **After every plan wave:** Run `{full suite command}`
- **Before `/gsd:verify-work`:** Full suite must be green
- **Maximum acceptable task feedback latency:** {N} seconds
---
## Per-Task Verification Map
| Task ID | Plan | Wave | Requirement | Test Type | Automated Command | File Exists | Status |
|---------|------|------|-------------|-----------|-------------------|-------------|--------|
| {N}-01-01 | 01 | 1 | REQ-{XX} | unit | `pytest tests/test_{module}.py::test_{name} -x` | ✅ / ❌ W0 | ⬜ pending |
| {N}-01-02 | 01 | 1 | REQ-{XX} | integration | `pytest tests/test_{flow}.py -x` | ✅ / ❌ W0 | ⬜ pending |
| {N}-02-01 | 02 | 2 | REQ-{XX} | smoke | `curl -s {endpoint} \| grep {expected}` | ✅ N/A | ⬜ pending |
*Status values: ⬜ pending · ✅ green · ❌ red · ⚠️ flaky*
---
## Wave 0 Requirements
> Test scaffolding committed BEFORE any implementation task. Executor runs Wave 0 first.
- [ ] `{tests/test_file.py}` — stubs for REQ-{XX}, REQ-{XX}
- [ ] `{tests/conftest.py}` — shared fixtures
- [ ] `{framework install}` — if no framework detected
*If none required: "Existing infrastructure covers all phase requirements — no Wave 0 test tasks needed."*
---
## Manual-Only Verifications
> Behaviors that genuinely cannot be automated, with justification.
> These are surfaced during `/gsd:verify-work` UAT.
| Behavior | Requirement | Why Manual | Test Instructions |
|----------|-------------|------------|-------------------|
| {behavior} | REQ-{XX} | {reason: visual, third-party auth, physical device...} | {step-by-step} |
*If none: "All phase behaviors have automated verification coverage."*
---
## Validation Sign-Off
Updated by `gsd-plan-checker` when plans are approved:
- [ ] All tasks have `<automated>` verify commands or Wave 0 dependencies
- [ ] No 3 consecutive implementation tasks without automated verify (sampling continuity)
- [ ] Wave 0 test files cover all MISSING references
- [ ] No watch-mode flags in any automated command
- [ ] Feedback latency per task: < {N}s ✅
- [ ] `nyquist_compliant: true` set in frontmatter
**Plan-checker approval:** {pending / approved on YYYY-MM-DD}
---
## Execution Tracking
Updated during `/gsd:execute-phase {N}`:
| Wave | Tasks | Tests Run | Pass | Fail | Sampling Status |
|------|-------|-----------|------|------|-----------------|
| 0 | {N} | — | — | — | scaffold |
| 1 | {N} | {command} | {N} | {N} | ✅ sampled |
| 2 | {N} | {command} | {N} | {N} | ✅ sampled |
**Phase validation complete:** {pending / YYYY-MM-DD HH:MM}

View File

@@ -5,7 +5,8 @@
"research": true,
"plan_check": true,
"verifier": true,
"auto_advance": false
"auto_advance": false,
"nyquist_validation": true
},
"planning": {
"commit_docs": true,

View File

@@ -18,7 +18,7 @@ Load all context in one call (paths only to minimize orchestrator context):
INIT=$(node ~/.claude/get-shit-done/bin/gsd-tools.cjs init plan-phase "$PHASE")
```
Parse JSON for: `researcher_model`, `planner_model`, `checker_model`, `research_enabled`, `plan_checker_enabled`, `commit_docs`, `phase_found`, `phase_dir`, `phase_number`, `phase_name`, `phase_slug`, `padded_phase`, `has_research`, `has_context`, `has_plans`, `plan_count`, `planning_exists`, `roadmap_exists`.
Parse JSON for: `researcher_model`, `planner_model`, `checker_model`, `research_enabled`, `plan_checker_enabled`, `nyquist_validation_enabled`, `commit_docs`, `phase_found`, `phase_dir`, `phase_number`, `phase_name`, `phase_slug`, `padded_phase`, `has_research`, `has_context`, `has_plans`, `plan_count`, `planning_exists`, `roadmap_exists`.
**File paths (for <files_to_read> blocks):** `state_path`, `roadmap_path`, `requirements_path`, `context_path`, `research_path`, `verification_path`, `uat_path`. These are null if files don't exist.
@@ -128,6 +128,31 @@ Task(
- **`## RESEARCH COMPLETE`:** Display confirmation, continue to step 6
- **`## RESEARCH BLOCKED`:** Display blocker, offer: 1) Provide context, 2) Skip research, 3) Abort
## 5.5. Create Validation Strategy (if Nyquist enabled)
**Skip if:** `nyquist_validation_enabled` is false from INIT JSON.
After researcher completes, check if RESEARCH.md contains a Validation Architecture section:
```bash
grep -l "## Validation Architecture" "${PHASE_DIR}"/*-RESEARCH.md 2>/dev/null
```
**If found:**
1. Read validation template from `~/.claude/get-shit-done/templates/VALIDATION.md`
2. Write to `${PHASE_DIR}/${PADDED_PHASE}-VALIDATION.md`
3. Fill frontmatter: replace `{N}` with phase number, `{phase-slug}` with phase slug, `{date}` with current date
4. If `commit_docs` is true:
```bash
node ~/.claude/get-shit-done/bin/gsd-tools.cjs commit-docs "docs(phase-${PHASE}): add validation strategy"
```
**If not found (and nyquist enabled):** Display warning:
```
⚠ Nyquist validation enabled but researcher did not produce a Validation Architecture section.
Continuing without validation strategy. Plans may fail Dimension 8 check.
```
## 6. Check Existing Plans
```bash
@@ -240,6 +265,7 @@ Checker prompt:
- {roadmap_path} (Roadmap)
- {requirements_path} (Requirements)
- {context_path} (USER DECISIONS from /gsd:discuss-phase)
- {research_path} (Technical Research — includes Validation Architecture)
</files_to_read>
**Phase requirement IDs (MUST ALL be covered):** {phase_req_ids}

View File

@@ -28,6 +28,7 @@ Parse current values (default to `true` if not present):
- `workflow.research` — spawn researcher during plan-phase
- `workflow.plan_check` — spawn plan checker during plan-phase
- `workflow.verifier` — spawn verifier during execute-phase
- `workflow.nyquist_validation` — validation architecture research during plan-phase
- `model_profile` — which model each agent uses (default: `balanced`)
- `git.branching_strategy` — branching approach (default: `"none"`)
</step>
@@ -83,6 +84,15 @@ AskUserQuestion([
{ label: "Yes", description: "Chain stages via Task() subagents (same isolation)" }
]
},
{
question: "Enable Nyquist Validation? (researches test coverage during planning)",
header: "Nyquist",
multiSelect: false,
options: [
{ label: "Yes (Recommended)", description: "Research automated test coverage during plan-phase. Adds validation requirements to plans. Blocks approval if tasks lack automated verify." },
{ label: "No", description: "Skip validation research. Good for rapid prototyping or no-test phases." }
]
},
{
question: "Git branching strategy?",
header: "Branching",
@@ -108,7 +118,8 @@ Merge new settings into existing config.json:
"research": true/false,
"plan_check": true/false,
"verifier": true/false,
"auto_advance": true/false
"auto_advance": true/false,
"nyquist_validation": true/false
},
"git": {
"branching_strategy": "none" | "phase" | "milestone"
@@ -155,7 +166,8 @@ Write `~/.gsd/defaults.json` with:
"research": <current>,
"plan_check": <current>,
"verifier": <current>,
"auto_advance": <current>
"auto_advance": <current>,
"nyquist_validation": <current>
}
}
```
@@ -176,6 +188,7 @@ Display:
| Plan Checker | {On/Off} |
| Execution Verifier | {On/Off} |
| Auto-Advance | {On/Off} |
| Nyquist Validation | {On/Off} |
| Git Branching | {None/Per Phase/Per Milestone} |
| Saved as Defaults | {Yes/No} |
@@ -193,7 +206,7 @@ Quick commands:
<success_criteria>
- [ ] Current config read
- [ ] User presented with 6 settings (profile + 4 workflow toggles + git branching)
- [ ] User presented with 7 settings (profile + 5 workflow toggles + git branching)
- [ ] Config updated with model_profile, workflow, and git sections
- [ ] User offered to save as global defaults (~/.gsd/defaults.json)
- [ ] Changes confirmed to user