Files
get-shit-done/tests/config-field-docs.test.cjs
Fana 33575ba91d feat: /gsd-ai-integration-phase + /gsd-eval-review — AI framework selection and eval coverage layer (#1971)
* feat: /gsd:ai-phase + /gsd:eval-review — AI evals and framework selection layer

Adds a structured AI development layer to GSD with 5 new agents, 2 new
commands, 2 new workflows, 2 reference files, and 1 template.

Commands:
- /gsd:ai-phase [N] — pre-planning AI design contract (inserts between
  discuss-phase and plan-phase). Orchestrates 4 agents in sequence:
  framework-selector → ai-researcher → domain-researcher → eval-planner.
  Output: AI-SPEC.md with framework decision, implementation guidance,
  domain expert context, and evaluation strategy.
- /gsd:eval-review [N] — retroactive eval coverage audit. Scores each
  planned eval dimension as COVERED/PARTIAL/MISSING. Output: EVAL-REVIEW.md
  with 0-100 score, verdict, and remediation plan.

Agents:
- gsd-framework-selector: interactive decision matrix (6 questions) →
  scored framework recommendation for CrewAI, LlamaIndex, LangChain,
  LangGraph, OpenAI Agents SDK, Claude Agent SDK, AutoGen/AG2, Haystack
- gsd-ai-researcher: fetches official framework docs + writes AI systems
  best practices (Pydantic structured outputs, async-first, prompt
  discipline, context window management, cost/latency budget)
- gsd-domain-researcher: researches business domain and use-case context —
  surfaces domain expert evaluation criteria, industry failure modes,
  regulatory constraints, and practitioner rubric ingredients before
  eval-planner writes measurable criteria
- gsd-eval-planner: designs evaluation strategy grounded in domain context;
  defaults to Arize Phoenix (tracing) + RAGAS (RAG eval) with detect-first
  guard for existing tooling
- gsd-eval-auditor: retroactive codebase scan → scores eval coverage

Integration points:
- plan-phase: non-blocking nudge (step 4.5) when AI keywords detected and
  no AI-SPEC.md present
- settings: new workflow.ai_phase toggle (default on)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: refine ai-integration-phase layer — rename, house style, consistency fixes

Amends the ai-evals framework layer (df8cb6c) with post-review improvements
before opening upstream PR.

Rename /gsd:ai-phase → /gsd:ai-integration-phase:
- Renamed commands/gsd/ai-phase.md → ai-integration-phase.md
- Renamed get-shit-done/workflows/ai-phase.md → ai-integration-phase.md
- Updated config key: workflow.ai_phase → workflow.ai_integration_phase
- Updated repair action: addAiPhaseKey → addAiIntegrationPhaseKey
- Updated all 84 cross-references across agents, workflows, templates, tests

Consistency fixes (same class as PR #1380 review):
- commands/gsd: objective described 3-agent chain, missing gsd-domain-researcher
- workflows/ai-integration-phase: purpose tag described 3-agent chain + "locks
  three things" — updated to 4 agents + 4 outputs
- workflows/ai-integration-phase: missing DOMAIN_MODEL resolve-model call in
  step 1 (domain-researcher was spawned in step 7.5 with no model variable)
- workflows/ai-integration-phase: fractional step ## 7.5 renumbered to integers
  (steps 8–12 shifted)

Agent house style (GSD meta-prompting conformance):
- All 5 new agents refactored to execution_flow + step name="" structure
- Role blocks compressed to 2 lines (removed verbose "Core responsibilities")
- Added skills: frontmatter to all 5 agents (agent-frontmatter tests)
- Added # hooks: commented pattern to file-writing agents
- Added ALWAYS use Write tool anti-heredoc instruction to file-writing agents
- Line reductions: ai-researcher −41%, domain-researcher −25%, eval-planner −26%,
  eval-auditor −25%, framework-selector −9%

Test coverage (tests/ai-evals.test.cjs — 48 tests):
- CONFIG: workflow.ai_integration_phase defaults and config-set/get
- HEALTH: W010 warning emission and addAiIntegrationPhaseKey repair
- TEMPLATE: AI-SPEC.md section completeness (10 sections)
- COMMAND: ai-integration-phase + eval-review frontmatter validity
- AGENTS: all 5 new agent files exist
- REFERENCES: ai-evals.md + ai-frameworks.md exist and are non-empty
- WORKFLOW: plan-phase nudge integration, workflow files exist + agent coverage

603/603 tests passing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: add Google ADK to framework selector and reference matrix

Google ADK (released March 2025) was missing from the framework options.
Adds Python + Java multi-agent framework optimised for Gemini / Vertex AI.

- get-shit-done/references/ai-frameworks.md: add Google ADK profile (type,
  language, model support, best for, avoid if, strengths, weaknesses, eval
  concerns); update Quick Picks, By System Type, and By Model Commitment tables
- agents/gsd-framework-selector.md: add "Google (Gemini)" to model provider
  interview question
- agents/gsd-ai-researcher.md: add Google ADK docs URL to documentation_sources

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: adapt to upstream conventions post-rebase

- Remove skills: frontmatter from all 5 new agents (upstream changed
  convention — skills: breaks Gemini CLI and must not be present)
- Add workflow.ai_integration_phase to VALID_CONFIG_KEYS whitelist in
  config.cjs (config-set blocked unknown keys)
- Add ai_integration_phase: true to CONFIG_DEFAULTS in core.cjs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: rephrase 4b.1 line to avoid false-positive in prompt-injection scan

"contract as a Pydantic model" matched the `act as a` pattern case-insensitively.
Rephrased to "output schema using a Pydantic model".

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: adapt to upstream conventions (W016, colon refs, config docs)

- Replace verify.cjs from upstream to restore W010-W015 + cmdValidateAgents,
  lost when rebase conflict was resolved with --theirs
- Add W016 (workflow.ai_integration_phase absent) inside the config try block,
  avoids collision with upstream's W010 agent-installation check
- Add addAiIntegrationPhaseKey repair case mirroring addNyquistKey pattern
- Replace /gsd: colon format with /gsd- hyphen format across all new files
  (agents, workflows, templates, verify.cjs) per stale-colon-refs guard (#1748)
- Add workflow.ai_integration_phase to planning-config.md reference table
- Add ai_integration_phase → workflow.ai_integration_phase to NAMESPACE_MAP
  in config-field-docs.test.cjs so CONFIG_DEFAULTS coverage check passes
- Update ai-evals tests to use W016 instead of W010

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: add 5 new agents to E2E Copilot install expected list

gsd-ai-researcher, gsd-domain-researcher, gsd-eval-auditor,
gsd-eval-planner, gsd-framework-selector added to the hardcoded
expected agent list in copilot-install.test.cjs (#1890).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 10:49:00 -04:00

203 lines
7.2 KiB
JavaScript

/**
* Verify planning-config.md documents all config fields from source code.
*/
const { describe, test, before } = require('node:test');
const assert = require('node:assert/strict');
const fs = require('fs');
const path = require('path');
const REFERENCE_PATH = path.join(__dirname, '..', 'get-shit-done', 'references', 'planning-config.md');
const CORE_PATH = path.join(__dirname, '..', 'get-shit-done', 'bin', 'lib', 'core.cjs');
describe('config-field-docs', () => {
let content;
before(() => {
content = fs.readFileSync(REFERENCE_PATH, 'utf-8');
});
test('contains Complete Field Reference section', () => {
assert.ok(
content.includes('## Complete Field Reference'),
'planning-config.md must contain a "Complete Field Reference" heading'
);
});
test('documents at least 15 config fields in tables', () => {
// Count table rows that start with | `<key>` (field rows, not header/separator)
const fieldRows = content.match(/^\| `[a-z_][a-z0-9_.]*` \|/gm);
assert.ok(fieldRows, 'Expected markdown table rows with backtick-quoted keys');
assert.ok(
fieldRows.length >= 15,
`Expected at least 15 documented fields, found ${fieldRows.length}`
);
});
test('contains example configurations', () => {
assert.ok(
content.includes('## Example Configurations'),
'planning-config.md must contain an "Example Configurations" section'
);
// Verify at least one JSON code block with a model_profile key
assert.ok(
content.includes('"model_profile"'),
'Example configurations must include model_profile'
);
});
test('contains field interactions section', () => {
assert.ok(
content.includes('## Field Interactions'),
'planning-config.md must contain a "Field Interactions" section'
);
});
test('every CONFIG_DEFAULTS key appears in the doc', () => {
// Extract CONFIG_DEFAULTS keys from core.cjs source
const coreSource = fs.readFileSync(CORE_PATH, 'utf-8');
const defaultsMatch = coreSource.match(
/const CONFIG_DEFAULTS\s*=\s*\{([\s\S]*?)\n\};/
);
assert.ok(defaultsMatch, 'Could not find CONFIG_DEFAULTS in core.cjs');
const body = defaultsMatch[1];
// Match property keys (word characters before the colon)
const keys = [...body.matchAll(/^\s*(\w+)\s*:/gm)].map(m => m[1]);
assert.ok(keys.length > 0, 'Could not extract any keys from CONFIG_DEFAULTS');
// CONFIG_DEFAULTS uses flat keys; the doc may use namespaced equivalents.
// Map flat keys to the namespace forms used in config.json and the doc.
const NAMESPACE_MAP = {
research: 'workflow.research',
plan_checker: 'workflow.plan_check',
verifier: 'workflow.verifier',
nyquist_validation: 'workflow.nyquist_validation',
ai_integration_phase: 'workflow.ai_integration_phase',
text_mode: 'workflow.text_mode',
subagent_timeout: 'workflow.subagent_timeout',
branching_strategy: 'git.branching_strategy',
phase_branch_template: 'git.phase_branch_template',
milestone_branch_template: 'git.milestone_branch_template',
quick_branch_template: 'git.quick_branch_template',
};
const missing = keys.filter(k => {
// Check both bare key and namespaced form
if (content.includes(`\`${k}\``)) return false;
const ns = NAMESPACE_MAP[k];
if (ns && content.includes(`\`${ns}\``)) return false;
return true;
});
assert.deepStrictEqual(
missing,
[],
`CONFIG_DEFAULTS keys missing from planning-config.md: ${missing.join(', ')}`
);
});
test('documents workflow namespace fields', () => {
const workflowFields = [
'workflow.research',
'workflow.plan_check',
'workflow.verifier',
'workflow.nyquist_validation',
'workflow.use_worktrees',
'workflow.subagent_timeout',
'workflow.text_mode',
];
const missing = workflowFields.filter(f => !content.includes(`\`${f}\``));
assert.deepStrictEqual(
missing,
[],
`Workflow fields missing from planning-config.md: ${missing.join(', ')}`
);
});
test('documents git namespace fields', () => {
const gitFields = [
'git.branching_strategy',
'git.base_branch',
'git.phase_branch_template',
'git.milestone_branch_template',
];
const missing = gitFields.filter(f => !content.includes(`\`${f}\``));
assert.deepStrictEqual(
missing,
[],
`Git fields missing from planning-config.md: ${missing.join(', ')}`
);
});
test('documents KNOWN_TOP_LEVEL internal fields not in CONFIG_DEFAULTS', () => {
// These fields are in KNOWN_TOP_LEVEL (core.cjs) and read by loadConfig()
// but not in CONFIG_DEFAULTS, so the CONFIG_DEFAULTS test doesn't cover them.
const internalFields = [
'model_overrides',
'agent_skills',
];
const missing = internalFields.filter(f => !content.includes(`\`${f}\``));
assert.deepStrictEqual(
missing,
[],
`KNOWN_TOP_LEVEL internal fields missing from planning-config.md: ${missing.join(', ')}`
);
});
test('documents sub_repos field (CONFIG_DEFAULTS, no namespace form)', () => {
// sub_repos is in CONFIG_DEFAULTS but has no NAMESPACE_MAP entry
// (it uses a planning.sub_repos nested lookup but is documented as a
// top-level field). Verify it explicitly since the NAMESPACE_MAP path
// would silently skip it.
assert.ok(
content.includes('`sub_repos`'),
'planning-config.md must document the sub_repos field'
);
});
test('documents features.thinking_partner field', () => {
// features.thinking_partner is in VALID_CONFIG_KEYS (config.cjs) and
// used by discuss-phase.md and plan-phase.md for conditional extended
// thinking at workflow decision points.
assert.ok(
content.includes('`features.thinking_partner`'),
'planning-config.md must document the features.thinking_partner field'
);
});
test('mode field documents correct allowed values', () => {
// mode values are "interactive" and "yolo" per templates/config.json
// and workflows/new-project.md — NOT "code-first"/"plan-first"/"hybrid"
assert.ok(
content.includes('"interactive"') && content.includes('"yolo"'),
'mode field must document "interactive" and "yolo" as allowed values'
);
assert.ok(
!content.includes('"code-first"'),
'mode field must NOT document non-existent "code-first" value'
);
});
test('discuss_mode field documents correct allowed values', () => {
// discuss_mode values are "discuss" and "assumptions" per workflows/settings.md
// NOT "auto" or "analyze" (those are CLI flags, not config values)
assert.ok(
content.includes('"assumptions"'),
'discuss_mode must document "assumptions" as an allowed value'
);
});
test('documents plan_checker alias for workflow.plan_check', () => {
// plan_checker is the flat-key form in CONFIG_DEFAULTS; workflow.plan_check
// is the canonical namespaced form. The doc should mention the alias.
assert.ok(
content.includes('`workflow.plan_check`'),
'planning-config.md must document workflow.plan_check'
);
assert.ok(
content.includes('plan_checker'),
'planning-config.md must mention the plan_checker flat-key alias'
);
});
});