mirror of
https://github.com/glittercowboy/get-shit-done
synced 2026-04-25 17:25:23 +02:00
* feat: /gsd:ai-phase + /gsd:eval-review — AI evals and framework selection layer Adds a structured AI development layer to GSD with 5 new agents, 2 new commands, 2 new workflows, 2 reference files, and 1 template. Commands: - /gsd:ai-phase [N] — pre-planning AI design contract (inserts between discuss-phase and plan-phase). Orchestrates 4 agents in sequence: framework-selector → ai-researcher → domain-researcher → eval-planner. Output: AI-SPEC.md with framework decision, implementation guidance, domain expert context, and evaluation strategy. - /gsd:eval-review [N] — retroactive eval coverage audit. Scores each planned eval dimension as COVERED/PARTIAL/MISSING. Output: EVAL-REVIEW.md with 0-100 score, verdict, and remediation plan. Agents: - gsd-framework-selector: interactive decision matrix (6 questions) → scored framework recommendation for CrewAI, LlamaIndex, LangChain, LangGraph, OpenAI Agents SDK, Claude Agent SDK, AutoGen/AG2, Haystack - gsd-ai-researcher: fetches official framework docs + writes AI systems best practices (Pydantic structured outputs, async-first, prompt discipline, context window management, cost/latency budget) - gsd-domain-researcher: researches business domain and use-case context — surfaces domain expert evaluation criteria, industry failure modes, regulatory constraints, and practitioner rubric ingredients before eval-planner writes measurable criteria - gsd-eval-planner: designs evaluation strategy grounded in domain context; defaults to Arize Phoenix (tracing) + RAGAS (RAG eval) with detect-first guard for existing tooling - gsd-eval-auditor: retroactive codebase scan → scores eval coverage Integration points: - plan-phase: non-blocking nudge (step 4.5) when AI keywords detected and no AI-SPEC.md present - settings: new workflow.ai_phase toggle (default on) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: refine ai-integration-phase layer — rename, house style, consistency fixes Amends the ai-evals framework layer (df8cb6c) with post-review improvements before opening upstream PR. Rename /gsd:ai-phase → /gsd:ai-integration-phase: - Renamed commands/gsd/ai-phase.md → ai-integration-phase.md - Renamed get-shit-done/workflows/ai-phase.md → ai-integration-phase.md - Updated config key: workflow.ai_phase → workflow.ai_integration_phase - Updated repair action: addAiPhaseKey → addAiIntegrationPhaseKey - Updated all 84 cross-references across agents, workflows, templates, tests Consistency fixes (same class as PR #1380 review): - commands/gsd: objective described 3-agent chain, missing gsd-domain-researcher - workflows/ai-integration-phase: purpose tag described 3-agent chain + "locks three things" — updated to 4 agents + 4 outputs - workflows/ai-integration-phase: missing DOMAIN_MODEL resolve-model call in step 1 (domain-researcher was spawned in step 7.5 with no model variable) - workflows/ai-integration-phase: fractional step ## 7.5 renumbered to integers (steps 8–12 shifted) Agent house style (GSD meta-prompting conformance): - All 5 new agents refactored to execution_flow + step name="" structure - Role blocks compressed to 2 lines (removed verbose "Core responsibilities") - Added skills: frontmatter to all 5 agents (agent-frontmatter tests) - Added # hooks: commented pattern to file-writing agents - Added ALWAYS use Write tool anti-heredoc instruction to file-writing agents - Line reductions: ai-researcher −41%, domain-researcher −25%, eval-planner −26%, eval-auditor −25%, framework-selector −9% Test coverage (tests/ai-evals.test.cjs — 48 tests): - CONFIG: workflow.ai_integration_phase defaults and config-set/get - HEALTH: W010 warning emission and addAiIntegrationPhaseKey repair - TEMPLATE: AI-SPEC.md section completeness (10 sections) - COMMAND: ai-integration-phase + eval-review frontmatter validity - AGENTS: all 5 new agent files exist - REFERENCES: ai-evals.md + ai-frameworks.md exist and are non-empty - WORKFLOW: plan-phase nudge integration, workflow files exist + agent coverage 603/603 tests passing. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: add Google ADK to framework selector and reference matrix Google ADK (released March 2025) was missing from the framework options. Adds Python + Java multi-agent framework optimised for Gemini / Vertex AI. - get-shit-done/references/ai-frameworks.md: add Google ADK profile (type, language, model support, best for, avoid if, strengths, weaknesses, eval concerns); update Quick Picks, By System Type, and By Model Commitment tables - agents/gsd-framework-selector.md: add "Google (Gemini)" to model provider interview question - agents/gsd-ai-researcher.md: add Google ADK docs URL to documentation_sources Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: adapt to upstream conventions post-rebase - Remove skills: frontmatter from all 5 new agents (upstream changed convention — skills: breaks Gemini CLI and must not be present) - Add workflow.ai_integration_phase to VALID_CONFIG_KEYS whitelist in config.cjs (config-set blocked unknown keys) - Add ai_integration_phase: true to CONFIG_DEFAULTS in core.cjs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: rephrase 4b.1 line to avoid false-positive in prompt-injection scan "contract as a Pydantic model" matched the `act as a` pattern case-insensitively. Rephrased to "output schema using a Pydantic model". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: adapt to upstream conventions (W016, colon refs, config docs) - Replace verify.cjs from upstream to restore W010-W015 + cmdValidateAgents, lost when rebase conflict was resolved with --theirs - Add W016 (workflow.ai_integration_phase absent) inside the config try block, avoids collision with upstream's W010 agent-installation check - Add addAiIntegrationPhaseKey repair case mirroring addNyquistKey pattern - Replace /gsd: colon format with /gsd- hyphen format across all new files (agents, workflows, templates, verify.cjs) per stale-colon-refs guard (#1748) - Add workflow.ai_integration_phase to planning-config.md reference table - Add ai_integration_phase → workflow.ai_integration_phase to NAMESPACE_MAP in config-field-docs.test.cjs so CONFIG_DEFAULTS coverage check passes - Update ai-evals tests to use W016 instead of W010 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: add 5 new agents to E2E Copilot install expected list gsd-ai-researcher, gsd-domain-researcher, gsd-eval-auditor, gsd-eval-planner, gsd-framework-selector added to the hardcoded expected agent list in copilot-install.test.cjs (#1890). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
203 lines
7.2 KiB
JavaScript
203 lines
7.2 KiB
JavaScript
/**
|
|
* Verify planning-config.md documents all config fields from source code.
|
|
*/
|
|
|
|
const { describe, test, before } = require('node:test');
|
|
const assert = require('node:assert/strict');
|
|
const fs = require('fs');
|
|
const path = require('path');
|
|
|
|
const REFERENCE_PATH = path.join(__dirname, '..', 'get-shit-done', 'references', 'planning-config.md');
|
|
const CORE_PATH = path.join(__dirname, '..', 'get-shit-done', 'bin', 'lib', 'core.cjs');
|
|
|
|
describe('config-field-docs', () => {
|
|
let content;
|
|
|
|
before(() => {
|
|
content = fs.readFileSync(REFERENCE_PATH, 'utf-8');
|
|
});
|
|
|
|
test('contains Complete Field Reference section', () => {
|
|
assert.ok(
|
|
content.includes('## Complete Field Reference'),
|
|
'planning-config.md must contain a "Complete Field Reference" heading'
|
|
);
|
|
});
|
|
|
|
test('documents at least 15 config fields in tables', () => {
|
|
// Count table rows that start with | `<key>` (field rows, not header/separator)
|
|
const fieldRows = content.match(/^\| `[a-z_][a-z0-9_.]*` \|/gm);
|
|
assert.ok(fieldRows, 'Expected markdown table rows with backtick-quoted keys');
|
|
assert.ok(
|
|
fieldRows.length >= 15,
|
|
`Expected at least 15 documented fields, found ${fieldRows.length}`
|
|
);
|
|
});
|
|
|
|
test('contains example configurations', () => {
|
|
assert.ok(
|
|
content.includes('## Example Configurations'),
|
|
'planning-config.md must contain an "Example Configurations" section'
|
|
);
|
|
// Verify at least one JSON code block with a model_profile key
|
|
assert.ok(
|
|
content.includes('"model_profile"'),
|
|
'Example configurations must include model_profile'
|
|
);
|
|
});
|
|
|
|
test('contains field interactions section', () => {
|
|
assert.ok(
|
|
content.includes('## Field Interactions'),
|
|
'planning-config.md must contain a "Field Interactions" section'
|
|
);
|
|
});
|
|
|
|
test('every CONFIG_DEFAULTS key appears in the doc', () => {
|
|
// Extract CONFIG_DEFAULTS keys from core.cjs source
|
|
const coreSource = fs.readFileSync(CORE_PATH, 'utf-8');
|
|
const defaultsMatch = coreSource.match(
|
|
/const CONFIG_DEFAULTS\s*=\s*\{([\s\S]*?)\n\};/
|
|
);
|
|
assert.ok(defaultsMatch, 'Could not find CONFIG_DEFAULTS in core.cjs');
|
|
|
|
const body = defaultsMatch[1];
|
|
// Match property keys (word characters before the colon)
|
|
const keys = [...body.matchAll(/^\s*(\w+)\s*:/gm)].map(m => m[1]);
|
|
assert.ok(keys.length > 0, 'Could not extract any keys from CONFIG_DEFAULTS');
|
|
|
|
// CONFIG_DEFAULTS uses flat keys; the doc may use namespaced equivalents.
|
|
// Map flat keys to the namespace forms used in config.json and the doc.
|
|
const NAMESPACE_MAP = {
|
|
research: 'workflow.research',
|
|
plan_checker: 'workflow.plan_check',
|
|
verifier: 'workflow.verifier',
|
|
nyquist_validation: 'workflow.nyquist_validation',
|
|
ai_integration_phase: 'workflow.ai_integration_phase',
|
|
text_mode: 'workflow.text_mode',
|
|
subagent_timeout: 'workflow.subagent_timeout',
|
|
branching_strategy: 'git.branching_strategy',
|
|
phase_branch_template: 'git.phase_branch_template',
|
|
milestone_branch_template: 'git.milestone_branch_template',
|
|
quick_branch_template: 'git.quick_branch_template',
|
|
};
|
|
|
|
const missing = keys.filter(k => {
|
|
// Check both bare key and namespaced form
|
|
if (content.includes(`\`${k}\``)) return false;
|
|
const ns = NAMESPACE_MAP[k];
|
|
if (ns && content.includes(`\`${ns}\``)) return false;
|
|
return true;
|
|
});
|
|
assert.deepStrictEqual(
|
|
missing,
|
|
[],
|
|
`CONFIG_DEFAULTS keys missing from planning-config.md: ${missing.join(', ')}`
|
|
);
|
|
});
|
|
|
|
test('documents workflow namespace fields', () => {
|
|
const workflowFields = [
|
|
'workflow.research',
|
|
'workflow.plan_check',
|
|
'workflow.verifier',
|
|
'workflow.nyquist_validation',
|
|
'workflow.use_worktrees',
|
|
'workflow.subagent_timeout',
|
|
'workflow.text_mode',
|
|
];
|
|
const missing = workflowFields.filter(f => !content.includes(`\`${f}\``));
|
|
assert.deepStrictEqual(
|
|
missing,
|
|
[],
|
|
`Workflow fields missing from planning-config.md: ${missing.join(', ')}`
|
|
);
|
|
});
|
|
|
|
test('documents git namespace fields', () => {
|
|
const gitFields = [
|
|
'git.branching_strategy',
|
|
'git.base_branch',
|
|
'git.phase_branch_template',
|
|
'git.milestone_branch_template',
|
|
];
|
|
const missing = gitFields.filter(f => !content.includes(`\`${f}\``));
|
|
assert.deepStrictEqual(
|
|
missing,
|
|
[],
|
|
`Git fields missing from planning-config.md: ${missing.join(', ')}`
|
|
);
|
|
});
|
|
|
|
test('documents KNOWN_TOP_LEVEL internal fields not in CONFIG_DEFAULTS', () => {
|
|
// These fields are in KNOWN_TOP_LEVEL (core.cjs) and read by loadConfig()
|
|
// but not in CONFIG_DEFAULTS, so the CONFIG_DEFAULTS test doesn't cover them.
|
|
const internalFields = [
|
|
'model_overrides',
|
|
'agent_skills',
|
|
];
|
|
const missing = internalFields.filter(f => !content.includes(`\`${f}\``));
|
|
assert.deepStrictEqual(
|
|
missing,
|
|
[],
|
|
`KNOWN_TOP_LEVEL internal fields missing from planning-config.md: ${missing.join(', ')}`
|
|
);
|
|
});
|
|
|
|
test('documents sub_repos field (CONFIG_DEFAULTS, no namespace form)', () => {
|
|
// sub_repos is in CONFIG_DEFAULTS but has no NAMESPACE_MAP entry
|
|
// (it uses a planning.sub_repos nested lookup but is documented as a
|
|
// top-level field). Verify it explicitly since the NAMESPACE_MAP path
|
|
// would silently skip it.
|
|
assert.ok(
|
|
content.includes('`sub_repos`'),
|
|
'planning-config.md must document the sub_repos field'
|
|
);
|
|
});
|
|
|
|
test('documents features.thinking_partner field', () => {
|
|
// features.thinking_partner is in VALID_CONFIG_KEYS (config.cjs) and
|
|
// used by discuss-phase.md and plan-phase.md for conditional extended
|
|
// thinking at workflow decision points.
|
|
assert.ok(
|
|
content.includes('`features.thinking_partner`'),
|
|
'planning-config.md must document the features.thinking_partner field'
|
|
);
|
|
});
|
|
|
|
test('mode field documents correct allowed values', () => {
|
|
// mode values are "interactive" and "yolo" per templates/config.json
|
|
// and workflows/new-project.md — NOT "code-first"/"plan-first"/"hybrid"
|
|
assert.ok(
|
|
content.includes('"interactive"') && content.includes('"yolo"'),
|
|
'mode field must document "interactive" and "yolo" as allowed values'
|
|
);
|
|
assert.ok(
|
|
!content.includes('"code-first"'),
|
|
'mode field must NOT document non-existent "code-first" value'
|
|
);
|
|
});
|
|
|
|
test('discuss_mode field documents correct allowed values', () => {
|
|
// discuss_mode values are "discuss" and "assumptions" per workflows/settings.md
|
|
// NOT "auto" or "analyze" (those are CLI flags, not config values)
|
|
assert.ok(
|
|
content.includes('"assumptions"'),
|
|
'discuss_mode must document "assumptions" as an allowed value'
|
|
);
|
|
});
|
|
|
|
test('documents plan_checker alias for workflow.plan_check', () => {
|
|
// plan_checker is the flat-key form in CONFIG_DEFAULTS; workflow.plan_check
|
|
// is the canonical namespaced form. The doc should mention the alias.
|
|
assert.ok(
|
|
content.includes('`workflow.plan_check`'),
|
|
'planning-config.md must document workflow.plan_check'
|
|
);
|
|
assert.ok(
|
|
content.includes('plan_checker'),
|
|
'planning-config.md must mention the plan_checker flat-key alias'
|
|
);
|
|
});
|
|
});
|