mirror of
https://github.com/glittercowboy/get-shit-done
synced 2026-05-13 18:46:38 +02:00
* feat(#2792): namespace meta-skills retargeted at the post-#2790 surface This branch is now based on #2790's HEAD (the consolidation PR) instead of main, and every routing table targets the consolidated surface so a user routed by a namespace meta-skill never lands at a deleted / folded sub-skill. Cross-PR inconsistencies the original PR #2825 carried (vs #2790): - ns-ideate routed to gsd-note / gsd-add-todo / gsd-add-backlog / gsd-plant-seed → all folded into gsd-capture by #2790. Now routes to gsd-capture (the parent picks the mode from the user's intent). - ns-context routed to gsd-scan and gsd-intel → folded into gsd-map-codebase --fast / --query by #2790. Now routes to those flag forms. - ns-manage routed all workspace intent to gsd-list-workspaces (a list-only entry) → CR also flagged the over-narrow target. #2790 folds into gsd-workspace; routing now points there. - ns-workflow routed to gsd-research-phase → deleted outright by #2790. Removed. - ns-project routed to gsd-plan-milestone-gaps → deleted outright by #2790. Removed. - None of the namespaces previously surfaced #2790's new consolidated skills (gsd-capture, gsd-phase, gsd-config, gsd-workspace, gsd-progress). All five are now reachable through the routers. - extract_learnings → extract-learnings (canonicalized by #2858). Defect fixes within the namespace skills: - Hyphen-form `name:` (gsd-workflow, …) per the canonical naming contract — the colon-form addressed CR's drift complaint. - `Skill` added to allowed-tools on every router. The body instructs "Invoke the matched skill directly using the Skill tool" — without Skill in the permission list the meta-skill cannot route at all. New regression guard in tests/enh-2792-namespace-skills.test.cjs: every gsd-* token in any namespace router's table column resolves to a surviving commands/gsd/*.md file (or to a known consolidated parent for flag-form targets like gsd-map-codebase --fast). This single test would have caught every dead-end route the original PR shipped with. Skill-count cap in tests/enh-2790-skill-consolidation.test.cjs now filters out ns-*.md from its <= 63 cap. Namespace routers are descriptor-only entries, not part of the consolidation surface that cap is policing — they have their own contract in tests/enh-2792-namespace-skills.test.cjs. INVENTORY.md gains a "Namespace Meta-Skills" section with the 6 router rows; INVENTORY-MANIFEST.json gains 6 entries; the headline count moves 59 → 65 to match. Out of scope for this rebase: the gsd-health --context flag (PR #2825 advertised the contract but didn't implement it). That's a separate feature concern and is left untouched here. 5908/5908 on `npm test`. * feat(#2792): implement gsd-health --context utilization guard The original PR #2825 advertised a `--context` flag on gsd-health with a 60%/70% utilization threshold table but never implemented the workflow logic — CR caught it as a contract leak, the rebase deferred it. This commit closes the gap with TDD red/green/refactor. Math layer (pure): - get-shit-done/bin/lib/context-utilization.cjs classifyContextUtilization(tokensUsed, contextWindow) → { percent, state } State boundaries use the exact ratio: < 60% healthy / 60–70% warning / ≥ 70% critical (fracture point) Display percent rounded for humans. Throws TypeError on non-integer or out-of-range inputs. - STATES = Object.freeze({ HEALTHY, WARNING, CRITICAL }) exported so callers reference the names by symbol, not by literal string. SDK CLI integration: - get-shit-done/bin/gsd-tools.cjs `validate context --tokens-used N --context-window M [--json]` routes to the classifier, owns the recommendation copy (the classifier intentionally does not — keeps the renderer free to evolve without touching the math layer or its tests), and uses core.output's rawValue path for the sync-flush guarantee. - sdk/src/query/validate.ts + sdk/src/query/index.ts TypeScript validateContext handler registered at 'validate.context' and 'validate context'. Mirrors the CJS classifier inline (15 lines of arithmetic; not worth a shared cross-language module). User-facing wiring: - commands/gsd/health.md frontmatter advertises --context, body documents the three-state threshold table. - get-shit-done/workflows/health.md adds a `context_check` step that's reached only when --context is set. Step calls `gsd-sdk query validate.context` with self-reported tokensUsed and contextWindow, prints the SDK output verbatim, and ends. Includes a TEXT_MODE plain-text fallback for non-Claude runtimes per #2012. Tests: - tests/context-utilization.test.cjs (17 tests) — pure-function contract: state thresholds at every boundary, percent rounding, input validation, return-shape (no recommendation field — that's the renderer's job). - tests/validate-context.test.cjs (9 tests) — SDK CLI plumbing: arg parsing errors, JSON vs human rendering, recommendation copy pinned per state. - tests/enh-2792-namespace-skills.test.cjs (4 new tests) — markdown contract: --context advertised in argument-hint, threshold table in command body, context_check step exists in workflow, step invokes gsd-sdk query validate.context with both flags. Inventory bookkeeping: - docs/INVENTORY.md "CLI Modules" 31 → 32; new row for context-utilization.cjs. - docs/INVENTORY-MANIFEST.json mirror. 5939/5939 on `npm test`.
91 lines
4.0 KiB
JavaScript
91 lines
4.0 KiB
JavaScript
'use strict';
|
|
|
|
/**
|
|
* SDK CLI integration tests for `gsd-tools validate context`.
|
|
*
|
|
* The pure classifier's behavior is covered by
|
|
* tests/context-utilization.test.cjs — these tests focus on what the CLI
|
|
* adds on top: argument parsing, JSON vs human-readable rendering,
|
|
* recommendation-string formatting, and exit-code semantics.
|
|
*/
|
|
|
|
const { describe, test } = require('node:test');
|
|
const assert = require('node:assert/strict');
|
|
const { runGsdTools } = require('./helpers.cjs');
|
|
|
|
describe('gsd-tools validate context — CLI argument errors', () => {
|
|
test('missing --tokens-used fails with named flag in stderr', () => {
|
|
const r = runGsdTools(['validate', 'context', '--context-window', '200000']);
|
|
assert.strictEqual(r.success, false);
|
|
assert.match(r.error, /tokens-used/i);
|
|
});
|
|
|
|
test('missing --context-window fails with named flag in stderr', () => {
|
|
const r = runGsdTools(['validate', 'context', '--tokens-used', '100000']);
|
|
assert.strictEqual(r.success, false);
|
|
assert.match(r.error, /context-window/i);
|
|
});
|
|
|
|
test('non-numeric --tokens-used reports the offending flag', () => {
|
|
const r = runGsdTools(['validate', 'context', '--tokens-used', 'abc', '--context-window', '200000']);
|
|
assert.strictEqual(r.success, false);
|
|
assert.match(r.error, /tokens-used/i);
|
|
});
|
|
|
|
test('negative --tokens-used reports the offending flag', () => {
|
|
const r = runGsdTools(['validate', 'context', '--tokens-used', '-1', '--context-window', '200000']);
|
|
assert.strictEqual(r.success, false);
|
|
assert.match(r.error, /tokens-used/i);
|
|
});
|
|
});
|
|
|
|
describe('gsd-tools validate context — JSON vs human rendering', () => {
|
|
test('--json emits the classifier result plus a recommendation field', () => {
|
|
// Single round-trip test confirms (a) classifier integration,
|
|
// (b) JSON serialization, and (c) recommendation lookup. Per-state
|
|
// classifier behavior is covered by context-utilization.test.cjs.
|
|
const r = runGsdTools(['validate', 'context', '--tokens-used', '50000', '--context-window', '200000', '--json']);
|
|
assert.strictEqual(r.success, true, `expected success, got: ${r.error}`);
|
|
const obj = JSON.parse(r.output);
|
|
assert.deepStrictEqual(Object.keys(obj).sort(), ['percent', 'recommendation', 'state']);
|
|
assert.strictEqual(obj.percent, 25);
|
|
assert.strictEqual(obj.state, 'healthy');
|
|
assert.strictEqual(obj.recommendation, null);
|
|
});
|
|
|
|
test('human mode (default) prints percent, state, and recommendation', () => {
|
|
const r = runGsdTools(['validate', 'context', '--tokens-used', '140000', '--context-window', '200000']);
|
|
assert.strictEqual(r.success, true);
|
|
assert.match(r.output, /70%/);
|
|
assert.match(r.output, /critical/);
|
|
assert.match(r.output, /\/gsd-thread/);
|
|
});
|
|
|
|
test('human mode omits the recommendation line for healthy state', () => {
|
|
const r = runGsdTools(['validate', 'context', '--tokens-used', '40000', '--context-window', '200000']);
|
|
assert.strictEqual(r.success, true);
|
|
assert.match(r.output, /20%/);
|
|
assert.match(r.output, /healthy/);
|
|
assert.doesNotMatch(r.output, /\/gsd-thread/, 'healthy output must not nag the user');
|
|
});
|
|
});
|
|
|
|
describe('gsd-tools validate context — recommendation copy per state', () => {
|
|
// The CLI owns the recommendation strings (the classifier does not).
|
|
// These tests pin the wording so a regression to the prose is caught.
|
|
test('warning state recommends /gsd-thread', () => {
|
|
const r = runGsdTools(['validate', 'context', '--tokens-used', '130000', '--context-window', '200000', '--json']);
|
|
const obj = JSON.parse(r.output);
|
|
assert.strictEqual(obj.state, 'warning');
|
|
assert.match(obj.recommendation, /\/gsd-thread/);
|
|
});
|
|
|
|
test('critical state names the fracture-point reasoning risk', () => {
|
|
const r = runGsdTools(['validate', 'context', '--tokens-used', '160000', '--context-window', '200000', '--json']);
|
|
const obj = JSON.parse(r.output);
|
|
assert.strictEqual(obj.state, 'critical');
|
|
assert.match(obj.recommendation, /\/gsd-thread/);
|
|
assert.match(obj.recommendation, /reasoning|degrade|fracture/i);
|
|
});
|
|
});
|