Files
get-shit-done/tests/validate-context.test.cjs
Tom Boucher 5fdc950eb7 feat(#2792): namespace meta-skills + keyword-tag descriptions + context utilization guard (#2825)
* feat(#2792): namespace meta-skills retargeted at the post-#2790 surface

This branch is now based on #2790's HEAD (the consolidation PR) instead
of main, and every routing table targets the consolidated surface so a
user routed by a namespace meta-skill never lands at a deleted /
folded sub-skill.

Cross-PR inconsistencies the original PR #2825 carried (vs #2790):

  - ns-ideate routed to gsd-note / gsd-add-todo / gsd-add-backlog /
    gsd-plant-seed → all folded into gsd-capture by #2790. Now routes
    to gsd-capture (the parent picks the mode from the user's intent).
  - ns-context routed to gsd-scan and gsd-intel → folded into
    gsd-map-codebase --fast / --query by #2790. Now routes to those
    flag forms.
  - ns-manage routed all workspace intent to gsd-list-workspaces (a
    list-only entry) → CR also flagged the over-narrow target. #2790
    folds into gsd-workspace; routing now points there.
  - ns-workflow routed to gsd-research-phase → deleted outright by
    #2790. Removed.
  - ns-project routed to gsd-plan-milestone-gaps → deleted outright by
    #2790. Removed.
  - None of the namespaces previously surfaced #2790's new consolidated
    skills (gsd-capture, gsd-phase, gsd-config, gsd-workspace,
    gsd-progress). All five are now reachable through the routers.
  - extract_learnings → extract-learnings (canonicalized by #2858).

Defect fixes within the namespace skills:

  - Hyphen-form `name:` (gsd-workflow, …) per the canonical naming
    contract — the colon-form addressed CR's drift complaint.
  - `Skill` added to allowed-tools on every router. The body instructs
    "Invoke the matched skill directly using the Skill tool" — without
    Skill in the permission list the meta-skill cannot route at all.

New regression guard in tests/enh-2792-namespace-skills.test.cjs: every
gsd-* token in any namespace router's table column resolves to a
surviving commands/gsd/*.md file (or to a known consolidated parent for
flag-form targets like gsd-map-codebase --fast). This single test would
have caught every dead-end route the original PR shipped with.

Skill-count cap in tests/enh-2790-skill-consolidation.test.cjs now
filters out ns-*.md from its <= 63 cap. Namespace routers are
descriptor-only entries, not part of the consolidation surface that cap
is policing — they have their own contract in
tests/enh-2792-namespace-skills.test.cjs.

INVENTORY.md gains a "Namespace Meta-Skills" section with the 6 router
rows; INVENTORY-MANIFEST.json gains 6 entries; the headline count moves
59 → 65 to match.

Out of scope for this rebase: the gsd-health --context flag (PR #2825
advertised the contract but didn't implement it). That's a separate
feature concern and is left untouched here.

5908/5908 on `npm test`.

* feat(#2792): implement gsd-health --context utilization guard

The original PR #2825 advertised a `--context` flag on gsd-health with a
60%/70% utilization threshold table but never implemented the workflow
logic — CR caught it as a contract leak, the rebase deferred it. This
commit closes the gap with TDD red/green/refactor.

Math layer (pure):
  - get-shit-done/bin/lib/context-utilization.cjs
    classifyContextUtilization(tokensUsed, contextWindow) →
      { percent, state }
    State boundaries use the exact ratio:
      < 60% healthy / 60–70% warning / ≥ 70% critical (fracture point)
    Display percent rounded for humans. Throws TypeError on non-integer
    or out-of-range inputs.
  - STATES = Object.freeze({ HEALTHY, WARNING, CRITICAL }) exported
    so callers reference the names by symbol, not by literal string.

SDK CLI integration:
  - get-shit-done/bin/gsd-tools.cjs
    `validate context --tokens-used N --context-window M [--json]`
    routes to the classifier, owns the recommendation copy (the
    classifier intentionally does not — keeps the renderer free to
    evolve without touching the math layer or its tests), and uses
    core.output's rawValue path for the sync-flush guarantee.
  - sdk/src/query/validate.ts + sdk/src/query/index.ts
    TypeScript validateContext handler registered at 'validate.context'
    and 'validate context'. Mirrors the CJS classifier inline (15 lines
    of arithmetic; not worth a shared cross-language module).

User-facing wiring:
  - commands/gsd/health.md frontmatter advertises --context, body
    documents the three-state threshold table.
  - get-shit-done/workflows/health.md adds a `context_check` step
    that's reached only when --context is set. Step calls
    `gsd-sdk query validate.context` with self-reported tokensUsed and
    contextWindow, prints the SDK output verbatim, and ends. Includes
    a TEXT_MODE plain-text fallback for non-Claude runtimes per #2012.

Tests:
  - tests/context-utilization.test.cjs (17 tests) — pure-function
    contract: state thresholds at every boundary, percent rounding,
    input validation, return-shape (no recommendation field — that's
    the renderer's job).
  - tests/validate-context.test.cjs (9 tests) — SDK CLI plumbing:
    arg parsing errors, JSON vs human rendering, recommendation copy
    pinned per state.
  - tests/enh-2792-namespace-skills.test.cjs (4 new tests) — markdown
    contract: --context advertised in argument-hint, threshold table
    in command body, context_check step exists in workflow, step
    invokes gsd-sdk query validate.context with both flags.

Inventory bookkeeping:
  - docs/INVENTORY.md "CLI Modules" 31 → 32; new row for
    context-utilization.cjs.
  - docs/INVENTORY-MANIFEST.json mirror.

5939/5939 on `npm test`.
2026-04-30 01:04:41 -04:00

91 lines
4.0 KiB
JavaScript

'use strict';
/**
* SDK CLI integration tests for `gsd-tools validate context`.
*
* The pure classifier's behavior is covered by
* tests/context-utilization.test.cjs — these tests focus on what the CLI
* adds on top: argument parsing, JSON vs human-readable rendering,
* recommendation-string formatting, and exit-code semantics.
*/
const { describe, test } = require('node:test');
const assert = require('node:assert/strict');
const { runGsdTools } = require('./helpers.cjs');
describe('gsd-tools validate context — CLI argument errors', () => {
test('missing --tokens-used fails with named flag in stderr', () => {
const r = runGsdTools(['validate', 'context', '--context-window', '200000']);
assert.strictEqual(r.success, false);
assert.match(r.error, /tokens-used/i);
});
test('missing --context-window fails with named flag in stderr', () => {
const r = runGsdTools(['validate', 'context', '--tokens-used', '100000']);
assert.strictEqual(r.success, false);
assert.match(r.error, /context-window/i);
});
test('non-numeric --tokens-used reports the offending flag', () => {
const r = runGsdTools(['validate', 'context', '--tokens-used', 'abc', '--context-window', '200000']);
assert.strictEqual(r.success, false);
assert.match(r.error, /tokens-used/i);
});
test('negative --tokens-used reports the offending flag', () => {
const r = runGsdTools(['validate', 'context', '--tokens-used', '-1', '--context-window', '200000']);
assert.strictEqual(r.success, false);
assert.match(r.error, /tokens-used/i);
});
});
describe('gsd-tools validate context — JSON vs human rendering', () => {
test('--json emits the classifier result plus a recommendation field', () => {
// Single round-trip test confirms (a) classifier integration,
// (b) JSON serialization, and (c) recommendation lookup. Per-state
// classifier behavior is covered by context-utilization.test.cjs.
const r = runGsdTools(['validate', 'context', '--tokens-used', '50000', '--context-window', '200000', '--json']);
assert.strictEqual(r.success, true, `expected success, got: ${r.error}`);
const obj = JSON.parse(r.output);
assert.deepStrictEqual(Object.keys(obj).sort(), ['percent', 'recommendation', 'state']);
assert.strictEqual(obj.percent, 25);
assert.strictEqual(obj.state, 'healthy');
assert.strictEqual(obj.recommendation, null);
});
test('human mode (default) prints percent, state, and recommendation', () => {
const r = runGsdTools(['validate', 'context', '--tokens-used', '140000', '--context-window', '200000']);
assert.strictEqual(r.success, true);
assert.match(r.output, /70%/);
assert.match(r.output, /critical/);
assert.match(r.output, /\/gsd-thread/);
});
test('human mode omits the recommendation line for healthy state', () => {
const r = runGsdTools(['validate', 'context', '--tokens-used', '40000', '--context-window', '200000']);
assert.strictEqual(r.success, true);
assert.match(r.output, /20%/);
assert.match(r.output, /healthy/);
assert.doesNotMatch(r.output, /\/gsd-thread/, 'healthy output must not nag the user');
});
});
describe('gsd-tools validate context — recommendation copy per state', () => {
// The CLI owns the recommendation strings (the classifier does not).
// These tests pin the wording so a regression to the prose is caught.
test('warning state recommends /gsd-thread', () => {
const r = runGsdTools(['validate', 'context', '--tokens-used', '130000', '--context-window', '200000', '--json']);
const obj = JSON.parse(r.output);
assert.strictEqual(obj.state, 'warning');
assert.match(obj.recommendation, /\/gsd-thread/);
});
test('critical state names the fracture-point reasoning risk', () => {
const r = runGsdTools(['validate', 'context', '--tokens-used', '160000', '--context-window', '200000', '--json']);
const obj = JSON.parse(r.output);
assert.strictEqual(obj.state, 'critical');
assert.match(obj.recommendation, /\/gsd-thread/);
assert.match(obj.recommendation, /reasoning|degrade|fracture/i);
});
});