feat: make model profiles runtime-aware for Codex/non-Claude runtimes (closes #2517) (#2609)

* feat: make model profiles runtime-aware for Codex/non-Claude runtimes (closes #2517)

Adds an optional top-level `runtime` config key plus a
`model_profile_overrides[runtime][tier]` map. When `runtime` is set,
profile tiers (opus/sonnet/haiku) resolve to runtime-native model IDs
(and reasoning_effort where supported) instead of bare Claude aliases.

Codex defaults from the spec:
  opus   -> gpt-5.4        reasoning_effort: xhigh
  sonnet -> gpt-5.3-codex  reasoning_effort: medium
  haiku  -> gpt-5.4-mini   reasoning_effort: medium

Claude defaults mirror MODEL_ALIAS_MAP. Unknown runtimes fall back to
the Claude-alias safe default rather than emit IDs the runtime cannot
accept. reasoning_effort is only emitted into Codex install paths;
never returned from resolveModelInternal and never written to Claude
agent frontmatter.

Backwards compatible: any user without `runtime` set sees identical
behavior — the new branch is gated on `config.runtime != null`.

Precedence (highest to lowest):
  1. per-agent model_overrides
  2. runtime-aware tier resolution (when `runtime` is set)
  3. resolve_model_ids: "omit"
  4. Claude-native default
  5. inherit (literal passthrough)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(#2517): address adversarial review of #2609 (findings 1-16)

Addresses all 16 findings from the adversarial review of PR #2609.
Each finding is enumerated below with its resolution.

CRITICAL
- F1: readGsdRuntimeProfileResolver(targetDir) now probes per-project
  .planning/config.json AND ~/.gsd/defaults.json with per-project winning,
  so the PR's headline claim ("set runtime in project config and Codex
  TOML emit picks it up") actually holds end-to-end.
- F2: resolveTierEntry field-merges user overrides with built-in defaults.
  The CONFIGURATION.md string-shorthand example
    `{ codex: { opus: "gpt-5-pro" } }`
  now keeps reasoning_effort from the built-in entry. Partial-object
  overrides like `{ opus: { reasoning_effort: 'low' } }` keep the
  built-in model. Both paths regression-tested.

MAJOR
- F3: resolveReasoningEffortInternal gates strictly on the
  RUNTIMES_WITH_REASONING_EFFORT allowlist regardless of override
  presence. Override + unknown-runtime no longer leaks reasoning_effort.
- F4: runtime:"claude" is now a no-op for resolution (it is the implicit
  default). It no longer hijacks resolve_model_ids:"omit". Existing
  tests for `runtime:"claude"` returning Claude IDs were rewritten to
  reflect the no-op semantics; new test asserts the omit case returns "".
- F5: _readGsdConfigFile in install.js writes a stderr warning on JSON
  parse failure instead of silently returning null. Read failure and
  parse failure are warned separately. Library require is hoisted to top
  of install.js so it is not co-mingled with config-read failure modes.
- F6: install.js requires for core.cjs / model-profiles.cjs are hoisted
  to the top of the file with __dirname-based absolute paths so global
  npm install works regardless of cwd. Test asserts both lib paths exist
  relative to install.js __dirname.
- F7: docs/CONFIGURATION.md `runtime` row no longer lists `opencode` as
  a valid runtime — install-path emission for non-Codex runtimes is
  explicitly out of scope per #2517 / #2612, and the doc now points at
  #2612 for the follow-on work. resolveModelInternal still accepts any
  runtime string (back-compat) and falls back safely for unknown values.
- F8: Tests now isolate HOME (and GSD_HOME) to a per-test tmpdir so the
  developer's real ~/.gsd/defaults.json cannot bleed into assertions.
  Same pattern CodeRabbit caught on PRs #2603 / #2604.
- F9: `runtime` and `model_profile_overrides` documented as flat-only
  in core.cjs comments — not routed through `get()` because they are
  top-level keys per docs/CONFIGURATION.md and introducing nested
  resolution for two new keys was not worth the edge-case surface.
- F10/F13: loadConfig now invokes _warnUnknownProfileOverrides on the
  raw parsed config so direct .planning/config.json edits surface
  unknown runtime values (e.g. typo `runtime: "codx"`) and unknown
  tier values (e.g. `model_profile_overrides.codex.banana`) at read
  time. Warnings only — preserves back-compat for runtimes added
  later. Per-process warning cache prevents log spam across repeated
  loadConfig calls.

MINOR / NIT
- F11: Removed dead `tier || 'sonnet'` defensive shortcut. The local
  is now `const alias = tier;` with a comment explaining why `tier`
  is guaranteed truthy at that point (every MODEL_PROFILES entry
  defines `balanced`, the fallback profile).
- F12: Extracted resolveTierEntry() in core.cjs as the single source
  of truth for runtime-aware tier resolution. core.cjs and bin/install.js
  both consume it — no duplicated lookup logic between the two files.
- F14: Added regression tests for findings #1, #2, #3, #4, #6, #10, #13
  in tests/issue-2517-runtime-aware-profiles.test.cjs. Each must-fix
  path has a corresponding test that fails against the pre-fix code
  and passes against the post-fix code.
- F15: docs/CONFIGURATION.md `model_profile` row cross-references
  #1713 / #1806 next to the `adaptive` enum value.
- F16: RUNTIME_PROFILE_MAP remains in core.cjs as the single source of
  truth; install.js imports it through the exported resolveTierEntry
  helper rather than carrying its own copy. Doc files (CONFIGURATION.md,
  USER-GUIDE.md, settings.md) intentionally still embed the IDs as text
  — code comment in core.cjs flags that those doc files must be updated
  whenever the constant changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Tom Boucher
2026-04-22 23:00:37 -04:00
committed by GitHub
parent 41dc475c46
commit cc17886c51
8 changed files with 1099 additions and 20 deletions

View File

@@ -57,6 +57,20 @@ const claudeToCopilotTools = {
// Get version from package.json
const pkg = require('../package.json');
// #2517 — runtime-aware tier resolution shared with core.cjs.
// Hoisted to top with absolute __dirname-based paths so `gsd install codex` works
// when invoked via npm global install (cwd is the user's project, not the gsd repo
// root). Inline `require('../get-shit-done/...')` from inside install functions
// works only because Node resolves it relative to the install.js file regardless
// of cwd, but keeping the require at the top makes the dependency explicit and
// surfaces resolution failures at process start instead of at first install call.
const _gsdLibDir = path.join(__dirname, '..', 'get-shit-done', 'bin', 'lib');
const { MODEL_PROFILES: GSD_MODEL_PROFILES } = require(path.join(_gsdLibDir, 'model-profiles.cjs'));
const {
RUNTIME_PROFILE_MAP: GSD_RUNTIME_PROFILE_MAP,
resolveTierEntry: gsdResolveTierEntry,
} = require(path.join(_gsdLibDir, 'core.cjs'));
// Parse args
const args = process.argv.slice(2);
const hasGlobal = args.includes('--global') || args.includes('-g');
@@ -620,6 +634,115 @@ function readGsdGlobalModelOverrides() {
}
}
/**
* #2517 — Read a single GSD config file (defaults.json or per-project
* config.json) into a plain object, returning null on missing/empty files
* and warning to stderr on JSON parse failures so silent corruption can't
* mask broken configs (review finding #5).
*/
function _readGsdConfigFile(absPath, label) {
if (!fs.existsSync(absPath)) return null;
let raw;
try {
raw = fs.readFileSync(absPath, 'utf-8');
} catch (err) {
process.stderr.write(`gsd: warning — could not read ${label} (${absPath}): ${err.message}\n`);
return null;
}
try {
return JSON.parse(raw);
} catch (err) {
process.stderr.write(`gsd: warning — invalid JSON in ${label} (${absPath}): ${err.message}\n`);
return null;
}
}
/**
* #2517 — Build a runtime-aware tier resolver for the install path.
*
* Probes BOTH per-project `<targetDir>/.planning/config.json` AND
* `~/.gsd/defaults.json`, with per-project keys winning over global. This
* matches `loadConfig`'s precedence and is the only way the PR's headline claim
* — "set runtime in .planning/config.json and the Codex TOML emit picks it up"
* — actually holds end-to-end (review finding #1).
*
* `targetDir` should be the consuming runtime's install root — install code
* passes `path.dirname(<runtime root>)` so `.planning/config.json` resolves
* relative to the user's project. When `targetDir` is null/undefined, only the
* global defaults are consulted.
*
* Returns null if no `runtime` is configured (preserves prior behavior — only
* model_overrides is embedded, no tier/reasoning-effort inference). Returns
* null when `model_profile` is `inherit` so the literal alias passes through
* unchanged.
*
* Returns { runtime, resolve(agentName) -> { model, reasoning_effort? } | null }
*/
function readGsdRuntimeProfileResolver(targetDir = null) {
const homeDefaults = _readGsdConfigFile(
path.join(os.homedir(), '.gsd', 'defaults.json'),
'~/.gsd/defaults.json'
);
// Per-project config probe. Resolve the project root by walking up from
// targetDir until we hit a `.planning/` directory; this covers both the
// common case (caller passes the project root) and the case where caller
// passes a nested install dir like `<root>/.codex/`.
let projectConfig = null;
if (targetDir) {
let probeDir = path.resolve(targetDir);
for (let depth = 0; depth < 8; depth += 1) {
const candidate = path.join(probeDir, '.planning', 'config.json');
if (fs.existsSync(candidate)) {
projectConfig = _readGsdConfigFile(candidate, '.planning/config.json');
break;
}
const parent = path.dirname(probeDir);
if (parent === probeDir) break;
probeDir = parent;
}
}
// Per-project wins. Only fall back to ~/.gsd/defaults.json when the project
// didn't set the field. Field-level merge (not whole-object replace) so a
// user can keep `runtime` global while overriding only `model_profile` per
// project, and vice versa.
const merged = {
runtime:
(projectConfig && projectConfig.runtime) ||
(homeDefaults && homeDefaults.runtime) ||
null,
model_profile:
(projectConfig && projectConfig.model_profile) ||
(homeDefaults && homeDefaults.model_profile) ||
'balanced',
model_profile_overrides:
(projectConfig && projectConfig.model_profile_overrides) ||
(homeDefaults && homeDefaults.model_profile_overrides) ||
null,
};
if (!merged.runtime) return null;
const profile = String(merged.model_profile).toLowerCase();
if (profile === 'inherit') return null;
return {
runtime: merged.runtime,
resolve(agentName) {
const agentModels = GSD_MODEL_PROFILES[agentName];
if (!agentModels) return null;
const tier = agentModels[profile] || agentModels.balanced;
if (!tier) return null;
return gsdResolveTierEntry({
runtime: merged.runtime,
tier,
overrides: merged.model_profile_overrides,
});
},
};
}
// Cache for attribution settings (populated once per runtime during install)
const attributionCache = new Map();
@@ -1789,7 +1912,7 @@ purpose: ${toSingleLine(description)}
* Sets required agent metadata, sandbox_mode, and developer_instructions
* from the agent markdown content.
*/
function generateCodexAgentToml(agentName, agentContent, modelOverrides = null) {
function generateCodexAgentToml(agentName, agentContent, modelOverrides = null, runtimeResolver = null) {
const sandboxMode = CODEX_AGENT_SANDBOX[agentName] || 'read-only';
const { frontmatter, body } = extractFrontmatterAndBody(agentContent);
const frontmatterText = frontmatter || '';
@@ -1808,9 +1931,20 @@ function generateCodexAgentToml(agentName, agentContent, modelOverrides = null)
// Embed model override when configured in ~/.gsd/defaults.json so that
// model_overrides is respected on Codex (which uses static TOML, not inline
// Task() model parameters). See #2256.
// Precedence: per-agent model_overrides > runtime-aware tier resolution (#2517).
const modelOverride = modelOverrides?.[resolvedName] || modelOverrides?.[agentName];
if (modelOverride) {
lines.push(`model = ${JSON.stringify(modelOverride)}`);
} else if (runtimeResolver) {
// #2517 — runtime-aware tier resolution. Embeds Codex-native model + reasoning_effort
// from RUNTIME_PROFILE_MAP / model_profile_overrides for the configured tier.
const entry = runtimeResolver.resolve(resolvedName) || runtimeResolver.resolve(agentName);
if (entry?.model) {
lines.push(`model = ${JSON.stringify(entry.model)}`);
if (entry.reasoning_effort) {
lines.push(`model_reasoning_effort = ${JSON.stringify(entry.reasoning_effort)}`);
}
}
}
// Agent prompts contain raw backslashes in regexes and shell snippets.
@@ -3050,8 +3184,16 @@ function installCodexConfig(targetDir, agentsSrc) {
// Pass model overrides from ~/.gsd/defaults.json so Codex TOML files
// embed the configured model — Codex cannot receive model inline (#2256).
// #2517 — also pass the runtime-aware tier resolver so profile tiers can
// resolve to Codex-native model IDs + reasoning_effort when `runtime: "codex"`
// is set in defaults.json.
const modelOverrides = readGsdGlobalModelOverrides();
const tomlContent = generateCodexAgentToml(name, content, modelOverrides);
// Pass `targetDir` so per-project .planning/config.json wins over global
// ~/.gsd/defaults.json — without this, the PR's headline claim that
// setting runtime in the project config reaches the Codex emit path is
// false (review finding #1).
const runtimeResolver = readGsdRuntimeProfileResolver(targetDir);
const tomlContent = generateCodexAgentToml(name, content, modelOverrides, runtimeResolver);
fs.writeFileSync(path.join(agentsTomlDir, `${name}.toml`), tomlContent);
}
@@ -6943,6 +7085,7 @@ if (process.env.GSD_TEST_MODE) {
stripGsdFromCodexConfig,
mergeCodexConfig,
installCodexConfig,
readGsdRuntimeProfileResolver,
install,
uninstall,
convertClaudeCommandToCodexSkill,

View File

@@ -111,7 +111,9 @@ GSD stores project settings in `.planning/config.json`. Created during `/gsd-new
|---------|------|---------|---------|-------------|
| `mode` | enum | `interactive`, `yolo` | `interactive` | `yolo` auto-approves decisions; `interactive` confirms at each step |
| `granularity` | enum | `coarse`, `standard`, `fine` | `standard` | Controls phase count: `coarse` (3-5), `standard` (5-8), `fine` (8-12) |
| `model_profile` | enum | `quality`, `balanced`, `budget`, `inherit` | `balanced` | Model tier for each agent (see [Model Profiles](#model-profiles)) |
| `model_profile` | enum | `quality`, `balanced`, `budget`, `adaptive`, `inherit` | `balanced` | Model tier for each agent (see [Model Profiles](#model-profiles)). `adaptive` was added per [#1713](https://github.com/gsd-build/get-shit-done/issues/1713) / [#1806](https://github.com/gsd-build/get-shit-done/issues/1806) and resolves the same way as the other tiers under runtime-aware profiles. |
| `runtime` | string | `claude`, `codex`, or any string | (none) | Active runtime for [runtime-aware profile resolution](#runtime-aware-profiles-2517). When set, profile tiers (opus/sonnet/haiku) resolve to runtime-native model IDs. Today only the Codex install path emits per-agent model IDs from this resolver; other runtimes (`opencode`, `gemini`, `qwen`, `copilot`, …) consume the resolver at spawn time and gain dedicated install-path support in [#2612](https://github.com/gsd-build/get-shit-done/issues/2612). When unset (default), behavior is unchanged from prior versions. Added in v1.39 |
| `model_profile_overrides.<runtime>.<tier>` | string \| object | per-runtime tier override | (none) | Override the runtime-aware tier mapping for a specific `(runtime, tier)`. Tier is one of `opus`, `sonnet`, `haiku`. Value is either a model ID string (e.g. `"gpt-5-pro"`) or `{ model, reasoning_effort }`. See [Runtime-Aware Profiles](#runtime-aware-profiles-2517). Added in v1.39 |
| `project_code` | string | any short string | (none) | Prefix for phase directory names (e.g., `"ABC"` produces `ABC-01-setup/`). Added in v1.31 |
| `response_language` | string | language code | (none) | Language for agent responses (e.g., `"pt"`, `"ko"`, `"ja"`). Propagates to all spawned agents for cross-phase language consistency. Added in v1.32 |
| `context_window` | number | any integer | `200000` | Context window size in tokens. Set `1000000` for 1M-context models (e.g., `claude-opus-4-7[1m]`). Values `>= 500000` enable adaptive context enrichment (full-body reads of prior SUMMARY.md, deeper anti-pattern reads). Configured via `/gsd-settings-advanced`. |
@@ -588,6 +590,64 @@ The intent is the same as the Claude profile tiers -- use a stronger model for p
| `true` | Maps aliases to full Claude model IDs (`claude-opus-4-6`) | Claude Code with API that requires full IDs |
| `"omit"` | Returns empty string (runtime picks its default) | Non-Claude runtimes (Codex, OpenCode, Gemini CLI, Kilo) |
### Runtime-Aware Profiles (#2517)
When `runtime` is set, profile tiers (`opus`/`sonnet`/`haiku`) resolve to runtime-native model IDs instead of Claude aliases. This lets a single shared `.planning/config.json` work cleanly across Claude and Codex.
**Built-in tier maps:**
| Runtime | `opus` | `sonnet` | `haiku` | reasoning_effort |
|---------|--------|----------|---------|------------------|
| `claude` | `claude-opus-4-6` | `claude-sonnet-4-6` | `claude-haiku-4-5` | (not used) |
| `codex` | `gpt-5.4` | `gpt-5.3-codex` | `gpt-5.4-mini` | `xhigh` / `medium` / `medium` |
**Codex example** — one config, tiered models, no large `model_overrides` block:
```json
{
"runtime": "codex",
"model_profile": "balanced"
}
```
This resolves `gsd-planner``gpt-5.4` (xhigh), `gsd-executor``gpt-5.3-codex` (medium), `gsd-codebase-mapper``gpt-5.4-mini` (medium). The Codex installer embeds `model = "..."` and `model_reasoning_effort = "..."` in each generated agent TOML.
**Claude example** — explicit opt-in resolves to full Claude IDs (no `resolve_model_ids: true` needed):
```json
{
"runtime": "claude",
"model_profile": "quality"
}
```
**Per-runtime overrides** — replace one or more tier defaults:
```json
{
"runtime": "codex",
"model_profile": "quality",
"model_profile_overrides": {
"codex": {
"opus": "gpt-5-pro",
"haiku": { "model": "gpt-5-nano", "reasoning_effort": "low" }
}
}
}
```
**Precedence (highest to lowest):**
1. `model_overrides[<agent>]` — explicit per-agent ID always wins.
2. **Runtime-aware tier resolution** (this section) — when `runtime` is set and profile is not `inherit`.
3. `resolve_model_ids: "omit"` — returns empty string when no `runtime` is set.
4. Claude-native default — `model_profile` tier as alias (current default).
5. `inherit` — propagates literal `inherit` for `Task(model="inherit")` semantics.
**Backwards compatibility.** Setups without `runtime` set see zero behavior change — every existing config continues to work identically. Codex installs that auto-set `resolve_model_ids: "omit"` continue to omit the model field unless the user opts in by setting `runtime: "codex"`.
**Unknown runtimes.** If `runtime` is set to a value with no built-in tier map and no `model_profile_overrides[<runtime>]`, GSD falls back to the Claude-alias safe default rather than emit a model ID the runtime cannot accept. To support a new runtime, populate `model_profile_overrides.<runtime>.{opus,sonnet,haiku}` with valid IDs.
### Profile Philosophy
| Profile | Philosophy | When to Use |

View File

@@ -754,6 +754,19 @@ To assign different models to different agents on a non-Claude runtime, add `mod
The installer auto-configures `resolve_model_ids: "omit"` for Gemini CLI, OpenCode, Kilo, and Codex. If you're manually setting up a non-Claude runtime, add it to `.planning/config.json` yourself.
#### Switching from Claude to Codex with one config change (#2517)
If you want tiered models on Codex without writing a large `model_overrides` block, set `runtime: "codex"` and pick a profile:
```json
{
"runtime": "codex",
"model_profile": "balanced"
}
```
GSD will resolve each agent's tier (`opus`/`sonnet`/`haiku`) to the Codex-native model and reasoning effort defined in the runtime tier map (`gpt-5.4` xhigh / `gpt-5.3-codex` medium / `gpt-5.4-mini` medium). The Codex installer embeds both `model` and `model_reasoning_effort` into each agent's TOML automatically. To override a single tier, add `model_profile_overrides.codex.<tier>`. See [Runtime-Aware Profiles](CONFIGURATION.md#runtime-aware-profiles-2517).
See the [Configuration Reference](CONFIGURATION.md#non-claude-runtimes-codex-opencode-gemini-cli-kilo) for the full explanation.
### Installing for Cline

View File

@@ -62,6 +62,8 @@ const VALID_CONFIG_KEYS = new Set([
'graphify.build_timeout',
'claude_md_path',
'claude_md_assembly.mode',
// #2517 — runtime-aware model profiles
'runtime',
]);
/**
@@ -73,6 +75,10 @@ const DYNAMIC_KEY_PATTERNS = [
{ test: (k) => /^review\.models\.[a-zA-Z0-9_-]+$/.test(k), description: 'review.models.<cli-name>' },
{ test: (k) => /^features\.[a-zA-Z0-9_]+$/.test(k), description: 'features.<feature_name>' },
{ test: (k) => /^claude_md_assembly\.blocks\.[a-zA-Z0-9_]+$/.test(k), description: 'claude_md_assembly.blocks.<section>' },
// #2517 — runtime-aware model profile overrides: model_profile_overrides.<runtime>.<tier>
// <runtime> is a free string (so users can map non-built-in runtimes); <tier> is enum-restricted.
{ test: (k) => /^model_profile_overrides\.[a-zA-Z0-9_-]+\.(opus|sonnet|haiku)$/.test(k),
description: 'model_profile_overrides.<runtime>.<opus|sonnet|haiku>' },
];
/**

View File

@@ -339,6 +339,13 @@ function loadConfig(cwd) {
);
}
// #2517 — Validate runtime/tier values for keys that loadConfig handles but
// can be edited directly into config.json (bypassing config-set's enum check).
// This catches typos like `runtime: "codx"` and `model_profile_overrides.codex.banana`
// at read time without rejecting back-compat values from new runtimes
// (review findings #10, #13).
_warnUnknownProfileOverrides(parsed, '.planning/config.json');
const get = (key, nested) => {
if (parsed[key] !== undefined) return parsed[key];
if (nested && parsed[nested.section] && parsed[nested.section][nested.field] !== undefined) {
@@ -390,6 +397,18 @@ function loadConfig(cwd) {
project_code: get('project_code') ?? defaults.project_code,
subagent_timeout: get('subagent_timeout', { section: 'workflow', field: 'subagent_timeout' }) ?? defaults.subagent_timeout,
model_overrides: parsed.model_overrides || null,
// #2517 — runtime-aware profiles. `runtime` defaults to null (back-compat).
// When null, resolveModelInternal preserves today's Claude-native behavior.
// NOTE: `runtime` and `model_profile_overrides` are intentionally read
// flat-only (not via `get()` with a workflow.X fallback) — they are
// top-level keys per docs/CONFIGURATION.md. The lighter-touch decision
// here was to document the constraint rather than introduce nested
// resolution edge cases for two new keys (review finding #9). The
// schema validation in `_warnUnknownProfileOverrides` runs against the
// raw `parsed` blob, so direct `.planning/config.json` edits surface
// unknown runtime/tier names at load time, not silently (review finding #10).
runtime: parsed.runtime || null,
model_profile_overrides: parsed.model_profile_overrides || null,
agent_skills: parsed.agent_skills || {},
manager: parsed.manager || {},
response_language: get('response_language') || null,
@@ -1449,32 +1468,220 @@ const MODEL_ALIAS_MAP = {
'haiku': 'claude-haiku-4-5',
};
/**
* #2517 — runtime-aware tier resolution.
* Maps `model_profile` tiers (opus/sonnet/haiku) to runtime-native model IDs and
* (where supported) reasoning_effort settings.
*
* Each entry: { model: <id>, reasoning_effort?: <level> }
*
* `claude` mirrors MODEL_ALIAS_MAP — present for symmetry so `runtime: "claude"`
* resolves through the same code path. `codex` defaults are taken from the spec
* in #2517. Unknown runtimes fall back to the Claude alias to avoid emitting
* provider-specific IDs the runtime cannot accept.
*/
const RUNTIME_PROFILE_MAP = {
claude: {
opus: { model: 'claude-opus-4-6' },
sonnet: { model: 'claude-sonnet-4-6' },
haiku: { model: 'claude-haiku-4-5' },
},
codex: {
opus: { model: 'gpt-5.4', reasoning_effort: 'xhigh' },
sonnet: { model: 'gpt-5.3-codex', reasoning_effort: 'medium' },
haiku: { model: 'gpt-5.4-mini', reasoning_effort: 'medium' },
},
};
const RUNTIMES_WITH_REASONING_EFFORT = new Set(['codex']);
/**
* Tier enum allowed under `model_profile_overrides[runtime][tier]`. Mirrors the
* regex in `config-schema.cjs` (DYNAMIC_KEY_PATTERNS) so loadConfig surfaces the
* same constraint at read time, not only at config-set time (review finding #10).
*/
const RUNTIME_OVERRIDE_TIERS = new Set(['opus', 'sonnet', 'haiku']);
/**
* Allowlist of runtime names the install pipeline currently knows how to emit
* native model IDs for. Synced with `getDirName` in `bin/install.js` and the
* runtime list in `docs/CONFIGURATION.md`. Free-string runtimes outside this
* set are still accepted (#2517 deliberately leaves the runtime field open) —
* a warning fires once at loadConfig so a typo like `runtime: "codx"` does not
* silently fall back to Claude defaults (review findings #10, #13).
*/
const KNOWN_RUNTIMES = new Set([
'claude', 'codex', 'opencode', 'kilo', 'gemini', 'qwen',
'copilot', 'cursor', 'windsurf', 'augment', 'trae', 'codebuddy',
'antigravity', 'cline',
]);
const _warnedConfigKeys = new Set();
/**
* Emit a one-time stderr warning for unknown runtime/tier keys in a parsed
* config blob. Idempotent across calls — the same (file, key) pair only warns
* once per process so loadConfig can be called repeatedly without spamming.
*
* Does NOT reject — preserves back-compat for users on a runtime not yet in the
* allowlist (the new-runtime case must always be possible without code changes).
*/
function _warnUnknownProfileOverrides(parsed, configLabel) {
if (!parsed || typeof parsed !== 'object') return;
const runtime = parsed.runtime;
if (runtime && typeof runtime === 'string' && !KNOWN_RUNTIMES.has(runtime)) {
const key = `${configLabel}::runtime::${runtime}`;
if (!_warnedConfigKeys.has(key)) {
_warnedConfigKeys.add(key);
try {
process.stderr.write(
`gsd: warning — config key "runtime" has unknown value "${runtime}". ` +
`Known runtimes: ${[...KNOWN_RUNTIMES].sort().join(', ')}. ` +
`Resolution will fall back to safe defaults. (#2517)\n`
);
} catch { /* stderr might be closed in some test harnesses */ }
}
}
const overrides = parsed.model_profile_overrides;
if (!overrides || typeof overrides !== 'object') return;
for (const [overrideRuntime, tierMap] of Object.entries(overrides)) {
if (!KNOWN_RUNTIMES.has(overrideRuntime)) {
const key = `${configLabel}::override-runtime::${overrideRuntime}`;
if (!_warnedConfigKeys.has(key)) {
_warnedConfigKeys.add(key);
try {
process.stderr.write(
`gsd: warning — model_profile_overrides.${overrideRuntime}.* uses ` +
`unknown runtime "${overrideRuntime}". Known runtimes: ` +
`${[...KNOWN_RUNTIMES].sort().join(', ')}. (#2517)\n`
);
} catch { /* ok */ }
}
}
if (!tierMap || typeof tierMap !== 'object') continue;
for (const tierName of Object.keys(tierMap)) {
if (!RUNTIME_OVERRIDE_TIERS.has(tierName)) {
const key = `${configLabel}::override-tier::${overrideRuntime}.${tierName}`;
if (!_warnedConfigKeys.has(key)) {
_warnedConfigKeys.add(key);
try {
process.stderr.write(
`gsd: warning — model_profile_overrides.${overrideRuntime}.${tierName} ` +
`uses unknown tier "${tierName}". Allowed tiers: opus, sonnet, haiku. (#2517)\n`
);
} catch { /* ok */ }
}
}
}
}
}
// Internal helper exposed for tests so per-process warning state can be reset
// between cases that intentionally exercise the warning path repeatedly.
function _resetRuntimeWarningCacheForTests() {
_warnedConfigKeys.clear();
}
/**
* #2517 — Resolve the runtime-aware tier entry for (runtime, tier).
*
* Single source of truth shared by core.cjs (resolveModelInternal /
* resolveReasoningEffortInternal) and bin/install.js (Codex/OpenCode TOML emit
* paths). Always merges built-in defaults with user overrides at the field
* level so partial overrides keep the unspecified fields:
*
* `{ codex: { opus: "gpt-5-pro" } }` keeps reasoning_effort: 'xhigh'
* `{ codex: { opus: { reasoning_effort: 'low' } } }` keeps model: 'gpt-5.4'
*
* Without this field-merge, the documented string-shorthand example silently
* dropped reasoning_effort and a partial-object override silently dropped the
* model — both reported as critical findings in the #2609 review.
*
* Inputs:
* - runtime: string (e.g. 'codex', 'claude', 'opencode')
* - tier: 'opus' | 'sonnet' | 'haiku'
* - overrides: optional `model_profile_overrides` blob (may be null/undefined)
*
* Returns `{ model: string, reasoning_effort?: string } | null`.
*/
function resolveTierEntry({ runtime, tier, overrides }) {
if (!runtime || !tier) return null;
const builtin = RUNTIME_PROFILE_MAP[runtime]?.[tier] || null;
const userRaw = overrides?.[runtime]?.[tier];
// String shorthand from CONFIGURATION.md examples — `{ codex: { opus: "gpt-5-pro" } }`.
// Treat as `{ model: "gpt-5-pro" }` so the field-merge below still preserves
// reasoning_effort from the built-in defaults.
let userEntry = null;
if (userRaw) {
userEntry = typeof userRaw === 'string' ? { model: userRaw } : userRaw;
}
if (!builtin && !userEntry) return null;
// Field-merge: user fields win, built-in fills the gaps.
return { ...(builtin || {}), ...(userEntry || {}) };
}
/**
* Convenience wrapper used by resolveModelInternal / resolveReasoningEffortInternal.
* Pulls runtime + overrides out of a loaded config and delegates to resolveTierEntry.
*/
function _resolveRuntimeTier(config, tier) {
return resolveTierEntry({
runtime: config.runtime,
tier,
overrides: config.model_profile_overrides,
});
}
function resolveModelInternal(cwd, agentType) {
const config = loadConfig(cwd);
// Check per-agent override first — always respected regardless of resolve_model_ids.
// 1. Per-agent override — always respected; highest precedence.
// Users who set fully-qualified model IDs (e.g., "openai/gpt-5.4") get exactly that.
const override = config.model_overrides?.[agentType];
if (override) {
return override;
}
// resolve_model_ids: "omit" — return empty string so the runtime uses its configured
// default model. For non-Claude runtimes (OpenCode, Codex, etc.) that don't recognize
// Claude aliases (opus/sonnet/haiku/inherit). Set automatically during install. See #1156.
// 2. Compute the tier (opus/sonnet/haiku) for this agent under the active profile.
const profile = String(config.model_profile || 'balanced').toLowerCase();
const agentModels = MODEL_PROFILES[agentType];
const tier = agentModels ? (agentModels[profile] || agentModels['balanced']) : null;
// 3. Runtime-aware resolution (#2517) — only when `runtime` is explicitly set
// to a non-Claude runtime. `runtime: "claude"` is the implicit default and is
// treated as a no-op here so it does not silently override `resolve_model_ids:
// "omit"` (review finding #4). Deliberate ordering for non-Claude runtimes:
// explicit opt-in beats `resolve_model_ids: "omit"` so users on Codex installs
// that auto-set "omit" can still flip on tiered behavior by setting runtime
// alone. inherit profile is preserved verbatim.
if (config.runtime && config.runtime !== 'claude' && profile !== 'inherit' && tier) {
const entry = _resolveRuntimeTier(config, tier);
if (entry?.model) return entry.model;
// Unknown runtime with no user-supplied overrides — fall through to Claude-safe
// default rather than emit an ID the runtime can't accept.
}
// 4. resolve_model_ids: "omit" — return empty string so the runtime uses its
// configured default model. For non-Claude runtimes (OpenCode, Codex, etc.) that
// don't recognize Claude aliases. Set automatically during install. See #1156.
if (config.resolve_model_ids === 'omit') {
return '';
}
// Fall back to profile lookup
const profile = String(config.model_profile || 'balanced').toLowerCase();
const agentModels = MODEL_PROFILES[agentType];
// 5. Profile lookup (Claude-native default).
if (!agentModels) return 'sonnet';
if (profile === 'inherit') return 'inherit';
const alias = agentModels[profile] || agentModels['balanced'] || 'sonnet';
// `tier` is guaranteed truthy here: agentModels exists, and MODEL_PROFILES
// entries always define `balanced`, so `agentModels[profile] || agentModels.balanced`
// resolves to a string. Keep the local for readability — no defensive fallback.
const alias = tier;
// resolve_model_ids: true — map alias to full Claude model ID
// Prevents 404s when the Task tool passes aliases directly to the API
// resolve_model_ids: true — map alias to full Claude model ID.
// Prevents 404s when the Task tool passes aliases directly to the API.
if (config.resolve_model_ids) {
return MODEL_ALIAS_MAP[alias] || alias;
}
@@ -1482,6 +1689,41 @@ function resolveModelInternal(cwd, agentType) {
return alias;
}
/**
* #2517 — Resolve runtime-specific reasoning_effort for an agent.
* Returns null unless:
* - `runtime` is explicitly set in config,
* - the runtime supports reasoning_effort (currently: codex),
* - profile is not 'inherit',
* - the resolved tier entry has a `reasoning_effort` value.
*
* Never returns a value for Claude — keeps reasoning_effort out of Claude spawn paths.
*/
function resolveReasoningEffortInternal(cwd, agentType) {
const config = loadConfig(cwd);
if (!config.runtime) return null;
// Strict allowlist: reasoning_effort only propagates for runtimes whose
// install path actually accepts it. Adding a new runtime here is the only
// way to enable effort propagation — overrides cannot bypass the gate.
// Without this, a typo in `runtime` (e.g. `"codx"`) plus a user override
// for that typo would leak `xhigh` into a Claude or unknown install
// (review finding #3).
if (!RUNTIMES_WITH_REASONING_EFFORT.has(config.runtime)) return null;
// Per-agent override means user supplied a fully-qualified ID; reasoning_effort
// for that case must be set via per-agent mechanism, not tier inference.
if (config.model_overrides?.[agentType]) return null;
const profile = String(config.model_profile || 'balanced').toLowerCase();
if (profile === 'inherit') return null;
const agentModels = MODEL_PROFILES[agentType];
if (!agentModels) return null;
const tier = agentModels[profile] || agentModels['balanced'];
if (!tier) return null;
const entry = _resolveRuntimeTier(config, tier);
return entry?.reasoning_effort || null;
}
// ─── Summary body helpers ─────────────────────────────────────────────────
/**
@@ -1760,6 +2002,13 @@ module.exports = {
getArchivedPhaseDirs,
getRoadmapPhaseInternal,
resolveModelInternal,
resolveReasoningEffortInternal,
RUNTIME_PROFILE_MAP,
RUNTIMES_WITH_REASONING_EFFORT,
KNOWN_RUNTIMES,
RUNTIME_OVERRIDE_TIERS,
resolveTierEntry,
_resetRuntimeWarningCacheForTests,
pathExistsInternal,
generateSlugInternal,
getMilestoneInfo,

View File

@@ -63,11 +63,17 @@ Parse current values (default to `true` if not present):
**Non-Claude runtime note:** If `TEXT_MODE` is active (i.e. the runtime is non-Claude), prepend the following notice before the model profile question:
```
Note: Quality, Balanced, and Budget profiles select Claude model tiers (Opus/Sonnet/Haiku).
On non-Claude runtimes (Codex, Gemini CLI, etc.) these profiles have no effect on actual
model selection — GSD agents will use the runtime's default model.
Choose "Inherit" to use the session model for all agents, or configure model_overrides
manually in .planning/config.json to target specific models for this runtime.
Note: Quality, Balanced, Budget, and Adaptive profiles assign semantic tiers
(Opus/Sonnet/Haiku) to each agent. When `runtime` is set in .planning/config.json,
tiers resolve to runtime-native model IDs — on Codex that's gpt-5.4 / gpt-5.3-codex /
gpt-5.4-mini with appropriate reasoning effort. See "Runtime-Aware Profiles" in
docs/CONFIGURATION.md.
If `runtime` is unset on a non-Claude runtime, the profile tiers have no effect on
actual model selection — agents use the runtime's default model. Choose "Inherit" to
force session-model behavior, set `runtime` + a profile to get tiered models, or
configure `model_overrides` manually in .planning/config.json to target specific
models per agent.
```
Use AskUserQuestion with current values pre-selected. Questions are grouped into six visual sections; the first question in each section carries the section-denoting `header` field (AskUserQuestion renders abbreviated section tags for grouping, max 12 chars).

View File

@@ -162,15 +162,25 @@ describe('loadConfig', () => {
const { VALID_CONFIG_KEYS } = require('../get-shit-done/bin/lib/config.cjs');
// Every top-level key from VALID_CONFIG_KEYS should be recognized
const topLevelKeys = [...VALID_CONFIG_KEYS].map(k => k.split('.')[0]);
// For value-validated keys (e.g. `runtime` enforces an enum at loadConfig
// time, see #2517 review finding #10), seed a known-good value so the
// value-validation warning doesn't fire — this test only checks that the
// key NAME is recognized, not whether the value itself is valid.
const KEY_VALID_VALUES = { runtime: 'codex' };
for (const key of topLevelKeys) {
writeConfig({ [key]: 'test-value' });
const value = KEY_VALID_VALUES[key] ?? 'test-value';
writeConfig({ [key]: value });
const origWrite = process.stderr.write;
let stderrOutput = '';
process.stderr.write = (chunk) => { stderrOutput += chunk; };
try {
loadConfig(tmpDir);
// Look only for the unknown-KEY warning shape, not any incidental match
// (the value-validation warning emitted by #2517 mentions key names too).
const unknownKeyWarning = stderrOutput.includes('unknown config key(s)') &&
stderrOutput.includes(key);
assert.ok(
!stderrOutput.includes(key),
!unknownKeyWarning,
`VALID_CONFIG_KEYS key "${key}" should not trigger unknown-key warning`
);
} finally {

View File

@@ -0,0 +1,592 @@
/**
* Issue #2517 — runtime-aware model profile resolution.
*
* Today, profile tiers (opus/sonnet/haiku) only resolve to Claude IDs. On Codex /
* other runtimes, users must use `inherit` or write large `model_overrides` blocks.
*
* This adds a `runtime` config key + `model_profile_overrides[runtime][tier]` map.
* When `runtime` is set to a non-Claude value, profile tiers resolve to runtime-
* native model IDs.
*
* Codex: opus -> gpt-5.4 (xhigh), sonnet -> gpt-5.3-codex (medium), haiku -> gpt-5.4-mini (medium)
*
* `runtime: "claude"` is the implicit default and is treated as a no-op for
* resolution — it does not override `resolve_model_ids: "omit"` or any other
* Claude-native semantics (review finding #4).
*
* `inherit` keeps current behavior. Unknown runtimes fall back safely (do NOT emit
* provider-specific IDs the runtime can't accept) and trigger a one-shot stderr
* warning so typos like `runtime: "codx"` surface immediately (review finding #13).
*
* HOME isolation: every test sets `process.env.HOME` to a per-suite tmpdir so the
* developer's real `~/.gsd/defaults.json` cannot bleed into assertions
* (review finding #8 / pattern from CodeRabbit on PRs #2603, #2604).
*/
'use strict';
const { describe, test, beforeEach, afterEach } = require('node:test');
const assert = require('node:assert/strict');
const fs = require('fs');
const os = require('os');
const path = require('path');
const { createTempProject, cleanup } = require('./helpers.cjs');
const {
resolveModelInternal,
resolveReasoningEffortInternal,
resolveTierEntry,
RUNTIME_PROFILE_MAP,
KNOWN_RUNTIMES,
_resetRuntimeWarningCacheForTests,
} = require('../get-shit-done/bin/lib/core.cjs');
const { isValidConfigKey } = require('../get-shit-done/bin/lib/config-schema.cjs');
function writeConfig(tmpDir, obj) {
fs.writeFileSync(
path.join(tmpDir, '.planning', 'config.json'),
JSON.stringify(obj, null, 2)
);
}
// ─── Shared HOME isolation (#2517 review finding #8) ────────────────────────
// Without this, a developer's real `~/.gsd/defaults.json` (e.g. one with
// `runtime: codex` set) silently overrides test assertions about back-compat
// behavior. Capture HOME, point it at an isolated tmpdir for the duration of
// each test, restore on teardown.
let _origHome;
let _origGsdHome;
let _isolatedHome;
function isolateHome() {
_origHome = process.env.HOME;
_origGsdHome = process.env.GSD_HOME;
_isolatedHome = fs.mkdtempSync(path.join(os.tmpdir(), 'gsd-home-iso-'));
process.env.HOME = _isolatedHome;
process.env.GSD_HOME = _isolatedHome;
}
function restoreHome() {
if (_origHome === undefined) delete process.env.HOME; else process.env.HOME = _origHome;
if (_origGsdHome === undefined) delete process.env.GSD_HOME; else process.env.GSD_HOME = _origGsdHome;
if (_isolatedHome) fs.rmSync(_isolatedHome, { recursive: true, force: true });
_isolatedHome = null;
}
// ─── Backwards compatibility — no `runtime` set ─────────────────────────────
describe('issue #2517: backwards compat — no runtime key set', () => {
let tmpDir;
beforeEach(() => { isolateHome(); tmpDir = createTempProject(); _resetRuntimeWarningCacheForTests(); });
afterEach(() => { cleanup(tmpDir); restoreHome(); });
test('balanced profile returns Claude alias when runtime absent', () => {
writeConfig(tmpDir, { model_profile: 'balanced' });
// gsd-planner balanced -> opus
assert.strictEqual(resolveModelInternal(tmpDir, 'gsd-planner'), 'opus');
});
test('inherit profile still returns "inherit" with no runtime', () => {
writeConfig(tmpDir, { model_profile: 'inherit' });
assert.strictEqual(resolveModelInternal(tmpDir, 'gsd-planner'), 'inherit');
});
test('resolve_model_ids:true still maps alias -> full Claude ID with no runtime', () => {
writeConfig(tmpDir, { model_profile: 'balanced', resolve_model_ids: true });
assert.strictEqual(resolveModelInternal(tmpDir, 'gsd-planner'), 'claude-opus-4-6');
});
test('resolve_model_ids:"omit" still returns "" with no runtime', () => {
writeConfig(tmpDir, { model_profile: 'balanced', resolve_model_ids: 'omit' });
assert.strictEqual(resolveModelInternal(tmpDir, 'gsd-planner'), '');
});
test('reasoning_effort returns null when runtime absent', () => {
writeConfig(tmpDir, { model_profile: 'balanced' });
assert.strictEqual(resolveReasoningEffortInternal(tmpDir, 'gsd-planner'), null);
});
test('adaptive profile still works without runtime (#1713/#1806)', () => {
writeConfig(tmpDir, { model_profile: 'adaptive' });
assert.strictEqual(resolveModelInternal(tmpDir, 'gsd-planner'), 'opus');
assert.strictEqual(resolveModelInternal(tmpDir, 'gsd-codebase-mapper'), 'haiku');
});
});
// ─── runtime: "claude" — no-op (preserves Claude-native semantics) ──────────
describe('issue #2517: runtime "claude" is a no-op for resolution (finding #4)', () => {
let tmpDir;
beforeEach(() => { isolateHome(); tmpDir = createTempProject(); _resetRuntimeWarningCacheForTests(); });
afterEach(() => { cleanup(tmpDir); restoreHome(); });
test('runtime:"claude" + balanced returns the alias, not the resolved Claude ID', () => {
// `runtime: "claude"` is the implicit default — it must not silently flip
// resolve_model_ids on. The alias passes through identically to the unset case.
writeConfig(tmpDir, { runtime: 'claude', model_profile: 'balanced' });
assert.strictEqual(resolveModelInternal(tmpDir, 'gsd-planner'), 'opus');
});
test('runtime:"claude" + resolve_model_ids:"omit" returns "" (finding #4 regression)', () => {
// The pre-fix bug: runtime:"claude" hijacked the resolution chain and
// returned `claude-opus-4-6` even when the user explicitly asked for the
// omit semantics.
writeConfig(tmpDir, {
runtime: 'claude',
model_profile: 'quality',
resolve_model_ids: 'omit',
});
assert.strictEqual(resolveModelInternal(tmpDir, 'gsd-planner'), '');
});
test('runtime:"claude" + resolve_model_ids:true maps alias -> full Claude ID', () => {
writeConfig(tmpDir, {
runtime: 'claude',
model_profile: 'quality',
resolve_model_ids: true,
});
assert.strictEqual(resolveModelInternal(tmpDir, 'gsd-planner'), 'claude-opus-4-6');
});
test('reasoning_effort is null on Claude (never leaks)', () => {
writeConfig(tmpDir, { runtime: 'claude', model_profile: 'quality' });
assert.strictEqual(resolveReasoningEffortInternal(tmpDir, 'gsd-planner'), null);
});
});
// ─── runtime: "codex" — resolves tiers to Codex IDs + reasoning_effort ──────
describe('issue #2517: runtime "codex" — Codex tier resolution', () => {
let tmpDir;
beforeEach(() => { isolateHome(); tmpDir = createTempProject(); _resetRuntimeWarningCacheForTests(); });
afterEach(() => { cleanup(tmpDir); restoreHome(); });
test('opus tier -> gpt-5.4 with reasoning_effort xhigh', () => {
writeConfig(tmpDir, { runtime: 'codex', model_profile: 'quality' });
// gsd-planner quality -> opus -> gpt-5.4
assert.strictEqual(resolveModelInternal(tmpDir, 'gsd-planner'), 'gpt-5.4');
assert.strictEqual(resolveReasoningEffortInternal(tmpDir, 'gsd-planner'), 'xhigh');
});
test('sonnet tier -> gpt-5.3-codex with reasoning_effort medium', () => {
writeConfig(tmpDir, { runtime: 'codex', model_profile: 'balanced' });
assert.strictEqual(resolveModelInternal(tmpDir, 'gsd-roadmapper'), 'gpt-5.3-codex');
assert.strictEqual(resolveReasoningEffortInternal(tmpDir, 'gsd-roadmapper'), 'medium');
});
test('haiku tier -> gpt-5.4-mini with reasoning_effort medium', () => {
writeConfig(tmpDir, { runtime: 'codex', model_profile: 'budget' });
assert.strictEqual(resolveModelInternal(tmpDir, 'gsd-codebase-mapper'), 'gpt-5.4-mini');
assert.strictEqual(resolveReasoningEffortInternal(tmpDir, 'gsd-codebase-mapper'), 'medium');
});
test('adaptive profile resolves on Codex (no #1713/#1806 regression)', () => {
writeConfig(tmpDir, { runtime: 'codex', model_profile: 'adaptive' });
// gsd-planner adaptive -> opus -> gpt-5.4
assert.strictEqual(resolveModelInternal(tmpDir, 'gsd-planner'), 'gpt-5.4');
// gsd-codebase-mapper adaptive -> haiku -> gpt-5.4-mini
assert.strictEqual(resolveModelInternal(tmpDir, 'gsd-codebase-mapper'), 'gpt-5.4-mini');
});
test('inherit profile still returns "inherit" on Codex', () => {
writeConfig(tmpDir, { runtime: 'codex', model_profile: 'inherit' });
assert.strictEqual(resolveModelInternal(tmpDir, 'gsd-planner'), 'inherit');
// No reasoning_effort when inherit
assert.strictEqual(resolveReasoningEffortInternal(tmpDir, 'gsd-planner'), null);
});
test('runtime:"codex" beats resolve_model_ids:"omit" (explicit non-Claude opt-in wins)', () => {
writeConfig(tmpDir, {
runtime: 'codex',
model_profile: 'quality',
resolve_model_ids: 'omit',
});
assert.strictEqual(resolveModelInternal(tmpDir, 'gsd-planner'), 'gpt-5.4');
});
});
// ─── Precedence chain ───────────────────────────────────────────────────────
describe('issue #2517: precedence chain', () => {
let tmpDir;
beforeEach(() => { isolateHome(); tmpDir = createTempProject(); _resetRuntimeWarningCacheForTests(); });
afterEach(() => { cleanup(tmpDir); restoreHome(); });
test('per-agent model_overrides wins over runtime tier resolution', () => {
writeConfig(tmpDir, {
runtime: 'codex',
model_profile: 'quality',
model_overrides: { 'gsd-planner': 'gpt-5.4-mini' },
});
assert.strictEqual(resolveModelInternal(tmpDir, 'gsd-planner'), 'gpt-5.4-mini');
});
test('model_profile_overrides[runtime][tier] beats built-in defaults', () => {
writeConfig(tmpDir, {
runtime: 'codex',
model_profile: 'quality',
model_profile_overrides: {
codex: { opus: 'gpt-5-pro' },
},
});
// gsd-planner quality -> opus -> overridden to gpt-5-pro
assert.strictEqual(resolveModelInternal(tmpDir, 'gsd-planner'), 'gpt-5-pro');
// haiku not overridden — fall back to spec defaults
// gsd-codebase-mapper quality -> sonnet -> gpt-5.3-codex
assert.strictEqual(resolveModelInternal(tmpDir, 'gsd-codebase-mapper'), 'gpt-5.3-codex');
});
test('partial profile_overrides — only opus overridden, sonnet uses default', () => {
writeConfig(tmpDir, {
runtime: 'codex',
model_profile: 'balanced',
model_profile_overrides: {
codex: { opus: 'gpt-5-pro' }, // only opus overridden
},
});
// gsd-planner balanced -> opus -> overridden to gpt-5-pro
assert.strictEqual(resolveModelInternal(tmpDir, 'gsd-planner'), 'gpt-5-pro');
// gsd-roadmapper balanced -> sonnet -> spec default
assert.strictEqual(resolveModelInternal(tmpDir, 'gsd-roadmapper'), 'gpt-5.3-codex');
});
test('per-agent override beats profile override beats default', () => {
writeConfig(tmpDir, {
runtime: 'codex',
model_profile: 'quality',
model_profile_overrides: { codex: { opus: 'gpt-5-pro' } },
model_overrides: { 'gsd-planner': 'custom-model' },
});
assert.strictEqual(resolveModelInternal(tmpDir, 'gsd-planner'), 'custom-model');
});
});
// ─── Field-merge semantics — review findings #2 ─────────────────────────────
describe('issue #2517: field-merge of overrides with built-in defaults (finding #2)', () => {
let tmpDir;
beforeEach(() => { isolateHome(); tmpDir = createTempProject(); _resetRuntimeWarningCacheForTests(); });
afterEach(() => { cleanup(tmpDir); restoreHome(); });
test('string-shorthand override keeps reasoning_effort from built-in (CONFIGURATION.md example)', () => {
// `{ codex: { opus: "gpt-5-pro" } }` is the documented shorthand. Pre-fix,
// it silently dropped reasoning_effort. Post-fix, the model is overridden
// and reasoning_effort comes from the built-in entry.
writeConfig(tmpDir, {
runtime: 'codex',
model_profile: 'quality',
model_profile_overrides: { codex: { opus: 'gpt-5-pro' } },
});
assert.strictEqual(resolveModelInternal(tmpDir, 'gsd-planner'), 'gpt-5-pro');
assert.strictEqual(resolveReasoningEffortInternal(tmpDir, 'gsd-planner'), 'xhigh');
});
test('partial-object override (no model) keeps model from built-in', () => {
// `{ codex: { opus: { reasoning_effort: "low" } } }` previously dropped
// the model entirely (returned undefined and fell through). Post-fix, the
// built-in `gpt-5.4` model is preserved and `low` reasoning_effort wins.
writeConfig(tmpDir, {
runtime: 'codex',
model_profile: 'quality',
model_profile_overrides: { codex: { opus: { reasoning_effort: 'low' } } },
});
assert.strictEqual(resolveModelInternal(tmpDir, 'gsd-planner'), 'gpt-5.4');
assert.strictEqual(resolveReasoningEffortInternal(tmpDir, 'gsd-planner'), 'low');
});
test('full-object override replaces both fields', () => {
writeConfig(tmpDir, {
runtime: 'codex',
model_profile: 'quality',
model_profile_overrides: {
codex: { opus: { model: 'custom-model', reasoning_effort: 'minimal' } },
},
});
assert.strictEqual(resolveModelInternal(tmpDir, 'gsd-planner'), 'custom-model');
assert.strictEqual(resolveReasoningEffortInternal(tmpDir, 'gsd-planner'), 'minimal');
});
test('resolveTierEntry helper: shorthand merge', () => {
// Direct unit-test of the shared helper used by core + install.js.
const entry = resolveTierEntry({
runtime: 'codex',
tier: 'opus',
overrides: { codex: { opus: 'gpt-5-pro' } },
});
assert.deepStrictEqual(entry, { model: 'gpt-5-pro', reasoning_effort: 'xhigh' });
});
test('resolveTierEntry helper: partial-object merge keeps built-in model', () => {
const entry = resolveTierEntry({
runtime: 'codex',
tier: 'opus',
overrides: { codex: { opus: { reasoning_effort: 'low' } } },
});
assert.deepStrictEqual(entry, { model: 'gpt-5.4', reasoning_effort: 'low' });
});
test('resolveTierEntry helper: unknown runtime + no overrides -> null', () => {
const entry = resolveTierEntry({
runtime: 'mystery',
tier: 'opus',
overrides: null,
});
assert.strictEqual(entry, null);
});
});
// ─── reasoning_effort allowlist (review finding #3) ─────────────────────────
describe('issue #2517: reasoning_effort allowlist gates regardless of overrides (finding #3)', () => {
let tmpDir;
beforeEach(() => { isolateHome(); tmpDir = createTempProject(); _resetRuntimeWarningCacheForTests(); });
afterEach(() => { cleanup(tmpDir); restoreHome(); });
test('unknown runtime with overrides supplying reasoning_effort yields null effort', () => {
// Pre-fix: `if (!overrides) return null` left a hole — overrides for an
// unknown runtime made effort propagate, defeating the typo guard.
writeConfig(tmpDir, {
runtime: 'mystery',
model_profile: 'quality',
model_profile_overrides: {
mystery: { opus: { model: 'mystery-opus', reasoning_effort: 'xhigh' } },
},
});
// Model still resolves (overrides are honored).
assert.strictEqual(resolveModelInternal(tmpDir, 'gsd-planner'), 'mystery-opus');
// …but reasoning_effort does NOT propagate to a runtime not in the allowlist.
assert.strictEqual(resolveReasoningEffortInternal(tmpDir, 'gsd-planner'), null);
});
test('typo runtime "codx" with overrides yields null effort (no leak into install path)', () => {
writeConfig(tmpDir, {
runtime: 'codx',
model_profile: 'quality',
model_profile_overrides: { codx: { opus: { model: 'gpt-5.4', reasoning_effort: 'xhigh' } } },
});
assert.strictEqual(resolveReasoningEffortInternal(tmpDir, 'gsd-planner'), null);
});
});
// ─── Unknown runtime / unknown tier ─────────────────────────────────────────
describe('issue #2517: unknown runtime + safe fallback', () => {
let tmpDir;
beforeEach(() => { isolateHome(); tmpDir = createTempProject(); _resetRuntimeWarningCacheForTests(); });
afterEach(() => { cleanup(tmpDir); restoreHome(); });
test('unknown runtime falls back to Claude-alias safe default (no Codex IDs leaked)', () => {
writeConfig(tmpDir, { runtime: 'mystery-runtime', model_profile: 'quality' });
// Should NOT emit gpt-5.4 — should fall back to Claude alias
const resolved = resolveModelInternal(tmpDir, 'gsd-planner');
assert.notStrictEqual(resolved, 'gpt-5.4');
assert.strictEqual(resolved, 'opus');
});
test('unknown runtime + user-provided overrides for that runtime — uses overrides', () => {
writeConfig(tmpDir, {
runtime: 'mystery-runtime',
model_profile: 'quality',
model_profile_overrides: {
'mystery-runtime': { opus: 'mystery-opus' },
},
});
assert.strictEqual(resolveModelInternal(tmpDir, 'gsd-planner'), 'mystery-opus');
});
test('runtime:"codex" but missing model_profile_overrides[codex] uses spec defaults', () => {
writeConfig(tmpDir, { runtime: 'codex', model_profile: 'quality' });
// No model_profile_overrides at all — built-in Codex defaults take over
assert.strictEqual(resolveModelInternal(tmpDir, 'gsd-planner'), 'gpt-5.4');
});
});
// ─── Schema validation (config-set time + load time) ────────────────────────
describe('issue #2517: VALID_CONFIG_KEYS schema', () => {
test('"runtime" is a valid config key', () => {
assert.strictEqual(isValidConfigKey('runtime'), true);
});
test('model_profile_overrides.codex.opus is valid', () => {
assert.strictEqual(isValidConfigKey('model_profile_overrides.codex.opus'), true);
});
test('model_profile_overrides.codex.sonnet is valid', () => {
assert.strictEqual(isValidConfigKey('model_profile_overrides.codex.sonnet'), true);
});
test('model_profile_overrides.codex.haiku is valid', () => {
assert.strictEqual(isValidConfigKey('model_profile_overrides.codex.haiku'), true);
});
test('model_profile_overrides.claude.opus is valid', () => {
assert.strictEqual(isValidConfigKey('model_profile_overrides.claude.opus'), true);
});
test('model_profile_overrides with unknown runtime is valid (free-string runtime)', () => {
assert.strictEqual(isValidConfigKey('model_profile_overrides.acme.opus'), true);
});
test('model_profile_overrides with bogus tier is rejected', () => {
assert.strictEqual(isValidConfigKey('model_profile_overrides.codex.banana'), false);
});
test('model_profile_overrides without tier is rejected', () => {
assert.strictEqual(isValidConfigKey('model_profile_overrides.codex'), false);
});
test('model_profile_overrides root key alone is rejected (must include runtime+tier)', () => {
assert.strictEqual(isValidConfigKey('model_profile_overrides'), false);
});
});
// ─── loadConfig validation warnings (review findings #10, #13) ──────────────
describe('issue #2517: loadConfig warns on unknown runtime/tier (findings #10, #13)', () => {
const { loadConfig } = require('../get-shit-done/bin/lib/core.cjs');
let tmpDir;
let origWrite;
let captured;
beforeEach(() => {
isolateHome();
tmpDir = createTempProject();
_resetRuntimeWarningCacheForTests();
captured = [];
origWrite = process.stderr.write.bind(process.stderr);
process.stderr.write = (chunk) => { captured.push(String(chunk)); return true; };
});
afterEach(() => { process.stderr.write = origWrite; cleanup(tmpDir); restoreHome(); });
test('unknown runtime triggers a stderr warning', () => {
writeConfig(tmpDir, { runtime: 'codx', model_profile: 'quality' });
loadConfig(tmpDir);
const joined = captured.join('');
assert.match(joined, /unknown value "codx"/);
});
test('known runtime does NOT trigger a runtime warning', () => {
writeConfig(tmpDir, { runtime: 'codex', model_profile: 'quality' });
loadConfig(tmpDir);
const joined = captured.join('');
assert.doesNotMatch(joined, /unknown value/);
});
test('unknown tier in overrides triggers a stderr warning', () => {
writeConfig(tmpDir, {
runtime: 'codex',
model_profile_overrides: { codex: { banana: 'whatever' } },
});
loadConfig(tmpDir);
const joined = captured.join('');
assert.match(joined, /unknown tier "banana"/);
});
test('unknown runtime in overrides triggers a stderr warning', () => {
writeConfig(tmpDir, {
runtime: 'codex',
model_profile_overrides: { mystery: { opus: 'whatever' } },
});
loadConfig(tmpDir);
const joined = captured.join('');
assert.match(joined, /model_profile_overrides\.mystery\.\* uses unknown runtime/);
});
test('every name in KNOWN_RUNTIMES survives the warning gate', () => {
// Smoke check: `KNOWN_RUNTIMES` must list every runtime `bin/install.js`
// emits for, otherwise legitimate users get spammed at every loadConfig.
for (const r of KNOWN_RUNTIMES) {
assert.ok(typeof r === 'string' && r.length > 0);
}
});
});
// ─── End-to-end: per-project config -> Codex TOML emit (finding #1) ─────────
describe('issue #2517: install end-to-end — per-project config reaches Codex TOML (finding #1)', () => {
// Load install.js in test-mode so its module exports are populated.
const prevTestMode = process.env.GSD_TEST_MODE;
process.env.GSD_TEST_MODE = '1';
const installMod = require('../bin/install.js');
if (prevTestMode === undefined) delete process.env.GSD_TEST_MODE;
else process.env.GSD_TEST_MODE = prevTestMode;
const { readGsdRuntimeProfileResolver, generateCodexAgentToml } = installMod;
let tmpDir;
beforeEach(() => { isolateHome(); tmpDir = createTempProject(); _resetRuntimeWarningCacheForTests(); });
afterEach(() => { cleanup(tmpDir); restoreHome(); });
test('readGsdRuntimeProfileResolver picks up runtime from .planning/config.json', () => {
// No ~/.gsd/defaults.json (HOME is isolated tmpdir). Per-project config alone
// must drive the resolver — pre-fix, it returned null.
writeConfig(tmpDir, { runtime: 'codex', model_profile: 'quality' });
const resolver = readGsdRuntimeProfileResolver(tmpDir);
assert.ok(resolver, 'expected a resolver from per-project config');
assert.strictEqual(resolver.runtime, 'codex');
const entry = resolver.resolve('gsd-planner');
assert.deepStrictEqual(entry, { model: 'gpt-5.4', reasoning_effort: 'xhigh' });
});
test('per-project config wins over global ~/.gsd/defaults.json', () => {
fs.mkdirSync(path.join(_isolatedHome, '.gsd'), { recursive: true });
fs.writeFileSync(
path.join(_isolatedHome, '.gsd', 'defaults.json'),
JSON.stringify({ runtime: 'claude', model_profile: 'budget' })
);
writeConfig(tmpDir, { runtime: 'codex', model_profile: 'quality' });
const resolver = readGsdRuntimeProfileResolver(tmpDir);
assert.strictEqual(resolver.runtime, 'codex');
const entry = resolver.resolve('gsd-planner');
assert.strictEqual(entry.model, 'gpt-5.4');
});
test('generated Codex TOML embeds model = and model_reasoning_effort = lines', () => {
writeConfig(tmpDir, { runtime: 'codex', model_profile: 'quality' });
const resolver = readGsdRuntimeProfileResolver(tmpDir);
const toml = generateCodexAgentToml(
'gsd-planner',
'---\nname: gsd-planner\ndescription: Planner agent\n---\nBody.\n',
null,
resolver
);
assert.match(toml, /^model = "gpt-5\.4"$/m);
assert.match(toml, /^model_reasoning_effort = "xhigh"$/m);
});
test('generated TOML omits reasoning_effort when runtime has none', () => {
// For a known runtime with model but no reasoning_effort, only model is emitted.
// Use the user-override path to simulate this with codex (no built-in returns
// model alone, so fabricate via override of an unknown-runtime entry).
writeConfig(tmpDir, {
runtime: 'codex',
model_profile: 'quality',
model_profile_overrides: { codex: { opus: { model: 'custom', reasoning_effort: '' } } },
});
const resolver = readGsdRuntimeProfileResolver(tmpDir);
const toml = generateCodexAgentToml(
'gsd-planner',
'---\nname: gsd-planner\n---\nBody.\n',
null,
resolver
);
assert.match(toml, /^model = "custom"$/m);
assert.doesNotMatch(toml, /model_reasoning_effort/);
});
test('resolver returns null with no global, no per-project config', () => {
// Sanity: nothing configured -> nothing emitted. Pre-existing back-compat.
const resolver = readGsdRuntimeProfileResolver(tmpDir);
assert.strictEqual(resolver, null);
});
test('inline require paths resolve relative to install.js __dirname (finding #6)', () => {
// Defensive: assert the lib files install.js requires actually exist at
// resolver-construction time. Catches accidental relative-path drift in CI.
const installDir = path.dirname(require.resolve('../bin/install.js'));
const libDir = path.join(installDir, '..', 'get-shit-done', 'bin', 'lib');
assert.ok(fs.existsSync(path.join(libDir, 'core.cjs')));
assert.ok(fs.existsSync(path.join(libDir, 'model-profiles.cjs')));
});
});
// ─── RUNTIME_PROFILE_MAP single source of truth (finding #16) ───────────────
describe('issue #2517: RUNTIME_PROFILE_MAP single source of truth (finding #16)', () => {
test('install.js consumes the same map as core.cjs', () => {
// `bin/install.js` must NOT carry its own duplicate copy of the map.
// The shared resolver imported in install.js exposes `runtime` and the
// entries through `resolveTierEntry`, so any future drift between the two
// files would surface as a test failure here rather than a silent bug.
const codexOpus = RUNTIME_PROFILE_MAP.codex?.opus;
assert.deepStrictEqual(codexOpus, { model: 'gpt-5.4', reasoning_effort: 'xhigh' });
const claudeOpus = RUNTIME_PROFILE_MAP.claude?.opus;
assert.deepStrictEqual(claudeOpus, { model: 'claude-opus-4-6' });
});
});