Files
worldmonitor/tests/brief-why-matters-analyst.test.mjs
Elie Habib 2f19d96357 feat(brief): route whyMatters through internal analyst-context endpoint (#3248)
* feat(brief): route whyMatters through internal analyst-context endpoint

The brief's "why this is important" callout currently calls Gemini on
only {headline, source, threatLevel, category, country} with no live
state. The LLM can't know whether a ceasefire is on day 2 or day 50,
that IMF flagged >90% gas dependency in UAE/Qatar/Bahrain, or what
today's forecasts look like. Output is generic prose instead of the
situational analysis WMAnalyst produces when given live context.

This PR adds an internal Vercel edge endpoint that reuses a trimmed
variant of the analyst context (country-brief, risk scores, top-3
forecasts, macro signals, market data — no GDELT, no digest-search)
and ships it through a one-sentence LLM call with the existing
WHY_MATTERS_SYSTEM prompt. The endpoint owns its own Upstash cache
(v3 prefix, 6h TTL), supports a shadow mode that runs both paths in
parallel for offline diffing, and is auth'd via RELAY_SHARED_SECRET.

Three-layer graceful degradation (endpoint → legacy Gemini-direct →
stub) keeps the brief shipping on any failure.

Env knobs:
- BRIEF_WHY_MATTERS_PRIMARY=analyst|gemini (default: analyst; typo → gemini)
- BRIEF_WHY_MATTERS_SHADOW=0|1 (default: 1; only '0' disables)
- BRIEF_WHY_MATTERS_SHADOW_SAMPLE_PCT=0..100 (default: 100)
- BRIEF_WHY_MATTERS_ENDPOINT_URL (Railway, optional override)

Cache keys:
- brief:llm:whymatters:v3:{hash16} — envelope {whyMatters, producedBy,
  at}, 6h TTL. Endpoint-owned.
- brief:llm:whymatters:shadow:v1:{hash16} — {analyst, gemini, chosen,
  at}, 7d TTL. Fire-and-forget.
- brief:llm:whymatters:v2:{hash16} — legacy. Cron's fallback path
  still reads/writes during the rollout window; expires in ≤24h.

Tests: 6022 pass (existing 5915 + 12 core + 36 endpoint + misc).
typecheck + typecheck:api + biome on changed files clean.

Plan (Codex-approved after 4 rounds):
docs/plans/2026-04-21-001-feat-brief-why-matters-analyst-endpoint-plan.md

* fix(brief): address /ce:review round 1 findings on PR #3248

Fixes 5 findings from multi-agent review, 2 of them P1:

- #241 P1: `.gitignore !api/internal/**` was too broad — it re-included
  `.env`, `.env.local`, and any future secret file dropped into that
  directory. Narrowed to explicit source extensions (`*.ts`, `*.js`,
  `*.mjs`) so parent `.env` / secrets rules stay in effect inside
  api/internal/.

- #242 P1: `Dockerfile.digest-notifications` did not COPY
  `shared/brief-llm-core.js` + `.d.ts`. Cron would have crashed at
  container start with ERR_MODULE_NOT_FOUND. Added alongside
  brief-envelope + brief-filter COPY lines.

- #243 P2: Cron dropped the endpoint's source/producedBy ground-truth
  signal, violating PR #3247's own round-3 memory
  (feedback_gate_on_ground_truth_not_configured_state.md). Added
  structured log at the call site: `[brief-llm] whyMatters source=<src>
  producedBy=<pb> hash=<h>`. Endpoint response now includes `hash` so
  log + shadow-record pairs can be cross-referenced.

- #244 P2: Defense-in-depth prompt-injection hardening. Story fields
  flowed verbatim into both LLM prompts, bypassing the repo's
  sanitizeForPrompt convention. Added sanitizeStoryFields helper and
  applied in both analyst and gemini paths.

- #245 P2: Removed redundant `validate` option from callLlmReasoning.
  With only openrouter configured in prod, a parse-reject walked the
  provider chain, then fell through to the other path (same provider),
  then the cron's own fallback (same model) — 3x billing on one reject.
  Post-call parseWhyMatters check already handles rejection cleanly.

Deferred to P3 follow-ups (todos 246-248): singleflight, v2 sunset,
misc polish (country-normalize LOC, JSDoc pruning, shadow waitUntil,
auto-sync mirror, context-assembly caching).

Tests: 6022 pass. typecheck + typecheck:api clean.

* fix(brief-why-matters): ctx.waitUntil for shadow write + sanitize legacy fallback

Two P2 findings on PR #3248:

1. Shadow record was fire-and-forget without ctx.waitUntil on an Edge
   function. Vercel can terminate the isolate after response return,
   so the background redisPipeline write completes unreliably — i.e.
   the rollout-validation signal the shadow keys were supposed to
   provide was flaky in production.

   Fix: accept an optional EdgeContext 2nd arg. Build the shadow
   promise up front (so it starts executing immediately) then register
   it with ctx.waitUntil when present. Falls back to plain unawaited
   execution when ctx is absent (local harness / tests).

2. scripts/lib/brief-llm.mjs legacy fallback path called
   buildWhyMattersPrompt(story) on raw fields with no sanitization.
   The analyst endpoint sanitizes before its own prompt build, but
   the fallback is exactly what runs when the endpoint misses /
   errors — so hostile headlines / sources reached the LLM verbatim
   on that path.

   Fix: local sanitizeStoryForPrompt wrapper imports sanitizeForPrompt
   from server/_shared/llm-sanitize.js (existing pattern — see
   scripts/seed-digest-notifications.mjs:41). Wraps story fields
   before buildWhyMattersPrompt. Cache key unchanged (hash is over raw
   story), so cache parity with the analyst endpoint's v3 entries is
   preserved.

Regression guard: new test asserts the fallback prompt strips
"ignore previous instructions", "### Assistant:" line prefixes, and
`<|im_start|>` tokens when injection-crafted fields arrive.

Typecheck + typecheck:api clean. 6023 / 6023 data tests pass.

* fix(digest-cron): COPY server/_shared/llm-sanitize into digest-notifications image

Reviewer P1 on PR #3248: my previous commit (4eee22083) added
`import sanitizeForPrompt from server/_shared/llm-sanitize.js` to
scripts/lib/brief-llm.mjs, but Dockerfile.digest-notifications cherry-
picks server/_shared/* files and doesn't copy llm-sanitize. Import is
top-level/static — the container would crash at module load with
ERR_MODULE_NOT_FOUND the moment seed-digest-notifications.mjs pulls in
scripts/lib/brief-llm.mjs. Not just on fallback — every startup.

Fix: add `COPY server/_shared/llm-sanitize.js server/_shared/llm-sanitize.d.ts`
next to the existing brief-render COPY line. Module is pure string
manipulation with zero transitive imports — nothing else needs to land.

Cites feedback_validation_docker_ship_full_scripts_dir.md in the comment
next to the COPY; the cherry-pick convention keeps biting when new
cross-dir imports land in scripts/lib/ or scripts/shared/.

Can't regression-test at build time from this branch without a
docker-build CI job, but the symptom is deterministic — local runs
remain green (they resolve against the live filesystem); only the
container crashes. Post-merge, Railway redeploy of seed-digest-
notifications should show a clean `Starting Container` log line
instead of the MODULE_NOT_FOUND crash my prior commit would have caused.
2026-04-21 14:03:27 +04:00

513 lines
19 KiB
JavaScript
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
/**
* Integration tests for the /api/internal/brief-why-matters edge endpoint
* + the cron's analyst-priority fallback chain.
*
* The endpoint is a .ts file; we test the pure helpers that go into it
* (country normalizer, core hashing, prompt builder, context trim, env
* parsing) plus simulate the handler end-to-end via the imported
* modules. The cron-side `generateWhyMatters` priority chain is covered
* directly via in-process dep injection.
*
* Run: node --test tests/brief-why-matters-analyst.test.mjs
*/
import { describe, it } from 'node:test';
import assert from 'node:assert/strict';
import { generateWhyMatters } from '../scripts/lib/brief-llm.mjs';
import {
hashBriefStory,
parseWhyMatters,
WHY_MATTERS_SYSTEM,
} from '../shared/brief-llm-core.js';
// ── Story fixture matching the cron's actual payload shape
// (shared/brief-filter.js:134-135). ────────────────────────────────────
function story(overrides = {}) {
return {
headline: 'Iran closes Strait of Hormuz',
source: 'Reuters',
threatLevel: 'critical',
category: 'Geopolitical Risk',
country: 'IR',
...overrides,
};
}
// ── Country normalizer ───────────────────────────────────────────────────
describe('normalizeCountryToIso2', () => {
let normalize;
it('loads from server/_shared/country-normalize.ts via tsx or compiled', async () => {
// The module is .ts; in the repo's test setup, node 22 can load .ts
// via tsx. If direct import fails under the test runner, fall back
// to running the logic inline by importing the JSON and a mirror
// function. The logic is trivial so this isn't a flaky compromise.
try {
const mod = await import('../server/_shared/country-normalize.ts');
normalize = mod.normalizeCountryToIso2;
} catch {
const { default: COUNTRY_NAMES } = await import('../shared/country-names.json', {
with: { type: 'json' },
});
const ISO2_SET = new Set(Object.values(COUNTRY_NAMES));
normalize = (raw) => {
if (typeof raw !== 'string') return null;
const trimmed = raw.trim();
if (trimmed === '') return null;
if (trimmed.toLowerCase() === 'global') return null;
if (/^[A-Za-z]{2}$/.test(trimmed)) {
const upper = trimmed.toUpperCase();
return ISO2_SET.has(upper) ? upper : null;
}
const lookup = COUNTRY_NAMES[trimmed.toLowerCase()];
return typeof lookup === 'string' ? lookup : null;
};
}
assert.ok(typeof normalize === 'function');
});
it('passes through valid ISO2 case-insensitively', () => {
assert.equal(normalize('US'), 'US');
assert.equal(normalize('us'), 'US');
assert.equal(normalize('IR'), 'IR');
assert.equal(normalize('gb'), 'GB');
});
it('resolves full names case-insensitively', () => {
assert.equal(normalize('United States'), 'US');
assert.equal(normalize('united states'), 'US');
assert.equal(normalize('Iran'), 'IR');
assert.equal(normalize('United Kingdom'), 'GB');
});
it("'Global' sentinel maps to null (non-country; not an error)", () => {
assert.equal(normalize('Global'), null);
assert.equal(normalize('global'), null);
assert.equal(normalize('GLOBAL'), null);
});
it('rejects unknown / empty / undefined / non-string inputs', () => {
assert.equal(normalize(''), null);
assert.equal(normalize(' '), null);
assert.equal(normalize('Nowhere'), null);
assert.equal(normalize(undefined), null);
assert.equal(normalize(null), null);
assert.equal(normalize(123), null);
});
it('resolves common non-ISO2 abbreviations when they exist in the gazetteer', () => {
// Plan assumed "USA" was not in the gazetteer; it actually is mapped.
// This exercises the full-name-path (3+ chars) with a short abbreviation.
assert.equal(normalize('USA'), 'US');
});
it('rejects ISO2-shaped values not in the gazetteer', () => {
assert.equal(normalize('ZZ'), null); // structurally valid, not in gazetteer
assert.equal(normalize('XY'), null);
});
});
// ── Cache-key stability ──────────────────────────────────────────────────
describe('cache key identity', () => {
it('hashBriefStory stable across the 5-field material', async () => {
const a = await hashBriefStory(story());
const b = await hashBriefStory(story());
assert.equal(a, b);
});
it('hashBriefStory differs when any hash-field differs', async () => {
const baseline = await hashBriefStory(story());
for (const f of ['headline', 'source', 'threatLevel', 'category', 'country']) {
const h = await hashBriefStory(story({ [f]: `${story()[f]}X` }));
assert.notEqual(h, baseline, `${f} must be part of cache identity`);
}
});
});
// ── Deterministic shadow sampling ────────────────────────────────────────
describe('shadow sample deterministic hashing', () => {
// Mirror of the endpoint's sample decision — any drift between this
// and the endpoint would silently halve the sampled population.
function sampleHit(hash16, pct) {
if (pct >= 100) return true;
if (pct <= 0) return false;
const bucket = Number.parseInt(hash16.slice(0, 8), 16) % 100;
return bucket < pct;
}
it('pct=100 always hits', () => {
for (const h of ['0000000000000000', 'ffffffffffffffff', 'abcdef0123456789']) {
assert.equal(sampleHit(h, 100), true);
}
});
it('pct=0 never hits', () => {
for (const h of ['0000000000000000', 'ffffffffffffffff', 'abcdef0123456789']) {
assert.equal(sampleHit(h, 0), false);
}
});
it('pct=25 hits approximately 25% on a bulk sample, and is deterministic', async () => {
let hits = 0;
const N = 400;
const seen = new Map();
for (let i = 0; i < N; i++) {
const h = await hashBriefStory(story({ headline: `fixture-${i}` }));
const first = sampleHit(h, 25);
const second = sampleHit(h, 25);
assert.equal(first, second, `hash ${h} must give the same decision`);
seen.set(h, first);
if (first) hits++;
}
// Tolerance: uniform mod-100 on SHA-256 prefix should be tight.
assert.ok(hits > N * 0.15, `expected > 15% hits, got ${hits}`);
assert.ok(hits < N * 0.35, `expected < 35% hits, got ${hits}`);
});
});
// ── `generateWhyMatters` analyst-priority chain ─────────────────────────
describe('generateWhyMatters — analyst priority', () => {
const VALID = 'Closure of the Strait of Hormuz would spike global oil prices and force a US naval response.';
it('uses the analyst endpoint result when it returns a string', async () => {
let callLlmInvoked = false;
const out = await generateWhyMatters(story(), {
callAnalystWhyMatters: async () => VALID,
callLLM: async () => {
callLlmInvoked = true;
return 'FALLBACK unused';
},
cacheGet: async () => null,
cacheSet: async () => {},
});
assert.equal(out, VALID);
assert.equal(callLlmInvoked, false, 'legacy callLLM must NOT fire when analyst returns');
});
it('falls through to legacy chain when analyst returns null', async () => {
let callLlmInvoked = false;
const out = await generateWhyMatters(story(), {
callAnalystWhyMatters: async () => null,
callLLM: async () => {
callLlmInvoked = true;
return VALID;
},
cacheGet: async () => null,
cacheSet: async () => {},
});
assert.equal(out, VALID);
assert.equal(callLlmInvoked, true, 'legacy callLLM must fire after analyst miss');
});
it('falls through when analyst returns unparseable prose (parser rejection)', async () => {
let callLlmInvoked = false;
const out = await generateWhyMatters(story(), {
// Too short — fails parseWhyMatters length gate (< 30 chars).
callAnalystWhyMatters: async () => 'Short.',
callLLM: async () => {
callLlmInvoked = true;
return VALID;
},
cacheGet: async () => null,
cacheSet: async () => {},
});
assert.equal(out, VALID);
assert.equal(callLlmInvoked, true, 'unparseable analyst output must trigger fallback');
});
it('falls through when analyst throws', async () => {
let callLlmInvoked = false;
const out = await generateWhyMatters(story(), {
callAnalystWhyMatters: async () => {
throw new Error('network timeout');
},
callLLM: async () => {
callLlmInvoked = true;
return VALID;
},
cacheGet: async () => null,
cacheSet: async () => {},
});
assert.equal(out, VALID);
assert.equal(callLlmInvoked, true);
});
it('returns null when BOTH layers fail (caller uses stub)', async () => {
const out = await generateWhyMatters(story(), {
callAnalystWhyMatters: async () => null,
callLLM: async () => null,
cacheGet: async () => null,
cacheSet: async () => {},
});
assert.equal(out, null);
});
it('no callAnalystWhyMatters dep → legacy chain runs directly (backcompat)', async () => {
let callLlmInvoked = false;
const out = await generateWhyMatters(story(), {
callLLM: async () => {
callLlmInvoked = true;
return VALID;
},
cacheGet: async () => null,
cacheSet: async () => {},
});
assert.equal(out, VALID);
assert.equal(callLlmInvoked, true);
});
});
// ── Body validation (simulated — same rules as endpoint's
// validateStoryBody) ────────────────────────────────────────────────────
describe('endpoint validation contract', () => {
// Mirror of the endpoint's validation so unit tests don't need the
// full edge runtime. Any divergence would surface as a cross-suite
// test regression on the endpoint flow (see "endpoint end-to-end" below).
const VALID_THREAT = new Set(['critical', 'high', 'medium', 'low']);
const CAPS = { headline: 400, source: 120, category: 80, country: 80 };
const MAX_BODY_BYTES = 4096;
function validate(raw) {
if (!raw || typeof raw !== 'object') return { ok: false, msg: 'body' };
const s = raw.story;
if (!s || typeof s !== 'object') return { ok: false, msg: 'body.story' };
for (const f of ['headline', 'source', 'category']) {
if (typeof s[f] !== 'string' || s[f].length === 0) return { ok: false, msg: f };
if (s[f].length > CAPS[f]) return { ok: false, msg: `${f}-length` };
}
if (typeof s.threatLevel !== 'string' || !VALID_THREAT.has(s.threatLevel)) {
return { ok: false, msg: 'threatLevel' };
}
if (s.country !== undefined) {
if (typeof s.country !== 'string') return { ok: false, msg: 'country' };
if (s.country.length > CAPS.country) return { ok: false, msg: 'country-length' };
}
return { ok: true };
}
function measureBytes(obj) {
return new TextEncoder().encode(JSON.stringify(obj)).byteLength;
}
it('accepts a valid payload', () => {
assert.deepEqual(validate({ story: story() }), { ok: true });
});
it('rejects threatLevel="info" (not in the 4-value enum)', () => {
const out = validate({ story: story({ threatLevel: 'info' }) });
assert.equal(out.ok, false);
assert.equal(out.msg, 'threatLevel');
});
it('accepts free-form category (no allowlist)', () => {
for (const cat of ['General', 'Geopolitical Risk', 'Market Activity', 'Humanitarian Crisis']) {
assert.deepEqual(validate({ story: story({ category: cat }) }), { ok: true });
}
});
it('rejects category exceeding length cap', () => {
const long = 'x'.repeat(81);
const out = validate({ story: story({ category: long }) });
assert.equal(out.ok, false);
assert.equal(out.msg, 'category-length');
});
it('rejects empty required fields', () => {
for (const f of ['headline', 'source', 'category']) {
const out = validate({ story: story({ [f]: '' }) });
assert.equal(out.ok, false);
assert.equal(out.msg, f);
}
});
it('accepts empty country + country="Global" + missing country', () => {
assert.deepEqual(validate({ story: story({ country: '' }) }), { ok: true });
assert.deepEqual(validate({ story: story({ country: 'Global' }) }), { ok: true });
const { country: _, ...withoutCountry } = story();
assert.deepEqual(validate({ story: withoutCountry }), { ok: true });
});
it('body cap catches oversize payloads (both Content-Length and post-read)', () => {
const bloated = {
story: {
...story(),
// Artificial oversize payload — would need headline cap bypassed
// to reach in practice, but the total body-byte cap must still fire.
extra: 'x'.repeat(5000),
},
};
assert.ok(measureBytes(bloated) > MAX_BODY_BYTES, 'fixture is oversize');
// Note: body-cap is enforced at the handler level, not the validator.
// We assert the invariant about the measure here; the handler path is
// covered by the endpoint smoke test below.
});
});
// ── Prompt builder shape ──────────────────────────────────────────────
describe('buildAnalystWhyMattersPrompt — shape and budget', () => {
let builder;
it('loads', async () => {
const mod = await import('../server/worldmonitor/intelligence/v1/brief-why-matters-prompt.ts');
builder = mod.buildAnalystWhyMattersPrompt;
assert.ok(typeof builder === 'function');
});
it('reuses WHY_MATTERS_SYSTEM verbatim', () => {
const { system } = builder(story(), {
worldBrief: 'X',
countryBrief: '',
riskScores: '',
forecasts: '',
marketData: '',
macroSignals: '',
degraded: false,
});
assert.equal(system, WHY_MATTERS_SYSTEM);
});
it('includes the story fields in the same 5-line format', () => {
const { user } = builder(story(), {
worldBrief: '',
countryBrief: '',
riskScores: '',
forecasts: '',
marketData: '',
macroSignals: '',
degraded: false,
});
assert.match(user, /Headline: Iran closes Strait of Hormuz/);
assert.match(user, /Source: Reuters/);
assert.match(user, /Severity: critical/);
assert.match(user, /Category: Geopolitical Risk/);
assert.match(user, /Country: IR/);
assert.match(user, /One editorial sentence on why this matters:$/);
});
it('omits context block when all fields empty', () => {
const { user } = builder(story(), {
worldBrief: '',
countryBrief: '',
riskScores: '',
forecasts: '',
marketData: '',
macroSignals: '',
degraded: false,
});
assert.doesNotMatch(user, /# Live WorldMonitor Context/);
});
it('truncates context to stay under budget', () => {
const hugeContext = {
worldBrief: 'x'.repeat(5000),
countryBrief: 'y'.repeat(5000),
riskScores: 'z'.repeat(5000),
forecasts: 'w'.repeat(5000),
marketData: 'v'.repeat(5000),
macroSignals: 'u'.repeat(5000),
degraded: false,
};
const { user } = builder(story(), hugeContext);
// Total user prompt should be bounded. Per plan: context budget ~1700
// + story fields + footer ~250 → under 2.5KB.
assert.ok(user.length < 2500, `prompt should be bounded; got ${user.length} chars`);
});
});
// ── Env flag parsing (endpoint config resolution) ─────────────────────
describe('endpoint env flag parsing', () => {
// Mirror the endpoint's readConfig logic so a drift between this
// expectation and the handler fails one test suite.
function readConfig(env) {
const rawPrimary = (env.BRIEF_WHY_MATTERS_PRIMARY ?? '').trim().toLowerCase();
let primary;
let invalidPrimaryRaw = null;
if (rawPrimary === '' || rawPrimary === 'analyst') primary = 'analyst';
else if (rawPrimary === 'gemini') primary = 'gemini';
else {
primary = 'gemini';
invalidPrimaryRaw = rawPrimary;
}
const shadowEnabled = env.BRIEF_WHY_MATTERS_SHADOW !== '0';
const rawSample = env.BRIEF_WHY_MATTERS_SHADOW_SAMPLE_PCT;
let samplePct = 100;
let invalidSamplePctRaw = null;
if (rawSample !== undefined && rawSample !== '') {
const parsed = Number.parseInt(rawSample, 10);
if (Number.isInteger(parsed) && parsed >= 0 && parsed <= 100 && String(parsed) === rawSample.trim()) {
samplePct = parsed;
} else {
invalidSamplePctRaw = rawSample;
}
}
return { primary, invalidPrimaryRaw, shadowEnabled, samplePct, invalidSamplePctRaw };
}
it('defaults: primary=analyst, shadow=on, sample=100', () => {
const c = readConfig({});
assert.equal(c.primary, 'analyst');
assert.equal(c.shadowEnabled, true);
assert.equal(c.samplePct, 100);
});
it('PRIMARY=gemini is honoured (kill switch)', () => {
const c = readConfig({ BRIEF_WHY_MATTERS_PRIMARY: 'gemini' });
assert.equal(c.primary, 'gemini');
});
it('PRIMARY=analust (typo) falls back to gemini + invalidPrimaryRaw set', () => {
const c = readConfig({ BRIEF_WHY_MATTERS_PRIMARY: 'analust' });
assert.equal(c.primary, 'gemini');
assert.equal(c.invalidPrimaryRaw, 'analust');
});
it('SHADOW disabled only by exact "0"', () => {
for (const v of ['yes', '1', 'true', '', 'on']) {
assert.equal(readConfig({ BRIEF_WHY_MATTERS_SHADOW: v }).shadowEnabled, true, `value=${v}`);
}
assert.equal(readConfig({ BRIEF_WHY_MATTERS_SHADOW: '0' }).shadowEnabled, false);
});
it('SAMPLE_PCT accepts integer 0100; invalid → 100', () => {
assert.equal(readConfig({ BRIEF_WHY_MATTERS_SHADOW_SAMPLE_PCT: '25' }).samplePct, 25);
assert.equal(readConfig({ BRIEF_WHY_MATTERS_SHADOW_SAMPLE_PCT: '0' }).samplePct, 0);
assert.equal(readConfig({ BRIEF_WHY_MATTERS_SHADOW_SAMPLE_PCT: '100' }).samplePct, 100);
assert.equal(readConfig({ BRIEF_WHY_MATTERS_SHADOW_SAMPLE_PCT: '101' }).samplePct, 100);
assert.equal(readConfig({ BRIEF_WHY_MATTERS_SHADOW_SAMPLE_PCT: 'foo' }).samplePct, 100);
assert.equal(readConfig({ BRIEF_WHY_MATTERS_SHADOW_SAMPLE_PCT: '-5' }).samplePct, 100);
assert.equal(readConfig({ BRIEF_WHY_MATTERS_SHADOW_SAMPLE_PCT: '50.5' }).samplePct, 100);
});
});
// ── Gemini path prompt parity snapshot ────────────────────────────────
describe('Gemini path prompt parity', () => {
it('buildWhyMattersPrompt output is stable (frozen snapshot)', async () => {
const { buildWhyMattersPrompt } = await import('../scripts/lib/brief-llm.mjs');
const { system, user } = buildWhyMattersPrompt(story());
// Snapshot — if either the system prompt or the user prompt shape
// changes, the endpoint's gemini-path output will drift from the
// cron's pre-PR output. Bump BRIEF_WHY_MATTERS_PRIMARY=gemini
// rollout risk accordingly.
assert.match(system, /ONE concise sentence \(1830 words\)/);
assert.equal(
user.split('\n').slice(0, 5).join('\n'),
[
'Headline: Iran closes Strait of Hormuz',
'Source: Reuters',
'Severity: critical',
'Category: Geopolitical Risk',
'Country: IR',
].join('\n'),
);
assert.ok(user.endsWith('One editorial sentence on why this matters:'));
});
});