mirror of
https://github.com/koala73/worldmonitor.git
synced 2026-04-25 17:14:57 +02:00
* fix(insights): trust cluster rank, stop LLM from re-picking top story WORLD BRIEF panel published "Iran's new supreme leader was seriously wounded, leading him to delegate power to the Revolutionary Guards. This development comes amid an ongoing war with Israel." to every visitor for 3h. Payload: openrouter / gemini-2.5-flash. Root cause: callLLM sent all 10 clustered headlines with "pick the ONE most significant and summarize ONLY that story". Clustering ranked Lebanon journalist killing #1 (2 corroborating sources); News24 Iran rumor ranked #3 (1 source). Gemini overrode the rank, picked #3, and embellished with war framing from story #4. Objective rank (sourceCount, velocity, isAlert) lost to model vibe. Shrink the LLM's job to phrasing. Clustering already ranks — pass only topStories[0].primaryTitle and instruct the model to rewrite it using ONLY facts from the headline. No name/place/context invention. Also: - temperature 0.3 -> 0.1 (factual summary, not creative) - CACHE_TTL 3h -> 30m so a bad brief ages out in one cron cycle - Drop dead MAX_HEADLINES const Payload shape unchanged; frontend untouched. * fix(insights): corroboration gate + revert TTL + drop unconditional WHERE Follow-up to review feedback on the ranking contract, TTL, and prompt: 1. Corroboration gate (P1a). scoreImportance() in scripts/_clustering.mjs is keyword-heavy (violence +125 on a single word, flashpoint +75, ^1.5 multiplier when both hit), so a single-source sensational rumor can outrank a 2-source lead purely on lexical signals. Blindly trusting topStories[0] would let the ranker's keyword bias still pick bad stories. Walk topStories for sourceCount >= 2 instead — corroboration becomes a hard requirement, not a tiebreaker. If no cluster qualifies, publish status=degraded with no brief (frontend already handles this). 2. CACHE_TTL back to 10800 (P1b). 30m TTL == one cron cadence means the key expires on any missed or delayed run and /api/bootstrap loses insights entirely (api/bootstrap.js reads news:insights:v1 directly, no LKG across TTL-gap). The short TTL was defense-in-depth for bad content; the real safety is now upstream (corroboration gate + grounded prompt), so the LKG window doesn't need to be sacrificed for it. 3. Prompt: location conditional (P2). "Use ONLY facts present" + "Lead with WHAT happened and WHERE" conflicted for headlines without an explicit location and pushed the model toward inferred-place hallucination. Replaced with "Include a location, person, or organization ONLY if it appears in the headline." * test(insights): lock corroboration gate + grounded-prompt invariants Review P2: the corroboration gate and the prompt's no-invention rules had no tests, so future edits to selectTopStories() ordering or prompt text could silently reintroduce the original hallucination. Extract the brief-selection helper and prompt builders into a pure module (scripts/_insights-brief.mjs) so tests can import them without triggering seed-insights.mjs's top-level runSeed() call: - pickBriefCluster(topStories) returns first sourceCount>=2 cluster - briefSystemPrompt(dateISO) returns the system prompt - briefUserPrompt(headline) returns the user prompt Regression tests (tests/seed-insights-brief.test.mjs, 12 cases) lock: - pickBriefCluster skips single-source rumors even when ranked above a multi-sourced lead (explicit regression: News24 Iran supreme leader 2026-04-23 scenario with realistic scores) - pickBriefCluster tolerates missing/null entries - briefSystemPrompt forbids invented facts and proper nouns - briefSystemPrompt's "location" rule is conditional (no unconditional "Lead with WHAT and WHERE" directive that would push the model toward place-inference when the headline has no location) - briefSystemPrompt does not contain "pick the most important" style language (ranking is done by pickBriefCluster upstream) - briefUserPrompt passes the headline verbatim and instructs "only facts from this headline" Also fix a misleading comment on CACHE_TTL: corroboration is gated at brief-selection time, not on the topStories payload itself (which still includes single-source clusters rendered as the headline list). test:data: 6657/6657 pass (was 6645; +12).
105 lines
3.7 KiB
JavaScript
105 lines
3.7 KiB
JavaScript
import { describe, it } from 'node:test';
|
|
import assert from 'node:assert/strict';
|
|
import {
|
|
pickBriefCluster,
|
|
briefSystemPrompt,
|
|
briefUserPrompt,
|
|
} from '../scripts/_insights-brief.mjs';
|
|
|
|
describe('pickBriefCluster', () => {
|
|
it('returns null for empty/non-array input', () => {
|
|
assert.equal(pickBriefCluster([]), null);
|
|
assert.equal(pickBriefCluster(null), null);
|
|
assert.equal(pickBriefCluster(undefined), null);
|
|
});
|
|
|
|
it('returns null when every cluster is single-source', () => {
|
|
const top = [
|
|
{ sourceCount: 1, primaryTitle: 'A' },
|
|
{ sourceCount: 1, primaryTitle: 'B' },
|
|
];
|
|
assert.equal(pickBriefCluster(top), null);
|
|
});
|
|
|
|
it('returns the first cluster with sourceCount >= 2', () => {
|
|
const top = [
|
|
{ sourceCount: 1, primaryTitle: 'A' },
|
|
{ sourceCount: 3, primaryTitle: 'B' },
|
|
{ sourceCount: 2, primaryTitle: 'C' },
|
|
];
|
|
assert.equal(pickBriefCluster(top).primaryTitle, 'B');
|
|
});
|
|
|
|
it('skips a higher-ranked single-source rumor for a lower-ranked multi-sourced lead (regression: News24 Iran supreme leader 2026-04-23)', () => {
|
|
const top = [
|
|
{
|
|
sourceCount: 1,
|
|
primaryTitle: 'Iran new supreme leader seriously wounded, delegates power to Revolutionary Guards',
|
|
importanceScore: 350,
|
|
},
|
|
{
|
|
sourceCount: 2,
|
|
primaryTitle: 'Lebanon leaders accuse Israel of war crime after journalist killed',
|
|
importanceScore: 300,
|
|
},
|
|
];
|
|
const picked = pickBriefCluster(top);
|
|
assert.ok(picked, 'expected a multi-source cluster to be picked');
|
|
assert.match(picked.primaryTitle, /Lebanon/);
|
|
assert.doesNotMatch(picked.primaryTitle, /supreme leader/);
|
|
});
|
|
|
|
it('treats a missing sourceCount as 1 (safe default — do not brief on unknown corroboration)', () => {
|
|
const top = [
|
|
{ primaryTitle: 'A' }, // no sourceCount field
|
|
{ sourceCount: 2, primaryTitle: 'B' },
|
|
];
|
|
assert.equal(pickBriefCluster(top).primaryTitle, 'B');
|
|
});
|
|
|
|
it('tolerates a null/undefined entry without throwing', () => {
|
|
const top = [null, undefined, { sourceCount: 2, primaryTitle: 'A' }];
|
|
assert.equal(pickBriefCluster(top).primaryTitle, 'A');
|
|
});
|
|
});
|
|
|
|
describe('briefSystemPrompt', () => {
|
|
const prompt = briefSystemPrompt('2026-04-24');
|
|
|
|
it('includes the injected date', () => {
|
|
assert.match(prompt, /2026-04-24/);
|
|
});
|
|
|
|
it('forbids inventing facts absent from the headline', () => {
|
|
assert.match(prompt, /Use ONLY facts present/);
|
|
assert.match(prompt, /Do not invent proper nouns/);
|
|
});
|
|
|
|
it('makes location conditional — no unconditional "WHERE" directive', () => {
|
|
// Regression: P2 review finding. "Lead with WHAT happened and WHERE" + "use ONLY facts"
|
|
// conflicted for headlines with no location, pushing the model to confabulate one.
|
|
assert.doesNotMatch(prompt, /Lead with WHAT happened and WHERE/);
|
|
assert.match(prompt, /ONLY if it appears in the headline/);
|
|
});
|
|
|
|
it('does not ask the LLM to rank/pick from multiple headlines', () => {
|
|
// Regression: the original prompt said "Pick the ONE most significant headline".
|
|
// Ranking is now done by pickBriefCluster upstream.
|
|
assert.doesNotMatch(prompt, /Pick the ONE most significant/);
|
|
assert.doesNotMatch(prompt, /Each numbered headline/i);
|
|
assert.doesNotMatch(prompt, /summarize ONLY that story/i);
|
|
});
|
|
});
|
|
|
|
describe('briefUserPrompt', () => {
|
|
it('passes the headline verbatim', () => {
|
|
const headline = 'Iran launches missile strikes on targets in Syria';
|
|
const out = briefUserPrompt(headline);
|
|
assert.ok(out.includes(headline));
|
|
});
|
|
|
|
it('instructs using only facts from the provided headline', () => {
|
|
assert.match(briefUserPrompt('X'), /only facts from this headline/i);
|
|
});
|
|
});
|