Files
worldmonitor/shared
Elie Habib 81536cb395 feat(brief): source links, LLM descriptions, strip suffix (envelope v2) (#3181)
* feat(brief): source links, LLM descriptions, strip publisher suffix (envelope v2)

Three coordinated fixes to the magazine content pipeline.

1. Headlines were ending with " - AP News" / " | Reuters" etc. because
   the composer passed RSS titles through verbatim. Added
   stripHeadlineSuffix() in brief-compose.mjs, conservative case-
   insensitive match only when the trailing token equals primarySource,
   so a real subtitle that happens to contain a dash still survives.

2. Story descriptions were the headline verbatim. Added
   generateStoryDescription to brief-llm.mjs, plumbed into
   enrichBriefEnvelopeWithLLM: one additional LLM call per story,
   cached 24h on a v1 key covering headline, source, severity,
   category, country. Cache hits are revalidated via
   parseStoryDescription so a bad row cannot flow to the envelope.
   Falls through to the cleaned headline on any failure.

3. Source attribution was plain text, no outgoing link. Bumped
   BRIEF_ENVELOPE_VERSION to 2, added BriefStory.sourceUrl. The
   composer now plumbs story:track:v1.link through
   digestStoryToUpstreamTopStory, UpstreamTopStory.primaryLink,
   filterTopStories, BriefStory.sourceUrl. The renderer wraps the
   Source line in an anchor with target=_blank, rel=noopener
   noreferrer, and UTM params (utm_source=worldmonitor,
   utm_medium=brief, utm_campaign=<issueDate>, utm_content=story-
   <rank>). UTM appending is idempotent, publisher-attributed URLs
   keep their own utm_source.

Envelope validation gains a validateSourceUrl step (https/http only,
no userinfo credentials, parseable absolute URL). Stories without a
valid upstream link are dropped by filterTopStories rather than
shipping with an unlinked source.

Tests: 30 renderer tests to 38; new assertions cover UTM presence on
every anchor, HTML-escaping of ampersands in hrefs, pre-existing UTM
preservation, and all four validator rejection modes. New composer
tests cover suffix stripping, link plumb-through, and v2 drop-on-no-
link behaviour. New LLM tests for generateStoryDescription cover
cache hit/miss, revalidation of bad rows, 24h TTL, and null-on-
failure.

* fix(brief): v1 back-compat window on renderer + consolidate story hash helper

Two P1/P2 review findings on #3181.

P1 (v1 back-compat). Bumping BRIEF_ENVELOPE_VERSION 1 to 2 made every
v1 envelope still resident in Redis under the 7-day TTL fail
assertBriefEnvelope. The hosted /api/brief route would 404 "expired"
and the /api/latest-brief preview would downgrade to "composing",
breaking already-issued links from the preceding week.

Fix: renderer now accepts SUPPORTED_ENVELOPE_VERSIONS = Set([1, 2])
on READ. BRIEF_ENVELOPE_VERSION stays at 2 and is the only version
the composer ever writes. BriefStory.sourceUrl is required when
version === 2 and absent on v1; when rendering a v1 story the source
line degrades to plain text (no anchor), matching pre-v2 appearance.
When the TTL window passes the set can shrink to [2] in a follow-up.

P2 (hash dedup). hashStoryDescription was byte-identical to hashStory,
inviting silent drift if one prompt gains a field the other forgets.
Consolidated into hashBriefStory. Cache key separation remains via
the distinct prefixes (brief:llm:whymatters:v2:/brief:llm:description:v1:).

Tests: adds 3 v1 back-compat assertions (plain source line, field
validation still runs, defensive sourceUrl check), updates the
version-mismatch assertion to match the new supported-set message.
161/161 pass (was 158). Full test:data 5706/5706.
2026-04-18 21:49:17 +04:00
..