mirror of
https://github.com/koala73/worldmonitor.git
synced 2026-04-25 17:14:57 +02:00
P1 — False-positive PARITY REGRESSION for AI-digest opt-out users
(scripts/seed-digest-notifications.mjs)
When rule.aiDigestEnabled === false, briefLead is intentionally
null (no summary in channel bodies), but envLead still reads the
envelope's stub lead. The string comparison null !== '<stub lead>'
fired channels_equal=false on every tick for every opted-out user
— flooding the parity log with noise and risking the PARITY
REGRESSION alert becoming useless.
The WARN was already gated by `briefLead && envLead` (so no Sentry
flood), but the LOG line still misled operators counting
channels_equal=false. Gate the entire parity-log block on the same
condition that governs briefLead population:
if (AI_DIGEST_ENABLED && rule.aiDigestEnabled !== false) {
// … parity log + warn …
}
Opt-out users now produce no parity-log line at all (correct —
there's no canonical synthesis to compare against).
P4 — greetingBucket '' fallback semantics
(scripts/lib/brief-llm.mjs)
Doc-only — Greptile flagged that unrecognised greetings collapse
to '' (a single bucket). Added a comment clarifying this is
intentional: '' is a stable fourth bucket, not a sentinel for
"missing data". A user whose greeting flips between recognised
and unrecognised values gets different cache keys, which is
correct (those produce visibly different leads).
Other Greptile findings (no code change — replied via PR comment):
- P2 (double-fetch in [...sortedDue, ...sortedAll]): already
addressed in helper extraction commit df3563080 of PR #3396 —
see `seen` Set dedupe at scripts/lib/digest-orchestration-helpers.mjs:103.
- P2 (parity check no-op for opted-in): outdated as written —
after 5d10cee86's per-rule synthesis, briefLead is per-rule and
envLead is winner-rule's envelope.lead. They diverge for
non-winner rules (legitimate); agree for winner rules (cache-
shared via generateDigestProse). The check still serves its
documented purpose for cache-drift detection.
Stacked on the merged PR #3396; opens as a follow-up since the
parent branch is now closed.
Test results: 7012/7012 (was 7006 pre-rebase onto post-merge main).
This commit is contained in:
@@ -344,6 +344,16 @@ const DIGEST_PROSE_SYSTEM_BASE =
|
||||
* locales; the bucket collapses them to one of three slots so the
|
||||
* cache key only changes when the time-of-day window changes.
|
||||
*
|
||||
* Unrecognised greetings (locale-specific phrases the keyword
|
||||
* heuristic doesn't match, empty strings after locale changes,
|
||||
* non-string inputs) collapse to the literal `''` slot. This is
|
||||
* INTENTIONAL — it's a stable fourth bucket, not a sentinel for
|
||||
* "missing data". A user whose greeting flips between a recognised
|
||||
* value (e.g. "Good morning") and an unrecognised one (e.g. a
|
||||
* locale-specific phrase) will get different cache keys, which is
|
||||
* correct: those produce visibly different leads. Greptile P2 on
|
||||
* PR #3396 raised the visibility, kept the behaviour.
|
||||
*
|
||||
* @param {string|null|undefined} greeting
|
||||
* @returns {'morning' | 'afternoon' | 'evening' | ''}
|
||||
*/
|
||||
|
||||
@@ -1767,7 +1767,13 @@ async function main() {
|
||||
console.log(
|
||||
`[digest] Sent ${stories.length} stories to ${rule.userId} (${rule.variant}, ${rule.digestMode})`,
|
||||
);
|
||||
// Parity observability. Two distinct properties to track:
|
||||
// Parity observability. Gated on AI_DIGEST_ENABLED + per-rule
|
||||
// aiDigestEnabled — without this guard, opt-out users (briefLead
|
||||
// is intentionally null) trigger PARITY REGRESSION every tick
|
||||
// (null !== '<envelope stub lead>'), flooding Sentry with
|
||||
// false positives. Greptile P1 on PR #3396.
|
||||
//
|
||||
// Two distinct properties to track:
|
||||
//
|
||||
// 1. CHANNEL parity (load-bearing): for ONE send, every channel
|
||||
// body of THIS rule (email HTML + plain text + Telegram +
|
||||
@@ -1777,41 +1783,44 @@ async function main() {
|
||||
//
|
||||
// 2. WINNER parity (informational): when `winner_match=true`,
|
||||
// THIS rule is the same one the magazine envelope was
|
||||
// composed from — so channel lead == magazine lead. When
|
||||
// `winner_match=false`, this is a non-winner rule send;
|
||||
// channel lead reflects this rule's pool while the magazine
|
||||
// URL points at the winner's editorial. Expected divergence,
|
||||
// not a regression.
|
||||
// composed from — so channel lead == magazine lead (cache-
|
||||
// shared via generateDigestProse). When `winner_match=false`,
|
||||
// this is a non-winner rule send; channel lead reflects this
|
||||
// rule's pool while the magazine URL points at the winner's
|
||||
// editorial. Expected divergence, not a regression.
|
||||
//
|
||||
// PARITY REGRESSION fires only when winner_match=true AND the
|
||||
// channel lead differs from the envelope lead (the canonical-
|
||||
// synthesis contract has actually broken).
|
||||
const envLead = brief?.envelope?.data?.digest?.lead ?? '';
|
||||
const winnerVariant = brief?.chosenVariant ?? '';
|
||||
const winnerMatch = winnerVariant === (rule.variant ?? 'full');
|
||||
const channelsEqual = briefLead === envLead;
|
||||
const publicLead = brief?.envelope?.data?.digest?.publicLead ?? '';
|
||||
console.log(
|
||||
`[digest] brief lead parity user=${rule.userId} ` +
|
||||
`rule=${rule.variant ?? 'full'}:${rule.sensitivity ?? 'high'}:${rule.lang ?? 'en'} ` +
|
||||
`winner_match=${winnerMatch} ` +
|
||||
`synthesis_level=${synthesisLevel} ` +
|
||||
`exec_len=${(briefLead ?? '').length} ` +
|
||||
`brief_lead_len=${envLead.length} ` +
|
||||
`channels_equal=${channelsEqual} ` +
|
||||
`public_lead_len=${publicLead.length}`,
|
||||
);
|
||||
if (winnerMatch && !channelsEqual && briefLead && envLead) {
|
||||
// Sentry alert candidate — winner_match=true means this rule
|
||||
// composed the envelope, so its channel lead MUST match the
|
||||
// envelope lead. Mismatch = canonical-synthesis cache drift
|
||||
// or code regression. Logged loudly so Sentry's console-
|
||||
// breadcrumb hook surfaces it without an explicit
|
||||
// captureMessage call.
|
||||
console.warn(
|
||||
`[digest] PARITY REGRESSION user=${rule.userId} — winner-rule channel lead != envelope lead. ` +
|
||||
`Investigate: cache drift between compose pass and send pass?`,
|
||||
// synthesis cache row has drifted between compose and send
|
||||
// passes — a real contract break).
|
||||
if (AI_DIGEST_ENABLED && rule.aiDigestEnabled !== false) {
|
||||
const envLead = brief?.envelope?.data?.digest?.lead ?? '';
|
||||
const winnerVariant = brief?.chosenVariant ?? '';
|
||||
const winnerMatch = winnerVariant === (rule.variant ?? 'full');
|
||||
const channelsEqual = briefLead === envLead;
|
||||
const publicLead = brief?.envelope?.data?.digest?.publicLead ?? '';
|
||||
console.log(
|
||||
`[digest] brief lead parity user=${rule.userId} ` +
|
||||
`rule=${rule.variant ?? 'full'}:${rule.sensitivity ?? 'high'}:${rule.lang ?? 'en'} ` +
|
||||
`winner_match=${winnerMatch} ` +
|
||||
`synthesis_level=${synthesisLevel} ` +
|
||||
`exec_len=${(briefLead ?? '').length} ` +
|
||||
`brief_lead_len=${envLead.length} ` +
|
||||
`channels_equal=${channelsEqual} ` +
|
||||
`public_lead_len=${publicLead.length}`,
|
||||
);
|
||||
if (winnerMatch && !channelsEqual && briefLead && envLead) {
|
||||
// Sentry alert candidate — winner_match=true means this rule
|
||||
// composed the envelope, so its channel lead MUST match the
|
||||
// envelope lead. Mismatch = canonical-synthesis cache drift
|
||||
// or code regression. Logged loudly so Sentry's console-
|
||||
// breadcrumb hook surfaces it without an explicit
|
||||
// captureMessage call.
|
||||
console.warn(
|
||||
`[digest] PARITY REGRESSION user=${rule.userId} — winner-rule channel lead != envelope lead. ` +
|
||||
`Investigate: cache drift between compose pass and send pass?`,
|
||||
);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user