Files
worldmonitor/.gitignore
Elie Habib 9c14820c69 fix(digest): brief filter-drop instrumentation + cache-key correctness (#3387)
* fix(digest): include sensitivity in digestFor cache key

buildDigest filters by rule.sensitivity BEFORE dedup, but digestFor
memoized only on (variant, lang, windowStart). Stricter-sensitivity
users in a shared bucket inherited the looser populator's pool,
producing the wrong story set and defeating downstream topic-grouping
adjacency once filterTopStories re-applied sensitivity.

Solution 1 from docs/plans/2026-04-24-004-fix-brief-topic-adjacency-defects-plan.md.

* feat(digest): instrument per-user filterTopStories drops

Adds an optional onDrop metrics callback to filterTopStories and threads
it through composeBriefFromDigestStories. The seeder aggregates counts
per composed brief and emits one structured log line per user per tick:

  [digest] brief filter drops user=<id> sensitivity=<s> in=<count>
    dropped_severity=<n> dropped_url=<n> dropped_headline=<n>
    dropped_shape=<n> out=<count>

Decides whether the conditional Solution 3 (post-filter regroup) is
warranted by quantifying how often post-group filter drops puncture
multi-member topics in production. No behaviour change for callers
that omit onDrop.

Solution 0 from docs/plans/2026-04-24-004-fix-brief-topic-adjacency-defects-plan.md.

* fix(digest): close two Sol-0 instrumentation gaps from code review

Review surfaced two P2 gaps in the filter-drop telemetry that weakened
its diagnostic purpose for Sol-3 gating:

1. Cap-truncation silent drop: filterTopStories broke on
   `out.length >= maxStories` BEFORE the onDrop emit sites, so up to
   (DIGEST_MAX_ITEMS - MAX_STORIES_PER_USER) stories per user were
   invisible. Added a 'cap' reason to DropMetricsFn and emit one event
   per skipped story so `in - out - sum(dropped_*) == 0` reconciles.

2. Wipeout invisibility: composeAndStoreBriefForUser only logged drop
   stats for the WINNING candidate. When every candidate composed to
   null, the log line never fired — exactly the wipeout case Sol-0
   was meant to surface. Now tracks per-candidate drops and emits an
   aggregate `outcome=wipeout` line covering all attempts.

Also tightens the digest-cache-key sensitivity regex test to anchor
inside the cache-key template literal (it would otherwise match the
unrelated `chosenCandidate.sensitivity ?? 'high'` in the new log line).

PR review residuals from
docs/plans/2026-04-24-004-fix-brief-topic-adjacency-defects-plan.md
ce-code-review run 20260424-232911-37a2d5df.

* chore: ignore .context/ ce-code-review run artifacts

The ce-code-review skill writes per-run artifacts (reviewer JSON,
synthesis.md, metadata.json) under .context/compound-engineering/.
These are local-only — neither tracked nor linted.

* fix(digest): emit per-attempt filter-drop rows, not per-user

Addresses two PR #3387 review findings:

- P2: Earlier candidates that composed to null (wiped out by post-group
  filtering) had their dropStats silently discarded when a later
  candidate shipped — exactly the signal Sol-0 was meant to surface.
- P3: outcome=wipeout row was labeled with allCandidateDrops[0]
  .sensitivity, misleading when candidates within one user have
  different sensitivities.

Fix: emit one structured row per attempted candidate, tagged with that
candidate's own sensitivity and variant. Outcome is shipped|rejected.
A wipeout is now detectable as "all rows for this user are rejected
within the tick" — no aggregate-row ambiguity. Removes the
allCandidateDrops accumulator entirely.

* fix(digest): align composeBriefFromDigestStories sensitivity default to 'high'

Addresses PR #3387 review (P2): composeBriefFromDigestStories defaulted
to `?? 'all'` while buildDigest, the digestFor cache key, and the new
per-attempt log line all default to `?? 'high'`. The mismatch is
harmless in production (the live cron path pre-filters the pool) but:

- A non-prefiltered caller with undefined sensitivity would silently
  ship medium/low stories.
- Per-attempt telemetry labels the attempt as `sensitivity=high` while
  compose actually applied 'all' — operators are misled.

Aligning compose to 'high' makes the four sites agree and the telemetry
honest. Production output is byte-identical (input pool was already
'high'-filtered upstream).

Adds 3 regression tests asserting the new default: critical/high admitted,
medium/low dropped, and onDrop fires reason=severity for the dropped
levels (locks in alignment with per-attempt telemetry).

* fix(digest): align remaining sensitivity defaults to 'high'

Addresses PR #3387 review (P2 + P3): three more sites still defaulted
missing sensitivity to 'all' while compose/buildDigest/cache/log now
treat it as 'high'.

P2 — compareRules (scripts/lib/brief-compose.mjs:35-36): the rank
function used to default to 'all', placing legacy undefined-sensitivity
rules FIRST in the candidate order. Compose then applied a 'high'
filter to them, shipping a narrow brief while an explicit 'all' rule
for the same user was never tried. Aligned to 'high' so the rank
matches what compose actually applies.

P3 — enrichBriefEnvelopeWithLLM (scripts/lib/brief-llm.mjs:526):
the digest prompt and cache key still used 'all' for legacy rules,
misleading personalization ("Reader sensitivity level: all" while the
brief contains only critical/high stories) and busting the cache for
legacy vs explicit-'all' rows that should share entries.

Also aligns the @deprecated composeBriefForRule (line 164) for
consistency, since tests still import it.

3 new regression tests in tests/brief-composer-rule-dedup.test.mjs
lock in the new ranking: explicit 'all' beats undefined-sensitivity,
undefined-sensitivity ties with explicit 'high' (decided by updatedAt),
and groupEligibleRulesByUser candidate order respects the rank.

6853/6853 tests pass (was 6850 → +3).
2026-04-25 00:23:29 +04:00

86 lines
2.0 KiB
Plaintext

node_modules/
.idea/
dist/
public/blog/
.DS_Store
*.log
.env
.env.local
.playwright-mcp/
*-network.txt
.vercel
api/\[domain\]/v1/\[rpc\].js
api/\[\[...path\]\].js
.claude/
.cursor/
CLAUDE.md
.env.vercel-backup
.env.vercel-export
.agent/
.factory/
.windsurf/
skills/
!api/skills/
ideas/
docs/internal/
docs/ideation/
internal/
# Exception: api/internal/ hosts Vercel edge endpoints that must be tracked
# (e.g. api/internal/brief-why-matters.ts — RELAY_SHARED_SECRET-auth'd
# endpoints for internal callers like the Railway digest cron).
# Scoped to SOURCE FILE TYPES ONLY so the parent `.env` / secrets ignore
# rules stay in effect inside this directory. Do NOT widen to `**`.
!api/internal/
!api/internal/*.ts
!api/internal/*.js
!api/internal/*.mjs
test-results/
src-tauri/sidecar/node/*
!src-tauri/sidecar/node/.gitkeep
# AI planning session state
.planning/
# Compiled sebuf gateway bundle (built by scripts/build-sidecar-sebuf.mjs)
api/[[][[].*.js
# Compiled sidecar domain handler bundles (built by scripts/build-sidecar-handlers.mjs)
api/*/v1/\[rpc\].js
.claudedocs/
# Large generated data files (reproduced by scripts/)
scripts/data/pizzint-processed.json
scripts/data/osm-military-processed.json
scripts/data/military-bases-final.json
scripts/data/dedup-dropped-pairs.json
scripts/data/pizzint-partial.json
scripts/data/gpsjam-latest.json
scripts/data/mirta-raw.geojson
scripts/data/osm-military-raw.json
scripts/data/forecast-replays/
# Iran events data (sensitive, not for public repo)
scripts/data/iran-events-latest.json
# Military bases rebuild script (references external Supabase URLs)
scripts/rebuild-military-bases.mjs
.wrangler
# Build artifacts (generated by esbuild/tsc, not source code)
api/data/city-coords.js
# OpenAPI bundle copied at build time from docs/api/ for native Vercel serve
/public/openapi.yaml
# Runtime artifacts (generated by sidecar/tools, not source code)
api-cache.json
verbose-mode.json
skills-lock.json
tmp/
.context/
# Local planning documents (not for public repo)
docs/plans/
docs/brainstorms/
playground-pricing.html