eliott/worldmonitor - worldmonitor - lab48

eliott/worldmonitor

mirror of https://github.com/koala73/worldmonitor.git synced 2026-04-25 17:14:57 +02:00

Author	SHA1	Message	Date
Elie Habib	93eca7bbbf	fix(digest): dense-fill topicOf with -1 sentinel + surface missed indices Greptile P2 on PR #3247: `new Array(top.length)` creates a sparse array. If a future injected clusterFn doesn't cover every input index, topicOf[i] would be undefined, which then silently poisons the phase-1 aggregates (topicSize[undefined] / topicMax[undefined]) and degrades the topic sort without any observable failure. Fill with -1 so absence is unambiguous, then validate after clusterFn runs and throw if any index is still -1. The outer try/catch captures the error and returns {reps: top, topicCount: top.length, error} matching the existing contract — primary order is preserved, no crash. No behavior change today: singleLinkCluster's union-find guarantees every index is covered. This just guards the invariant for future clusterFn injections.	2026-04-21 08:52:49 +04:00
Elie Habib	f5205bdb57	fix(digest): truthful typo warn + gate grouping on actual embeddings Two P2 bugs from round-3 review: P2a: warn text lied. After the typo→jaccard fix, the warn still said "defaulting to embed", which is the opposite of what the code now does. During an outage, operators reading the warn get told the wrong thing. Updated to "falling back to jaccard (safe rollback path)" to match the actual behavior. P2b: shouldGroupTopics gate used stale signal. Gating on cfg.mode === 'embed' worked for configured-jaccard (kill switch) but NOT for runtime Jaccard fallback. When the embed path throws inside deduplicateStories, it falls back to Jaccard but cfg.mode is still 'embed'. The gate passed, groupTopicsPostDedup ran with an empty embeddingByHash, and the caller emitted a misleading "topic grouping failed: missing embedding" warn ON TOP of the legitimate "falling back to Jaccard" warn. Ground-truth fix: gate on `embeddingByHash.size > 0`. The Map is the authoritative signal for "primary embed path produced vectors" — populated only on success, empty in both fallback paths (configured and runtime). One gate, both paths clean. Added regression test: "runtime Jaccard fallback returns empty embeddingByHash + empty logSummary" — proves the ground-truth invariant the caller relies on, so a future change can't re-introduce the leak. Tests: 5915 pass (+1 new). typecheck, typecheck:api, biome clean.	2026-04-21 08:35:10 +04:00
Elie Habib	d234452e5c	fix(digest): two-phase topic sort + typo-safe mode fallback Two P1/P2 bugs found in post-PR-#3247 review: P1: groupTopicsPostDedup did NOT guarantee contiguous topic blocks. The global sort key (topicSize, topicMax, repScore, titleHashHex) fell through to per-rep repScore when two topics tied on size and max, interleaving members. Two same-size-same-max topics {A90,A80} and {B90,B70} output as [A90,B90,A80,B70] — broken contiguity, breaking the editorial promise of the PR. Fix: two-phase sort. Phase 1 orders TOPICS by (topicSize DESC, topicMax DESC, topicTieHash ASC) where topicTieHash = min titleHash among the topic's members — a topic-level invariant, not a rep-level one. Phase 2 orders members within each topic by (repScore DESC, titleHashHex ASC). Concatenate in topic order. Members of the same topic CANNOT interleave with any other topic's members. Added regression test with the exact fixture. P2: DIGEST_DEDUP_MODE typos failed OPEN to embed, not Jaccard. File header documented "non-{embed,jaccard} → jaccard with warn" but readOrchestratorConfig mapped typos to mode='embed'. Operator scenario: during an embed outage sets DIGEST_DEDUP_MODE=jacard — kill switch silently stays off. Fix: typo / unrecognised value resolves to mode='jaccard' (safer), matching the documented contract. invalidModeRaw warn still fires so operators see the typo. Added 3 parsing tests (typo, garbage, empty). Tests: 5914 pass (was 5910 + 4 new). typecheck, typecheck:api, biome clean.	2026-04-21 07:29:20 +04:00
Elie Habib	38541c1075	chore(digest): address /ce:review round 1 findings Fixes 5 findings raised by the multi-agent review of PR #3247: - #234 P2: Drop dead `deps.log` shim from `deduplicateStories` — caller now owns the log line so the param rotted (no test used it post-#3247). Removed from JSDoc + signature logic. - #236 P3: Add defensive warn when `winningIdx === undefined` during sidecar `embeddingByHash` population. Shouldn't fire with the current `materializeCluster` contract, but catches a future refactor where a synthesised rep would silently skip topic grouping. - #237 P3: Skip `groupTopicsPostDedup` when `cfg.mode === 'jaccard'` — the kill-switch path returns an empty `embeddingByHash`, and running the secondary pass on it would log a noisy "missing embedding" warn every tick. Gate the call site; passthrough primary order. - #240a P3: Remove dead `top ?? []` fallback after `!Array.isArray(top)` already handled the falsy case. Replaced with an explicit Array check for the rare "falsy but also not-array" input (defence-in-depth). - #240b P3: Delete two redundant test blocks — the `titleHashHex tiebreak` 2-rep fixture (permutation-invariance at 15-rep scale already covers this invariant) and the `caller log-line format (regex splice)` describe block (the regex lives in seed-digest-notifications.mjs, not brief-dedup; the full-flow envelope-cleanliness test exercises the caller end-to-end). Deferred to follow-up: #235 (structured logParts vs regex splice), #238 (plumb cfg to avoid double env-read), #239 (repo-wide env .trim pattern). Tests: 5910 pass (was 5913; -3 redundant tests removed). typecheck, typecheck:api, biome all clean.	2026-04-21 07:20:12 +04:00
Elie Habib	8fe6284c4f	feat(digest): topic-grouped brief ordering (size-first) Brief composer currently surfaces top-N stories in raw currentScore DESC order. On topic-dominant news days (e.g. 2026-04-20 20:00 brief) related stories scatter — 4 Hormuz angles at positions 1/3/8/11 with unrelated stories wedged between them. Secondary clustering pass on already-sliced top-30 reps at a looser cosine threshold (default 0.45), then re-orders by a total key: (topicSize DESC, topicMax DESC, repScore DESC, titleHashHex ASC). The dominant thread leads; within-thread order is score DESC; ties are deterministic. Hidden behind DIGEST_DEDUP_TOPIC_GROUPING (default 1 — kill switch = '0', no deploy). Design notes: - Post-slice placement bounds work to N ≤ 30 (~0.4 ms) and avoids reshuffling reps that never surface. - Sidecar Map<hash, number[]> returned from deduplicateStories — no hidden __embedding fields on the user-facing Rep (would otherwise risk leaking into the brief envelope). - groupTopicsPostDedup is pure: no I/O, no logging, injected clusterFn for testability. Errors are RETURNED (not thrown) so a helper bug cannot cascade into the outer Jaccard fallback boundary. - Caller owns logging: deduplicateStories returns logSummary; caller splices ` topics=N ` after `clusters=M ` via a simple regex, emits one log line per tick. Env: - DIGEST_DEDUP_TOPIC_GROUPING = '0' disables (default on) - DIGEST_DEDUP_TOPIC_THRESHOLD = float in (0,1], default 0.45 Tests: 55 in tests/brief-dedup-embedding.test.mjs (was 33, +22 new): size-first ordering, topicMax tiebreak, within-topic score, titleHash determinism, kill switch, permutation invariance, empty/singleton, injected clusterer throws, missing embedding, materialized-rep keying, envelope cleanliness (JSON.stringify has no _embedding / __ / embeddingByHash), log-line splice regex, pre-slice input size, and 6 env-parsing cases. Full verification: npm run test:data (5913 pass), typecheck, typecheck:api, biome check on changed files — all clean. Pre-existing main() complexity (74) unchanged.	2026-04-21 06:50:52 +04:00
Elie Habib	d1ebc84c6c	feat(digest-dedup): single-link clustering (F1 0.73 vs 0.53 complete-link) (#3234 ) Problem ------- The post-threshold-tuning brief at /api/brief/user_3BovQ1tYlaz2YIGYAdDPXGFBgKy/2026-04-20-1532 still showed 4 copies of "US seizes Iranian ship", 3 copies of the Hormuz closure, and 2 copies of the oil-price story — despite running the calibrated 0.55 threshold. Root cause: complete-link is too strict for wire-headline clustering. Pairwise cosines in the 4-way ship-seizure cluster: 1 <-> 5: 0.632 5 <-> 8: 0.692 1 <-> 8: 0.500 5 <-> 10: 0.656 1 <-> 10: 0.554 8 <-> 10: 0.510 Complete-link requires EVERY pair to clear threshold. Pair 1<->8 at 0.500 fails so the whole 4-way cluster can't form, and all 4 stories bubble up as separate reps, eating 4 slots of the 12-story brief. Measured on the 12 real titles from that brief: Algorithm \| Clusters \| F1 \| P \| R --------------------------\|----------\|-------\|------\|------ complete-link @ 0.55 (was)\| 7 \| 0.526 \| 0.56 \| 0.50 complete-link @ 0.50 \| 6 \| 0.435 \| 0.38 \| 0.50 single-link @ 0.55 \| 4 \| 0.435 \| 0.28 \| 1.00 over-merge single-link @ 0.60 \| 6 \| 0.727 \| 0.67 \| 0.80 winner Change ------ scripts/lib/brief-dedup-embed.mjs: New singleLinkCluster(items, {cosineThreshold, vetoFn}) using union-find. Chain merges through strong intermediates when a direct pair is weak; respects the entity veto (blocked pairs don't union). O(N^2 alpha(N)); permutation-invariant by construction. scripts/lib/brief-dedup.mjs: New DIGEST_DEDUP_CLUSTERING env var (default 'single', set 'complete' to revert). readOrchestratorConfig returns 'clustering' field. Dispatch at call site picks the right function. Structured log line now includes clustering=<algo>. tests/brief-dedup-embedding.test.mjs: +8 regressions: - singleLinkCluster chains the 4-way through a bridge - veto blocks unions even when cosine passes - permutation-invariance property test (5 shuffles) - empty-input - DIGEST_DEDUP_CLUSTERING default is 'single' - DIGEST_DEDUP_CLUSTERING=complete kill switch works - unrecognised values fall back to 'single' - log line includes clustering=<algo> Bridge-pollution risk note -------------------------- The original plan rejected single-link to avoid the Jaccard-era "bridge pollution" (A~B=0.6, B~C=0.6, A~C=0.3 all chain through a mixed-topic B). With text-embedding-3-small at cosine >= 0.60, a bridge must be semantically real — the probe showed a 37% F1 bump with no new FPs on the production case. Setting DIGEST_DEDUP_CLUSTERING=complete on Railway is the instant rollback if a bad day ever surfaces chaining. Operator activation ------------------- After merge, on Railway seed-digest-notifications service: DIGEST_DEDUP_COSINE_THRESHOLD=0.60 No other changes needed — clustering=single is the default. Verification ------------ - npm run test:data 5825/5825 pass - tests/brief-dedup-embedding 53/53 pass (45 existing + 8 new) - typecheck + typecheck:api clean - biome check on changed files clean Post-Deploy Monitoring & Validation ----------------------------------- - Grep '[digest] dedup mode=embed clustering=single' in Railway logs — confirms the new algo is live - Expect clusters= to drop further on bulk ticks (stories=700+): current ~23 on 84-story ticks -> expected ~15-18 - Manually open next brief post-deploy, visually verify ship-seizure / Hormuz / oil stories no longer duplicate - Rollback: DIGEST_DEDUP_CLUSTERING=complete on Railway (instant, no deploy), next cron tick reverts to old behaviour - Validation window: 24h - Owner: koala73 Related ------- - #3200 embedding-based dedup (introduced complete-link) - #3224 DIGEST_SCORE_MIN floor (the low-importance half of the fix)	2026-04-20 16:21:20 +04:00
Elie Habib	305dc5ef36	feat(digest-dedup): Phase A — embedding-based dedup scaffolding (no-op) (#3200 ) * feat(digest-dedup): Phase A — embedding-based dedup scaffolding (no-op) Replaces the inline Jaccard story-dedup in seed-digest-notifications with an orchestrator that can run Jaccard, shadow, or full embedding modes. Ships with DIGEST_DEDUP_MODE=jaccard as the default so production behaviour is unchanged until Phase C shadow + Phase D flip. New modules (scripts/lib/): - brief-dedup-consts.mjs tunables + cache prefix + __constants bag - brief-dedup-jaccard.mjs verbatim 0.55-threshold extract (fallback) - entity-gazetteer.mjs cities/regions gazetteer + common-caps - brief-embedding.mjs OpenRouter /embeddings client with Upstash cache, all-or-nothing timeout, cosineSimilarity - brief-dedup-embed.mjs complete-link clustering + entity veto (pure) - brief-dedup.mjs orchestrator, env read at call entry, shadow archive, structured log line Operator tools (scripts/tools/): - calibrate-dedup-threshold.mjs offline calibration runner + histogram - golden-pair-validator.mjs live-embedder drift detector (nightly CI) - shadow-sample.mjs Sample A/B CSV emitter over SCAN archive Tests: - brief-dedup-jaccard.test.mjs migrated from regex-harness to direct import plus orchestrator parity tests (22) - brief-dedup-embedding.test.mjs 9 plan scenarios incl. 10-permutation property test, complete-link non-chain (21) - brief-dedup-golden.test.mjs 20-pair mocked canary (21) Workflows: - .github/workflows/dedup-golden-pairs.yml nightly live-embedder canary (07:17 UTC), opens issue on drift Deviation from plan: the shouldVeto("Iran closes Hormuz", "Tehran shuts Hormuz") case can't return true under a single coherent classification (country-in-A vs capital-in-B sit on different sides of the actor/location boundary). Gazetteer follows the plan's "countries are actors" intent; the test is updated to assert false with a comment pointing at the irreducible capital-country coreference limitation. Verification: - npm run test:data 5825/5825 pass - tests/edge-functions 171/171 pass - typecheck + typecheck:api clean - biome check on new files clean - lint:md 0 errors Phase B (calibration), Phase C (shadow), and Phase D (flip) are subsequent PRs. * refactor(digest-dedup): address review findings 193-199 Fresh-eyes review found 3 P1s, 3 P2s, and a P3 bundle across kieran-typescript, security-sentinel, performance-oracle, architecture- strategist, and code-simplicity reviewers. Fixes below; all 64 dedup tests + 5825 data tests + 171 edge-function tests still green. P1 #193 - dedup regex + redis pipeline duplication - Extract defaultRedisPipeline into scripts/lib/_upstash-pipeline.mjs; both orchestrator and embedding client import from there. - normalizeForEmbedding now delegates to stripSourceSuffix from the Jaccard module so the outlet allow-list is single-sourced. P1 #194 - embedding timeout floor + negative-budget path - callEmbeddingsApi throws EmbeddingTimeoutError when timeoutMs<=0 instead of opening a doomed 250ms fetch. - Removed Math.max(250, ...) floor that let wall-clock cap overshoot. P1 #195 - dead env getters - Deleted getMode / isRemoteEmbedEnabled / isEntityVetoEnabled / getCosineThreshold / getWallClockMs from brief-dedup-consts.mjs (zero callers; orchestrator reimplements inline). P2 #196 - orchestrator cleanup bundle - Removed re-exports at bottom of brief-dedup.mjs. - Extracted materializeCluster into brief-dedup-jaccard.mjs; both the fallback and orchestrator use the shared helper. - Deleted clusterWithEntityVeto wrapper; orchestrator inlines the vetoFn wiring at the single call site. - Shadow mode now runs Jaccard exactly once per tick (was twice). - Fallback warn line carries reason=ErrorName so operators can filter timeout vs provider vs shape errors. - Invalid DIGEST_DEDUP_MODE values emit a warn once per run (vs silently falling to jaccard). P2 #197 - workflow + shadow-sample hardening - dedup-golden-pairs.yml body composition no longer relies on a heredoc that would command-substitute validator stdout. Switched to printf with sanitised LOG_TAIL (printable ASCII only) and --body-file so crafted fixture text cannot escape into the runner. - shadow-sample.mjs Upstash helper enforces a hardcoded command allowlist (SCAN \| GET \| EXISTS). P2 #198 - test + observability polish - Scenarios 2 and 3 deep-equal returned clusters against the Jaccard expected shape, not just length. Also assert the reason= field. P3 #199 - nits - Removed __constants test-bag; jaccard tests use named imports. - Renamed deps.apiKey to deps._apiKey in embedding client. - Added @pre JSDoc on diffClustersByHash about unique-hash contract. - Deferred: mocked golden-pair test removal, gazetteer JSON migration, scripts/tools AGENTS.md doc note. Todos 193-199 moved from pending to complete. Verification: - npm run test:data 5825/5825 pass - tests/edge-functions 171/171 pass - typecheck + typecheck:api clean - biome check on changed files clean * fix(digest-dedup): address Greptile P2 findings on PR #3200 1. brief-embedding.mjs: wrap fetch lookup as `(...args) => globalThis.fetch(...args)` instead of aliasing bare `fetch`. Aliasing captures the binding at module-load time, so later instrumentation / Edge-runtime shims don't see the wrapper — same class of bug as the banned `fetch.bind(globalThis)` pattern flagged in AGENTS.md. 2. dedup-golden-pairs.yml: `gh issue create --label "..." \|\| true` silently swallowed the failure when any of dedup/canary/p1 labels didn't pre-exist, breaking the drift alert channel while leaving the job red in the Actions UI. Switched to repeated `--label` flags + `--create-label` so any missing label is auto-created on first drift, and dropped the `\|\| true` so a legitimate failure (network / auth) surfaces instead of hiding. Both fixes are P2-style per Greptile (confidence 5/5, no P0/P1); applied pre-merge so the nightly canary is usable from day one. * fix(digest-dedup): two P1s found on PR #3200 P1 — canary classifier must match production Nightly golden-pair validator was checking a hardcoded threshold (default 0.60) and always applied the entity veto, while the actual dedup path at runtime reads DIGEST_DEDUP_COSINE_THRESHOLD and DIGEST_DEDUP_ENTITY_VETO_ENABLED from env at every call. A Phase C/D env flip could make the canary green while prod was wrong or red while prod was healthy, defeating the whole point of a drift detector. Fix: - golden-pair-validator.mjs now calls readOrchestratorConfig(process.env) — the same helper the orchestrator uses — so any classifier knob added later is picked up automatically. The threshold and veto- enabled flags are sourced from env by default; a --threshold CLI flag still overrides for manual calibration sweeps. - dedup-golden-pairs.yml sources DIGEST_DEDUP_COSINE_THRESHOLD and DIGEST_DEDUP_ENTITY_VETO_ENABLED from GitHub repo variables (vars.), which operators must keep in lockstep with Railway. The workflow_dispatch threshold input now defaults to empty; the scheduled canary always uses the production-parity config. - Validator log line prints the effective config + source so nightly output makes the classifier visible. P1 — shadow archive writes were fail-open `defaultRedisPipeline()` returns null on timeout / auth / HTTP failure. `writeShadowArchive()` only had a try/catch, so the null result was silently treated as success. A Phase C rollout could log clean "mode=shadow … disagreements=X" lines every tick while the Upstash archive received zero writes — and Sample B labelling would then find no batches, silently killing calibration. Fix: - writeShadowArchive now inspects the pipeline return. null result, non-array response, per-command {error}, or a cell without {result: "OK"} all return {ok: false, reason}. - Orchestrator emits a warn line with the failure reason, and the structured log line carries archive_write=ok\|failed so operators can grep for failed ticks. - Regression test in brief-dedup-embedding.test.mjs simulates the null-pipeline contract and asserts both the warn and the structured field land. Verification: - test:data 5825/5825 pass - dedup suites 65/65 pass (new: archive-fail regression) - typecheck + api clean - biome check clean on changed files fix(digest-dedup): two more P1s found on PR #3200 P1 — canary must also honour DIGEST_DEDUP_MODE + REMOTE_EMBED_ENABLED The prior round fixed the threshold/veto knobs but left the canary running embeddings regardless of whether production could actually reach the embed path. If Railway has DIGEST_DEDUP_MODE=jaccard or DIGEST_DEDUP_REMOTE_EMBED_ENABLED=0, production never calls the classifier, so a drift signal is meaningless — or worse, a live OpenRouter issue flags the canary while prod is obliviously fine. Fix: - golden-pair-validator.mjs reads mode + remoteEmbedEnabled from the same readOrchestratorConfig() helper the orchestrator uses. When either says "embed path inactive in prod", the validator logs an explicit skip line and exits 0. The nightly workflow then shows green, which is the correct signal ("nothing to drift against"). - A --force CLI flag remains for manual dispatch during staged rollouts. - dedup-golden-pairs.yml sources DIGEST_DEDUP_MODE and DIGEST_DEDUP_REMOTE_EMBED_ENABLED from GitHub repo variables alongside the threshold and veto-enabled knobs, so all four classifier gates stay in lockstep with Railway. - Validator log line now prints mode + remoteEmbedEnabled so the canary output surfaces which classifier it validated. P1 — shadow-sample Sample A was biased by SCAN order enumerate-and-dedup added every seen pair to a dedup key BEFORE filtering by agreement. If the same pair appeared in an agreeing batch first and a disagreeing batch later, the disagreeing occurrence was silently dropped. SCAN order is unspecified, so Sample A could omit real disagreement pairs. Fix: - Extracted the enumeration into a pure `enumeratePairs(archives, mode)` export so the logic is testable. Mode filter runs BEFORE the dedup check: agreeing pairs are skipped entirely under --mode disagreements, so any later disagreeing occurrence can still claim the dedup slot. - Added tests/brief-dedup-shadow-sample.test.mjs with 5 regression cases: agreement-then-disagreement, reversed order (symmetry), always-agreed omission, population enumeration, cross-batch dedup. - isMain guard added so importing the module for tests does not kick off the CLI scan path. Verification: - test:data 5825/5825 pass - dedup suites 70/70 pass (5 new shadow-sample regressions) - typecheck + api clean - biome check clean on changed files Operator follow-up before Phase C: Set all FOUR dedup repo variables in GitHub alongside Railway: DIGEST_DEDUP_MODE, DIGEST_DEDUP_REMOTE_EMBED_ENABLED, DIGEST_DEDUP_COSINE_THRESHOLD, DIGEST_DEDUP_ENTITY_VETO_ENABLED * refactor(digest-dedup): Railway is the single source of truth for dedup config Fair user pushback: asking operators to set four DIGEST_DEDUP_* values in BOTH Railway (where the cron runs) AND GitHub repo variables (where the canary runs) is architectural debt. Two copies of the same truth will always drift. Solution: the digest cron publishes its resolved config to Upstash on every tick under brief:dedup:config:v1 (2h TTL). The nightly golden-pair canary reads that key instead of env vars. Railway stays the sole source of truth; no parallel repo variables to maintain. A missing/expired key signals "cron hasn't run" and the canary skips with exit 0 — better than validating against hardcoded defaults that might diverge from prod. Changes: - brief-dedup-consts.mjs: new ACTIVE_CONFIG_KEY + TTL constants. - brief-dedup.mjs: new publishActiveConfig() fires at the start of every deduplicateStories() call (before the mode short-circuit, so jaccard ticks also publish a "mode=jaccard" signal the canary can read). Fire-and-forget; archive-write error semantics still apply if the operator wants stricter tracking. - golden-pair-validator.mjs: removed readOrchestratorConfig(env) path. Now calls fetchActiveConfigFromUpstash() and either validates against that config, skips when the embed path is inactive, or skips when the key is missing (with --force override for manual dispatch). - dedup-golden-pairs.yml: dropped the four DIGEST_DEDUP_* env lines and the corresponding repo-variable dependency. Only the three Upstash + OpenRouter secrets remain. - tests: two new regressions assert config is published on every tick (shadow AND jaccard modes) with the right shape + TTL. Operator onboarding now takes one action: set the four DIGEST_DEDUP_* variables on the Railway seed-digest-notifications service. Nothing to set in GitHub beyond the existing OPENROUTER_API_KEY / UPSTASH_* secrets. Verification: - test:data 5825/5825 pass - dedup suites 72/72 pass (2 new config-publish regressions) - typecheck + api clean - biome check clean on changed files * refactor(digest-dedup): ship embed directly, drop phases/canary/shadow User feedback: "i dont need multiple phases and shit, we go directly to embed". Fair. Ripping out the overengineering I accumulated: DELETED - .github/workflows/dedup-golden-pairs.yml (nightly canary) - scripts/tools/golden-pair-validator.mjs - scripts/tools/shadow-sample.mjs - scripts/tools/calibrate-dedup-threshold.mjs - tests/fixtures/brief-dedup-golden-pairs.json - tests/brief-dedup-golden.test.mjs - tests/brief-dedup-shadow-sample.test.mjs SIMPLIFIED - brief-dedup.mjs: removed shadow mode, publishActiveConfig, writeShadowArchive, diffClustersByHash, jaccardRepsToClusterHashes, and the DIGEST_DEDUP_REMOTE_EMBED_ENABLED knob. MODE is now binary: `embed` (default) or `jaccard` (instant kill switch). - brief-dedup-consts.mjs: dropped SHADOW_ARCHIVE_, ACTIVE_CONFIG_. - Default flipped: DIGEST_DEDUP_MODE unset = embed (prod path). Railway deploy with OPENROUTER_API_KEY set = embeddings live on next cron tick. Set MODE=jaccard on Railway to revert instantly. Orchestrator still falls back to Jaccard on any embed-path failure (timeout, provider outage, missing API key, bad response). Fallback warn carries reason=<ErrorName>. The cron never fails because embeddings flaked. All 64 dedup tests + 5825 data tests still green. Net diff: -1,407 lines. Operator single action: set OPENROUTER_API_KEY on Railway's seed-digest-notifications service (already present) and ship. No GH Actions, no shadow archives, no labelling sprints. If the 0.60 threshold turns out wrong, tune DIGEST_DEDUP_COSINE_THRESHOLD on Railway — takes effect on next tick, no redeploy. * fix(digest-dedup): multi-word location phrases in the entity veto Extractor was whitespace-tokenising and only single-token matching against LOCATION_GAZETTEER, silently making every multi-word entry unreachable: extractEntities("Houthis strike ship in Red Sea") → { locations: [], actors: ['houthis','red','sea'] } ✗ shouldVeto("Houthis strike ship in Red Sea", "US escorts convoy in Red Sea") → false ✗ With MODE=embed as the default, that turned off the main anti-overmerge safety rail for bodies of water, regions, and compound city names — exactly the P07-Hormuz / Houthis-Red-Sea headlines the veto was designed to cover. Fix: greedy longest-phrase scan with a sliding window. At each token position try the longest multi-word phrase first (down to 2), require first AND last tokens to be capitalised (so lowercase prose like "the middle east" doesn't falsely match while headline "Middle East" does), lowercase connectors in between are fine ("Strait of Hormuz" → phrase "strait of hormuz" ✓). Falls back to single-token lookup when no multi-word phrase fits. Now: extractEntities("Houthis strike ship in Red Sea") → { locations: ['red sea'], actors: ['houthis'] } ✓ shouldVeto(Red-Sea-Houthis, Red-Sea-US) → true ✓ Complexity still O(N · MAX_PHRASE_LEN) — MAX_PHRASE_LEN is 4 (longest gazetteer entry: "ho chi minh city"), so this is effectively O(N). Added 5 regression tests covering Red Sea, South China Sea, Strait of Hormuz (lowercase-connector case), Abu Dhabi, and New York, plus the Houthis-vs-US veto reproducer from the P1. All 5825 data tests + 45 dedup tests green; lint + typecheck clean.	2026-04-19 13:49:48 +04:00