Files
worldmonitor/todos/056-pending-p3-stable-hash-unnecessary-disease-seed.md
Elie Habib e7ba05553d fix(health): disease outbreaks CDC/Outbreak feeds, VPD tracker seed, BOOTSTRAP_KEYS gold standard (#2378)
* feat(panels): Disease Outbreaks, Shipping Stress, Social Velocity, nuclear test site monitoring

- Add HealthService proto with ListDiseaseOutbreaks RPC (WHO + ProMED RSS)
- Add GetShippingStress RPC to SupplyChainService (Yahoo Finance carrier ETFs)
- Add GetSocialVelocity RPC to IntelligenceService (Reddit r/worldnews + r/geopolitics)
- Enrich earthquake seed with Haversine nuclear test-site proximity scoring
- Add 5 nuclear test sites to NUCLEAR_FACILITIES (Punggye-ri, Lop Nur, Novaya Zemlya, Nevada NTS, Semipalatinsk)
- Add shipping stress + social velocity seed loops to ais-relay.cjs
- Add seed-disease-outbreaks.mjs Railway cron script
- Wire all new RPCs: edge functions, handlers, gateway cache tiers, health.js STANDALONE_KEYS/SEED_META

* fix(relay): apply gold standard retry/TTL-extend pattern to shipping-stress and social-velocity seeders

* fix(review): address all PR #2375 review findings

- health.js: shippingStress maxStaleMin 30→45 (3x interval), socialVelocity 20→30 (3x interval)
- health.js: remove shippingStress/diseaseOutbreaks/socialVelocity from ON_DEMAND_KEYS (relay/cron seeds, not on-demand)
- cache-keys.ts: add shippingStress, diseaseOutbreaks, socialVelocity to BOOTSTRAP_CACHE_KEYS
- ais-relay.cjs: stressScore formula 50→40 (neutral market = moderate, not elevated)
- ais-relay.cjs: fetchedAt Date.now() (consistent with other seeders)
- ais-relay.cjs: deduplicate cross-subreddit article URLs in social velocity loop
- seed-disease-outbreaks.mjs: WHO URL → specific DON RSS endpoint (not dead general news feed)
- seed-disease-outbreaks.mjs: validate() requires outbreaks.length >= 1 (reject empty array)
- seed-disease-outbreaks.mjs: stable id using hash(link) not array index
- seed-disease-outbreaks.mjs: RSS regexes use [\s\S]*? for CDATA multiline content
- seed-earthquakes.mjs: Lop Nur coordinates corrected (41.39,89.03 not 41.75,88.35)
- seed-earthquakes.mjs: sourceVersion bumped to usgs-4.5-day-nuclear-v1
- earthquake.proto: fields 8-11 marked optional (distinguish not-enriched from enriched=false/0)
- buf generate: regenerate seismology service stubs

* revert(cache-keys): don't add new keys to bootstrap without frontend consumers

* fix(panels): address all P1/P2/P3 review findings for PR #2375

- proto: add INT64_ENCODING_NUMBER annotation + sebuf import to get_shipping_stress.proto (run make generate)
- bootstrap: register shippingStress (fast), socialVelocity (fast), diseaseOutbreaks (slow) in api/bootstrap.js + cache-keys.ts
- relay: update WIDGET_SYSTEM_PROMPT with new bootstrap keys and live RPCs for health/supply-chain/intelligence
- seeder: remove broken ProMED feed URL (promedmail.org/feed/ returns HTML 404); add 500K size guard to fetchRssItems; replace private COUNTRY_CODE_MAP with shared geo-extract.mjs; remove permanently-empty location field; bump sourceVersion to who-don-rss-v2
- handlers: remove dead .catch from all 3 new RPC handlers; fix stressLevel fallback to low; fix fetchedAt fallback to 0
- services: add fetchShippingStress, disease-outbreaks.ts, social-velocity.ts with getHydratedData consumers

* fix(health): move seeded keys to BOOTSTRAP_KEYS, add VPD tracker seed and feeds

- Reclassify diseaseOutbreaks, shippingStress, socialVelocity from
  STANDALONE_KEYS to BOOTSTRAP_KEYS so health endpoint reports CRIT
  (not WARN) when their seeds miss a cycle
- Add vpdTrackerRealtime and vpdTrackerHistorical to BOOTSTRAP_KEYS
  with SEED_META entries (maxStaleMin: 2880 = 2x daily interval)
- Fix seed-disease-outbreaks: add CDC and Outbreak News Today feeds
  alongside WHO, populate location field from title parsing, fix TTL
  to 259200s (3x daily interval per gold standard)
- Add seed-vpd-tracker.mjs: scrapes Think Global Health VPD Tracker
  bundle (1,827 realtime alerts + 25,960 historical WHO records),
  writes both Redis keys in one runSeed call via extraKeys
- Add review todos 049-059 from PR #2375 code review
2026-03-27 22:47:24 +04:00

1.9 KiB

status, priority, issue_id, tags, dependencies
status priority issue_id tags dependencies
pending p3 056
code-review
quality
seeding
disease-outbreaks
simplicity
pr-2375

Problem Statement

scripts/seed-disease-outbreaks.mjs implements a custom stableHash function (djb2 variant) to generate IDs for disease outbreak items. Since WHO DON RSS items each have a unique <link> URL (the WHO article URL), using the URL directly as the item ID — or a simple truncation of it — would be stable, readable, and require no custom hashing code.

Findings

  • File: scripts/seed-disease-outbreaks.mjsstableHash(title + pubDate) used to generate item IDs
  • Each WHO DON item has: a unique <link> field (e.g., https://www.who.int/emergencies/disease-outbreak-news/item/...)
  • The WHO item URL slug is already a stable unique identifier — no hash needed
  • Impact: Custom hash function adds ~10 lines of unnecessary code; URL-based IDs would be human-readable in Redis and easier to debug

Proposed Solutions

Option A: Use WHO item URL slug as ID (Recommended)

// Instead of: id: stableHash(title + pubDate)
// Use: id: link.split('/').pop() || stableHash(title)
const id = item.link?.split('/item/')[1]?.replace(/[^a-z0-9-]/gi, '') || stableHash(title);
  • Effort: Trivial
  • Risk: Very low — IDs are stable as long as WHO URL structure doesn't change (they have been stable for years)

Option B: Remove stableHash, use title + pubDate substring

Generate IDs from a truncated, URL-encoded version of the title + date without a hash function.

  • Effort: Trivial
  • Risk: Very low

Acceptance Criteria

  • stableHash function removed or replaced with simpler ID generation
  • Item IDs remain stable across re-runs (same item → same ID)

Work Log

  • 2026-03-27: Identified by simplicity-reviewer agent during PR #2375 review.