Files
worldmonitor/todos/059-pending-p3-disease-keywords-duplicated-in-detect-function.md
Elie Habib e7ba05553d fix(health): disease outbreaks CDC/Outbreak feeds, VPD tracker seed, BOOTSTRAP_KEYS gold standard (#2378)
* feat(panels): Disease Outbreaks, Shipping Stress, Social Velocity, nuclear test site monitoring

- Add HealthService proto with ListDiseaseOutbreaks RPC (WHO + ProMED RSS)
- Add GetShippingStress RPC to SupplyChainService (Yahoo Finance carrier ETFs)
- Add GetSocialVelocity RPC to IntelligenceService (Reddit r/worldnews + r/geopolitics)
- Enrich earthquake seed with Haversine nuclear test-site proximity scoring
- Add 5 nuclear test sites to NUCLEAR_FACILITIES (Punggye-ri, Lop Nur, Novaya Zemlya, Nevada NTS, Semipalatinsk)
- Add shipping stress + social velocity seed loops to ais-relay.cjs
- Add seed-disease-outbreaks.mjs Railway cron script
- Wire all new RPCs: edge functions, handlers, gateway cache tiers, health.js STANDALONE_KEYS/SEED_META

* fix(relay): apply gold standard retry/TTL-extend pattern to shipping-stress and social-velocity seeders

* fix(review): address all PR #2375 review findings

- health.js: shippingStress maxStaleMin 30→45 (3x interval), socialVelocity 20→30 (3x interval)
- health.js: remove shippingStress/diseaseOutbreaks/socialVelocity from ON_DEMAND_KEYS (relay/cron seeds, not on-demand)
- cache-keys.ts: add shippingStress, diseaseOutbreaks, socialVelocity to BOOTSTRAP_CACHE_KEYS
- ais-relay.cjs: stressScore formula 50→40 (neutral market = moderate, not elevated)
- ais-relay.cjs: fetchedAt Date.now() (consistent with other seeders)
- ais-relay.cjs: deduplicate cross-subreddit article URLs in social velocity loop
- seed-disease-outbreaks.mjs: WHO URL → specific DON RSS endpoint (not dead general news feed)
- seed-disease-outbreaks.mjs: validate() requires outbreaks.length >= 1 (reject empty array)
- seed-disease-outbreaks.mjs: stable id using hash(link) not array index
- seed-disease-outbreaks.mjs: RSS regexes use [\s\S]*? for CDATA multiline content
- seed-earthquakes.mjs: Lop Nur coordinates corrected (41.39,89.03 not 41.75,88.35)
- seed-earthquakes.mjs: sourceVersion bumped to usgs-4.5-day-nuclear-v1
- earthquake.proto: fields 8-11 marked optional (distinguish not-enriched from enriched=false/0)
- buf generate: regenerate seismology service stubs

* revert(cache-keys): don't add new keys to bootstrap without frontend consumers

* fix(panels): address all P1/P2/P3 review findings for PR #2375

- proto: add INT64_ENCODING_NUMBER annotation + sebuf import to get_shipping_stress.proto (run make generate)
- bootstrap: register shippingStress (fast), socialVelocity (fast), diseaseOutbreaks (slow) in api/bootstrap.js + cache-keys.ts
- relay: update WIDGET_SYSTEM_PROMPT with new bootstrap keys and live RPCs for health/supply-chain/intelligence
- seeder: remove broken ProMED feed URL (promedmail.org/feed/ returns HTML 404); add 500K size guard to fetchRssItems; replace private COUNTRY_CODE_MAP with shared geo-extract.mjs; remove permanently-empty location field; bump sourceVersion to who-don-rss-v2
- handlers: remove dead .catch from all 3 new RPC handlers; fix stressLevel fallback to low; fix fetchedAt fallback to 0
- services: add fetchShippingStress, disease-outbreaks.ts, social-velocity.ts with getHydratedData consumers

* fix(health): move seeded keys to BOOTSTRAP_KEYS, add VPD tracker seed and feeds

- Reclassify diseaseOutbreaks, shippingStress, socialVelocity from
  STANDALONE_KEYS to BOOTSTRAP_KEYS so health endpoint reports CRIT
  (not WARN) when their seeds miss a cycle
- Add vpdTrackerRealtime and vpdTrackerHistorical to BOOTSTRAP_KEYS
  with SEED_META entries (maxStaleMin: 2880 = 2x daily interval)
- Fix seed-disease-outbreaks: add CDC and Outbreak News Today feeds
  alongside WHO, populate location field from title parsing, fix TTL
  to 259200s (3x daily interval per gold standard)
- Add seed-vpd-tracker.mjs: scrapes Think Global Health VPD Tracker
  bundle (1,827 realtime alerts + 25,960 historical WHO records),
  writes both Redis keys in one runSeed call via extraKeys
- Add review todos 049-059 from PR #2375 code review
2026-03-27 22:47:24 +04:00

1.8 KiB

status, priority, issue_id, tags, dependencies
status priority issue_id tags dependencies
pending p3 059
code-review
quality
seeding
disease-outbreaks
duplication
pr-2375

Problem Statement

scripts/seed-disease-outbreaks.mjs maintains two parallel lists that overlap significantly: a diseaseKeywords array (or constant) and the keyword list embedded inside the detectDisease() function. Any update to supported disease keywords must be made in both places, or the two lists drift out of sync.

Findings

  • File: scripts/seed-disease-outbreaks.mjsdiseaseKeywords constant and detectDisease() function both enumerate disease names/keywords
  • Overlap: The function's keyword list appears to be a superset or duplicate of diseaseKeywords
  • Impact: Adding a new disease (e.g., MPOX variant) requires two edits; omitting one causes inconsistent behavior between any code that uses diseaseKeywords directly vs calls detectDisease()

Proposed Solutions

Option A: Remove standalone array, have detectDisease() be the single source (Recommended)

If diseaseKeywords is only used to drive detectDisease(), inline the array into the function and export only the function.

  • Effort: Small (consolidate + verify no other consumers of the array)
  • Risk: Very low

Option B: Make detectDisease() use the diseaseKeywords array

const DISEASE_KEYWORDS = ['mpox', 'ebola', 'cholera', ...];
function detectDisease(text) {
  return DISEASE_KEYWORDS.find(k => text.toLowerCase().includes(k)) || null;
}
  • Effort: Trivial
  • Risk: Very low — clean single source of truth

Acceptance Criteria

  • Disease keyword list exists in exactly one place
  • detectDisease() uses that single list

Work Log

  • 2026-03-27: Identified by simplicity-reviewer agent during PR #2375 review.