Files
worldmonitor/todos/059-pending-p3-disease-keywords-duplicated-in-detect-function.md
Elie Habib e7ba05553d fix(health): disease outbreaks CDC/Outbreak feeds, VPD tracker seed, BOOTSTRAP_KEYS gold standard (#2378)
* feat(panels): Disease Outbreaks, Shipping Stress, Social Velocity, nuclear test site monitoring

- Add HealthService proto with ListDiseaseOutbreaks RPC (WHO + ProMED RSS)
- Add GetShippingStress RPC to SupplyChainService (Yahoo Finance carrier ETFs)
- Add GetSocialVelocity RPC to IntelligenceService (Reddit r/worldnews + r/geopolitics)
- Enrich earthquake seed with Haversine nuclear test-site proximity scoring
- Add 5 nuclear test sites to NUCLEAR_FACILITIES (Punggye-ri, Lop Nur, Novaya Zemlya, Nevada NTS, Semipalatinsk)
- Add shipping stress + social velocity seed loops to ais-relay.cjs
- Add seed-disease-outbreaks.mjs Railway cron script
- Wire all new RPCs: edge functions, handlers, gateway cache tiers, health.js STANDALONE_KEYS/SEED_META

* fix(relay): apply gold standard retry/TTL-extend pattern to shipping-stress and social-velocity seeders

* fix(review): address all PR #2375 review findings

- health.js: shippingStress maxStaleMin 30→45 (3x interval), socialVelocity 20→30 (3x interval)
- health.js: remove shippingStress/diseaseOutbreaks/socialVelocity from ON_DEMAND_KEYS (relay/cron seeds, not on-demand)
- cache-keys.ts: add shippingStress, diseaseOutbreaks, socialVelocity to BOOTSTRAP_CACHE_KEYS
- ais-relay.cjs: stressScore formula 50→40 (neutral market = moderate, not elevated)
- ais-relay.cjs: fetchedAt Date.now() (consistent with other seeders)
- ais-relay.cjs: deduplicate cross-subreddit article URLs in social velocity loop
- seed-disease-outbreaks.mjs: WHO URL → specific DON RSS endpoint (not dead general news feed)
- seed-disease-outbreaks.mjs: validate() requires outbreaks.length >= 1 (reject empty array)
- seed-disease-outbreaks.mjs: stable id using hash(link) not array index
- seed-disease-outbreaks.mjs: RSS regexes use [\s\S]*? for CDATA multiline content
- seed-earthquakes.mjs: Lop Nur coordinates corrected (41.39,89.03 not 41.75,88.35)
- seed-earthquakes.mjs: sourceVersion bumped to usgs-4.5-day-nuclear-v1
- earthquake.proto: fields 8-11 marked optional (distinguish not-enriched from enriched=false/0)
- buf generate: regenerate seismology service stubs

* revert(cache-keys): don't add new keys to bootstrap without frontend consumers

* fix(panels): address all P1/P2/P3 review findings for PR #2375

- proto: add INT64_ENCODING_NUMBER annotation + sebuf import to get_shipping_stress.proto (run make generate)
- bootstrap: register shippingStress (fast), socialVelocity (fast), diseaseOutbreaks (slow) in api/bootstrap.js + cache-keys.ts
- relay: update WIDGET_SYSTEM_PROMPT with new bootstrap keys and live RPCs for health/supply-chain/intelligence
- seeder: remove broken ProMED feed URL (promedmail.org/feed/ returns HTML 404); add 500K size guard to fetchRssItems; replace private COUNTRY_CODE_MAP with shared geo-extract.mjs; remove permanently-empty location field; bump sourceVersion to who-don-rss-v2
- handlers: remove dead .catch from all 3 new RPC handlers; fix stressLevel fallback to low; fix fetchedAt fallback to 0
- services: add fetchShippingStress, disease-outbreaks.ts, social-velocity.ts with getHydratedData consumers

* fix(health): move seeded keys to BOOTSTRAP_KEYS, add VPD tracker seed and feeds

- Reclassify diseaseOutbreaks, shippingStress, socialVelocity from
  STANDALONE_KEYS to BOOTSTRAP_KEYS so health endpoint reports CRIT
  (not WARN) when their seeds miss a cycle
- Add vpdTrackerRealtime and vpdTrackerHistorical to BOOTSTRAP_KEYS
  with SEED_META entries (maxStaleMin: 2880 = 2x daily interval)
- Fix seed-disease-outbreaks: add CDC and Outbreak News Today feeds
  alongside WHO, populate location field from title parsing, fix TTL
  to 259200s (3x daily interval per gold standard)
- Add seed-vpd-tracker.mjs: scrapes Think Global Health VPD Tracker
  bundle (1,827 realtime alerts + 25,960 historical WHO records),
  writes both Redis keys in one runSeed call via extraKeys
- Add review todos 049-059 from PR #2375 code review
2026-03-27 22:47:24 +04:00

48 lines
1.8 KiB
Markdown

---
status: pending
priority: p3
issue_id: "059"
tags: [code-review, quality, seeding, disease-outbreaks, duplication, pr-2375]
dependencies: []
---
## Problem Statement
`scripts/seed-disease-outbreaks.mjs` maintains two parallel lists that overlap significantly: a `diseaseKeywords` array (or constant) and the keyword list embedded inside the `detectDisease()` function. Any update to supported disease keywords must be made in both places, or the two lists drift out of sync.
## Findings
- **File:** `scripts/seed-disease-outbreaks.mjs``diseaseKeywords` constant and `detectDisease()` function both enumerate disease names/keywords
- **Overlap:** The function's keyword list appears to be a superset or duplicate of `diseaseKeywords`
- **Impact:** Adding a new disease (e.g., MPOX variant) requires two edits; omitting one causes inconsistent behavior between any code that uses `diseaseKeywords` directly vs calls `detectDisease()`
## Proposed Solutions
**Option A: Remove standalone array, have detectDisease() be the single source (Recommended)**
If `diseaseKeywords` is only used to drive `detectDisease()`, inline the array into the function and export only the function.
- **Effort:** Small (consolidate + verify no other consumers of the array)
- **Risk:** Very low
**Option B: Make detectDisease() use the diseaseKeywords array**
```javascript
const DISEASE_KEYWORDS = ['mpox', 'ebola', 'cholera', ...];
function detectDisease(text) {
return DISEASE_KEYWORDS.find(k => text.toLowerCase().includes(k)) || null;
}
```
- **Effort:** Trivial
- **Risk:** Very low — clean single source of truth
## Acceptance Criteria
- [ ] Disease keyword list exists in exactly one place
- [ ] `detectDisease()` uses that single list
## Work Log
- 2026-03-27: Identified by simplicity-reviewer agent during PR #2375 review.