mirror of
https://github.com/koala73/worldmonitor.git
synced 2026-05-13 18:46:21 +02:00
* feat(seeds): add Railway seed scripts for economic and trade endpoints
Two new seed scripts to eliminate Vercel edge external API calls:
seed-economy.mjs:
- EIA energy prices (WTI, Brent) -> economic:energy:v1:all
- EIA energy capacity (Solar, Wind, Coal) -> economic:capacity:v1:COL,SUN,WND:20
- FRED series (10 series) -> economic:fred:v1:<id>:120
- Macro signals (Yahoo, Alternative.me, Mempool) -> economic:macro-signals:v1
seed-supply-chain-trade.mjs:
- Shipping rates (FRED) -> supply_chain:shipping:v2
- Trade barriers (WTO tariff gap) -> trade:barriers:v1:tariff-gap:50
- Trade restrictions (WTO MFN overview) -> trade:restrictions:v1:tariff-overview:50
- Trade flows (WTO, 15 major reporters) -> trade:flows:v1:<reporter>:000:10
- Tariff trends (WTO, 15 major reporters) -> trade:tariffs:v1:<reporter>:all:10
Cache keys match handler patterns exactly so cachedFetchJson finds
pre-seeded data and avoids live external API calls from Vercel edge.
* feat(seeds): add seed-aviation.mjs for airport ops and aviation news
Seeds 2 aviation endpoints with predictable default params:
- getAirportOpsSummary (AviationStack + NOTAM) -> aviation:ops-summary:v1:CDG,ESB,FRA,IST,LHR,SAW
- listAviationNews (9 RSS feeds, 24h window) -> aviation:news::24:v1
NOT seeded (inherently on-demand, user-specific inputs):
- getFlightStatus: specific flight number lookup
- trackAircraft: bounding-box or icao24 queries
- listAirportFlights: arbitrary airport+direction+limit combos
- getCarrierOps: depends on listAirportFlights with variable params
* feat(seeds): add seed-conflict-intel.mjs for ACLED, HAPI, and PizzINT
Seeds 3 conflict/intelligence endpoints with predictable default params:
- listAcledEvents (all countries, last 30 days) -> conflict:acled:v1:all:0:0
- getHumanitarianSummary (20 top conflict countries) -> conflict:humanitarian:v1:<CC>
- getPizzintStatus (base + GDELT variants) -> intel:pizzint:v1:base, intel:pizzint:v1:gdelt
NOT seeded (inherently on-demand, LLM or user-specific inputs):
- classifyEvent: per-headline LLM classification
- deductSituation: per-query LLM deduction
- getCountryIntelBrief: per-country LLM brief with context hash
- getCountryFacts: per-country REST Countries + Wikidata + Wikipedia
- searchGdeltDocuments: per-query GDELT search
Requires: ACLED_EMAIL, ACLED_KEY, UPSTASH_REDIS_REST_URL/TOKEN
* feat(seeds): add seed-research.mjs for arXiv, HN, tech events, trending repos
Seeds 4 research endpoints:
- listArxivPapers (cs.AI, cs.CL, cs.CR) -> research:arxiv:v1:<cat>::50
- listHackernewsItems (top, best feeds) -> research:hackernews:v1:<feed>:30
- listTechEvents (Techmeme ICS + dev.events RSS) -> research:tech-events:v1
- listTrendingRepos (python, javascript, typescript) -> research:trending:v1:<lang>:daily:50
Tech events key is also seeded by the relay, this script provides backup
hydration and ensures the key is warm even if relay hasn't run yet.
Requires: UPSTASH_REDIS_REST_URL/TOKEN
* feat(seeds): add seed-military-maritime-news.mjs for USNI and nav warnings
Seeds 2 endpoints with predictable default params:
- USNI Fleet Report (WordPress JSON API) -> usni-fleet:sebuf:v1 + stale backup
- Navigational Warnings (NGA broadcast, all areas) -> maritime:navwarnings:v1:all
NOT seeded (inherently on-demand):
- getAircraftDetails/batch: per-icao24 Wingbits lookup
- listMilitaryFlights: bounding-box query (quantized 1-degree grid)
- getVesselSnapshot: in-memory cache, reads from relay /ais-snapshot
- listFeedDigest: per-feed-URL RSS caching (hundreds of feeds, relay proxied)
- summarizeArticle: per-article LLM summarization
Requires: UPSTASH_REDIS_REST_URL/TOKEN
* feat(seeds): add seed-infra.mjs warm-ping for service statuses and cable health
Uses warm-ping pattern (calls Vercel RPC from Railway) because:
- list-service-statuses: 30 status page parsers with 8 custom formats
- get-cable-health: NGA text analysis with cable name matching + proximity
Replicating this logic in a standalone script is fragile and duplicative.
NOT seeded (on-demand):
- search-imagery: per-bbox/datetime STAC query
- get-giving-summary: hardcoded baselines, no external fetches
- get-webcam-image: per-webcamId Windy API lookup
* fix(seeds): move secondary key writes before process.exit, fix data shapes
Critical bugs found in code review:
1. runSeed() calls process.exit(0) after primary key write, so .then()
callbacks were dead code. All secondary keys (FRED, macro signals,
trade data, HAPI summaries, pizzint, HN, trending, etc.) were NEVER
written. Fix: move writeExtraKey calls inside fetchAll() before return.
2. FRED cache key used :120 suffix but handler default is :0 (req.limit||0).
Fixed to :0 so seed matches handler cache key for default requests.
3. USNI and nav warnings seed parsers produced wrong data shapes vs handler
(different field names, missing fields). Converted to warm-ping pattern
(like seed-infra.mjs) to avoid shape divergence.
* fix(seeds): reduce GDELT 429 rate limiting in seed-gdelt-intel
Problems from logs: every topic fetch hits 429, runs take 3-5min,
4th run failed fatally after 12min of cascading retries.
Fixes:
- Increase inter-topic delay: 12s -> 20s (GDELT needs longer cooldown)
- Increase initial backoff: 10s -> 20s, with 15s increments per retry
- Graceful degradation: exhausted retries return empty topic instead of
throwing (prevents withRetry from restarting ALL topics from scratch)
- Align TTL with health.js: 3600s -> 7200s (matches maxStaleMin:120)
- Validation allows partial success (3/6 topics minimum)
Cron interval should also be increased from 30min to 2h on Railway
to match the new 2h TTL.
* fix(seeds): 4 bugs from review - ACLED auth, NOTAM key, infra precedence, curated events
P1: ACLED auth used wrong endpoint (api/acled/token) and env vars (ACLED_KEY).
Fixed to match server/acled-auth.ts: ACLED_EMAIL+ACLED_PASSWORD via /oauth/token,
with ACLED_ACCESS_TOKEN static fallback.
P1: Aviation NOTAM key was aviation:notam-closures:v1, handler reads
aviation:notam:closures:v2. Fixed key to match _shared.ts.
P2: Infra warm-ping had operator precedence bug in nullish coalescing:
(a ?? b) ? c : d instead of a ?? (b ? c : d). Added parens.
P2: Research seed missed curated conferences that the handler appends
(CURATED_EVENTS in list-tech-events.ts). Added same curated events so
seeded data matches what the handler would produce.
* fix(seeds): add seed-meta freshness metadata for all secondary keys
Added writeExtraKeyWithMeta() to _seed-utils.mjs that writes both the
data key and a seed-meta:<key> freshness metadata entry. All secondary
key writes in seed scripts now use this helper so health.js can track
freshness for: energy capacity, FRED series, macro signals, trade
barriers/restrictions/flows/tariffs, aviation news, HAPI summaries,
PizzINT, arXiv categories, HN feeds, tech events, trending repos.
Previously only the primary key per script got seed-meta (via runSeed),
leaving secondary keys operationally invisible to health monitoring.
* fix(seeds): align seed-meta keys with health.js conventions
P1: writeExtraKeyWithMeta wrote seed-meta:<full-cache-key> (e.g.,
seed-meta:economic:macro-signals:v1), but health.js expects normalized
names without version suffixes (seed-meta:economic:macro-signals).
Fixed by stripping trailing :v\d+ from key. Added metaKeyOverride
param for cases needing explicit control.
P1: shipping seed used runSeed('supply-chain', 'shipping-trade', ...)
producing seed-meta:supply-chain:shipping-trade, but health.js expects
seed-meta:supply_chain:shipping. Fixed domain/resource to match.
* fix(seeds): only write seed-meta after successful data key write
writeExtraKey() now returns false on failure. writeExtraKeyWithMeta()
skips seed-meta write when the data write fails, preventing false-positive
health reports for keys like macro-signals and tech-events.
272 lines
10 KiB
JavaScript
Executable File
272 lines
10 KiB
JavaScript
Executable File
#!/usr/bin/env node
|
|
|
|
/**
|
|
* Seed aviation data to Redis for the 3 seedable aviation endpoints:
|
|
* - getAirportOpsSummary (AviationStack delays + NOTAM closures)
|
|
* - getCarrierOps (derived from airport flights)
|
|
* - listAviationNews (RSS feeds)
|
|
*
|
|
* NOT seeded (inherently on-demand, user-specific inputs):
|
|
* - getFlightStatus (specific flight number lookup)
|
|
* - trackAircraft (bounding-box or icao24 lookup)
|
|
* - listAirportFlights (arbitrary airport + direction + limit combos)
|
|
*/
|
|
|
|
import { loadEnvFile, CHROME_UA, runSeed, writeExtraKeyWithMeta, sleep } from './_seed-utils.mjs';
|
|
|
|
loadEnvFile(import.meta.url);
|
|
|
|
const DEFAULT_AIRPORTS = ['IST', 'ESB', 'SAW', 'LHR', 'FRA', 'CDG'];
|
|
const OPS_CACHE_KEY = `aviation:ops-summary:v1:${[...DEFAULT_AIRPORTS].sort().join(',')}`;
|
|
const NEWS_CACHE_KEY = 'aviation:news::24:v1'; // empty entities, 24h window
|
|
const OPS_TTL = 300;
|
|
const NEWS_TTL = 900;
|
|
|
|
const AVIATIONSTACK_URL = 'https://api.aviationstack.com/v1/flights';
|
|
|
|
// ─── Airport Ops Summary (AviationStack + NOTAM) ───
|
|
|
|
async function fetchAviationStackFlights(airports) {
|
|
const apiKey = process.env.AVIATIONSTACK_API;
|
|
if (!apiKey) return { alerts: [], healthy: false };
|
|
|
|
const alerts = [];
|
|
for (const iata of airports) {
|
|
try {
|
|
const params = new URLSearchParams({
|
|
access_key: apiKey, dep_iata: iata, limit: '100',
|
|
});
|
|
const resp = await fetch(`${AVIATIONSTACK_URL}?${params}`, {
|
|
headers: { 'User-Agent': CHROME_UA },
|
|
signal: AbortSignal.timeout(10_000),
|
|
});
|
|
if (!resp.ok) { console.warn(` AviationStack ${iata}: HTTP ${resp.status}`); continue; }
|
|
const json = await resp.json();
|
|
if (json.error) { console.warn(` AviationStack ${iata}: ${json.error.message}`); continue; }
|
|
const flights = json.data || [];
|
|
const total = flights.length;
|
|
const delayed = flights.filter(f => (f.departure?.delay ?? 0) > 0);
|
|
const cancelled = flights.filter(f => f.flight_status === 'cancelled');
|
|
const totalDelay = delayed.reduce((s, f) => s + (f.departure?.delay ?? 0), 0);
|
|
|
|
alerts.push({
|
|
iata,
|
|
totalFlights: total,
|
|
delayedFlightsPct: total > 0 ? Math.round((delayed.length / total) * 1000) / 10 : 0,
|
|
avgDelayMinutes: delayed.length > 0 ? Math.round(totalDelay / delayed.length) : 0,
|
|
cancelledFlights: cancelled.length,
|
|
reason: delayed.length > 3 ? 'Multiple delays reported' : '',
|
|
});
|
|
await sleep(300); // rate limit
|
|
} catch (e) {
|
|
console.warn(` AviationStack ${iata}: ${e.message}`);
|
|
}
|
|
}
|
|
return { alerts, healthy: alerts.length > 0 };
|
|
}
|
|
|
|
async function fetchNotamClosures() {
|
|
try {
|
|
const { url, token } = getRedisCredentialsFromEnv();
|
|
const resp = await fetch(`${url}/get/aviation:notam:closures:v2`, {
|
|
headers: { Authorization: `Bearer ${token}` },
|
|
signal: AbortSignal.timeout(5_000),
|
|
});
|
|
if (!resp.ok) return null;
|
|
const data = await resp.json();
|
|
return data.result ? JSON.parse(data.result) : null;
|
|
} catch {
|
|
return null;
|
|
}
|
|
}
|
|
|
|
function getRedisCredentialsFromEnv() {
|
|
return {
|
|
url: process.env.UPSTASH_REDIS_REST_URL,
|
|
token: process.env.UPSTASH_REDIS_REST_TOKEN,
|
|
};
|
|
}
|
|
|
|
function determineSeverity(avgDelay, delayPct) {
|
|
if (avgDelay > 90 || delayPct > 50) return 'severe';
|
|
if (avgDelay > 60 || delayPct > 35) return 'major';
|
|
if (avgDelay > 30 || delayPct > 20) return 'moderate';
|
|
if (avgDelay > 15 || delayPct > 10) return 'minor';
|
|
return 'normal';
|
|
}
|
|
|
|
function severityFromCancelRate(rate) {
|
|
if (rate > 20) return 'severe';
|
|
if (rate > 10) return 'major';
|
|
if (rate > 5) return 'moderate';
|
|
if (rate > 2) return 'minor';
|
|
return 'normal';
|
|
}
|
|
|
|
async function fetchAirportOpsSummary() {
|
|
const now = Date.now();
|
|
const avResult = await fetchAviationStackFlights(DEFAULT_AIRPORTS);
|
|
|
|
let notamClosedIcaos = new Set();
|
|
let notamRestrictedIcaos = new Set();
|
|
let notamReasons = {};
|
|
const notamData = await fetchNotamClosures();
|
|
if (notamData) {
|
|
notamClosedIcaos = new Set(notamData.closedIcaos || []);
|
|
notamRestrictedIcaos = new Set(notamData.restrictedIcaos || []);
|
|
notamReasons = notamData.reasons || {};
|
|
}
|
|
|
|
// We don't have full MONITORED_AIRPORTS config here, build minimal map
|
|
const ICAO_MAP = { IST: 'LTFM', ESB: 'LTAC', SAW: 'LTFJ', LHR: 'EGLL', FRA: 'EDDF', CDG: 'LFPG' };
|
|
const NAME_MAP = { IST: 'Istanbul Airport', ESB: 'Esenboga', SAW: 'Sabiha Gokcen', LHR: 'Heathrow', FRA: 'Frankfurt', CDG: 'Charles de Gaulle' };
|
|
|
|
const summaries = [];
|
|
for (const iata of DEFAULT_AIRPORTS) {
|
|
const icao = ICAO_MAP[iata] || '';
|
|
const alert = avResult.alerts.find(a => a.iata === iata);
|
|
const isClosed = notamClosedIcaos.has(icao);
|
|
const isRestricted = notamRestrictedIcaos.has(icao);
|
|
const notamText = notamReasons[icao];
|
|
|
|
const delayPct = alert?.delayedFlightsPct ?? 0;
|
|
const avgDelay = alert?.avgDelayMinutes ?? 0;
|
|
const cancelledFlights = alert?.cancelledFlights ?? 0;
|
|
const totalFlights = alert?.totalFlights ?? 0;
|
|
const cancelRate = totalFlights > 0 ? (cancelledFlights / totalFlights) * 100 : 0;
|
|
|
|
const cancelSev = severityFromCancelRate(cancelRate);
|
|
const delaySev = determineSeverity(avgDelay, delayPct);
|
|
const notamFloor = isClosed ? (totalFlights === 0 ? 'severe' : 'moderate') : isRestricted ? 'minor' : 'normal';
|
|
const sevOrder = ['normal', 'minor', 'moderate', 'major', 'severe'];
|
|
const sevStr = sevOrder[Math.max(sevOrder.indexOf(cancelSev), sevOrder.indexOf(delaySev), sevOrder.indexOf(notamFloor))] ?? 'normal';
|
|
|
|
const notamFlags = [];
|
|
if (isClosed) notamFlags.push('CLOSED');
|
|
if (isRestricted) notamFlags.push('RESTRICTED');
|
|
if (notamText) notamFlags.push('NOTAM');
|
|
|
|
const topDelayReasons = [];
|
|
if (alert?.reason) topDelayReasons.push(alert.reason);
|
|
if ((isClosed || isRestricted) && notamText) topDelayReasons.push(notamText.slice(0, 80));
|
|
|
|
summaries.push({
|
|
iata, icao, name: NAME_MAP[iata] || iata, timezone: 'UTC',
|
|
delayPct, avgDelayMinutes: avgDelay,
|
|
cancellationRate: Math.round(cancelRate * 10) / 10,
|
|
totalFlights, closureStatus: isClosed, notamFlags,
|
|
severity: `FLIGHT_DELAY_SEVERITY_${sevStr.toUpperCase()}`,
|
|
topDelayReasons,
|
|
source: avResult.healthy ? 'aviationstack' : 'simulated',
|
|
updatedAt: now,
|
|
});
|
|
}
|
|
console.log(` Airport ops: ${summaries.length} airports, ${avResult.alerts.length} with live data`);
|
|
return { summaries };
|
|
}
|
|
|
|
// ─── Aviation News (RSS) ───
|
|
|
|
const AVIATION_RSS_FEEDS = [
|
|
{ url: 'https://www.flightglobal.com/rss', name: 'FlightGlobal' },
|
|
{ url: 'https://simpleflying.com/feed/', name: 'Simple Flying' },
|
|
{ url: 'https://aerotime.aero/feed', name: 'AeroTime' },
|
|
{ url: 'https://thepointsguy.com/feed/', name: 'The Points Guy' },
|
|
{ url: 'https://airlinegeeks.com/feed/', name: 'Airline Geeks' },
|
|
{ url: 'https://onemileatatime.com/feed/', name: 'One Mile at a Time' },
|
|
{ url: 'https://viewfromthewing.com/feed/', name: 'View from the Wing' },
|
|
{ url: 'https://www.aviationpros.com/rss', name: 'Aviation Pros' },
|
|
{ url: 'https://www.aviationweek.com/rss', name: 'Aviation Week' },
|
|
];
|
|
|
|
function parseRssItems(xml, sourceName) {
|
|
try {
|
|
// Lightweight XML parse for RSS items
|
|
const items = [];
|
|
const itemRegex = /<item[\s>]([\s\S]*?)<\/item>/gi;
|
|
let match;
|
|
while ((match = itemRegex.exec(xml)) !== null) {
|
|
const block = match[1];
|
|
const title = block.match(/<title[^>]*>([\s\S]*?)<\/title>/i)?.[1]?.replace(/<!\[CDATA\[([\s\S]*?)\]\]>/g, '$1').trim() || '';
|
|
const link = block.match(/<link[^>]*>([\s\S]*?)<\/link>/i)?.[1]?.replace(/<!\[CDATA\[([\s\S]*?)\]\]>/g, '$1').trim() || '';
|
|
const pubDate = block.match(/<pubDate[^>]*>([\s\S]*?)<\/pubDate>/i)?.[1]?.trim() || '';
|
|
const desc = block.match(/<description[^>]*>([\s\S]*?)<\/description>/i)?.[1]?.replace(/<!\[CDATA\[([\s\S]*?)\]\]>/g, '$1').trim() || '';
|
|
if (title && link) items.push({ title, link, pubDate, description: desc, _source: sourceName });
|
|
}
|
|
return items.slice(0, 30);
|
|
} catch {
|
|
return [];
|
|
}
|
|
}
|
|
|
|
async function fetchAviationNews() {
|
|
const now = Date.now();
|
|
const cutoff = now - 24 * 60 * 60 * 1000;
|
|
const allItems = [];
|
|
|
|
await Promise.allSettled(
|
|
AVIATION_RSS_FEEDS.map(async (feed) => {
|
|
try {
|
|
const resp = await fetch(feed.url, {
|
|
headers: { 'User-Agent': CHROME_UA, Accept: 'application/rss+xml, application/xml, text/xml, */*' },
|
|
signal: AbortSignal.timeout(8_000),
|
|
});
|
|
if (!resp.ok) return;
|
|
const xml = await resp.text();
|
|
allItems.push(...parseRssItems(xml, feed.name));
|
|
} catch { /* skip */ }
|
|
}),
|
|
);
|
|
|
|
const items = allItems
|
|
.map((item) => {
|
|
let publishedAt = 0;
|
|
if (item.pubDate) try { publishedAt = new Date(item.pubDate).getTime(); } catch { /* skip */ }
|
|
if (publishedAt && publishedAt < cutoff) return null;
|
|
const snippet = (item.description || '').replace(/<[^>]+>/g, '').slice(0, 200);
|
|
return {
|
|
id: Buffer.from(item.link).toString('base64').slice(0, 32),
|
|
title: item.title, url: item.link, sourceName: item._source,
|
|
publishedAt: publishedAt || now, snippet,
|
|
matchedEntities: [], imageUrl: '',
|
|
};
|
|
})
|
|
.filter(Boolean)
|
|
.sort((a, b) => b.publishedAt - a.publishedAt);
|
|
|
|
console.log(` Aviation news: ${items.length} articles from ${AVIATION_RSS_FEEDS.length} feeds`);
|
|
return { items };
|
|
}
|
|
|
|
// ─── Main ───
|
|
|
|
async function fetchAll() {
|
|
const [ops, news] = await Promise.allSettled([
|
|
fetchAirportOpsSummary(),
|
|
fetchAviationNews(),
|
|
]);
|
|
|
|
const opsData = ops.status === 'fulfilled' ? ops.value : null;
|
|
const newsData = news.status === 'fulfilled' ? news.value : null;
|
|
|
|
if (!opsData && !newsData) throw new Error('All aviation fetches failed');
|
|
|
|
// Write secondary keys BEFORE returning (runSeed calls process.exit after primary write)
|
|
if (newsData?.items?.length > 0) await writeExtraKeyWithMeta(NEWS_CACHE_KEY, newsData, NEWS_TTL, newsData.items.length);
|
|
|
|
return opsData || { summaries: [] };
|
|
}
|
|
|
|
function validate(data) {
|
|
return data?.summaries?.length > 0;
|
|
}
|
|
|
|
runSeed('aviation', 'ops-news', OPS_CACHE_KEY, fetchAll, {
|
|
validateFn: validate,
|
|
ttlSeconds: OPS_TTL,
|
|
sourceVersion: 'aviationstack-rss',
|
|
}).catch((err) => {
|
|
console.error('FATAL:', err.message || err);
|
|
process.exit(1);
|
|
});
|