worldmonitor

eliott/worldmonitor

Fork 0

mirror of https://github.com/koala73/worldmonitor.git synced 2026-04-25 17:14:57 +02:00

Commit Graph

Author SHA1 Message Date

Author	SHA1	Message	Date
Sebastien Melki	58e42aadf9	chore(api): enforce sebuf contract + migrate drifting endpoints (#3207 ) (#3242 ) * chore(api): enforce sebuf contract via exceptions manifest (#3207) Adds api/api-route-exceptions.json as the single source of truth for non-proto /api/ endpoints, with scripts/enforce-sebuf-api-contract.mjs gating every PR via npm run lint:api-contract. Fixes the root-only blind spot in the prior allowlist (tests/edge-functions.test.mjs), which only scanned top-level .js files and missed nested paths and .ts endpoints — the gap that let api/supply-chain/v1/country-products.ts and friends drift under proto domain URL prefixes unchallenged. Checks both directions: every api/<domain>/v<N>/[rpc].ts must pair with a generated service_server.ts (so a deleted proto fails CI), and every generated service must have an HTTP gateway (no orphaned generated code). Manifest entries require category + reason + owner, with removal_issue mandatory for temporary categories (deferred, migration-pending) and forbidden for permanent ones. .github/CODEOWNERS pins the manifest to @SebastienMelki so new exceptions don't slip through review. The manifest only shrinks: migration-pending entries (19 today) will be removed as subsequent commits in this PR land each migration. refactor(maritime): migrate /api/ais-snapshot → maritime/v1.GetVesselSnapshot (#3207) The proto VesselSnapshot was carrying density + disruptions but the frontend also needed sequence, relay status, and candidate_reports to drive the position-callback system. Those only lived on the raw relay passthrough, so the client had to keep hitting /api/ais-snapshot whenever callbacks were registered and fall back to the proto RPC only when the relay URL was gone. This commit pushes all three missing fields through the proto contract and collapses the dual-fetch-path into one proto client call. Proto changes (proto/worldmonitor/maritime/v1/): - VesselSnapshot gains sequence, status, candidate_reports. - GetVesselSnapshotRequest gains include_candidates (query: include_candidates). Handler (server/worldmonitor/maritime/v1/get-vessel-snapshot.ts): - Forwards include_candidates to ?candidates=... on the relay. - Separate 5-min in-memory caches for the candidates=on and candidates=off variants; they have very different payload sizes and should not share a slot. - Per-request in-flight dedup preserved per-variant. Frontend (src/services/maritime/index.ts): - fetchSnapshotPayload now calls MaritimeServiceClient.getVesselSnapshot directly with includeCandidates threaded through. The raw-relay path, SNAPSHOT_PROXY_URL, DIRECT_RAILWAY_SNAPSHOT_URL and LOCAL_SNAPSHOT_FALLBACK are gone — production already routed via Vercel, the "direct" branch only ever fired on localhost, and the proto gateway covers both. - New toLegacyCandidateReport helper mirrors toDensityZone/toDisruptionEvent. api/ais-snapshot.js deleted; manifest entry removed. Only reduced the codegen scope to worldmonitor.maritime.v1 (buf generate --path) — regenerating the full tree drops // @ts-nocheck from every client/server file and surfaces pre-existing type errors across 30+ unrelated services, which is not in scope for this PR. Shape-diff vs legacy payload: - disruptions / density: proto carries the same fields, just with the GeoCoordinates wrapper and enum strings (remapped client-side via existing toDisruptionEvent / toDensityZone helpers). - sequence, status.{connected,vessels,messages}: now populated from the proto response — was hardcoded to 0/false in the prior proto fallback. - candidateReports: same shape; optional numeric fields come through as 0 instead of undefined, which the legacy consumer already handled. * refactor(sanctions): migrate /api/sanctions-entity-search → LookupSanctionEntity (#3207) The proto docstring already claimed "OFAC + OpenSanctions" coverage but the handler only fuzzy-matched a local OFAC Redis index — narrower than the legacy /api/sanctions-entity-search, which proxied OpenSanctions live (the source advertised in docs/api-proxies.mdx). Deleting the legacy without expanding the handler would have been a silent coverage regression for external consumers. Handler changes (server/worldmonitor/sanctions/v1/lookup-entity.ts): - Primary path: live search against api.opensanctions.org/search/default with an 8s timeout and the same User-Agent the legacy edge fn used. - Fallback path: the existing OFAC local fuzzy match, kept intact for when OpenSanctions is unreachable / rate-limiting. - Response source field flips between 'opensanctions' (happy path) and 'ofac' (fallback) so clients can tell which index answered. - Query validation tightened: rejects q > 200 chars (matches legacy cap). Rate limiting: - Added /api/sanctions/v1/lookup-entity to ENDPOINT_RATE_POLICIES at 30/min per IP — matches the legacy createIpRateLimiter budget. The gateway already enforces per-endpoint policies via checkEndpointRateLimit. Docs: - docs/api-proxies.mdx — dropped the /api/sanctions-entity-search row (plus the orphaned /api/ais-snapshot row left over from the previous commit in this PR). - docs/panels/sanctions-pressure.mdx — points at the new RPC URL and describes the OpenSanctions-primary / OFAC-fallback semantics. api/sanctions-entity-search.js deleted; manifest entry removed. * refactor(military): migrate /api/military-flights → ListMilitaryFlights (#3207) Legacy /api/military-flights read a pre-baked Redis blob written by the seed-military-flights cron and returned flights in a flat app-friendly shape (lat/lon, lowercase enums, lastSeenMs). The proto RPC takes a bbox, fetches OpenSky live, classifies server-side, and returns nested GeoCoordinates + MILITARY__TYPE_ enum strings + lastSeenAt — same data, different contract. fetchFromRedis in src/services/military-flights.ts was doing nothing sebuf-aware. Renamed it to fetchViaProto and rewrote to: - Instantiate MilitaryServiceClient against getRpcBaseUrl(). - Iterate MILITARY_QUERY_REGIONS (PACIFIC + WESTERN) in parallel — same regions the desktop OpenSky path and the seed cron already use, so dashboard coverage tracks the analytic pipeline. - Dedup by hexCode across regions. - Map proto → app shape via new mapProtoFlight helper plus three reverse enum maps (AIRCRAFT_TYPE_REVERSE, OPERATOR_REVERSE, CONFIDENCE_REVERSE). The seed cron (scripts/seed-military-flights.mjs) stays put: it feeds regional-snapshot mobility, cross-source signals, correlation, and the health freshness check (api/health.js: 'military:flights:v1'). None of those read the legacy HTTP endpoint; they read the Redis key directly. The proto handler uses its own per-bbox cache keys under the same prefix, so dashboard traffic no longer races the seed cron's blob — the two paths diverge by a small refresh lag, which is acceptable. Docs: dropped the /api/military-flights row from docs/api-proxies.mdx. api/military-flights.js deleted; manifest entry removed. Shape-diff vs legacy: - f.location.{latitude,longitude} → f.lat, f.lon - f.aircraftType: MILITARY_AIRCRAFT_TYPE_TANKER → 'tanker' via reverse map - f.operator: MILITARY_OPERATOR_USAF → 'usaf' via reverse map - f.confidence: MILITARY_CONFIDENCE_LOW → 'low' via reverse map - f.lastSeenAt (number) → f.lastSeen (Date) - f.enrichment → f.enriched (with field renames) - Extra fields registration / aircraftModel / origin / destination / firstSeenAt now flow through where proto populates them. * fix(supply-chain): thread includeCandidates through chokepoint status (#3207) Caught by tsconfig.api.json typecheck in the pre-push hook (not covered by the plain tsc --noEmit run that ran before I pushed the ais-snapshot commit). The chokepoint status handler calls getVesselSnapshot internally with a static no-auth request — now required to include the new includeCandidates bool from the proto extension. Passing false: server-internal callers don't need per-vessel reports. * test(maritime): update getVesselSnapshot cache assertions (#3207) The ais-snapshot migration replaced the single cachedSnapshot/cacheTimestamp pair with a per-variant cache so candidates-on and candidates-off payloads don't evict each other. Pre-push hook surfaced that tests/server-handlers still asserted the old variable names. Rewriting the assertions to match the new shape while preserving the invariants they actually guard: - Freshness check against slot TTL. - Cache read before relay call. - Per-slot in-flight dedup. - Stale-serve on relay failure (result ?? slot.snapshot). * chore(proto): restore // @ts-nocheck on regenerated maritime files (#3207) I ran 'buf generate --path worldmonitor/maritime/v1' to scope the proto regen to the one service I was changing (to avoid the toolchain drift that drops @ts-nocheck from 60+ unrelated files — separate issue). But the repo convention is the 'make generate' target, which runs buf and then sed-prepends '// @ts-nocheck' to every generated .ts file. My scoped command skipped the sed step. The proto-check CI enforces the sed output, so the two maritime files need the directive restored. * refactor(enrichment): decomm /api/enrichment/{company,signals} legacy edge fns (#3207) Both endpoints were already ported to IntelligenceService: - getCompanyEnrichment (/api/intelligence/v1/get-company-enrichment) - listCompanySignals (/api/intelligence/v1/list-company-signals) No frontend callers of the legacy /api/enrichment/* paths exist. Removes: - api/enrichment/company.js, signals.js, _domain.js - api-route-exceptions.json migration-pending entries (58 remain) - docs/api-proxies.mdx rows for /api/enrichment/{company,signals} - docs/architecture.mdx reference updated to the IntelligenceService RPCs Verified: typecheck, typecheck:api, lint:api-contract (89 files / 58 entries), lint:boundaries, tests/edge-functions.test.mjs (136 pass), tests/enrichment-caching.test.mjs (14 pass — still guards the intelligence/v1 handlers), make generate is zero-diff. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(leads): migrate /api/{contact,register-interest} → LeadsService (#3207) New leads/v1 sebuf service with two POST RPCs: - SubmitContact → /api/leads/v1/submit-contact - RegisterInterest → /api/leads/v1/register-interest Handler logic ported 1:1 from api/contact.js + api/register-interest.js: - Turnstile verification (desktop sources bypass, preserved) - Honeypot (website field) silently accepts without upstream calls - Free-email-domain gate on SubmitContact (422 ApiError) - validateEmail (disposable/offensive/typo-TLD/MX) on RegisterInterest - Convex writes via ConvexHttpClient (contactMessages:submit, registerInterest:register) - Resend notification + confirmation emails (HTML templates unchanged) Shared helpers moved to server/_shared/: - turnstile.ts (getClientIp + verifyTurnstile) - email-validation.ts (disposable/offensive/MX checks) Rate limits preserved via ENDPOINT_RATE_POLICIES: - submit-contact: 3/hour per IP (was in-memory 3/hr) - register-interest: 5/hour per IP (was in-memory 5/hr; desktop sources previously capped at 2/hr via shared in-memory map — now 5/hr like everyone else, accepting the small regression in exchange for Upstash-backed global limiting) Callers updated: - pro-test/src/App.tsx contact form → new submit-contact path - src-tauri/sidecar/local-api-server.mjs cloud-fallback rewrites /api/register-interest → /api/leads/v1/register-interest when proxying; keeps local path for older desktop builds - src/services/runtime.ts isKeyFreeApiTarget allows both old and new paths through the WORLDMONITOR_API_KEY-optional gate Tests: - tests/contact-handler.test.mjs rewritten to call submitContact handler directly; asserts on ValidationError / ApiError - tests/email-validation.test.mjs + tests/turnstile.test.mjs point at the new server/_shared/ modules Deleted: api/contact.js, api/register-interest.js, api/_ip-rate-limit.js, api/_turnstile.js, api/_email-validation.js, api/_turnstile.test.mjs. Manifest entries removed (58 → 56). Docs updated (api-platform, api-commerce, usage-rate-limits). Verified: npm run typecheck + typecheck:api + lint:api-contract (88 files / 56 entries) + lint:boundaries pass; full test:data (5852 tests) passes; make generate is zero-diff. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(pro-test): rebuild bundle for leads/v1 contact form (#3207) Updates the enterprise contact form to POST to /api/leads/v1/submit-contact (old path /api/contact removed in the previous commit). Bundle is rebuilt from pro-test/src/App.tsx source change in `9ccd309d`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(review): address HIGH review findings 1-3 (#3207) Three review findings from @koala73 on the sebuf-migration PR, all silent bugs that would have shipped to prod: ### 1. Sanctions rate-limit policy was dead code ENDPOINT_RATE_POLICIES keyed the 30/min budget under /api/sanctions/v1/lookup-entity, but the generated route (from the proto RPC LookupSanctionEntity) is /api/sanctions/v1/lookup-sanction-entity. hasEndpointRatePolicy / getEndpointRatelimit are exact-string pathname lookups, so the mismatch meant the endpoint fell through to the generic 600/min global limiter instead of the advertised 30/min. Net effect: the live OpenSanctions proxy endpoint (unauthenticated, external upstream) had 20x the intended rate budget. Fixed by renaming the policy key to match the generated route. ### 2. Lost stale-seed fallback on military-flights Legacy api/military-flights.js cascaded military:flights:v1 → military:flights:stale:v1 before returning empty. The new proto handler went straight to live OpenSky/relay and returned null on miss. Relay or OpenSky hiccup used to serve stale seeded data (24h TTL); under the new handler it showed an empty map. Both keys are still written by scripts/seed-military-flights.mjs on every run — fix just reads the stale key when the live fetch returns null, converts the seed's app-shape flights (flat lat/lon, lowercase enums, lastSeenMs) to the proto shape (nested GeoCoordinates, enum strings, lastSeenAt), and filters to the request bbox. Read via getRawJson (unprefixed) to match the seed cron's writes, which bypass the env-prefix system. ### 3. Hex-code casing mismatch broke getFlightByHex The seed cron writes hexCode: icao24.toUpperCase() (uppercase); src/services/military-flights.ts:getFlightByHex uppercases the lookup input: f.hexCode === hexCode.toUpperCase(). The new proto handler preserved OpenSky's lowercase icao24, and mapProtoFlight is a pass-through. getFlightByHex was silently returning undefined for every call after the migration. Fix: uppercase in the proto handler (live + stale paths), and document the invariant in a comment on MilitaryFlight.hex_code in military_flight.proto so future handlers don't re-break it. ### Verified - typecheck + typecheck:api clean - lint:api-contract (56 entries) / lint:boundaries clean - tests/edge-functions.test.mjs 130 pass - make generate zero-diff (openapi spec regenerated for proto comment) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(review): restore desktop 2/hr rate cap on register-interest (#3207) Addresses HIGH review finding #4 from @koala73. The legacy api/register-interest.js applied a nested 2/hr per-IP cap when `source === 'desktop-settings'`, on top of the generic 5/hr endpoint budget. The sebuf migration lost this — desktop-source requests now enjoy the full 5/hr cap. Since `source` is an unsigned client-supplied field, anyone sending `source: 'desktop-settings'` skips Turnstile AND gets 5/hr. Without the tighter cap the Turnstile bypass is cheaper to abuse. Added `checkScopedRateLimit` to `server/_shared/rate-limit.ts` — a reusable second-stage Upstash limiter keyed on an opaque scope string + caller identifier. Fail-open on Redis errors to match existing checkRateLimit / checkEndpointRateLimit semantics. Handlers that need per-subscope caps on top of the gateway-level endpoint budget use this helper. In register-interest: when `isDesktopSource`, call checkScopedRateLimit with scope `/api/leads/v1/register-interest#desktop`, limit=2, window=1h, IP as identifier. On exceeded → throw ApiError(429). ### What this does not fix This caps the blast radius of the Turnstile bypass but does not close it — an attacker sending `source: 'desktop-settings'` still skips Turnstile (just at 2/hr instead of 5/hr). The proper fix is a signed desktop-secret header that authenticates the bypass; filed as follow-up #3252. That requires coordinated Tauri build + Vercel env changes out of scope for #3207. ### Verified - typecheck + typecheck:api clean - lint:api-contract (56 entries) - tests/edge-functions.test.mjs + contact-handler.test.mjs (147 pass) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(review): MEDIUM + LOW + rate-limit-policy CI check (#3207) Closes out the remaining @koala73 review findings from #3242 that didn't already land in the HIGH-fix commits, plus the requested CI check that would have caught HIGH #1 (dead-code policy key) at review time. ### MEDIUM #5 — Turnstile missing-secret policy default Flip `verifyTurnstile`'s default `missingSecretPolicy` from `'allow'` to `'allow-in-development'`. Dev with no secret = pass (expected local); prod with no secret = reject + log. submit-contact was already explicitly overriding to `'allow-in-development'`; register-interest was silently getting `'allow'`. Safe default now means a future missing-secret misconfiguration in prod gets caught instead of silently letting bots through. Removed the now-redundant override in submit-contact. ### MEDIUM #6 — Silent enum fallbacks in maritime client `toDisruptionEvent` mapped `AIS_DISRUPTION_TYPE_UNSPECIFIED` / unknown enum values → `gap_spike` / `low` silently. Refactored to return null when either enum is unknown; caller filters nulls out of the array. Handler doesn't produce UNSPECIFIED today, but the `gap_spike` default would have mislabeled the first new enum value the proto ever adds — dropping unknowns is safer than shipping wrong labels. ### LOW — Copy drift in register-interest email Email template hardcoded `435+ Sources`; PR #3241 bumped marketing to `500+`. Bumped in the rewritten file to stay consistent. The `as any` on Convex mutation names carried over from legacy and filed as follow-up #3253. ### Rate-limit-policy coverage lint `scripts/enforce-rate-limit-policies.mjs` validates every key in `ENDPOINT_RATE_POLICIES` resolves to a proto-generated gateway route by cross-referencing `docs/api/.openapi.yaml`. Fails with the sanctions-entity-search incident referenced in the error message so future drift has a paper trail. Wired into package.json (`lint:rate-limit-policies`) and the pre-push hook alongside `lint:boundaries`. Smoke-tested both directions — clean repo passes (5 policies / 175 routes), seeded drift (the exact HIGH #1 typo) fails with the advertised remedy text. ### Verified - `lint:rate-limit-policies` ✓ - `typecheck` + `typecheck:api` ✓ - `lint:api-contract` ✓ (56 entries) - `lint:boundaries` ✓ - edge-functions + contact-handler tests (147 pass) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> refactor(commit 5): decomm /api/eia/* + migrate /api/satellites → IntelligenceService (#3207) Both targets turned out to be decomm-not-migration cases. The original plan called for two new services (economic/v1.GetEiaSeries + natural/v1.ListSatellitePositions) but research found neither was needed: ### /api/eia/[[...path]].js — pure decomm, zero consumers The "catch-all" is a misnomer — only two paths actually worked, /api/eia/health and /api/eia/petroleum, both Redis-only readers. Zero frontend callers in src/. Zero server-side readers. Nothing consumes the `energy:eia-petroleum:v1` key that seed-eia-petroleum.mjs writes daily. The EIA data the frontend actually uses goes through existing typed RPCs in economic/v1: GetEnergyPrices, GetCrudeInventories, GetNatGasStorage, GetEnergyCapacity. None of those touch /api/eia/. Building GetEiaSeries would have been dead code. Deleted the legacy file + its test (tests/api-eia-petroleum.test.mjs — it only covered the legacy endpoint, no behavior to preserve). Empty api/eia/ dir removed. Note for review:* the Redis seed cron keeps running daily and nothing consumes it. If that stays unused, seed-eia-petroleum.mjs should be retired too (separate PR). Out of scope for sebuf-migration. ### /api/satellites.js — Learning #2 strikes again IntelligenceService.ListSatellites already exists at /api/intelligence/v1/list-satellites, reads the same Redis key (intelligence:satellites:tle:v1), and supports an optional country filter the legacy didn't have. One frontend caller in src/services/satellites.ts needed to switch from `fetch(toApiUrl('/api/satellites'))` to the typed IntelligenceServiceClient.listSatellites. Shape diff was tiny — legacy `noradId` became proto `id` (handler line 36 already picks either), everything else identical. alt/velocity/inclination in the proto are ignored by the caller since it propagates positions client-side via satellite.js. Kept the client-side cache + failure cooldown + 20s timeout (still valid concerns at the caller level). ### Manifest + docs - api-route-exceptions.json: 56 → 54 entries (both removed) - docs/api-proxies.mdx: dropped the two rows from the Raw-data passthroughs table ### Verified - typecheck + typecheck:api ✓ - lint:api-contract (54 entries) / lint:boundaries / lint:rate-limit-policies ✓ - tests/edge-functions.test.mjs 127 pass (down from 130 — 3 tests were for the deleted eia endpoint) - make generate zero-diff (no proto changes) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(commit 6): migrate /api/supply-chain/v1/{country-products,multi-sector-cost-shock} → SupplyChainService (#3207) Both endpoints were hand-rolled TS handlers sitting under a proto URL prefix — the exact drift the manifest guardrail flagged. Promoted both to typed RPCs: - GetCountryProducts → /api/supply-chain/v1/get-country-products - GetMultiSectorCostShock → /api/supply-chain/v1/get-multi-sector-cost-shock Handlers preserve the existing semantics: PRO-gate via isCallerPremium(ctx.request), iso2 / chokepointId validation, raw bilateral-hs4 Redis read (skip env-prefix to match seeder writes), CHOKEPOINT_STATUS_KEY for war-risk tier, and the math from _multi-sector-shock.ts unchanged. Empty-data and non-PRO paths return the typed empty payload (no 403 — the sebuf gateway pattern is empty-payload-on-deny). Client wrapper switches from premiumFetch to client.getCountryProducts/ client.getMultiSectorCostShock. Legacy MultiSectorShock / MultiSectorShockResponse / CountryProductsResponse names remain as type aliases of the generated proto types so CountryBriefPanel + CountryDeepDivePanel callsites compile with zero churn. Manifest 54 → 52. Rate-limit gateway routes 175 → 177. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(gateway): add cache-tier entries for new supply-chain RPCs (#3207) Pre-push tests/route-cache-tier.test.mjs caught the missing entries. Both PRO-gated, request-varying — match the existing supply-chain PRO cohort (get-country-cost-shock, get-bypass-options, etc.) at slow-browser tier. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(commit 7): migrate /api/scenario/v1/{run,status,templates} → ScenarioService (#3207) Promote the three literal-filename scenario endpoints to a typed sebuf service with three RPCs: POST /api/scenario/v1/run-scenario (RunScenario) GET /api/scenario/v1/get-scenario-status (GetScenarioStatus) GET /api/scenario/v1/list-scenario-templates (ListScenarioTemplates) Preserves all security invariants from the legacy handlers: - 405 for wrong method (sebuf service-config method gate) - scenarioId validation against SCENARIO_TEMPLATES registry - iso2 regex ^[A-Z]{2}$ - JOB_ID_RE path-traversal guard on status - Per-IP 10/min rate limit (moved to gateway ENDPOINT_RATE_POLICIES) - Queue-depth backpressure (>100 → 429) - PRO gating via isCallerPremium - AbortSignal.timeout on every Redis pipeline (runRedisPipeline helper) Wire-level diffs vs legacy: - Per-user RL now enforced at the gateway (same 10/min/IP budget). - Rate-limit response omits Retry-After header; retryAfter is in the body per error-mapper.ts convention. - ListScenarioTemplates emits affectedHs2: [] when the registry entry is null (all-sectors sentinel); proto repeated cannot carry null. - RunScenario returns { jobId, status } (no statusUrl field — unused by SupplyChainPanel, drop from wire). Gateway wiring: - server/gateway.ts RPC_CACHE_TIER: list-scenario-templates → 'daily' (matches legacy max-age=3600); get-scenario-status → 'slow-browser' (premium short-circuit target, explicit entry required by tests/route-cache-tier.test.mjs). - src/shared/premium-paths.ts: swap old run/status for the new run-scenario/get-scenario-status paths. - api/scenario/v1/{run,status,templates}.ts deleted; 3 manifest exceptions removed (63 → 52 → 49 migration-pending). Client: - src/services/scenario/index.ts — typed client wrapper using premiumFetch (injects Clerk bearer / API key). - src/components/SupplyChainPanel.ts — polling loop swapped from premiumFetch strings to runScenario/getScenarioStatus. Hard 20s timeout on run preserved via AbortSignal.any. Tests: - tests/scenario-handler.test.mjs — 18 new handler-level tests covering every security invariant + the worker envelope coercion. - tests/edge-functions.test.mjs — scenario sections removed, replaced with a breadcrumb pointer to the new test file. Docs: api-scenarios.mdx, scenario-engine.mdx, usage-rate-limits.mdx, usage-errors.mdx, supply-chain.mdx refreshed with new paths. Verified: typecheck, typecheck:api, lint:api-contract (49 entries), lint:rate-limit-policies (6/180), lint:boundaries, route-cache-tier (parity), full edge-functions (117) + scenario-handler (18). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(commit 8): migrate /api/v2/shipping/{route-intelligence,webhooks} → ShippingV2Service (#3207) Partner-facing endpoints promoted to a typed sebuf service. Wire shape preserved byte-for-byte (camelCase field names, ISO-8601 fetchedAt, the same subscriberId/secret formats, the same SET + SADD + EXPIRE 30-day Redis pipeline). Partner URLs /api/v2/shipping/* are unchanged. RPCs landed: - GET /route-intelligence → RouteIntelligence (PRO, slow-browser) - POST /webhooks → RegisterWebhook (PRO) - GET /webhooks → ListWebhooks (PRO, slow-browser) The existing path-parameter URLs remain on the legacy edge-function layout because sebuf's HTTP annotations don't currently model path params (grep proto/*/.proto for `path: "{…}"` returns zero). Those endpoints are split into two Vercel dynamic-route files under api/v2/shipping/webhooks/, behaviorally identical to the previous hybrid file but cleanly separated: - GET /webhooks/{subscriberId} → [subscriberId].ts - POST /webhooks/{subscriberId}/rotate-secret → [subscriberId]/[action].ts - POST /webhooks/{subscriberId}/reactivate → [subscriberId]/[action].ts Both get manifest entries under `migration-pending` pointing at #3207. Other changes - scripts/enforce-sebuf-api-contract.mjs: extended GATEWAY_RE to accept api/v{N}/{domain}/[rpc].ts (version-first) alongside the canonical api/{domain}/v{N}/[rpc].ts; first-use of the reversed ordering is shipping/v2 because that's the partner contract. - vite.config.ts: dev-server sebuf interceptor regex extended to match both layouts; shipping/v2 import + allRoutes entry added. - server/gateway.ts: RPC_CACHE_TIER entries for /api/v2/shipping/ route-intelligence + /webhooks (slow-browser; premium-gated endpoints short-circuit to slow-browser but the entries are required by tests/route-cache-tier.test.mjs). - src/shared/premium-paths.ts: route-intelligence + webhooks added. - tests/shipping-v2-handler.test.mjs: 18 handler-level tests covering PRO gate, iso2/cargoType/hs2 coercion, SSRF guards (http://, RFC1918, cloud metadata, IMDS), chokepoint whitelist, alertThreshold range, secret/subscriberId format, pipeline shape + 30-day TTL, cross-tenant owner isolation, `secret` omission from list response. Manifest delta - Removed: api/v2/shipping/route-intelligence.ts, api/v2/shipping/webhooks.ts - Added: api/v2/shipping/webhooks/[subscriberId].ts (migration-pending) - Added: api/v2/shipping/webhooks/[subscriberId]/[action].ts (migration-pending) - Added: api/internal/brief-why-matters.ts (internal-helper) — regression surface from the #3248 main merge, which introduced the file without a manifest entry. Filed here to keep the lint green; not strictly in scope for commit 8 but unblocking. Net result: 49 → 47 `migration-pending` entries (one net-removal even though webhook path-params stay pending, because two files collapsed into two dynamic routes). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(review HIGH 1): SupplyChainServiceClient must use premiumFetch (#3207) Signed-in browser pro users were silently hitting 401 on 8 supply-chain premium endpoints (country-products, multi-sector-cost-shock, country-chokepoint-index, bypass-options, country-cost-shock, sector-dependency, route-explorer-lane, route-impact). The shared client was constructed with globalThis.fetch, so no Clerk bearer or X-WorldMonitor-Key was injected. The gateway's validateApiKey runs with forceKey=true for PREMIUM_RPC_PATHS and 401s before isCallerPremium is consulted. The generated client's try/catch collapses the 401 into an empty-fallback return, leaving panels blank with no visible error. Fix is one line at the client constructor: swap globalThis.fetch for premiumFetch. The same pattern is already in use for insider-transactions, stock-analysis, stock-backtest, scenario, trade (premiumClient) — this was an omission on this client, not a new pattern. premiumFetch no-ops safely when no credentials are available, so the 5 non-premium methods on this client (shippingRates, chokepointStatus, chokepointHistory, criticalMinerals, shippingStress) continue to work unchanged. This also fixes two panels that were pre-existing latently broken on main (chokepoint-index, bypass-options, etc. — predating #3207, not regressions from it). Commit 6 expanded the surface by routing two more methods through the same buggy client; this commit fixes the class. From koala73 review (#3242 second-pass, HIGH new #1): > Exact class PR #3233 fixed for RegionalIntelligenceBoard / > DeductionPanel / trade / country-intel. Supply-chain was not in > #3233's scope. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(review HIGH 2): restore 400 on input-shape errors for 2 supply-chain handlers (#3207) Commit 6 collapsed all non-happy paths into empty-200 on `get-country-products` and `get-multi-sector-cost-shock`, including caller-bug cases that legacy returned 400 for: - get-country-products: malformed iso2 → empty 200 (was 400) - get-multi-sector-cost-shock: malformed iso2 / missing chokepointId / unknown chokepointId → empty 200 (was 400) The commit message for 6 called out the 403-for-non-pro → empty-200 shift ("sebuf gateway pattern is empty-payload-on-deny") but not the 400 shift. They're different classes: - Empty-payload-200 for PRO-deny: intentional contract change, already documented and applied across the service. Generated clients treat "you lack PRO" as "no data" — fine. - Empty-payload-200 for malformed input: caller bug silently masked. External API consumers can't distinguish "bad wiring" from "genuinely no data", test harnesses lose the signal, bad calling code doesn't surface in Sentry. Fix: `throw new ValidationError(violations)` on the 3 input-shape branches. The generated sebuf server maps ValidationError → HTTP 400 (see src/generated/server/.../service_server.ts and leads/v1 which already uses this pattern). PRO-gate deny stays as empty-200 — that contract shift was intentional and is preserved. Regression tests added at tests/supply-chain-validation.test.mjs (8 cases) pinning the three-way contract: - bad input → 400 (ValidationError) - PRO-gate deny on valid input → 200 empty - valid PRO input, no data in Redis → 200 empty (unchanged) From koala73 review (#3242 second-pass, HIGH new #2). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(review HIGH 3): restore statusUrl on RunScenarioResponse + document 202→200 wire break (#3207) Commit 7 silently shifted /api/scenario/v1/run-scenario's response contract in two ways that the commit message covered only partially: 1. HTTP 202 Accepted → HTTP 200 OK 2. Dropped `statusUrl` string from the response body The `statusUrl` drop was mentioned as "unused by SupplyChainPanel" but not framed as a contract change. The 202 → 200 shift was not mentioned at all. This is a same-version (v1 → v1) migration, so external callers that key off either signal — `response.status === 202` or `response.body.statusUrl` — silently branch incorrectly. Evaluated options: (a) sebuf per-RPC status-code config — not available. sebuf's HttpConfig only models `path` and `method`; no status annotation. (b) Bump to scenario/v2 — judged heavier than the break itself for a single status-code shift. No in-repo caller uses 202 or statusUrl; the docs-level impact is containable. (c) Accept the break, document explicitly, partially restore. Took option (c): - Restored `statusUrl` in the proto (new field `string status_url = 3` on RunScenarioResponse). Server computes `/api/scenario/v1/get-scenario-status?jobId=<encoded job_id>` and populates it on every successful enqueue. External callers that followed this URL keep working unchanged. - 202 → 200 is not recoverable inside the sebuf generator, so it is called out explicitly in two places: - docs/api-scenarios.mdx now includes a prominent `<Warning>` block documenting the v1→v1 contract shift + the suggested migration (branch on response body shape, not HTTP status). - RunScenarioResponse proto comment explains why 200 is the new success status on enqueue. OpenAPI bundle regenerated to reflect the restored statusUrl field. - Regression test added in tests/scenario-handler.test.mjs pinning `statusUrl` to the exact URL-encoded shape — locks the invariant so a future proto rename or handler refactor can't silently drop it again. From koala73 review (#3242 second-pass, HIGH new #3). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(review HIGH 1/2): close webhook tenant-isolation gap on shipping/v2 (#3207) Koala flagged this as a merge blocker in PR #3242 review. server/worldmonitor/shipping/v2/{register-webhook,list-webhooks}.ts migrated without reinstating validateApiKey(req, { forceKey: true }), diverging from both the sibling api/v2/shipping/webhooks/[subscriberId] routes and the documented "X-WorldMonitor-Key required" contract in docs/api-shipping-v2.mdx. Attack surface: the gateway accepts Clerk bearer auth as a pro signal. A Clerk-authenticated pro user with no X-WorldMonitor-Key reaches the handler, callerFingerprint() falls back to 'anon', and every such caller collapses into a shared webhook:owner:anon:v1 bucket. The defense-in-depth ownerTag !== ownerHash check in list-webhooks.ts doesn't catch it because both sides equal 'anon' — every Clerk-session holder could enumerate / overwrite every other Clerk-session pro tenant's registered webhook URLs. Fix: reinstate validateApiKey(ctx.request, { forceKey: true }) at the top of each handler, throwing ApiError(401) when absent. Matches the sibling routes exactly and the published partner contract. Tests: - tests/shipping-v2-handler.test.mjs: two existing "non-PRO → 403" tests for register/list were using makeCtx() with no key, which now fails at the 401 layer first. Renamed to "no API key → 401 (tenant-isolation gate)" with a comment explaining the failure mode being tested. 18/18 pass. Verified: typecheck:api, lint:api-contract (no change), lint:boundaries, lint:rate-limit-policies, test:data (6005/6005). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(review HIGH 2/2): restore v1 path aliases on scenario + supply-chain (#3207) Koala flagged this as a merge blocker in PR #3242 review. Commits 6 + 7 of #3207 renamed five documented v1 URLs to the sebuf method-derived paths and deleted the legacy edge-function files: POST /api/scenario/v1/run → run-scenario GET /api/scenario/v1/status → get-scenario-status GET /api/scenario/v1/templates → list-scenario-templates GET /api/supply-chain/v1/country-products → get-country-products GET /api/supply-chain/v1/multi-sector-cost-shock → get-multi-sector-cost-shock server/router.ts is an exact static-match table (Map keyed on `METHOD PATH`), so any external caller — docs, partner scripts, grep-the- internet — hitting the old documented URL would 404 on first request after merge. Commit 8 (shipping/v2) preserved partner URLs byte-for- byte; the scenario + supply-chain renames missed that discipline. Fix: add five thin alias edge functions that rewrite the pathname to the canonical sebuf path and delegate to the domain [rpc].ts gateway via a new server/alias-rewrite.ts helper. Premium gating, rate limits, entitlement checks, and cache-tier lookups all fire on the canonical path — aliases are pure URL rewrites, not a duplicate handler pipeline. api/scenario/v1/{run,status,templates}.ts api/supply-chain/v1/{country-products,multi-sector-cost-shock}.ts Vite dev parity: file-based routing at api/ is a Vercel concern, so the dev middleware (vite.config.ts) gets a matching V1_ALIASES rewrite map before the router dispatch. Manifest: 5 new entries under `deferred` with removal_issue=#3282 (tracking their retirement at the next v1→v2 break). lint:api-contract stays green (89 files checked, 55 manifest entries validated). Docs: - docs/api-scenarios.mdx: migration callout at the top with the full old→new URL table and a link to the retirement issue. - CHANGELOG.md + docs/changelog.mdx: Changed entry documenting the rename + alias compat + the 202→200 shift (from commit `23c821a1`). Verified: typecheck:api, lint:api-contract, lint:rate-limit-policies, lint:boundaries, test:data (6005/6005). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 09:55:59 +03:00
Elie Habib	96fca1dc2b	fix(supply-chain): popup-keyed history re-query + dataAvailable flag (#3187 ) * fix(supply-chain): popup-keyed history re-query + dataAvailable flag for partial coverage Two P1 findings on #3185 post-merge review: 1. MapPopup cross-chokepoint history contamination Popup's async history resolve re-queried [data-transit-chart] without a cpId key. User opens popup A → fetch starts for cpA; user opens popup B before it resolves → cpA's history mounts into cpB's chart container. Fix: add data-transit-chart-id keyed by cpId; re-query by it on resolve. Mirrors SupplyChainPanel's existing data-chart-cp-id pattern. 2. Partial portwatch coverage still looked healthy Previous fix emits all 13 canonical summaries (zero-state fill for missing IDs) and records pwCovered in seed-meta, but: - get-chokepoint-status still zero-filled missing chokepoints and cached the response as healthy — panel rendered silent empty rows. - api/health.js only degrades on recordCount=0, so 10/13 partial read as OK despite the UI hiding entire chokepoints. Fix: - proto: TransitSummary.data_available (field 12). Writer tags with Boolean(cpData). Status RPC passes through; defaults true for pre-fix payloads (absence = covered). - Status RPC writes seed-meta recordCount as covered count (not shape size), and flips response-level upstreamUnavailable on partial. - api/health.js: new minRecordCount field on SEED_META entries + new COVERAGE_PARTIAL status (warn rollup). chokepoints entry declares minRecordCount: 13. recordCount < 13 → COVERAGE_PARTIAL. - Client (panel + popup): skip stats/chart rendering when !dataAvailable; show "Transit data unavailable (upstream partial)" microcopy so users understand the gap. 5759/5759 data tests pass. Typecheck + typecheck:api clean. * fix(supply-chain): guarantee Simulate Closure button exits Computing state User reports "Simulate Closure does nothing beyond write Computing…" — the button sticks at Computing forever. Two causes: 1. Scenario worker appears down (0 scenario-result:* keys in Redis in the last 24h of 24h-TTL). Railway-side — separate intervention needed to redeploy scripts/scenario-worker.mjs. 2. Client leaked the "Computing…" state on multiple exit paths: - signal.aborted early-return inside the poll loop never reset the button. Second click fired abort on first → first returned without resetting → button stayed "Computing…" until next render. - !this.content.isConnected early-return also skipped reset (less user-visible but same class of bug). - catch block swallowed AbortError without resetting. - POST /run had no hard timeout — a hanging edge function left the button in Computing indefinitely. Fix: - resetButton(text) helper touches the btn only if still connected; applied in every exit path (abort, timeout, post-success, catch). - AbortSignal.any([caller, AbortSignal.timeout(20_000)]) on POST /run. - console.error on failure so Simulate Closure errors surface in ops. - Error message includes "scenario worker may be down" on loop timeout so operators see the right suspect. Backend observations (for follow-up): - Hormuz backend is healthy (/api/health chokepoints OK, 13 records, 1 min old; live RPC has hormuz_strait.riskLevel=critical, wow=-22, flowEstimate present; GetChokepointHistory returns 174 entries). User-reported "Hormuz empty" is likely browser/CDN stale cache from before PR #3185; hard refresh should resolve. - scenario-worker.mjs has zero result keys in 24h. Railway service needs verification/redeployment. * fix(scenario): wrong Upstash RPUSH format silently broke every Simulate Closure Railway scenario-worker log shows every job failing field validation since at least 03:06Z today: [scenario-worker] Job failed field validation, discarding: ["{\"jobId\":\"scenario:1776535792087:cynxx5v4\",... The leading [" in the payload is the smoking gun. api/scenario/v1/run.ts was POSTing to /rpush/{key} with body `[payload]`, expecting Upstash to unpack the array and push one string value. Upstash does NOT parse that form — it stored the literal `["{...}"]` string as a single list value. Worker BLMOVEs the literal string → JSON.parse → array → destructure `{jobId, scenarioId, iso2}` on an array returns undefined for all three → every job discarded without writing a result. Client poll returns `pending` for the full 60s timeout, then (on the prior client code path) leaked the stuck "Computing…" button state indefinitely. Fix: use the standard Upstash REST command format — POST to the base URL with body `["RPUSH", key, value]`. Matches scripts/ais-relay.cjs upstashLpush. After this, the scenario-queue:pending list stores the raw payload string, BLMOVE returns the payload, JSON.parse gives the object, validation passes, computeScenario runs, result key gets written, client poll sees `done`. Zero result keys existed in prod Redis in the last 24h (24h TTL on scenario-result:*) — confirms the fix addresses the production outage.	2026-04-18 23:38:33 +04:00
Elie Habib	12ea629a63	feat(supply-chain): Sprint C — scenario engine (templates, job API, Railway worker, map activation) (#2890 ) * feat(supply-chain): Sprint C — scenario engine templates, job API, Railway worker, map activation Adds the async scenario engine for supply chain disruption modelling: - src/config/scenario-templates.ts: 6 pre-built ScenarioTemplate definitions (Taiwan Strait closure, Suez+BaB simultaneous, Panama drought, Hormuz blockade, Russia Baltic grain suspension, US electronics tariff shock) with costShockMultiplier and optional HS2 sector scoping. Exports ScenarioVisualState + ScenarioResult types (no UI imports, avoids MapContainer <-> DeckGLMap circular dep). - api/scenario/v1/run.ts: PRO-gated edge function — validates scenarioId against template registry and iso2 format, enqueues job to Redis scenario-queue:pending via RPUSH. Returns {jobId, status:'pending'} HTTP 202. - api/scenario/v1/status.ts: Edge function — validates jobId via regex to prevent Redis key injection, reads scenario-result:{jobId}. Returns {status:'pending'} when unprocessed, or full worker result when done. - scripts/scenario-worker.mjs: Always-on Railway worker using BLMOVE LEFT RIGHT for atomic FIFO dequeue+claim. Idempotency check before compute. Writes result with 24h TTL; writes {status:'failed'} on error; always cleans processing list in finally. - DeckGLMap.ts: scenarioState field + setScenarioState(). createTradeRoutesLayer() overrides arc color to orange for segments whose route waypoints intersect scenario disruptedChokepointIds. Null state restores normal colors. - MapContainer.ts: activateScenario(id, result) and deactivateScenario() broadcast ScenarioVisualState to DeckGLMap. Globe/SVG deferred to Sprint D (best-effort). 🤖 Generated with Claude Sonnet 4.6 via Claude Code (https://claude.com/claude-code) + Compound Engineering v2.49.0 Co-Authored-By: Claude Sonnet 4.6 (200K context) <noreply@anthropic.com> * fix(supply-chain): move scenario-templates to server/ to satisfy arch boundary api/ edge functions may not import from src/ app code. Move the authoritative scenario-templates.ts to server/worldmonitor/supply-chain/v1/ and replace src/config/scenario-templates.ts with a type-only re-export for src/ consumers. * fix(supply-chain): guard scenario-worker runWorker() behind isMain check Without the isMain guard, the pre-push hook picks up scenario-worker.mjs as a seed test candidate (non-matching lines pass through sed unchanged) and starts the long-running worker process, causing push failures. * fix(pre-push): filter non-matching lines from seed test selector The sed transform passes non-matching lines (e.g. scenario-worker.mjs) through unchanged. Adding grep "^tests/" ensures only successfully transformed test paths are passed to the test runner. * fix(supply-chain): address PR #2890 review findings — worker data shapes + status PRO gate Three bugs found in PR #2890 code review: 1. [High] scenario-worker.mjs read wrong cache shape for exposure data. supply-chain:exposure:{iso2}:{hs2}:v1 caches GetCountryChokepointIndexResponse ({ iso2, hs2, exposures: [{chokepointId, exposureScore}], ... }), not a chokepointId-keyed object. Worker now iterates data.exposures[], filters by template.affectedChokepointIds, and ranks by exposureScore (importValue does not exist on ChokepointExposureEntry). adjustedImpact = exposureScore x (disruptionPct/100) x costShockMultiplier. 2. [Medium] api/scenario/v1/status.ts was not PRO-gated, allowing anyone with a valid jobId to retrieve full premium scenario results. Added isCallerPremium() check; returns HTTP 403 for non-PRO callers, matching run.ts behavior. 3. [Low] Worker parsed chokepoint status cache as Array but actual shape is { chokepoints: [], fetchedAt, upstreamUnavailable }. Fixed to access cpData.chokepoints array. * fix(scenario): per-country impactPct + O(1) route lookup in arc layer - impactPct now reflects each country's relative share of the worst-hit country (0-100) instead of the flat template.disruptionPct for all - Pre-build routeId→waypoints Map in createTradeRoutesLayer() so getColor() is O(1) per segment instead of O(n) per frame * fix(scenario): rate limit, pipeline GETs, error sanitization, processing state, orphan drain - Add per-user rate limit (10 jobs/min) + queue depth cap to run.ts - Replace 594 sequential Redis GETs with single Upstash pipeline call - Sanitize worker err.message to 'computation_error' in failed results - Remove dead validateApiKey() calls (isCallerPremium covers this) - Write processing state before computeScenario() starts - Add SIGTERM handler + startup orphan drain to worker loop - Validate dequeued job payload fields before use as Redis key fragments - Fix maxImpact divide-by-zero with Math.max(..., 1) - Hoist routeWaypoints Map to module level in DeckGLMap - Add GET /api/scenario/v1/templates discovery endpoint - Fix template sync comment to reference correct authoritative file * docs(plan): mark Sprint C complete, record deferrals to Sprint D - Sprint status table added: Sprints 0-2 merged, C ready to merge (#2890), A/B/D not started - Sprint C checklist: 4 ACs checked off, panel UI + tariff-shock visual deferred - Sprint D section updated to carry over Sprint C visual deferrals - PR #2890 added to Related PRs --------- Co-authored-by: Claude Sonnet 4.6 (200K context) <noreply@anthropic.com>	2026-04-10 14:44:14 +04:00

Sebastien Melki

58e42aadf9

chore(api): enforce sebuf contract + migrate drifting endpoints (#3207 ) (#3242 )

* chore(api): enforce sebuf contract via exceptions manifest (#3207)

Adds api/api-route-exceptions.json as the single source of truth for
non-proto /api/ endpoints, with scripts/enforce-sebuf-api-contract.mjs
gating every PR via npm run lint:api-contract. Fixes the root-only blind
spot in the prior allowlist (tests/edge-functions.test.mjs), which only
scanned top-level *.js files and missed nested paths and .ts endpoints —
the gap that let api/supply-chain/v1/country-products.ts and friends
drift under proto domain URL prefixes unchallenged.

Checks both directions: every api/<domain>/v<N>/[rpc].ts must pair with
a generated service_server.ts (so a deleted proto fails CI), and every
generated service must have an HTTP gateway (no orphaned generated code).

Manifest entries require category + reason + owner, with removal_issue
mandatory for temporary categories (deferred, migration-pending) and
forbidden for permanent ones. .github/CODEOWNERS pins the manifest to
@SebastienMelki so new exceptions don't slip through review.

The manifest only shrinks: migration-pending entries (19 today) will be
removed as subsequent commits in this PR land each migration.

* refactor(maritime): migrate /api/ais-snapshot → maritime/v1.GetVesselSnapshot (#3207)

The proto VesselSnapshot was carrying density + disruptions but the frontend
also needed sequence, relay status, and candidate_reports to drive the
position-callback system. Those only lived on the raw relay passthrough, so
the client had to keep hitting /api/ais-snapshot whenever callbacks were
registered and fall back to the proto RPC only when the relay URL was gone.

This commit pushes all three missing fields through the proto contract and
collapses the dual-fetch-path into one proto client call.

Proto changes (proto/worldmonitor/maritime/v1/):
  - VesselSnapshot gains sequence, status, candidate_reports.
  - GetVesselSnapshotRequest gains include_candidates (query: include_candidates).

Handler (server/worldmonitor/maritime/v1/get-vessel-snapshot.ts):
  - Forwards include_candidates to ?candidates=... on the relay.
  - Separate 5-min in-memory caches for the candidates=on and candidates=off
    variants; they have very different payload sizes and should not share a slot.
  - Per-request in-flight dedup preserved per-variant.

Frontend (src/services/maritime/index.ts):
  - fetchSnapshotPayload now calls MaritimeServiceClient.getVesselSnapshot
    directly with includeCandidates threaded through. The raw-relay path,
    SNAPSHOT_PROXY_URL, DIRECT_RAILWAY_SNAPSHOT_URL and LOCAL_SNAPSHOT_FALLBACK
    are gone — production already routed via Vercel, the "direct" branch only
    ever fired on localhost, and the proto gateway covers both.
  - New toLegacyCandidateReport helper mirrors toDensityZone/toDisruptionEvent.

api/ais-snapshot.js deleted; manifest entry removed. Only reduced the codegen
scope to worldmonitor.maritime.v1 (buf generate --path) — regenerating the
full tree drops // @ts-nocheck from every client/server file and surfaces
pre-existing type errors across 30+ unrelated services, which is not in
scope for this PR.

Shape-diff vs legacy payload:
  - disruptions / density: proto carries the same fields, just with the
    GeoCoordinates wrapper and enum strings (remapped client-side via
    existing toDisruptionEvent / toDensityZone helpers).
  - sequence, status.{connected,vessels,messages}: now populated from the
    proto response — was hardcoded to 0/false in the prior proto fallback.
  - candidateReports: same shape; optional numeric fields come through as
    0 instead of undefined, which the legacy consumer already handled.

* refactor(sanctions): migrate /api/sanctions-entity-search → LookupSanctionEntity (#3207)

The proto docstring already claimed "OFAC + OpenSanctions" coverage but the
handler only fuzzy-matched a local OFAC Redis index — narrower than the
legacy /api/sanctions-entity-search, which proxied OpenSanctions live (the
source advertised in docs/api-proxies.mdx). Deleting the legacy without
expanding the handler would have been a silent coverage regression for
external consumers.

Handler changes (server/worldmonitor/sanctions/v1/lookup-entity.ts):
  - Primary path: live search against api.opensanctions.org/search/default
    with an 8s timeout and the same User-Agent the legacy edge fn used.
  - Fallback path: the existing OFAC local fuzzy match, kept intact for when
    OpenSanctions is unreachable / rate-limiting.
  - Response source field flips between 'opensanctions' (happy path) and
    'ofac' (fallback) so clients can tell which index answered.
  - Query validation tightened: rejects q > 200 chars (matches legacy cap).

Rate limiting:
  - Added /api/sanctions/v1/lookup-entity to ENDPOINT_RATE_POLICIES at 30/min
    per IP — matches the legacy createIpRateLimiter budget. The gateway
    already enforces per-endpoint policies via checkEndpointRateLimit.

Docs:
  - docs/api-proxies.mdx — dropped the /api/sanctions-entity-search row
    (plus the orphaned /api/ais-snapshot row left over from the previous
    commit in this PR).
  - docs/panels/sanctions-pressure.mdx — points at the new RPC URL and
    describes the OpenSanctions-primary / OFAC-fallback semantics.

api/sanctions-entity-search.js deleted; manifest entry removed.

* refactor(military): migrate /api/military-flights → ListMilitaryFlights (#3207)

Legacy /api/military-flights read a pre-baked Redis blob written by the
seed-military-flights cron and returned flights in a flat app-friendly
shape (lat/lon, lowercase enums, lastSeenMs). The proto RPC takes a bbox,
fetches OpenSky live, classifies server-side, and returns nested
GeoCoordinates + MILITARY_*_TYPE_* enum strings + lastSeenAt — same data,
different contract.

fetchFromRedis in src/services/military-flights.ts was doing nothing
sebuf-aware. Renamed it to fetchViaProto and rewrote to:

  - Instantiate MilitaryServiceClient against getRpcBaseUrl().
  - Iterate MILITARY_QUERY_REGIONS (PACIFIC + WESTERN) in parallel — same
    regions the desktop OpenSky path and the seed cron already use, so
    dashboard coverage tracks the analytic pipeline.
  - Dedup by hexCode across regions.
  - Map proto → app shape via new mapProtoFlight helper plus three reverse
    enum maps (AIRCRAFT_TYPE_REVERSE, OPERATOR_REVERSE, CONFIDENCE_REVERSE).

The seed cron (scripts/seed-military-flights.mjs) stays put: it feeds
regional-snapshot mobility, cross-source signals, correlation, and the
health freshness check (api/health.js: 'military:flights:v1'). None of
those read the legacy HTTP endpoint; they read the Redis key directly.
The proto handler uses its own per-bbox cache keys under the same prefix,
so dashboard traffic no longer races the seed cron's blob — the two paths
diverge by a small refresh lag, which is acceptable.

Docs: dropped the /api/military-flights row from docs/api-proxies.mdx.

api/military-flights.js deleted; manifest entry removed.

Shape-diff vs legacy:
  - f.location.{latitude,longitude} → f.lat, f.lon
  - f.aircraftType: MILITARY_AIRCRAFT_TYPE_TANKER → 'tanker' via reverse map
  - f.operator: MILITARY_OPERATOR_USAF → 'usaf' via reverse map
  - f.confidence: MILITARY_CONFIDENCE_LOW → 'low' via reverse map
  - f.lastSeenAt (number) → f.lastSeen (Date)
  - f.enrichment → f.enriched (with field renames)
  - Extra fields registration / aircraftModel / origin / destination /
    firstSeenAt now flow through where proto populates them.

* fix(supply-chain): thread includeCandidates through chokepoint status (#3207)

Caught by tsconfig.api.json typecheck in the pre-push hook (not covered
by the plain tsc --noEmit run that ran before I pushed the ais-snapshot
commit). The chokepoint status handler calls getVesselSnapshot internally
with a static no-auth request — now required to include the new
includeCandidates bool from the proto extension.

Passing false: server-internal callers don't need per-vessel reports.

* test(maritime): update getVesselSnapshot cache assertions (#3207)

The ais-snapshot migration replaced the single cachedSnapshot/cacheTimestamp
pair with a per-variant cache so candidates-on and candidates-off payloads
don't evict each other. Pre-push hook surfaced that tests/server-handlers
still asserted the old variable names. Rewriting the assertions to match
the new shape while preserving the invariants they actually guard:

  - Freshness check against slot TTL.
  - Cache read before relay call.
  - Per-slot in-flight dedup.
  - Stale-serve on relay failure (result ?? slot.snapshot).

* chore(proto): restore // @ts-nocheck on regenerated maritime files (#3207)

I ran 'buf generate --path worldmonitor/maritime/v1' to scope the proto
regen to the one service I was changing (to avoid the toolchain drift
that drops @ts-nocheck from 60+ unrelated files — separate issue). But
the repo convention is the 'make generate' target, which runs buf and
then sed-prepends '// @ts-nocheck' to every generated .ts file. My
scoped command skipped the sed step. The proto-check CI enforces the
sed output, so the two maritime files need the directive restored.

* refactor(enrichment): decomm /api/enrichment/{company,signals} legacy edge fns (#3207)

Both endpoints were already ported to IntelligenceService:
  - getCompanyEnrichment  (/api/intelligence/v1/get-company-enrichment)
  - listCompanySignals    (/api/intelligence/v1/list-company-signals)

No frontend callers of the legacy /api/enrichment/* paths exist. Removes:
  - api/enrichment/company.js, signals.js, _domain.js
  - api-route-exceptions.json migration-pending entries (58 remain)
  - docs/api-proxies.mdx rows for /api/enrichment/{company,signals}
  - docs/architecture.mdx reference updated to the IntelligenceService RPCs

Verified: typecheck, typecheck:api, lint:api-contract (89 files / 58 entries),
lint:boundaries, tests/edge-functions.test.mjs (136 pass),
tests/enrichment-caching.test.mjs (14 pass — still guards the intelligence/v1
handlers), make generate is zero-diff.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(leads): migrate /api/{contact,register-interest} → LeadsService (#3207)

New leads/v1 sebuf service with two POST RPCs:
  - SubmitContact    → /api/leads/v1/submit-contact
  - RegisterInterest → /api/leads/v1/register-interest

Handler logic ported 1:1 from api/contact.js + api/register-interest.js:
  - Turnstile verification (desktop sources bypass, preserved)
  - Honeypot (website field) silently accepts without upstream calls
  - Free-email-domain gate on SubmitContact (422 ApiError)
  - validateEmail (disposable/offensive/typo-TLD/MX) on RegisterInterest
  - Convex writes via ConvexHttpClient (contactMessages:submit, registerInterest:register)
  - Resend notification + confirmation emails (HTML templates unchanged)

Shared helpers moved to server/_shared/:
  - turnstile.ts (getClientIp + verifyTurnstile)
  - email-validation.ts (disposable/offensive/MX checks)

Rate limits preserved via ENDPOINT_RATE_POLICIES:
  - submit-contact:    3/hour per IP (was in-memory 3/hr)
  - register-interest: 5/hour per IP (was in-memory 5/hr; desktop
    sources previously capped at 2/hr via shared in-memory map —
    now 5/hr like everyone else, accepting the small regression in
    exchange for Upstash-backed global limiting)

Callers updated:
  - pro-test/src/App.tsx contact form → new submit-contact path
  - src-tauri/sidecar/local-api-server.mjs cloud-fallback rewrites
    /api/register-interest → /api/leads/v1/register-interest when
    proxying; keeps local path for older desktop builds
  - src/services/runtime.ts isKeyFreeApiTarget allows both old and
    new paths through the WORLDMONITOR_API_KEY-optional gate

Tests:
  - tests/contact-handler.test.mjs rewritten to call submitContact
    handler directly; asserts on ValidationError / ApiError
  - tests/email-validation.test.mjs + tests/turnstile.test.mjs
    point at the new server/_shared/ modules

Deleted: api/contact.js, api/register-interest.js, api/_ip-rate-limit.js,
api/_turnstile.js, api/_email-validation.js, api/_turnstile.test.mjs.
Manifest entries removed (58 → 56). Docs updated (api-platform,
api-commerce, usage-rate-limits).

Verified: npm run typecheck + typecheck:api + lint:api-contract
(88 files / 56 entries) + lint:boundaries pass; full test:data
(5852 tests) passes; make generate is zero-diff.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(pro-test): rebuild bundle for leads/v1 contact form (#3207)

Updates the enterprise contact form to POST to /api/leads/v1/submit-contact
(old path /api/contact removed in the previous commit).

Bundle is rebuilt from pro-test/src/App.tsx source change in 9ccd309d.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(review): address HIGH review findings 1-3 (#3207)

Three review findings from @koala73 on the sebuf-migration PR, all
silent bugs that would have shipped to prod:

### 1. Sanctions rate-limit policy was dead code

ENDPOINT_RATE_POLICIES keyed the 30/min budget under
/api/sanctions/v1/lookup-entity, but the generated route (from the
proto RPC LookupSanctionEntity) is /api/sanctions/v1/lookup-sanction-entity.
hasEndpointRatePolicy / getEndpointRatelimit are exact-string pathname
lookups, so the mismatch meant the endpoint fell through to the
generic 600/min global limiter instead of the advertised 30/min.

Net effect: the live OpenSanctions proxy endpoint (unauthenticated,
external upstream) had 20x the intended rate budget. Fixed by renaming
the policy key to match the generated route.

### 2. Lost stale-seed fallback on military-flights

Legacy api/military-flights.js cascaded military:flights:v1 →
military:flights:stale:v1 before returning empty. The new proto
handler went straight to live OpenSky/relay and returned null on miss.

Relay or OpenSky hiccup used to serve stale seeded data (24h TTL);
under the new handler it showed an empty map. Both keys are still
written by scripts/seed-military-flights.mjs on every run — fix just
reads the stale key when the live fetch returns null, converts the
seed's app-shape flights (flat lat/lon, lowercase enums, lastSeenMs)
to the proto shape (nested GeoCoordinates, enum strings, lastSeenAt),
and filters to the request bbox.

Read via getRawJson (unprefixed) to match the seed cron's writes,
which bypass the env-prefix system.

### 3. Hex-code casing mismatch broke getFlightByHex

The seed cron writes hexCode: icao24.toUpperCase() (uppercase);
src/services/military-flights.ts:getFlightByHex uppercases the lookup
input: f.hexCode === hexCode.toUpperCase(). The new proto handler
preserved OpenSky's lowercase icao24, and mapProtoFlight is a
pass-through. getFlightByHex was silently returning undefined for
every call after the migration.

Fix: uppercase in the proto handler (live + stale paths), and document
the invariant in a comment on MilitaryFlight.hex_code in
military_flight.proto so future handlers don't re-break it.

### Verified

- typecheck + typecheck:api clean
- lint:api-contract (56 entries) / lint:boundaries clean
- tests/edge-functions.test.mjs 130 pass
- make generate zero-diff (openapi spec regenerated for proto comment)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(review): restore desktop 2/hr rate cap on register-interest (#3207)

Addresses HIGH review finding #4 from @koala73. The legacy
api/register-interest.js applied a nested 2/hr per-IP cap when
`source === 'desktop-settings'`, on top of the generic 5/hr endpoint
budget. The sebuf migration lost this — desktop-source requests now
enjoy the full 5/hr cap.

Since `source` is an unsigned client-supplied field, anyone sending
`source: 'desktop-settings'` skips Turnstile AND gets 5/hr. Without
the tighter cap the Turnstile bypass is cheaper to abuse.

Added `checkScopedRateLimit` to `server/_shared/rate-limit.ts` — a
reusable second-stage Upstash limiter keyed on an opaque scope string
+ caller identifier. Fail-open on Redis errors to match existing
checkRateLimit / checkEndpointRateLimit semantics. Handlers that need
per-subscope caps on top of the gateway-level endpoint budget use this
helper.

In register-interest: when `isDesktopSource`, call checkScopedRateLimit
with scope `/api/leads/v1/register-interest#desktop`, limit=2, window=1h,
IP as identifier. On exceeded → throw ApiError(429).

### What this does not fix

This caps the blast radius of the Turnstile bypass but does not close
it — an attacker sending `source: 'desktop-settings'` still skips
Turnstile (just at 2/hr instead of 5/hr). The proper fix is a signed
desktop-secret header that authenticates the bypass; filed as
follow-up #3252. That requires coordinated Tauri build + Vercel env
changes out of scope for #3207.

### Verified

- typecheck + typecheck:api clean
- lint:api-contract (56 entries)
- tests/edge-functions.test.mjs + contact-handler.test.mjs (147 pass)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(review): MEDIUM + LOW + rate-limit-policy CI check (#3207)

Closes out the remaining @koala73 review findings from #3242 that
didn't already land in the HIGH-fix commits, plus the requested CI
check that would have caught HIGH #1 (dead-code policy key) at
review time.

### MEDIUM #5 — Turnstile missing-secret policy default

Flip `verifyTurnstile`'s default `missingSecretPolicy` from `'allow'`
to `'allow-in-development'`. Dev with no secret = pass (expected
local); prod with no secret = reject + log. submit-contact was
already explicitly overriding to `'allow-in-development'`;
register-interest was silently getting `'allow'`. Safe default now
means a future missing-secret misconfiguration in prod gets caught
instead of silently letting bots through. Removed the now-redundant
override in submit-contact.

### MEDIUM #6 — Silent enum fallbacks in maritime client

`toDisruptionEvent` mapped `AIS_DISRUPTION_TYPE_UNSPECIFIED` / unknown
enum values → `gap_spike` / `low` silently. Refactored to return null
when either enum is unknown; caller filters nulls out of the array.
Handler doesn't produce UNSPECIFIED today, but the `gap_spike`
default would have mislabeled the first new enum value the proto
ever adds — dropping unknowns is safer than shipping wrong labels.

### LOW — Copy drift in register-interest email

Email template hardcoded `435+ Sources`; PR #3241 bumped marketing to
`500+`. Bumped in the rewritten file to stay consistent.

The `as any` on Convex mutation names carried over from legacy and
filed as follow-up #3253.

### Rate-limit-policy coverage lint

`scripts/enforce-rate-limit-policies.mjs` validates every key in
`ENDPOINT_RATE_POLICIES` resolves to a proto-generated gateway route
by cross-referencing `docs/api/*.openapi.yaml`. Fails with the
sanctions-entity-search incident referenced in the error message so
future drift has a paper trail.

Wired into package.json (`lint:rate-limit-policies`) and the pre-push
hook alongside `lint:boundaries`. Smoke-tested both directions —
clean repo passes (5 policies / 175 routes), seeded drift (the exact
HIGH #1 typo) fails with the advertised remedy text.

### Verified
- `lint:rate-limit-policies` ✓
- `typecheck` + `typecheck:api` ✓
- `lint:api-contract` ✓ (56 entries)
- `lint:boundaries` ✓
- edge-functions + contact-handler tests (147 pass)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(commit 5): decomm /api/eia/* + migrate /api/satellites → IntelligenceService (#3207)

Both targets turned out to be decomm-not-migration cases. The original
plan called for two new services (economic/v1.GetEiaSeries +
natural/v1.ListSatellitePositions) but research found neither was
needed:

### /api/eia/[[...path]].js — pure decomm, zero consumers

The "catch-all" is a misnomer — only two paths actually worked,
/api/eia/health and /api/eia/petroleum, both Redis-only readers.
Zero frontend callers in src/. Zero server-side readers. Nothing
consumes the `energy:eia-petroleum:v1` key that seed-eia-petroleum.mjs
writes daily.

The EIA data the frontend actually uses goes through existing typed
RPCs in economic/v1: GetEnergyPrices, GetCrudeInventories,
GetNatGasStorage, GetEnergyCapacity. None of those touch /api/eia/*.

Building GetEiaSeries would have been dead code. Deleted the legacy
file + its test (tests/api-eia-petroleum.test.mjs — it only covered
the legacy endpoint, no behavior to preserve). Empty api/eia/ dir
removed.

**Note for review:** the Redis seed cron keeps running daily and
nothing consumes it. If that stays unused, seed-eia-petroleum.mjs
should be retired too (separate PR). Out of scope for sebuf-migration.

### /api/satellites.js — Learning #2 strikes again

IntelligenceService.ListSatellites already exists at
/api/intelligence/v1/list-satellites, reads the same Redis key
(intelligence:satellites:tle:v1), and supports an optional country
filter the legacy didn't have.

One frontend caller in src/services/satellites.ts needed to switch
from `fetch(toApiUrl('/api/satellites'))` to the typed
IntelligenceServiceClient.listSatellites. Shape diff was tiny —
legacy `noradId` became proto `id` (handler line 36 already picks
either), everything else identical. alt/velocity/inclination in the
proto are ignored by the caller since it propagates positions
client-side via satellite.js.

Kept the client-side cache + failure cooldown + 20s timeout (still
valid concerns at the caller level).

### Manifest + docs
- api-route-exceptions.json: 56 → 54 entries (both removed)
- docs/api-proxies.mdx: dropped the two rows from the Raw-data
  passthroughs table

### Verified
- typecheck + typecheck:api ✓
- lint:api-contract (54 entries) / lint:boundaries / lint:rate-limit-policies ✓
- tests/edge-functions.test.mjs 127 pass (down from 130 — 3 tests were
  for the deleted eia endpoint)
- make generate zero-diff (no proto changes)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(commit 6): migrate /api/supply-chain/v1/{country-products,multi-sector-cost-shock} → SupplyChainService (#3207)

Both endpoints were hand-rolled TS handlers sitting under a proto URL prefix —
the exact drift the manifest guardrail flagged. Promoted both to typed RPCs:

- GetCountryProducts → /api/supply-chain/v1/get-country-products
- GetMultiSectorCostShock → /api/supply-chain/v1/get-multi-sector-cost-shock

Handlers preserve the existing semantics: PRO-gate via isCallerPremium(ctx.request),
iso2 / chokepointId validation, raw bilateral-hs4 Redis read (skip env-prefix to
match seeder writes), CHOKEPOINT_STATUS_KEY for war-risk tier, and the math from
_multi-sector-shock.ts unchanged. Empty-data and non-PRO paths return the typed
empty payload (no 403 — the sebuf gateway pattern is empty-payload-on-deny).

Client wrapper switches from premiumFetch to client.getCountryProducts/
client.getMultiSectorCostShock. Legacy MultiSectorShock / MultiSectorShockResponse /
CountryProductsResponse names remain as type aliases of the generated proto types
so CountryBriefPanel + CountryDeepDivePanel callsites compile with zero churn.

Manifest 54 → 52. Rate-limit gateway routes 175 → 177.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(gateway): add cache-tier entries for new supply-chain RPCs (#3207)

Pre-push tests/route-cache-tier.test.mjs caught the missing entries.
Both PRO-gated, request-varying — match the existing supply-chain PRO cohort
(get-country-cost-shock, get-bypass-options, etc.) at slow-browser tier.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(commit 7): migrate /api/scenario/v1/{run,status,templates} → ScenarioService (#3207)

Promote the three literal-filename scenario endpoints to a typed sebuf
service with three RPCs:

  POST /api/scenario/v1/run-scenario        (RunScenario)
  GET  /api/scenario/v1/get-scenario-status (GetScenarioStatus)
  GET  /api/scenario/v1/list-scenario-templates (ListScenarioTemplates)

Preserves all security invariants from the legacy handlers:
- 405 for wrong method (sebuf service-config method gate)
- scenarioId validation against SCENARIO_TEMPLATES registry
- iso2 regex ^[A-Z]{2}$
- JOB_ID_RE path-traversal guard on status
- Per-IP 10/min rate limit (moved to gateway ENDPOINT_RATE_POLICIES)
- Queue-depth backpressure (>100 → 429)
- PRO gating via isCallerPremium
- AbortSignal.timeout on every Redis pipeline (runRedisPipeline helper)

Wire-level diffs vs legacy:
- Per-user RL now enforced at the gateway (same 10/min/IP budget).
- Rate-limit response omits Retry-After header; retryAfter is in the
  body per error-mapper.ts convention.
- ListScenarioTemplates emits affectedHs2: [] when the registry entry
  is null (all-sectors sentinel); proto repeated cannot carry null.
- RunScenario returns { jobId, status } (no statusUrl field — unused
  by SupplyChainPanel, drop from wire).

Gateway wiring:
- server/gateway.ts RPC_CACHE_TIER: list-scenario-templates → 'daily'
  (matches legacy max-age=3600); get-scenario-status → 'slow-browser'
  (premium short-circuit target, explicit entry required by
  tests/route-cache-tier.test.mjs).
- src/shared/premium-paths.ts: swap old run/status for the new
  run-scenario/get-scenario-status paths.
- api/scenario/v1/{run,status,templates}.ts deleted; 3 manifest
  exceptions removed (63 → 52 → 49 migration-pending).

Client:
- src/services/scenario/index.ts — typed client wrapper using
  premiumFetch (injects Clerk bearer / API key).
- src/components/SupplyChainPanel.ts — polling loop swapped from
  premiumFetch strings to runScenario/getScenarioStatus. Hard 20s
  timeout on run preserved via AbortSignal.any.

Tests:
- tests/scenario-handler.test.mjs — 18 new handler-level tests
  covering every security invariant + the worker envelope coercion.
- tests/edge-functions.test.mjs — scenario sections removed,
  replaced with a breadcrumb pointer to the new test file.

Docs: api-scenarios.mdx, scenario-engine.mdx, usage-rate-limits.mdx,
usage-errors.mdx, supply-chain.mdx refreshed with new paths.

Verified: typecheck, typecheck:api, lint:api-contract (49 entries),
lint:rate-limit-policies (6/180), lint:boundaries, route-cache-tier
(parity), full edge-functions (117) + scenario-handler (18).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(commit 8): migrate /api/v2/shipping/{route-intelligence,webhooks} → ShippingV2Service (#3207)

Partner-facing endpoints promoted to a typed sebuf service. Wire shape
preserved byte-for-byte (camelCase field names, ISO-8601 fetchedAt, the
same subscriberId/secret formats, the same SET + SADD + EXPIRE 30-day
Redis pipeline). Partner URLs /api/v2/shipping/* are unchanged.

RPCs landed:
- GET  /route-intelligence  → RouteIntelligence  (PRO, slow-browser)
- POST /webhooks            → RegisterWebhook    (PRO)
- GET  /webhooks            → ListWebhooks       (PRO, slow-browser)

The existing path-parameter URLs remain on the legacy edge-function
layout because sebuf's HTTP annotations don't currently model path
params (grep proto/**/*.proto for `path: "{…}"` returns zero). Those
endpoints are split into two Vercel dynamic-route files under
api/v2/shipping/webhooks/, behaviorally identical to the previous
hybrid file but cleanly separated:
- GET  /webhooks/{subscriberId}                → [subscriberId].ts
- POST /webhooks/{subscriberId}/rotate-secret  → [subscriberId]/[action].ts
- POST /webhooks/{subscriberId}/reactivate     → [subscriberId]/[action].ts

Both get manifest entries under `migration-pending` pointing at #3207.

Other changes
- scripts/enforce-sebuf-api-contract.mjs: extended GATEWAY_RE to accept
  api/v{N}/{domain}/[rpc].ts (version-first) alongside the canonical
  api/{domain}/v{N}/[rpc].ts; first-use of the reversed ordering is
  shipping/v2 because that's the partner contract.
- vite.config.ts: dev-server sebuf interceptor regex extended to match
  both layouts; shipping/v2 import + allRoutes entry added.
- server/gateway.ts: RPC_CACHE_TIER entries for /api/v2/shipping/
  route-intelligence + /webhooks (slow-browser; premium-gated endpoints
  short-circuit to slow-browser but the entries are required by
  tests/route-cache-tier.test.mjs).
- src/shared/premium-paths.ts: route-intelligence + webhooks added.
- tests/shipping-v2-handler.test.mjs: 18 handler-level tests covering
  PRO gate, iso2/cargoType/hs2 coercion, SSRF guards (http://, RFC1918,
  cloud metadata, IMDS), chokepoint whitelist, alertThreshold range,
  secret/subscriberId format, pipeline shape + 30-day TTL, cross-tenant
  owner isolation, `secret` omission from list response.

Manifest delta
- Removed: api/v2/shipping/route-intelligence.ts, api/v2/shipping/webhooks.ts
- Added:   api/v2/shipping/webhooks/[subscriberId].ts (migration-pending)
- Added:   api/v2/shipping/webhooks/[subscriberId]/[action].ts (migration-pending)
- Added:   api/internal/brief-why-matters.ts (internal-helper) — regression
  surface from the #3248 main merge, which introduced the file without a
  manifest entry. Filed here to keep the lint green; not strictly in scope
  for commit 8 but unblocking.

Net result: 49 → 47 `migration-pending` entries (one net-removal even
though webhook path-params stay pending, because two files collapsed
into two dynamic routes).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(review HIGH 1): SupplyChainServiceClient must use premiumFetch (#3207)

Signed-in browser pro users were silently hitting 401 on 8 supply-chain
premium endpoints (country-products, multi-sector-cost-shock,
country-chokepoint-index, bypass-options, country-cost-shock,
sector-dependency, route-explorer-lane, route-impact). The shared
client was constructed with globalThis.fetch, so no Clerk bearer or
X-WorldMonitor-Key was injected. The gateway's validateApiKey runs
with forceKey=true for PREMIUM_RPC_PATHS and 401s before isCallerPremium
is consulted. The generated client's try/catch collapses the 401 into
an empty-fallback return, leaving panels blank with no visible error.

Fix is one line at the client constructor: swap globalThis.fetch for
premiumFetch. The same pattern is already in use for insider-transactions,
stock-analysis, stock-backtest, scenario, trade (premiumClient) — this
was an omission on this client, not a new pattern.

premiumFetch no-ops safely when no credentials are available, so the
5 non-premium methods on this client (shippingRates, chokepointStatus,
chokepointHistory, criticalMinerals, shippingStress) continue to work
unchanged.

This also fixes two panels that were pre-existing latently broken on
main (chokepoint-index, bypass-options, etc. — predating #3207, not
regressions from it). Commit 6 expanded the surface by routing two more
methods through the same buggy client; this commit fixes the class.

From koala73 review (#3242 second-pass, HIGH new #1):
> Exact class PR #3233 fixed for RegionalIntelligenceBoard /
> DeductionPanel / trade / country-intel. Supply-chain was not in
> #3233's scope.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(review HIGH 2): restore 400 on input-shape errors for 2 supply-chain handlers (#3207)

Commit 6 collapsed all non-happy paths into empty-200 on
`get-country-products` and `get-multi-sector-cost-shock`, including
caller-bug cases that legacy returned 400 for:

- get-country-products: malformed iso2 → empty 200 (was 400)
- get-multi-sector-cost-shock: malformed iso2 / missing chokepointId /
  unknown chokepointId → empty 200 (was 400)

The commit message for 6 called out the 403-for-non-pro → empty-200
shift ("sebuf gateway pattern is empty-payload-on-deny") but not the
400 shift. They're different classes:

- Empty-payload-200 for PRO-deny: intentional contract change, already
  documented and applied across the service. Generated clients treat
  "you lack PRO" as "no data" — fine.
- Empty-payload-200 for malformed input: caller bug silently masked.
  External API consumers can't distinguish "bad wiring" from "genuinely
  no data", test harnesses lose the signal, bad calling code doesn't
  surface in Sentry.

Fix: `throw new ValidationError(violations)` on the 3 input-shape
branches. The generated sebuf server maps ValidationError → HTTP 400
(see src/generated/server/.../service_server.ts and leads/v1 which
already uses this pattern).

PRO-gate deny stays as empty-200 — that contract shift was intentional
and is preserved.

Regression tests added at tests/supply-chain-validation.test.mjs (8
cases) pinning the three-way contract:
- bad input                         → 400 (ValidationError)
- PRO-gate deny on valid input      → 200 empty
- valid PRO input, no data in Redis → 200 empty (unchanged)

From koala73 review (#3242 second-pass, HIGH new #2).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(review HIGH 3): restore statusUrl on RunScenarioResponse + document 202→200 wire break (#3207)

Commit 7 silently shifted /api/scenario/v1/run-scenario's response
contract in two ways that the commit message covered only partially:

1. HTTP 202 Accepted → HTTP 200 OK
2. Dropped `statusUrl` string from the response body

The `statusUrl` drop was mentioned as "unused by SupplyChainPanel" but
not framed as a contract change. The 202 → 200 shift was not mentioned
at all. This is a same-version (v1 → v1) migration, so external callers
that key off either signal — `response.status === 202` or
`response.body.statusUrl` — silently branch incorrectly.

Evaluated options:
  (a) sebuf per-RPC status-code config — not available. sebuf's
      HttpConfig only models `path` and `method`; no status annotation.
  (b) Bump to scenario/v2 — judged heavier than the break itself for
      a single status-code shift. No in-repo caller uses 202 or
      statusUrl; the docs-level impact is containable.
  (c) Accept the break, document explicitly, partially restore.

Took option (c):

- Restored `statusUrl` in the proto (new field `string status_url = 3`
  on RunScenarioResponse). Server computes
  `/api/scenario/v1/get-scenario-status?jobId=<encoded job_id>` and
  populates it on every successful enqueue. External callers that
  followed this URL keep working unchanged.
- 202 → 200 is not recoverable inside the sebuf generator, so it is
  called out explicitly in two places:
    - docs/api-scenarios.mdx now includes a prominent `<Warning>` block
      documenting the v1→v1 contract shift + the suggested migration
      (branch on response body shape, not HTTP status).
    - RunScenarioResponse proto comment explains why 200 is the new
      success status on enqueue.
  OpenAPI bundle regenerated to reflect the restored statusUrl field.

- Regression test added in tests/scenario-handler.test.mjs pinning
  `statusUrl` to the exact URL-encoded shape — locks the invariant so
  a future proto rename or handler refactor can't silently drop it
  again.

From koala73 review (#3242 second-pass, HIGH new #3).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(review HIGH 1/2): close webhook tenant-isolation gap on shipping/v2 (#3207)

Koala flagged this as a merge blocker in PR #3242 review.

server/worldmonitor/shipping/v2/{register-webhook,list-webhooks}.ts
migrated without reinstating validateApiKey(req, { forceKey: true }),
diverging from both the sibling api/v2/shipping/webhooks/[subscriberId]
routes and the documented "X-WorldMonitor-Key required" contract in
docs/api-shipping-v2.mdx.

Attack surface: the gateway accepts Clerk bearer auth as a pro signal.
A Clerk-authenticated pro user with no X-WorldMonitor-Key reaches the
handler, callerFingerprint() falls back to 'anon', and every such
caller collapses into a shared webhook:owner:anon:v1 bucket. The
defense-in-depth ownerTag !== ownerHash check in list-webhooks.ts
doesn't catch it because both sides equal 'anon' — every Clerk-session
holder could enumerate / overwrite every other Clerk-session pro
tenant's registered webhook URLs.

Fix: reinstate validateApiKey(ctx.request, { forceKey: true }) at the
top of each handler, throwing ApiError(401) when absent. Matches the
sibling routes exactly and the published partner contract.

Tests:
- tests/shipping-v2-handler.test.mjs: two existing "non-PRO → 403"
  tests for register/list were using makeCtx() with no key, which now
  fails at the 401 layer first. Renamed to "no API key → 401
  (tenant-isolation gate)" with a comment explaining the failure mode
  being tested. 18/18 pass.

Verified: typecheck:api, lint:api-contract (no change), lint:boundaries,
lint:rate-limit-policies, test:data (6005/6005).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(review HIGH 2/2): restore v1 path aliases on scenario + supply-chain (#3207)

Koala flagged this as a merge blocker in PR #3242 review.

Commits 6 + 7 of #3207 renamed five documented v1 URLs to the sebuf
method-derived paths and deleted the legacy edge-function files:

  POST /api/scenario/v1/run                       → run-scenario
  GET  /api/scenario/v1/status                    → get-scenario-status
  GET  /api/scenario/v1/templates                 → list-scenario-templates
  GET  /api/supply-chain/v1/country-products      → get-country-products
  GET  /api/supply-chain/v1/multi-sector-cost-shock → get-multi-sector-cost-shock

server/router.ts is an exact static-match table (Map keyed on `METHOD
PATH`), so any external caller — docs, partner scripts, grep-the-
internet — hitting the old documented URL would 404 on first request
after merge. Commit 8 (shipping/v2) preserved partner URLs byte-for-
byte; the scenario + supply-chain renames missed that discipline.

Fix: add five thin alias edge functions that rewrite the pathname to
the canonical sebuf path and delegate to the domain [rpc].ts gateway
via a new server/alias-rewrite.ts helper. Premium gating, rate limits,
entitlement checks, and cache-tier lookups all fire on the canonical
path — aliases are pure URL rewrites, not a duplicate handler pipeline.

  api/scenario/v1/{run,status,templates}.ts
  api/supply-chain/v1/{country-products,multi-sector-cost-shock}.ts

Vite dev parity: file-based routing at api/ is a Vercel concern, so the
dev middleware (vite.config.ts) gets a matching V1_ALIASES rewrite map
before the router dispatch.

Manifest: 5 new entries under `deferred` with removal_issue=#3282
(tracking their retirement at the next v1→v2 break). lint:api-contract
stays green (89 files checked, 55 manifest entries validated).

Docs:
- docs/api-scenarios.mdx: migration callout at the top with the full
  old→new URL table and a link to the retirement issue.
- CHANGELOG.md + docs/changelog.mdx: Changed entry documenting the
  rename + alias compat + the 202→200 shift (from commit 23c821a1).

Verified: typecheck:api, lint:api-contract, lint:rate-limit-policies,
lint:boundaries, test:data (6005/6005).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-22 09:55:59 +03:00

Elie Habib

96fca1dc2b

fix(supply-chain): popup-keyed history re-query + dataAvailable flag (#3187 )

* fix(supply-chain): popup-keyed history re-query + dataAvailable flag for partial coverage

Two P1 findings on #3185 post-merge review:

1. MapPopup cross-chokepoint history contamination
   Popup's async history resolve re-queried [data-transit-chart] without a
   cpId key. User opens popup A → fetch starts for cpA; user opens popup B
   before it resolves → cpA's history mounts into cpB's chart container.
   Fix: add data-transit-chart-id keyed by cpId; re-query by it on resolve.
   Mirrors SupplyChainPanel's existing data-chart-cp-id pattern.

2. Partial portwatch coverage still looked healthy
   Previous fix emits all 13 canonical summaries (zero-state fill for
   missing IDs) and records pwCovered in seed-meta, but:
   - get-chokepoint-status still zero-filled missing chokepoints and cached
     the response as healthy — panel rendered silent empty rows.
   - api/health.js only degrades on recordCount=0, so 10/13 partial read
     as OK despite the UI hiding entire chokepoints.
   Fix:
   - proto: TransitSummary.data_available (field 12). Writer tags with
     Boolean(cpData). Status RPC passes through; defaults true for pre-fix
     payloads (absence = covered).
   - Status RPC writes seed-meta recordCount as covered count (not shape
     size), and flips response-level upstreamUnavailable on partial.
   - api/health.js: new minRecordCount field on SEED_META entries + new
     COVERAGE_PARTIAL status (warn rollup). chokepoints entry declares
     minRecordCount: 13. recordCount < 13 → COVERAGE_PARTIAL.
   - Client (panel + popup): skip stats/chart rendering when
     !dataAvailable; show "Transit data unavailable (upstream partial)"
     microcopy so users understand the gap.

5759/5759 data tests pass. Typecheck + typecheck:api clean.

* fix(supply-chain): guarantee Simulate Closure button exits Computing state

User reports "Simulate Closure does nothing beyond write Computing…" — the
button sticks at Computing forever. Two causes:

1. Scenario worker appears down (0 scenario-result:* keys in Redis in the
   last 24h of 24h-TTL). Railway-side — separate intervention needed to
   redeploy scripts/scenario-worker.mjs.

2. Client leaked the "Computing…" state on multiple exit paths:
   - signal.aborted early-return inside the poll loop never reset the
     button. Second click fired abort on first → first returned without
     resetting → button stayed "Computing…" until next render.
   - !this.content.isConnected early-return also skipped reset (less
     user-visible but same class of bug).
   - catch block swallowed AbortError without resetting.
   - POST /run had no hard timeout — a hanging edge function left the
     button in Computing indefinitely.

Fix:
- resetButton(text) helper touches the btn only if still connected;
  applied in every exit path (abort, timeout, post-success, catch).
- AbortSignal.any([caller, AbortSignal.timeout(20_000)]) on POST /run.
- console.error on failure so Simulate Closure errors surface in ops.
- Error message includes "scenario worker may be down" on loop timeout
  so operators see the right suspect.

Backend observations (for follow-up):
- Hormuz backend is healthy (/api/health chokepoints OK, 13 records,
  1 min old; live RPC has hormuz_strait.riskLevel=critical, wow=-22,
  flowEstimate present; GetChokepointHistory returns 174 entries).
  User-reported "Hormuz empty" is likely browser/CDN stale cache from
  before PR #3185; hard refresh should resolve.
- scenario-worker.mjs has zero result keys in 24h. Railway service
  needs verification/redeployment.

* fix(scenario): wrong Upstash RPUSH format silently broke every Simulate Closure

Railway scenario-worker log shows every job failing field validation since
at least 03:06Z today:

  [scenario-worker] Job failed field validation, discarding:
    ["{\"jobId\":\"scenario:1776535792087:cynxx5v4\",...

The leading [" in the payload is the smoking gun. api/scenario/v1/run.ts
was POSTing to /rpush/{key} with body `[payload]`, expecting Upstash to
unpack the array and push one string value. Upstash does NOT parse that
form — it stored the literal `["{...}"]` string as a single list value.

Worker BLMOVEs the literal string → JSON.parse → array → destructure
`{jobId, scenarioId, iso2}` on an array returns undefined for all three
→ every job discarded without writing a result. Client poll returns
`pending` for the full 60s timeout, then (on the prior client code path)
leaked the stuck "Computing…" button state indefinitely.

Fix: use the standard Upstash REST command format — POST to the base URL
with body `["RPUSH", key, value]`. Matches scripts/ais-relay.cjs upstashLpush.

After this, the scenario-queue:pending list stores the raw payload string,
BLMOVE returns the payload, JSON.parse gives the object, validation passes,
computeScenario runs, result key gets written, client poll sees `done`.

Zero result keys existed in prod Redis in the last 24h (24h TTL on
scenario-result:*) — confirms the fix addresses the production outage.

2026-04-18 23:38:33 +04:00

Elie Habib

12ea629a63

feat(supply-chain): Sprint C — scenario engine (templates, job API, Railway worker, map activation) (#2890 )

* feat(supply-chain): Sprint C — scenario engine templates, job API, Railway worker, map activation

Adds the async scenario engine for supply chain disruption modelling:

- src/config/scenario-templates.ts: 6 pre-built ScenarioTemplate definitions
  (Taiwan Strait closure, Suez+BaB simultaneous, Panama drought, Hormuz blockade,
  Russia Baltic grain suspension, US electronics tariff shock) with costShockMultiplier
  and optional HS2 sector scoping. Exports ScenarioVisualState + ScenarioResult
  types (no UI imports, avoids MapContainer <-> DeckGLMap circular dep).

- api/scenario/v1/run.ts: PRO-gated edge function — validates scenarioId against
  template registry and iso2 format, enqueues job to Redis scenario-queue:pending
  via RPUSH. Returns {jobId, status:'pending'} HTTP 202.

- api/scenario/v1/status.ts: Edge function — validates jobId via regex to prevent
  Redis key injection, reads scenario-result:{jobId}. Returns {status:'pending'}
  when unprocessed, or full worker result when done.

- scripts/scenario-worker.mjs: Always-on Railway worker using BLMOVE LEFT RIGHT for
  atomic FIFO dequeue+claim. Idempotency check before compute. Writes result with
  24h TTL; writes {status:'failed'} on error; always cleans processing list in finally.

- DeckGLMap.ts: scenarioState field + setScenarioState(). createTradeRoutesLayer()
  overrides arc color to orange for segments whose route waypoints intersect scenario
  disruptedChokepointIds. Null state restores normal colors.

- MapContainer.ts: activateScenario(id, result) and deactivateScenario() broadcast
  ScenarioVisualState to DeckGLMap. Globe/SVG deferred to Sprint D (best-effort).

🤖 Generated with Claude Sonnet 4.6 via Claude Code (https://claude.com/claude-code) + Compound Engineering v2.49.0

Co-Authored-By: Claude Sonnet 4.6 (200K context) <noreply@anthropic.com>

* fix(supply-chain): move scenario-templates to server/ to satisfy arch boundary

api/ edge functions may not import from src/ app code. Move the authoritative
scenario-templates.ts to server/worldmonitor/supply-chain/v1/ and replace
src/config/scenario-templates.ts with a type-only re-export for src/ consumers.

* fix(supply-chain): guard scenario-worker runWorker() behind isMain check

Without the isMain guard, the pre-push hook picks up scenario-worker.mjs
as a seed test candidate (non-matching lines pass through sed unchanged)
and starts the long-running worker process, causing push failures.

* fix(pre-push): filter non-matching lines from seed test selector

The sed transform passes non-matching lines (e.g. scenario-worker.mjs)
through unchanged. Adding grep "^tests/" ensures only successfully
transformed test paths are passed to the test runner.

* fix(supply-chain): address PR #2890 review findings — worker data shapes + status PRO gate

Three bugs found in PR #2890 code review:

1. [High] scenario-worker.mjs read wrong cache shape for exposure data.
   supply-chain:exposure:{iso2}:{hs2}:v1 caches GetCountryChokepointIndexResponse
   ({ iso2, hs2, exposures: [{chokepointId, exposureScore}], ... }), not a
   chokepointId-keyed object. Worker now iterates data.exposures[], filters by
   template.affectedChokepointIds, and ranks by exposureScore (importValue does
   not exist on ChokepointExposureEntry). adjustedImpact = exposureScore x
   (disruptionPct/100) x costShockMultiplier.

2. [Medium] api/scenario/v1/status.ts was not PRO-gated, allowing anyone with
   a valid jobId to retrieve full premium scenario results. Added isCallerPremium()
   check; returns HTTP 403 for non-PRO callers, matching run.ts behavior.

3. [Low] Worker parsed chokepoint status cache as Array but actual shape is
   { chokepoints: [], fetchedAt, upstreamUnavailable }. Fixed to access
   cpData.chokepoints array.

* fix(scenario): per-country impactPct + O(1) route lookup in arc layer

- impactPct now reflects each country's relative share of the worst-hit
  country (0-100) instead of the flat template.disruptionPct for all
- Pre-build routeId→waypoints Map in createTradeRoutesLayer() so
  getColor() is O(1) per segment instead of O(n) per frame

* fix(scenario): rate limit, pipeline GETs, error sanitization, processing state, orphan drain

- Add per-user rate limit (10 jobs/min) + queue depth cap to run.ts
- Replace 594 sequential Redis GETs with single Upstash pipeline call
- Sanitize worker err.message to 'computation_error' in failed results
- Remove dead validateApiKey() calls (isCallerPremium covers this)
- Write processing state before computeScenario() starts
- Add SIGTERM handler + startup orphan drain to worker loop
- Validate dequeued job payload fields before use as Redis key fragments
- Fix maxImpact divide-by-zero with Math.max(..., 1)
- Hoist routeWaypoints Map to module level in DeckGLMap
- Add GET /api/scenario/v1/templates discovery endpoint
- Fix template sync comment to reference correct authoritative file

* docs(plan): mark Sprint C complete, record deferrals to Sprint D

- Sprint status table added: Sprints 0-2 merged, C ready to merge (#2890), A/B/D not started
- Sprint C checklist: 4 ACs checked off, panel UI + tariff-shock visual deferred
- Sprint D section updated to carry over Sprint C visual deferrals
- PR #2890 added to Related PRs

---------

Co-authored-by: Claude Sonnet 4.6 (200K context) <noreply@anthropic.com>

2026-04-10 14:44:14 +04:00

3 Commits