feat(resilience): three-pillar schema + schemaVersion v2.0 feature flag (Phase 2 T2.1) (#2977)

* feat(resilience): three-pillar schema + schemaVersion v2.0 feature flag (Phase 2 T2.1)

Ships the Phase 2 T2.1 schema slice of the country-resilience reference
grade upgrade plan. Adds the three-pillar response shape
(StructuralReadiness, LiveShockExposure, RecoveryCapacity) as a new
`pillars` field on GetResilienceScoreResponse alongside a `schemaVersion`
string field, both gated behind the RESILIENCE_SCHEMA_V2_ENABLED env
flag (default false).

This PR is schema + plumbing only. Pillars ship with score=0,
coverage=0; real aggregation lands in PR 4 (T2.3). No behavior change
at the v1 default, which preserves widget / map / Country Brief
compatibility for one release cycle per the plan.

What this PR commits

- Proto: new ResiliencePillar message (id, score, weight, coverage,
  domains). New `pillars` repeated field + `schema_version` string
  field on GetResilienceScoreResponse. No renumbering or mutation of
  existing fields.
- Generated TS: regenerated service_server.ts, service_client.ts, and
  OpenAPI JSON/YAML via `make generate`.
- New module server/worldmonitor/resilience/v1/_pillar-membership.ts:
  declarative PILLAR_DOMAINS map + PILLAR_WEIGHTS map + ordered
  iteration list. Single source of truth for the pillar structure that
  PR 4 will import. Note: pillar membership uses the runtime
  ResilienceDomainId values (kebab-case domain ids that already ship in
  v1), not the long-form pillar names from the plan example.
  - structural-readiness (0.40): economic, infrastructure, social-governance
  - live-shock-exposure  (0.35): energy, health-food
  - recovery-capacity    (0.25): empty until PR 3 adds the new dimensions
- Response builder: new buildPillarList helper emits shaped-but-empty
  pillars when the v2 flag is on, empty array when off. Response
  literal fallback paths in _shared.ts and the LOCKED_PREVIEW fixture
  in resilience-widget-utils.ts updated to include pillars: [] and
  schemaVersion: '1.0' to satisfy the generated TS types.
- Tests: 13 new pillar-schema unit cases (membership invariants,
  weight sum=1.0, disjoint sets, empty recovery pillar, buildPillarList
  flag-off / flag-on / shuffled-order / partial-domain-set) + 3
  response-shape cases on the release-gate test pinning the v1 default
  shape and the new field presence on the wire.

What is deliberately NOT in this PR

- No aggregation logic: score/coverage on pillars stay 0 until PR 4.
- No cache key bump: schema is additive with proto3 defaults.
- No changes to overallScore/baselineScore/stressScore (parallel for
  one release cycle).
- No new seeders or dimensions (PR 3 / T2.2b).
- No tiering registry changes (PR 2 / T2.2a).
- No widget rendering (Phase 3 T3.6).

Verified

- make generate clean, new ResiliencePillar interface in regenerated
  client + server + OpenAPI artifacts
- typecheck + typecheck:api clean
- tests/resilience-pillar-schema.test.mts: 13/13 passing
- tests/resilience-release-gate.test.mts: 14/14 passing (3 new T2.1
  cases + 11 prior)
- full resilience suite: 283/283 passing
- npm run test:data: 4539/4539 passing
- npm run lint: exit 0

* fix(resilience): cache-flag decoupling + freshness error-status guard (#2977 P1+P2)

Two Greptile review findings addressed:

P1: RESILIENCE_SCHEMA_V2_ENABLED changed the cached response shape
but the cache key did not encode the flag state. Flipping the flag
on a warm cache served stale v1.0 payloads until the 6h TTL expired.

Fix: always compute and cache the v2 superset (with pillars and
schemaVersion='2.0'). Apply the flag as a response-time gate: when
off, strip pillars to [] and downgrade schemaVersion to '1.0' before
returning. This decouples the cache from the flag and makes flag
flips take effect immediately without waiting for TTL expiry.

P2: readFreshnessMap in _dimension-freshness.ts trusted fetchedAt
without checking status. The resilience-static seeder writes
fetchedAt: Date.now() on BOTH success and error paths (status: 'ok'
vs 'error'), so a failed seed run that preserved old data via
extendExistingTtl made the freshness badges show 'fresh' for what
is actually stale data.

Fix: skip seed-meta entries where status !== 'ok'. When the meta
is skipped, the dimension has no freshness data and classifies as
stale, matching api/health.js behavior. Added a test case that
verifies error-status entries are excluded from the freshness map.
This commit is contained in:
Elie Habib
2026-04-12 09:51:54 +04:00
committed by GitHub
parent e1b3796939
commit 39d5199ae0
13 changed files with 557 additions and 3 deletions

File diff suppressed because one or more lines are too long