Phase 3 PR1: Regime drift history (writer + RPC) (#2981)

* feat(intelligence): regime drift history (Phase 3 PR1)

Phase 3 PR1 of the Regional Intelligence Model. Adds an append-only
regime transition log per region plus a premium-gated RPC to read it.

## What landed

### New writer module: scripts/regional-snapshot/regime-history.mjs

Single public entry point:

  recordRegimeTransition(region, snapshot, diff, opts?)
    -> { recorded, entry, pushed, trimmed }

Pure builder + Redis-ops orchestrator + dependency-injected publisher.

Flow:
  1. buildTransitionEntry() returns null when diff has no regime_changed
     (steady-state snapshots produce no entry — pure transition stream)
  2. publishTransitionWithOps() LPUSHes onto
     intelligence:regime-history:v1:{region}, then LTRIMs to keep the
     most recent REGIME_HISTORY_MAX (100) entries
  3. defaultPublisher binds real Upstash REST calls; tests inject an
     in-memory ops object for offline coverage

LTRIM failure is non-fatal — entry already landed, next cycle will
re-trim. LPUSH failure short-circuits and reports pushed=false. The
recorder NEVER throws and is wrapped in its own try/catch in the seed
loop so snapshot persist is never blocked.

### seed-regional-snapshots.mjs hook

Added a regime-history call alongside the existing alert-emitter call,
right after persistSnapshot success. Same best-effort contract:
unconditional try/catch, log warn on throw, continue main loop.

### Proto + RPC: GetRegimeHistory

  proto/worldmonitor/intelligence/v1/get_regime_history.proto

  - GetRegimeHistoryRequest { region_id, limit (0..100) }
  - GetRegimeHistoryResponse { transitions: RegimeTransition[] }
  - RegimeTransition { region_id, label, previous_label,
                       transitioned_at, transition_driver, snapshot_id }

region_id validated as strict kebab-case (same regex as
get-regional-snapshot). limit capped server-side at MAX_LIMIT=100,
defaulting to 50 when omitted.

Added to IntelligenceService in service.proto. Generated openapi
JSON/YAML committed via `make generate`.

### Server handler: server/worldmonitor/intelligence/v1/get-regime-history.ts

LRANGE-based read (newest-first because the writer LPUSHes). adapter
is a dedicated exported function adaptTransition(raw) for testability.

LRANGE helper is inlined here because server/_shared/redis.ts has no
list helpers yet — this is the first list-reading handler in the
intelligence service. If a second list reader lands, the helper can
be promoted to a shared util.

Empty list / Redis miss / failed JSON parse all return
{ transitions: [] } so the client can distinguish "never changed" from
"upstream broken" via the HTTP status code, not the body.

Registered in handler.ts.

### Premium gating + cache tier

  src/shared/premium-paths.ts:   added /api/intelligence/v1/get-regime-history
  server/gateway.ts RPC_CACHE_TIER: same path with 'slow' tier (matches
                                    route-parity contract enforced by
                                    tests/route-cache-tier.test.mjs)

## Tests — 44 new unit tests

tests/regional-snapshot-regime-history.test.mjs (22 tests):

  buildTransitionEntry (7):
    - null on missing diff/region/snapshot
    - returns entry on regime change
    - first-ever transition (empty previous_label)
    - falls back to generated_at when transitioned_at is missing
    - preserves snapshot_id

  publishTransitionWithOps (8):
    - happy path (LPUSH + LTRIM both succeed)
    - canonical key prefix
    - LTRIM uses REGIME_HISTORY_MAX-1 stop
    - LPUSH failure → not pushed, LTRIM not called
    - LTRIM failure → pushed=true, trimmed=false (non-fatal)
    - LPUSH/LTRIM throwing caught and reported
    - null/empty entry → no-op

  recordRegimeTransition (5):
    - no-op on no regime change
    - records on regime change
    - publisher returning false → recorded=false
    - publisher exceptions swallowed
    - critical escalation labels preserved

  module constants (2): key prefix + max are valid

tests/get-regime-history.test.mts (22 tests):

  adaptTransition (4):
    - all fields snake → camel
    - missing fields → empty/zero defaults
    - first-ever transition shape preserved
    - non-numeric transitioned_at → 0

  handler structural checks (7): canonical key prefix, LRANGE usage,
    adapter export, handler export signature, MAX_LIMIT cap matches
    writer, missing-region short-circuit, malformed-entry filter

  intelligence handler registration (2): import + registration

  security wiring (2): premium path + cache-tier entry

  proto definition (7): RPC method declared, import wired, request
    shape, kebab regex, limit bounds, RegimeTransition fields,
    response shape

## Verification

- node --test tests/regional-snapshot-regime-history.test.mjs: 22/22 pass
- npx tsx --test tests/get-regime-history.test.mts: 22/22 pass
- npm run test:data: 4621/4621 pass
- npm run typecheck: clean
- npm run typecheck:api: clean
- biome lint on touched files: clean

## Deferred to future iterations

- Phase 3 PR2: weekly regional briefs LLM seeder (consumes regime history
  to highlight drift events in the weekly summary)
- Phase 3 PR3: UI block in RegionalIntelligenceBoard for regime drift
  timeline (can ride alongside or after PR2)
- Drift analytics: % of last N days spent in each regime, transition
  frequency rolling window, regime cycle detection
- Alert triggers on drift cycles (e.g., "thrashed between regimes 3 times
  in 7 days")

* fix(intelligence): address 2 review findings on #2981

P2 #1 — transition_driver always empty in the live path

buildRegimeState(balance, previousLabel, '') at Step 11 passed an empty
driver because the diff hasn't been computed yet. The regime-history
recorder reads snapshot.regime.transition_driver which was therefore
always '' in production, despite tests exercising synthetic fixtures
with a populated driver.

Fix: after Step 15 derives triggerReason via inferTriggerReason(diff),
backfill regime.transition_driver = triggerReason when a genuine regime
change occurred. This ensures both the persisted snapshot's regime block
AND the regime-history entry carry the real driver (e.g., 'regime_shift',
'trigger_activation', 'corridor_break').

Added 2 regression tests: populated driver flows through, and pre-fix
empty-driver snapshots remain back-compatible.

P2 #2 — Redis failure returns cached false-empty history

get-regime-history.ts returned 200 {transitions:[]} on LRANGE failure.
The gateway caches 200 GET responses at the slow tier, so a transient
Upstash outage would be pinned as a false-empty history until the cache
TTL expired.

Fix: when redisLrange returns null (Redis unavailable or credentials
missing), the response now includes upstreamUnavailable: true in the
body. The gateway already checks for this flag in the response body
(line 434) and sets Cache-Control: no-store, so transient failures are
not cached.

Added 1 structural test asserting the upstreamUnavailable flag is set.

Verification:
- 24/24 writer tests, 23/23 handler tests, 4624/4624 full suite pass
- npm run typecheck: clean
- biome lint on touched files: clean

* fix(intelligence): correct misleading 'log once per region' comment (Greptile P2)
This commit is contained in:
Elie Habib
2026-04-12 07:58:01 +04:00
committed by GitHub
parent 6dab59faba
commit 19d67cea94
14 changed files with 1252 additions and 1 deletions

View File

@@ -829,6 +829,55 @@ paths:
application/json:
schema:
$ref: '#/components/schemas/Error'
/api/intelligence/v1/get-regime-history:
get:
tags:
- IntelligenceService
summary: GetRegimeHistory
description: |-
GetRegimeHistory returns the region's regime transition log newest-first.
Entries are append-only from the seed writer, recorded only when
diffRegionalSnapshot reports regime_changed. Premium-gated.
operationId: GetRegimeHistory
parameters:
- name: region_id
in: query
description: |-
Display region id (e.g. "mena", "east-asia", "europe"). See shared/geography.js.
Kebab-case: lowercase alphanumeric groups separated by single hyphens, no
trailing or consecutive hyphens.
required: false
schema:
type: string
- name: limit
in: query
description: |-
Optional cap on how many entries to return. Defaults to 50 server-side
when omitted or <= 0. Hard cap enforced by the handler at 100 (= the
writer-side LTRIM cap in regime-history.mjs).
required: false
schema:
type: integer
format: int32
responses:
"200":
description: Successful response
content:
application/json:
schema:
$ref: '#/components/schemas/GetRegimeHistoryResponse'
"400":
description: Validation error
content:
application/json:
schema:
$ref: '#/components/schemas/ValidationError'
default:
description: Error response
content:
application/json:
schema:
$ref: '#/components/schemas/Error'
components:
schemas:
Error:
@@ -2940,3 +2989,75 @@ components:
items:
type: string
description: NarrativeSection is one block of narrative text plus its supporting evidence.
GetRegimeHistoryRequest:
type: object
properties:
regionId:
type: string
maxLength: 32
minLength: 1
pattern: ^[a-z][a-z0-9]*(-[a-z0-9]+)*$
description: |-
Display region id (e.g. "mena", "east-asia", "europe"). See shared/geography.js.
Kebab-case: lowercase alphanumeric groups separated by single hyphens, no
trailing or consecutive hyphens.
limit:
type: integer
maximum: 100
minimum: 0
format: int32
description: |-
Optional cap on how many entries to return. Defaults to 50 server-side
when omitted or <= 0. Hard cap enforced by the handler at 100 (= the
writer-side LTRIM cap in regime-history.mjs).
required:
- regionId
description: |-
GetRegimeHistoryRequest asks for the recent regime transition log for a
region. Returns newest-first. Phase 3 PR1 — see
scripts/regional-snapshot/regime-history.mjs for the writer that
populates the underlying Redis list on every regime change.
GetRegimeHistoryResponse:
type: object
properties:
transitions:
type: array
items:
$ref: '#/components/schemas/RegimeTransition'
description: |-
GetRegimeHistoryResponse returns the region's regime transition log
newest-first. The list is append-only from the seed writer's perspective:
only diffs with regime_changed set produce an entry, so this is a pure
transition stream (no steady-state noise).
RegimeTransition:
type: object
properties:
regionId:
type: string
label:
type: string
description: Current regime label (the label the region just moved INTO).
previousLabel:
type: string
description: |-
Previous regime label (the label the region was in before). Empty for
the first-ever recorded transition for a region.
transitionedAt:
type: integer
format: int64
description: |-
Unix ms when the transition was recorded. Mirrors
snapshot.regime.transitioned_at when available.. Warning: Values > 2^53 may lose precision in JavaScript
transitionDriver:
type: string
description: |-
Free-text driver string from the seed writer (e.g. "cross_source_surge").
May be empty.
snapshotId:
type: string
description: |-
Snapshot id that materialized this transition. Points back to the
full snapshot via intelligence:snapshot-by-id:v1:{snapshot_id}.
description: |-
RegimeTransition is a single recorded regime change moment. One of these
lands in the log each time diffRegionalSnapshot() reports regime_changed.