mirror of
https://github.com/koala73/worldmonitor.git
synced 2026-04-25 17:14:57 +02:00
main
2 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
abdcdb581f |
feat(resilience): SWF manifest expansion + KIA split + new schema fields (#3391)
* feat(resilience): SWF manifest expansion + KIA split + new schema fields Phase 1 of plan 2026-04-25-001 (Codex-approved round 5). Manifest-only data correction; no construct change, no cache prefix bump. Schema additions (loader-validated, misplacement-rejected): - top-level: aum_usd, aum_year, aum_verified (primary-source AUM) - under classification: aum_pct_of_audited (fraction multiplier), excluded_overlaps_with_reserves (boolean; documentation-only) Manifest expansion (13 → 21 funds, 6 → 13 countries): - UAE: +ICD ($320B verified), +ADQ ($199B verified), +EIA (unverified — loaded for documentation, excluded from scoring per data-integrity rule) - KW: kia split into kia-grf (5%, access=0.9) + kia-fgf (95%, access=0.20). Corrects ~18× over-statement of crisis-deployable Kuwait sovereign wealth (audit found combined-AUM × 0.7 access applied $750B as "deployable" against ~$15B actual GRF stabilization capacity). - CN: +CIC ($1.35T), +NSSF ($400B, statutorily-gated 0.20 tier), +SAFE-IC ($417B, excluded — overlaps SAFE FX reserves) - HK: +HKMA-EF ($498B, excluded — overlaps HKMA reserves) - KR: +KIC ($182B, IFSWF full member) - AU: +Future Fund ($192B, pension-locked) - OM: +OIA ($50B, IFSWF member) - BH: +Mumtalakat ($19B) - TL: +Petroleum Fund ($22B, GPFG-style high-transparency) Re-audits (Phase 1E): - ADIA access 0.3 → 0.4 (rubric flagged; ruler-discretionary deployment empirically demonstrated) - Mubadala access 0.4 → 0.5 (rubric flagged); transparency 0.6 → 0.7 (LM=10 + IFSWF full member alignment) Rubric (docs/methodology/swf-classification-rubric.md): - New "Statutorily-gated long-horizon" 0.20 access tier added between 0.1 (sanctions/frozen) and 0.3 (intergenerational/ruler-discretionary). Anchored by KIA-FGF (Decree 106 of 1976; Council-of-Ministers + Emir decree gate; crossed once in extremis during COVID). Seeder: - Two new pure helpers: shouldSkipFundForBuffer (excluded/unverified decision) and applyAumPctOfAudited (sleeve fraction multiplier) - Manifest-AUM bypass: if aum_verified=true AND aum_usd present, use that value directly (skip Wikipedia) - Skip funds with excluded_overlaps_with_reserves=true (no double-counting against reserveAdequacy / liquidReserveAdequacy) - Skip funds with aum_verified=false (load for documentation only) Tests (+25 net): - 15 schema-extension tests (misplacement rejection, value-range gates, rationale-pairing coherence, backward-compat with pre-PR entries) - 10 helper tests (shouldSkipFundForBuffer + applyAumPctOfAudited predicates and arithmetic; KIA-GRF + KIA-FGF sum equals combined AUM) - Existing manifest test updated for the kia → kia-grf+kia-fgf split Full suite: 6,940 tests pass (+50 net), typecheck clean, no new lint. Predicted ranking deltas (informational, NOT acceptance criteria per plan §"Hard non-goals"): - AE sovFiscBuf likely 39 → 47-49 (Phase 1A + 1E) - KW sovFiscBuf likely 98 → 53-57 (Phase 1B) - CN, HK (excluded), KR, AU acquire newly-defined sovFiscBuf scores - GCC ordering shifts toward QA > KW > AE; AE-KW gap likely 6 → ~3-4 Real outcome will be measured post-deploy via cohort audit per plan §Phase 4. * fix(resilience): completeness denominator excludes documentation-only funds PR-3391 review (P1 catch): the per-country `expectedFunds` denominator counted ALL manifest entries (`funds.length`) including those skipped from buffer scoring by design — `excluded_overlaps_with_reserves: true` (SAFE-IC, HKMA-EF) and `aum_verified: false` (EIA). Result: countries with mixed scorable + non-scorable rosters showed `completeness < 1.0` even when every scorable fund matched. UAE (4 scorable + EIA) would show 0.8; CN (CIC + NSSF + SAFE-IC excluded) would show 0.67. The downstream scorer then derated those countries' coverage based on a fake-partial signal. Three call sites all carried the same bug: - per-country `expectedFunds` in fetchSovereignWealth main loop - `expectedFundsTotal` + `expectedCountries` in buildCoverageSummary - `countManifestFundsForCountry` (missing-country path) All three now filter via `shouldSkipFundForBuffer` to count only scorable manifest entries. Documentation-only funds neither expected nor matched — they don't appear in the ratio at all. Tests added (+4): - AE complete with all 4 scorable matched (EIA documented but excluded) - CN complete with CIC + NSSF matched (SAFE-IC documented but excluded) - Missing-country path returns scorable count not raw manifest count - Country with ONLY documentation-only entries excluded from expectedCountries Full suite: 6,944 tests pass (+4 net), typecheck clean. * fix(resilience): address Greptile P2s on PR #3391 manifest Three review findings, all in the manifest YAML: 1. **KIA-GRF access 0.9 → 0.7** (rubric alignment): GRF deployment requires active Council-of-Ministers authorization (2020 COVID precedent demonstrates this), not rule-triggered automatic deployment. The rubric's 0.9 tier ("Pure automatic stabilization") reserved for funds where political authorization is post-hoc / symbolic (Chile ESSF candidate). KIA-GRF correctly fits 0.7 ("Explicit stabilization with rule") — the same tier the pre-split combined-KIA was assigned. Updated rationale clarifies the tier choice. Rubric's 0.7 precedent column already lists "KIA General Reserve Fund" — now consistent with the manifest. 2. **Duplicate `# ── Australia ──` header before Oman** (copy-paste artifact): removed the orphaned header at the Oman section; added proper `# ── Australia ──` header above the Future Fund entry where it actually belongs (after Timor-Leste). 3. **NSSF `aum_pct_of_audited: 1.0` removed** (no-op): a multiplier of 1.0 is identity. The schema field is OPTIONAL and only meant for fund-of-funds split entries (e.g. KIA-GRF/FGF). Setting it to 1.0 forced the loader to require an `aum_pct_of_audited` rationale paragraph with no computational benefit. Both the field and the paragraph are now removed; NSSF remains a single- sleeve entry that scores its full audited AUM. Full suite: 6,944 tests pass, typecheck clean. |
||
|
|
8032dc3a04 |
feat(resilience): PR 2 pre-scorer — SWF manifest + seeder (8/8 funds) (#3305)
* feat(resilience): PR 2 scaffolding — SWF classification manifest + seeder skeleton
Plan §3.4. First of multiple commits for PR 2 (fiscal-buffer split
and sovereign-wealth integration). This commit is SCAFFOLDING ONLY:
no dimension wiring, no scorer, no cache-keys entry yet. The goal is
to land the reviewer-facing metadata and the seeder's three-tier
source shape so an external SWF practitioner can critique before we
wire the scorer.
What is in:
1. docs/methodology/swf-classification-manifest.yaml — authoritative
per-fund classification for the `sovereignFiscalBuffer` dimension.
First-pass estimates for the 8 funds named in plan §3.4 table:
Norway GPFG, UAE ADIA + Mubadala, Saudi PIF, Kuwait KIA,
Qatar QIA, Singapore GIC + Temasek. Each fund carries:
- three-component classification (access, liquidity, transparency)
each on [0, 1], with rationale text citing the mandate / fiscal
rule / asset-mix / transparency-index evidence
- source URLs for audit
Fund-candidates deferred for external-reviewer decision are listed
in a trailing comment block (CIC, NWF, SOFAZ, NSIA, Future Fund,
NZ Super, ESSF, etc.).
external_review_status: PENDING — flip to REVIEWED on sign-off.
2. scripts/shared/swf-manifest-loader.mjs — YAML parser + strict schema
validator. Fails loudly on any deviation (out-of-range scores,
non-ISO2 countries, missing rationale, duplicate fund IDs, wrong
manifest version). Single source of truth for the seeder, future
scorer, and methodology-doc linter.
3. scripts/seed-sovereign-wealth.mjs — seeder shell with the three-tier
source priority from plan §3.4:
1. Official fund disclosures (MoF, central-bank, annual reports)
2. IFSWF member filings
3. SWFI public fund-rankings page (license-free fallback, scraped)
Tiers 1-3 are all stubbed (return null) in this commit — the
seeder publishes a well-formed empty payload so the scorer IMPUTE
fallback can be exercised end-to-end without live data.
emptyDataIsFailure: false is set deliberately so pre-wiring cron
runs do not poison seed-meta (see
feedback_strict_floor_validate_fail_poisons_seed_meta.md).
SWFI scrape target is documented in the file header with the
exact URL and a 2.5s inter-request interval. The scraper itself
lands in the next commit after the external reviewer signs off
on the manifest.
4. tests/swf-classification-manifest.test.mjs — 14 tests exercising
both the shipped YAML (plan §3.4 required-fund presence, [0,1]
bounds, rationale length, source citations, multi-fund country
handling) and the validator's schema enforcement (rejects out-
of-range scores, non-ISO2 codes, missing rationale, empty sources,
duplicates, wrong version, invalid review status).
Out of scope for this commit (follow-ups, in order):
- Implement SWFI scrape + IFSWF parse + per-fund official endpoints
- Add `liquidReserveAdequacy` and `sovereignFiscalBuffer` dimensions
to RESILIENCE_DIMENSION_ORDER, registry, and scorers
- Retire `reserveAdequacy` via RESILIENCE_RETIRED_DIMENSIONS
- cache-keys.ts + api/bootstrap.js + api/health.js wiring (new
seed key needs ON_DEMAND_KEYS gating per Railway-cron bake-in rule)
- Recovery-domain weight rebalance + Spearman sensitivity rerun
- Methodology doc: rewrite the reserveAdequacy section
Tests: 508/508 pass (resilience suite + new manifest tests).
Typecheck clean on both tsconfig.json and tsconfig.api.json.
No external-facing behavior change — all files are new + isolated.
* feat(resilience): PR 2 commit 2 — Wikipedia SWF scraper + SWFI pivot
Implements Tier 3 of the sovereignFiscalBuffer seeder. Tier 1 (official
disclosures) and Tier 2 (IFSWF filings) remain stubbed — they require
per-fund bespoke adapters and will land incrementally.
SWFI pivot
----------
The plan's original Tier 3 target was
https://www.swfinstitute.org/fund-rankings/sovereign-wealth-fund. Live
check on 2026-04-23: the page's <tbody> is empty and AUM is gated
behind a lead-capture form (name + company + job title). SWFI per-fund
/profile/<id> pages are similarly barren. The "public fund rankings"
is effectively no longer public; scraping the lead-gated surface would
require submitting fabricated contact info (TOS violation, legally
questionable), so Tier 3 pivots to Wikipedia.
Wikipedia is legally clean (CC-BY-SA 4.0, attribution required — see
WIKIPEDIA_SOURCE_ATTRIBUTION in the seeder) and structurally scrapable.
The SWFI Linaburg-Maduell Transparency Index mentioned in manifest
rationale text is a SEPARATE SWFI publication (public index scores),
not the fund-rankings paywall — those citations stay valid.
What is in
----------
1. scripts/seed-sovereign-wealth.mjs — Wikipedia scraper implementation:
- parseWikipediaRankingsTable(html) — exported pure function so
the parser is unit-testable without a live fetch. Extracts the
wikitable, parses per-fund rows (Country, Abbrev, Fund name,
Assets USD B, Inception, Origin).
- Strip-HTML helper strips <sup> tags to SPACES (not empty) so
`302.0<sup>41</sup>` stays `302.0 41` — otherwise the decimal
value and its trailing footnote ref get welded into `302.041`,
which the Assets regex mis-parses.
- matchWikipediaRecord(fund, cache) — abbrev + fund-name lookup
with country disambiguation: lookup maps are now
Map<key, Record[]> (list) rather than Map<key, Record>, and the
matcher filters the list by manifest country before returning.
This is the exact fix for the PIF collision:
"PIF" resolves to BOTH Saudi Arabia's Public Investment Fund
(~USD 925B) and Palestine's Palestine Investment Fund (~USD 900M)
on the live article. Without country-filtering, Map.set silently
overwrites one with the other, so Saudi PIF would return
Palestine's AUM — three orders of magnitude wrong.
- When the country disambiguator cannot pick, returns null rather
than a best-guess. Seeder logs the unmatched fund; the IMPUTE
path handles it gracefully.
2. docs/methodology/swf-classification-manifest.yaml — added
`wikipedia` hints block to each of the 8 funds (abbrev and/or
fund_name, matching Wikipedia's canonical naming).
3. scripts/shared/swf-manifest-loader.mjs — optional `wikipedia` field
in the schema: `abbrev` and `fund_name` both optional strings, but
at least one must be present if the block is provided.
4. tests/seed-sovereign-wealth.test.mjs — 12 tests exercising:
- fixture-based parser: abbrev/name indexing, HTML + footnote
stripping, decimal AUM, malformed rows skipped, missing-table error
- abbrev-collision handling: both candidates retained in the list
- country-disambiguation matcher: Saudi PIF correctly picked from
a Saudi-vs-Palestine collision fixture (the exact live bug)
- ambiguous lookup with unknown country returns null, not wrong record
Live verification against the shipped Wikipedia article: 7/8 funds
matched with the correct country; Saudi PIF now correctly returns
USD 925B (not Palestine's USD 0.9B) because of the country-
disambiguation fix. Temasek is the one miss — Wikipedia does not
classify it as an SWF (practitioner debate; it lists under "state
holding companies" instead). Falls through to IMPUTE in the scorer
until Tier 1/2 adapters land with an official-disclosure source.
Tests: 522/522 pass (resilience + manifest + scraper).
Typecheck clean on both tsconfig.json and tsconfig.api.json.
Still stubbed for later commits:
- Tier 1 per-fund official-disclosure adapters (incl. Temasek)
- Tier 2 IFSWF secretariat parser
- Dimension wiring (liquidReserveAdequacy, sovereignFiscalBuffer)
- reserveAdequacy retirement via RESILIENCE_RETIRED_DIMENSIONS
- cache-keys / bootstrap / health.js wiring (ON_DEMAND_KEYS until bake-in)
- Recovery-domain weight rebalance + Spearman sensitivity rerun
* feat(resilience): PR 2 commit 3 — Wikipedia infobox fallback + FX → 8/8 match
Closes the Temasek gap. The Wikipedia list article excludes Temasek on
editorial grounds (classified as a "state holding company" rather than
an SWF), so the Tier-3 list-only path topped out at 7/8 funds matched.
This commit adds Tier 3b — per-fund Wikipedia article infobox scrape
— and a baked-in FX table to handle non-USD infobox currencies.
Live verification on the shipped Wikipedia articles: 8/8 funds matched.
Temasek: S$ 434B → US$ 321B via infobox + SGD→USD FX.
Implementation
1. scripts/seed-sovereign-wealth.mjs
- FX_TO_USD table (USD, SGD, NOK, EUR, GBP, AED, SAR, KWD, QAR)
with FX_RATES_REVIEWED_AT='2026-04-23' committed into the seed
payload so stale rates are visible at audit time.
- CURRENCY_SYMBOL_TO_ISO ordered list — US$ tested before S$ before
bare $, and $ / kr require a space + digit neighbor to avoid
false-matches in rich prose.
- detectCurrency(text) exported pure for unit testing.
- parseWikipediaArticleInfobox(html) exported pure — scans rows
for "Total assets" / "Assets under management" / "AUM" / "Net
assets" / "Net portfolio value" labels, extracts "NUMBER (trillion
| billion | million) (YEAR)" values, applies FX conversion.
- fetchWikipediaInfobox(fund) — per-fund article fetch, gated on
the manifest's wikipedia.article_url hint.
- sourceMix split into {official, ifswf, wikipedia_list,
wikipedia_infobox} counters so the seed payload shows which tier
delivered each fund.
- Source priority chain: official → ifswf → wikipedia_list →
wikipedia_infobox. Infobox last because it is N network round-
trips; amortizing over the list article cache first minimizes
live traffic.
2. docs/methodology/swf-classification-manifest.yaml
- Temasek entry gains wikipedia.article_url:
https://en.wikipedia.org/wiki/Temasek_Holdings with an inline
comment explaining why the list-article path misses.
3. scripts/shared/swf-manifest-loader.mjs
- article_url optional field; validator rejects anything that is
not a https://<lang>.wikipedia.org/... URL so a typo cannot
silently wire the seeder to an off-site fetch.
4. tests/seed-sovereign-wealth.test.mjs (10 new tests, 38/38 pass)
- detectCurrency distinguishes US$ vs S$ vs bare $.
- parseWikipediaArticleInfobox extracts Temasek S$ 434B → US$ 321B
with year tag from "(2025)".
- USD-native row pass-through with fxRate=1.0.
- NOK trillion conversion (NOK 18.7T → USD 1.74T).
- Returns null when no AUM row / no infobox at all.
- Documents the unknown-currency → USD fallback contract.
Tests: 532/532 pass (full resilience + manifest + scraper suite).
Typecheck clean on both tsconfig.json and tsconfig.api.json.
Still stubbed for later commits:
- Tier 1 per-fund official-disclosure adapters
- Tier 2 IFSWF secretariat parser
- Dimension wiring (liquidReserveAdequacy, sovereignFiscalBuffer)
- reserveAdequacy retirement via RESILIENCE_RETIRED_DIMENSIONS
- cache-keys / bootstrap / health.js wiring (ON_DEMAND_KEYS)
- Recovery-domain weight rebalance + Spearman sensitivity rerun
* refactor(resilience): reuse project-shared FX infrastructure for SWF seeder
Self-caught duplication from the previous commit (
|