mirror of
https://github.com/koala73/worldmonitor.git
synced 2026-04-25 17:14:57 +02:00
* feat(seed-contract): PR 1 foundation — envelope helpers + contract validators + static conformance test
Adds the foundational pieces for the unified seed contract rollout described in
docs/plans/2026-04-14-002-fix-runseed-zero-record-lockout-plan.md. Behavior-
preserving by construction: legacy-shape Redis values unwrap as { _seed: null,
data: raw } and pass through every helper unchanged.
New files:
- scripts/_seed-envelope-source.mjs — single source of truth for unwrapEnvelope,
stripSeedEnvelope, buildEnvelope.
- api/_seed-envelope.js — edge-safe mirror (AGENTS.md:80 forbids api/* importing
from server/).
- server/_shared/seed-envelope.ts — TS mirror with SeedMeta, SeedEnvelope,
UnwrapResult types.
- scripts/_seed-contract.mjs — SeedContractError + validateDescriptor (10
required fields, 10 optional, unknown-field rejection) + resolveRecordCount
(non-negative integer or throw).
- scripts/verify-seed-envelope-parity.mjs — diffs function bodies between the
two JS copies; TS copy guarded by tsc.
- tests/seed-envelope.test.mjs — 14 tests for the three helpers (null,
legacy-passthrough, stringified JSON, round-trip).
- tests/seed-contract.test.mjs — 25 tests for validateDescriptor/
resolveRecordCount + a soft-warn conformance scan that STATICALLY parses
scripts/seed-*.mjs (never dynamic import — several seeders process.exit() at
module load). Currently logs 91 seeders awaiting declareRecords migration.
Wiring (minimal, behavior-preserving):
- api/health.js: imports unwrapEnvelope; routes readSeedMeta's parsed value
through it. Legacy meta has no _seed wrapper → passes through unchanged.
- scripts/_bundle-runner.mjs: readSectionFreshness prefers envelope at
section.canonicalKey when present, falls back to the existing
seed-meta:<key> read via section.seedMetaKey (unchanged path today since no
bundle defines canonicalKey yet).
No seeder modified. No writes changed. All 5279 existing data tests still
green; both typechecks clean; parity verifier green; 39 new tests pass.
PR 2 will migrate seeders, bundles, and readers to envelope semantics. PR 3
removes the legacy path and hard-fails the conformance test.
* fix(seed-contract): address PR #3095 review — metaTtlSeconds opt, bundle fallback, strict conformance mode
Review findings applied:
P1 — metaTtlSeconds missing from OPTIONAL_FIELDS whitelist.
scripts/seed-jodi-gas.mjs:250 passes metaTtlSeconds to runSeed(); field is
consumed by _seed-utils writeSeedMeta. Without it in the whitelist, PR 2's
validateDescriptor wiring would throw 'unknown field' the moment jodi-gas
migrates. Added with a 'removed in PR 3' note.
P2 — Bundle canonicalKey short-circuit over-runs during migration.
readSectionFreshness previously returned null if canonicalKey had no envelope
yet, even when a legacy seed-meta key was also declared — making every cron
re-run the section. Fixed to fall through to seedMetaKey on null envelope so
the transition state is safe.
P3 — Conformance soft-warn signal was invisible in CI.
tests/seed-contract.test.mjs now emits a t.diagnostic summary line
('N/M seeders export declareRecords') visible on every run and gates hard-fail
behind SEED_CONTRACT_STRICT=1 so PR 3 can flip to strict without more code.
Nitpick — parity regex missed 'export async function'.
Added '(?:async\s+)?' to scripts/verify-seed-envelope-parity.mjs function
extraction regex.
Verified: 39 tests green, parity verifier green, strict mode correctly
hard-fails with 91 seeders missing (expected during PR 1).
* fix(seed-contract): address review round 2 — NaN/empty-string validation, Error cause, parity CI wiring
P2 — Non-finite ttlSeconds/maxStaleMin bypassed validation.
`typeof NaN === 'number'` and `NaN > 0 === false` meant a NaN duration
passed the old typeof+<=0 checks and would have poisoned TTLs once
validateDescriptor is wired into runSeed. Now gated by Number.isFinite,
which rejects NaN and ±Infinity. Tests added for NaN/Infinity on both
fields.
P2 — Empty/whitespace-only strings for domain/resource/canonicalKey/sourceVersion
bypassed validation. Added .trim() === '' rejection + tests per field.
This mattered because canonicalKey='' would have landed writes at the
empty key and seed-meta under a blank resource namespace.
P3 — SeedContractError silently dropped the Error v2 cause option.
Constructor now forwards { cause } through super() so err.cause works
with standard tooling (Node's default stack printer, Sentry chained-cause
serialization). resolveRecordCount's manual err.cause = err assignment
was replaced with the options-bag form. Test added for both constructor
direct-use and the resolveRecordCount wrap path.
P3 — Parity verifier was not on an automated path. Added
tests/seed-envelope-parity.test.mjs which spawns scripts/verify-seed-envelope-parity.mjs
via execFile; non-zero exit (drift) → test fails. Now runs as part of
`npm run test:data` (tsx --test tests/*.test.mjs). Drift injection
confirmed: sed -i modifying api/_seed-envelope.js makes the test fail
with 'Command failed' from execFile.
51 tests total (was 39). All green on clean tree.
* fix(seed-contract): conformance test checks full descriptor, not just declareRecords
Previous conformance check green-lit any seeder that exported
declareRecords, even if the runSeed(...) call-site omitted other
validateDescriptor-required opts (validateFn, ttlSeconds, sourceVersion,
schemaVersion, maxStaleMin). That would have produced a false readiness
signal for PR 3's strict flip: test goes green, but wiring
validateDescriptor() into runSeed in PR 2 would still throw at runtime
across the fleet.
Examples verified on the PR head:
- scripts/seed-cot.mjs:188-192 — no sourceVersion/schemaVersion/maxStaleMin
- scripts/seed-market-breadth.mjs:121-124 — same
- scripts/seed-jodi-gas.mjs:248-253 — no schemaVersion/maxStaleMin
Now the conformance test:
1. AST-lite extracts the runSeed(...) call site with balanced parens,
tolerating strings and comments.
2. Checks every REQUIRED_OPTS_FIELDS entry (validateFn, declareRecords,
ttlSeconds, sourceVersion, schemaVersion, maxStaleMin) is present as
an object key in that call-site.
3. Emits a per-file diagnostic listing missing fields.
4. Migration signal is now accurate: 0/91 seeders fully satisfy the
descriptor (was claiming 0/91 missing just declareRecords). Matches
the underlying validateDescriptor behavior.
Verified: strict mode (SEED_CONTRACT_STRICT=1) surfaces 'opt:schemaVersion,
opt:maxStaleMin' as missing fields per seeder — actionable for PR 2
migration work. 51 tests total (unchanged count; behavior change is in
which seeders the one conformance test considers migrated).
* fix(seed-contract): strip comments/strings before parsing runSeed() call site
The conformance scanner located the first 'runSeed(' substring in the raw
source, which caught commented-out mentions upstream of the real call.
Offending files where this produced false 'incomplete' diagnoses:
- scripts/seed-bis-data.mjs:209 // runSeed() calls process.exit(0)…
real call at :220
- scripts/seed-economy.mjs:788 header comment mentioning runSeed()
real call at :891
Three files had the same pattern. Under strict mode these would have been
false hard failures in PR 3 even when the real descriptor was migrated.
Fix:
- stripCommentsAndStrings(src) produces a view where block comments, line
comments, and string/template literals are replaced with spaces (line
feeds preserved). Indices stay aligned with the original source so
extractRunSeedCall can match against the stripped view and then slice
the original source for the real call body.
- descriptorFieldsPresent() also runs its field-presence regex against
the stripped call body so '// TODO: validateFn' inside the call doesn't
fool the check.
- hasRunSeedCall() uses the stripped view too, which correctly excludes
5 seeders that only mentioned runSeed in comments. Count dropped
91→86 real callers.
Added 4 targeted tests covering:
- runSeed() inside a line comment ahead of the real call
- runSeed() inside a block comment
- runSeed() inside a string literal ("don't call runSeed() directly")
- descriptor field names inside an inline comment don't count as present
Verified on the actual files: seed-bis-data.mjs first real runSeed( in
stripped source is at line 220 (was line 209 before fix).
40 tests total, all green.
* fix(seed-contract): parity verifier survives unbalanced braces in string/template literals
Addresses Greptile P2 on PR #3095: the body extractor in
scripts/verify-seed-envelope-parity.mjs counted raw { and } on every
character. A future helper body that legitimately contains
`const marker = '{'` would have pushed depth past zero at the literal
brace and truncated the body — silently masking drift in the rest of
the function.
Extracted the scan into scanBalanced(source, start, open, close) which
skips characters inside line comments, block comments, and string /
template literals (with escape handling and template-literal ${} recursion
for interpolation). Call sites in extractFunctions updated to use the new
scanner for both the arg-list parens and the function body braces.
Made extractFunctions and scanBalanced exported so the new test file
can exercise them directly. Gated main() behind an isMain check so
importing the module from tests doesn't trigger process.exit.
New tests in tests/seed-envelope-parity.test.mjs:
- extractFunctions tolerates unbalanced braces in string literals
- same for template literals
- same for braces inside block comments
- same for braces inside line comments
- scanBalanced respects backslash-escapes inside strings
- scanBalanced recurses into template-literal ${} interpolation
Also addresses the other two Greptile P2s which were already fixed in
earlier commits on this branch:
- Empty-string gap (99646dd9a): .trim()==='' rejection added
- SeedContractError cause drop (99646dd9a): constructor forwards cause
through super's options bag per Error v2 spec
61 tests green. Both typechecks clean.
119 lines
4.1 KiB
JavaScript
119 lines
4.1 KiB
JavaScript
import test from 'node:test';
|
|
import assert from 'node:assert/strict';
|
|
|
|
import {
|
|
unwrapEnvelope,
|
|
stripSeedEnvelope,
|
|
buildEnvelope,
|
|
} from '../scripts/_seed-envelope-source.mjs';
|
|
|
|
test('unwrapEnvelope: null input → null envelope + null data', () => {
|
|
assert.deepEqual(unwrapEnvelope(null), { _seed: null, data: null });
|
|
assert.deepEqual(unwrapEnvelope(undefined), { _seed: null, data: null });
|
|
});
|
|
|
|
test('unwrapEnvelope: legacy raw value passes through as data', () => {
|
|
assert.deepEqual(unwrapEnvelope({ events: [1, 2, 3] }), {
|
|
_seed: null,
|
|
data: { events: [1, 2, 3] },
|
|
});
|
|
});
|
|
|
|
test('unwrapEnvelope: legacy array passes through as data', () => {
|
|
assert.deepEqual(unwrapEnvelope([1, 2, 3]), { _seed: null, data: [1, 2, 3] });
|
|
});
|
|
|
|
test('unwrapEnvelope: envelope shape parses _seed + data', () => {
|
|
const wrapped = {
|
|
_seed: { fetchedAt: 1_700_000_000_000, recordCount: 5, sourceVersion: 'v1', schemaVersion: 1, state: 'OK' },
|
|
data: { events: [{ id: 1 }] },
|
|
};
|
|
const out = unwrapEnvelope(wrapped);
|
|
assert.equal(out._seed.fetchedAt, 1_700_000_000_000);
|
|
assert.equal(out._seed.state, 'OK');
|
|
assert.deepEqual(out.data, { events: [{ id: 1 }] });
|
|
});
|
|
|
|
test('unwrapEnvelope: malformed _seed block (missing fetchedAt) → treated as legacy', () => {
|
|
const bogus = { _seed: { sourceVersion: 'v1' }, data: { x: 1 } };
|
|
const out = unwrapEnvelope(bogus);
|
|
assert.equal(out._seed, null);
|
|
// Falls through the `_seed` branch: whole object becomes `data`.
|
|
assert.deepEqual(out.data, bogus);
|
|
});
|
|
|
|
test('unwrapEnvelope: stringified JSON is parsed', () => {
|
|
const wrapped = JSON.stringify({
|
|
_seed: { fetchedAt: 123, recordCount: 0, sourceVersion: 'v1', schemaVersion: 1, state: 'OK_ZERO' },
|
|
data: [],
|
|
});
|
|
const out = unwrapEnvelope(wrapped);
|
|
assert.equal(out._seed.state, 'OK_ZERO');
|
|
assert.deepEqual(out.data, []);
|
|
});
|
|
|
|
test('unwrapEnvelope: stringified garbage → legacy passthrough', () => {
|
|
const out = unwrapEnvelope('not json at all');
|
|
assert.equal(out._seed, null);
|
|
assert.equal(out.data, 'not json at all');
|
|
});
|
|
|
|
test('stripSeedEnvelope: returns data only', () => {
|
|
const wrapped = {
|
|
_seed: { fetchedAt: 1, recordCount: 1, sourceVersion: 'v1', schemaVersion: 1, state: 'OK' },
|
|
data: { hello: 'world' },
|
|
};
|
|
assert.deepEqual(stripSeedEnvelope(wrapped), { hello: 'world' });
|
|
});
|
|
|
|
test('stripSeedEnvelope: legacy value passes through unchanged', () => {
|
|
const legacy = { events: [1, 2] };
|
|
assert.deepEqual(stripSeedEnvelope(legacy), legacy);
|
|
});
|
|
|
|
test('stripSeedEnvelope: null → null', () => {
|
|
assert.equal(stripSeedEnvelope(null), null);
|
|
});
|
|
|
|
test('buildEnvelope: minimal OK build', () => {
|
|
const env = buildEnvelope({
|
|
fetchedAt: 1, recordCount: 5, sourceVersion: 'v1', schemaVersion: 1, state: 'OK',
|
|
data: { events: [] },
|
|
});
|
|
assert.equal(env._seed.state, 'OK');
|
|
assert.equal(env._seed.recordCount, 5);
|
|
assert.deepEqual(env.data, { events: [] });
|
|
assert.equal(env._seed.failedDatasets, undefined);
|
|
});
|
|
|
|
test('buildEnvelope: ERROR state carries failedDatasets + errorReason', () => {
|
|
const env = buildEnvelope({
|
|
fetchedAt: 1, recordCount: 0, sourceVersion: 'v1', schemaVersion: 1, state: 'ERROR',
|
|
failedDatasets: ['wgi', 'fao'],
|
|
errorReason: 'upstream 503',
|
|
data: null,
|
|
});
|
|
assert.equal(env._seed.state, 'ERROR');
|
|
assert.deepEqual(env._seed.failedDatasets, ['wgi', 'fao']);
|
|
assert.equal(env._seed.errorReason, 'upstream 503');
|
|
});
|
|
|
|
test('buildEnvelope: groupId included for multi-key group writes', () => {
|
|
const env = buildEnvelope({
|
|
fetchedAt: 1, recordCount: 222, sourceVersion: 'v7', schemaVersion: 1, state: 'OK',
|
|
groupId: 'resilience-static:2026-04-14',
|
|
data: { countries: {} },
|
|
});
|
|
assert.equal(env._seed.groupId, 'resilience-static:2026-04-14');
|
|
});
|
|
|
|
test('unwrapEnvelope round-trips buildEnvelope output', () => {
|
|
const env = buildEnvelope({
|
|
fetchedAt: 42, recordCount: 3, sourceVersion: 'v9', schemaVersion: 2, state: 'OK',
|
|
data: { items: [{ a: 1 }, { b: 2 }, { c: 3 }] },
|
|
});
|
|
const out = unwrapEnvelope(env);
|
|
assert.deepEqual(out._seed, env._seed);
|
|
assert.deepEqual(out.data, env.data);
|
|
});
|