mirror of
https://github.com/koala73/worldmonitor.git
synced 2026-04-25 17:14:57 +02:00
feat(energy-atlas): GEM pipeline data import — gas 297, oil 334 (#3406)
* feat(energy-atlas): GEM pipeline data import — gas 75→297, oil 75→334 (parity-push closure) Closes the ~3.6× pipeline-scale gap that PR #3397's import infrastructure was built for. Per docs/methodology/pipelines.mdx operator runbook. Source releases (CC-BY 4.0, attribution preserved in registry envelope): - GEM-GGIT-Gas-Pipelines-2025-11.xlsx SHA256: f56d8b14400e558f06e53a4205034d3d506fc38c5ae6bf58000252f87b1845e6 URL: https://globalenergymonitor.org/wp-content/uploads/2025/11/GEM-GGIT-Gas-Pipelines-2025-11.xlsx - GEM-GOIT-Oil-NGL-Pipelines-2025-03.xlsx SHA256: d1648d28aed99cfd2264047f1e944ddfccf50ce9feeac7de5db233c601dc3bb2 URL: https://globalenergymonitor.org/wp-content/uploads/2025/03/GEM-GOIT-Oil-NGL-Pipelines-2025-03.xlsx Pre-conversion: GeoJSON (geometry endpoints) + XLSX (column properties) → canonical operator-shape JSON via /tmp/gem-import/convert.py. Filter knobs: - status ∈ {operating, construction} - length ≥ 750 km (gas) / 400 km (oil) — asymmetric per-fuel trunk-class - capacity unit conversions: bcm/y native; MMcf/d, MMSCMD, mtpa, m3/day, bpd, Mb/d, kbd → bcm/y (gas) or bbl/d (oil) at canonical conversion factors. - Country names → ISO 3166-1 alpha-2 via pycountry + alias table. Merge results (via scripts/import-gem-pipelines.mjs --merge): gas: +222 added, 15 duplicates skipped (haversine ≤ 5km AND token Jaccard ≥ 0.6) oil: +259 added, 16 duplicates skipped Final: 297 gas / 334 oil. Hand-curated 75+75 preserved with full evidence; GEM rows ship physicalStateSource='gem', classifierConfidence=0.4, operatorStatement=null, sanctionRefs=[]. Floor bump: scripts/_pipeline-registry.mjs MIN_PIPELINES_PER_REGISTRY 8 → 200. Live counts (297/334) leave ~100 rows of jitter headroom so a partial re-import or coverage-narrowing release fails loud rather than halving the registry silently. Tests: - tests/pipelines-registry.test.mts: bumped synthetic-registry Array.from({length:8}) → length:210 to clear new floor; added 'gem' to the evidence-source whitelist for non-flowing badges (parity with the derivePipelinePublicBadge audit done in PR #3397 U1). - tests/import-gem-pipelines.test.mjs: bumped registry-conformance loop 3 → 70 to clear new floor. - 51/51 pipeline tests pass; tsc --noEmit clean. vs peer reference site (281 gas + 265 oil): we now match (gas 297) and exceed (oil 334). Functional + visual + data parity for the energy variant is closed; remaining gaps are editorial-cadence (weekly briefing) which is intentionally out of scope per the parity-push plan. * docs(energy-atlas): land GEM converter + expand methodology runbook for quarterly refresh PR #3406 imported the data but didn't land the conversion script that produced it. This commit lands the converter at scripts/_gem-geojson-to-canonical.py so future operators can reproduce the import deterministically, and rewrites the docs/methodology/pipelines.mdx runbook to match what actually works: - Use GeoJSON (not XLSX) — the XLSX has properties but no lat/lon columns; only the GIS .zip's GeoJSON has both. The original runbook said to download XLSX which would fail at the lat/lon validation step. - Cadence: quarterly refresh, with concrete signals (peer-site comparison, 90-day calendar reminder). - Source datasets: explicit GGIT (gas) + GOIT (oil/NGL) tracker names so future operators don't re-request the wrong dataset (the Extraction Tracker = wells/fields, NOT pipelines — ours requires the Infrastructure Trackers). - Last-known-good URLs documented + URL pattern explained as fallback when GEM rotates per release. - Filter knob defaults documented inline (gas ≥ 750km, oil ≥ 400km, status ∈ {operating, construction}, capacity unit conversion table). - Failure-mode table mapping common errors to fixes. Converter takes paths via env vars (GEM_GAS_GEOJSON, GEM_OIL_GEOJSON, GEM_DOWNLOADED_AT, GEM_SOURCE_VERSION) instead of hardcoded paths so it works for any release without code edits. * fix(energy-atlas): close PR #3406 review findings — dedup + zero-length + test Three Greptile findings on PR #3406: P1 — Dedup miss (Dampier-Bunbury): Same physical pipeline existed in both registries — curated `dampier-bunbury` and GEM-imported `dampier-to-bunbury-natural-gas-pipeline-au` — because GEM digitized only the southern 60% of the line. The shared Bunbury terminus matched at 13.7 km but the average-endpoint distance was 287 km, just over the 5 km gate. Fix: scripts/_pipeline-dedup.mjs adds a name-set-identity short-circuit — if Jaccard == 1.0 (after stopword removal) AND any of the 4 endpoint pairings is ≤ 25 km, treat as duplicate. The 25 km anchor preserves the existing "name collision in different ocean → still added" contract. Added regression test: identical Dampier-Bunbury inputs → 0 added, 1 skipped, matched against `dampier-bunbury`. P1 — Zero-length geometry (9 rows: Trans-Alaska, Enbridge Line 3, Ichthys, etc.): GEM source GeoJSON occasionally has a Point geometry or single-coord LineString, producing pipelines where startPoint == endPoint. They render as map-point artifacts and skew aggregate-length stats. Fix (defense in depth): - scripts/_gem-geojson-to-canonical.py drops at conversion time (`zero_length` reason in drop log). - scripts/_pipeline-registry.mjs validateRegistry rejects defensively so even a hand-curated row with degenerate geometry fails loud. P2 — Test repetition coupled to fixture row count: Hardcoded `for (let i = 0; i < 70; i++)` × 3 fixture rows = 210 silently breaks if fixture is trimmed below 3. Fix: `Math.ceil(REGISTRY_FLOOR / fixture.length) + 5` derives reps from the floor and current fixture length. Re-run --merge with all fixes applied: gas: 75 → 293 (+218 added, 17 deduped — was 222/15 before; +2 catches via name-set-identity short-circuit; -2 zero-length never imported) oil: 75 → 325 (+250 added, 18 deduped — was 259/16; +2 catches; -7 zero-length) Tests: 74/74 pipeline tests pass; tsc --noEmit clean.
This commit is contained in:
@@ -69,20 +69,97 @@ The hand-curated subset (operator/regulator/sanctions-bearing rows with classifi
|
||||
|
||||
## Operator runbook — GEM import refresh
|
||||
|
||||
GEM publishes new releases of the Oil & Gas Infrastructure Trackers roughly quarterly. The refresh is operator-mediated rather than cron-driven because the GEM download URL changes per release; a hardcoded URL would silently fetch a different version than the one we attribute. Steps:
|
||||
### Cadence
|
||||
|
||||
1. Visit the [GEM Oil & Gas Infrastructure Trackers](https://globalenergymonitor.org/projects/global-oil-gas-infrastructure-tracker/) page. Registration is required for direct download even though the data itself is CC-BY 4.0.
|
||||
2. Download the latest gas + oil tracker Excel workbooks. Record the release date and download URL.
|
||||
3. Pre-convert each workbook to JSON externally (Numbers / pandas / csvkit), normalizing column names to the canonical set documented in `scripts/import-gem-pipelines.mjs::REQUIRED_COLUMNS` and country names to ISO 3166-1 alpha-2 codes.
|
||||
4. Run a dry pass to inspect the candidate diff:
|
||||
**Refresh quarterly** (or whenever a new GEM release lands — check the GGIT/GOIT landing pages below). The refresh is operator-mediated rather than cron-driven because:
|
||||
|
||||
- GEM downloads are gated behind a per-request form; the resulting URL is release-specific and rotates each quarter, so a hardcoded URL would silently fetch a different version than the one we attribute.
|
||||
- Each release adjusts column names occasionally; the schema-drift sentinel in `scripts/import-gem-pipelines.mjs` catches this loudly, but it requires a human review of the diff before committing.
|
||||
|
||||
If a quarter passes without a refresh, set a calendar reminder. Suggested cadence: review every 90 days; refresh whenever a peer reference site (e.g. global-energy-flow.com) advertises a newer release than ours.
|
||||
|
||||
### Source datasets
|
||||
|
||||
The two files we use are GEM's pipeline-only trackers (NOT the combined "Oil & Gas Extraction Tracker" — that's upstream wells/fields and has a different schema):
|
||||
|
||||
| Tracker | Acronym | What it contains | Landing page |
|
||||
|---|---|---|---|
|
||||
| Global Gas Infrastructure Tracker | **GGIT** | Gas pipelines + LNG terminals | [globalenergymonitor.org/projects/global-gas-infrastructure-tracker](https://globalenergymonitor.org/projects/global-gas-infrastructure-tracker/) |
|
||||
| Global Oil Infrastructure Tracker | **GOIT** | Oil + NGL pipelines | [globalenergymonitor.org/projects/global-oil-infrastructure-tracker](https://globalenergymonitor.org/projects/global-oil-infrastructure-tracker/) |
|
||||
|
||||
The **GIS .zip download** (containing GeoJSON, GeoPackage, and shapefile) is what we want — NOT the .xlsx. The XLSX has properties but no lat/lon columns; only the GeoJSON has both column properties AND `LineString.coordinates` for endpoint extraction.
|
||||
|
||||
#### Last-known-good URLs (rotate per release)
|
||||
|
||||
These are the URLs we used for the 2026-04-25 import. GEM rotates them per release, so always re-request via the landing page above for the current release before re-running:
|
||||
|
||||
```
|
||||
GGIT Gas (2025-11): https://globalenergymonitor.org/wp-content/uploads/2025/11/GEM-GGIT-Gas-Pipelines-2025-11.zip
|
||||
GOIT Oil (2025-03): https://globalenergymonitor.org/wp-content/uploads/2025/03/GEM-GOIT-Oil-NGL-Pipelines-2025-03.zip
|
||||
```
|
||||
|
||||
URL pattern is stable: `globalenergymonitor.org/wp-content/uploads/YYYY/MM/GEM-{GGIT,GOIT}-{tracker-name}-YYYY-MM.zip`. If the landing-page download flow changes, this pattern is the fallback for figuring out the new URL given the release date GEM publishes.
|
||||
|
||||
### Refresh steps
|
||||
|
||||
1. **Request the data** via either landing page above. GEM emails you per-release URLs (one for the .xlsx, one for the GIS .zip). Registration is required even though the data itself is CC-BY 4.0.
|
||||
|
||||
2. **Download both GIS .zips** and unzip:
|
||||
```bash
|
||||
GEM_PIPELINES_FILE=/tmp/gem.json node scripts/import-gem-pipelines.mjs --print-candidates | jq '.gas | length, .oil | length'
|
||||
unzip -o ~/Downloads/GEM-GGIT-Gas-Pipelines-YYYY-MM.zip -d /tmp/gem-gis/gas/
|
||||
unzip -o ~/Downloads/GEM-GOIT-Oil-NGL-Pipelines-YYYY-MM.zip -d /tmp/gem-gis/oil/
|
||||
```
|
||||
5. Run the merge to write the deduplicated rows into `scripts/data/pipelines-{gas,oil}.json`. Spot-check 5-10 random GEM-sourced rows manually before committing.
|
||||
6. Commit the data + bump `MIN_PIPELINES_PER_REGISTRY` in `scripts/_pipeline-registry.mjs` to a sensible new floor (e.g. 200) so future partial imports fail loud. Record the GEM release date, download URL, and SHA256 of the source workbook in the commit message.
|
||||
7. Verify `npm run test:data` is green before pushing.
|
||||
|
||||
Schema-drift sentinel guards against silent failures when GEM renames columns between releases — the parser throws with a clear message naming the missing column rather than producing zero-data rows.
|
||||
3. **Convert GeoJSON → canonical JSON** via the in-repo converter. It reads both GeoJSON files, applies the filter knobs documented in the script header, normalizes country names to ISO 3166-1 alpha-2 via `pycountry`, and emits the operator-shape envelope:
|
||||
```bash
|
||||
pip3 install pycountry # one-time
|
||||
GEM_GAS_GEOJSON=/tmp/gem-gis/gas/GEM-GGIT-Gas-Pipelines-YYYY-MM.geojson \
|
||||
GEM_OIL_GEOJSON=/tmp/gem-gis/oil/GEM-GOIT-Oil-NGL-Pipelines-YYYY-MM.geojson \
|
||||
GEM_DOWNLOADED_AT=YYYY-MM-DD \
|
||||
GEM_SOURCE_VERSION="GEM-GGIT-YYYY-MM+GOIT-YYYY-MM" \
|
||||
python3 scripts/_gem-geojson-to-canonical.py > /tmp/gem-pipelines.json 2> /tmp/gem-drops.log
|
||||
cat /tmp/gem-drops.log # inspect drop counts before merging
|
||||
```
|
||||
|
||||
Filter knob defaults (in `scripts/_gem-geojson-to-canonical.py`):
|
||||
- `MIN_LENGTH_KM_GAS = 750` (trunk-class only)
|
||||
- `MIN_LENGTH_KM_OIL = 400` (trunk-class only)
|
||||
- `ACCEPTED_STATUS = {operating, construction}`
|
||||
- Capacity unit conversions: bcm/y native; MMcf/d, MMSCMD, mtpa, m3/day, bpd, Mb/d, kbd → bcm/y (gas) or bbl/d (oil)
|
||||
|
||||
These thresholds were tuned empirically against the 2025-11/2025-03 release to land at ~250-300 entries per registry. Adjust if a future release shifts the volume distribution.
|
||||
|
||||
4. **Dry-run** to inspect candidate counts before touching the registry:
|
||||
```bash
|
||||
GEM_PIPELINES_FILE=/tmp/gem-pipelines.json node scripts/import-gem-pipelines.mjs --print-candidates \
|
||||
| jq '{ gas: (.gas | length), oil: (.oil | length) }'
|
||||
```
|
||||
|
||||
5. **Merge** into `scripts/data/pipelines-{gas,oil}.json` (writes both atomically — validates both before either is touched on disk):
|
||||
```bash
|
||||
GEM_PIPELINES_FILE=/tmp/gem-pipelines.json node scripts/import-gem-pipelines.mjs --merge
|
||||
```
|
||||
Spot-check 5-10 random GEM-sourced rows in the diff before committing — known major trunks (Druzhba, Nord Stream, Keystone, TAPI, Centro Oeste) are good sanity-check anchors.
|
||||
|
||||
6. **Commit** the data + record provenance. Per-release SHA256s go in the commit message so future audits can verify reproducibility:
|
||||
```bash
|
||||
shasum -a 256 ~/Downloads/GEM-GGIT-Gas-Pipelines-YYYY-MM.xlsx \
|
||||
~/Downloads/GEM-GOIT-Oil-NGL-Pipelines-YYYY-MM.xlsx
|
||||
```
|
||||
If the row count crosses a threshold, also bump `MIN_PIPELINES_PER_REGISTRY` in `scripts/_pipeline-registry.mjs` so future partial re-imports fail loud rather than silently halving the registry.
|
||||
|
||||
7. **Verify** `npm run test:data` is green before pushing.
|
||||
|
||||
### Failure modes and what to do
|
||||
|
||||
| Symptom | Cause | Fix |
|
||||
|---|---|---|
|
||||
| Converter exits with `GEM_GAS_GEOJSON env vars are required` | Env vars not set | Re-run with both `GEM_GAS_GEOJSON` and `GEM_OIL_GEOJSON` pointed at the unzipped `.geojson` files |
|
||||
| Many rows dropped on `country:Foo|Bar` | New country name GEM uses isn't in `pycountry` or the alias table | Add the alias to `COUNTRY_ALIASES` in `scripts/_gem-geojson-to-canonical.py` |
|
||||
| Many rows dropped on `no_capacity` with a unit we haven't seen | GEM added a capacity unit | Add the conversion factor to `gas_capacity()` or `oil_capacity()` in the converter |
|
||||
| Parser throws `schema drift — pipelines[i] missing column "X"` | GEM renamed a column between releases | The parser will name the missing column; map it back in the converter and re-run |
|
||||
| `validateRegistry` rejects the merged registry | Almost always: count below `MIN_PIPELINES_PER_REGISTRY`, or an evidence-source not in the whitelist | Inspect the merged JSON; if the row drop is real, lower the floor; if a row's evidence is malformed, fix the converter |
|
||||
| Net adds drop precipitously between releases | GEM removed a tracker subset, OR the dedup is over-matching | Run `--print-candidates` and diff against the prior quarter's output; adjust the haversine/Jaccard knobs in `scripts/_pipeline-dedup.mjs` if needed |
|
||||
|
||||
## Corrections
|
||||
|
||||
|
||||
382
scripts/_gem-geojson-to-canonical.py
Normal file
382
scripts/_gem-geojson-to-canonical.py
Normal file
@@ -0,0 +1,382 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Pre-convert GEM GeoJSON (GGIT gas + GOIT oil pipelines) → canonical JSON shape
|
||||
that scripts/import-gem-pipelines.mjs::REQUIRED_COLUMNS expects.
|
||||
|
||||
Why GeoJSON, not XLSX:
|
||||
GEM publishes both XLSX and GIS .zip downloads (with GeoJSON, GeoPackage,
|
||||
shapefile inside). The XLSX has properties but NO lat/lon columns — endpoint
|
||||
geometry only lives in the GIS feed. The GeoJSON `properties` block carries
|
||||
the same column set as the XLSX, AND `geometry.coordinates` gives us the
|
||||
LineString endpoints we need for haversine dedup. So we use GeoJSON only.
|
||||
|
||||
Usage:
|
||||
GEM_GAS_GEOJSON=/path/to/GEM-GGIT-Gas-Pipelines-YYYY-MM.geojson \\
|
||||
GEM_OIL_GEOJSON=/path/to/GEM-GOIT-Oil-NGL-Pipelines-YYYY-MM.geojson \\
|
||||
python3 scripts/_gem-geojson-to-canonical.py \\
|
||||
> /tmp/gem-pipelines.json
|
||||
|
||||
# Then feed to the merge step:
|
||||
GEM_PIPELINES_FILE=/tmp/gem-pipelines.json node \\
|
||||
scripts/import-gem-pipelines.mjs --print-candidates # dry run
|
||||
GEM_PIPELINES_FILE=/tmp/gem-pipelines.json node \\
|
||||
scripts/import-gem-pipelines.mjs --merge
|
||||
|
||||
Dependencies:
|
||||
pip3 install pycountry # ISO 3166-1 alpha-2 mapping for country names
|
||||
|
||||
Drop-summary log goes to stderr; canonical JSON goes to stdout.
|
||||
"""
|
||||
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
import pycountry
|
||||
|
||||
GAS_PATH = os.environ.get("GEM_GAS_GEOJSON")
|
||||
OIL_PATH = os.environ.get("GEM_OIL_GEOJSON")
|
||||
if not GAS_PATH or not OIL_PATH:
|
||||
sys.exit(
|
||||
"GEM_GAS_GEOJSON and GEM_OIL_GEOJSON env vars are required. "
|
||||
"Point each at the GEM-{GGIT,GOIT}-{Gas,Oil-NGL}-Pipelines-YYYY-MM.geojson "
|
||||
"file unzipped from the GIS download. See script header for details."
|
||||
)
|
||||
|
||||
# Filter knobs (per plan: trunk-class only, target 250-300 entries per registry).
|
||||
# Asymmetric thresholds: gas has more long-distance trunks worldwide (LNG-feeder
|
||||
# corridors, Russia→Europe, Russia→China), oil pipelines tend to be shorter
|
||||
# regional collectors. Tuned empirically against the 2025-11 GEM release to
|
||||
# yield ~265 gas + ~300 oil after dedup against the 75 hand-curated rows.
|
||||
MIN_LENGTH_KM_GAS = 750.0
|
||||
MIN_LENGTH_KM_OIL = 400.0
|
||||
ACCEPTED_STATUS = {"operating", "construction"}
|
||||
|
||||
# GEM (lowercase) → parser STATUS_MAP key (PascalCase)
|
||||
STATUS_PASCAL = {
|
||||
"operating": "Operating",
|
||||
"construction": "Construction",
|
||||
"proposed": "Proposed",
|
||||
"cancelled": "Cancelled",
|
||||
"shelved": "Cancelled", # treat shelved as cancelled per plan U2
|
||||
"mothballed": "Mothballed",
|
||||
"idle": "Idle",
|
||||
"shut-in": "Shut-in",
|
||||
"retired": "Mothballed",
|
||||
"mixed status": "Operating", # rare; treat as operating
|
||||
}
|
||||
|
||||
# Country aliases for cases pycountry's fuzzy match fails on
|
||||
COUNTRY_ALIASES = {
|
||||
"United States": "US",
|
||||
"United Kingdom": "GB",
|
||||
"Russia": "RU",
|
||||
"South Korea": "KR",
|
||||
"North Korea": "KP",
|
||||
"Iran": "IR",
|
||||
"Syria": "SY",
|
||||
"Venezuela": "VE",
|
||||
"Bolivia": "BO",
|
||||
"Tanzania": "TZ",
|
||||
"Vietnam": "VN",
|
||||
"Laos": "LA",
|
||||
"Czech Republic": "CZ",
|
||||
"Czechia": "CZ",
|
||||
"Slovakia": "SK",
|
||||
"Macedonia": "MK",
|
||||
"North Macedonia": "MK",
|
||||
"Moldova": "MD",
|
||||
"Brunei": "BN",
|
||||
"Cape Verde": "CV",
|
||||
"Ivory Coast": "CI",
|
||||
"Cote d'Ivoire": "CI",
|
||||
"Republic of the Congo": "CG",
|
||||
"Democratic Republic of the Congo": "CD",
|
||||
"DR Congo": "CD",
|
||||
"DRC": "CD",
|
||||
"Congo": "CG",
|
||||
"Burma": "MM",
|
||||
"Myanmar": "MM",
|
||||
"Taiwan": "TW",
|
||||
"Palestine": "PS",
|
||||
"Kosovo": "XK", # not ISO-2 official; use XK (commonly accepted)
|
||||
}
|
||||
|
||||
|
||||
def country_to_iso2(name):
|
||||
if not name:
|
||||
return None
|
||||
name = name.strip()
|
||||
if name in COUNTRY_ALIASES:
|
||||
return COUNTRY_ALIASES[name]
|
||||
try:
|
||||
c = pycountry.countries.get(name=name)
|
||||
if c:
|
||||
return c.alpha_2
|
||||
# Try common_name (e.g. "Russia" → "Russian Federation")
|
||||
c = pycountry.countries.get(common_name=name)
|
||||
if c:
|
||||
return c.alpha_2
|
||||
# Fuzzy
|
||||
results = pycountry.countries.search_fuzzy(name)
|
||||
if results:
|
||||
return results[0].alpha_2
|
||||
except (LookupError, KeyError):
|
||||
pass
|
||||
return None
|
||||
|
||||
|
||||
def split_countries(s):
|
||||
"""Parse 'Russia, Belarus, Ukraine' → ['Russia','Belarus','Ukraine']"""
|
||||
if not s:
|
||||
return []
|
||||
return [x.strip() for x in s.split(",") if x.strip()]
|
||||
|
||||
|
||||
def get_endpoints(geom):
|
||||
"""Return ((startLon, startLat), (endLon, endLat)) or None."""
|
||||
if not geom:
|
||||
return None
|
||||
t = geom.get("type")
|
||||
coords = geom.get("coordinates")
|
||||
if t == "LineString" and coords and len(coords) >= 2:
|
||||
return coords[0], coords[-1]
|
||||
if t == "MultiLineString" and coords:
|
||||
flat = [pt for line in coords if line for pt in line]
|
||||
if len(flat) >= 2:
|
||||
return flat[0], flat[-1]
|
||||
if t == "GeometryCollection":
|
||||
geoms = geom.get("geometries") or []
|
||||
all_coords = []
|
||||
for g in geoms:
|
||||
if g and g.get("type") == "LineString" and g.get("coordinates"):
|
||||
all_coords.extend(g["coordinates"])
|
||||
elif g and g.get("type") == "MultiLineString" and g.get("coordinates"):
|
||||
for line in g["coordinates"]:
|
||||
all_coords.extend(line)
|
||||
if len(all_coords) >= 2:
|
||||
return all_coords[0], all_coords[-1]
|
||||
return None
|
||||
|
||||
|
||||
def first_year(props):
|
||||
for k in ("StartYear1", "StartYear2", "StartYear3"):
|
||||
v = props.get(k)
|
||||
if v:
|
||||
try:
|
||||
return int(float(v))
|
||||
except (TypeError, ValueError):
|
||||
pass
|
||||
return 0
|
||||
|
||||
|
||||
def best_length_km(props):
|
||||
for k in ("LengthMergedKm", "LengthKnownKm", "LengthEstimateKm"):
|
||||
v = props.get(k)
|
||||
if v in (None, "", "NA"):
|
||||
continue
|
||||
try:
|
||||
f = float(v)
|
||||
if f > 0:
|
||||
return f
|
||||
except (TypeError, ValueError):
|
||||
pass
|
||||
return 0.0
|
||||
|
||||
|
||||
def _f(v):
|
||||
if v in (None, "", "NA"):
|
||||
return None
|
||||
try:
|
||||
f = float(v)
|
||||
return f if f > 0 else None
|
||||
except (TypeError, ValueError):
|
||||
return None
|
||||
|
||||
|
||||
def gas_capacity(props):
|
||||
"""Return (capacity, 'bcm/y'). GGIT has CapacityBcm/y derived for many rows."""
|
||||
f = _f(props.get("CapacityBcm/y"))
|
||||
if f is not None:
|
||||
return f, "bcm/y"
|
||||
# Fall back to raw Capacity + CapacityUnits with conversions to bcm/y.
|
||||
cap = _f(props.get("Capacity"))
|
||||
if cap is None:
|
||||
return None, None
|
||||
u = (props.get("CapacityUnits") or "").strip().lower()
|
||||
if u == "bcm/y":
|
||||
return cap, "bcm/y"
|
||||
if u == "mmcf/d": # million standard cubic feet/day → bcm/y
|
||||
return cap * 0.01034, "bcm/y"
|
||||
if u == "mmscmd": # million standard cubic metres/day
|
||||
return cap * 365.25 / 1000.0, "bcm/y"
|
||||
if u == "mill.sm3/day": # million Sm3/day = MMSCMD
|
||||
return cap * 365.25 / 1000.0, "bcm/y"
|
||||
if u == "scm/y": # standard cubic metres/year
|
||||
return cap / 1e9, "bcm/y"
|
||||
if u == "mtpa": # million tonnes/annum LNG → bcm/y (1 mtpa ≈ 1.36 bcm/y)
|
||||
return cap * 1.36, "bcm/y"
|
||||
return None, None
|
||||
|
||||
|
||||
def oil_capacity(props):
|
||||
"""Return (capacity, capacityUnit) for oil. Convert to bbl/d for parser
|
||||
consumption (parser then converts bbl/d / 1e6 → Mbd internally)."""
|
||||
cap = _f(props.get("Capacity"))
|
||||
unit_raw = (props.get("CapacityUnits") or "").strip().lower()
|
||||
if cap is None or not unit_raw:
|
||||
# Fallback: derive from CapacityBOEd if present (already bpd-equivalent).
|
||||
boed = _f(props.get("CapacityBOEd"))
|
||||
if boed is not None:
|
||||
return boed, "bbl/d"
|
||||
return None, None
|
||||
if unit_raw == "bpd":
|
||||
return cap, "bbl/d"
|
||||
if unit_raw in ("mb/d", "mbd"):
|
||||
# GEM "Mb/d" = thousand bbl/day (industry shorthand). Convert to bbl/d.
|
||||
return cap * 1000.0, "bbl/d"
|
||||
if unit_raw in ("kbd", "kb/d"):
|
||||
return cap * 1000.0, "bbl/d"
|
||||
if unit_raw == "mtpa":
|
||||
# Million tonnes/annum crude → bbl/d (avg crude: 7.33 bbl/tonne).
|
||||
return cap * 1e6 * 7.33 / 365.25, "bbl/d"
|
||||
if unit_raw == "m3/day":
|
||||
# 1 m3 = 6.2898 bbl
|
||||
return cap * 6.2898, "bbl/d"
|
||||
if unit_raw == "m3/month":
|
||||
return cap * 6.2898 / 30.4, "bbl/d"
|
||||
if unit_raw == "m3/year":
|
||||
return cap * 6.2898 / 365.25, "bbl/d"
|
||||
if unit_raw == "thousand m3/year":
|
||||
return cap * 1000 * 6.2898 / 365.25, "bbl/d"
|
||||
if unit_raw == "tn/d": # tonnes/day
|
||||
return cap * 7.33, "bbl/d"
|
||||
# Unknown unit → fall back to BOEd if available.
|
||||
boed = _f(props.get("CapacityBOEd"))
|
||||
if boed is not None:
|
||||
return boed, "bbl/d"
|
||||
return None, None
|
||||
|
||||
|
||||
def convert_one(props, geom, fuel_token):
|
||||
name = (props.get("PipelineName") or "").strip()
|
||||
seg = (props.get("SegmentName") or "").strip()
|
||||
if seg and seg.lower() not in ("main line", "mainline", "main"):
|
||||
name = f"{name} - {seg}" if name else seg
|
||||
if not name:
|
||||
return None, "no_name"
|
||||
|
||||
status = (props.get("Status") or "").strip().lower()
|
||||
if status not in ACCEPTED_STATUS:
|
||||
return None, f"status:{status or 'empty'}"
|
||||
|
||||
pts = get_endpoints(geom)
|
||||
if not pts:
|
||||
return None, "no_geom"
|
||||
s_lon, s_lat = pts[0][0], pts[0][1]
|
||||
e_lon, e_lat = pts[1][0], pts[1][1]
|
||||
# Drop degenerate geometry (start == end). GEM occasionally publishes
|
||||
# rows with a Point geometry or a single-coord LineString, which we'd
|
||||
# otherwise emit as zero-length routes. PR #3406 review found 9 such
|
||||
# rows (Trans-Alaska, Enbridge Line 3 Replacement, Ichthys, etc.).
|
||||
if s_lat == e_lat and s_lon == e_lon:
|
||||
return None, "zero_length"
|
||||
|
||||
length = best_length_km(props)
|
||||
threshold = MIN_LENGTH_KM_GAS if fuel_token == "Gas" else MIN_LENGTH_KM_OIL
|
||||
if length < threshold:
|
||||
return None, "too_short"
|
||||
|
||||
if fuel_token == "Gas":
|
||||
cap, unit = gas_capacity(props)
|
||||
from_country_name = props.get("StartCountryOrArea")
|
||||
to_country_name = props.get("EndCountryOrArea")
|
||||
all_countries = split_countries(props.get("CountriesOrAreas"))
|
||||
else:
|
||||
cap, unit = oil_capacity(props)
|
||||
from_country_name = props.get("StartCountry")
|
||||
to_country_name = props.get("EndCountry")
|
||||
all_countries = split_countries(props.get("Countries"))
|
||||
if cap is None or unit is None:
|
||||
return None, "no_capacity"
|
||||
|
||||
from_iso = country_to_iso2(from_country_name)
|
||||
to_iso = country_to_iso2(to_country_name)
|
||||
if not from_iso or not to_iso:
|
||||
return None, f"country:{from_country_name}|{to_country_name}"
|
||||
|
||||
transit = []
|
||||
for c in all_countries:
|
||||
iso = country_to_iso2(c)
|
||||
if iso and iso != from_iso and iso != to_iso:
|
||||
transit.append(iso)
|
||||
|
||||
operator = (props.get("Owner") or props.get("Parent") or "").strip()
|
||||
if not operator:
|
||||
operator = "Unknown"
|
||||
|
||||
row = {
|
||||
"name": name,
|
||||
"operator": operator,
|
||||
"fuel": fuel_token,
|
||||
"fromCountry": from_iso,
|
||||
"toCountry": to_iso,
|
||||
"transitCountries": transit,
|
||||
"capacity": cap,
|
||||
"capacityUnit": unit,
|
||||
"lengthKm": length,
|
||||
"status": STATUS_PASCAL.get(status, "Operating"),
|
||||
"startLat": s_lat,
|
||||
"startLon": s_lon,
|
||||
"endLat": e_lat,
|
||||
"endLon": e_lon,
|
||||
"startYear": first_year(props),
|
||||
}
|
||||
return row, None
|
||||
|
||||
|
||||
def process(path, fuel_token, drops):
|
||||
with open(path) as f:
|
||||
gj = json.load(f)
|
||||
out = []
|
||||
for ft in gj["features"]:
|
||||
props = ft.get("properties") or {}
|
||||
geom = ft.get("geometry")
|
||||
row, reason = convert_one(props, geom, fuel_token)
|
||||
if row:
|
||||
out.append(row)
|
||||
else:
|
||||
drops[reason] = drops.get(reason, 0) + 1
|
||||
return out
|
||||
|
||||
|
||||
def main():
|
||||
drops_gas, drops_oil = {}, {}
|
||||
gas_rows = process(GAS_PATH, "Gas", drops_gas)
|
||||
oil_rows = process(OIL_PATH, "Oil", drops_oil)
|
||||
|
||||
# The operator stamps `downloadedAt` and `sourceVersion` per release so
|
||||
# the parser's deterministic-timestamp logic (resolveEvidenceTimestamp in
|
||||
# scripts/import-gem-pipelines.mjs) produces a stable lastEvidenceUpdate
|
||||
# tied to the actual download date — not "now". Override via env so the
|
||||
# script doesn't drift across re-runs.
|
||||
downloaded_at = os.environ.get("GEM_DOWNLOADED_AT", "1970-01-01")
|
||||
source_version = os.environ.get("GEM_SOURCE_VERSION", "GEM-unspecified-release")
|
||||
envelope = {
|
||||
"downloadedAt": downloaded_at,
|
||||
"sourceVersion": source_version,
|
||||
"pipelines": gas_rows + oil_rows,
|
||||
}
|
||||
json.dump(envelope, sys.stdout, indent=2, ensure_ascii=False)
|
||||
|
||||
print("\n--- DROP SUMMARY (gas) ---", file=sys.stderr)
|
||||
for k, v in sorted(drops_gas.items(), key=lambda x: -x[1]):
|
||||
print(f" {k}: {v}", file=sys.stderr)
|
||||
print(f" KEPT: {len(gas_rows)}", file=sys.stderr)
|
||||
print("--- DROP SUMMARY (oil) ---", file=sys.stderr)
|
||||
for k, v in sorted(drops_oil.items(), key=lambda x: -x[1]):
|
||||
print(f" {k}: {v}", file=sys.stderr)
|
||||
print(f" KEPT: {len(oil_rows)}", file=sys.stderr)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -24,6 +24,16 @@ const STOPWORDS = new Set([
|
||||
|
||||
const MATCH_DISTANCE_KM = 5;
|
||||
const MATCH_JACCARD_MIN = 0.6;
|
||||
// When the candidate's tokenized name equals the existing row's tokenized
|
||||
// name (Jaccard == 1.0 after stopword removal), accept the match if ANY
|
||||
// endpoint pairing is within MATCH_NAME_IDENTICAL_DISTANCE_KM. Catches PR
|
||||
// #3406 review's Dampier-Bunbury case: GEM digitized only the southern
|
||||
// 60% of the line, so the average-endpoint distance was 287km but the
|
||||
// shared Bunbury terminus matched within 13.7km. A pure name-only rule
|
||||
// would false-positive on coincidental collisions in different oceans
|
||||
// (e.g. unrelated "Nord Stream 1" in the Pacific), so we still require
|
||||
// SOME geographic anchor.
|
||||
const MATCH_NAME_IDENTICAL_DISTANCE_KM = 25;
|
||||
const EARTH_RADIUS_KM = 6371;
|
||||
|
||||
/**
|
||||
@@ -55,6 +65,24 @@ function averageEndpointDistanceKm(a, b) {
|
||||
return Math.min(forward, reversed);
|
||||
}
|
||||
|
||||
/**
|
||||
* Minimum of all four cross-pairings between candidate and existing endpoints.
|
||||
* Used by the name-identical short-circuit: if the candidate digitizes a
|
||||
* different segment of the same physical pipeline, only ONE endpoint pair
|
||||
* may match closely (e.g. Dampier-Bunbury: shared Bunbury terminus 13.7 km,
|
||||
* other end 560 km away because GEM stopped at Onslow vs the full Dampier
|
||||
* route). A tight average would miss this; the min of the four pairings
|
||||
* doesn't.
|
||||
*/
|
||||
function minPairwiseEndpointDistanceKm(a, b) {
|
||||
return Math.min(
|
||||
haversineKm(a.startPoint, b.startPoint),
|
||||
haversineKm(a.startPoint, b.endPoint),
|
||||
haversineKm(a.endPoint, b.startPoint),
|
||||
haversineKm(a.endPoint, b.endPoint),
|
||||
);
|
||||
}
|
||||
|
||||
/**
|
||||
* Tokenize a name: lowercased word tokens, ASCII-only word boundaries,
|
||||
* stopwords removed. Stable across invocations.
|
||||
@@ -85,12 +113,35 @@ function jaccard(a, b) {
|
||||
}
|
||||
|
||||
/**
|
||||
* Decide if a candidate matches an existing row. Both criteria required.
|
||||
* Decide if a candidate matches an existing row.
|
||||
*
|
||||
* Two acceptance paths:
|
||||
* (a) Token sets are IDENTICAL (Jaccard == 1.0 after stopword removal) —
|
||||
* the same pipeline regardless of how either source digitized its
|
||||
* endpoints. Catches the Dampier-Bunbury case (PR #3406 review):
|
||||
* GEM's GeoJSON terminus was 13.7 km from the curated terminus
|
||||
* (just over the 5 km distance gate) but both names tokenize to
|
||||
* {dampier, to, bunbury, natural, gas}, so they are clearly the
|
||||
* same physical pipeline.
|
||||
* (b) Distance ≤ 5 km AND Jaccard ≥ 0.6 — the original conjunctive rule
|
||||
* for slight name-variation cases (e.g. "Druzhba Pipeline" vs
|
||||
* "Druzhba Oil Pipeline").
|
||||
*/
|
||||
function isDuplicate(candidate, existing) {
|
||||
const sim = jaccard(candidate.name, existing.name);
|
||||
// Path (a): identical token-set + at least one endpoint pair within 25 km.
|
||||
// The geographic anchor distinguishes the Dampier-Bunbury case from a
|
||||
// theoretical name-collision in a different ocean.
|
||||
if (sim >= 1.0) {
|
||||
const minDist = minPairwiseEndpointDistanceKm(candidate, existing);
|
||||
if (minDist <= MATCH_NAME_IDENTICAL_DISTANCE_KM) return true;
|
||||
// Identical names but no endpoint near each other → distinct pipelines
|
||||
// sharing a name (rare but real). Fall through to the conjunctive rule
|
||||
// below, which will return false because Jaccard 1.0 with > 25km min
|
||||
// pair always exceeds 5 km average.
|
||||
}
|
||||
const dist = averageEndpointDistanceKm(candidate, existing);
|
||||
if (dist > MATCH_DISTANCE_KM) return false;
|
||||
const sim = jaccard(candidate.name, existing.name);
|
||||
return sim >= MATCH_JACCARD_MIN;
|
||||
}
|
||||
|
||||
@@ -160,6 +211,7 @@ export function dedupePipelines(existing, candidates) {
|
||||
export const _internal = {
|
||||
haversineKm,
|
||||
averageEndpointDistanceKm,
|
||||
minPairwiseEndpointDistanceKm,
|
||||
tokenize,
|
||||
jaccard,
|
||||
isDuplicate,
|
||||
@@ -167,4 +219,5 @@ export const _internal = {
|
||||
STOPWORDS,
|
||||
MATCH_DISTANCE_KM,
|
||||
MATCH_JACCARD_MIN,
|
||||
MATCH_NAME_IDENTICAL_DISTANCE_KM,
|
||||
};
|
||||
|
||||
@@ -44,9 +44,11 @@ export const VALID_SOURCES = new Set(['operator', 'regulator', 'press', 'satelli
|
||||
// inline copy in tests could silently drift when the enum is extended.
|
||||
export const VALID_OIL_PRODUCT_CLASSES = new Set(['crude', 'products', 'mixed']);
|
||||
|
||||
// Minimum viable registry size. Expansion to ~75 each happens in the follow-up
|
||||
// GEM import PR; this seeder doesn't care about exact counts beyond the floor.
|
||||
const MIN_PIPELINES_PER_REGISTRY = 8;
|
||||
// Minimum viable registry size. Post-GEM-import floor: 200. Live counts after
|
||||
// the 2025-11 GGIT + 2025-03 GOIT merge are 297 gas / 334 oil; 200 leaves ~100
|
||||
// rows of jitter headroom so a partial GEM re-import or a coverage-narrowing
|
||||
// release fails loud rather than silently halving the registry.
|
||||
const MIN_PIPELINES_PER_REGISTRY = 200;
|
||||
|
||||
function loadRegistry(filename) {
|
||||
const __dirname = dirname(fileURLToPath(import.meta.url));
|
||||
@@ -96,6 +98,13 @@ export function validateRegistry(data) {
|
||||
if (!p.endPoint || typeof p.endPoint.lat !== 'number' || typeof p.endPoint.lon !== 'number') return false;
|
||||
if (!isValidLatLon(p.startPoint.lat, p.startPoint.lon)) return false;
|
||||
if (!isValidLatLon(p.endPoint.lat, p.endPoint.lon)) return false;
|
||||
// Reject degenerate routes where startPoint == endPoint. PR #3406 review
|
||||
// surfaced 9 GEM rows (incl. Trans-Alaska, Enbridge Line 3, Ichthys)
|
||||
// whose source GeoJSON had a Point geometry or a single-coord LineString,
|
||||
// producing zero-length pipelines that render as map-point artifacts and
|
||||
// skew aggregate-length statistics. Defense in depth — converter also
|
||||
// drops these — but the validator gate makes the contract explicit.
|
||||
if (p.startPoint.lat === p.endPoint.lat && p.startPoint.lon === p.endPoint.lon) return false;
|
||||
|
||||
if (!p.evidence || typeof p.evidence !== 'object') return false;
|
||||
const ev = p.evidence;
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
@@ -182,13 +182,17 @@ describe('import-gem-pipelines — minimum-viable evidence', () => {
|
||||
});
|
||||
|
||||
describe('import-gem-pipelines — registry-shape conformance', () => {
|
||||
// Compute the repeat count from the floor + the fixture row count so this
|
||||
// test stays correct if the fixture is trimmed or the floor is raised. The
|
||||
// hardcoded `for (let i = 0; i < 70; i++)` was fragile — Greptile P2 on PR
|
||||
// #3406. +5 over the floor leaves a safety margin without inflating the test.
|
||||
const REGISTRY_FLOOR = 200;
|
||||
|
||||
test('emitted gas registry passes validateRegistry', () => {
|
||||
// Build a synthetic registry of just the GEM-emitted gas rows; meets the
|
||||
// validator's MIN_PIPELINES_PER_REGISTRY=8 floor by repeating the 3 fixture
|
||||
// rows so we exercise the schema, not the count.
|
||||
const { gas } = parseGemPipelines(fixture);
|
||||
const reps = Math.ceil(REGISTRY_FLOOR / gas.length) + 5;
|
||||
const repeated = [];
|
||||
for (let i = 0; i < 3; i++) {
|
||||
for (let i = 0; i < reps; i++) {
|
||||
for (const p of gas) repeated.push({ ...p, id: `${p.id}-rep${i}` });
|
||||
}
|
||||
const reg = {
|
||||
@@ -199,8 +203,9 @@ describe('import-gem-pipelines — registry-shape conformance', () => {
|
||||
|
||||
test('emitted oil registry passes validateRegistry', () => {
|
||||
const { oil } = parseGemPipelines(fixture);
|
||||
const reps = Math.ceil(REGISTRY_FLOOR / oil.length) + 5;
|
||||
const repeated = [];
|
||||
for (let i = 0; i < 3; i++) {
|
||||
for (let i = 0; i < reps; i++) {
|
||||
for (const p of oil) repeated.push({ ...p, id: `${p.id}-rep${i}` });
|
||||
}
|
||||
const reg = {
|
||||
|
||||
@@ -86,6 +86,22 @@ describe('pipeline-dedup — match logic', () => {
|
||||
assert.equal(skippedDuplicates[0].matchedExistingId, 'druzhba-north');
|
||||
});
|
||||
|
||||
test('identical names + one shared terminus (≤25 km) → deduped (PR #3406 Dampier-Bunbury regression)', () => {
|
||||
// Real-world case from PR #3406 review: GEM digitized only the southern
|
||||
// 60% of the line, so the shared Bunbury terminus matched at 13.7 km
|
||||
// but the average-endpoint distance was 287 km (over the 5 km gate).
|
||||
// Identical token sets + ≥1 close pairing = same physical pipeline.
|
||||
const existing = [makePipeline('dampier-bunbury', 'Dampier to Bunbury Natural Gas Pipeline',
|
||||
-20.68, 116.72, -33.33, 115.63)];
|
||||
const candidates = [makePipeline('dampier-to-bunbury-natural-gas-pipeline-au',
|
||||
'Dampier to Bunbury Natural Gas Pipeline',
|
||||
-33.265797, 115.755682, -24.86854, 113.674968)];
|
||||
const { toAdd, skippedDuplicates } = dedupePipelines(existing, candidates);
|
||||
assert.equal(toAdd.length, 0);
|
||||
assert.equal(skippedDuplicates.length, 1);
|
||||
assert.equal(skippedDuplicates[0].matchedExistingId, 'dampier-bunbury');
|
||||
});
|
||||
|
||||
test('name-match only (endpoints in different ocean) → added', () => {
|
||||
const existing = [makePipeline('nord-stream-1', 'Nord Stream 1',
|
||||
60.08, 29.05, 54.14, 13.66)];
|
||||
|
||||
@@ -88,7 +88,7 @@ describe('pipeline registries — evidence', () => {
|
||||
const hasEvidence =
|
||||
p.evidence.operatorStatement != null ||
|
||||
p.evidence.sanctionRefs.length > 0 ||
|
||||
['ais-relay', 'satellite', 'press'].includes(p.evidence.physicalStateSource);
|
||||
['ais-relay', 'satellite', 'press', 'gem'].includes(p.evidence.physicalStateSource);
|
||||
assert.ok(hasEvidence, `${p.id} has no supporting evidence for state=${p.evidence.physicalState}`);
|
||||
}
|
||||
});
|
||||
@@ -157,7 +157,7 @@ describe('pipeline registries — productClass', () => {
|
||||
const { productClass: _drop, ...stripped } = oilSample;
|
||||
const bad = {
|
||||
pipelines: Object.fromEntries(
|
||||
Array.from({ length: 8 }, (_, i) => [`p${i}`, { ...stripped, id: `p${i}` }]),
|
||||
Array.from({ length: 210 }, (_, i) => [`p${i}`, { ...stripped, id: `p${i}` }]),
|
||||
),
|
||||
};
|
||||
assert.equal(validateRegistry(bad), false);
|
||||
@@ -167,7 +167,7 @@ describe('pipeline registries — productClass', () => {
|
||||
const oilSample = oil.pipelines[Object.keys(oil.pipelines)[0]!];
|
||||
const bad = {
|
||||
pipelines: Object.fromEntries(
|
||||
Array.from({ length: 8 }, (_, i) => [
|
||||
Array.from({ length: 210 }, (_, i) => [
|
||||
`p${i}`,
|
||||
{ ...oilSample, id: `p${i}`, productClass: 'diesel-only' },
|
||||
]),
|
||||
@@ -180,7 +180,7 @@ describe('pipeline registries — productClass', () => {
|
||||
const gasSample = gas.pipelines[Object.keys(gas.pipelines)[0]!];
|
||||
const bad = {
|
||||
pipelines: Object.fromEntries(
|
||||
Array.from({ length: 8 }, (_, i) => [
|
||||
Array.from({ length: 210 }, (_, i) => [
|
||||
`p${i}`,
|
||||
{ ...gasSample, id: `p${i}`, productClass: 'crude' },
|
||||
]),
|
||||
@@ -202,7 +202,7 @@ describe('pipeline registries — validateRegistry rejects bad input', () => {
|
||||
test('rejects a pipeline with no evidence', () => {
|
||||
const bad = {
|
||||
pipelines: Object.fromEntries(
|
||||
Array.from({ length: 8 }, (_, i) => [`p${i}`, {
|
||||
Array.from({ length: 210 }, (_, i) => [`p${i}`, {
|
||||
id: `p${i}`, name: 'x', operator: 'y', commodityType: 'gas',
|
||||
fromCountry: 'US', toCountry: 'CA', transitCountries: [],
|
||||
capacityBcmYr: 1, startPoint: { lat: 0, lon: 0 }, endPoint: { lat: 1, lon: 1 },
|
||||
@@ -236,7 +236,7 @@ describe('pipeline registries — GEM source enum', () => {
|
||||
const gasSample = gas.pipelines[Object.keys(gas.pipelines)[0]!];
|
||||
const good = {
|
||||
pipelines: Object.fromEntries(
|
||||
Array.from({ length: 8 }, (_, i) => [`p${i}`, {
|
||||
Array.from({ length: 210 }, (_, i) => [`p${i}`, {
|
||||
...gasSample,
|
||||
id: `p${i}`,
|
||||
evidence: {
|
||||
@@ -264,7 +264,7 @@ describe('pipeline registries — GEM source enum', () => {
|
||||
const gasSample = gas.pipelines[Object.keys(gas.pipelines)[0]!];
|
||||
const good = {
|
||||
pipelines: Object.fromEntries(
|
||||
Array.from({ length: 8 }, (_, i) => [`p${i}`, {
|
||||
Array.from({ length: 210 }, (_, i) => [`p${i}`, {
|
||||
...gasSample,
|
||||
id: `p${i}`,
|
||||
evidence: {
|
||||
@@ -288,7 +288,7 @@ describe('pipeline registries — GEM source enum', () => {
|
||||
const gasSample = gas.pipelines[Object.keys(gas.pipelines)[0]!];
|
||||
const bad = {
|
||||
pipelines: Object.fromEntries(
|
||||
Array.from({ length: 8 }, (_, i) => [`p${i}`, {
|
||||
Array.from({ length: 210 }, (_, i) => [`p${i}`, {
|
||||
...gasSample,
|
||||
id: `p${i}`,
|
||||
evidence: {
|
||||
|
||||
Reference in New Issue
Block a user