Files
worldmonitor/consumer-prices-core
Elie Habib 56dba6c159 feat(consumer-prices): product pinning, BigBasket fix, spread threshold, retailer sync (#2136)
* feat(consumer-prices): product pinning, BigBasket fix, spread threshold, retailer sync

Implements scraper stability plan to prevent URL churn between runs:

Product pinning (core fix):
- Migration 007: adds pin_disabled_at, consecutive_out_of_stock, pin_error_count columns
- After first successful Exa+Firecrawl match, reuse the stored URL directly on subsequent
  runs without re-running Exa. Stale pins are soft-disabled (never deleted) after 3x OOS
  or 3x fetch errors, triggering automatic Exa rediscovery on the next run.
- On direct-path failure, falls back to normal Exa flow in the same run.
- Compound map key "basketSlug:canonicalName" prevents collisions in multi-basket markets.

Retailer active-state sync:
- getOrCreateRetailer now writes active=config.enabled on every upsert.
- scrapeAll iterates ALL configs (not just enabled) so disabled retailers get synced to DB.
- Eliminates need for manual SQL hotfixes to set active=false.

Analytics correctness:
- All product_matches reads in aggregate, validate, worldmonitor snapshot now filter
  pin_disabled_at IS NULL so soft-disabled stale matches don't skew indices.
- getBaselinePrices adds missing match_status IN ('auto','approved') guard.

BigBasket IN fix:
- inStockFromPrice: true flag overrides out-of-stock when price > 0.
- BigBasket gates on delivery pincode, not product availability — Firecrawl misread
  the pincode gate as out-of-stock for all 12 basket items.

Spread reliability:
- Minimum 4 common categories required to compute retailer_spread_pct.
- Writes explicit 0 when below threshold to prevent stale noisy value persisting.
- US spread 134.8% from single cooking_oil pair is now suppressed.

Other:
- Tamimi SA: better Exa query template targeting tamimimarkets.com directly.
- Remove KE from frontend MARKETS array (basket data preserved in DB).
- 13 new vitest unit tests covering pinning, inStockFromPrice, host validation.

* fix(scrape): distinguish wasDirectHit from isDirect to close pin-fallback loop

When a pinned target falls back to Exa (fetchTarget sets direct:false in
payload), the old code read isDirect from target.metadata (always true),
causing two bugs:
1. upsertProductMatch was skipped → new Exa URL never pinned
2. Stale-pin counters reset on any pin target → broken pins never disabled

Introduce wasDirectHit = isDirect && rawPayload.direct === true so:
- Stale-pin maintenance (OOS/reset) only fires when pin URL was actually used
- Exa fallback (isDirect && !wasDirectHit) fires handlePinError on old pin
- upsertProductMatch guard uses !wasDirectHit so fallback results get pinned
2026-03-23 14:19:30 +04:00
..