mirror of
https://github.com/koala73/worldmonitor.git
synced 2026-04-25 17:14:57 +02:00
* feat(consumer-prices): product pinning, BigBasket fix, spread threshold, retailer sync
Implements scraper stability plan to prevent URL churn between runs:
Product pinning (core fix):
- Migration 007: adds pin_disabled_at, consecutive_out_of_stock, pin_error_count columns
- After first successful Exa+Firecrawl match, reuse the stored URL directly on subsequent
runs without re-running Exa. Stale pins are soft-disabled (never deleted) after 3x OOS
or 3x fetch errors, triggering automatic Exa rediscovery on the next run.
- On direct-path failure, falls back to normal Exa flow in the same run.
- Compound map key "basketSlug:canonicalName" prevents collisions in multi-basket markets.
Retailer active-state sync:
- getOrCreateRetailer now writes active=config.enabled on every upsert.
- scrapeAll iterates ALL configs (not just enabled) so disabled retailers get synced to DB.
- Eliminates need for manual SQL hotfixes to set active=false.
Analytics correctness:
- All product_matches reads in aggregate, validate, worldmonitor snapshot now filter
pin_disabled_at IS NULL so soft-disabled stale matches don't skew indices.
- getBaselinePrices adds missing match_status IN ('auto','approved') guard.
BigBasket IN fix:
- inStockFromPrice: true flag overrides out-of-stock when price > 0.
- BigBasket gates on delivery pincode, not product availability — Firecrawl misread
the pincode gate as out-of-stock for all 12 basket items.
Spread reliability:
- Minimum 4 common categories required to compute retailer_spread_pct.
- Writes explicit 0 when below threshold to prevent stale noisy value persisting.
- US spread 134.8% from single cooking_oil pair is now suppressed.
Other:
- Tamimi SA: better Exa query template targeting tamimimarkets.com directly.
- Remove KE from frontend MARKETS array (basket data preserved in DB).
- 13 new vitest unit tests covering pinning, inStockFromPrice, host validation.
* fix(scrape): distinguish wasDirectHit from isDirect to close pin-fallback loop
When a pinned target falls back to Exa (fetchTarget sets direct:false in
payload), the old code read isDirect from target.metadata (always true),
causing two bugs:
1. upsertProductMatch was skipped → new Exa URL never pinned
2. Stale-pin counters reset on any pin target → broken pins never disabled
Introduce wasDirectHit = isDirect && rawPayload.direct === true so:
- Stale-pin maintenance (OOS/reset) only fires when pin URL was actually used
- Exa fallback (isDirect && !wasDirectHit) fires handlePinError on old pin
- upsertProductMatch guard uses !wasDirectHit so fallback results get pinned