Files
Elie Habib 261d40dfa1 fix(consumer-prices): block restaurant URLs and reject null productName extractions (#2156)
Two data-quality regressions found in the 2026-03-23 scrape:

1. ananinja_sa: Exa returns restaurant menu URLs (e.g. /restaurants/burger-king)
   alongside product pages. isTitlePlausible passes "onion rings" for "Onions 1kg"
   because "onion" is a substring. Add urlPathContains: /product/ to restrict Exa
   results to product pages only, matching the existing pattern used by tamimi/carrefour.

2. search.ts parseListing: when Firecrawl can't extract a productName, we fell back
   to the canonical name as rawTitle. This silently stored unverifiable matches
   (raw_title = canonical_name) as real prices — e.g. Noon matched a chicken product
   as "Eggs Fresh 12 Pack". Reject the result outright when productName is missing.

DB: manually disabled 30 known-bad pins (26 canonical-name=raw-title failures,
2 restaurant URLs, 2 bundle-size mismatches) via pin_disabled_at.
2026-03-23 21:47:53 +04:00
..