mirror of
https://github.com/koala73/worldmonitor.git
synced 2026-04-25 17:14:57 +02:00
* fix(cache): align Redis digest + RSS feed TTLs to CF CDN TTL
RSS feed TTL 600s → 3600s; digest TTL 900s → 3600s.
CF CDN caches at 3600s, so Redis expiring earlier caused every hourly
CF revalidation to hit a cold origin and run the full buildDigest()
pipeline (75 feeds, up to 25s). Aligning both to 3600s ensures CF
revalidation gets a warm Redis hit and returns immediately.
* fix(cache): emit only non-ok feedStatuses; update proto comment + make generate
Digest was emitting 'ok' for every successful feed (~50 entries, ~1-2KB
per response). No in-repo client reads feedStatuses values. Changed to
only emit 'empty' and 'timeout'; absent key implies ok.
Updated proto comment to document the absence-implies-ok contract and
ran make generate to regenerate docs/api/ OpenAPI files.
* fix(cache): add slow-browser tier; move digest route to it
New 'slow-browser' tier is identical to 'slow' but adds max-age=300,
letting browsers skip the network for 5 minutes. Without max-age,
browsers ignore s-maxage and send conditional If-None-Match on every
20-min poll — each costing 1 billable edge request even for 304s.
Scoped only to list-feed-digest (a safe polling endpoint). Premium
user-triggered endpoints (analyze-stock, backtest-stock) stay on 'slow'
where browser caching is inappropriate.
* test: regression tests for feedStatuses and slow-browser tier
- digest-no-reclassify: assert buildDigest does not write 'ok' to feedStatuses
- route-cache-tier: include slow-browser in tier regex; assert slow-browser
has max-age and slow tier does not
* fix(cache): add variant to per-feed RSS cache key
rss:feed:v1:${url} was shared across variants even though classifyByKeyword()
bakes variant-specific threat/category labels into the cached ParsedItem[].
Feeds shared between full and tech variants (Verge, Ars, HN, etc.) had
whichever variant populated the cache first control the other variant's
classifications for the full 3600s TTL — turning a pre-existing 10-minute
bleed-through into a 1-hour accuracy bug for the tech dashboard.
Fix: key is now rss:feed:v1:${variant}:${url}.
* fix(cache): bypass browser HTTP cache on digest fetch
max-age=300 on the slow-browser tier lets browsers serve the digest
from their HTTP cache for up to 5 minutes, including on explicit
in-app refresh (window.location.reload) or page reload after a
breaking event. Users would see stale data until the TTL expired.
Add cache: 'no-cache' to tryFetchDigest() so every fetch revalidates
against CF edge. CF returns 304 (minimal cost) when data is unchanged,
or 200 with the current digest. s-maxage and CF-level caching are
unaffected; max-age still benefits browser back/forward cache.
* fix(cache): 15-min consistent TTL + degrade guard for digest
Issue 1 — TTL alignment: Redis digest TTL reverted to 900s (from 3600).
slow-browser tier reduced from s-maxage=1800/CDN=3600 to s-maxage=900 on
both sides, matching the Redis TTL. The freshness window is now consistently
15 minutes across Redis, Vercel edge, and CF CDN. max-age=300 (browser
local) is kept to avoid unnecessary revalidations on tab switch.
Issue 2 — Cache poisoning: replaced cachedFetchJson in listFeedDigest with
explicit getCachedJson/setCachedJson. After buildDigest(), if total items
across all categories is 0 the response is treated as degraded: Redis write
is skipped and markNoCacheResponse(ctx.request) is called so the gateway
sets Cache-Control: no-store instead of the normal tier headers. This
prevents a transient bad run from poisoning Redis and browser/CDN for the
full TTL. Error paths also call markNoCacheResponse.
338 lines
14 KiB
YAML
338 lines
14 KiB
YAML
openapi: 3.1.0
|
|
info:
|
|
title: NewsService API
|
|
version: 1.0.0
|
|
paths:
|
|
/api/news/v1/summarize-article:
|
|
post:
|
|
tags:
|
|
- NewsService
|
|
summary: SummarizeArticle
|
|
description: SummarizeArticle generates an LLM summary with provider selection and fallback support.
|
|
operationId: SummarizeArticle
|
|
requestBody:
|
|
content:
|
|
application/json:
|
|
schema:
|
|
$ref: '#/components/schemas/SummarizeArticleRequest'
|
|
required: true
|
|
responses:
|
|
"200":
|
|
description: Successful response
|
|
content:
|
|
application/json:
|
|
schema:
|
|
$ref: '#/components/schemas/SummarizeArticleResponse'
|
|
"400":
|
|
description: Validation error
|
|
content:
|
|
application/json:
|
|
schema:
|
|
$ref: '#/components/schemas/ValidationError'
|
|
default:
|
|
description: Error response
|
|
content:
|
|
application/json:
|
|
schema:
|
|
$ref: '#/components/schemas/Error'
|
|
/api/news/v1/summarize-article-cache:
|
|
get:
|
|
tags:
|
|
- NewsService
|
|
summary: GetSummarizeArticleCache
|
|
description: GetSummarizeArticleCache looks up a cached summary by deterministic key (CDN-cacheable GET).
|
|
operationId: GetSummarizeArticleCache
|
|
parameters:
|
|
- name: cache_key
|
|
in: query
|
|
description: Deterministic cache key computed by buildSummaryCacheKey().
|
|
required: false
|
|
schema:
|
|
type: string
|
|
responses:
|
|
"200":
|
|
description: Successful response
|
|
content:
|
|
application/json:
|
|
schema:
|
|
$ref: '#/components/schemas/SummarizeArticleResponse'
|
|
"400":
|
|
description: Validation error
|
|
content:
|
|
application/json:
|
|
schema:
|
|
$ref: '#/components/schemas/ValidationError'
|
|
default:
|
|
description: Error response
|
|
content:
|
|
application/json:
|
|
schema:
|
|
$ref: '#/components/schemas/Error'
|
|
/api/news/v1/list-feed-digest:
|
|
get:
|
|
tags:
|
|
- NewsService
|
|
summary: ListFeedDigest
|
|
description: ListFeedDigest returns a pre-aggregated digest of all RSS feeds for a site variant.
|
|
operationId: ListFeedDigest
|
|
parameters:
|
|
- name: variant
|
|
in: query
|
|
description: 'Site variant: full, tech, finance, happy'
|
|
required: false
|
|
schema:
|
|
type: string
|
|
- name: lang
|
|
in: query
|
|
description: ISO 639-1 language code (en, fr, ar, etc.)
|
|
required: false
|
|
schema:
|
|
type: string
|
|
responses:
|
|
"200":
|
|
description: Successful response
|
|
content:
|
|
application/json:
|
|
schema:
|
|
$ref: '#/components/schemas/ListFeedDigestResponse'
|
|
"400":
|
|
description: Validation error
|
|
content:
|
|
application/json:
|
|
schema:
|
|
$ref: '#/components/schemas/ValidationError'
|
|
default:
|
|
description: Error response
|
|
content:
|
|
application/json:
|
|
schema:
|
|
$ref: '#/components/schemas/Error'
|
|
components:
|
|
schemas:
|
|
Error:
|
|
type: object
|
|
properties:
|
|
message:
|
|
type: string
|
|
description: Error message (e.g., 'user not found', 'database connection failed')
|
|
description: Error is returned when a handler encounters an error. It contains a simple error message that the developer can customize.
|
|
FieldViolation:
|
|
type: object
|
|
properties:
|
|
field:
|
|
type: string
|
|
description: The field path that failed validation (e.g., 'user.email' for nested fields). For header validation, this will be the header name (e.g., 'X-API-Key')
|
|
description:
|
|
type: string
|
|
description: Human-readable description of the validation violation (e.g., 'must be a valid email address', 'required field missing')
|
|
required:
|
|
- field
|
|
- description
|
|
description: FieldViolation describes a single validation error for a specific field.
|
|
ValidationError:
|
|
type: object
|
|
properties:
|
|
violations:
|
|
type: array
|
|
items:
|
|
$ref: '#/components/schemas/FieldViolation'
|
|
description: List of validation violations
|
|
required:
|
|
- violations
|
|
description: ValidationError is returned when request validation fails. It contains a list of field violations describing what went wrong.
|
|
SummarizeArticleRequest:
|
|
type: object
|
|
properties:
|
|
provider:
|
|
type: string
|
|
minLength: 1
|
|
description: 'LLM provider: "ollama", "groq", "openrouter"'
|
|
headlines:
|
|
type: array
|
|
items:
|
|
type: string
|
|
minItems: 1
|
|
description: Headlines to summarize (max 8 used).
|
|
minItems: 1
|
|
mode:
|
|
type: string
|
|
description: 'Summarization mode: "brief", "analysis", "translate", "" (default).'
|
|
geoContext:
|
|
type: string
|
|
description: Geographic signal context to include in the prompt.
|
|
variant:
|
|
type: string
|
|
description: 'Variant: "full", "tech", or target language for translate mode.'
|
|
lang:
|
|
type: string
|
|
description: Output language code, default "en".
|
|
required:
|
|
- provider
|
|
description: SummarizeArticleRequest specifies parameters for LLM article summarization.
|
|
SummarizeArticleResponse:
|
|
type: object
|
|
properties:
|
|
summary:
|
|
type: string
|
|
description: The generated summary text.
|
|
model:
|
|
type: string
|
|
description: Model identifier used for generation.
|
|
provider:
|
|
type: string
|
|
description: Provider that produced the result (or "cache").
|
|
tokens:
|
|
type: integer
|
|
format: int32
|
|
description: Token count from the LLM response.
|
|
fallback:
|
|
type: boolean
|
|
description: Whether the client should try the next provider in the fallback chain.
|
|
error:
|
|
type: string
|
|
description: Error message if the request failed.
|
|
errorType:
|
|
type: string
|
|
description: Error type/name (e.g. "TypeError").
|
|
status:
|
|
type: string
|
|
enum:
|
|
- SUMMARIZE_STATUS_UNSPECIFIED
|
|
- SUMMARIZE_STATUS_SUCCESS
|
|
- SUMMARIZE_STATUS_CACHED
|
|
- SUMMARIZE_STATUS_SKIPPED
|
|
- SUMMARIZE_STATUS_ERROR
|
|
description: SummarizeStatus indicates the outcome of a summarization request.
|
|
statusDetail:
|
|
type: string
|
|
description: Human-readable detail for non-success statuses (skip reason, etc.).
|
|
description: SummarizeArticleResponse contains the LLM summarization result.
|
|
GetSummarizeArticleCacheRequest:
|
|
type: object
|
|
properties:
|
|
cacheKey:
|
|
type: string
|
|
description: Deterministic cache key computed by buildSummaryCacheKey().
|
|
description: GetSummarizeArticleCacheRequest looks up a pre-computed summary by cache key.
|
|
ListFeedDigestRequest:
|
|
type: object
|
|
properties:
|
|
variant:
|
|
type: string
|
|
description: 'Site variant: full, tech, finance, happy'
|
|
lang:
|
|
type: string
|
|
description: ISO 639-1 language code (en, fr, ar, etc.)
|
|
ListFeedDigestResponse:
|
|
type: object
|
|
properties:
|
|
categories:
|
|
type: object
|
|
additionalProperties:
|
|
$ref: '#/components/schemas/CategoryBucket'
|
|
description: Per-category buckets — keys match category names from feed config
|
|
feedStatuses:
|
|
type: object
|
|
additionalProperties:
|
|
type: string
|
|
description: |-
|
|
Per-feed status — only non-ok states emitted; absent key implies ok.
|
|
Values: empty (feed returned 0 items), timeout (timed out during fetch).
|
|
generatedAt:
|
|
type: string
|
|
description: ISO 8601 timestamp of when this digest was generated
|
|
CategoriesEntry:
|
|
type: object
|
|
properties:
|
|
key:
|
|
type: string
|
|
value:
|
|
$ref: '#/components/schemas/CategoryBucket'
|
|
FeedStatusesEntry:
|
|
type: object
|
|
properties:
|
|
key:
|
|
type: string
|
|
value:
|
|
type: string
|
|
CategoryBucket:
|
|
type: object
|
|
properties:
|
|
items:
|
|
type: array
|
|
items:
|
|
$ref: '#/components/schemas/NewsItem'
|
|
NewsItem:
|
|
type: object
|
|
properties:
|
|
source:
|
|
type: string
|
|
minLength: 1
|
|
description: Source feed name.
|
|
title:
|
|
type: string
|
|
minLength: 1
|
|
description: Article headline.
|
|
link:
|
|
type: string
|
|
description: Article URL.
|
|
publishedAt:
|
|
type: integer
|
|
format: int64
|
|
description: 'Publication time, as Unix epoch milliseconds.. Warning: Values > 2^53 may lose precision in JavaScript'
|
|
isAlert:
|
|
type: boolean
|
|
description: Whether this article triggered an alert condition.
|
|
threat:
|
|
$ref: '#/components/schemas/ThreatClassification'
|
|
location:
|
|
$ref: '#/components/schemas/GeoCoordinates'
|
|
locationName:
|
|
type: string
|
|
description: Human-readable location name.
|
|
required:
|
|
- source
|
|
- title
|
|
description: NewsItem represents a single news article from RSS feed aggregation.
|
|
ThreatClassification:
|
|
type: object
|
|
properties:
|
|
level:
|
|
type: string
|
|
enum:
|
|
- THREAT_LEVEL_UNSPECIFIED
|
|
- THREAT_LEVEL_LOW
|
|
- THREAT_LEVEL_MEDIUM
|
|
- THREAT_LEVEL_HIGH
|
|
- THREAT_LEVEL_CRITICAL
|
|
description: ThreatLevel represents the assessed threat level of a news event.
|
|
category:
|
|
type: string
|
|
description: Event category.
|
|
confidence:
|
|
type: number
|
|
maximum: 1
|
|
minimum: 0
|
|
format: double
|
|
description: Confidence score (0.0 to 1.0).
|
|
source:
|
|
type: string
|
|
description: Classification source — "keyword", "ml", or "llm".
|
|
description: ThreatClassification represents an AI-assessed threat level for a news item.
|
|
GeoCoordinates:
|
|
type: object
|
|
properties:
|
|
latitude:
|
|
type: number
|
|
maximum: 90
|
|
minimum: -90
|
|
format: double
|
|
description: Latitude in decimal degrees (-90 to 90).
|
|
longitude:
|
|
type: number
|
|
maximum: 180
|
|
minimum: -180
|
|
format: double
|
|
description: Longitude in decimal degrees (-180 to 180).
|
|
description: GeoCoordinates represents a geographic location using WGS84 coordinates.
|