Files
worldmonitor/docs/api/NewsService.openapi.yaml
Elie Habib dcf73385ca fix(scoring): rebalance formula weights severity 55%, corroboration 15% (#3144)
* fix(scoring): rebalance formula weights severity 55%, corroboration 15%

PR A of the scoring recalibration plan (docs/plans/2026-04-17-002).

The v2 shadow-log recalibration (690 items, Pearson 0.413) showed the
formula compresses scores into a narrow 30-70 range, making the 85
critical gate unreachable and the 65 high gate marginal. Root cause:
corroboration at 30% weight penalizes breaking single-source news
(the most important alerts) while severity at 40% doesn't separate
critical from high enough.

Weight change:
  BEFORE: severity 0.40 + sourceTier 0.20 + corroboration 0.30 + recency 0.10
  AFTER:  severity 0.55 + sourceTier 0.20 + corroboration 0.15 + recency 0.10

Expected effect: critical/tier1/fresh rises from 76 to 88 (clears 85
gate). critical/tier2/fresh rises from 71 to 83 (recommend lowering
critical gate to 80 at activation time). high/tier2/fresh rises from
61 to 69 (clears 65 gate). The HIGH-CRITICAL gap widens from 10 to
14 points for same-tier items.

Also:
- Bumps shadow-log key from v2 to v3 for a clean recalibration dataset
  (v2 had old-weight scores that would contaminate the 48h soak)
- Updates proto/news_item.proto formula comment to reflect new weights
- Updates cache-keys.ts documentation

No cache migration needed: the classify cache stores {level, category},
not scores. Scores are computed at read time from the stored level +
the formula, so new digest requests immediately produce new scores.

Gates remain OFF. After 48h of v3 data, re-run:
  node scripts/shadow-score-report.mjs
  node scripts/shadow-score-rank.mjs sample 25

🤖 Generated with Claude Opus 4.6 via Claude Code + Compound Engineering v2.49.0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: regenerate proto OpenAPI docs for weight rebalance

* fix(scoring): bump SHADOW_SCORE_LOG_KEY export to v3

The exported constant in cache-keys.ts was left at v2 while the relay's
local constant was bumped to v3. Anyone importing the export (or grep-
discovering it) would get a stale key. Architecture review flagged this.

* fix(scoring): update test + stale comments for shadow-log v3

Review found the regression test still asserted v2 key, causing CI
failure. Also fixed stale v1/v2 references in report script header,
default-key comment, report title render, and shouldNotify docstring.

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-17 17:43:39 +04:00

376 lines
16 KiB
YAML
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
openapi: 3.1.0
info:
title: NewsService API
version: 1.0.0
paths:
/api/news/v1/summarize-article:
post:
tags:
- NewsService
summary: SummarizeArticle
description: SummarizeArticle generates an LLM summary with provider selection and fallback support.
operationId: SummarizeArticle
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/SummarizeArticleRequest'
required: true
responses:
"200":
description: Successful response
content:
application/json:
schema:
$ref: '#/components/schemas/SummarizeArticleResponse'
"400":
description: Validation error
content:
application/json:
schema:
$ref: '#/components/schemas/ValidationError'
default:
description: Error response
content:
application/json:
schema:
$ref: '#/components/schemas/Error'
/api/news/v1/summarize-article-cache:
get:
tags:
- NewsService
summary: GetSummarizeArticleCache
description: GetSummarizeArticleCache looks up a cached summary by deterministic key (CDN-cacheable GET).
operationId: GetSummarizeArticleCache
parameters:
- name: cache_key
in: query
description: Deterministic cache key computed by buildSummaryCacheKey().
required: false
schema:
type: string
responses:
"200":
description: Successful response
content:
application/json:
schema:
$ref: '#/components/schemas/SummarizeArticleResponse'
"400":
description: Validation error
content:
application/json:
schema:
$ref: '#/components/schemas/ValidationError'
default:
description: Error response
content:
application/json:
schema:
$ref: '#/components/schemas/Error'
/api/news/v1/list-feed-digest:
get:
tags:
- NewsService
summary: ListFeedDigest
description: ListFeedDigest returns a pre-aggregated digest of all RSS feeds for a site variant.
operationId: ListFeedDigest
parameters:
- name: variant
in: query
description: 'Site variant: full, tech, finance, happy'
required: false
schema:
type: string
- name: lang
in: query
description: ISO 639-1 language code (en, fr, ar, etc.)
required: false
schema:
type: string
responses:
"200":
description: Successful response
content:
application/json:
schema:
$ref: '#/components/schemas/ListFeedDigestResponse'
"400":
description: Validation error
content:
application/json:
schema:
$ref: '#/components/schemas/ValidationError'
default:
description: Error response
content:
application/json:
schema:
$ref: '#/components/schemas/Error'
components:
schemas:
Error:
type: object
properties:
message:
type: string
description: Error message (e.g., 'user not found', 'database connection failed')
description: Error is returned when a handler encounters an error. It contains a simple error message that the developer can customize.
FieldViolation:
type: object
properties:
field:
type: string
description: The field path that failed validation (e.g., 'user.email' for nested fields). For header validation, this will be the header name (e.g., 'X-API-Key')
description:
type: string
description: Human-readable description of the validation violation (e.g., 'must be a valid email address', 'required field missing')
required:
- field
- description
description: FieldViolation describes a single validation error for a specific field.
ValidationError:
type: object
properties:
violations:
type: array
items:
$ref: '#/components/schemas/FieldViolation'
description: List of validation violations
required:
- violations
description: ValidationError is returned when request validation fails. It contains a list of field violations describing what went wrong.
SummarizeArticleRequest:
type: object
properties:
provider:
type: string
minLength: 1
description: 'LLM provider: "ollama", "groq", "openrouter"'
headlines:
type: array
items:
type: string
minItems: 1
description: Headlines to summarize (max 8 used).
minItems: 1
mode:
type: string
description: 'Summarization mode: "brief", "analysis", "translate", "" (default).'
geoContext:
type: string
description: Geographic signal context to include in the prompt.
variant:
type: string
description: 'Variant: "full", "tech", or target language for translate mode.'
lang:
type: string
description: Output language code, default "en".
systemAppend:
type: string
description: Optional system prompt append for analytical framework instructions.
required:
- provider
description: SummarizeArticleRequest specifies parameters for LLM article summarization.
SummarizeArticleResponse:
type: object
properties:
summary:
type: string
description: The generated summary text.
model:
type: string
description: Model identifier used for generation.
provider:
type: string
description: Provider that produced the result (or "cache").
tokens:
type: integer
format: int32
description: Token count from the LLM response.
fallback:
type: boolean
description: Whether the client should try the next provider in the fallback chain.
error:
type: string
description: Error message if the request failed.
errorType:
type: string
description: Error type/name (e.g. "TypeError").
status:
type: string
enum:
- SUMMARIZE_STATUS_UNSPECIFIED
- SUMMARIZE_STATUS_SUCCESS
- SUMMARIZE_STATUS_CACHED
- SUMMARIZE_STATUS_SKIPPED
- SUMMARIZE_STATUS_ERROR
description: SummarizeStatus indicates the outcome of a summarization request.
statusDetail:
type: string
description: Human-readable detail for non-success statuses (skip reason, etc.).
description: SummarizeArticleResponse contains the LLM summarization result.
GetSummarizeArticleCacheRequest:
type: object
properties:
cacheKey:
type: string
description: Deterministic cache key computed by buildSummaryCacheKey().
description: GetSummarizeArticleCacheRequest looks up a pre-computed summary by cache key.
ListFeedDigestRequest:
type: object
properties:
variant:
type: string
description: 'Site variant: full, tech, finance, happy'
lang:
type: string
description: ISO 639-1 language code (en, fr, ar, etc.)
ListFeedDigestResponse:
type: object
properties:
categories:
type: object
additionalProperties:
$ref: '#/components/schemas/CategoryBucket'
description: Per-category buckets — keys match category names from feed config
feedStatuses:
type: object
additionalProperties:
type: string
description: |-
Per-feed status — only non-ok states emitted; absent key implies ok.
Values: empty (feed returned 0 items), timeout (timed out during fetch).
generatedAt:
type: string
description: ISO 8601 timestamp of when this digest was generated
CategoriesEntry:
type: object
properties:
key:
type: string
value:
$ref: '#/components/schemas/CategoryBucket'
FeedStatusesEntry:
type: object
properties:
key:
type: string
value:
type: string
CategoryBucket:
type: object
properties:
items:
type: array
items:
$ref: '#/components/schemas/NewsItem'
NewsItem:
type: object
properties:
source:
type: string
minLength: 1
description: Source feed name.
title:
type: string
minLength: 1
description: Article headline.
link:
type: string
description: Article URL.
publishedAt:
type: integer
format: int64
description: 'Publication time, as Unix epoch milliseconds.. Warning: Values > 2^53 may lose precision in JavaScript'
isAlert:
type: boolean
description: Whether this article triggered an alert condition.
threat:
$ref: '#/components/schemas/ThreatClassification'
location:
$ref: '#/components/schemas/GeoCoordinates'
locationName:
type: string
description: Human-readable location name.
importanceScore:
type: integer
format: int32
description: 'Composite importance score (0-100): severity × 55% + source tier × 20% + corroboration × 15% + recency × 10%.'
corroborationCount:
type: integer
format: int32
description: Number of distinct sources that reported the same story in this digest cycle.
storyMeta:
$ref: '#/components/schemas/StoryMeta'
required:
- source
- title
description: NewsItem represents a single news article from RSS feed aggregation.
ThreatClassification:
type: object
properties:
level:
type: string
enum:
- THREAT_LEVEL_UNSPECIFIED
- THREAT_LEVEL_LOW
- THREAT_LEVEL_MEDIUM
- THREAT_LEVEL_HIGH
- THREAT_LEVEL_CRITICAL
description: ThreatLevel represents the assessed threat level of a news event.
category:
type: string
description: Event category.
confidence:
type: number
maximum: 1
minimum: 0
format: double
description: Confidence score (0.0 to 1.0).
source:
type: string
description: Classification source — "keyword", "ml", or "llm".
description: ThreatClassification represents an AI-assessed threat level for a news item.
GeoCoordinates:
type: object
properties:
latitude:
type: number
maximum: 90
minimum: -90
format: double
description: Latitude in decimal degrees (-90 to 90).
longitude:
type: number
maximum: 180
minimum: -180
format: double
description: Longitude in decimal degrees (-180 to 180).
description: GeoCoordinates represents a geographic location using WGS84 coordinates.
StoryMeta:
type: object
properties:
firstSeen:
type: integer
format: int64
description: 'Epoch ms when the story first appeared in any digest cycle.. Warning: Values > 2^53 may lose precision in JavaScript'
mentionCount:
type: integer
format: int32
description: Total number of digest cycles in which this story appeared.
sourceCount:
type: integer
format: int32
description: Number of unique sources that reported this story (cached from Redis Set).
phase:
type: string
enum:
- STORY_PHASE_UNSPECIFIED
- STORY_PHASE_BREAKING
- STORY_PHASE_DEVELOPING
- STORY_PHASE_SUSTAINED
- STORY_PHASE_FADING
description: StoryPhase represents the lifecycle stage of a tracked news story.
description: StoryMeta carries cross-cycle persistence data attached to each news item.