mirror of
https://github.com/koala73/worldmonitor.git
synced 2026-04-25 17:14:57 +02:00
* feat(intelligence): DeepEar QW-1/QW-3/QW-4 — Polymarket context injection, dual-model routing, robust JSON parsing QW-1: Inject crowd-calibrated Polymarket/Kalshi odds into deduction prompts - Fetch prediction:markets-bootstrap:v1 from Redis and keyword-score markets against user query; top-7 matches appended as structured context block - Hash prediction context into cache key so odds movements trigger fresh LLM calls - Sanitize market titles with sanitizeHeadline() before prompt injection - deductSituation now uses callLlmReasoning (explicit reasoning-tier routing) QW-3: Dual-model LLM routing via env vars - callLlmTool (LLM_TOOL_PROVIDER/MODEL, default groq) for extraction tasks - callLlmReasoning (LLM_REASONING_PROVIDER/MODEL, default openrouter) for synthesis - Private callLlmProfile factory eliminates duplicated wrapper bodies QW-4: Shared _llm-json.mjs utility for robust LLM JSON parsing - cleanJsonText: strips C-style comments and trailing commas before JSON.parse - extractFirstJsonObject/Array: private extractFirstDelimited walker (was 2x23 LOC) - Removes duplicate function definitions from seed-forecasts.mjs - All JSON.parse call sites in tryParseImpactExpansionCandidate and tryParseStructuredCandidate now use cleanJsonText for comment/comma tolerance * fix(intelligence): address PR #2421 review findings - P1: getCachedJson(PREDICTION_BOOTSTRAP_KEY, true) — seed keys must be read raw (unprefixed); without this the prediction-odds block never activates outside production - P2: validate LLM_TOOL_PROVIDER / LLM_REASONING_PROVIDER env value against PROVIDER_SET before use; log a warning and fall back to defaultProvider on typo instead of silently rerouting - Filter: lower word-length threshold from >3 to >1 so short but critical terms like "war", "oil", "EU", "AI", "Fed", "USD" match prediction markets - Suggestion: bucket yesPrice to 5% increments before building the context string to reduce 1%-movement cache churn (20 bands vs 100) * fix(intelligence): guard empty prediction context header when all titles sanitize to empty
45 lines
1.5 KiB
JavaScript
45 lines
1.5 KiB
JavaScript
/**
|
|
* Shared LLM JSON extraction utilities.
|
|
*
|
|
* Handles the common failure modes of LLM JSON output:
|
|
* - C-style line and block comments (// ..., /* ... *\/)
|
|
* - Trailing commas before } and ]
|
|
* - Partial outputs (brace/bracket extraction as fallback)
|
|
*
|
|
* Note: cleanJsonText strips `//` best-effort and will incorrectly strip
|
|
* URLs inside JSON string values (e.g. "https://..."). Acceptable for LLM output.
|
|
*/
|
|
|
|
/**
|
|
* Strip C-style comments and trailing commas from a JSON-like string.
|
|
*/
|
|
export function cleanJsonText(text) {
|
|
return text
|
|
.replace(/\/\*[\s\S]*?\*\//g, '')
|
|
.replace(/\/\/[^\n]*/g, '')
|
|
.replace(/,(\s*[}\]])/g, '$1')
|
|
.trim();
|
|
}
|
|
|
|
function extractFirstDelimited(text, open, close) {
|
|
const cleaned = cleanJsonText(text);
|
|
const start = cleaned.indexOf(open);
|
|
if (start === -1) return '';
|
|
let depth = 0;
|
|
let inString = false;
|
|
let escaped = false;
|
|
for (let i = start; i < cleaned.length; i++) {
|
|
const char = cleaned[i];
|
|
if (escaped) { escaped = false; continue; }
|
|
if (char === '\\') { escaped = true; continue; }
|
|
if (char === '"') { inString = !inString; continue; }
|
|
if (inString) continue;
|
|
if (char === open) depth++;
|
|
if (char === close && --depth === 0) return cleaned.slice(start, i + 1);
|
|
}
|
|
return cleaned.slice(start);
|
|
}
|
|
|
|
export const extractFirstJsonObject = (text) => extractFirstDelimited(text, '{', '}');
|
|
export const extractFirstJsonArray = (text) => extractFirstDelimited(text, '[', ']');
|