mirror of
https://github.com/thedotmack/claude-mem
synced 2026-04-25 17:15:04 +02:00
Revert "revert: roll back v12.3.3 (Issue Blowout 2026)"
This reverts commit bfc7de377a.
This commit is contained in:
228
ISSUE-BLOWOUT-TODO.md
Normal file
228
ISSUE-BLOWOUT-TODO.md
Normal file
@@ -0,0 +1,228 @@
|
||||
# Issue Blowout 2026 - Running TODO
|
||||
|
||||
Branch: `issue-blowout-2026` (merged as PR #2079)
|
||||
Strategy: Cynical dev. Every bug report is suspect — look for overengineered band-aids as root cause.
|
||||
Test gate: After every build-and-sync, verify observations are flowing.
|
||||
Released: **v12.3.2** on 2026-04-19
|
||||
|
||||
## Instructions for Continuation
|
||||
|
||||
### Workflow per issue
|
||||
1. Use `/make-plan` and `/do` to attack each issue's root cause
|
||||
2. Be cynical — most bug reports are surface-level; the real issue is usually overengineered band-aids
|
||||
3. After every `npm run build-and-sync`, verify observations flow:
|
||||
```bash
|
||||
sleep 5 && sqlite3 ~/.claude-mem/claude-mem.db "SELECT COUNT(*) FROM observations WHERE created_at_epoch > (strftime('%s','now') - 120) * 1000"
|
||||
```
|
||||
4. If observations stop flowing, that's a regression — fix it before continuing
|
||||
|
||||
### Docker isolation
|
||||
- **Port 37777**: Host's live bun worker (YOUR claude-mem instance — don't touch)
|
||||
- **Port 37778**: Another agent's docker container (`claude-mem-dev`) — hands off
|
||||
- **Your docker**: Use tag `claude-mem:blowout`, data dir `.docker-blowout-data/`
|
||||
```bash
|
||||
TAG=claude-mem:blowout docker/claude-mem/build.sh
|
||||
HOST_MEM_DIR=$(pwd)/.docker-blowout-data TAG=claude-mem:blowout docker/claude-mem/run.sh
|
||||
```
|
||||
- Check observations in docker DB:
|
||||
```bash
|
||||
sqlite3 .docker-blowout-data/claude-mem.db 'select count(*) from observations'
|
||||
```
|
||||
|
||||
### PR → Review → Merge → Release cycle
|
||||
1. Create PR from feature branch to main
|
||||
2. Start review loop: `/loop 2m` to check and resolve review comments
|
||||
- CodeRabbit and Greptile post inline comments — read, fix, commit, push, reply
|
||||
- `claude-review` is a CI check — just needs to pass
|
||||
- CodeRabbit can take 5-10 min to process after each push
|
||||
3. When all reviews pass: `gh pr merge <PR#> --repo thedotmack/claude-mem --squash --delete-branch --admin`
|
||||
4. Close resolved issues: `for issue in <numbers>; do gh issue close $issue --repo thedotmack/claude-mem --comment "Fixed in PR #XXXX"; done`
|
||||
5. Version bump:
|
||||
```bash
|
||||
cd ~/Scripts/claude-mem
|
||||
git pull origin main
|
||||
# Run /version-bump patch (or use the skill: claude-mem:version-bump)
|
||||
# It handles: version files → build → commit → tag → push → gh release → changelog
|
||||
```
|
||||
|
||||
### Key files in the codebase
|
||||
- **Parser**: `src/sdk/parser.ts` — observation and summary XML parsing
|
||||
- **Prompts**: `src/sdk/prompts.ts` — LLM prompt templates (observation, summary, continuation)
|
||||
- **ResponseProcessor**: `src/services/worker/agents/ResponseProcessor.ts` — unified response handler
|
||||
- **SessionManager**: `src/services/worker/SessionManager.ts` — queue, sessions, circuit breaker
|
||||
- **SessionSearch**: `src/services/sqlite/SessionSearch.ts` — FTS5 and filter queries
|
||||
- **SearchManager**: `src/services/worker/SearchManager.ts` — hybrid Chroma+SQLite orchestration
|
||||
- **Worker service**: `src/services/worker-service.ts` — periodic reapers, startup
|
||||
- **Summarize hook**: `src/cli/handlers/summarize.ts` — Stop hook entry point
|
||||
- **SessionRoutes**: `src/services/worker/http/routes/SessionRoutes.ts` — HTTP API
|
||||
- **ViewerRoutes**: `src/services/worker/http/routes/ViewerRoutes.ts` — /health endpoint
|
||||
- **Agents**: `src/services/worker/SDKAgent.ts`, `GeminiAgent.ts`, `OpenRouterAgent.ts`
|
||||
- **Modes**: `plugin/modes/code.json` — prompt field values for the default mode
|
||||
- **Migrations**: `src/services/sqlite/migrations/runner.ts`
|
||||
- **PendingMessageStore**: `src/services/sqlite/PendingMessageStore.ts` — queue persistence
|
||||
|
||||
## Completed Phase 2-5 (16 more issues — this session)
|
||||
|
||||
| # | Component | Issue | Resolution |
|
||||
|---|-----------|-------|------------|
|
||||
| 2053 | worker | Generator restart guard strands pending messages | FIXED — Time-windowed RestartGuard replaces flat counter (10 restarts/60s window, 5min decay) |
|
||||
| 1868 | worker | SDK pool deadlock: idle sessions monopolize slots | FIXED — evictIdlestSession() callback in waitForSlot() preempts idle sessions |
|
||||
| 1876 | worker | MCP loopback self-check fails; crash misclassification | FIXED — process.execPath replaces bare 'node'; removed false "exited unexpectedly" log |
|
||||
| 1901 | hooks | Summarize stop hook exits code 2 on errors | FIXED — workerHttpRequest wrapped in try/catch, exits gracefully |
|
||||
| 1907 | hooks | Linux/WSL session-init before worker healthy | FIXED — health-check curl loop added to UserPromptSubmit hook; HTTP call wrapped |
|
||||
| 1896 | hooks | PreToolUse file-context caps Read to limit:1 | CLOSED — already fixed (mtime comparison at file-context.ts:255-267) |
|
||||
| 1903 | hooks | PostToolUse/Stop/SessionEnd never fire | CLOSED — no-repro (hooks.json correct; Claude Code 12.0.1 platform bug) |
|
||||
| 1932 | security | Admin endpoints spoofable requireLocalhost | FIXED — bearer token auth on all API endpoints |
|
||||
| 1933 | security | Unauthenticated HTTP API exposes 30+ endpoints | FIXED — auto-generated token at ~/.claude-mem/worker-auth-token (mode 0600) |
|
||||
| 1934 | security | watch.context.path written without validation | FIXED — path traversal protection validates against project root / data dir |
|
||||
| 1935 | security | Unbounded input, no rate limits | FIXED — 5MB body limit (was 50MB), 300 req/min/IP rate limiter |
|
||||
| 1936 | security | Multi-user macOS shared port cross-user MCP | FIXED — per-user port derivation from UID (37700 + uid%100) |
|
||||
| 1911 | search | search()/timeline() cross-project results | FIXED — project filter passed to Chroma queries and timeline anchor searches |
|
||||
| 1912 | search | /api/search per-type endpoints ignore project | FIXED — project $or clause added to searchObservations/Sessions/UserPrompts |
|
||||
| 1914 | search | Imported observations invisible to MCP search | FIXED — ChromaSync.syncObservation() called after import |
|
||||
| 1918 | search | SessionStart "no previous sessions" on fresh sessions | FIXED — session-init cwd fallback matches context.ts (process.cwd()) |
|
||||
|
||||
## Completed (9 issues — PR #2079, v12.3.2)
|
||||
|
||||
| # | Component | Issue | Resolution |
|
||||
|---|-----------|-------|------------|
|
||||
| 1908 | summarizer | parseSummary discards output when LLM emits observation tags | CLOSED — already fixed by Gen 3 coercion (coerceObservationToSummary in parser.ts) |
|
||||
| 1953 | db | Migration 7 rebuilds table every startup | CLOSED — already fixed by commit 59ce0fc5 (origin !== 'pk' filter) |
|
||||
| 1916 | search | /api/search/by-concept emits malformed SQL | FIXED — concept→concepts remap in SearchManager.normalizeParams() |
|
||||
| 1913 | search | Text search returns empty when ChromaDB disabled | FIXED — FTS5 keyword fallback in SessionSearch + SearchManager |
|
||||
| 2048 | search | Text queries should fall back to FTS5 when Chroma disabled | FIXED — same as #1913 |
|
||||
| 1957 | db | pending_messages: failed rows never purged | FIXED — periodic clearFailed() in stale session reaper (every 2 min) |
|
||||
| 1956 | db | WAL grows unbounded, no checkpoint schedule | FIXED — journal_size_limit=4MB + periodic wal_checkpoint(PASSIVE) |
|
||||
| 1874 | worker | processAgentResponse deletes queued messages on non-XML output | FIXED — mark messages failed (with retry) instead of confirming |
|
||||
| 1867 | worker | Queue processor dies while /health stays green | FIXED — activeSessions count added to /health endpoint |
|
||||
|
||||
Also fixed (not an issue): docker/claude-mem/run.sh nounset-safe TTY_ARGS expansion.
|
||||
Also fixed (Greptile review): cached isFts5Available() at construction time.
|
||||
|
||||
## Remaining — CRITICAL (5)
|
||||
|
||||
| # | Component | Issue |
|
||||
|---|-----------|-------|
|
||||
| 1925 | mcp | chroma-mcp subprocess leak via null-before-close |
|
||||
| 1926 | mcp | chroma-mcp stdio handshake broken across all versions |
|
||||
| 1942 | auth | Default model not resolved on Bedrock/Vertex/Azure |
|
||||
| 1943 | auth | SDK pipeline rejects Bedrock auth |
|
||||
| 1880 | windows | Ghost LISTEN socket on port 37777 after crash |
|
||||
| 1887 | windows | Failing worker blocks Claude Code MCP 10+ min in hook-restart loop |
|
||||
|
||||
## Remaining — HIGH (32)
|
||||
|
||||
| # | Component | Issue |
|
||||
|---|-----------|-------|
|
||||
| 1869 | worker | No mid-session auto-restart after inner crash |
|
||||
| 1870 | worker | Stop hook blocks ~110s when SDK pool saturated |
|
||||
| 1871 | worker | generateContext opens fresh SessionStore per call |
|
||||
| 1875 | worker | Spawns uvx/node/claude by bare name; silent fail in non-interactive |
|
||||
| 1877 | worker | Cross-session context bleed in same project dir |
|
||||
| 1879 | worker | Session completion races in-flight summarize |
|
||||
| 1890 | sdk-pool | SDK session resume during summarize causes context-overflow |
|
||||
| 1892 | sdk-pool | Memory agent prompt defeats cache (dynamic before static) |
|
||||
| 1895 | hooks | Stop hook spins 110s when worker older than v12.1.0 |
|
||||
| 1897 | hooks | PreToolUse:Read lacks PATH export and cache-path lookup |
|
||||
| 1899 | hooks | SessionStart additionalContext >10KB truncated to 2KB |
|
||||
| 1902 | hooks | Stop and PostToolUse hooks synchronously block up to 120s |
|
||||
| 1904 | hooks | UserPromptSubmit hooks skipped in git worktree sessions |
|
||||
| 1905 | hooks | Saved_hook_context entries pegs CPU 100% on session load |
|
||||
| 1906 | hooks | PR #1229 fallback path points to source, not cache |
|
||||
| 1909 | summarizer | Summarize hook doesn't recognize Gemini transcripts |
|
||||
| 1921 | mcp | Root .mcp.json is empty, mcp-search never registers |
|
||||
| 1922 | mcp | MCP server uses 3s timeout for corpus prime/query |
|
||||
| 1929 | installer | "Update now" fails for cache-only installs |
|
||||
| 1930 | installer | Windows 11 ships smart-explore without tree-sitter |
|
||||
| 1937 | observer | JSONL files accumulate indefinitely, tens of GB |
|
||||
| 1938 | observer | Observer background sessions burn tokens with no budget |
|
||||
| 1939 | cross-platform | Project key uses basename(cwd), fragmenting worktrees |
|
||||
| 1941 | cross-platform | Linux worker with live-but-unhealthy PID blocks restart |
|
||||
| 1944 | auth | ANTHROPIC_AUTH_TOKEN not forwarded to SDK subprocess |
|
||||
| 1945 | auth | Vertex AI CLI auth fails silently on expired OAuth |
|
||||
| 1947 | plugin-lifecycle | OpenCode tool args as plain objects not Zod schemas |
|
||||
| 1948 | plugin-lifecycle | OpenClaw installer "plugin not found" |
|
||||
| 1949 | plugin-lifecycle | OpenClaw per-agent memory isolation broken |
|
||||
| 1950 | plugin-lifecycle | OpenClaw missing skills, session drift, workspaceDir loss |
|
||||
| 1952 | db | ON UPDATE CASCADE rewrites historical session attribution |
|
||||
| 1954 | db | observation_feedback schema mismatch source vs compiled |
|
||||
| 1958 | viewer | Settings model dropdown destroys precise model IDs |
|
||||
| 1881-1888 | windows | 8 Windows-specific bugs (paths, spawning, timeouts) |
|
||||
|
||||
## Remaining — MEDIUM (21)
|
||||
|
||||
| # | Component | Issue |
|
||||
|---|-----------|-------|
|
||||
| 1872 | worker | Gemini 400/401 triggers 2-min crash-recovery loop |
|
||||
| 1873 | worker | worker-service.cjs killed by SIGKILL (unbounded heap) |
|
||||
| 1878 | worker | Logger caches log file path, never rotates |
|
||||
| 1891 | sdk-pool | Mode prompts in user messages, not system prompt |
|
||||
| 1893 | sdk-pool | SDK sub-agents hardcoded permissionMode:"default" |
|
||||
| 1894 | hooks | SessionStart can't find claude at ~/.local/bin |
|
||||
| 1898 | hooks | SessionStart health-check uses hardcoded port 37777 |
|
||||
| 1900 | hooks | Setup hook references non-existent scripts/setup.sh |
|
||||
| 1910 | summarizer | Summary prompt leaks observation tags, ignores user_prompt |
|
||||
| 1915 | search | Search results not deduplicated |
|
||||
| 1917 | search | $CMEM context preview shows oldest instead of newest |
|
||||
| 1920 | search | Context footer "ID" ambiguous across 3 ID spaces |
|
||||
| 1923 | mcp | smart_outline empty for .txt files |
|
||||
| 1924 | mcp | chroma-mcp child not terminated on exit |
|
||||
| 1927 | mcp | chroma-mcp fails on WSL with ALL_PROXY=socks5 |
|
||||
| 1928 | installer | BranchManager.pullUpdates() fails on cache-layout |
|
||||
| 1931 | installer | npm run worker:status ENOENT .claude/package.json |
|
||||
| 1940 | cross-platform | cmux.app wrapper "Claude executable not found" |
|
||||
| 1946 | auth | OpenRouter 401 Missing Authentication header |
|
||||
| 1955 | db | Duplicate observations bypass content-hash dedup |
|
||||
| 1959 | viewer | SSE new_prompt broadcast dies after /reload-plugins |
|
||||
| 1961 | misc | Traditional Chinese falls back to Simplified |
|
||||
|
||||
## Remaining — LOW (3)
|
||||
|
||||
| # | Component | Issue |
|
||||
|---|-----------|-------|
|
||||
| 1919 | search | Shared jsts tree-sitter query applies TS-only to JS |
|
||||
| 1951 | plugin-lifecycle | OpenClaw lifecycle events stored as observations |
|
||||
| 1960 | misc | OpenRouter URL hardcoded |
|
||||
|
||||
## Remaining — NON-LABELED (1)
|
||||
|
||||
| # | Component | Issue |
|
||||
|---|-----------|-------|
|
||||
| 2054 | installer | installCLI version-pinned alias can't self-update |
|
||||
|
||||
## Suggested Next Attack Order
|
||||
|
||||
### Phase 2: Worker stability — DONE
|
||||
### Phase 3: Hooks reliability — DONE
|
||||
### Phase 4: Security hardening — DONE
|
||||
### Phase 5: Search remaining — DONE
|
||||
|
||||
### Phase 6: MCP + Auth
|
||||
- #1925, #1926, #1942, #1943
|
||||
|
||||
### Phase 7: Windows
|
||||
- #1880, #1887, #1881-1888
|
||||
|
||||
### Phase 6: MCP / Chroma
|
||||
- #1925, #1926, #2046, #1921
|
||||
|
||||
### Phase 7: Everything else
|
||||
- Remaining hooks, installer, windows, observer, viewer, auth, plugin-lifecycle
|
||||
|
||||
## Progress Log
|
||||
|
||||
| Time | Action | Result |
|
||||
|------|--------|--------|
|
||||
| 9:40p | #1908 analyzed | Already fixed by Gen 3 coercion. Closed. |
|
||||
| 9:51p | #1916 fixed | concept→concepts remap in normalizeParams |
|
||||
| 9:53p | #1913/#2048 fixed | FTS5 fallback in SessionSearch + SearchManager |
|
||||
| 9:57p | #1953 closed | Already fixed by commit 59ce0fc5 |
|
||||
| 9:57p | #1957 fixed | Periodic clearFailed() in stale session reaper |
|
||||
| 9:58p | #1956 fixed | journal_size_limit + periodic WAL checkpoint |
|
||||
| 10:01p | #1874 fixed | Non-XML responses mark messages failed instead of confirming |
|
||||
| 10:01p | #1867 fixed | Health endpoint includes activeSessions count |
|
||||
| 10:02p | build-and-sync | Observations flowing. No regression. |
|
||||
| 10:03p | PR #2079 created | 2 commits pushed |
|
||||
| 10:06p | Greptile review | 2 comments — cached isFts5Available(). Fixed + pushed. |
|
||||
| 10:20p | PR #2079 merged | All reviews passed (CodeRabbit, Greptile, claude-review) |
|
||||
| 10:25p | v12.3.2 released | Tag pushed, GitHub release created, CHANGELOG updated |
|
||||
@@ -56,7 +56,7 @@ else
|
||||
fi
|
||||
|
||||
# Pick -it only when a TTY is attached (keeps non-interactive callers working).
|
||||
# Initialize with a no-op flag so the array is never empty (nounset-safe).
|
||||
# Initialize empty; expansion below safely omits args when the array is unset/empty.
|
||||
TTY_ARGS=()
|
||||
[[ -t 0 && -t 1 ]] && TTY_ARGS=(-it)
|
||||
|
||||
|
||||
@@ -24,12 +24,12 @@
|
||||
},
|
||||
{
|
||||
"type": "command",
|
||||
"command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" start; for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20; do curl -sf http://localhost:37777/health >/dev/null 2>&1 && break; sleep 1; done; curl -sf http://localhost:37777/health >/dev/null 2>&1 || true; echo '{\"continue\":true,\"suppressOutput\":true}'",
|
||||
"command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" start; for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20; do curl -sf http://localhost:$((37700 + $(id -u 2>/dev/null || echo 77) % 100))/health >/dev/null 2>&1 && break; sleep 1; done; curl -sf http://localhost:$((37700 + $(id -u 2>/dev/null || echo 77) % 100))/health >/dev/null 2>&1 || true; echo '{\"continue\":true,\"suppressOutput\":true}'",
|
||||
"timeout": 60
|
||||
},
|
||||
{
|
||||
"type": "command",
|
||||
"command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20; do curl -sf http://localhost:37777/health >/dev/null 2>&1 && break; sleep 1; done; if curl -sf http://localhost:37777/health >/dev/null 2>&1; then node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" hook claude-code context || true; fi",
|
||||
"command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20; do curl -sf http://localhost:$((37700 + $(id -u 2>/dev/null || echo 77) % 100))/health >/dev/null 2>&1 && break; sleep 1; done; if curl -sf http://localhost:$((37700 + $(id -u 2>/dev/null || echo 77) % 100))/health >/dev/null 2>&1; then node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" hook claude-code context || true; fi",
|
||||
"timeout": 60
|
||||
}
|
||||
]
|
||||
@@ -40,7 +40,7 @@
|
||||
"hooks": [
|
||||
{
|
||||
"type": "command",
|
||||
"command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" hook claude-code session-init",
|
||||
"command": "export PATH=\"$($SHELL -lc 'echo $PATH' 2>/dev/null):$PATH\"; _R=\"${CLAUDE_PLUGIN_ROOT}\"; [ -z \"$_R\" ] && _R=$(ls -dt $HOME/.claude/plugins/cache/thedotmack/claude-mem/[0-9]*/ 2>/dev/null | head -1); _R=\"${_R%/}\"; [ -z \"$_R\" ] && _R=\"$HOME/.claude/plugins/marketplaces/thedotmack/plugin\"; _HEALTH=0; curl -sf http://localhost:$((37700 + $(id -u 2>/dev/null || echo 77) % 100))/health >/dev/null 2>&1 && _HEALTH=1 || for i in 1 2 3 4 5 6 7 8 9 10; do sleep 1; curl -sf http://localhost:$((37700 + $(id -u 2>/dev/null || echo 77) % 100))/health >/dev/null 2>&1 && _HEALTH=1 && break; done; [ \"$_HEALTH\" = \"1\" ] && node \"$_R/scripts/bun-runner.js\" \"$_R/scripts/worker-service.cjs\" hook claude-code session-init",
|
||||
"timeout": 60
|
||||
}
|
||||
]
|
||||
|
||||
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
@@ -108,12 +108,22 @@ try {
|
||||
// Trigger worker restart after file sync
|
||||
console.log('\n🔄 Triggering worker restart...');
|
||||
const http = require('http');
|
||||
const fs = require('fs');
|
||||
const os = require('os');
|
||||
// Read auth token for API auth (#1932/#1933)
|
||||
const dataDir = process.env.CLAUDE_MEM_DATA_DIR || require('path').join(os.homedir(), '.claude-mem');
|
||||
let authToken = '';
|
||||
try { authToken = fs.readFileSync(require('path').join(dataDir, 'worker-auth-token'), 'utf-8').trim(); } catch {}
|
||||
// Use per-user port derivation (#1936)
|
||||
const uid = typeof process.getuid === 'function' ? process.getuid() : 77;
|
||||
const workerPort = parseInt(process.env.CLAUDE_MEM_WORKER_PORT || String(37700 + (uid % 100)), 10);
|
||||
const req = http.request({
|
||||
hostname: '127.0.0.1',
|
||||
port: 37777,
|
||||
port: workerPort,
|
||||
path: '/api/admin/restart',
|
||||
method: 'POST',
|
||||
timeout: 2000
|
||||
timeout: 2000,
|
||||
headers: authToken ? { 'Authorization': `Bearer ${authToken}` } : {}
|
||||
}, (res) => {
|
||||
if (res.statusCode === 200) {
|
||||
console.log('\x1b[32m%s\x1b[0m', '✓ Worker restart triggered');
|
||||
|
||||
@@ -44,7 +44,8 @@ export const sessionInitHandler: EventHandler = {
|
||||
return { continue: true, suppressOutput: true, exitCode: HOOK_EXIT_CODES.SUCCESS };
|
||||
}
|
||||
|
||||
const { sessionId, cwd, prompt: rawPrompt } = input;
|
||||
const { sessionId, prompt: rawPrompt } = input;
|
||||
const cwd = input.cwd ?? process.cwd(); // Match context.ts fallback (#1918)
|
||||
|
||||
// Guard: Codex CLI and other platforms may not provide a session_id (#744)
|
||||
if (!sessionId) {
|
||||
@@ -69,16 +70,23 @@ export const sessionInitHandler: EventHandler = {
|
||||
logger.debug('HOOK', 'session-init: Calling /api/sessions/init', { contentSessionId: sessionId, project });
|
||||
|
||||
// Initialize session via HTTP - handles DB operations and privacy checks
|
||||
const initResponse = await workerHttpRequest('/api/sessions/init', {
|
||||
method: 'POST',
|
||||
headers: { 'Content-Type': 'application/json' },
|
||||
body: JSON.stringify({
|
||||
contentSessionId: sessionId,
|
||||
project,
|
||||
prompt,
|
||||
platformSource
|
||||
})
|
||||
});
|
||||
let initResponse: Response;
|
||||
try {
|
||||
initResponse = await workerHttpRequest('/api/sessions/init', {
|
||||
method: 'POST',
|
||||
headers: { 'Content-Type': 'application/json' },
|
||||
body: JSON.stringify({
|
||||
contentSessionId: sessionId,
|
||||
project,
|
||||
prompt,
|
||||
platformSource
|
||||
})
|
||||
});
|
||||
} catch (err) {
|
||||
// Worker unreachable — on Linux/WSL, hook may fire before worker is healthy (#1907)
|
||||
logger.warn('HOOK', `session-init: worker request failed: ${err instanceof Error ? err.message : err}`);
|
||||
return { continue: true, suppressOutput: true, exitCode: HOOK_EXIT_CODES.SUCCESS };
|
||||
}
|
||||
|
||||
if (!initResponse.ok) {
|
||||
// Log but don't throw - a worker 500 should not block the user's prompt
|
||||
|
||||
@@ -84,16 +84,24 @@ export const summarizeHandler: EventHandler = {
|
||||
const platformSource = normalizePlatformSource(input.platform);
|
||||
|
||||
// 1. Queue summarize request — worker returns immediately with { status: 'queued' }
|
||||
const response = await workerHttpRequest('/api/sessions/summarize', {
|
||||
method: 'POST',
|
||||
headers: { 'Content-Type': 'application/json' },
|
||||
body: JSON.stringify({
|
||||
contentSessionId: sessionId,
|
||||
last_assistant_message: lastAssistantMessage,
|
||||
platformSource
|
||||
}),
|
||||
timeoutMs: SUMMARIZE_TIMEOUT_MS
|
||||
});
|
||||
let response: Response;
|
||||
try {
|
||||
response = await workerHttpRequest('/api/sessions/summarize', {
|
||||
method: 'POST',
|
||||
headers: { 'Content-Type': 'application/json' },
|
||||
body: JSON.stringify({
|
||||
contentSessionId: sessionId,
|
||||
last_assistant_message: lastAssistantMessage,
|
||||
platformSource
|
||||
}),
|
||||
timeoutMs: SUMMARIZE_TIMEOUT_MS
|
||||
});
|
||||
} catch (err) {
|
||||
// Network error, worker crash, or timeout — exit gracefully instead of
|
||||
// bubbling to hook runner which exits code 2 and blocks session exit (#1901)
|
||||
logger.warn('HOOK', `Stop hook: summarize request failed: ${err instanceof Error ? err.message : err}`);
|
||||
return { continue: true, suppressOutput: true, exitCode: HOOK_EXIT_CODES.SUCCESS };
|
||||
}
|
||||
|
||||
if (!response.ok) {
|
||||
return { continue: true, suppressOutput: true };
|
||||
|
||||
@@ -101,6 +101,8 @@ const MAX_TOOL_RESPONSE_LENGTH = 1000;
|
||||
// Worker HTTP Client
|
||||
// ============================================================================
|
||||
|
||||
const JSON_HEADERS: Record<string, string> = { "Content-Type": "application/json" };
|
||||
|
||||
async function workerPost(
|
||||
path: string,
|
||||
body: Record<string, unknown>,
|
||||
@@ -109,7 +111,7 @@ async function workerPost(
|
||||
try {
|
||||
response = await fetch(`${WORKER_BASE_URL}${path}`, {
|
||||
method: "POST",
|
||||
headers: { "Content-Type": "application/json" },
|
||||
headers: JSON_HEADERS,
|
||||
body: JSON.stringify(body),
|
||||
});
|
||||
} catch (error: unknown) {
|
||||
@@ -134,7 +136,7 @@ function workerPostFireAndForget(
|
||||
): void {
|
||||
fetch(`${WORKER_BASE_URL}${path}`, {
|
||||
method: "POST",
|
||||
headers: { "Content-Type": "application/json" },
|
||||
headers: JSON_HEADERS,
|
||||
body: JSON.stringify(body),
|
||||
}).catch((error: unknown) => {
|
||||
const message = error instanceof Error ? error.message : String(error);
|
||||
@@ -146,7 +148,7 @@ function workerPostFireAndForget(
|
||||
|
||||
async function workerGetText(path: string): Promise<string | null> {
|
||||
try {
|
||||
const response = await fetch(`${WORKER_BASE_URL}${path}`);
|
||||
const response = await fetch(`${WORKER_BASE_URL}${path}`, { headers: JSON_HEADERS });
|
||||
if (!response.ok) {
|
||||
console.warn(`[claude-mem] Worker GET ${path} returned ${response.status}`);
|
||||
return null;
|
||||
|
||||
@@ -477,6 +477,25 @@ export class PendingMessageStore {
|
||||
return result.changes;
|
||||
}
|
||||
|
||||
/**
|
||||
* Clear failed messages older than the given threshold.
|
||||
* Preserves recent failures for inspection and manual retry.
|
||||
* @param thresholdMs - Only delete failures older than this many milliseconds
|
||||
* @returns Number of messages deleted
|
||||
*/
|
||||
clearFailedOlderThan(thresholdMs: number): number {
|
||||
const cutoff = Date.now() - thresholdMs;
|
||||
// Use COALESCE to prefer the most recent failure timestamp over creation time.
|
||||
// failed_at_epoch is set by session-level failures, completed_at_epoch by markFailed().
|
||||
const stmt = this.db.prepare(`
|
||||
DELETE FROM pending_messages
|
||||
WHERE status = 'failed'
|
||||
AND COALESCE(failed_at_epoch, completed_at_epoch, started_processing_at_epoch, created_at_epoch) < ?
|
||||
`);
|
||||
const result = stmt.run(cutoff);
|
||||
return result.changes;
|
||||
}
|
||||
|
||||
/**
|
||||
* Clear all pending, processing, and failed messages from the queue
|
||||
* Keeps only processed messages (for history)
|
||||
|
||||
@@ -36,7 +36,7 @@ export class SessionSearch {
|
||||
// Cache FTS5 availability once at construction (avoids DDL probe on every query)
|
||||
this._fts5Available = this.isFts5Available();
|
||||
|
||||
// Ensure FTS tables exist
|
||||
// Ensure FTS tables exist — may downgrade _fts5Available if creation fails
|
||||
this.ensureFTSTables();
|
||||
}
|
||||
|
||||
@@ -84,6 +84,7 @@ export class SessionSearch {
|
||||
logger.info('DB', 'FTS5 tables created successfully');
|
||||
} catch (error) {
|
||||
// FTS5 creation failed at runtime despite probe succeeding — degrade gracefully
|
||||
this._fts5Available = false;
|
||||
logger.warn('DB', 'FTS5 table creation failed — search will use ChromaDB and LIKE queries', {}, error instanceof Error ? error : undefined);
|
||||
}
|
||||
}
|
||||
@@ -327,14 +328,17 @@ export class SessionSearch {
|
||||
LIMIT ? OFFSET ?
|
||||
`;
|
||||
|
||||
params.unshift(query);
|
||||
// Escape FTS5 special characters: wrap in quotes to treat as literal phrase
|
||||
const escapedQuery = '"' + query.replace(/"/g, '""') + '"';
|
||||
params.unshift(escapedQuery);
|
||||
params.push(limit, offset);
|
||||
|
||||
try {
|
||||
return this.db.prepare(sql).all(...params) as ObservationSearchResult[];
|
||||
} catch (error) {
|
||||
logger.warn('DB', 'FTS5 observation search failed, returning empty', {}, error instanceof Error ? error : undefined);
|
||||
return [];
|
||||
// Re-throw so callers can distinguish FTS failure from "no results"
|
||||
logger.warn('DB', 'FTS5 observation search failed', {}, error instanceof Error ? error : undefined);
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
|
||||
@@ -383,7 +387,9 @@ export class SessionSearch {
|
||||
|
||||
const orderClause = orderBy === 'date_asc'
|
||||
? 'ORDER BY s.created_at_epoch ASC'
|
||||
: 'ORDER BY session_summaries_fts.rank ASC';
|
||||
: orderBy === 'date_desc'
|
||||
? 'ORDER BY s.created_at_epoch DESC'
|
||||
: 'ORDER BY session_summaries_fts.rank ASC';
|
||||
|
||||
const sql = `
|
||||
SELECT s.*, s.discovery_tokens
|
||||
@@ -395,14 +401,17 @@ export class SessionSearch {
|
||||
LIMIT ? OFFSET ?
|
||||
`;
|
||||
|
||||
params.unshift(query);
|
||||
// Escape FTS5 special characters: wrap in quotes to treat as literal phrase
|
||||
const escapedQuery = '"' + query.replace(/"/g, '""') + '"';
|
||||
params.unshift(escapedQuery);
|
||||
params.push(limit, offset);
|
||||
|
||||
try {
|
||||
return this.db.prepare(sql).all(...params) as SessionSummarySearchResult[];
|
||||
} catch (error) {
|
||||
logger.warn('DB', 'FTS5 session search failed, returning empty', {}, error instanceof Error ? error : undefined);
|
||||
return [];
|
||||
// Re-throw so callers can distinguish FTS failure from "no results"
|
||||
logger.warn('DB', 'FTS5 session search failed', {}, error instanceof Error ? error : undefined);
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
|
||||
@@ -645,8 +654,10 @@ export class SessionSearch {
|
||||
}
|
||||
|
||||
// LIKE fallback for user prompts text search (no FTS table for this entity)
|
||||
baseConditions.push('up.prompt_text LIKE ?');
|
||||
params.push(`%${query}%`);
|
||||
// Escape LIKE metacharacters so %, _, and \ in user input are treated as literals
|
||||
const escapedQuery = query.replace(/[\\%_]/g, '\\$&');
|
||||
baseConditions.push("up.prompt_text LIKE ? ESCAPE '\\'");
|
||||
params.push(`%${escapedQuery}%`);
|
||||
|
||||
const whereClause = `WHERE ${baseConditions.join(' AND ')}`;
|
||||
const orderClause = orderBy === 'date_asc'
|
||||
|
||||
@@ -1,8 +1,10 @@
|
||||
import path from 'path';
|
||||
import { sessionInitHandler } from '../../cli/handlers/session-init.js';
|
||||
import { observationHandler } from '../../cli/handlers/observation.js';
|
||||
import { fileEditHandler } from '../../cli/handlers/file-edit.js';
|
||||
import { sessionCompleteHandler } from '../../cli/handlers/session-complete.js';
|
||||
import { ensureWorkerRunning, workerHttpRequest } from '../../shared/worker-utils.js';
|
||||
import { DATA_DIR } from '../../shared/paths.js';
|
||||
import { logger } from '../../utils/logger.js';
|
||||
import { getProjectContext } from '../../utils/project-name.js';
|
||||
import { writeAgentsMd } from '../../utils/agents-md-utils.js';
|
||||
@@ -357,6 +359,19 @@ export class TranscriptEventProcessor {
|
||||
const contextUrl = `/api/context/inject?projects=${encodeURIComponent(projectsParam)}&platformSource=${encodeURIComponent(session.platformSource)}`;
|
||||
const agentsPath = expandHomePath(watch.context.path ?? `${cwd}/AGENTS.md`);
|
||||
|
||||
// Validate resolved path stays within allowed directories (#1934)
|
||||
const resolvedAgentsPath = path.resolve(agentsPath);
|
||||
const allowedRoots = [path.resolve(cwd), path.resolve(DATA_DIR)];
|
||||
const isPathSafe = allowedRoots.some(root => resolvedAgentsPath.startsWith(root + path.sep) || resolvedAgentsPath === root);
|
||||
if (!isPathSafe) {
|
||||
logger.warn('SECURITY', 'Rejected path traversal attempt in watch.context.path', {
|
||||
original: watch.context.path,
|
||||
resolved: resolvedAgentsPath,
|
||||
allowedRoots
|
||||
});
|
||||
return;
|
||||
}
|
||||
|
||||
let response: Awaited<ReturnType<typeof workerHttpRequest>>;
|
||||
try {
|
||||
response = await workerHttpRequest(contextUrl);
|
||||
|
||||
@@ -28,6 +28,7 @@ import { sanitizeEnv } from '../supervisor/env-sanitizer.js';
|
||||
// ensure the worker daemon is up without importing this entire module — which
|
||||
// transitively pulls in the SQLite database layer via ChromaSync/DatabaseManager.
|
||||
import { ensureWorkerStarted as ensureWorkerStartedShared } from './worker-spawner.js';
|
||||
import { RestartGuard } from './worker/RestartGuard.js';
|
||||
|
||||
// Re-export for backward compatibility — canonical implementation in shared/plugin-state.ts
|
||||
export { isPluginDisabledInClaudeSettings } from '../shared/plugin-state.js';
|
||||
@@ -482,7 +483,7 @@ export class WorkerService {
|
||||
// Best-effort loopback MCP self-check
|
||||
getSupervisor().assertCanSpawn('mcp server');
|
||||
const transport = new StdioClientTransport({
|
||||
command: 'node',
|
||||
command: process.execPath, // Use resolved path, not bare 'node' which fails on non-interactive PATH (#1876)
|
||||
args: [mcpServerPath],
|
||||
env: sanitizeEnv(process.env)
|
||||
});
|
||||
@@ -558,12 +559,14 @@ export class WorkerService {
|
||||
}
|
||||
}
|
||||
|
||||
// Purge failed pending messages to prevent unbounded queue growth (#1957)
|
||||
// Purge stale failed pending messages to prevent unbounded queue growth (#1957)
|
||||
// Only remove failures older than 1 hour to preserve recent failures for inspection/retry
|
||||
try {
|
||||
const pendingStore = this.sessionManager.getPendingMessageStore();
|
||||
const purged = pendingStore.clearFailed();
|
||||
const FAILED_MESSAGE_RETENTION_MS = 60 * 60 * 1000; // 1 hour
|
||||
const purged = pendingStore.clearFailedOlderThan(FAILED_MESSAGE_RETENTION_MS);
|
||||
if (purged > 0) {
|
||||
logger.info('SYSTEM', `Purged ${purged} failed pending messages`);
|
||||
logger.info('SYSTEM', `Purged ${purged} stale failed pending messages (older than 1h)`);
|
||||
}
|
||||
} catch (e) {
|
||||
if (e instanceof Error) {
|
||||
@@ -816,17 +819,19 @@ export class WorkerService {
|
||||
}
|
||||
// Fall through to pending-work restart below
|
||||
}
|
||||
const MAX_PENDING_RESTARTS = 3;
|
||||
|
||||
if (pendingCount > 0) {
|
||||
// Track consecutive pending-work restarts to prevent infinite loops (e.g. FK errors)
|
||||
session.consecutiveRestarts = (session.consecutiveRestarts || 0) + 1;
|
||||
// Windowed restart guard: only blocks tight-loop restarts, not spread-out ones (#2053)
|
||||
if (!session.restartGuard) session.restartGuard = new RestartGuard();
|
||||
const restartAllowed = session.restartGuard.recordRestart();
|
||||
session.consecutiveRestarts = (session.consecutiveRestarts || 0) + 1; // Keep for logging
|
||||
|
||||
if (session.consecutiveRestarts > MAX_PENDING_RESTARTS) {
|
||||
logger.error('SYSTEM', 'Exceeded max pending-work restarts, stopping to prevent infinite loop', {
|
||||
if (!restartAllowed) {
|
||||
logger.error('SYSTEM', 'Restart guard tripped: too many restarts in window, stopping to prevent runaway costs', {
|
||||
sessionId: session.sessionDbId,
|
||||
pendingCount,
|
||||
consecutiveRestarts: session.consecutiveRestarts
|
||||
restartsInWindow: session.restartGuard.restartsInWindow,
|
||||
windowMs: session.restartGuard.windowMs,
|
||||
maxRestarts: session.restartGuard.maxRestarts
|
||||
});
|
||||
session.consecutiveRestarts = 0;
|
||||
this.terminateSession(session.sessionDbId, 'max_restarts_exceeded');
|
||||
@@ -846,6 +851,7 @@ export class WorkerService {
|
||||
} else {
|
||||
// Successful completion with no pending work — clean up session
|
||||
// removeSessionImmediate fires onSessionDeletedCallback → broadcastProcessingStatus()
|
||||
session.restartGuard?.recordSuccess();
|
||||
session.consecutiveRestarts = 0;
|
||||
this.sessionManager.removeSessionImmediate(session.sessionDbId);
|
||||
}
|
||||
|
||||
@@ -3,6 +3,7 @@
|
||||
*/
|
||||
|
||||
import type { Response } from 'express';
|
||||
import type { RestartGuard } from './worker/RestartGuard.js';
|
||||
|
||||
// ============================================================================
|
||||
// Active Session Types
|
||||
@@ -34,7 +35,8 @@ export interface ActiveSession {
|
||||
earliestPendingTimestamp: number | null; // Original timestamp of earliest pending message (for accurate observation timestamps)
|
||||
conversationHistory: ConversationMessage[]; // Shared conversation history for provider switching
|
||||
currentProvider: 'claude' | 'gemini' | 'openrouter' | null; // Track which provider is currently running
|
||||
consecutiveRestarts: number; // Track consecutive restart attempts to prevent infinite loops
|
||||
consecutiveRestarts: number; // DEPRECATED: use restartGuard. Kept for logging compat.
|
||||
restartGuard?: RestartGuard;
|
||||
forceInit?: boolean; // Force fresh SDK session (skip resume)
|
||||
idleTimedOut?: boolean; // Set when session exits due to idle timeout (prevents restart loop)
|
||||
lastGeneratorActivity: number; // Timestamp of last generator progress (for stale detection, Issue #1099)
|
||||
|
||||
@@ -115,10 +115,15 @@ function notifySlotAvailable(): void {
|
||||
* Wait for a pool slot to become available (promise-based, not polling)
|
||||
* @param maxConcurrent Max number of concurrent agents
|
||||
* @param timeoutMs Max time to wait before giving up
|
||||
* @param evictIdleSession Optional callback to evict an idle session when all slots are full (#1868)
|
||||
*/
|
||||
const TOTAL_PROCESS_HARD_CAP = 10;
|
||||
|
||||
export async function waitForSlot(maxConcurrent: number, timeoutMs: number = 60_000): Promise<void> {
|
||||
export async function waitForSlot(
|
||||
maxConcurrent: number,
|
||||
timeoutMs: number = 60_000,
|
||||
evictIdleSession?: () => boolean
|
||||
): Promise<void> {
|
||||
// Hard cap: refuse to spawn if too many processes exist regardless of pool accounting
|
||||
const activeCount = getActiveCount();
|
||||
if (activeCount >= TOTAL_PROCESS_HARD_CAP) {
|
||||
@@ -127,6 +132,17 @@ export async function waitForSlot(maxConcurrent: number, timeoutMs: number = 60_
|
||||
|
||||
if (activeCount < maxConcurrent) return;
|
||||
|
||||
// Try to evict an idle session before waiting (#1868)
|
||||
// Idle sessions hold pool slots during their 3-min idle timeout, blocking new sessions
|
||||
// that would timeout after 60s. Eviction aborts the idle session asynchronously —
|
||||
// the freed slot is picked up by the waiter mechanism below.
|
||||
if (evictIdleSession) {
|
||||
const evicted = evictIdleSession();
|
||||
if (evicted) {
|
||||
logger.info('PROCESS', 'Evicted idle session to free pool slot for waiting request');
|
||||
}
|
||||
}
|
||||
|
||||
logger.info('PROCESS', `Pool limit reached (${activeCount}/${maxConcurrent}), waiting for slot...`);
|
||||
|
||||
return new Promise<void>((resolve, reject) => {
|
||||
|
||||
70
src/services/worker/RestartGuard.ts
Normal file
70
src/services/worker/RestartGuard.ts
Normal file
@@ -0,0 +1,70 @@
|
||||
/**
|
||||
* Time-windowed restart guard.
|
||||
* Prevents tight-loop restarts (bug) while allowing legitimate occasional restarts
|
||||
* over long sessions. Replaces the flat consecutiveRestarts counter that stranded
|
||||
* pending messages after just 3 restarts over any timeframe (#2053).
|
||||
*/
|
||||
|
||||
const RESTART_WINDOW_MS = 60_000; // Only count restarts within last 60 seconds
|
||||
const MAX_WINDOWED_RESTARTS = 10; // 10 restarts in 60s = runaway loop
|
||||
const DECAY_AFTER_SUCCESS_MS = 5 * 60_000; // Clear history after 5min of uninterrupted success
|
||||
|
||||
export class RestartGuard {
|
||||
private restartTimestamps: number[] = [];
|
||||
private lastSuccessfulProcessing: number | null = null;
|
||||
|
||||
/**
|
||||
* Record a restart and check if the guard should trip.
|
||||
* @returns true if the restart is ALLOWED, false if it should be BLOCKED
|
||||
*/
|
||||
recordRestart(): boolean {
|
||||
const now = Date.now();
|
||||
|
||||
// Decay: clear history only after real success + 5min of uninterrupted success
|
||||
if (this.lastSuccessfulProcessing !== null
|
||||
&& now - this.lastSuccessfulProcessing >= DECAY_AFTER_SUCCESS_MS) {
|
||||
this.restartTimestamps = [];
|
||||
this.lastSuccessfulProcessing = null;
|
||||
}
|
||||
|
||||
// Prune old timestamps outside the window
|
||||
this.restartTimestamps = this.restartTimestamps.filter(
|
||||
ts => now - ts < RESTART_WINDOW_MS
|
||||
);
|
||||
|
||||
// Record this restart
|
||||
this.restartTimestamps.push(now);
|
||||
|
||||
// Check if we've exceeded the cap within the window
|
||||
return this.restartTimestamps.length <= MAX_WINDOWED_RESTARTS;
|
||||
}
|
||||
|
||||
/**
|
||||
* Call when a message is successfully processed to update the success timestamp.
|
||||
*/
|
||||
recordSuccess(): void {
|
||||
this.lastSuccessfulProcessing = Date.now();
|
||||
}
|
||||
|
||||
/**
|
||||
* Get the number of restarts in the current window (for logging).
|
||||
*/
|
||||
get restartsInWindow(): number {
|
||||
const now = Date.now();
|
||||
return this.restartTimestamps.filter(ts => now - ts < RESTART_WINDOW_MS).length;
|
||||
}
|
||||
|
||||
/**
|
||||
* Get the window size in ms (for logging).
|
||||
*/
|
||||
get windowMs(): number {
|
||||
return RESTART_WINDOW_MS;
|
||||
}
|
||||
|
||||
/**
|
||||
* Get the max allowed restarts (for logging).
|
||||
*/
|
||||
get maxRestarts(): number {
|
||||
return MAX_WINDOWED_RESTARTS;
|
||||
}
|
||||
}
|
||||
@@ -90,9 +90,11 @@ export class SDKAgent {
|
||||
}
|
||||
|
||||
// Wait for agent pool slot (configurable via CLAUDE_MEM_MAX_CONCURRENT_AGENTS)
|
||||
// Pass idle session eviction callback to prevent pool deadlock (#1868):
|
||||
// idle sessions hold slots during 3-min idle wait, blocking new sessions
|
||||
const settings = SettingsDefaultsManager.loadFromFile(USER_SETTINGS_PATH);
|
||||
const maxConcurrent = parseInt(settings.CLAUDE_MEM_MAX_CONCURRENT_AGENTS, 10) || 2;
|
||||
await waitForSlot(maxConcurrent);
|
||||
await waitForSlot(maxConcurrent, 60_000, () => this.sessionManager.evictIdlestSession());
|
||||
|
||||
// Build isolated environment from ~/.claude-mem/.env
|
||||
// This prevents Issue #733: random ANTHROPIC_API_KEY from project .env files
|
||||
|
||||
@@ -67,8 +67,20 @@ export class SearchManager {
|
||||
return await this.chromaSync.queryChroma(query, limit, whereFilter);
|
||||
}
|
||||
|
||||
private async searchChromaForTimeline(query: string, ninetyDaysAgo: number): Promise<ObservationSearchResult[]> {
|
||||
const chromaResults = await this.queryChroma(query, 100);
|
||||
private async searchChromaForTimeline(query: string, ninetyDaysAgo: number, project?: string): Promise<ObservationSearchResult[]> {
|
||||
// Build where filter scoped to observations only + project if provided
|
||||
let whereFilter: Record<string, any> = { doc_type: 'observation' };
|
||||
if (project) {
|
||||
const projectFilter = {
|
||||
$or: [
|
||||
{ project },
|
||||
{ merged_into_project: project }
|
||||
]
|
||||
};
|
||||
whereFilter = { $and: [whereFilter, projectFilter] };
|
||||
}
|
||||
|
||||
const chromaResults = await this.queryChroma(query, 100, whereFilter);
|
||||
logger.debug('SEARCH', 'Chroma returned semantic matches for timeline', { matchCount: chromaResults?.ids?.length ?? 0 });
|
||||
|
||||
if (chromaResults?.ids && chromaResults.ids.length > 0) {
|
||||
@@ -78,7 +90,7 @@ export class SearchManager {
|
||||
});
|
||||
|
||||
if (recentIds.length > 0) {
|
||||
return this.sessionStore.getObservationsByIds(recentIds, { orderBy: 'date_desc', limit: 1 });
|
||||
return this.sessionStore.getObservationsByIds(recentIds, { orderBy: 'date_desc', limit: 1, project });
|
||||
}
|
||||
}
|
||||
return [];
|
||||
@@ -286,14 +298,20 @@ export class SearchManager {
|
||||
// ChromaDB not initialized - fall back to FTS5 keyword search (#1913, #2048)
|
||||
else if (query) {
|
||||
logger.debug('SEARCH', 'ChromaDB not initialized — falling back to FTS5 keyword search', {});
|
||||
if (searchObservations) {
|
||||
observations = this.sessionSearch.searchObservations(query, { ...options, type: obs_type, concepts, files });
|
||||
}
|
||||
if (searchSessions) {
|
||||
sessions = this.sessionSearch.searchSessions(query, options);
|
||||
}
|
||||
if (searchPrompts) {
|
||||
prompts = this.sessionSearch.searchUserPrompts(query, options);
|
||||
try {
|
||||
if (searchObservations) {
|
||||
observations = this.sessionSearch.searchObservations(query, { ...options, type: obs_type, concepts, files });
|
||||
}
|
||||
if (searchSessions) {
|
||||
sessions = this.sessionSearch.searchSessions(query, options);
|
||||
}
|
||||
if (searchPrompts) {
|
||||
prompts = this.sessionSearch.searchUserPrompts(query, options);
|
||||
}
|
||||
} catch (ftsError) {
|
||||
const errorObject = ftsError instanceof Error ? ftsError : new Error(String(ftsError));
|
||||
logger.error('WORKER', 'FTS5 fallback search failed', {}, errorObject);
|
||||
chromaFailed = true;
|
||||
}
|
||||
}
|
||||
|
||||
@@ -469,13 +487,25 @@ export class SearchManager {
|
||||
logger.debug('SEARCH', 'Using hybrid semantic search for timeline query', {});
|
||||
const ninetyDaysAgo = Date.now() - SEARCH_CONSTANTS.RECENCY_WINDOW_MS;
|
||||
try {
|
||||
results = await this.searchChromaForTimeline(query, ninetyDaysAgo);
|
||||
results = await this.searchChromaForTimeline(query, ninetyDaysAgo, project);
|
||||
} catch (chromaError) {
|
||||
const errorObject = chromaError instanceof Error ? chromaError : new Error(String(chromaError));
|
||||
logger.error('WORKER', 'Chroma search failed for timeline, continuing without semantic results', {}, errorObject);
|
||||
}
|
||||
}
|
||||
|
||||
// FTS fallback when Chroma is unavailable or returned no results
|
||||
if (results.length === 0) {
|
||||
try {
|
||||
const ftsResults = this.sessionSearch.searchObservations(query, { project, limit: 1 });
|
||||
if (ftsResults.length > 0) {
|
||||
results = ftsResults;
|
||||
}
|
||||
} catch (ftsError) {
|
||||
logger.warn('SEARCH', 'FTS fallback failed for timeline', {}, ftsError instanceof Error ? ftsError : undefined);
|
||||
}
|
||||
}
|
||||
|
||||
if (results.length === 0) {
|
||||
return {
|
||||
content: [{
|
||||
@@ -927,26 +957,55 @@ export class SearchManager {
|
||||
if (this.chromaSync) {
|
||||
logger.debug('SEARCH', 'Using hybrid semantic search (Chroma + SQLite)', {});
|
||||
|
||||
// Build Chroma where filter with doc_type and project scope
|
||||
let whereFilter: Record<string, any> = { doc_type: 'observation' };
|
||||
if (options.project) {
|
||||
const projectFilter = {
|
||||
$or: [
|
||||
{ project: options.project },
|
||||
{ merged_into_project: options.project }
|
||||
]
|
||||
};
|
||||
whereFilter = { $and: [whereFilter, projectFilter] };
|
||||
}
|
||||
|
||||
// Step 1: Chroma semantic search (top 100)
|
||||
const chromaResults = await this.queryChroma(query, 100);
|
||||
logger.debug('SEARCH', 'Chroma returned semantic matches', { matchCount: chromaResults.ids.length });
|
||||
try {
|
||||
const chromaResults = await this.queryChroma(query, 100, whereFilter);
|
||||
logger.debug('SEARCH', 'Chroma returned semantic matches', { matchCount: chromaResults.ids.length });
|
||||
|
||||
if (chromaResults.ids.length > 0) {
|
||||
// Step 2: Filter by recency (90 days)
|
||||
const ninetyDaysAgo = Date.now() - SEARCH_CONSTANTS.RECENCY_WINDOW_MS;
|
||||
const recentIds = chromaResults.ids.filter((_id, idx) => {
|
||||
const meta = chromaResults.metadatas[idx];
|
||||
return meta && meta.created_at_epoch > ninetyDaysAgo;
|
||||
});
|
||||
if (chromaResults.ids.length > 0) {
|
||||
// Step 2: Filter by recency (90 days)
|
||||
const ninetyDaysAgo = Date.now() - SEARCH_CONSTANTS.RECENCY_WINDOW_MS;
|
||||
const recentIds = chromaResults.ids.filter((_id, idx) => {
|
||||
const meta = chromaResults.metadatas[idx];
|
||||
return meta && meta.created_at_epoch > ninetyDaysAgo;
|
||||
});
|
||||
|
||||
logger.debug('SEARCH', 'Results within 90-day window', { count: recentIds.length });
|
||||
logger.debug('SEARCH', 'Results within 90-day window', { count: recentIds.length });
|
||||
|
||||
// Step 3: Hydrate from SQLite in temporal order
|
||||
if (recentIds.length > 0) {
|
||||
const limit = options.limit || 20;
|
||||
results = this.sessionStore.getObservationsByIds(recentIds, { orderBy: 'date_desc', limit });
|
||||
logger.debug('SEARCH', 'Hydrated observations from SQLite', { count: results.length });
|
||||
// Step 3: Hydrate from SQLite in temporal order
|
||||
if (recentIds.length > 0) {
|
||||
const limit = options.limit || 20;
|
||||
results = this.sessionStore.getObservationsByIds(recentIds, { orderBy: 'date_desc', limit, project: options.project });
|
||||
logger.debug('SEARCH', 'Hydrated observations from SQLite', { count: results.length });
|
||||
}
|
||||
}
|
||||
} catch (chromaError) {
|
||||
const errorObject = chromaError instanceof Error ? chromaError : new Error(String(chromaError));
|
||||
logger.error('WORKER', 'Chroma search failed for observations, falling back to FTS', {}, errorObject);
|
||||
}
|
||||
}
|
||||
|
||||
// FTS fallback when Chroma is unavailable or returned no results
|
||||
if (results.length === 0) {
|
||||
try {
|
||||
const ftsResults = this.sessionSearch.searchObservations(query, options);
|
||||
if (ftsResults.length > 0) {
|
||||
results = ftsResults;
|
||||
}
|
||||
} catch (ftsError) {
|
||||
logger.warn('SEARCH', 'FTS fallback failed for observations', {}, ftsError instanceof Error ? ftsError : undefined);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -984,26 +1043,55 @@ export class SearchManager {
|
||||
if (this.chromaSync) {
|
||||
logger.debug('SEARCH', 'Using hybrid semantic search for sessions', {});
|
||||
|
||||
// Build Chroma where filter with doc_type and project scope
|
||||
let whereFilter: Record<string, any> = { doc_type: 'session_summary' };
|
||||
if (options.project) {
|
||||
const projectFilter = {
|
||||
$or: [
|
||||
{ project: options.project },
|
||||
{ merged_into_project: options.project }
|
||||
]
|
||||
};
|
||||
whereFilter = { $and: [whereFilter, projectFilter] };
|
||||
}
|
||||
|
||||
// Step 1: Chroma semantic search (top 100)
|
||||
const chromaResults = await this.queryChroma(query, 100, { doc_type: 'session_summary' });
|
||||
logger.debug('SEARCH', 'Chroma returned semantic matches for sessions', { matchCount: chromaResults.ids.length });
|
||||
try {
|
||||
const chromaResults = await this.queryChroma(query, 100, whereFilter);
|
||||
logger.debug('SEARCH', 'Chroma returned semantic matches for sessions', { matchCount: chromaResults.ids.length });
|
||||
|
||||
if (chromaResults.ids.length > 0) {
|
||||
// Step 2: Filter by recency (90 days)
|
||||
const ninetyDaysAgo = Date.now() - SEARCH_CONSTANTS.RECENCY_WINDOW_MS;
|
||||
const recentIds = chromaResults.ids.filter((_id, idx) => {
|
||||
const meta = chromaResults.metadatas[idx];
|
||||
return meta && meta.created_at_epoch > ninetyDaysAgo;
|
||||
});
|
||||
if (chromaResults.ids.length > 0) {
|
||||
// Step 2: Filter by recency (90 days)
|
||||
const ninetyDaysAgo = Date.now() - SEARCH_CONSTANTS.RECENCY_WINDOW_MS;
|
||||
const recentIds = chromaResults.ids.filter((_id, idx) => {
|
||||
const meta = chromaResults.metadatas[idx];
|
||||
return meta && meta.created_at_epoch > ninetyDaysAgo;
|
||||
});
|
||||
|
||||
logger.debug('SEARCH', 'Results within 90-day window', { count: recentIds.length });
|
||||
logger.debug('SEARCH', 'Results within 90-day window', { count: recentIds.length });
|
||||
|
||||
// Step 3: Hydrate from SQLite in temporal order
|
||||
if (recentIds.length > 0) {
|
||||
const limit = options.limit || 20;
|
||||
results = this.sessionStore.getSessionSummariesByIds(recentIds, { orderBy: 'date_desc', limit });
|
||||
logger.debug('SEARCH', 'Hydrated sessions from SQLite', { count: results.length });
|
||||
// Step 3: Hydrate from SQLite in temporal order
|
||||
if (recentIds.length > 0) {
|
||||
const limit = options.limit || 20;
|
||||
results = this.sessionStore.getSessionSummariesByIds(recentIds, { orderBy: 'date_desc', limit, project: options.project });
|
||||
logger.debug('SEARCH', 'Hydrated sessions from SQLite', { count: results.length });
|
||||
}
|
||||
}
|
||||
} catch (chromaError) {
|
||||
const errorObject = chromaError instanceof Error ? chromaError : new Error(String(chromaError));
|
||||
logger.error('WORKER', 'Chroma search failed for sessions, falling back to FTS', {}, errorObject);
|
||||
}
|
||||
}
|
||||
|
||||
// FTS fallback when Chroma is unavailable or returned no results
|
||||
if (results.length === 0) {
|
||||
try {
|
||||
const ftsResults = this.sessionSearch.searchSessions(query, options);
|
||||
if (ftsResults.length > 0) {
|
||||
results = ftsResults;
|
||||
}
|
||||
} catch (ftsError) {
|
||||
logger.warn('SEARCH', 'FTS fallback failed for sessions', {}, ftsError instanceof Error ? ftsError : undefined);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1041,26 +1129,55 @@ export class SearchManager {
|
||||
if (this.chromaSync) {
|
||||
logger.debug('SEARCH', 'Using hybrid semantic search for user prompts', {});
|
||||
|
||||
// Build Chroma where filter with doc_type and project scope
|
||||
let whereFilter: Record<string, any> = { doc_type: 'user_prompt' };
|
||||
if (options.project) {
|
||||
const projectFilter = {
|
||||
$or: [
|
||||
{ project: options.project },
|
||||
{ merged_into_project: options.project }
|
||||
]
|
||||
};
|
||||
whereFilter = { $and: [whereFilter, projectFilter] };
|
||||
}
|
||||
|
||||
// Step 1: Chroma semantic search (top 100)
|
||||
const chromaResults = await this.queryChroma(query, 100, { doc_type: 'user_prompt' });
|
||||
logger.debug('SEARCH', 'Chroma returned semantic matches for prompts', { matchCount: chromaResults.ids.length });
|
||||
try {
|
||||
const chromaResults = await this.queryChroma(query, 100, whereFilter);
|
||||
logger.debug('SEARCH', 'Chroma returned semantic matches for prompts', { matchCount: chromaResults.ids.length });
|
||||
|
||||
if (chromaResults.ids.length > 0) {
|
||||
// Step 2: Filter by recency (90 days)
|
||||
const ninetyDaysAgo = Date.now() - SEARCH_CONSTANTS.RECENCY_WINDOW_MS;
|
||||
const recentIds = chromaResults.ids.filter((_id, idx) => {
|
||||
const meta = chromaResults.metadatas[idx];
|
||||
return meta && meta.created_at_epoch > ninetyDaysAgo;
|
||||
});
|
||||
if (chromaResults.ids.length > 0) {
|
||||
// Step 2: Filter by recency (90 days)
|
||||
const ninetyDaysAgo = Date.now() - SEARCH_CONSTANTS.RECENCY_WINDOW_MS;
|
||||
const recentIds = chromaResults.ids.filter((_id, idx) => {
|
||||
const meta = chromaResults.metadatas[idx];
|
||||
return meta && meta.created_at_epoch > ninetyDaysAgo;
|
||||
});
|
||||
|
||||
logger.debug('SEARCH', 'Results within 90-day window', { count: recentIds.length });
|
||||
logger.debug('SEARCH', 'Results within 90-day window', { count: recentIds.length });
|
||||
|
||||
// Step 3: Hydrate from SQLite in temporal order
|
||||
if (recentIds.length > 0) {
|
||||
const limit = options.limit || 20;
|
||||
results = this.sessionStore.getUserPromptsByIds(recentIds, { orderBy: 'date_desc', limit });
|
||||
logger.debug('SEARCH', 'Hydrated user prompts from SQLite', { count: results.length });
|
||||
// Step 3: Hydrate from SQLite in temporal order
|
||||
if (recentIds.length > 0) {
|
||||
const limit = options.limit || 20;
|
||||
results = this.sessionStore.getUserPromptsByIds(recentIds, { orderBy: 'date_desc', limit, project: options.project });
|
||||
logger.debug('SEARCH', 'Hydrated user prompts from SQLite', { count: results.length });
|
||||
}
|
||||
}
|
||||
} catch (chromaError) {
|
||||
const errorObject = chromaError instanceof Error ? chromaError : new Error(String(chromaError));
|
||||
logger.error('WORKER', 'Chroma search failed for user prompts, falling back to FTS', {}, errorObject);
|
||||
}
|
||||
}
|
||||
|
||||
// FTS fallback when Chroma is unavailable or returned no results
|
||||
if (results.length === 0 && query) {
|
||||
try {
|
||||
const ftsResults = this.sessionSearch.searchUserPrompts(query, options);
|
||||
if (ftsResults.length > 0) {
|
||||
results = ftsResults;
|
||||
}
|
||||
} catch (ftsError) {
|
||||
logger.warn('SEARCH', 'FTS fallback failed for user prompts', {}, ftsError instanceof Error ? ftsError : undefined);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1702,23 +1819,53 @@ export class SearchManager {
|
||||
// Use hybrid search if available
|
||||
if (this.chromaSync) {
|
||||
logger.debug('SEARCH', 'Using hybrid semantic search for timeline query', {});
|
||||
const chromaResults = await this.queryChroma(query, 100);
|
||||
logger.debug('SEARCH', 'Chroma returned semantic matches for timeline', { matchCount: chromaResults.ids.length });
|
||||
|
||||
if (chromaResults.ids.length > 0) {
|
||||
// Filter by recency (90 days)
|
||||
const ninetyDaysAgo = Date.now() - SEARCH_CONSTANTS.RECENCY_WINDOW_MS;
|
||||
const recentIds = chromaResults.ids.filter((_id, idx) => {
|
||||
const meta = chromaResults.metadatas[idx];
|
||||
return meta && meta.created_at_epoch > ninetyDaysAgo;
|
||||
});
|
||||
// Build Chroma where filter scoped to observations + project if provided
|
||||
let whereFilter: Record<string, any> = { doc_type: 'observation' };
|
||||
if (project) {
|
||||
const projectFilter = {
|
||||
$or: [
|
||||
{ project },
|
||||
{ merged_into_project: project }
|
||||
]
|
||||
};
|
||||
whereFilter = { $and: [whereFilter, projectFilter] };
|
||||
}
|
||||
|
||||
logger.debug('SEARCH', 'Results within 90-day window', { count: recentIds.length });
|
||||
try {
|
||||
const chromaResults = await this.queryChroma(query, 100, whereFilter);
|
||||
logger.debug('SEARCH', 'Chroma returned semantic matches for timeline', { matchCount: chromaResults.ids.length });
|
||||
|
||||
if (recentIds.length > 0) {
|
||||
results = this.sessionStore.getObservationsByIds(recentIds, { orderBy: 'date_desc', limit: mode === 'auto' ? 1 : limit });
|
||||
logger.debug('SEARCH', 'Hydrated observations from SQLite', { count: results.length });
|
||||
if (chromaResults.ids.length > 0) {
|
||||
// Filter by recency (90 days)
|
||||
const ninetyDaysAgo = Date.now() - SEARCH_CONSTANTS.RECENCY_WINDOW_MS;
|
||||
const recentIds = chromaResults.ids.filter((_id, idx) => {
|
||||
const meta = chromaResults.metadatas[idx];
|
||||
return meta && meta.created_at_epoch > ninetyDaysAgo;
|
||||
});
|
||||
|
||||
logger.debug('SEARCH', 'Results within 90-day window', { count: recentIds.length });
|
||||
|
||||
if (recentIds.length > 0) {
|
||||
results = this.sessionStore.getObservationsByIds(recentIds, { orderBy: 'date_desc', limit: mode === 'auto' ? 1 : limit, project });
|
||||
logger.debug('SEARCH', 'Hydrated observations from SQLite', { count: results.length });
|
||||
}
|
||||
}
|
||||
} catch (chromaError) {
|
||||
const errorObject = chromaError instanceof Error ? chromaError : new Error(String(chromaError));
|
||||
logger.error('WORKER', 'Chroma search failed for timeline by query, falling back to FTS', {}, errorObject);
|
||||
}
|
||||
}
|
||||
|
||||
// FTS fallback when Chroma is unavailable or returned no results
|
||||
if (results.length === 0) {
|
||||
try {
|
||||
const ftsResults = this.sessionSearch.searchObservations(query, { project, limit: mode === 'auto' ? 1 : limit });
|
||||
if (ftsResults.length > 0) {
|
||||
results = ftsResults;
|
||||
}
|
||||
} catch (ftsError) {
|
||||
logger.warn('SEARCH', 'FTS fallback failed for timeline by query', {}, ftsError instanceof Error ? ftsError : undefined);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -17,6 +17,7 @@ import { SessionQueueProcessor } from '../queue/SessionQueueProcessor.js';
|
||||
import { getProcessBySession, ensureProcessExit } from './ProcessRegistry.js';
|
||||
import { getSupervisor } from '../../supervisor/index.js';
|
||||
import { MAX_CONSECUTIVE_SUMMARY_FAILURES } from '../../sdk/prompts.js';
|
||||
import { RestartGuard } from './RestartGuard.js';
|
||||
|
||||
/** Idle threshold before a stuck generator (zombie subprocess) is force-killed. */
|
||||
export const MAX_GENERATOR_IDLE_MS = 5 * 60 * 1000; // 5 minutes
|
||||
@@ -224,7 +225,8 @@ export class SessionManager {
|
||||
earliestPendingTimestamp: null,
|
||||
conversationHistory: [], // Initialize empty - will be populated by agents
|
||||
currentProvider: null, // Will be set when generator starts
|
||||
consecutiveRestarts: 0, // Track consecutive restart attempts to prevent infinite loops
|
||||
consecutiveRestarts: 0, // DEPRECATED: use restartGuard. Kept for logging compat.
|
||||
restartGuard: new RestartGuard(),
|
||||
processingMessageIds: [], // CLAIM-CONFIRM: Track message IDs for confirmProcessed()
|
||||
lastGeneratorActivity: Date.now(), // Initialize for stale detection (Issue #1099)
|
||||
consecutiveSummaryFailures: 0, // Circuit breaker for summary retry loop (#1633)
|
||||
@@ -465,6 +467,44 @@ export class SessionManager {
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Evict the idlest session to free a pool slot (#1868).
|
||||
* An "idle" session has an active generator but no pending work — it's sitting
|
||||
* in the 3-min idle wait before subprocess cleanup. Evicting it triggers abort
|
||||
* which kills the subprocess and frees the pool slot for a waiting new session.
|
||||
* @returns true if a session was evicted, false if no idle sessions found
|
||||
*/
|
||||
evictIdlestSession(): boolean {
|
||||
let idlestSessionId: number | null = null;
|
||||
let oldestActivity = Infinity;
|
||||
|
||||
for (const [sessionDbId, session] of this.sessions) {
|
||||
if (!session.generatorPromise) continue; // No generator = no slot held
|
||||
const pendingCount = this.getPendingStore().getPendingCount(sessionDbId);
|
||||
if (pendingCount > 0) continue; // Has work to do, don't evict
|
||||
|
||||
// Pick the session with the oldest lastGeneratorActivity (idlest)
|
||||
if (session.lastGeneratorActivity < oldestActivity) {
|
||||
oldestActivity = session.lastGeneratorActivity;
|
||||
idlestSessionId = sessionDbId;
|
||||
}
|
||||
}
|
||||
|
||||
if (idlestSessionId === null) return false;
|
||||
|
||||
const session = this.sessions.get(idlestSessionId);
|
||||
if (!session) return false;
|
||||
|
||||
logger.info('SESSION', 'Evicting idle session to free pool slot for new request (#1868)', {
|
||||
sessionDbId: idlestSessionId,
|
||||
idleDurationMs: Date.now() - oldestActivity
|
||||
});
|
||||
|
||||
session.idleTimedOut = true;
|
||||
session.abortController.abort();
|
||||
return true;
|
||||
}
|
||||
|
||||
/**
|
||||
* Reap sessions with no active generator and no pending work that have been idle too long.
|
||||
* Also reaps sessions whose generator has been stuck (no lastGeneratorActivity update) for
|
||||
|
||||
@@ -207,6 +207,8 @@ export async function processAgentResponse(
|
||||
}
|
||||
if (session.processingMessageIds.length > 0) {
|
||||
logger.debug('QUEUE', `CONFIRMED_BATCH | sessionDbId=${session.sessionDbId} | count=${session.processingMessageIds.length} | ids=[${session.processingMessageIds.join(',')}]`);
|
||||
// Record successful processing so restart guard decay is anchored to real successes
|
||||
session.restartGuard?.recordSuccess();
|
||||
}
|
||||
// Clear the tracking array after confirmation
|
||||
session.processingMessageIds = [];
|
||||
|
||||
@@ -21,8 +21,8 @@ export function createMiddleware(
|
||||
): RequestHandler[] {
|
||||
const middlewares: RequestHandler[] = [];
|
||||
|
||||
// JSON parsing with 50mb limit
|
||||
middlewares.push(express.json({ limit: '50mb' }));
|
||||
// JSON parsing with 5mb limit (#1935)
|
||||
middlewares.push(express.json({ limit: '5mb' }));
|
||||
|
||||
// CORS - restrict to localhost origins only
|
||||
middlewares.push(cors({
|
||||
@@ -42,6 +42,39 @@ export function createMiddleware(
|
||||
credentials: false
|
||||
}));
|
||||
|
||||
// Simple in-memory rate limiter (#1935)
|
||||
const requestCounts = new Map<string, { count: number; resetAt: number }>();
|
||||
const RATE_LIMIT_WINDOW_MS = 60_000; // 1 minute
|
||||
const RATE_LIMIT_MAX_REQUESTS = 300; // 300 requests per minute per IP
|
||||
|
||||
const rateLimiter: RequestHandler = (req, res, next) => {
|
||||
const clientIp = req.ip || 'unknown';
|
||||
const now = Date.now();
|
||||
let entry = requestCounts.get(clientIp);
|
||||
|
||||
if (!entry || now >= entry.resetAt) {
|
||||
entry = { count: 0, resetAt: now + RATE_LIMIT_WINDOW_MS };
|
||||
requestCounts.set(clientIp, entry);
|
||||
}
|
||||
|
||||
// Lazy cleanup: remove expired entries when map grows large
|
||||
if (requestCounts.size > 100) {
|
||||
for (const [ip, e] of requestCounts) {
|
||||
if (now >= e.resetAt) requestCounts.delete(ip);
|
||||
}
|
||||
}
|
||||
|
||||
entry.count++;
|
||||
if (entry.count > RATE_LIMIT_MAX_REQUESTS) {
|
||||
res.status(429).json({ error: 'Rate limit exceeded' });
|
||||
return;
|
||||
}
|
||||
|
||||
next();
|
||||
};
|
||||
|
||||
middlewares.push(rateLimiter);
|
||||
|
||||
// HTTP request/response logging
|
||||
middlewares.push((req: Request, res: Response, next: NextFunction) => {
|
||||
// Skip logging for static assets, health checks, and polling endpoints
|
||||
|
||||
@@ -382,11 +382,13 @@ export class DataRoutes extends BaseRouteHandler {
|
||||
}
|
||||
|
||||
// Import observations (depends on sessions)
|
||||
const importedObservations: Array<{ id: number; obs: typeof observations[0] }> = [];
|
||||
if (Array.isArray(observations)) {
|
||||
for (const obs of observations) {
|
||||
const result = store.importObservation(obs);
|
||||
if (result.imported) {
|
||||
stats.observationsImported++;
|
||||
importedObservations.push({ id: result.id, obs });
|
||||
} else {
|
||||
stats.observationsSkipped++;
|
||||
}
|
||||
@@ -398,6 +400,53 @@ export class DataRoutes extends BaseRouteHandler {
|
||||
if (stats.observationsImported > 0) {
|
||||
store.rebuildObservationsFTSIndex();
|
||||
}
|
||||
|
||||
// Sync imported observations to ChromaDB for vector search.
|
||||
// Fire-and-forget: Chroma sync failure should not block the import response.
|
||||
// Bounded concurrency to prevent overwhelming Chroma on large imports.
|
||||
const chromaSync = this.dbManager.getChromaSync();
|
||||
if (chromaSync && importedObservations.length > 0) {
|
||||
const CHROMA_SYNC_CONCURRENCY = 8;
|
||||
const safeParseJson = (val: string | null): string[] => {
|
||||
if (!val) return [];
|
||||
try { return JSON.parse(val); } catch { return []; }
|
||||
};
|
||||
|
||||
const syncOne = async ({ id, obs }: { id: number; obs: any }) => {
|
||||
const parsedObs = {
|
||||
type: obs.type || 'discovery',
|
||||
title: obs.title || null,
|
||||
subtitle: obs.subtitle || null,
|
||||
facts: safeParseJson(obs.facts),
|
||||
narrative: obs.narrative || null,
|
||||
concepts: safeParseJson(obs.concepts),
|
||||
files_read: safeParseJson(obs.files_read),
|
||||
files_modified: safeParseJson(obs.files_modified),
|
||||
};
|
||||
|
||||
await chromaSync.syncObservation(
|
||||
id,
|
||||
obs.memory_session_id,
|
||||
obs.project,
|
||||
parsedObs,
|
||||
obs.prompt_number || 0,
|
||||
obs.created_at_epoch,
|
||||
obs.discovery_tokens || 0
|
||||
).catch(err => {
|
||||
logger.error('CHROMA', 'Import ChromaDB sync failed', { id }, err as Error);
|
||||
});
|
||||
};
|
||||
|
||||
// Fire-and-forget: process in batches but don't block the response
|
||||
(async () => {
|
||||
for (let i = 0; i < importedObservations.length; i += CHROMA_SYNC_CONCURRENCY) {
|
||||
const batch = importedObservations.slice(i, i + CHROMA_SYNC_CONCURRENCY);
|
||||
await Promise.all(batch.map(syncOne));
|
||||
}
|
||||
})().catch(err => {
|
||||
logger.error('CHROMA', 'Import ChromaDB batch sync failed', {}, err as Error);
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
// Import prompts (depends on sessions)
|
||||
|
||||
@@ -24,6 +24,7 @@ import { USER_SETTINGS_PATH } from '../../../../shared/paths.js';
|
||||
import { getProcessBySession, ensureProcessExit } from '../../ProcessRegistry.js';
|
||||
import { getProjectContext } from '../../../../utils/project-name.js';
|
||||
import { normalizePlatformSource } from '../../../../shared/platform-source.js';
|
||||
import { RestartGuard } from '../../RestartGuard.js';
|
||||
|
||||
export class SessionRoutes extends BaseRouteHandler {
|
||||
private completionHandler: SessionCompletionHandler;
|
||||
@@ -279,9 +280,10 @@ export class SessionRoutes extends BaseRouteHandler {
|
||||
|
||||
if (wasAborted) {
|
||||
logger.info('SESSION', `Generator aborted`, { sessionId: sessionDbId });
|
||||
} else {
|
||||
logger.error('SESSION', `Generator exited unexpectedly`, { sessionId: sessionDbId });
|
||||
}
|
||||
// Don't log "exited unexpectedly" here — a non-abort exit is normal when
|
||||
// the SDK subprocess completes its work. The crash-recovery block below
|
||||
// checks pendingCount to distinguish real crashes from clean exits (#1876).
|
||||
|
||||
session.generatorPromise = null;
|
||||
session.currentProvider = null;
|
||||
@@ -290,7 +292,6 @@ export class SessionRoutes extends BaseRouteHandler {
|
||||
// Crash recovery: If not aborted and still has work, restart (with limit)
|
||||
if (!wasAborted) {
|
||||
const pendingStore = this.sessionManager.getPendingMessageStore();
|
||||
const MAX_CONSECUTIVE_RESTARTS = 3;
|
||||
|
||||
let pendingCount: number;
|
||||
try {
|
||||
@@ -309,14 +310,18 @@ export class SessionRoutes extends BaseRouteHandler {
|
||||
return;
|
||||
}
|
||||
|
||||
session.consecutiveRestarts = (session.consecutiveRestarts || 0) + 1;
|
||||
// Windowed restart guard: only blocks tight-loop restarts, not spread-out ones (#2053)
|
||||
if (!session.restartGuard) session.restartGuard = new RestartGuard();
|
||||
const restartAllowed = session.restartGuard.recordRestart();
|
||||
session.consecutiveRestarts = (session.consecutiveRestarts || 0) + 1; // Keep for logging
|
||||
|
||||
if (session.consecutiveRestarts > MAX_CONSECUTIVE_RESTARTS) {
|
||||
logger.error('SESSION', `CRITICAL: Generator restart limit exceeded - stopping to prevent runaway costs`, {
|
||||
if (!restartAllowed) {
|
||||
logger.error('SESSION', `CRITICAL: Restart guard tripped — too many restarts in window, stopping to prevent runaway costs`, {
|
||||
sessionId: sessionDbId,
|
||||
pendingCount,
|
||||
consecutiveRestarts: session.consecutiveRestarts,
|
||||
maxRestarts: MAX_CONSECUTIVE_RESTARTS,
|
||||
restartsInWindow: session.restartGuard.restartsInWindow,
|
||||
windowMs: session.restartGuard.windowMs,
|
||||
maxRestarts: session.restartGuard.maxRestarts,
|
||||
action: 'Generator will NOT restart. Check logs for root cause. Messages remain in pending state.'
|
||||
});
|
||||
// Don't restart - abort to prevent further API calls
|
||||
@@ -328,7 +333,8 @@ export class SessionRoutes extends BaseRouteHandler {
|
||||
sessionId: sessionDbId,
|
||||
pendingCount,
|
||||
consecutiveRestarts: session.consecutiveRestarts,
|
||||
maxRestarts: MAX_CONSECUTIVE_RESTARTS
|
||||
restartsInWindow: session.restartGuard!.restartsInWindow,
|
||||
maxRestarts: session.restartGuard!.maxRestarts
|
||||
});
|
||||
|
||||
// Abort OLD controller before replacing to prevent child process leaks
|
||||
|
||||
@@ -85,7 +85,7 @@ export class SettingsDefaultsManager {
|
||||
private static readonly DEFAULTS: SettingsDefaults = {
|
||||
CLAUDE_MEM_MODEL: 'claude-sonnet-4-6',
|
||||
CLAUDE_MEM_CONTEXT_OBSERVATIONS: '50',
|
||||
CLAUDE_MEM_WORKER_PORT: '37777',
|
||||
CLAUDE_MEM_WORKER_PORT: String(37700 + ((process.getuid?.() ?? 77) % 100)),
|
||||
CLAUDE_MEM_WORKER_HOST: '127.0.0.1',
|
||||
CLAUDE_MEM_SKIP_TOOLS: 'ListMcpResourcesTool,SlashCommand,Skill,TodoWrite,AskUserQuestion',
|
||||
// AI Provider Configuration
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
import React, { useState, useEffect, useCallback, useRef, useMemo } from 'react';
|
||||
import { authFetch } from '../utils/api';
|
||||
|
||||
// Log levels and components matching the logger.ts definitions
|
||||
type LogLevel = 'DEBUG' | 'INFO' | 'WARN' | 'ERROR';
|
||||
@@ -133,7 +134,7 @@ export function LogsDrawer({ isOpen, onClose }: LogsDrawerProps) {
|
||||
setIsLoading(true);
|
||||
setError(null);
|
||||
try {
|
||||
const response = await fetch('/api/logs');
|
||||
const response = await authFetch('/api/logs');
|
||||
if (!response.ok) {
|
||||
throw new Error(`Failed to fetch logs: ${response.statusText}`);
|
||||
}
|
||||
@@ -158,7 +159,7 @@ export function LogsDrawer({ isOpen, onClose }: LogsDrawerProps) {
|
||||
setIsLoading(true);
|
||||
setError(null);
|
||||
try {
|
||||
const response = await fetch('/api/logs/clear', { method: 'POST' });
|
||||
const response = await authFetch('/api/logs/clear', { method: 'POST' });
|
||||
if (!response.ok) {
|
||||
throw new Error(`Failed to clear logs: ${response.statusText}`);
|
||||
}
|
||||
|
||||
@@ -1,5 +1,6 @@
|
||||
import { useState, useEffect, useCallback } from 'react';
|
||||
import type { ProjectCatalog, Settings } from '../types';
|
||||
import { authFetch } from '../utils/api';
|
||||
|
||||
interface UseContextPreviewResult {
|
||||
preview: string;
|
||||
@@ -39,7 +40,7 @@ export function useContextPreview(settings: Settings): UseContextPreviewResult {
|
||||
async function fetchProjects() {
|
||||
let data: ProjectCatalog;
|
||||
try {
|
||||
const response = await fetch('/api/projects');
|
||||
const response = await authFetch('/api/projects');
|
||||
data = await response.json() as ProjectCatalog;
|
||||
} catch (err: unknown) {
|
||||
console.error('Failed to fetch projects:', err instanceof Error ? err.message : String(err));
|
||||
@@ -100,7 +101,7 @@ export function useContextPreview(settings: Settings): UseContextPreviewResult {
|
||||
}
|
||||
|
||||
try {
|
||||
const response = await fetch(`/api/context/preview?${params}`);
|
||||
const response = await authFetch(`/api/context/preview?${params}`);
|
||||
const text = await response.text();
|
||||
|
||||
if (response.ok) {
|
||||
|
||||
@@ -2,6 +2,7 @@ import { useState, useCallback, useRef } from 'react';
|
||||
import { Observation, Summary, UserPrompt } from '../types';
|
||||
import { UI } from '../constants/ui';
|
||||
import { API_ENDPOINTS } from '../constants/api';
|
||||
import { authFetch } from '../utils/api';
|
||||
|
||||
interface PaginationState {
|
||||
isLoading: boolean;
|
||||
@@ -68,7 +69,7 @@ function usePaginationFor(endpoint: string, dataType: DataType, currentFilter: s
|
||||
params.append('platformSource', currentSource);
|
||||
}
|
||||
|
||||
const response = await fetch(`${endpoint}?${params}`);
|
||||
const response = await authFetch(`${endpoint}?${params}`);
|
||||
|
||||
if (!response.ok) {
|
||||
throw new Error(`Failed to load ${dataType}: ${response.statusText}`);
|
||||
|
||||
@@ -3,6 +3,7 @@ import { Settings } from '../types';
|
||||
import { DEFAULT_SETTINGS } from '../constants/settings';
|
||||
import { API_ENDPOINTS } from '../constants/api';
|
||||
import { TIMING } from '../constants/timing';
|
||||
import { authFetch } from '../utils/api';
|
||||
|
||||
export function useSettings() {
|
||||
const [settings, setSettings] = useState<Settings>(DEFAULT_SETTINGS);
|
||||
@@ -11,8 +12,13 @@ export function useSettings() {
|
||||
|
||||
useEffect(() => {
|
||||
// Load initial settings
|
||||
fetch(API_ENDPOINTS.SETTINGS)
|
||||
.then(res => res.json())
|
||||
authFetch(API_ENDPOINTS.SETTINGS)
|
||||
.then(async res => {
|
||||
if (!res.ok) {
|
||||
throw new Error(`Failed to load settings (${res.status})`);
|
||||
}
|
||||
return res.json();
|
||||
})
|
||||
.then(data => {
|
||||
// Use ?? (nullish coalescing) instead of || so that falsy values
|
||||
// like '0', 'false', and '' from the backend are preserved.
|
||||
@@ -60,20 +66,30 @@ export function useSettings() {
|
||||
setIsSaving(true);
|
||||
setSaveStatus('Saving...');
|
||||
|
||||
const response = await fetch(API_ENDPOINTS.SETTINGS, {
|
||||
method: 'POST',
|
||||
headers: { 'Content-Type': 'application/json' },
|
||||
body: JSON.stringify(newSettings)
|
||||
});
|
||||
try {
|
||||
const response = await authFetch(API_ENDPOINTS.SETTINGS, {
|
||||
method: 'POST',
|
||||
headers: { 'Content-Type': 'application/json' },
|
||||
body: JSON.stringify(newSettings)
|
||||
});
|
||||
|
||||
const result = await response.json();
|
||||
if (!response.ok) {
|
||||
setSaveStatus(`✗ Error: ${response.status === 401 ? 'Unauthorized' : response.statusText}`);
|
||||
setIsSaving(false);
|
||||
return;
|
||||
}
|
||||
|
||||
if (result.success) {
|
||||
setSettings(newSettings);
|
||||
setSaveStatus('✓ Saved');
|
||||
setTimeout(() => setSaveStatus(''), TIMING.SAVE_STATUS_DISPLAY_DURATION_MS);
|
||||
} else {
|
||||
setSaveStatus(`✗ Error: ${result.error}`);
|
||||
const result = await response.json();
|
||||
|
||||
if (result.success) {
|
||||
setSettings(newSettings);
|
||||
setSaveStatus('✓ Saved');
|
||||
setTimeout(() => setSaveStatus(''), TIMING.SAVE_STATUS_DISPLAY_DURATION_MS);
|
||||
} else {
|
||||
setSaveStatus(`✗ Error: ${result.error}`);
|
||||
}
|
||||
} catch (error) {
|
||||
setSaveStatus(`✗ Error: ${error instanceof Error ? error.message : 'Network error'}`);
|
||||
}
|
||||
|
||||
setIsSaving(false);
|
||||
|
||||
@@ -1,13 +1,14 @@
|
||||
import { useState, useEffect, useCallback } from 'react';
|
||||
import { Stats } from '../types';
|
||||
import { API_ENDPOINTS } from '../constants/api';
|
||||
import { authFetch } from '../utils/api';
|
||||
|
||||
export function useStats() {
|
||||
const [stats, setStats] = useState<Stats>({});
|
||||
|
||||
const loadStats = useCallback(async () => {
|
||||
try {
|
||||
const response = await fetch(API_ENDPOINTS.STATS);
|
||||
const response = await authFetch(API_ENDPOINTS.STATS);
|
||||
const data = await response.json();
|
||||
setStats(data);
|
||||
} catch (error: unknown) {
|
||||
|
||||
7
src/ui/viewer/utils/api.ts
Normal file
7
src/ui/viewer/utils/api.ts
Normal file
@@ -0,0 +1,7 @@
|
||||
/**
|
||||
* Fetch wrapper for viewer API calls.
|
||||
* Worker is localhost-only; no auth header needed.
|
||||
*/
|
||||
export function authFetch(input: RequestInfo | URL, init?: RequestInit): Promise<Response> {
|
||||
return fetch(input, init);
|
||||
}
|
||||
Reference in New Issue
Block a user