The project's working changelog regenerator is `scripts/generate-changelog.js`
(not the stdin-based bundled script), exposed via `npm run changelog:generate`.
Prior wording pointed to a broken path.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The skill previously listed 3 manifest paths and omitted `npm publish`
entirely, which meant `npx claude-mem@X.Y.Z` only resolved when someone
ran publish out-of-band. Now the skill:
- Enumerates all 6 version-bearing files (package.json, plugin/package.json,
.claude-plugin/marketplace.json, .claude-plugin/plugin.json,
plugin/.claude-plugin/plugin.json, .codex-plugin/plugin.json).
- Adds an explicit `npm publish` step with `npm view claude-mem@X.Y.Z version`
verification so the npx-distributed version is the one users actually pin.
- Documents `npm run release:patch|minor|major` (np helper) as an alternative.
- Adds `git grep` pre-flight so new manifests are discovered automatically.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat: security observation types + Telegram notifier
Adds two severity-axis security observation types (security_alert, security_note)
to the code mode and a fire-and-forget Telegram notifier that posts when a saved
observation matches configured type or concept triggers. Default trigger fires on
security_alert only; notifier is disabled until BOT_TOKEN and CHAT_ID are set.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(telegram): honor CLAUDE_MEM_TELEGRAM_ENABLED master toggle
Adds an explicit on/off flag (default 'true') so users can disable the
notifier without clearing credentials.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* perf(stop-hook): make summarize handler fire-and-forget
Stop hook previously blocked the Claude Code session for up to 110
seconds while polling the worker for summary completion. The handler
now returns as soon as the enqueue POST is acked.
- summarize.ts: drop the 500ms polling loop and /api/sessions/complete
call; tighten SUMMARIZE_TIMEOUT_MS from 300s to 5s since the worker
acks the enqueue synchronously.
- SessionCompletionHandler: extract idempotent finalizeSession() for
DB mark + orphaned-pending-queue drain + broadcast. completeByDbId
now delegates so the /api/sessions/complete HTTP route is backward
compatible.
- SessionRoutes: wire finalizeSession into the SDK-agent generator's
finally block, gated on lastSummaryStored + empty pending queue so
only Stop events produce finalize (not every idle tick).
- WorkerService: own the single SessionCompletionHandler instance and
inject it into SessionRoutes to avoid duplicate construction.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(pr2084): address reviewer findings
CodeRabbit:
- SessionStore.getSessionById now returns status; without it, the
finalizeSession idempotency guard always evaluated false and
re-fired drain/broadcast on every call.
- worker-service.ts: three call sites that remove the in-memory session
after finalizeSession now do so only on success. On failure the
session is left in place so the 60s orphan reaper can retry; removing
it would orphan an 'active' DB row indefinitely under the fire-and-
forget Stop hook.
- runFallbackForTerminatedSession no longer emits a second
session_completed event; finalizeSession already broadcasts one.
The explicit broadcast now runs only on the finalize-failure fallback.
Greptile:
- TelegramNotifier reads via loadFromFile(USER_SETTINGS_PATH) so values
in ~/.claude-mem/settings.json actually take effect; SettingsDefaultsManager.get()
alone skipped the file and silently ignored user-configured credentials.
- Emoji is derived from obs.type (security_alert → 🚨, security_note → 🔐,
fallback 🔔) instead of hardcoded 🚨 for every observation.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(hooks): worker-port mismatch on Windows and settings.json overrides (#2086)
Hooks computed the health-check port as \$((37700 + id -u % 100)),
ignoring ~/.claude-mem/settings.json. Two failure modes resulted:
1. Users upgrading from pre-per-uid builds kept CLAUDE_MEM_WORKER_PORT
set to '37777' in settings.json. The worker bound 37777 (settings
wins), but hooks queried 37701 (uid 501 on macOS), so every
SessionStart/UserPromptSubmit health check failed.
2. Windows Git Bash/PowerShell returns a real Windows UID for 'id -u'
(e.g. 209), producing port 37709 while the Node worker fell back
to 37777 (process.getuid?.() ?? 77). Every prompt hit the 60s hook
timeout.
hooks.json now resolves the port in this order, matching how the
worker itself resolves it:
1. sed CLAUDE_MEM_WORKER_PORT from ~/.claude-mem/settings.json
2. If absent, and uname is MINGW/CYGWIN/MSYS → 37777
3. Otherwise 37700 + (id -u || 77) % 100
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(pr2084): sync DatabaseManager.getSessionById return type
CodeRabbit round 2: the DatabaseManager.getSessionById return type
was missing platform_source, custom_title, and status fields that
SessionStore.getSessionById actually returns. Structural typing
hid the mismatch at compile time, but it prevents callers going
through DatabaseManager from seeing the status field that the
idempotency guard in SessionCompletionHandler relies on.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(pr2084): hooks honor env vars and host; looser port regex (#2086 followup)
CodeRabbit round 3: match the worker's env > file > defaults precedence
and resolve host the same way as port.
- Env: CLAUDE_MEM_WORKER_PORT and CLAUDE_MEM_WORKER_HOST win first.
- File: sed now accepts both quoted ('"37777"') and unquoted (37777)
JSON values for the port; a separate sed reads CLAUDE_MEM_WORKER_HOST.
- Defaults: port per-uid formula (Windows: 37777), host 127.0.0.1.
- Health-check URL uses the resolved $HOST instead of hardcoded localhost.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: detect PID reuse in worker start-guard to survive container restarts
The 'Worker already running' guard checked PID liveness with kill(0), which
false-positives when a persistent PID file outlives the PID namespace (docker
stop / docker start, pm2 graceful reloads). The new worker comes up with the
same low PID (e.g. 11) as the old one, kill(0) says 'alive', and the worker
refuses to start against its own prior incarnation.
Capture a process-start token alongside the PID and verify identity, not just
liveness:
- Linux: /proc/<pid>/stat field 22 (starttime, jiffies since boot)
- macOS/POSIX: `ps -p <pid> -o lstart=`
- Windows: unchanged (returns null, falls back to liveness)
PID files written by older versions are token-less, so verifyPidFileOwnership
falls back to the current liveness-only behavior for backwards compatibility.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor: apply review feedback to PID identity helpers
- Collapse ProcessManager re-export down to a single import/export statement.
- Make verifyPidFileOwnership a type predicate (info is PidInfo) so callers
don't need non-null assertions on the narrowed value.
- Drop the `!` assertions at the worker-service GUARD 1 call site now that
the predicate narrows.
- Tighten the captureProcessStartToken platform doc comment to enumerate
process.platform values explicitly.
No behavior change — esbuild output is byte-identical (type-only edits).
Addresses items 1-3 of the claude-review comment on PR #2082.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: pin LC_ALL=C for `ps lstart=` in captureProcessStartToken
Without a locale pin, `ps -o lstart=` emits month/weekday names in the
system locale. A bind-mounted PID file written under one locale and read
under another would hash to different tokens and the live worker would
incorrectly appear stale — reintroducing the very bug this helper exists
to prevent.
Flagged by Greptile on PR #2082.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor: address second-round review on PID identity helpers
- verifyPidFileOwnership: log a DEBUG diagnostic when the PID is alive but
the start-token mismatches. Without it, callers can't distinguish the
"process dead" path from the "PID reused" path in production logs — the
exact case this helper exists to catch.
- writePidFile: drop the redundant `?? undefined` coercion. `null` and
`undefined` are both falsy for the subsequent ternary, so the coercion
was purely cosmetic noise that suggested an important distinction.
- Add a unit test for the win32 fallback path in captureProcessStartToken
(mocks process.platform) — previously uncovered in CI.
Addresses items 1, 2, and 5 of the second claude-review on PR #2082.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: resolve search, database, and docker bugs (#1913, #1916, #1956, #1957, #2048)
- Fix concept/concepts param mismatch in SearchManager.normalizeParams (#1916)
- Add FTS5 keyword fallback when ChromaDB is unavailable (#1913, #2048)
- Add periodic WAL checkpoint and journal_size_limit to prevent unbounded WAL growth (#1956)
- Add periodic clearFailed() to purge stale pending_messages (#1957)
- Fix nounset-safe TTY_ARGS expansion in docker/claude-mem/run.sh
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: prevent silent data loss on non-XML responses, add queue info to /health (#1867, #1874)
- ResponseProcessor: mark messages as failed (with retry) instead of confirming
when the LLM returns non-XML garbage (auth errors, rate limits) (#1874)
- Health endpoint: include activeSessions count for queue liveness monitoring (#1867)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: cache isFts5Available() at construction time
Addresses Greptile review: avoid DDL probe (CREATE + DROP) on every text
query. Result is now cached in _fts5Available at construction.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: resolve worker stability bugs — pool deadlock, MCP loopback, restart guard (#1868, #1876, #2053)
- Replace flat consecutiveRestarts counter with time-windowed RestartGuard:
only counts restarts within 60s window (cap=10), decays after 5min of
success. Prevents stranding pending messages on long-running sessions. (#2053)
- Add idle session eviction to pool slot allocation: when all slots are full,
evict the idlest session (no pending work, oldest activity) to free a slot
for new requests, preventing 60s timeout deadlock. (#1868)
- Fix MCP loopback self-check: use process.execPath instead of bare 'node'
which fails on non-interactive PATH. Fix crash misclassification by removing
false "Generator exited unexpectedly" error log on normal completion. (#1876)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: resolve hooks reliability bugs — summarize exit code, session-init health wait (#1896, #1901, #1903, #1907)
- Wrap summarize hook's workerHttpRequest in try/catch to prevent exit
code 2 (blocking error) on network failures or malformed responses.
Session exit no longer blocks on worker errors. (#1901)
- Add health-check wait loop to UserPromptSubmit session-init command in
hooks.json. On Linux/WSL where hook ordering fires UserPromptSubmit
before SessionStart, session-init now waits up to 10s for worker health
before proceeding. Also wrap session-init HTTP call in try/catch. (#1907)
- Close#1896 as already-fixed: mtime comparison at file-context.ts:255-267
bypasses truncation when file is newer than latest observation.
- Close#1903 as no-repro: hooks.json correctly declares all hook events.
Issue was Claude Code 12.0.1/macOS platform event-dispatch bug.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: security hardening — bearer auth, path validation, rate limits, per-user port (#1932, #1933, #1934, #1935, #1936)
- Add bearer token auth to all API endpoints: auto-generated 32-byte
token stored at ~/.claude-mem/worker-auth-token (mode 0600). All hook,
MCP, viewer, and OpenCode requests include Authorization header.
Health/readiness endpoints exempt for polling. (#1932, #1933)
- Add path traversal protection: watch.context.path validated against
project root and ~/.claude-mem/ before write. Rejects ../../../etc
style attacks. (#1934)
- Reduce JSON body limit from 50MB to 5MB. Add in-memory rate limiter
(300 req/min/IP) to prevent abuse. (#1935)
- Derive default worker port from UID (37700 + uid%100) to prevent
cross-user data leakage on multi-user macOS. Windows falls back to
37777. Shell hooks use same formula via id -u. (#1936)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: resolve search project filtering and import Chroma sync (#1911, #1912, #1914, #1918)
- Fix per-type search endpoints to pass project filter to Chroma queries
and SQLite hydration. searchObservations/Sessions/UserPrompts now use
$or clause matching project + merged_into_project. (#1912)
- Fix timeline/search methods to pass project to Chroma anchor queries.
Prevents cross-project result leakage when project param omitted. (#1911)
- Sync imported observations to ChromaDB after FTS rebuild. Import
endpoint now calls chromaSync.syncObservation() for each imported
row, making them visible to MCP search(). (#1914)
- Fix session-init cwd fallback to match context.ts (process.cwd()).
Prevents project key mismatch that caused "no previous sessions"
on fresh sessions. (#1918)
- Fix sync-marketplace restart to include auth token and per-user port.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: resolve all CodeRabbit and Greptile review comments on PR #2080
- Fix run.sh comment mismatch (no-op flag vs empty array)
- Gate session-init on health check success (prevent running when worker unreachable)
- Fix date_desc ordering ignored in FTS session search
- Age-scope failed message purge (1h retention) instead of clearing all
- Anchor RestartGuard decay to real successes (null init, not Date.now())
- Add recordSuccess() calls in ResponseProcessor and completion path
- Prevent caller headers from overriding bearer auth token
- Add lazy cleanup for rate limiter map to prevent unbounded growth
- Bound post-import Chroma sync with concurrency limit of 8
- Add doc_type:'observation' filter to Chroma queries feeding observation hydration
- Add FTS fallback to all specialized search handlers (observations, sessions, prompts, timeline)
- Add response.ok check and error handling in viewer saveSettings
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: resolve CodeRabbit round-2 review comments
- Use failure timestamp (COALESCE) instead of created_at_epoch for stale purge
- Downgrade _fts5Available flag when FTS table creation fails
- Escape FTS5 MATCH input by quoting user queries as literal phrases
- Escape LIKE metacharacters (%, _, \) in prompt text search
- Add response.ok check in initial settings load (matches save flow)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: resolve CodeRabbit round-3 review comments
- Include failed_at_epoch in COALESCE for age-scoped purge
- Re-throw FTS5 errors so callers can distinguish failure from no-results
- Wrap all FTS fallback calls in SearchManager with try/catch
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* refactor: remove bearer auth and platform_source from context inject
Bearer token auth (#1932/#1933) added friction for all localhost API
clients with no benefit — the worker already binds localhost-only (CORS
restriction + host binding). Removed auth-token module, requireAuth
middleware, and Authorization headers from all internal callers.
platform_source filtering from the /api/context/inject path was never
used by any caller and silently filtered out observations. The underlying
platform_source column stays; only the query-time filter and its plumbing
through ContextBuilder, ObservationCompiler, SearchRoutes, context.ts,
and transcripts/processor.ts are removed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: resolve CodeRabbit + Greptile + claude-review comments on PR #2081
- middleware.ts: drop 'Authorization' from CORS allowedHeaders (Greptile)
- middleware.ts: rate limiter falls back to req.socket.remoteAddress; add Retry-After on 429 (claude-review)
- SearchRoutes.ts: drop leftover platformSource read+pass in handleContextPreview (Greptile)
- .docker-blowout-data/: stop tracking the empty SQLite placeholder and gitignore the dir (claude-review)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: tighten rate limiter — correct boundary + drop dead cleanup branch
- `entry.count >= RATE_LIMIT_MAX_REQUESTS` so the 300th request is the
first rejected (was 301).
- Removed the `requestCounts.size > 100` lazy-cleanup block — on a
localhost-only server the map tops out at 1–2 entries, so the branch
was dead code.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: rate limiter correctly allows exactly 300 req/min; doc localhost scope
- Check `entry.count >= max` BEFORE incrementing so the cap matches the
comment: 300 requests pass, the 301st gets 429.
- Added a comment noting the limiter is effectively a global cap on a
localhost-only worker (all callers share the 127.0.0.1/::1 bucket).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: normalise IPv4-mapped IPv6 in rate limiter client IP
Strip the `::ffff:` prefix so a localhost caller routed as
`::ffff:127.0.0.1` shares a bucket with `127.0.0.1`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: size-guarded prune of rate limiter map for non-localhost deploys
Prune expired entries only when the map exceeds 1000 keys and we're
already doing a window reset, so the cost is zero on the localhost hot
path (1–2 keys) and the map can't grow unbounded if the worker is ever
bound on a non-loopback interface.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Removes the 300 req/min rate limiter from the worker's HTTP middleware.
The worker is localhost-only (enforced via CORS), so rate limiting was
pointless security theater — but it broke the viewer, which polls logs
and stats frequently enough to trip the limit within seconds.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SessionStart context injection regressed in v12.3.3 — no memory
context is being delivered to new sessions. Rolling back to the
v12.3.2 tree state while the regression is investigated.
Reverts #2080.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: resolve search, database, and docker bugs (#1913, #1916, #1956, #1957, #2048)
- Fix concept/concepts param mismatch in SearchManager.normalizeParams (#1916)
- Add FTS5 keyword fallback when ChromaDB is unavailable (#1913, #2048)
- Add periodic WAL checkpoint and journal_size_limit to prevent unbounded WAL growth (#1956)
- Add periodic clearFailed() to purge stale pending_messages (#1957)
- Fix nounset-safe TTY_ARGS expansion in docker/claude-mem/run.sh
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: prevent silent data loss on non-XML responses, add queue info to /health (#1867, #1874)
- ResponseProcessor: mark messages as failed (with retry) instead of confirming
when the LLM returns non-XML garbage (auth errors, rate limits) (#1874)
- Health endpoint: include activeSessions count for queue liveness monitoring (#1867)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: cache isFts5Available() at construction time
Addresses Greptile review: avoid DDL probe (CREATE + DROP) on every text
query. Result is now cached in _fts5Available at construction.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: resolve worker stability bugs — pool deadlock, MCP loopback, restart guard (#1868, #1876, #2053)
- Replace flat consecutiveRestarts counter with time-windowed RestartGuard:
only counts restarts within 60s window (cap=10), decays after 5min of
success. Prevents stranding pending messages on long-running sessions. (#2053)
- Add idle session eviction to pool slot allocation: when all slots are full,
evict the idlest session (no pending work, oldest activity) to free a slot
for new requests, preventing 60s timeout deadlock. (#1868)
- Fix MCP loopback self-check: use process.execPath instead of bare 'node'
which fails on non-interactive PATH. Fix crash misclassification by removing
false "Generator exited unexpectedly" error log on normal completion. (#1876)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: resolve hooks reliability bugs — summarize exit code, session-init health wait (#1896, #1901, #1903, #1907)
- Wrap summarize hook's workerHttpRequest in try/catch to prevent exit
code 2 (blocking error) on network failures or malformed responses.
Session exit no longer blocks on worker errors. (#1901)
- Add health-check wait loop to UserPromptSubmit session-init command in
hooks.json. On Linux/WSL where hook ordering fires UserPromptSubmit
before SessionStart, session-init now waits up to 10s for worker health
before proceeding. Also wrap session-init HTTP call in try/catch. (#1907)
- Close#1896 as already-fixed: mtime comparison at file-context.ts:255-267
bypasses truncation when file is newer than latest observation.
- Close#1903 as no-repro: hooks.json correctly declares all hook events.
Issue was Claude Code 12.0.1/macOS platform event-dispatch bug.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: security hardening — bearer auth, path validation, rate limits, per-user port (#1932, #1933, #1934, #1935, #1936)
- Add bearer token auth to all API endpoints: auto-generated 32-byte
token stored at ~/.claude-mem/worker-auth-token (mode 0600). All hook,
MCP, viewer, and OpenCode requests include Authorization header.
Health/readiness endpoints exempt for polling. (#1932, #1933)
- Add path traversal protection: watch.context.path validated against
project root and ~/.claude-mem/ before write. Rejects ../../../etc
style attacks. (#1934)
- Reduce JSON body limit from 50MB to 5MB. Add in-memory rate limiter
(300 req/min/IP) to prevent abuse. (#1935)
- Derive default worker port from UID (37700 + uid%100) to prevent
cross-user data leakage on multi-user macOS. Windows falls back to
37777. Shell hooks use same formula via id -u. (#1936)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: resolve search project filtering and import Chroma sync (#1911, #1912, #1914, #1918)
- Fix per-type search endpoints to pass project filter to Chroma queries
and SQLite hydration. searchObservations/Sessions/UserPrompts now use
$or clause matching project + merged_into_project. (#1912)
- Fix timeline/search methods to pass project to Chroma anchor queries.
Prevents cross-project result leakage when project param omitted. (#1911)
- Sync imported observations to ChromaDB after FTS rebuild. Import
endpoint now calls chromaSync.syncObservation() for each imported
row, making them visible to MCP search(). (#1914)
- Fix session-init cwd fallback to match context.ts (process.cwd()).
Prevents project key mismatch that caused "no previous sessions"
on fresh sessions. (#1918)
- Fix sync-marketplace restart to include auth token and per-user port.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: resolve all CodeRabbit and Greptile review comments on PR #2080
- Fix run.sh comment mismatch (no-op flag vs empty array)
- Gate session-init on health check success (prevent running when worker unreachable)
- Fix date_desc ordering ignored in FTS session search
- Age-scope failed message purge (1h retention) instead of clearing all
- Anchor RestartGuard decay to real successes (null init, not Date.now())
- Add recordSuccess() calls in ResponseProcessor and completion path
- Prevent caller headers from overriding bearer auth token
- Add lazy cleanup for rate limiter map to prevent unbounded growth
- Bound post-import Chroma sync with concurrency limit of 8
- Add doc_type:'observation' filter to Chroma queries feeding observation hydration
- Add FTS fallback to all specialized search handlers (observations, sessions, prompts, timeline)
- Add response.ok check and error handling in viewer saveSettings
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: resolve CodeRabbit round-2 review comments
- Use failure timestamp (COALESCE) instead of created_at_epoch for stale purge
- Downgrade _fts5Available flag when FTS table creation fails
- Escape FTS5 MATCH input by quoting user queries as literal phrases
- Escape LIKE metacharacters (%, _, \) in prompt text search
- Add response.ok check in initial settings load (matches save flow)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: resolve CodeRabbit round-3 review comments
- Include failed_at_epoch in COALESCE for age-scoped purge
- Re-throw FTS5 errors so callers can distinguish failure from no-results
- Wrap all FTS fallback calls in SearchManager with try/catch
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: resolve search, database, and docker bugs (#1913, #1916, #1956, #1957, #2048)
- Fix concept/concepts param mismatch in SearchManager.normalizeParams (#1916)
- Add FTS5 keyword fallback when ChromaDB is unavailable (#1913, #2048)
- Add periodic WAL checkpoint and journal_size_limit to prevent unbounded WAL growth (#1956)
- Add periodic clearFailed() to purge stale pending_messages (#1957)
- Fix nounset-safe TTY_ARGS expansion in docker/claude-mem/run.sh
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: prevent silent data loss on non-XML responses, add queue info to /health (#1867, #1874)
- ResponseProcessor: mark messages as failed (with retry) instead of confirming
when the LLM returns non-XML garbage (auth errors, rate limits) (#1874)
- Health endpoint: include activeSessions count for queue liveness monitoring (#1867)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: cache isFts5Available() at construction time
Addresses Greptile review: avoid DDL probe (CREATE + DROP) on every text
query. Result is now cached in _fts5Available at construction.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The extracted helper methods (handleInitResponse, processObservationMessage,
processSummaryMessage) lost the conversationHistory.push calls for assistant
replies, breaking multi-turn context for queryOpenRouterMultiTurn.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove spurious console.error in logger JSON.parse catch (expected control flow)
- Remove debug logging from hot PID cleanup loop (approved override)
- Replace unsafe `error as Error` casts with instanceof checks in ChromaSync, GeminiAgent, OpenRouterAgent
- Wrap non-Error FTS failures with new Error(String()) instead of dropping details
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(evals): SWE-bench Docker scaffolding for claude-mem resolve-rate measurement
Adds evals/swebench/ scaffolding per .claude/plans/swebench-claude-mem-docker.md.
Agent image builds Claude Code 2.1.114 + locally-built claude-mem plugin;
run-instance.sh executes the two-turn ingest/fix protocol per instance;
run-batch.py orchestrates parallel Docker runs with per-instance isolation;
eval.sh wraps the upstream SWE-bench harness; summarize.py aggregates reports.
Orchestrator owns JSONL writes under a lock to avoid racy concurrent appends;
agent writes its authoritative diff to CLAUDE_MEM_OUTPUT_DIR (/scratch in
container mode) and the orchestrator reads it back. Scaffolding only — no
Docker build or smoke test run yet.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(evals): OAuth credential mounting for Claude Max/Pro subscriptions
Skips per-call API billing by extracting OAuth creds from host Keychain
(macOS) or ~/.claude/.credentials.json (Linux) and bind-mounting them
read-only into each agent container. Creds are copied into HOME=$SCRATCH/.claude
at container start so the per-instance isolation model still holds.
Adds run-batch.py --auth {oauth,api-key,auto} (auto prefers OAuth, falls
back to API key). run-instance.sh accepts either ANTHROPIC_API_KEY or
CLAUDE_MEM_CREDENTIALS_FILE. smoke-test.sh runs one instance end-to-end
using OAuth for quick verification before batch runs.
Caveat surfaced in docstrings: Max/Pro has per-window usage limits and is
framed for individual developer use — batch evaluation may exhaust the
quota or raise compliance questions.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(docker): basic claude-mem container for ad-hoc testing
Adds docker/claude-mem/ with a fresh spin-up image:
- Dockerfile: FROM node:20 (reproduces anthropics/claude-code .devcontainer
pattern — Anthropic ships the Dockerfile, not a pullable image); layers
Bun + uv + locally-built plugin/; runs as non-root node user
- entrypoint.sh: seeds OAuth creds from CLAUDE_MEM_CREDENTIALS_FILE into
$HOME/.claude/.credentials.json, then exec's the command (default: bash)
- build.sh: npm run build + docker build
- run.sh: interactive launcher; auto-extracts OAuth from macOS Keychain
(security find-generic-password) or ~/.claude/.credentials.json on Linux,
mounts host .docker-claude-mem-data/ at /home/node/.claude-mem so the
observations DB survives container exit
Validated end-to-end: PostToolUse hook fires, queue enqueues, worker's SDK
compression runs under subscription OAuth, observations row lands with
populated facts/concepts/files_read, Chroma sync triggers.
Also updates .gitignore/.dockerignore for the new runtime-output paths.
Built plugin artifacts refreshed by the build step.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(evals/swebench): non-root user, OAuth mount, Lite dataset default
- Dockerfile.agent: switch to non-root \`node\` user (uid 1000); Claude Code
refuses --permission-mode bypassPermissions when euid==0, which made every
agent run exit 1 before producing a diff. Also move Bun + uv installs to
system paths so the non-root user can exec them.
- run-batch.py: add extract_oauth_credentials() that pulls from macOS
Keychain / Linux ~/.claude/.credentials.json into a temp file and bind-
mounts it at /auth/.credentials.json:ro with CLAUDE_MEM_CREDENTIALS_FILE.
New --auth {oauth,api-key,auto} flag. New --dataset flag so the batch can
target SWE-bench_Lite without editing the script.
- smoke-test.sh: default DATASET to princeton-nlp/SWE-bench_Lite (Lite
contains sympy__sympy-24152, Verified does not); accept DATASET env
override.
Caveat surfaced during testing: Max/Pro subscriptions have per-window usage
limits; running 5 instances in parallel with the "read every source file"
ingest prompt exhausted the 5h window within ~25 minutes (3/5 hit HTTP 429).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: address PR #2076 review comments
- docker/claude-mem/run.sh: chmod 600 (not 644) on extracted OAuth creds
to match what `claude login` writes; avoids exposing tokens to other
host users. Verified readable inside the container under Docker
Desktop's UID translation.
- docker/claude-mem/Dockerfile: pin Bun + uv via --build-arg BUN_VERSION
/ UV_VERSION (defaults: 1.3.12, 0.11.7). Bun via `bash -s "bun-v<V>"`;
uv via versioned installer URL `https://astral.sh/uv/<V>/install.sh`.
- evals/swebench/smoke-test.sh: pipe JSON through stdin to `python3 -c`
so paths with spaces/special chars can't break shell interpolation.
- evals/swebench/run-batch.py: add --overwrite flag; abort by default
when predictions.jsonl for the run-id already exists, preventing
accidental silent discard of partial results.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: address coderabbit review on PR #2076
Actionable (4):
- Dockerfile uv install: wrap `chmod ... || true` in braces so the trailing
`|| true` no longer masks failures from `curl|sh` via bash operator
precedence (&& binds tighter than ||). Applied to both docker/claude-mem/
and evals/swebench/Dockerfile.agent. Added `set -eux` to the RUN lines.
- docker/claude-mem/Dockerfile: drop unused `sudo` apt package (~2 MB).
- run-batch.py: name each agent container (`swebench-agent-<id>-<pid>-<tid>`)
and force-remove via `docker rm -f <name>` in the TimeoutExpired handler
so timed-out runs don't leave orphan containers.
Nitpicks (2):
- smoke-test.sh: collapse 3 python3 invocations into 1 — parse the instance
JSON once, print `repo base_commit`, and write problem.txt in the same
call.
- run-instance.sh: shallow clone via `--depth 1 --no-single-branch` +
`fetch --depth 1 origin $BASE_COMMIT`. Falls back to a full clone if the
server rejects the by-commit fetch.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: address second coderabbit review on PR #2076
Actionable (3):
- docker/claude-mem/run.sh: on macOS, fall back to ~/.claude/.credentials.json
when the Keychain lookup misses (some setups still have file-only creds).
Unified into a single creds_obtained gate so the error surface lists both
sources tried.
- docker/claude-mem/run.sh: drop `exec docker run` — `exec` replaces the shell
so the EXIT trap (`rm -f "$CREDS_FILE"`) never fires and the extracted
OAuth JSON leaks to disk until tmpfs cleanup. Run as a child instead so
the trap runs on exit.
- evals/swebench/smoke-test.sh: actually enforce the TIMEOUT env var. Pick
`timeout` or `gtimeout` (coreutils on macOS), fall back to uncapped with
a warning. Name the container so exit-124 from timeout can `docker rm -f`
it deterministically.
Nitpick from the same review (consolidated python3 calls in smoke-test.sh)
was already addressed in the prior commit ef621e00.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: address third coderabbit review on PR #2076
Actionable (1):
- evals/swebench/smoke-test.sh: the consolidated python heredoc had competing
stdin redirections — `<<'PY'` (script body) AND `< "$INSTANCE_JSON"` (data).
The heredoc won, so `json.load(sys.stdin)` saw an empty stream and the parse
would have failed at runtime. Pass INSTANCE_JSON as argv[2] and `open()` it
inside the script instead; the heredoc is now only the script body, which
is what `python3 -` needs.
Nitpicks (2):
- evals/swebench/smoke-test.sh: macOS Keychain lookup now falls through to
~/.claude/.credentials.json on miss (matches docker/claude-mem/run.sh).
- evals/swebench/run-batch.py: extract_oauth_credentials() no longer
early-returns on Darwin keychain miss; falls through to the on-disk creds
file so macOS setups with file-only credentials work in batch mode too.
Functional spot-check of the parse fix confirmed: REPO/BASE_COMMIT populated
and problem.txt written from a synthetic INSTANCE_JSON.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore: gitignore runtime state files
.claude/scheduled_tasks.lock is a PID+sessionId lock written by Claude
Code's cron scheduler every session. It got accidentally checked in during
the v12.0.0 bump and has been churning phantom diffs in every PR since.
Untrack it and ignore.
plugin/.cli-installed is a timestamp marker the claude-mem installer drops
to record when the plugin was installed. Never belonged in version control.
Ignore it.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore: add trailing newline to .gitignore
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
parseSummary runs on every agent response, not just summary turns. When the
turn is a normal observation, the LLM correctly emits <observation> and no
<summary> — but the fallthrough branch from #1345 treated this as prompt
misbehavior and logged "prompt conditioning may need strengthening" every
time. That assumption stopped holding after #1633 refactored the caller to
always invoke parseSummary with a coerceFromObservation flag.
Gate the whole observation-on-summary path on coerceFromObservation. On a
real summary turn, coercion still runs and logs the legitimate "coercion
failed" warning when the response has no usable content. On an observation
turn, parseSummary returns null silently, which is the correct behavior.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat: disable subagent summaries and label subagent observations
Detect Claude Code subagent hook context via `agent_id`/`agent_type` on
stdin, short-circuit the Stop-hook summary path when present, and thread
the subagent identity end-to-end onto observation rows (new `agent_type`
and `agent_id` columns, migration 010 at version 27). Main-session rows
remain NULL; content-hash dedup is unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: address PR #2073 review feedback
- Narrow summarize subagent guard to agentId only so --agent-started
main sessions still own their summary (agentType alone is main-session).
- Remove now-dead agentId/agentType spreads from the summarize POST body.
- Always overwrite pendingAgentId/pendingAgentType in SDK/Gemini/OpenRouter
agents (clears stale subagent identity on main-session messages after
a subagent message in the same batch).
- Add idx_observations_agent_id index in migration 010 + the mirror
migration in SessionStore + the runner.
- Replace console.log in migration010 with logger.debug.
- Update summarize test: agentType alone no longer short-circuits.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: address CodeRabbit + claude-review iteration 4 feedback
- SessionRoutes.handleSummarizeByClaudeId: narrow worker-side guard to
agentId only (matches hook-side). agentType alone = --agent main
session, which still owns its summary.
- ResponseProcessor: wrap storeObservations in try/finally so
pendingAgentId/Type clear even if storage throws. Prevents stale
subagent identity from leaking into the next batch on error.
- SessionStore.importObservation + bulk.importObservation: persist
agent_type/agent_id so backup/import round-trips preserve subagent
attribution.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* polish: claude-review iteration 5 cleanup
- Use ?? not || for nullable subagent fields in PendingMessageStore
(prevents treating empty string as null).
- Simplify observation.ts body spread — include fields unconditionally;
JSON.stringify drops undefined anyway.
- Narrow any[] to Array<{ name: string }> in migration010 column checks.
- Add trailing newline to migrations.ts.
- Document in observations/store.ts why the dedup hash intentionally
excludes agent fields.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* polish: claude-review iteration 7 feedback
- claude-code adapter: add 128-char safety cap on agent_id/agent_type
so a malformed Claude Code payload cannot balloon DB rows. Empty
strings now also treated as absent.
- migration010: state-aware debug log lists only columns actually
added; idempotent re-runs log "already present; ensured indexes".
- Add 3 adapter tests covering the length cap boundary and empty-string
rejection.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* perf: skip subagent summary before worker bootstrap
Move the agentId short-circuit above ensureWorkerRunning() so a Stop
hook fired inside a subagent does not trigger worker startup just to
return early. Addresses CodeRabbit nit on summarize.ts:36-47.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Initial plan
* fix: break infinite summary-retry loop (#1633)
Three-part fix:
1. Parser coercion: When LLM returns <observation> tags instead of <summary>,
coerce observation content into summary fields (root cause fix)
2. Stronger summary prompt: Add clearer tag requirements with warnings
3. Circuit breaker: Track consecutive summary failures per session,
skip further attempts after 3 failures to prevent unbounded prompt growth
Agent-Logs-Url: https://github.com/thedotmack/claude-mem/sessions/e345e8ec-bc97-4eaa-94bd-6e951fda8f77
Co-authored-by: thedotmack <683968+thedotmack@users.noreply.github.com>
* refactor: extract shared constants for summary mode marker and failure threshold
Addresses code review feedback: SUMMARY_MODE_MARKER and
MAX_CONSECUTIVE_SUMMARY_FAILURES are now defined once in sdk/prompts.ts
and imported by ResponseProcessor and SessionManager.
Agent-Logs-Url: https://github.com/thedotmack/claude-mem/sessions/e345e8ec-bc97-4eaa-94bd-6e951fda8f77
Co-authored-by: thedotmack <683968+thedotmack@users.noreply.github.com>
* fix: guard summary failure counter on summaryExpected (Greptile P1)
The circuit breaker counter previously incremented on any response
containing <observation> or <summary> tags — which matches virtually
every normal observation response. After 3 observations the breaker
would open and permanently block summarization, reproducing the
data-loss scenario #1633 was meant to prevent.
Gate the increment block on summaryExpected (already computed for
parseSummary coercion) so the counter only tracks actual summary
attempts.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test: cover circuit-breaker + apply review polish
- Use findLast / at(-1) for last-user-message lookup instead of
filter + index (O(1) common case).
- Drop redundant `|| 0` fallback — field is required and initialized.
- Add comment noting counter is ephemeral by design.
- Add ResponseProcessor tests covering:
* counter NOT incrementing on normal observation responses
(regression guard for the Greptile P1)
* counter incrementing when a summary was expected but missing
* counter resetting to 0 on successful summary storage
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: iterate all observation blocks; don't count skip_summary as failure
Addresses CodeRabbit review on #2072:
- coerceObservationToSummary now iterates all <observation> blocks
with a global regex and returns the first block that has title,
narrative, or facts. Previously, an empty leading observation
would short-circuit and discard populated follow-ups.
- Circuit-breaker counter now treats explicit <skip_summary/> as
neutral — neither a failure nor a success — so a run that happens
to end on a skip doesn't punish the session or mask a prior bad
streak. Real failures (no summary, no skip) still increment.
- Tests added for both cases.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test: reference SUMMARY_MODE_MARKER constant instead of hardcoded string
Addresses CodeRabbit nitpick: tests should pull the marker from the
canonical source so they don't silently drift when the constant is
renamed or edited.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: also coerce observations when <summary> has empty sub-tags
When the LLM wraps an empty <summary></summary> around real observation
content, the #1360 empty-subtag guard rejects the summary and returns
null — which would lose the observation content and resurrect the
#1633 retry loop. Fall back to coerceObservationToSummary in that
branch too, mirroring the unmatched-<summary> path.
Adds a test covering the empty-summary-wraps-observation case and
a guard test for empty summary with no observation content.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: thedotmack <683968+thedotmack@users.noreply.github.com>
Co-authored-by: Alex Newman <thedotmack@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Conductor workspace setup is no longer needed - plugins handle hook
registration directly via plugin/hooks/hooks.json. The shim was copying
a stale settings.local.json into every worktree, registering dead hook
paths (save-hook.js, new-hook.js, summary-hook.js) that no longer exist.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Script now reads existing CHANGELOG.md, skips releases already documented,
only fetches bodies for new releases, and prepends them. Pass --full to
force complete regeneration.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>