eliott/claude-mem - claude-mem - lab48

eliott/claude-mem

mirror of https://github.com/thedotmack/claude-mem synced 2026-04-25 17:15:04 +02:00

Author	SHA1	Message	Date
Alex Newman	c648d5d8d2	feat: Knowledge Agents — queryable corpora from claude-mem (#1653 ) * feat: add knowledge agent types, store, builder, and renderer Phase 1 of Knowledge Agents feature. Introduces corpus compilation pipeline that filters observations from the database into portable corpus files stored at ~/.claude-mem/corpora/. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add corpus CRUD HTTP endpoints and wire into worker service Phase 2 of Knowledge Agents. Adds CorpusRoutes with 5 endpoints (build, list, get, delete, rebuild) and registers them during worker background initialization alongside SearchRoutes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add KnowledgeAgent with V1 SDK prime/query/reprime Phase 3 of Knowledge Agents. Uses Agent SDK V1 query() with resume and disallowedTools for Q&A-only knowledge sessions. Auto-reprimes on session expiry. Adds prime, query, and reprime HTTP endpoints to CorpusRoutes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add MCP tools and skill for knowledge agents Phase 4 of Knowledge Agents. Adds build_corpus, list_corpora, prime_corpus, and query_corpus MCP tools delegating to worker HTTP endpoints. Includes /knowledge-agent skill with workflow docs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: handle SDK process exit in KnowledgeAgent, add e2e test The Agent SDK may throw after yielding all messages when the Claude process exits with a non-zero code. Now tolerates this if session_id/answer were already captured. Adds comprehensive e2e test script (31 assertions) orchestrated via tmux-cli. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use settings model ID instead of hardcoded model in KnowledgeAgent Reads CLAUDE_MEM_MODEL from user settings via getModelId(), matching the existing SDKAgent pattern. No more hardcoded model assumptions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: improve knowledge agents developer experience Add public documentation page, rebuild/reprime MCP tools, and actionable error messages. DX review scored knowledge agents 4/10 — core engineering works (31/31 e2e) but the feature was invisible. This addresses discoverability (docs, cross-links), API completeness (missing MCP tools), and error quality (fix/example fields in error responses). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add quick start guide to knowledge agents page Covers the three main use cases upfront: creating an agent, asking a single question, and starting a fresh conversation with reprime. Includes keeping-it-current section for rebuild + reprime workflow. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address code review issues — path traversal, session safety, prompt injection - Block path traversal in CorpusStore with alphanumeric name validation and resolved path check - Harden system prompt against instruction injection from untrusted corpus content - Validate question field as non-empty string in query endpoint - Only persist session_id after successful prime (not null on failure) - Persist refreshed session_id after query execution - Only auto-reprime on session resume errors, not all query failures - Add fenced code block language tags to SKILL.md Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address remaining code review issues — e2e robustness, MCP validation, docs - Harden e2e curl wrappers with connect-timeout, fallback to HTTP 000 on transport failure - Use curl_post wrapper consistently for all long-running POST calls - Add runtime name validation to all corpus MCP tool handlers - Fix docs: soften hallucination guarantee to probabilistic claim - Fix architecture diagram: add missing rebuild_corpus and reprime_corpus tools Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: enforce string[] type in safeParseJsonArray for corpus data integrity Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: add blank line before fenced code blocks in SKILL.md maintenance section Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 17:30:20 -07:00
Alex Newman	29f2d0bc02	chore: bump version to 12.0.1 Patch release for the MCP server bun:sqlite crash fix landed in PR #1645 (commit `abd55977`). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 18:10:04 -07:00
Alex Newman	abd55977ca	fix(mcp): MCP server crashes with Cannot find module 'bun:sqlite' under Node (#1645 ) * fix(mcp): MCP server crashes with Cannot find module 'bun:sqlite' under Node The MCP server bundle (mcp-server.cjs) ships with `#!/usr/bin/env node` so it must run under Node, but commit `2b60dd29` added an import of `ensureWorkerStarted` from worker-service.ts. That import transitively pulls in DatabaseManager → bun:sqlite, blowing up at top-level require under Node. The bundle ballooned from ~358KB (v11.0.1) to ~1.96MB (v12.0.0) and crashed on every spawn, breaking the MCP server entirely for Codex/MCP-only clients and any flow that boots the MCP tool surface. Fix: 1. Extract `ensureWorkerStarted` and the Windows spawn-cooldown helpers into a new lightweight module `src/services/worker-spawner.ts` that only imports from infrastructure/ProcessManager, infrastructure/HealthMonitor, shared/, and utils/logger — no SQLite, no ChromaSync, no DatabaseManager. 2. The new helper takes the worker script path explicitly so callers running under Node (mcp-server) can pass `worker-service.cjs` while callers already inside the worker (worker-service self-spawn) pass `__filename`. worker-service.ts keeps a thin wrapper for back-compat. 3. mcp-server.ts now imports from worker-spawner.js and resolves WORKER_SCRIPT_PATH via __dirname so the daemon can be auto-started for MCP-only clients without dragging in the entire worker bundle. 4. resolveWorkerRuntimePath() now searches for Bun on every platform (not just Windows). worker-service.cjs requires Bun at runtime, so when the spawner is invoked from a Node process the Unix branch can no longer fall through to process.execPath (= node). 5. spawnDaemon's Unix branch now calls resolveWorkerRuntimePath() instead of hardcoding process.execPath, fixing the same Node-spawning-Node bug for the actual subprocess launch on Linux/macOS. After: - mcp-server.cjs is 384KB again with zero `bun:sqlite` references - node mcp-server.cjs initializes and serves tools/list + tools/call (verified via JSON-RPC against the running worker) - ProcessManager test suite updated for the new cross-platform Bun resolution behavior; full suite has the same pre-existing failures as main, no regressions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> fix(mcp): address PR #1645 review feedback (round 1) Per Claude Code Review on PR #1645: 1. mcp-server.ts: log a warning when both __dirname and import.meta.url resolution fail. The cwd() fallback is essentially dead code for the CJS bundle but if it ever fires it gives the user a breadcrumb instead of a silently-wrong WORKER_SCRIPT_PATH. 2. mcp-server.ts: existsSync check on WORKER_SCRIPT_PATH at module load. Surfaces a clear "worker-service.cjs not found at expected path" log line for partial installs / dev environments instead of letting the failure surface as a generic spawnDaemon error later. 3. ProcessManager.ts: explanatory comment on the Windows `return 0` sentinel in spawnDaemon. Documents that PowerShell Start-Process doesn't return a PID and that callers MUST use `pid === undefined` for failure detection — never falsy checks like `if (!pid)`. Items 4 (no direct unit tests for the worker-spawner Windows cooldown helpers) and 5 (process-manager.test.ts uses real ~/.claude-mem path) are deferred — the reviewer flagged the latter as out of scope, and the former needs an injectable-I/O refactor that isn't appropriate for a hotfix bugfix PR. Verified: build clean, mcp-server.cjs still 384KB / zero bun:sqlite, JSON-RPC tools/list still returns the 7-tool surface, ProcessManager test suite still 43/43. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(spawner): mkdir CLAUDE_MEM_DATA_DIR before writing Windows cooldown marker Per CodeRabbit on PR #1645: on a fresh user profile, the data dir may not exist yet when markWorkerSpawnAttempted() runs. writeFileSync would throw ENOENT, the catch would swallow it, and the marker would never be created — defeating the popup-loop protection this helper exists to provide. mkdirSync(dir, { recursive: true }) is a no-op when the directory already exists, so it's safe to call on every spawn attempt. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs(spawner): add APPROVED OVERRIDE annotations for cooldown marker catches Per CodeRabbit on PR #1645: silent catch blocks at spawn-cooldown sites should carry the APPROVED OVERRIDE annotation that the rest of the codebase uses (see ProcessManager.ts:689, BaseRouteHandler.ts:82, ChromaSync.ts:288). Both catches are intentional best-effort: - markWorkerSpawnAttempted: if mkdir/writeFileSync fails, the worker spawn itself will almost certainly fail too. Surfacing that downstream is far more useful than a noisy log line about a lock file. - clearWorkerSpawnAttempted: a stale marker is harmless. Worst case is one suppressed retry within the cooldown window, then self-heals. No behaviour change. Resolves the second half of CodeRabbit's lines 38-65 comment on worker-spawner.ts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(mcp): address PR #1645 review feedback (round 2) Round 2 of Claude Code Review feedback on PR #1645: Build guardrail (most important — protects the regression this PR fixes): - scripts/build-hooks.js: post-build check that fails the build if mcp-server.cjs ever contains a `bun:sqlite` reference. This is the exact regression PR #1645 fixed; future contributors will get an immediate, actionable error if a transitive import re-introduces it. Verified the check trips when violated. Code clarity: - src/servers/mcp-server.ts: drop dead `_originalLog` capture — it was never restored. Less code is fewer bugs. - src/servers/mcp-server.ts: elevate `cwd()` fallback log from WARN to ERROR. Per reviewer: a wrong WORKER_SCRIPT_PATH means worker auto-start silently fails, so the breadcrumb should be loud and searchable. - src/services/worker-service.ts: extended doc comment on the `ensureWorkerStartedShared(port, __filename)` wrapper explaining why `__filename` is the correct script path here (CJS bundle = compiled worker-service.cjs) and why mcp-server.ts can't use the same trick. - src/services/infrastructure/ProcessManager.ts: inline comment on the `env.BUN === 'bun'` bare-command guard explaining why it's reachable even though `isBunExecutablePath('bun')` is true (pathExists returns false for relative names, so the second branch is what fires). Coverage: - src/services/infrastructure/ProcessManager.ts: add `/usr/bin/bun` to the Linux candidate paths so apt-installed Bun on Debian/Ubuntu is found without falling through to the PATH lookup. Out-of-scope items (deferred with rationale in PR replies): - Unit tests for ensureWorkerStarted / Windows cooldown helpers — needs injectable-I/O refactor unsuitable for a hotfix. - Sentinel object for Windows spawnDaemon `0` — broader API change. - Windows Scoop install path — follow-up for a future PR. - runOneTimeChromaMigration placement, aggressiveStartupCleanup, console.log redirect timing, platform timeout multiplier — all pre-existing and unrelated to this regression. Verified: build clean, guardrail trips on simulated violation, mcp-server.cjs still 0 bun:sqlite refs, ProcessManager tests 43/43. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(mcp): address PR #1645 review feedback (round 3) Round 3 of Claude Code Review feedback on PR #1645: ProcessManager.ts: improve actionability of "Bun not found" errors Both Windows and Unix branches of spawnDaemon previously logged a vague "Failed to locate Bun runtime" message when resolveWorkerRuntimePath() returned null. Replaced with an actionable message that names the install URL and explains why Bun is required (worker uses bun:sqlite). The existing null-guard at the call sites already prevents passing null to child_process.spawn — only the error text changed. scripts/build-hooks.js: refine bun:sqlite guardrail to match actual require() calls only The previous coarse `includes('bun:sqlite')` check tripped on its own improved error message, which legitimately mentions "bun:sqlite" by name. Switched to a regex that matches `require("bun:sqlite")` / `require('bun:sqlite')` (with optional whitespace, handles both quote styles, handles minified output) so error messages and inline comments can reference the module name without false positives. Verified the regex still trips on real violations (both spaced and minified forms) and correctly ignores string-literal mentions. Other round-3 items (verified, not changed): - TOOL_ENDPOINT_MAP: reviewer flagged as dead code, but it IS used at lines 250 and 263 by the search and timeline tool handlers. False positive — kept as-is. - if (!pid) callsites: grepped src/, zero offenders. The Windows `0` PID sentinel contract is safe; only the in-line documentation comment in ProcessManager.ts mentions the anti-pattern. - callWorkerAPIPost double-wrapping: pre-existing intentional behavior (only used by /api/observations/batch which returns raw data, not the MCP {content:[...]} shape). Unrelated to this regression. - Snap path / startParentHeartbeat / main().catch / test for non- existent workerScriptPath / etc — pre-existing or out of scope for this hotfix, deferred per established disposition. Verified: build clean, guardrail still trips on real violations, mcp-server.cjs has 0 require("bun:sqlite") calls, JSON-RPC tools/list returns the 7-tool surface, ProcessManager tests 43/43. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test(spawnDaemon): contract test for Windows 0 PID success sentinel Per CodeRabbit nitpick on PR #1645 commit `7a96b3b9`: add a focused test that documents the spawnDaemon return contract so any future contributor who introduces `if (!pid)` against a spawnDaemon return value (or its wrapper) sees a failing assertion explaining why the falsy check is incorrect. The test deliberately exercises the JS-level semantics rather than mocking PowerShell — a true mocked Windows test would require refactoring spawnDaemon to take an injectable execSync, which is a larger change than this hotfix should carry. The contract assertions here catch the same regression class (treating Windows success as failure) without that refactor. Verified: bun test tests/infrastructure/process-manager.test.ts now passes 44/44 (was 43/43). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(mcp): address PR #1645 review feedback (round 4) Round 4 of Claude Code Review feedback on PR #1645 (review of round-3 commit `193286f9`): tests/infrastructure/process-manager.test.ts: replace require('fs') with the already-imported statSync. Reviewer correctly flagged that the file uses ESM-style named imports everywhere else and the inline require() calls would break under strict ESM. Two callsites updated in the touchPidFile test. src/services/infrastructure/ProcessManager.ts: hoist resolveWorkerRuntimePath() and the `Bun runtime not found` error handling out of both branches in spawnDaemon. Both Windows and Unix branches need the same Bun lookup, and resolving once before the OS branch split avoids a duplicate execSync('which bun')/where bun in the no-well-known-path fallback. The error message is also DRY now — single source of truth instead of two near-identical strings. CodeRabbit confirmed in its previous reply that "All actionable items across all four review rounds are fully resolved" — these two minor items from claude-review of round 3 are the only remaining cleanup. Verified: build clean, ProcessManager tests still 44/44. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(mcp): address PR #1645 review feedback (round 5) Round 5 of Claude Code Review feedback on PR #1645: src/services/worker-spawner.ts: drop `export` from internal helpers `shouldSkipSpawnOnWindows`, `markWorkerSpawnAttempted`, and `clearWorkerSpawnAttempted` were exported even though they were private in worker-service.ts and nothing outside this module needs them. Removing the `export` keyword keeps the public surface to just `ensureWorkerStarted` and prevents future callers from bypassing the spawn lifecycle. scripts/build-hooks.js: broaden guardrail to all bun:* modules Previously the regex only caught `require("bun:sqlite")`, but every module in the `bun:` namespace (bun:ffi, bun:test, etc.) is Bun-only and would crash mcp-server.cjs the same way under Node. Generalized the regex to `require("bun:[a-z][a-z0-9_-]")` so a transitive import of any Bun-only module fails the build instead of shipping a broken bundle. Verified the new regex still trips on bun:sqlite, bun:ffi, bun:test, and correctly ignores string-literal mentions in error messages. src/servers/mcp-server.ts: attribute root cause when dirname resolution fails Previously, if `__dirname`/`import.meta.url` resolution failed and we fell back to `process.cwd()`, the user would see two warnings: an error about the dirname fallback AND a separate warning about the missing worker bundle. The second warning hides the root cause — someone debugging would assume the install is broken when really it's a dirname-resolution failure. Track the failure with a flag and emit a single root-cause-attributing log line in the existence-check branch instead. The dirname fallback paths are still functionally unreachable in CJS deployment; this just makes the failure mode unmistakable if it ever does fire. Out of scope (consistent with prior rounds): - darwin/linux split for non-Windows candidate paths (benign today) - Integration test for non-existent workerScriptPath (test coverage gap deferred since rounds 1-2) - Defer existsSync check to first ensureWorkerStarted call (current module-init check is the loud signal we want) Already addressed in earlier rounds: - resolveWorkerRuntimePath() called twice in spawnDaemon → hoisted in round 4 (`b2c114b4`) - _originalLog dead code → removed in round 2 (`7a96b3b9`) Verified: build clean, broadened guardrail trips on bun:sqlite, bun:ffi, and bun:test (and ignores string literals), MCP server serves the 7-tool surface, ProcessManager tests still 44/44. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> fix(mcp): address PR #1645 review feedback (round 6) Round 6 of Claude Code Review feedback on PR #1645: src/services/worker-spawner.ts: validate workerScriptPath at entry Add an empty-string + existsSync guard at the top of ensureWorkerStarted. Without this, a partial install or upstream path-resolution regression just surfaces as a low-signal child_process error from spawnDaemon. The explicit log line at the entry point makes that class of bug much easier to diagnose. The mcp-server.ts module-init existsSync check already covers this for the MCP-server caller, but defending at the spawner level reinforces the contract for any future caller. src/services/worker-spawner.ts: document SettingsDefaultsManager dependency boundary in the module header The spawner imports from SettingsDefaultsManager, ProcessManager, and HealthMonitor. None of those currently touch bun:sqlite, but if any of them ever does, the spawner's SQLite-free contract silently breaks. The build guardrail in build-hooks.js is the only thing that catches it. Header comment now flags this so future contributors audit transitive imports when adding helpers from the shared/infrastructure layers. src/services/infrastructure/ProcessManager.ts: add /snap/bin/bun Ubuntu Snap install path. Now alongside the existing apt path (/usr/bin/bun) and Homebrew/Linuxbrew paths. The PATH lookup catches it as fallback, but listing it explicitly avoids paying for an execSync('which bun') in the common case. src/servers/mcp-server.ts: elevate missing-bundle log warn → error A missing worker-service.cjs means EVERY MCP tool call that needs the worker silently fails. That's a broken-install state, not a transient condition — match the severity of the dirname-fallback branch above (which is already ERROR). Out of scope (consistent with prior rounds, reviewer agrees these are appropriately deferred): - Streaming bundle read in build-hooks.js (nit at current 384KB size) - Unit tests for ensureWorkerStarted / cooldown helpers - Integration test for non-existent workerScriptPath Verified: build clean, broadened guardrail still trips on bun:* imports and ignores string literals, MCP server serves the 7-tool surface, ProcessManager tests still 44/44. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(mcp): defer WORKER_SCRIPT_PATH check to first call (round 7) Round 7 of Claude Code Review feedback on PR #1645: src/servers/mcp-server.ts: extract module-level existsSync check into checkWorkerScriptPath() and call it lazily from ensureWorkerConnection() instead of at module load. The early-warning intent is preserved (the check still fires before any actual spawn attempt), but tests/tools that import this module without booting the MCP server no longer see noisy ERROR-level log lines for a worker bundle they never intended to start. The check is cheap and idempotent, so calling it on every auto-start attempt is fine. The two failure-mode branches (dirname-resolution failure vs simple missing-bundle) remain unchanged — the function body is identical to the previous module-level if-block, just hoisted into a function and called from ensureWorkerConnection(). False positive (no change needed): - Reviewer flagged `mkdirSync` as a dead import in worker-spawner.ts, but it IS used at line 71 in markWorkerSpawnAttempted (the round-1 ENOENT fix CodeRabbit explicitly asked for). Out of scope: - Volta path (~/.volta/bin/bun) — PATH fallback handles it; nit per reviewer - worker-spawner.ts unit tests — needs injectable I/O, deferred consistently since round 1 Verified: build clean, tests 44/44, smoke test 7-tool surface. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(mcp): address PR #1645 review feedback (round 8) Round 8 of Claude Code Review feedback on PR #1645: tests/services/worker-spawner.test.ts: NEW FILE — unit tests for the ensureWorkerStarted entry-point validation guards added in round 6. Covers the empty-string and non-existent-path cases without requiring the broader injectable-I/O refactor that the deeper spawn lifecycle tests would need. 2 new passing tests. src/services/infrastructure/ProcessManager.ts: memoize resolveWorkerRuntimePath() for the no-options call site (which is what spawnDaemon uses). Caches both successful resolutions and the not-found result so repeated spawn attempts (crash loops, health thrashing) don't repeatedly hit statSync on candidate paths. Tests that pass options bypass the cache entirely so existing test cases remain deterministic. Added resetWorkerRuntimePathCache() exported for test isolation only. src/servers/mcp-server.ts: rename checkWorkerScriptPath() → warnIfWorkerScriptMissing(). Per reviewer: the old name implied a boolean check but the function returns void and has side effects. New name is more accurate. DEFENDED (no change made): - Reviewer asked to elevate process.cwd() fallback to a synchronous throw at module load. This conflicts with round 7 feedback which asked to defer the existsSync check to first call to avoid noisy test logs. The current lazy approach is the right compromise: it fires before any actual spawn attempt, attributes the root cause, and doesn't pollute test imports. Throwing at module load would crash before stdio is wired up, which is much harder to debug than the lazy log line. - Reviewer asked to grep for `if (!pid)` callsites — already verified in round 3, zero offenders in src/. Out of scope: - Volta path (~/.volta/bin/bun) — PATH fallback handles it; reviewer marked as nit - Deeper unit tests for ensureWorkerStarted spawn lifecycle (PID file cleanup, health checks, etc.) — needs injectable I/O, deferred consistently since round 1 Verified: build clean, ProcessManager tests still 44/44, new worker-spawner tests 2/2, smoke test serves 7 tools. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(spawner): clear Windows cooldown marker on all healthy paths (round 9) Round 9 of PR #1645 review feedback. src/services/worker-spawner.ts: clear stale Windows cooldown marker on every healthy-return path Per CodeRabbit (genuine bug): The .worker-start-attempted marker was previously only cleared after a spawn initiated by ensureWorkerStarted itself succeeded. If a previous auto-start failed, then the worker became healthy via another session or a manual start, the early-return success branches (existing live PID, fast-path health check, port-in-use waitForHealth) would leave the stale marker behind. A subsequent genuine outage inside the 2-minute cooldown window would then be incorrectly suppressed on Windows. Now calls clearWorkerSpawnAttempted() on all three healthy success paths in addition to the existing post-spawn path. The function is already a no-op on non-Windows, so the change is risk-free for Linux and macOS callers. src/servers/mcp-server.ts: more actionable error when auto-start fails Per claude-review: when ensureWorkerStarted returns false (or throws), the caller currently logs a generic "Worker auto-start failed" line. Updated both error sites to explicitly call out which MCP tools will fail (search/timeline/get_observations) and to point at earlier log lines for the specific cause. Helps users distinguish "worker is just not running" from "tools are broken". DEFENDED (no change): - Sentinel object for Windows spawnDaemon 0 PID — broader API change, out of scope, deferred consistently since round 1 - Spawner lifecycle tests beyond input validation — needs injectable I/O, deferred consistently - Concurrent cooldown marker race on Windows — pre-existing, out of scope - stripHardcodedDirname() regex fragility assertion — pre-existing, out of scope Verified: build clean, ProcessManager tests 44/44, worker-spawner tests 2/2, smoke test 7-tool surface. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(spawner): don't cache null Bun-not-found result (round 10) Round 10 of PR #1645 review feedback. src/services/infrastructure/ProcessManager.ts: only cache successful resolveWorkerRuntimePath() results Genuine bug from claude-review: the round-8 memoization cached BOTH successful resolutions AND the not-found `null` result. If Bun isn't on PATH at the moment the MCP server first tries to spawn the worker — e.g., on a fresh install where the user installs Bun in another terminal and retries — every subsequent ensureWorkerConnection call would return the cached `null` and fail with a misleading "Bun not found" error even though Bun is now available. The fix is the one-line change the reviewer suggested: only cache when `result !== null`. Crash loops still get the fast-path memoized success; recovery from a fresh-install Bun install still works. src/servers/mcp-server.ts: rename warnIfWorkerScriptMissing → errorIfWorkerScriptMissing Per claude-review: the function uses logger.error but the name says "warn" — name/level mismatch. Renamed to match. The function still serves the same purpose (defensive lazy check), just with an accurate name. DEFENDED (no change): - Discriminated union for mcpServerDirResolutionFailed flag — current approach works, the noise is minimal, and the alternative would add type complexity for a path that's functionally unreachable in CJS deployment - macOS /usr/local/bin/bun "missing" — already in the Linux/macOS candidate list at line 137 (false positive from reviewer) - nix store path — out of scope, PATH fallback handles it - Long build-hooks.js error message — verbosity is intentional, this message only fires on a real regression and the diagnostic value is worth the line wrap - Spawner lifecycle test coverage gap — needs injectable I/O, deferred consistently Verified: build clean, ProcessManager tests 44/44, worker-spawner tests 2/2, smoke test 7-tool surface. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(mcp): bundle size budget guardrail (round 11) Round 11 of PR #1645 review feedback. scripts/build-hooks.js: secondary bundle-size budget guardrail Per claude-review: the existing `require("bun:")` regex catches the specific regression class we already know about, but if esbuild ever changes how it emits external module specifiers, the regex could silently miss the regression. A bundle-size budget catches the structural symptom (worker-service.ts dragged into the bundle blew the size from ~358KB to ~1.96MB) regardless of how the imports look. Set the ceiling at 600KB. Current size is ~384KB; the broken v12.0.0 bundle was ~1920KB. Plenty of headroom for legitimate growth without incentivizing bundle bloat or false positives. Both guardrails fire independently — one is regex-based, one is size-based — so a regression has to defeat both to ship. tests/services/worker-spawner.test.ts: comment about port irrelevance Per claude-review: the hardcoded port values in the validation-guard tests are arbitrary because the path validation short-circuits before any network I/O. Added a comment explaining this so future readers don't waste time wondering why specific ports were picked. DEFENDED (no change): - clearWorkerSpawnAttempted on the unhealthy-live-PID return path: reviewer asked to clear the marker here too, but the current behavior is correct. The marker tracks "recently attempted a spawn" and exists to prevent rapid PowerShell-popup loops. If a wedged process is currently using the port, the spawn isn't actually happening on this code path (the helper returns false without reaching the spawn step). When the wedged process eventually dies and a subsequent call hits the spawn path, the marker correctly suppresses repeated retry attempts within the 2-minute cooldown. Clearing the marker on the unhealthy-return path would defeat exactly the popup-loop protection the marker exists to provide. - execSync in lookupBinaryInPath blocks event loop: pre-existing concern, not introduced by this PR. Reviewer notes "fires once, result cached". Not in scope for a hotfix. - Tracking issue for spawner lifecycle test gap: out of scope for this PR; the gap is documented in the test file's header comment with a back-reference to PR #1645. Verified: build clean, both guardrails functional (size budget is under the new ceiling), ProcessManager tests 44/44, worker-spawner tests 2/2, smoke test 7-tool surface. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> fix(mcp): eliminate double error log when worker bundle is missing (round 12) Round 12 of PR #1645 review feedback. src/servers/mcp-server.ts: errorIfWorkerScriptMissing() now only logs when the dirname-fallback attribution path is needed Previously a missing worker-service.cjs would produce two ERROR log lines on the same code path: 1. errorIfWorkerScriptMissing() in ensureWorkerConnection() 2. The existsSync guard inside ensureWorkerStarted() The simple "missing bundle" case is fully covered by the spawner's own existsSync guard. The mcp-server.ts function now ONLY logs when mcpServerDirResolutionFailed is true — that's the mcp-server-specific root-cause attribution that the spawner cannot provide on its own. Net effect: same single error log per bug class, cleaner triage. DEFENDED (no change): - mkdirSync error propagation in markWorkerSpawnAttempted: reviewer worried that mkdirSync/writeFileSync exceptions could escape, but the entire body is already wrapped in try/catch with an APPROVED OVERRIDE annotation. False positive. - clearWorkerSpawnAttempted on healthy paths: reviewer asked a clarifying question, not a change request. The behavior is intentional — the cooldown marker exists to prevent rapid PowerShell-popup loops from a series of failed spawns; a healthy worker means the marker has served its purpose and a future outage should NOT be suppressed. Will explain in PR reply. - __filename ESM concern in worker-service.ts wrapper: already documented in round 4 with an extended comment about the CJS bundle context and why mcp-server.ts can't use the same trick. - Spawn lifecycle integration tests: deferred consistently since round 1; gap is documented in worker-spawner.test.ts header. Verified: build clean, ProcessManager tests 44/44, worker-spawner tests 2/2, smoke test 7-tool surface. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test(spawner): add bare-command BUN env override coverage Final round of PR #1645 review feedback: while preparing to merge, I noticed CodeRabbit's round-5 CHANGES_REQUESTED review on commit `3570d2f0` included an unaddressed nitpick — the env-driven bare-command branch in resolveWorkerRuntimePath() (returning a bare 'bun' unchanged when BUN or BUN_PATH is set that way) had no test coverage and could regress without any failing assertion. Added a focused test that exercises the env: { BUN: 'bun' } branch specifically. 47/47 tests pass (was 46/46). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 18:08:36 -07:00
Alex Newman	25ccf46ac0	chore: bump version to 12.0.0 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 14:22:27 -07:00
Alex Newman	a0e895b53b	fix: enhance title sanitization per PR #1641 review (round 4) Collapse multiple whitespace, trim, and increase max length to 160 chars for observation titles in file-context deny reason. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 14:18:22 -07:00
Alex Newman	753a993647	fix: address PR #1641 review comments (round 3) - Fix migration version conflict: addSessionPlatformSourceColumn now uses v25 - Sanitize observation titles in file-context deny reason (strip newlines, limit length) - Guard json_each() with LIKE '[%' check for legacy bare-path rows - Guard /stream SSE endpoint with 503 before DB initialization - Scope bun-runner signal exit handling to start subcommand only - Normalize platformSource at route boundary in DataRoutes Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 14:16:41 -07:00
Alex Newman	d0676aa049	feat: file-read gate allows Edit, add legacy-peer-deps for grammar install - Change file-read gate from deny to allow with limit:1, injecting the observation timeline as additionalContext. Edit now works on gated files since the file registers as "read" with near-zero token cost. - Add updatedInput to HookResult type for PreToolUse hooks. - Add .npmrc with legacy-peer-deps=true for tree-sitter peer dep conflicts. - Add --legacy-peer-deps to npm fallback paths in smart-install.js so end users without bun can install the 24 grammar packages. - Rebuild plugin artifacts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 14:06:07 -07:00
Alex Newman	25bb93a995	fix: address PR #1641 review comments (round 2) - Remove duplicate TranscriptWatcher/config imports in worker-service.ts - Use normalizePlatformSource in handleSessionInitByClaudeId for consistency - Don't skip DB completion when session not in memory (completeByClaudeId) - Add try-catch around fetch in useContextPreview refresh callback - Deduplicate store.getAllProjects() call in DataRoutes - Fix malformed comment separators in migration runner - Fix missing closing brace and JSDoc opener (merge artifact) in migration runner Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 13:22:58 -07:00
Alex Newman	c21e49d9fa	fix: address PR review comments and add file read gate docs Fix indentation bugs flagged in PR review (SettingsDefaultsManager, MigrationRunner), add current date/time to file read gate timeline so the model can judge observation recency, and add documentation for the file read gate feature. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 13:09:46 -07:00
Alex Newman	cbb68ad9e1	fix: worker startup crash and missing observation columns Two bugs fixed: 1. SessionCompletionHandler called dbManager.getSessionStore() during WorkerService construction, before DB initialization. Changed to accept DatabaseManager and defer the call to runtime. 2. migration009 (generated_by_model, relevance_count columns) only ran via the deprecated MigrationRunner path, never through SessionStore's migration chain. Added addObservationModelColumns() to SessionStore constructor. Checks column existence directly since schema_versions may have been marked applied without the ALTER TABLE succeeding. Also removed duplicate transcriptWatcher declaration and shutdown block (merge artifact). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 12:20:10 -07:00
Alex Newman	052da384b2	chore: rebuild worker-service.cjs with filePath escaping fix Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 11:17:21 -07:00
Alex Newman	e3475180cd	fix: address PR review — day sort, path canonicalization, dead code cleanup - Sort within-day observations chronologically (was specificity-ordered) - Canonicalize relative paths to POSIX format before DB lookup - Skip projects param when allProjects is empty (prevents cross-project leaks) - Remove dead stderrMessage field and hook-command block (unused after permissionDecision switch) - Type permissionDecision as 'allow' \| 'deny' union instead of string - Remove redundant non-null assertions in getObservationsByFilePath - Add edit guidance to deny message (use sed via Bash with smart tools) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 01:59:30 -07:00
Alex Newman	ef1b427a2a	fix: update timeline deny message to route to smart tools The deny reason is the routing surface — show all cheaper exits: semantic priming from the timeline, get_observations for details, and smart_outline/smart_unfold for current code structure. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 00:25:55 -07:00
Alex Newman	455aeaf654	fix: remove per-session gate, use permissionDecision deny for every read The per-session FileReadGate was never requested and broke the cost savings loop — subsequent reads in the same session silently bypassed the timeline, hiding newly created observations. Now the timeline fires on every read that has observations, using the hook contract's permissionDecision: "deny" with the timeline as the reason (exit 0 + JSON) instead of exit code 2 + stderr. - Delete FileReadGate.ts entirely - Remove /api/file-context/gate endpoint from DataRoutes - Switch handler from exit code 2 to permissionDecision: "deny" - Restore permissionDecision fields to HookResult - Eliminate one HTTP round-trip per read (no gate check needed) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 22:05:40 -07:00
Alex Newman	31910fb265	fix: address PR review feedback — path safety, SQL injection, gate scoping - Resolve relative filePath against input.cwd before statSync; early-return on ENOENT - Replace LIKE '%path%' with exact json_each equality to prevent false matches - Sanitize and parameterize LIMIT to prevent NaN SQL errors - Fix day-sorting to use earliest epoch in group, not first (specificity-sorted) item - Use exact path equality in deduplicateObservations instead of substring includes - Scope FileReadGate by session+cwd to prevent worktree collisions - Refresh lastAccess TTL on active sessions; throttle prune to every 50 calls - Type params as (string \| number)[] instead of any[] - Remove unused permissionDecision fields from HookResult Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 17:29:59 -07:00
Alex Newman	a60f79c44d	feat: file-size threshold and observation dedup for timeline gate - Skip gate for files under 1,500 bytes — timeline (~370 tokens) costs more than just reading small files directly - Deduplicate observations by memory_session_id (one per session) - Rank by specificity: files_modified > files_read, fewer tagged files > many - Fetch 40 candidates, dedup/score down to 15 for display - Reduce default by-file query limit from 30 to 15 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 13:29:28 -07:00
Alex Newman	18aa5dc4e7	chore: bump version to 11.0.1 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 21:08:32 -07:00
Alex Newman	0f9745535a	fix: disable semantic inject by default — experimental feature not ready for all users The per-prompt Chroma vector search injection on UserPromptSubmit adds latency and context noise. Disable by default while we iterate on a more precise file-context approach. Users can still opt in via CLAUDE_MEM_SEMANTIC_INJECT=true. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 21:05:03 -07:00
Alex Newman	a7ebc35ee0	chore: bump version to 11.0.0 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 19:39:28 -07:00
Alex Newman	3b34feb779	chore: rebuild plugin artifacts for v10.7.2 with Alessandro's stability PRs (#1607 ) Rebuilt worker-service, mcp-server, and viewer-bundle to include: - SIGTERM drain for orphaned pending messages (#1567) - Multi-machine sync script (#1570) - 3 upstream bug fixes: summarize loop, ChromaSync duplicates, TOCTOU port check (#1566) - Semantic context injection via Chroma (#1568) - Tier routing by queue complexity (#1569) - Architecture overview + production guide docs (#1574) Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 19:36:32 -07:00
Alex Newman	b385570884	chore: bump version to 10.7.2 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 19:22:50 -07:00
Alex Newman	29ef3f5603	fix: downgrade concept-type cleanup log from error to debug (#1606 ) The parser correctly strips observation types from concepts arrays when the LLM ignores the prompt instruction. This is routine data normalization, not an error — downgrade to debug to reduce log noise. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 19:21:38 -07:00
Alex Newman	bedca129ac	chore: bump version to 10.7.1 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 18:59:00 -07:00
Alex Newman	c5129ed016	chore: bump version to 10.7.0 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 14:58:05 -07:00
Alex Newman	2495f98496	refactor: consolidate MCP factory, add non-TTY support, auto-detect transcript watchers - Phase 1: Replace 5 duplicate MCP installers with config-driven factory, extract shared context-injection and json-utils utilities, fix process.execPath usage - Phase 2: Add non-TTY fallback for @clack/prompts to prevent ENOENT in CI/Docker - Phase 3: Wire GeminiCliHooksInstaller through hook command framework with adapter - Phase 4: Auto-start transcript watchers on worker boot when config exists Net -107 lines via DRY consolidation of duplicated installer logic. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 00:35:55 -07:00
Alex Newman	a2ac116aac	fix: move summary wait + session-complete into Stop hook to prevent lost summaries SessionEnd has a 1.5s hardcoded cap from Claude Code (CLAUDE_CODE_SESSIONEND_HOOKS_TIMEOUT_MS), making it unsuitable for waiting on async work. Previously, the Stop hook would fire-and-forget the summarize request, then SessionEnd would immediately call deleteSession — aborting the SDK agent mid-summary. Now the Stop hook (120s timeout, no cap) owns the full lifecycle: 1. Queue summarize request 2. Poll new GET /api/sessions/status endpoint until queue drains 3. Call /api/sessions/complete after summary finishes SessionEnd is now a true fire-and-forget fallback (process.exit(0) immediately). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 14:05:53 -07:00
Alex Newman	8265fc7aa1	Merge remote-tracking branch 'origin/thedotmack/npx-gemini-cli' into thedotmack/npx-gemini-cli Resolve merge conflicts in adapter index, gemini-cli adapter, and rebuilt CJS artifacts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 13:47:49 -07:00
Alex Newman	76a880a3d6	feat: update install CLI, ESM compat, and Gemini CLI docs Fixes CursorHooksInstaller ESM compatibility, updates install command with improved path resolution, and refreshes built plugin artifacts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 12:38:45 -07:00
Alex Newman	ddb57ea598	chore: bump version to 10.6.3 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 19:49:24 -07:00
Alex Newman	0321f4266d	fix: remove import.meta.url banner from CJS files run by Node.js The MCP server (#!/usr/bin/env node) and context generator run under Node.js, where import.meta.url throws SyntaxError in CJS mode. Only the worker-service needs the banner since it runs under Bun. CJS files under Node.js already have __dirname/__filename natively. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 18:32:43 -07:00
Alex Newman	07ab7000a8	fix: patch 7 critical bugs affecting all non-dev-machine users and Windows 1. Fix esbuild inlining build-machine __dirname as string literal — use CJS-compatible runtime banner with require("node:url").fileURLToPath across worker-service, mcp-server, and context-generator builds. 2. Fix isMainModule check missing .cjs extension and Windows backslash path normalization. 3. Wrap extractLastMessage in try-catch to prevent infinite Stop hook feedback loop on malformed transcripts (exit 0 instead of exit 2). 4. Replace heavy SessionEnd hook (Node→Bun→1.7MB CJS→HTTP) with lightweight inline node -e one-liner (~200ms vs >1s). 5. Add 7 Gemini/OpenRouter error patterns to unrecoverablePatterns circuit breaker to prevent 77K+ retry loops on expired API keys. 6. Preserve CLAUDE_CODE_OAUTH_TOKEN and CLAUDE_CODE_GIT_BASH_PATH in sanitizeEnv instead of stripping them with the CLAUDE_CODE_ prefix. 7. Use PowerShell -EncodedCommand for spawnDaemon to fix path quoting when Windows usernames contain spaces. Closes #1515, #1495, #1475, #1465, #1500, #1513, #1512, #1450, #1460, #1486, #1449, #1481, #1451, #1480, #1453, #1445 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 15:20:29 -07:00
Conductor	5621b67ccd	Saving uncommitted changes before archiving	2026-03-26 19:35:27 -07:00
Alex Newman	0524fa83cd	chore: bump version to 10.6.2 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-21 14:14:09 -07:00
Alex Newman	4d7bec4d05	fix: stop spinner from spinning forever (#1440 ) * fix: stop spinner from spinning forever due to orphaned DB messages The activity spinner never stopped because isAnySessionProcessing() queried ALL pending/processing messages in the database, including orphaned messages from dead sessions that no generator would ever process. Root cause: isAnySessionProcessing() used hasAnyPendingWork() which is a global DB scan. Changed it to use getTotalQueueDepth() which only checks sessions in the active in-memory Map. Additional fixes: - Add terminateSession() to enforce restart-or-terminate invariant - Fix 3 zombie paths in .finally() handler that left sessions alive - Clean up idle sessions from memory on successful completion - Remove redundant bare isProcessing:true broadcast - Replace inline require() with proper accessor - Add 8 regression tests for session termination invariant Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address review findings — idle-timeout race, double broadcast, query amplification - Move pendingCount check before idle-timeout termination to prevent abandoning fresh messages that arrive between idle abort and .finally() - Move broadcastProcessingStatus() inside restart branch only — the else branch already broadcasts via removeSessionImmediate callback - Compute queueDepth once in broadcastProcessingStatus() and derive isProcessing from it, eliminating redundant double iteration Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-21 14:13:10 -07:00
Alex Newman	9f529a30f5	feat: strip <system_instruction> tags before DB storage (#1398 ) * feat: strip <system_instruction> tags before database storage Extends the existing tag-stripping mechanism (used for <private> and <claude-mem-context>) to also filter Conductor-injected system instructions, preventing them from being persisted in the observation database. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: also strip <system-instruction> (hyphen variant) before DB storage Conductor uses both <system_instruction> and <system-instruction> tag formats. This adds the hyphen variant to the same stripping mechanism. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-19 12:08:25 -07:00
Alex Newman	d54e574251	chore: bump version to 10.6.1 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-18 14:36:23 -07:00
Alex Newman	7e07210635	feat: add timeline-report skill with token economics, compress context output 53% ## Summary - New timeline-report skill for generating narrative project history reports - Compressed markdown context output ~53% (tables → flat compact lines, verbose labels → terse format) - Added `full=true` param to /api/context/inject for fetching all observations - Split TimelineRenderer into separate markdown/color rendering paths - Removed arbitrary file write vulnerability (dump_to_file param) - Fixed timestamp ditto marker leaking across session summary boundaries ## Review - Rebased on main (v10.6.0) to preserve OpenClaw system prompt injection - Reviewed by /review (gstack) + /octo:review (Codex, Gemini, Claude fleet) - Security fix (dump_to_file removal) confirmed by all 3 reviewers - Timestamp bug caught by Codex, fixed 🤖 Generated with [Claude Code](https://claude.com/claude-code)	2026-03-18 13:57:20 -07:00
Alex Newman	8c79b99384	chore: bump version to 10.6.0 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-17 17:15:27 -07:00
Alex Newman	5ccd81b8a3	chore: bump version to 10.5.6 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-16 14:54:32 -07:00
Alex Newman	678ae1e7d3	chore: rebuild worker-service.cjs with latest source changes Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-16 14:53:03 -07:00
Alex Newman	80a8c90a1a	feat: add embedded Process Supervisor for unified process lifecycle (#1370 ) * feat: add embedded Process Supervisor for unified process lifecycle management Consolidates scattered process management (ProcessManager, GracefulShutdown, HealthMonitor, ProcessRegistry) into a unified src/supervisor/ module. New: ProcessRegistry with JSON persistence, env sanitizer (strips CLAUDECODE_* vars), graceful shutdown cascade (SIGTERM → 5s wait → SIGKILL with tree-kill on Windows), PID file liveness validation, and singleton Supervisor API. Fixes #1352 (worker inherits CLAUDECODE env causing nested sessions) Fixes #1356 (zombie TCP socket after Windows reboot) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add session-scoped process reaping to supervisor Adds reapSession(sessionId) to ProcessRegistry for killing session-tagged processes on session end. SessionManager.deleteSession() now triggers reaping. Tightens orphan reaper interval from 60s to 30s. Fixes #1351 (MCP server processes leak on session end) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add Unix domain socket support for worker communication Introduces socket-manager.ts for UDS-based worker communication, eliminating port 37777 collisions between concurrent sessions. Worker listens on ~/.claude-mem/sockets/worker.sock by default with TCP fallback. All hook handlers, MCP server, health checks, and admin commands updated to use socket-aware workerHttpRequest(). Backwards compatible — settings can force TCP mode via CLAUDE_MEM_WORKER_TRANSPORT=tcp. Fixes #1346 (port 37777 collision across concurrent sessions) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: remove in-process worker fallback from hook command Removes the fallback path where hook scripts started WorkerService in-process, making the worker a grandchild of Claude Code (killed by sandbox). Hooks now always delegate to ensureWorkerStarted() which spawns a fully detached daemon. Fixes #1249 (grandchild process killed by sandbox) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add health checker and /api/admin/doctor endpoint Adds 30-second periodic health sweep that prunes dead processes from the supervisor registry and cleans stale socket files. Adds /api/admin/doctor endpoint exposing supervisor state, process liveness, and environment health. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: add comprehensive supervisor test suite 64 tests covering all supervisor modules: process registry (18 tests), env sanitizer (8), shutdown cascade (10), socket manager (15), health checker (5), and supervisor API (6). Includes persistence, isolation, edge cases, and cross-module integration scenarios. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: revert Unix domain socket transport, restore TCP on port 37777 The socket-manager introduced UDS as default transport, but this broke the HTTP server's TCP accessibility (viewer UI, curl, external monitoring). Since there's only ever one worker process handling all sessions, the port collision rationale for UDS doesn't apply. Reverts to TCP-only, removing ~900 lines of unnecessary complexity. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: remove dead code found in pre-landing review Remove unused `acceptingSpawns` field from Supervisor class (written but never read — assertCanSpawn uses stopPromise instead) and unused `buildWorkerUrl` import from context handler. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * updated gitignore * fix: address PR review feedback - downgrade HTTP logging, clean up gitignore, harden supervisor - Downgrade request/response HTTP logging from info to debug to reduce noise - Remove unused getWorkerPort imports, use buildWorkerUrl helper - Export ENV_PREFIXES/ENV_EXACT_MATCHES from env-sanitizer, reuse in Server.ts - Fix isPidAlive(0) returning true (should be false) - Add shutdownInitiated flag to prevent signal handler race condition - Make validateWorkerPidFile testable with pidFilePath option - Remove unused dataDir from ShutdownCascadeOptions - Upgrade reapSession log from debug to warn - Rename zombiePidFiles to deadProcessPids (returns actual PIDs) - Clean up gitignore: remove duplicate datasets/, stale ~/ and http/ patterns - Fix tests to use temp directories instead of relying on real PID file Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-16 14:49:23 -07:00
Vincent Leraitre	237a4c37f8	fix: always pass --ssl flag to chroma-mcp in remote mode (#1286 ) * fix: always pass --ssl flag to chroma-mcp in remote mode The chroma-mcp CLI defaults to SSL when using --client-type http. When CLAUDE_MEM_CHROMA_SSL is false (the common case for local ChromaDB servers), buildCommandArgs() omitted --ssl entirely, causing chroma-mcp to attempt an SSL connection to a plain HTTP server and fail with "Could not connect to a Chroma server". Always pass --ssl with an explicit true/false value so the user's CLAUDE_MEM_CHROMA_SSL setting is faithfully forwarded. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: add regression tests for ChromaMcpManager SSL flag fix Adds 4 focused test cases verifying buildCommandArgs() produces correct --ssl args, covering SSL=false, SSL=true, unset (defaults to false), and local mode (no --ssl flag). Requested by @xkonjin in PR #1286 review. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: rebuild checked-in bundles to include SSL flag fix Rebuild all bundles against upstream/main so the --ssl <true\|false> fix is present in the runtime artifacts that hooks and the marketplace plugin actually execute. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-12 20:03:58 -07:00
Alex Newman	79bc3c85b3	chore: bump version to 10.5.5 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 03:02:03 -07:00
Alex Newman	6581d2ef45	fix: unify mode type/concept loading to always use mode definition (#1316 ) * fix: unify mode type/concept loading to always use mode definition Code mode previously read observation types/concepts from settings.json while non-code modes read from their mode JSON definition. This caused stale filters to persist when switching between modes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: remove dead observation type/concept settings constants CLAUDE_MEM_CONTEXT_OBSERVATION_TYPES and OBSERVATION_CONCEPTS are no longer read by ContextConfigLoader since all modes now use their mode definition. Removes the constants, defaults, UI controls, and the now-empty observation-metadata.ts file. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 03:00:20 -07:00
Alex Newman	3af68b7dfe	chore: bump version to 10.5.4 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-08 21:00:15 -07:00
Alex Newman	a32151a166	chore: bump version to 10.5.3	2026-03-08 19:35:56 -07:00
Alex Newman	6c7acfbc1c	chore: bump version to 10.5.2 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-25 22:23:15 -05:00
Alex Newman	a5e86ad4ab	chore: bump version to 10.5.1 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-25 21:08:19 -05:00
Alex Newman	272391ec9d	chore: bump version to 10.5.0 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-25 21:02:55 -05:00
Alex Newman	0e502dbd21	feat: add smart-explore AST-based code navigation (#1244 ) * feat: add smart-file-read module for token-optimized semantic code search - Created package.json for the smart-file-read module with dependencies and scripts. - Implemented parser.ts for code structure parsing using tree-sitter, supporting multiple languages. - Developed search.ts for searching code files and symbols with grep-style and structural matching. - Added test-run.mjs for testing search and outline functionalities. - Configured TypeScript with tsconfig.json for strict type checking and module resolution. * fix: update .gitignore to include _tree-sitter and remove unused subproject * feat: add preliminary results and skill recommendation for smart-explore module * chore: remove outdated plan.md file detailing session start hook issues * feat: update Smart File Read integration plan and skill documentation for smart-explore * feat: migrate Smart File Read to web-tree-sitter WASM for cross-platform compatibility * refactor: switch to tree-sitter CLI for parsing and enhance search functionality - Updated `parser.ts` to utilize the tree-sitter CLI for AST extraction instead of native bindings, improving compatibility and performance. - Removed grammar loading logic and replaced it with a path resolution for grammar packages. - Implemented batch parsing in `parseFilesBatch` to handle multiple files in a single CLI call, enhancing search speed. - Refactored `searchCodebase` to collect files and parse them in batches, streamlining the search process. - Adjusted symbol extraction logic to accommodate the new parsing method and ensure accurate symbol matching. * feat: update Smart File Read integration plan to utilize tree-sitter CLI for improved performance and cross-platform compatibility * feat: add smart-file-read parser and search to src/services Copy validated tree-sitter CLI-based parser and search modules from smart-file-read prototype into the claude-mem source tree for MCP tool integration. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: register smart_search, smart_unfold, smart_outline MCP tools Add 3 tree-sitter AST-based code exploration tools to the MCP server. Direct execution (no HTTP delegation) — they call parser/search functions directly for sub-second response times. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add tree-sitter CLI deps to build system and plugin runtime Externalize tree-sitter packages in esbuild MCP server build. Add 10 grammar packages + CLI to plugin package.json for runtime install. Remove unused @chroma-core/default-embed from plugin deps. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: create smart-explore skill with 3-layer workflow docs Progressive disclosure workflow: search -> outline -> unfold. Documents all 3 MCP tools with parameters and token economics. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add comprehensive documentation for the smart-explore feature - Introduced a detailed technical reference covering the architecture, parser, search engine, and tool registration for the smart-explore feature in claude-mem. - Documented the three-layer workflow: search, outline, and unfold, along with their respective MCP tools. - Explained the parsing process using tree-sitter, including language support, query patterns, and symbol extraction. - Outlined the search module's functionality, including file discovery, batch parsing, and relevance scoring. - Provided insights into build system integration and token economics for efficient code exploration. * chore: remove experiment artifacts, prototypes, and plan files Remove A/B test docs, prototype smart-file-read directory, and implementation plans. Keep only production code. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor: simplify hooks configuration and remove setup script * fix: use execFileSync to prevent command injection in tree-sitter parser Replaces execSync shell string with execFileSync + argument array, eliminating shell interpretation of file paths. Also corrects file_pattern description from "Glob pattern" to "Substring filter". Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-25 21:00:26 -05:00

1 2 3 4 5 ...