Commit Graph

657 Commits

Author SHA1 Message Date
Alex Newman
79bc3c85b3 chore: bump version to 10.5.5
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 03:02:03 -07:00
Alex Newman
6581d2ef45 fix: unify mode type/concept loading to always use mode definition (#1316)
* fix: unify mode type/concept loading to always use mode definition

Code mode previously read observation types/concepts from settings.json
while non-code modes read from their mode JSON definition. This caused
stale filters to persist when switching between modes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor: remove dead observation type/concept settings constants

CLAUDE_MEM_CONTEXT_OBSERVATION_TYPES and OBSERVATION_CONCEPTS are no
longer read by ContextConfigLoader since all modes now use their mode
definition. Removes the constants, defaults, UI controls, and the
now-empty observation-metadata.ts file.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 03:00:20 -07:00
Alex Newman
3af68b7dfe chore: bump version to 10.5.4
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08 21:00:15 -07:00
Alex Newman
e9b4f75fb2 fix: restore modes to plugin/modes/ from erroneous plugin/hooks/modes/ location
Modes were incorrectly moved to plugin/hooks/modes/ in v10.5.3, breaking
the release. This restores them to the correct plugin/modes/ directory.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08 20:52:24 -07:00
Alex Newman
a32151a166 chore: bump version to 10.5.3 2026-03-08 19:35:56 -07:00
Alex Newman
97ea9e45fc feat: add law-study mode for law students (#1305)
Adds a new claude-mem mode tailored for law school study sessions, with
observation types for case holdings, issue patterns, prof frameworks,
doctrine/rule synthesis, argument structures, and cross-case connections.
Includes a chill variant and a CLAUDE.md template for use as a legal
study partner.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-08 19:35:21 -07:00
Alex Newman
6c7acfbc1c chore: bump version to 10.5.2
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 22:23:15 -05:00
Alex Newman
44a7b2fcb9 docs: add smart-explore benchmark report and update skill with benchmark data
Add public benchmark report documenting the A/B comparison between Smart
Explore and the standard Explore agent (17.8x cheaper discovery, 19.4x
cheaper targeted reads). Update SKILL.md with benchmark-accurate token
economics, completeness guarantee, map-first principle, and Explore agent
escalation guidance.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 22:22:41 -05:00
Alex Newman
a5e86ad4ab chore: bump version to 10.5.1
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 21:08:19 -05:00
Alex Newman
d93bde059e fix: restore hooks.json to pre-smart-explore configuration
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 21:08:01 -05:00
Alex Newman
272391ec9d chore: bump version to 10.5.0
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 21:02:55 -05:00
Alex Newman
0e502dbd21 feat: add smart-explore AST-based code navigation (#1244)
* feat: add smart-file-read module for token-optimized semantic code search

- Created package.json for the smart-file-read module with dependencies and scripts.
- Implemented parser.ts for code structure parsing using tree-sitter, supporting multiple languages.
- Developed search.ts for searching code files and symbols with grep-style and structural matching.
- Added test-run.mjs for testing search and outline functionalities.
- Configured TypeScript with tsconfig.json for strict type checking and module resolution.

* fix: update .gitignore to include _tree-sitter and remove unused subproject

* feat: add preliminary results and skill recommendation for smart-explore module

* chore: remove outdated plan.md file detailing session start hook issues

* feat: update Smart File Read integration plan and skill documentation for smart-explore

* feat: migrate Smart File Read to web-tree-sitter WASM for cross-platform compatibility

* refactor: switch to tree-sitter CLI for parsing and enhance search functionality

- Updated `parser.ts` to utilize the tree-sitter CLI for AST extraction instead of native bindings, improving compatibility and performance.
- Removed grammar loading logic and replaced it with a path resolution for grammar packages.
- Implemented batch parsing in `parseFilesBatch` to handle multiple files in a single CLI call, enhancing search speed.
- Refactored `searchCodebase` to collect files and parse them in batches, streamlining the search process.
- Adjusted symbol extraction logic to accommodate the new parsing method and ensure accurate symbol matching.

* feat: update Smart File Read integration plan to utilize tree-sitter CLI for improved performance and cross-platform compatibility

* feat: add smart-file-read parser and search to src/services

Copy validated tree-sitter CLI-based parser and search modules from
smart-file-read prototype into the claude-mem source tree for MCP
tool integration.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: register smart_search, smart_unfold, smart_outline MCP tools

Add 3 tree-sitter AST-based code exploration tools to the MCP server.
Direct execution (no HTTP delegation) — they call parser/search
functions directly for sub-second response times.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add tree-sitter CLI deps to build system and plugin runtime

Externalize tree-sitter packages in esbuild MCP server build. Add
10 grammar packages + CLI to plugin package.json for runtime install.
Remove unused @chroma-core/default-embed from plugin deps.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: create smart-explore skill with 3-layer workflow docs

Progressive disclosure workflow: search -> outline -> unfold.
Documents all 3 MCP tools with parameters and token economics.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Add comprehensive documentation for the smart-explore feature

- Introduced a detailed technical reference covering the architecture, parser, search engine, and tool registration for the smart-explore feature in claude-mem.
- Documented the three-layer workflow: search, outline, and unfold, along with their respective MCP tools.
- Explained the parsing process using tree-sitter, including language support, query patterns, and symbol extraction.
- Outlined the search module's functionality, including file discovery, batch parsing, and relevance scoring.
- Provided insights into build system integration and token economics for efficient code exploration.

* chore: remove experiment artifacts, prototypes, and plan files

Remove A/B test docs, prototype smart-file-read directory, and
implementation plans. Keep only production code.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor: simplify hooks configuration and remove setup script

* fix: use execFileSync to prevent command injection in tree-sitter parser

Replaces execSync shell string with execFileSync + argument array,
eliminating shell interpretation of file paths. Also corrects
file_pattern description from "Glob pattern" to "Substring filter".

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 21:00:26 -05:00
Alex Newman
50d1dfb7ee chore: bump version to 10.4.4
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 19:55:44 -05:00
Alex Newman
dd1b812443 chore: bump version to 10.4.3
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 19:34:28 -05:00
Alex Newman
ad3d236cec fix: resolve hook crashes and CLAUDE_PLUGIN_ROOT fallback (#1215, #1220) (#1229)
* fix: resolve PostToolUse hook crashes and 5s latency (#1220)

Three compounding bugs caused hook failures:

1. Missing break statements in worker-service.ts switch — if async
   code threw before process.exit(), execution fell through to
   subsequent cases. Added break to all 7 cases missing them.

2. Unhandled promise rejection on main() — added .catch() that logs
   the error and exits 0 (per project exit code strategy: don't block
   Claude Code or leave Windows Terminal tabs open).

3. Redundant start commands in hooks.json — PostToolUse,
   UserPromptSubmit, and Stop groups each had a standalone start
   command that was redundant (the hook case already calls
   ensureWorkerStarted internally). The redundant start also caused
   5s latency via bun-runner.js collectStdin() timeout since Claude
   Code never closes stdin.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add CLAUDE_PLUGIN_ROOT fallback for Stop hooks (#1215)

Upstream Claude Code bug (anthropics/claude-code#24529) leaves
CLAUDE_PLUGIN_ROOT unset for Stop hooks on macOS and ALL hooks
on Linux. Two-layer defense:

1. Shell-level: hooks.json commands now use inline fallback
   _R="${CLAUDE_PLUGIN_ROOT}"; [ -z "$_R" ] && _R="$HOME/...";
   falling back to the known marketplace install path.

2. Script-level: bun-runner.js self-resolves plugin root from
   its own filesystem location via import.meta.url, and fixes
   broken /scripts/... paths that result from empty expansion.

Added test to verify all hook commands include the fallback path.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 19:31:26 -05:00
Alex Newman
4144010264 chore: bump version to 10.4.1 2026-02-23 22:13:11 -05:00
Alex Newman
3c4486e69e feat: convert make-plan and do commands to skills (#1216) 2026-02-23 22:08:21 -05:00
Alex Newman
23f30d35b9 chore: bump version to 10.4.0
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 19:39:06 -05:00
Alex Newman
c6f932988a Fix 30+ root-cause bugs across 10 triage phases (#1214)
* MAESTRO: fix ChromaDB core issues — Python pinning, Windows paths, disable toggle, metadata sanitization, transport errors

- Add --python version pinning to uvx args in both local and remote mode (fixes #1196, #1206, #1208)
- Convert backslash paths to forward slashes for --data-dir on Windows (fixes #1199)
- Add CLAUDE_MEM_CHROMA_ENABLED setting for SQLite-only fallback mode (fixes #707)
- Sanitize metadata in addDocuments() to filter null/undefined/empty values (fixes #1183, #1188)
- Wrap callTool() in try/catch for transport errors with auto-reconnect (fixes #1162)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* MAESTRO: fix data integrity — content-hash deduplication, project name collision, empty project guard, stuck isProcessing

- Add SHA-256 content-hash deduplication to observations INSERT (store.ts, transactions.ts, SessionStore.ts)
- Add content_hash column via migration 22 with backfill and index
- Fix project name collision: getCurrentProjectName() now returns parent/basename
- Guard against empty project string with cwd-derived fallback
- Fix stuck isProcessing: hasAnyPendingWork() resets processing messages older than 5 minutes
- Add 12 new tests covering all four fixes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* MAESTRO: fix hook lifecycle — stderr suppression, output isolation, conversation pollution prevention

- Suppress process.stderr.write in hookCommand() to prevent Claude Code showing diagnostic
  output as error UI (#1181). Restores stderr in finally block for worker-continues case.
- Convert console.error() to logger.warn()/error() in hook-command.ts and handlers/index.ts
  so all diagnostics route to log file instead of stderr.
- Verified all 7 handlers return suppressOutput: true (prevents conversation pollution #598, #784).
- Verified session-complete is a recognized event type (fixes #984).
- Verified unknown event types return no-op handler with exit 0 (graceful degradation).
- Added 10 new tests in tests/hook-lifecycle.test.ts covering event dispatch, adapter defaults,
  stderr suppression, and standard response constants.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* MAESTRO: fix worker lifecycle — restart loop coordination, stale transport retry, ENOENT shutdown race

- Add PID file mtime guard to prevent concurrent restart storms (#1145):
  isPidFileRecent() + touchPidFile() coordinate across sessions
- Add transparent retry in ChromaMcpManager.callTool() on transport
  error — reconnects and retries once instead of failing (#1131)
- Wrap getInstalledPluginVersion() with ENOENT/EBUSY handling (#1042)
- Verified ChromaMcpManager.stop() already called on all shutdown paths

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* MAESTRO: fix Windows platform support — uvx.cmd spawn, PowerShell $_ elimination, windowsHide, FTS5 fallback

- Route uvx spawn through cmd.exe /c on Windows since MCP SDK lacks shell:true (#1190, #1192, #1199)
- Replace all PowerShell Where-Object {$_} pipelines with WQL -Filter server-side filtering (#1024, #1062)
- Add windowsHide: true to all exec/spawn calls missing it to prevent console popups (#1048)
- Add FTS5 runtime probe with graceful fallback when unavailable on Windows (#791)
- Guard FTS5 table creation in migrations, SessionSearch, and SessionStore with try/catch

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* MAESTRO: fix skills/ distribution — build-time verification and regression tests (#1187)

Add post-build verification in build-hooks.js that fails if critical
distribution files (skills, hooks, plugin manifest) are missing. Add
10 regression tests covering skill file presence, YAML frontmatter,
hooks.json integrity, and package.json files field.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* MAESTRO: fix MigrationRunner schema initialization (#979) — version conflict between parallel migration systems

Root cause: old DatabaseManager migrations 1-7 shared schema_versions table with
MigrationRunner's 4-22, causing version number collisions (5=drop tables vs add column,
6=FTS5 vs prompt tracking, 7=discovery_tokens vs remove UNIQUE).  initializeSchema()
was gated behind maxApplied===0, so core tables were never created when old versions
were present.

Fixes:
- initializeSchema() always creates core tables via CREATE TABLE IF NOT EXISTS
- Migrations 5-7 check actual DB state (columns/constraints) not just version tracking
- Crash-safe temp table rebuilds (DROP IF EXISTS _new before CREATE)
- Added missing migration 21 (ON UPDATE CASCADE) to MigrationRunner
- Added ON UPDATE CASCADE to FK definitions in initializeSchema()
- All changes applied to both runner.ts and SessionStore.ts

Tests: 13 new tests in migration-runner.test.ts covering fresh DB, idempotency,
version conflicts, crash recovery, FK constraints, and data integrity.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* MAESTRO: fix 21 test failures — stale mocks, outdated assertions, missing OpenClaw guards

Server tests (12): Added missing workerPath and getAiStatus to ServerOptions
mocks after interface expansion. ChromaSync tests (3): Updated to verify
transport cleanup in ChromaMcpManager after architecture refactor. OpenClaw (2):
Added memory_ tool skipping and response truncation to prevent recursive loops
and oversized payloads. MarkdownFormatter (2): Updated assertions to match
current output. SettingsDefaultsManager (1): Used correct default key for
getBool test. Logger standards (1): Excluded CLI transcript command from
background service check.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* MAESTRO: fix Codex CLI compatibility (#744) — session_id fallbacks, unknown platform tolerance, undefined guard

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* MAESTRO: fix Cursor IDE integration (#838, #1049) — adapter field fallbacks, tolerant session-init validation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* MAESTRO: fix /api/logs OOM (#1203) — tail-read replaces full-file readFileSync

Replace readFileSync (loads entire file into memory) with readLastLines()
that reads only from the end of the file in expanding chunks (64KB → 10MB cap).
Prevents OOM on large log files while preserving the same API response shape.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* MAESTRO: fix Settings CORS error (#1029) — explicit methods and allowedHeaders in CORS config

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* MAESTRO: add session custom_title for agent attribution (#1213) — migration 23, endpoint + store support

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* MAESTRO: prevent CLAUDE.md/AGENTS.md writes inside .git/ directories (#1165)

Add .git path guard to all 4 write sites to prevent ref corruption when
paths resolve inside .git internals.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* MAESTRO: fix plugin disabled state not respected (#781) — early exit check in all hook entry points

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* MAESTRO: fix UserPromptSubmit context re-injection on every turn (#1079) — contextInjected session flag

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* MAESTRO: fix stale AbortController queue stall (#1099) — lastGeneratorActivity tracking + 30s timeout

Three-layer fix:
1. Added lastGeneratorActivity timestamp to ActiveSession, updated by
   processAgentResponse (all agents), getMessageIterator (queue yields),
   and startGeneratorWithProvider (generator launch)
2. Added stale generator detection in ensureGeneratorRunning — if no
   activity for >30s, aborts stale controller, resets state, restarts
3. Added AbortSignal.timeout(30000) in deleteSession to prevent
   indefinite hang when awaiting a stuck generator promise

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 19:34:35 -05:00
Alex Newman
50eeed97e7 MAESTRO: fix cache-based install — resolveRoot, post-install verification, CLI path
Replace hardcoded marketplace path in plugin/scripts/smart-install.js with
resolveRoot() that uses CLAUDE_PLUGIN_ROOT env var (set by Claude Code for
all hooks), with fallback to script location and legacy paths. Fixes #1128,
#1166 where cache installs couldn't find or install node_modules.

Also fixes installCLI() path (ROOT/plugin/scripts/ → ROOT/scripts/) and adds
verifyCriticalModules() post-install check with npm fallback on failure.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 17:28:51 -05:00
Alex Newman
2db9d0e383 chore: bump version to 10.3.3
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 03:47:48 -05:00
Alex Newman
c2c3e3069c chore: bump version to 10.3.2
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 03:32:22 -05:00
Alex Newman
7966c6cba9 fix: rename save_memory and fix MCP search instructions + startup hook (#1210)
* fix: rename save_memory to save_observation and fix MCP search instructions

Stop the primary agent from proactively saving memories by renaming
save_memory to save_observation with a neutral description. Remove
"Saving Memories" section from SKILL.md. Update context formatters
and output styles to reference the mem-search skill instead of raw
MCP tool names.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: split SessionStart hooks so smart-install failure doesn't block worker start

smart-install.js and worker-start were in the same hook group, so if
smart-install exited non-zero the worker never started. Split into
separate hook groups so they run independently.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: worker startup waits for readiness before hooks fire

Move initializationCompleteFlag to set after DB/search init (not MCP),
add waitForReadiness() polling /api/readiness, and extract shared
pollEndpointUntilOk helper to DRY up health/readiness checks.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 03:30:31 -05:00
Alex Newman
097035de6c chore: bump version to 10.3.1
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 20:12:17 -05:00
Alex Newman
e788fd3676 fix: prevent duplicate worker daemons and zombie processes (#1178)
* fix: prevent duplicate worker daemons and zombie processes

Three root causes of chroma-mcp timeouts:

1. HTTP shutdown (POST /api/admin/shutdown) closed resources but never
   called process.exit(). Zombie workers stayed alive, background tasks
   reconnected to chroma-mcp, spawning duplicate subprocesses that all
   contended for the same persistent data directory.

2. No guard against concurrent daemon startup. When hooks fired
   simultaneously, multiple daemons started before either wrote a PID
   file. The loser got EADDRINUSE but stayed alive because signal
   handlers registered in the constructor prevented exit.

3. Corrupt 147GB HNSW index file caused all chroma queries to timeout
   (MCP error -32001). Data fix: deleted corrupt collection, backfill
   rebuilds from SQLite.

Code fixes:
- Add PID-based guard in daemon startup: exit if PID file process alive
- Add port-based guard in daemon startup: exit if port already bound
  (runs before WorkerService constructor registers keepalive handlers)
- Add process.exit(0) after HTTP shutdown/restart completes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: aggressive startup cleanup and one-time chroma wipe for upgrade

Kill orphaned worker-service.cjs and chroma-mcp processes immediately
at startup (no age gate) while keeping 30-min threshold for mcp-server.
Wipe corrupt chroma data once on upgrade from pre-v10.3 versions —
backfill rebuilds from SQLite automatically.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: wrap shutdown handlers in try/finally to guarantee process.exit

If onShutdown() or onRestart() threw, process.exit(0) was never reached,
leaving the daemon alive as a zombie. Also removed redundant require('fs')
calls in process-manager tests where ESM imports already existed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 20:10:28 -05:00
Alex Newman
91b48a6481 chore: bump version to 10.3.0
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 18:33:52 -05:00
Alex Newman
40daf8f3fa feat: replace WASM embeddings with persistent chroma-mcp MCP connection (#1176)
* feat: replace WASM embeddings with persistent chroma-mcp MCP connection

Replace ChromaServerManager (npx chroma run + chromadb npm + ONNX/WASM)
with ChromaMcpManager, a singleton stdio MCP client that communicates with
chroma-mcp via uvx. This eliminates native binary issues, segfaults, and
WASM embedding failures that plagued cross-platform installs.

Key changes:
- Add ChromaMcpManager: singleton MCP client with lazy connect, auto-reconnect,
  connection lock, and Zscaler SSL cert support
- Rewrite ChromaSync to use MCP tool calls instead of chromadb npm client
- Handle chroma-mcp's non-JSON responses (plain text success/error messages)
- Treat "collection already exists" as idempotent success
- Wire ChromaMcpManager into GracefulShutdown for clean subprocess teardown
- Delete ChromaServerManager (no longer needed)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address PR review — connection guard leak, timer leak, async reset

- Clear connecting guard in finally block to prevent permanent reconnection block
- Clear timeout after successful connection to prevent timer leak
- Make reset() async to await stop() before nullifying instance
- Delete obsolete chroma-server-manager test (imports deleted class)
- Update graceful-shutdown test to use chromaMcpManager property name

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: prevent chroma-mcp spawn storm — zombie cleanup, stale onclose guard, reconnect backoff

Three bugs caused chroma-mcp processes to accumulate (92+ observed):

1. Zombie on timeout: failed connections left subprocess alive because
   only the timer was cleared, not the transport. Now catch block
   explicitly closes transport+client before rethrowing.

2. Stale onclose race: old transport's onclose handler captured `this`
   and overwrote the current connection reference after reconnect,
   orphaning the new subprocess. Now guarded with reference check.

3. No backoff: every failure triggered immediate reconnect. With
   backfill doing hundreds of MCP calls, this created rapid-fire
   spawning. Added 10s backoff on both connection failure and
   unexpected process death.

Also includes ChromaSync fixes from PR review:
- queryChroma deduplication now preserves index-aligned arrays
- SQL injection guard on backfill ID exclusion lists

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 18:32:38 -05:00
Alex Newman
ea683a4e6c chore: bump version to 10.2.6
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 16:41:46 -05:00
Alex Newman
5d79bb7a7a fix: prevent zombie process accumulation by verifying subprocess exit (#1168) (#1175)
Two changes fix the observer process resource leak:

1. Add ensureProcessExit to generator finally blocks in SessionRoutes and
   worker-service, matching the pattern already working in SDKAgent.

2. Add stale session reaper (every 2m) that removes sessions with no active
   generator and no pending work after 15m idle. This unblocks the orphan
   reaper which previously skipped processes for "active" sessions.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 16:33:23 -05:00
Alex Newman
149f548667 chore: bump version to 10.2.5
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 23:17:08 -05:00
Alex Newman
b88251bc8b fix: self-healing claimNextMessage prevents stuck processing messages (#1159)
* fix: self-healing claimNextMessage prevents stuck processing messages

claimAndDelete → claimNextMessage with atomic self-healing: resets stale
processing messages (>60s) back to pending before claiming. Eliminates
stuck messages from generator crashes without external timers. Removes
redundant idle-timeout reset in worker-service.ts. Adds QUEUE to logger
Component type.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: update stale comments in SessionQueueProcessor to reflect claim-confirm pattern

Comments still referenced the old claim-and-delete pattern after the
claimNextMessage rename. Updated to accurately describe the current
lifecycle where messages are marked as processing and stay in DB until
confirmProcessed() is called.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: move Date.now() inside transaction and extract stale threshold constant

- Move Date.now() inside claimNextMessage transaction closure so timestamp
  is fresh if WAL contention causes retry
- Extract STALE_PROCESSING_THRESHOLD_MS to module-level constant
- Add comment clarifying strict < boundary semantics

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 23:15:46 -05:00
Alex Newman
b1cfc85333 chore: bump version to 10.2.4
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 22:49:10 -05:00
Alex Newman
ca8421611c fix: backfill Chroma vector DB for all projects on startup (#1154)
* fix: backfill all Chroma projects on worker startup

ChromaSync.ensureBackfilled() existed but was never called. After
v10.2.2's bun cache clear destroyed the ONNX model cache, Chroma only
had ~2 days of embeddings while SQLite had 49k+ observations.

- Add static backfillAllProjects() to ChromaSync — iterates all projects
  in SQLite, creates temporary ChromaSync per project, runs smart diff
- Call backfillAllProjects() fire-and-forget on worker startup
- Add 'CHROMA_SYNC' to logger Component type (pre-existing gap)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: sanitize project names for Chroma collection naming

Replace characters outside [a-zA-Z0-9._-] with underscores so projects
like "YC Stuff" map to collection "cm__YC_Stuff" instead of failing
Chroma's collection name validation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: route backfill to shared cm__claude-mem collection, harden sanitization

- Use single ChromaSync('claude-mem') in backfillAllProjects() instead of
  per-project instances, matching how DatabaseManager and SearchManager
  operate — fixes critical bug where backfilled data landed in orphaned
  collections that no search path reads from
- Strip trailing non-alphanumeric chars from sanitized collection names
  to satisfy Chroma's end-character constraint
- Guard backfill behind Chroma server readiness to avoid N spurious error
  logs when Chroma failed to start
- Use CHROMA_SYNC log component consistently for backfill messages

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor: pass project as parameter to ensureBackfilled instead of mutating instance state

Eliminates shared mutable state in backfillAllProjects() loop. Project
scoping is now passed explicitly via parameter to both ensureBackfilled()
and getExistingChromaIds(), keeping a single Chroma connection while
avoiding fragile instance property mutation across iterations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 22:47:46 -05:00
Alex Newman
b446f2630e chore: bump version to 10.2.3
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 19:55:09 -05:00
Alex Newman
224567f980 fix: prevent ONNX model cache corruption from bun cache clears
Remove nuclear `bun pm cache rm` from smart-install.js and
sync-marketplace.cjs (only needed for removed sharp dependency).
Add `bun install` in cache version directory after sync so worker
can resolve dependencies. Move HuggingFace model cache to
~/.claude-mem/models/ so reinstalls don't corrupt it. Add self-healing
retry for Protobuf parsing failures.

Fixes recurring issues #1104, #1105, #1110.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 19:54:17 -05:00
Alex Newman
613f0e9795 chore: bump version to 10.2.2
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 19:21:39 -05:00
Alex Newman
df36ce68df chore: bump version to 10.2.1
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 18:08:33 -05:00
Alex Newman
f24251118e fix: bun install, node-addon-api for sharp, consolidate PendingMessageStore (#1140)
* fix: use bun install in sync, add node-addon-api for sharp, consolidate PendingMessageStore

- Switch sync-marketplace from npm to bun install
- Add node-addon-api as dev dep so sharp builds under bun
- Consolidate duplicate PendingMessageStore instantiation in worker-service finally block

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* build assets

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 18:05:42 -05:00
Alex Newman
854bf922a4 chore: bump version to 10.2.0
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 16:40:10 -05:00
yczc3999
ddc25372c1 fix(linux): buffer stdin in Node.js before passing to Bun (#646) (#977)
On Linux, Bun's libuv calls fstat() on inherited pipe file descriptors
and crashes with EINVAL when the pipe originates from Claude Code's hook
system. This causes all PostToolUse hooks to fail silently, preventing
observations from being recorded.

The fix reads stdin entirely in the Node.js parent process (bun-runner.js)
before spawning Bun, then writes the buffered data to a fresh pipe created
by Node's child_process.spawn(). Bun receives a standard pipe that it can
fstat() without errors.

Changes:
- Add collectStdin() to buffer piped input in Node.js with 5s safety timeout
- Change stdio from 'inherit' to ['pipe'|'ignore', 'inherit', 'inherit']
- Write buffered stdin to child.stdin then close for proper EOF signaling
- Handle edge cases: TTY stdin, no stdin, read errors

Fixes #646

Co-authored-by: yczc3999 <zxfgds@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 00:30:42 -05:00
Alex Newman
327dd44992 chore: bump version to 10.1.0
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 00:17:34 -05:00
Alex Newman
676a3d175e fix: make context and colored timeline fetches truly parallel
Address PR #1125 review feedback - both fetches now start simultaneously
via Promise.all instead of sequential-then-parallel.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 00:11:25 -05:00
Alex Newman
34358ab33d feat: add systemMessage support for SessionStart hook and tune defaults
Add systemMessage field to HookResult so SessionStart can display a
colored timeline directly to the user in the CLI. The handler now
parallel-fetches both markdown (for Claude context) and ANSI-colored
(for user display) timelines, appending a viewer URL link.

Also update default settings to hide verbose token columns (read/work
tokens, savings amount) and disable full observation expansion, keeping
the cleaner index-only view by default.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 00:05:13 -05:00
Alex Newman
51abe5d1ff chore: bump version to 10.0.8
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 23:33:20 -05:00
Alex Newman
055888e181 fix: address PR review feedback for subprocess cleanup and binary resolution
Wrap SDK query loop in try/finally so subprocess cleanup runs on error paths.
Swap Chroma binary check order to try project-level .bin first (common case).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 23:24:00 -05:00
Alex Newman
67ba17cc8a fix: use WASM backend for Chroma embeddings to fix cross-platform issues
Chroma requires client-side embeddings — the server is storage only.
The previous commit incorrectly removed @chroma-core/default-embed.

Uses DefaultEmbeddingFunction({ wasm: true }) which forces the WASM
backend instead of native ONNX binaries. Same model (all-MiniLM-L6-v2),
same embeddings, but works on all platforms without segfaults or
ENOENT errors (#1104, #1105, #1110).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 23:14:21 -05:00
Alex Newman
e1ef14dbcc fix: resolve orphaned subprocesses and Chroma HTTP regressions
- Add subprocess cleanup after SDK query loop completes, using existing
  ProcessRegistry infrastructure (getProcessBySession + ensureProcessExit)
- Replace npx-based Chroma binary spawning with absolute path resolution
  via require.resolve, falling back to npx with explicit cwd (#1120)
- Remove @chroma-core/default-embed client-side dependency; let Chroma
  HTTP server handle embeddings server-side (#1104, #1105, #1110)

Closes #1010, #1089, #1090, #1068, #1120, #1104, #1105, #1110

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 22:04:52 -05:00
Alex Newman
0ac4c7b8a9 chore: bump version to 10.0.7
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 16:52:11 -05:00
Alex Newman
c27314f896 fix: address PR review comments for chroma server lifecycle 2026-02-13 23:39:30 -05:00
Alex Newman
1b68c55763 fix: resolve SDK spawn failures and sharp native binary crashes
- Strip CLAUDECODE env var from SDK subprocesses to prevent "cannot be
  launched inside another Claude Code session" error (Claude Code 2.1.42+)
- Lazy-load @chroma-core/default-embed to avoid eagerly pulling in
  sharp native binaries at bundle startup (fixes ERR_DLOPEN_FAILED)
- Add stderr capture to SDK spawn for diagnosing future process failures
- Exclude lockfiles from marketplace rsync and delete stale lockfiles
  before npm install to prevent native dep version mismatches

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 22:47:27 -05:00