mirror of
https://github.com/thedotmack/claude-mem
synced 2026-04-25 17:15:04 +02:00
* fix: add idle timeout to prevent zombie observer processes Root cause fix for zombie observer accumulation. The SessionQueueProcessor iterator now exits gracefully after 3 minutes of inactivity instead of waiting forever for messages. Changes: - Add IDLE_TIMEOUT_MS constant (3 minutes) - waitForMessage() now returns boolean and accepts timeout parameter - createIterator() tracks lastActivityTime and exits on idle timeout - Graceful exit via return (not throw) allows SDK to complete cleanly This addresses the root cause that PR #848 worked around with pattern matching. Observer processes now self-terminate, preventing accumulation when session-complete hooks don't fire. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: trigger abort on idle timeout to actually kill subprocess The previous implementation only returned from the iterator on idle timeout, but this doesn't terminate the Claude subprocess - it just stops yielding messages. The subprocess stays alive as a zombie because: 1. Returning from createIterator() ends the generator 2. The SDK closes stdin via transport.endInput() 3. But the subprocess may not exit on stdin EOF 4. No abort signal is sent to kill it Fix: Add onIdleTimeout callback that SessionManager uses to call session.abortController.abort(). This sends SIGTERM to the subprocess via the SDK's ProcessTransport abort handler. Verified by Codex analysis of the SDK internals: - abort() triggers ProcessTransport abort handler → SIGTERM - transport.close() sends SIGTERM → escalates to SIGKILL after 5s - Just closing stdin is NOT sufficient to guarantee subprocess exit Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: add idle timeout to prevent zombie observer processes Also cleaned up hooks.json to remove redundant start commands. The hook command handler now auto-starts the worker if not running, which is how it should have been since we changed to auto-start. This maintenance change was done manually. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: resolve race condition in session queue idle timeout detection - Reset timer on spurious wakeup when queue is empty but duration check fails - Use optional chaining for onIdleTimeout callback - Include threshold value in idle timeout log message for better diagnostics - Add comprehensive unit tests for SessionQueueProcessor Fixes PR #856 review feedback. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat: migrate installer to Setup hook - Add plugin/scripts/setup.sh for one-time dependency setup - Add Setup hook to hooks.json (triggers via claude --init) - Remove smart-install.js from SessionStart hook - Keep smart-install.js as manual fallback for Windows/auto-install Setup hook handles: - Bun detection with fallback locations - uv detection (optional, for Chroma) - Version marker to skip redundant installs - Clear error messages with install instructions * feat: add np for one-command npm releases - Add np as dev dependency - Add release, release:patch, release:minor, release:major scripts - Add prepublishOnly hook to run build before publish - Configure np (no yarn, include all contents, run tests) * fix: reduce PostToolUse hook timeout to 30s PostToolUse runs on every tool call, 120s was excessive and could cause hangs. Reduced to 30s for responsive behavior. * docs: add PR shipping report Analyzed 6 PRs for shipping readiness: - #856: Ready to merge (idle timeout fix) - #700, #722, #657: Have conflicts, need rebase - #464: Contributor PR, too large (15K+ lines) - #863: Needs manual review Includes shipping strategy and conflict resolution order. * MAESTRO: Verify PR #856 test suite passes All 797 tests pass (3 skipped, 0 failures). The 11 SessionQueueProcessor idle timeout tests all pass with 20 expect() assertions verified. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * MAESTRO: Verify PR #856 build passes - Ran npm run build successfully with no TypeScript errors - All artifacts generated (worker-service, mcp-server, context-generator, viewer UI) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * MAESTRO: Code review PR #856 implementation verified Verified all requirements in SessionQueueProcessor.ts: - IDLE_TIMEOUT_MS = 180000ms (3 minutes) - waitForMessage() accepts timeout parameter - lastActivityTime reset on spurious wakeup (race condition fix) - Graceful exit logs include thresholdMs parameter - 11 comprehensive test cases in SessionQueueProcessor.test.ts Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: bigph00t <166455923+bigph00t@users.noreply.github.com> Co-authored-by: root <root@srv1317155.hstgr.cloud>
2.5 KiB
2.5 KiB
Phase 01: Test and Merge PR #856 - Zombie Observer Fix
PR #856 adds idle timeout to SessionQueueProcessor to prevent zombie observer processes. This is the most mature PR with existing test coverage, passing CI, and no merge conflicts. By the end of this phase, the fix will be merged to main and the improvement will be live.
Tasks
-
Checkout and verify PR #856:
git fetch origin fix/observer-idle-timeoutgit checkout fix/observer-idle-timeout- Verify the branch is up to date with origin
- ✅ Branch verified up to date with origin (pulled 4 new files: PR-SHIPPING-REPORT.md, package.json updates, hooks.json updates, setup.sh)
-
Run the full test suite to confirm all tests pass:
npm test- Specifically verify the 11 SessionQueueProcessor tests pass
- Report any failures
- ✅ Full test suite passes: 797 pass, 3 skip (pre-existing), 0 fail
- ✅ All 11 SessionQueueProcessor tests pass: 11 pass, 0 fail, 20 expect() calls
-
Run the build to confirm compilation succeeds:
npm run build- Verify no TypeScript errors
- Verify all artifacts are generated
- ✅ Build completed successfully with no TypeScript errors
- ✅ All artifacts generated:
- worker-service.cjs (1786.80 KB)
- mcp-server.cjs (332.41 KB)
- context-generator.cjs (61.57 KB)
- viewer-bundle.js and viewer.html
-
Code review the changes for correctness:
- Read
src/services/queue/SessionQueueProcessor.tsand verify:IDLE_TIMEOUT_MSis set to 3 minutes (180000ms)waitForMessage()accepts timeout parameterlastActivityTimeis reset on spurious wakeup (race condition fix)- Graceful exit logs with
thresholdMsparameter
- Read
tests/services/queue/SessionQueueProcessor.test.tsand verify test coverage - ✅ Code review complete - all requirements verified:
- Line 6:
IDLE_TIMEOUT_MS = 3 * 60 * 1000(180000ms) - Line 90:
waitForMessage(signal: AbortSignal, timeoutMs: number = IDLE_TIMEOUT_MS) - Line 63:
lastActivityTime = Date.now()on spurious wakeup with comment - Lines 54-58: Logger includes
thresholdMs: IDLE_TIMEOUT_MSparameter - 11 test cases covering idle timeout, abort signal, message events, cleanup, errors, and conversion
- Line 6:
- Read
-
Merge PR #856 to main:
git checkout maingit pull origin maingh pr merge 856 --squash --delete-branch- Verify merge succeeded
-
Run post-merge verification:
git pull origin mainnpm testto confirm tests still pass on mainnpm run buildto confirm build still works