mirror of
https://github.com/thedotmack/claude-mem
synced 2026-04-25 17:15:04 +02:00
- Issue #514: Documented analysis of orphaned observer session files, including root cause, evidence, and recommended fixes. - Issue #517: Analyzed PowerShell escaping issues in cleanupOrphanedProcesses() on Windows, with recommended fixes using WMIC. - Issue #520: Confirmed resolution of stuck messages issue through architectural changes to a claim-and-delete pattern. - Issue #527: Identified detection failure of uv on Apple Silicon Macs with Homebrew installation, proposed path updates for detection. - Issue #532: Analyzed memory leak issues in SessionManager, detailing session cleanup and conversationHistory growth concerns, with recommended fixes.
This commit is contained in:
292
docs/reports/2026-01-04--issue-514-orphaned-sessions-analysis.md
Normal file
292
docs/reports/2026-01-04--issue-514-orphaned-sessions-analysis.md
Normal file
@@ -0,0 +1,292 @@
|
||||
# Issue #514: Orphaned Observer Session Files Analysis
|
||||
|
||||
**Date:** January 4, 2026
|
||||
**Status:** PARTIALLY RESOLVED - Root cause understood, fix was made but reverted
|
||||
**Original Issue:** 13,000+ orphaned .jsonl session files created over 2 days
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
Issue #514 reported that the plugin created 13,000+ orphaned session .jsonl files in `~/.claude/projects/<project>/`. Each file contained only an initialization message with no actual observations. The hypothesis was that `startSessionProcessor()` in startup-recovery created new observer sessions in a loop.
|
||||
|
||||
**Current State:** The issue was **fixed in commit 9a7f662** with a deterministic `mem-${contentSessionId}` prefix approach, but this fix was **reverted in commit f9197b5** due to the SDK not accepting custom session IDs. The current code uses a NULL-based initialization pattern that can still create orphaned sessions under certain conditions.
|
||||
|
||||
---
|
||||
|
||||
## Evidence: Current File Analysis
|
||||
|
||||
Filesystem analysis of `~/.claude/projects/-Users-alexnewman-Scripts-claude-mem/`:
|
||||
|
||||
| Line Count | Number of Files |
|
||||
|------------|-----------------|
|
||||
| 0 lines (empty) | 407 |
|
||||
| 1 line | **12,562** |
|
||||
| 2 lines | 3,199 |
|
||||
| 3+ lines | 3,546 |
|
||||
| **Total** | **~19,714** |
|
||||
|
||||
The 12,562 single-line files are consistent with the issue description - sessions that initialized but never received observations.
|
||||
|
||||
Sample single-line file content:
|
||||
```json
|
||||
{"type":"queue-operation","operation":"dequeue","timestamp":"2025-12-28T20:41:25.484Z","sessionId":"00081a3b-9485-48a4-89f0-fd4dfccd3ac9"}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Root Cause Analysis
|
||||
|
||||
### The Problem Chain
|
||||
|
||||
1. **Worker startup calls `processPendingQueues()`** (line 281 in worker-service.ts)
|
||||
2. For each session with pending messages, it calls `initializeSession()` then `startSessionProcessor()`
|
||||
3. `startSessionProcessor()` invokes `sdkAgent.startSession()` which calls the Claude Agent SDK `query()` function
|
||||
4. **If `memorySessionId` is NULL**, no `resume` parameter is passed to `query()`
|
||||
5. **The SDK creates a NEW .jsonl file** for each query call without a resume parameter
|
||||
6. **If the query aborts before receiving a response** (timeout, crash, abort signal), the `memorySessionId` is never captured
|
||||
7. On next startup, the cycle repeats - creating yet another orphaned file
|
||||
|
||||
### Why Sessions Abort Before Capturing memorySessionId
|
||||
|
||||
Looking at `startSessionProcessor()` flow:
|
||||
|
||||
```typescript
|
||||
// worker-service.ts lines 301-321
|
||||
private startSessionProcessor(session, source) {
|
||||
session.generatorPromise = this.sdkAgent.startSession(session, this)
|
||||
.catch(error => { /* error handling */ })
|
||||
.finally(() => {
|
||||
session.generatorPromise = null;
|
||||
this.broadcastProcessingStatus();
|
||||
});
|
||||
}
|
||||
```
|
||||
|
||||
And `processPendingQueues()`:
|
||||
|
||||
```typescript
|
||||
// worker-service.ts lines 347-371
|
||||
for (const sessionDbId of orphanedSessionIds) {
|
||||
const session = this.sessionManager.initializeSession(sessionDbId);
|
||||
this.startSessionProcessor(session, 'startup-recovery');
|
||||
await new Promise(resolve => setTimeout(resolve, 100)); // 100ms delay between sessions
|
||||
}
|
||||
```
|
||||
|
||||
The problem: Starting 50 sessions rapidly (100ms delay) with pending messages means:
|
||||
- All 50 SDK queries start nearly simultaneously
|
||||
- The SDK creates 50 new .jsonl files (since none have memorySessionId yet)
|
||||
- If any query fails/aborts before the first response, its memorySessionId is never captured
|
||||
- On next startup, those sessions get new files again
|
||||
|
||||
---
|
||||
|
||||
## Code Flow: Where .jsonl Files Are Created
|
||||
|
||||
The .jsonl files are created by the **Claude Agent SDK** (`@anthropic-ai/claude-agent-sdk`), not by claude-mem directly.
|
||||
|
||||
When `query()` is called in SDKAgent.ts:
|
||||
|
||||
```typescript
|
||||
// SDKAgent.ts lines 89-99
|
||||
const queryResult = query({
|
||||
prompt: messageGenerator,
|
||||
options: {
|
||||
model: modelId,
|
||||
// Resume with captured memorySessionId (null on first prompt, real ID on subsequent)
|
||||
...(hasRealMemorySessionId && { resume: session.memorySessionId }),
|
||||
disallowedTools,
|
||||
abortController: session.abortController,
|
||||
pathToClaudeCodeExecutable: claudePath
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
**Key insight:** If `hasRealMemorySessionId` is false (memorySessionId is null), no `resume` parameter is passed. The SDK then generates a new UUID and creates a new file at:
|
||||
`~/.claude/projects/<dashed-cwd>/<new-uuid>.jsonl`
|
||||
|
||||
---
|
||||
|
||||
## Fix History
|
||||
|
||||
### Commit 9a7f662: The Original Fix (Reverted)
|
||||
|
||||
```
|
||||
fix(sdk): always pass deterministic session ID to prevent orphaned files
|
||||
|
||||
Fixes #514 - Excessive observer sessions created during startup-recovery
|
||||
|
||||
Root cause: When memorySessionId was null, no `resume` parameter was passed
|
||||
to the SDK's query(). This caused the SDK to create a NEW session file on
|
||||
every call. If queries aborted before capturing the SDK's session_id, the
|
||||
placeholder remained, leading to cascading creation of 13,000+ orphaned files.
|
||||
|
||||
Fix:
|
||||
- Generate deterministic ID `mem-${contentSessionId}` upfront
|
||||
- Always pass it to `resume` parameter
|
||||
- Persist immediately to database before query starts
|
||||
- If SDK returns different ID, capture and use that going forward
|
||||
```
|
||||
|
||||
**This fix was correct in approach** - always passing a resume parameter prevents new file creation.
|
||||
|
||||
### Commit f9197b5: The Revert
|
||||
|
||||
```
|
||||
fix(sdk): restore session continuity via robust capture-and-resume strategy
|
||||
|
||||
Replaces the deterministic 'mem-' ID approach with a capture-based strategy:
|
||||
1. Passes 'resume' parameter ONLY when a verified memory session ID exists
|
||||
2. Captures SDK-generated session ID when it differs from current ID
|
||||
3. Ensures subsequent prompts resume the correctly captured session ID
|
||||
|
||||
This resolves the issue where new sessions were created for every message
|
||||
due to failure to capture/resume the initial session ID, without introducing
|
||||
potentially invalid deterministic IDs.
|
||||
```
|
||||
|
||||
**The revert explanation suggests the SDK rejected the `mem-` prefix IDs.**
|
||||
|
||||
### Commit 005b0f8: Current NULL-based Pattern
|
||||
|
||||
Changed `memory_session_id` initialization from `contentSessionId` (placeholder) to `NULL`:
|
||||
- Simpler logic: `!!session.memorySessionId` instead of `memorySessionId !== contentSessionId`
|
||||
- But still creates new files on first prompt of each session
|
||||
|
||||
---
|
||||
|
||||
## Relationship with Issue #520 (Stuck Messages)
|
||||
|
||||
**Issue #520 is related but distinct:**
|
||||
|
||||
| Aspect | Issue #514 (Orphaned Files) | Issue #520 (Stuck Messages) |
|
||||
|--------|-----------------------------|-----------------------------|
|
||||
| Problem | Too many .jsonl files | Messages never processed |
|
||||
| Root Cause | SDK creates new file per query without resume | Old claim-process-mark pattern left messages in 'processing' state |
|
||||
| Status | Partially resolved | **Fully resolved** |
|
||||
| Fix | Need deterministic resume IDs | Changed to claim-and-delete pattern |
|
||||
|
||||
**Connection:** Both issues relate to startup-recovery. Issue #520's fix (claim-and-delete pattern) doesn't create the loop that #514 describes, but #514 can still occur when:
|
||||
1. Sessions have pending messages
|
||||
2. Recovery starts the generator
|
||||
3. Generator aborts before capturing memorySessionId
|
||||
4. Next startup repeats the cycle
|
||||
|
||||
---
|
||||
|
||||
## v8.5.7 Status
|
||||
|
||||
**v8.5.7 did NOT fully address Issue #514.** The major changes were:
|
||||
- Modular architecture refactor
|
||||
- NULL-based initialization pattern
|
||||
- Comprehensive test coverage
|
||||
|
||||
The deterministic `mem-` prefix fix (9a7f662) was reverted before v8.5.7.
|
||||
|
||||
---
|
||||
|
||||
## Recommended Fix
|
||||
|
||||
### Option 1: Reintroduce Deterministic IDs with SDK Validation
|
||||
|
||||
```typescript
|
||||
// SDKAgent.ts - In startSession()
|
||||
async startSession(session: ActiveSession, worker?: WorkerRef): Promise<void> {
|
||||
// Generate deterministic ID based on database session ID (not UUID-based contentSessionId)
|
||||
// Format: "mem-<sessionDbId>" is short and unlikely to conflict
|
||||
const deterministicMemoryId = session.memorySessionId || `mem-${session.sessionDbId}`;
|
||||
|
||||
// Always pass resume to prevent orphaned sessions
|
||||
const queryResult = query({
|
||||
prompt: messageGenerator,
|
||||
options: {
|
||||
model: modelId,
|
||||
resume: deterministicMemoryId, // ALWAYS pass, even if SDK might reject
|
||||
disallowedTools,
|
||||
abortController: session.abortController,
|
||||
pathToClaudeCodeExecutable: claudePath
|
||||
}
|
||||
});
|
||||
|
||||
// Capture whatever ID the SDK actually uses
|
||||
for await (const message of queryResult) {
|
||||
if (message.session_id && message.session_id !== session.memorySessionId) {
|
||||
session.memorySessionId = message.session_id;
|
||||
this.dbManager.getSessionStore().updateMemorySessionId(
|
||||
session.sessionDbId,
|
||||
message.session_id
|
||||
);
|
||||
}
|
||||
// ... rest of processing
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Option 2: Limit Recovery Scope
|
||||
|
||||
Prevent the recovery loop by limiting how many times a session can be recovered:
|
||||
|
||||
```typescript
|
||||
// In processPendingQueues()
|
||||
for (const sessionDbId of orphanedSessionIds) {
|
||||
// Check if this session was already recovered recently
|
||||
const dbSession = this.dbManager.getSessionById(sessionDbId);
|
||||
const recoveryAttempts = dbSession.recovery_attempts || 0;
|
||||
|
||||
if (recoveryAttempts >= 3) {
|
||||
logger.warn('SYSTEM', 'Session exceeded max recovery attempts, skipping', {
|
||||
sessionDbId,
|
||||
recoveryAttempts
|
||||
});
|
||||
continue;
|
||||
}
|
||||
|
||||
// Increment recovery counter
|
||||
this.dbManager.getSessionStore().incrementRecoveryAttempts(sessionDbId);
|
||||
|
||||
// ... rest of recovery
|
||||
}
|
||||
```
|
||||
|
||||
### Option 3: Cleanup Old Files (Mitigation, Not Fix)
|
||||
|
||||
Add a cleanup script that removes orphaned .jsonl files:
|
||||
|
||||
```bash
|
||||
# Find files with only 1 line older than 7 days
|
||||
find ~/.claude/projects/ -name "*.jsonl" -mtime +7 \
|
||||
-exec sh -c '[ $(wc -l < "$1") -le 1 ] && rm "$1"' _ {} \;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Files Involved
|
||||
|
||||
| File | Role |
|
||||
|------|------|
|
||||
| `src/services/worker-service.ts` | `startSessionProcessor()`, `processPendingQueues()` |
|
||||
| `src/services/worker/SDKAgent.ts` | `startSession()`, `query()` call with `resume` parameter |
|
||||
| `src/services/worker/SessionManager.ts` | `initializeSession()`, session lifecycle |
|
||||
| `src/services/sqlite/sessions/create.ts` | `createSDKSession()`, NULL-based initialization |
|
||||
| `src/services/sqlite/PendingMessageStore.ts` | `getSessionsWithPendingMessages()` |
|
||||
|
||||
---
|
||||
|
||||
## Conclusion
|
||||
|
||||
Issue #514 was correctly diagnosed. The fix in commit 9a7f662 was the right approach but was reverted because the SDK may not accept arbitrary custom IDs. The current NULL-based pattern (005b0f8) is cleaner but doesn't prevent orphaned files when queries abort before capturing the SDK's session ID.
|
||||
|
||||
**Recommendation:** Reintroduce the deterministic ID approach with proper handling of SDK rejections (Option 1). If the SDK rejects the ID and returns a different one, capture and persist that ID immediately. This ensures at most one .jsonl file per database session, even across crashes and restarts.
|
||||
|
||||
---
|
||||
|
||||
## Appendix: Git Commit References
|
||||
|
||||
| Commit | Description |
|
||||
|--------|-------------|
|
||||
| 9a7f662 | Original fix: deterministic `mem-` prefix IDs (REVERTED) |
|
||||
| f9197b5 | Revert: capture-based strategy without deterministic IDs |
|
||||
| 005b0f8 | NULL-based initialization pattern (current) |
|
||||
| d72a81e | Queue refactoring (related to #520) |
|
||||
| eb1a78b | Claim-and-delete pattern (fixes #520) |
|
||||
@@ -0,0 +1,87 @@
|
||||
# Issue #517 Analysis: Windows PowerShell Escaping in cleanupOrphanedProcesses()
|
||||
|
||||
**Date:** 2026-01-04
|
||||
**Version Analyzed:** 8.5.7
|
||||
**Status:** NOT FIXED - Issue still present
|
||||
|
||||
## Summary
|
||||
|
||||
The reported issue involves PowerShell's `$_` variable being interpreted by Bash before PowerShell receives it when running in Git Bash or WSL environments on Windows. This causes `cleanupOrphanedProcesses()` to fail during worker initialization.
|
||||
|
||||
## Current State
|
||||
|
||||
The `cleanupOrphanedProcesses()` function is located in:
|
||||
- **File:** `/Users/alexnewman/Scripts/claude-mem/src/services/infrastructure/ProcessManager.ts`
|
||||
- **Lines:** 164-251
|
||||
|
||||
### Problematic Code (Lines 170-172)
|
||||
|
||||
```typescript
|
||||
if (isWindows) {
|
||||
// Windows: Use PowerShell Get-CimInstance to find chroma-mcp processes
|
||||
const cmd = `powershell -Command "Get-CimInstance Win32_Process | Where-Object { $_.Name -like '*python*' -and $_.CommandLine -like '*chroma-mcp*' } | Select-Object -ExpandProperty ProcessId"`;
|
||||
const { stdout } = await execAsync(cmd, { timeout: 60000 });
|
||||
```
|
||||
|
||||
The `$_.Name` and `$_.CommandLine` contain `$_` which is a special variable in both PowerShell and Bash. When this command string is executed via Node.js `child_process.exec()` in a Git Bash or WSL environment, Bash may interpret `$_` as its own special variable (the last argument of the previous command) before passing it to PowerShell.
|
||||
|
||||
### Additional Occurrence (Lines 91-92)
|
||||
|
||||
A similar issue exists in `getChildProcesses()`:
|
||||
|
||||
```typescript
|
||||
const cmd = `powershell -Command "Get-CimInstance Win32_Process | Where-Object { $_.ParentProcessId -eq ${parentPid} } | Select-Object -ExpandProperty ProcessId"`;
|
||||
```
|
||||
|
||||
## Error Handling Analysis
|
||||
|
||||
Both functions have try-catch blocks with non-blocking error handling:
|
||||
- Line 208-212: `cleanupOrphanedProcesses()` catches errors and logs a warning, then returns
|
||||
- Line 98-102: `getChildProcesses()` catches errors and logs a warning, returning empty array
|
||||
|
||||
While this prevents worker initialization from crashing, it means orphaned process cleanup silently fails on affected Windows environments.
|
||||
|
||||
## Recommended Fix
|
||||
|
||||
Replace PowerShell commands with WMIC (Windows Management Instrumentation Command-line), which does not use `$_` syntax:
|
||||
|
||||
### For cleanupOrphanedProcesses() (Line 171):
|
||||
|
||||
**Current:**
|
||||
```typescript
|
||||
const cmd = `powershell -Command "Get-CimInstance Win32_Process | Where-Object { $_.Name -like '*python*' -and $_.CommandLine -like '*chroma-mcp*' } | Select-Object -ExpandProperty ProcessId"`;
|
||||
```
|
||||
|
||||
**Recommended:**
|
||||
```typescript
|
||||
const cmd = `wmic process where "name like '%python%' and commandline like '%chroma-mcp%'" get processid /format:list`;
|
||||
```
|
||||
|
||||
### For getChildProcesses() (Line 91):
|
||||
|
||||
**Current:**
|
||||
```typescript
|
||||
const cmd = `powershell -Command "Get-CimInstance Win32_Process | Where-Object { $_.ParentProcessId -eq ${parentPid} } | Select-Object -ExpandProperty ProcessId"`;
|
||||
```
|
||||
|
||||
**Recommended:**
|
||||
```typescript
|
||||
const cmd = `wmic process where "parentprocessid=${parentPid}" get processid /format:list`;
|
||||
```
|
||||
|
||||
### Implementation Notes
|
||||
|
||||
1. WMIC output format differs from PowerShell - parse `ProcessId=12345` format
|
||||
2. WMIC is deprecated in newer Windows versions but still widely available
|
||||
3. Alternative: Use PowerShell with proper escaping (`$$_` or `\$_` depending on context)
|
||||
4. Consider using `powershell -NoProfile -NonInteractive` flags for faster execution
|
||||
|
||||
## Impact Assessment
|
||||
|
||||
- **Severity:** Medium - orphaned process cleanup fails silently
|
||||
- **Scope:** Windows users running in Git Bash, WSL, or mixed shell environments
|
||||
- **Workaround:** None currently - users must manually kill orphaned chroma-mcp processes
|
||||
|
||||
## Files to Modify
|
||||
|
||||
1. `/src/services/infrastructure/ProcessManager.ts` (lines 91-92, 171-172)
|
||||
210
docs/reports/2026-01-04--issue-520-stuck-messages-analysis.md
Normal file
210
docs/reports/2026-01-04--issue-520-stuck-messages-analysis.md
Normal file
@@ -0,0 +1,210 @@
|
||||
# Issue #520: Stuck Messages Analysis
|
||||
|
||||
**Date:** January 4, 2026
|
||||
**Status:** RESOLVED - Issue no longer exists in current codebase
|
||||
**Original Issue:** Messages in 'processing' status never recovered after worker crash
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
The issue described in GitHub #520 has been **fully resolved** in the current codebase through a fundamental architectural change. The system now uses a **claim-and-delete** pattern instead of the old **claim-process-mark** pattern, which eliminates the stuck 'processing' state problem entirely.
|
||||
|
||||
---
|
||||
|
||||
## Original Issue Description
|
||||
|
||||
The issue claimed that after a worker crash:
|
||||
|
||||
1. `getSessionsWithPendingMessages()` returns sessions with `status IN ('pending', 'processing')`
|
||||
2. But `claimNextMessage()` only looks for `status = 'pending'`
|
||||
3. So 'processing' messages are orphaned
|
||||
|
||||
**Proposed Fix:** Add `resetStuckMessages(0)` at start of `processPendingQueues()`
|
||||
|
||||
---
|
||||
|
||||
## Current Code Analysis
|
||||
|
||||
### 1. Queue Processing Pattern: Claim-and-Delete
|
||||
|
||||
The current architecture uses `claimAndDelete()` instead of `claimNextMessage()`:
|
||||
|
||||
**File:** `/Users/alexnewman/Scripts/claude-mem/src/services/sqlite/PendingMessageStore.ts`
|
||||
|
||||
```typescript
|
||||
// Lines 85-104
|
||||
claimAndDelete(sessionDbId: number): PersistentPendingMessage | null {
|
||||
const claimTx = this.db.transaction((sessionId: number) => {
|
||||
const peekStmt = this.db.prepare(`
|
||||
SELECT * FROM pending_messages
|
||||
WHERE session_db_id = ? AND status = 'pending'
|
||||
ORDER BY id ASC
|
||||
LIMIT 1
|
||||
`);
|
||||
const msg = peekStmt.get(sessionId) as PersistentPendingMessage | null;
|
||||
|
||||
if (msg) {
|
||||
// Delete immediately - no "processing" state needed
|
||||
const deleteStmt = this.db.prepare('DELETE FROM pending_messages WHERE id = ?');
|
||||
deleteStmt.run(msg.id);
|
||||
}
|
||||
return msg;
|
||||
});
|
||||
|
||||
return claimTx(sessionDbId) as PersistentPendingMessage | null;
|
||||
}
|
||||
```
|
||||
|
||||
**Key insight:** Messages are atomically selected and deleted in a single transaction. There is no 'processing' state for messages being actively worked on - they simply don't exist in the database anymore.
|
||||
|
||||
### 2. Iterator Uses claimAndDelete
|
||||
|
||||
**File:** `/Users/alexnewman/Scripts/claude-mem/src/services/queue/SessionQueueProcessor.ts`
|
||||
|
||||
```typescript
|
||||
// Lines 18-38
|
||||
async *createIterator(sessionDbId: number, signal: AbortSignal): AsyncIterableIterator<PendingMessageWithId> {
|
||||
while (!signal.aborted) {
|
||||
try {
|
||||
// Atomically claim AND DELETE next message from DB
|
||||
// Message is now in memory only - no "processing" state tracking needed
|
||||
const persistentMessage = this.store.claimAndDelete(sessionDbId);
|
||||
|
||||
if (persistentMessage) {
|
||||
// Yield the message for processing (it's already deleted from queue)
|
||||
yield this.toPendingMessageWithId(persistentMessage);
|
||||
} else {
|
||||
// Queue empty - wait for wake-up event
|
||||
await this.waitForMessage(signal);
|
||||
}
|
||||
} catch (error) {
|
||||
// ... error handling
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 3. getSessionsWithPendingMessages Still Checks Both States
|
||||
|
||||
**File:** `/Users/alexnewman/Scripts/claude-mem/src/services/sqlite/PendingMessageStore.ts`
|
||||
|
||||
```typescript
|
||||
// Lines 319-326
|
||||
getSessionsWithPendingMessages(): number[] {
|
||||
const stmt = this.db.prepare(`
|
||||
SELECT DISTINCT session_db_id FROM pending_messages
|
||||
WHERE status IN ('pending', 'processing')
|
||||
`);
|
||||
const results = stmt.all() as { session_db_id: number }[];
|
||||
return results.map(r => r.session_db_id);
|
||||
}
|
||||
```
|
||||
|
||||
**This is technically vestigial code** - with the claim-and-delete pattern, messages should never be in 'processing' state. However, it provides backward compatibility and defense-in-depth.
|
||||
|
||||
### 4. Startup Recovery Still Exists
|
||||
|
||||
**File:** `/Users/alexnewman/Scripts/claude-mem/src/services/worker-service.ts`
|
||||
|
||||
```typescript
|
||||
// Lines 236-242
|
||||
// Recover stuck messages from previous crashes
|
||||
const { PendingMessageStore } = await import('./sqlite/PendingMessageStore.js');
|
||||
const pendingStore = new PendingMessageStore(this.dbManager.getSessionStore().db, 3);
|
||||
const STUCK_THRESHOLD_MS = 5 * 60 * 1000;
|
||||
const resetCount = pendingStore.resetStuckMessages(STUCK_THRESHOLD_MS);
|
||||
if (resetCount > 0) {
|
||||
logger.info('SYSTEM', `Recovered ${resetCount} stuck messages from previous session`, { thresholdMinutes: 5 });
|
||||
}
|
||||
```
|
||||
|
||||
This runs BEFORE `processPendingQueues()` is called (line 281), which addresses the original fix request.
|
||||
|
||||
---
|
||||
|
||||
## Verification of Issue Status
|
||||
|
||||
### Does the Issue Exist?
|
||||
|
||||
**NO** - The issue as described no longer exists because:
|
||||
|
||||
1. **No 'processing' state during normal operation**: With claim-and-delete, messages go directly from 'pending' to 'deleted'. They never enter a 'processing' state.
|
||||
|
||||
2. **Startup recovery handles legacy stuck messages**: Even if 'processing' messages exist (from old code or edge cases), `resetStuckMessages()` is called BEFORE `processPendingQueues()` in `initializeBackground()` (lines 236-241 run before line 281).
|
||||
|
||||
3. **Architecture fundamentally changed**: The old `claimNextMessage()` function that only looked for `status = 'pending'` no longer exists. It was replaced with `claimAndDelete()`.
|
||||
|
||||
### GeminiAgent and OpenRouterAgent Behavior
|
||||
|
||||
Both agents use the same `SessionManager.getMessageIterator()` which calls `SessionQueueProcessor.createIterator()` which uses `claimAndDelete()`. All three agents (SDKAgent, GeminiAgent, OpenRouterAgent) use identical queue processing:
|
||||
|
||||
```typescript
|
||||
// GeminiAgent.ts:174, OpenRouterAgent.ts:134
|
||||
for await (const message of this.sessionManager.getMessageIterator(session.sessionDbId)) {
|
||||
// ...
|
||||
}
|
||||
```
|
||||
|
||||
They do NOT handle recovery differently - they all rely on the shared infrastructure.
|
||||
|
||||
### What v8.5.7 Changed
|
||||
|
||||
Looking at the git history:
|
||||
|
||||
```
|
||||
v8.5.7 (ac03901):
|
||||
- Minor ESM/CommonJS compatibility fix for isMainModule detection
|
||||
- No queue-related changes
|
||||
|
||||
v8.5.6 -> v8.5.7:
|
||||
- f21ea97 refactor: decompose monolith into modular architecture with comprehensive test suite (#538)
|
||||
```
|
||||
|
||||
The major refactor happened before v8.5.7. The claim-and-delete pattern was already in place.
|
||||
|
||||
---
|
||||
|
||||
## Timeline of Resolution
|
||||
|
||||
Based on git history, the issue was likely resolved through these commits:
|
||||
|
||||
1. **b8ce27b** - `feat(queue): Simplify queue processing and enhance reliability`
|
||||
2. **eb1a78b** - `fix: eliminate duplicate observations by simplifying message queue`
|
||||
3. **d72a81e** - `Refactor session queue processing and database interactions`
|
||||
|
||||
These commits appear to have introduced the claim-and-delete pattern that eliminates the original bug.
|
||||
|
||||
---
|
||||
|
||||
## Conclusion
|
||||
|
||||
**Issue #520 should be closed as resolved.**
|
||||
|
||||
The described bug (`claimNextMessage()` only checking `status = 'pending'`) no longer exists because:
|
||||
|
||||
1. `claimNextMessage()` was replaced with `claimAndDelete()` which atomically removes messages
|
||||
2. `resetStuckMessages()` is already called at startup BEFORE `processPendingQueues()`
|
||||
3. The 'processing' status is now only used for legacy compatibility and edge cases
|
||||
|
||||
### No Fix Needed
|
||||
|
||||
The proposed fix ("Add `resetStuckMessages(0)` at start of `processPendingQueues()`") is:
|
||||
|
||||
1. **Unnecessary** - The recovery happens in `initializeBackground()` before `processPendingQueues()` is called
|
||||
2. **Using wrong threshold** - `resetStuckMessages(0)` would reset ALL processing messages immediately, which could cause issues if called during normal operation (not just startup)
|
||||
|
||||
The current implementation with a 5-minute threshold is more robust - it only recovers truly stuck messages, not messages that are actively being processed.
|
||||
|
||||
---
|
||||
|
||||
## Appendix: File References
|
||||
|
||||
| Component | File | Key Lines |
|
||||
|-----------|------|-----------|
|
||||
| claimAndDelete | `src/services/sqlite/PendingMessageStore.ts` | 85-104 |
|
||||
| Queue Iterator | `src/services/queue/SessionQueueProcessor.ts` | 18-38 |
|
||||
| Startup Recovery | `src/services/worker-service.ts` | 236-242 |
|
||||
| processPendingQueues | `src/services/worker-service.ts` | 326-375 |
|
||||
| getSessionsWithPendingMessages | `src/services/sqlite/PendingMessageStore.ts` | 319-326 |
|
||||
| resetStuckMessages | `src/services/sqlite/PendingMessageStore.ts` | 279-290 |
|
||||
112
docs/reports/2026-01-04--issue-527-uv-homebrew-analysis.md
Normal file
112
docs/reports/2026-01-04--issue-527-uv-homebrew-analysis.md
Normal file
@@ -0,0 +1,112 @@
|
||||
# Issue #527: uv Detection Fails on Apple Silicon Macs with Homebrew Installation
|
||||
|
||||
**Date**: 2026-01-04
|
||||
**Issue**: GitHub Issue #527
|
||||
**Status**: Confirmed - Fix Required
|
||||
|
||||
## Summary
|
||||
|
||||
The `isUvInstalled()` function fails to detect uv when installed via Homebrew on Apple Silicon Macs because it does not check the `/opt/homebrew/bin/uv` path.
|
||||
|
||||
## Analysis
|
||||
|
||||
### Files Affected
|
||||
|
||||
Two copies of `smart-install.js` exist in the codebase:
|
||||
|
||||
1. **Source file**: `/Users/alexnewman/Scripts/claude-mem/scripts/smart-install.js`
|
||||
2. **Built/deployed file**: `/Users/alexnewman/Scripts/claude-mem/plugin/scripts/smart-install.js`
|
||||
|
||||
### Current uv Path Detection
|
||||
|
||||
**Source file (`scripts/smart-install.js`)** - Lines 22-24:
|
||||
```javascript
|
||||
const UV_COMMON_PATHS = IS_WINDOWS
|
||||
? [join(homedir(), '.local', 'bin', 'uv.exe'), join(homedir(), '.cargo', 'bin', 'uv.exe')]
|
||||
: [join(homedir(), '.local', 'bin', 'uv'), join(homedir(), '.cargo', 'bin', 'uv'), '/usr/local/bin/uv'];
|
||||
```
|
||||
|
||||
**Plugin file (`plugin/scripts/smart-install.js`)** - Lines 103-105:
|
||||
```javascript
|
||||
const uvPaths = IS_WINDOWS
|
||||
? [join(homedir(), '.local', 'bin', 'uv.exe'), join(homedir(), '.cargo', 'bin', 'uv.exe')]
|
||||
: [join(homedir(), '.local', 'bin', 'uv'), join(homedir(), '.cargo', 'bin', 'uv'), '/usr/local/bin/uv'];
|
||||
```
|
||||
|
||||
### Paths Currently Checked (Unix/macOS)
|
||||
|
||||
| Path | Installer | Architecture |
|
||||
|------|-----------|--------------|
|
||||
| `~/.local/bin/uv` | Official installer | Any |
|
||||
| `~/.cargo/bin/uv` | Cargo/Rust install | Any |
|
||||
| `/usr/local/bin/uv` | Homebrew (Intel) | x86_64 |
|
||||
|
||||
### Missing Path
|
||||
|
||||
| Path | Installer | Architecture |
|
||||
|------|-----------|--------------|
|
||||
| `/opt/homebrew/bin/uv` | Homebrew (Apple Silicon) | arm64 |
|
||||
|
||||
## Root Cause
|
||||
|
||||
Homebrew installs to different prefixes depending on architecture:
|
||||
- **Intel Macs (x86_64)**: `/usr/local/bin/`
|
||||
- **Apple Silicon Macs (arm64)**: `/opt/homebrew/bin/`
|
||||
|
||||
The current implementation only includes the Intel Homebrew path, causing detection to fail on Apple Silicon when:
|
||||
1. uv is installed via `brew install uv`
|
||||
2. The user's shell PATH is not available during script execution (common in non-interactive contexts)
|
||||
|
||||
## Impact
|
||||
|
||||
Users on Apple Silicon Macs who installed uv via Homebrew will:
|
||||
1. See "uv not found" errors
|
||||
2. Have uv unnecessarily reinstalled via the official installer
|
||||
3. End up with duplicate installations
|
||||
|
||||
## Recommended Fix
|
||||
|
||||
Add `/opt/homebrew/bin/uv` to the Unix paths array.
|
||||
|
||||
### Source file (`scripts/smart-install.js`) - Line 24
|
||||
|
||||
**Before:**
|
||||
```javascript
|
||||
: [join(homedir(), '.local', 'bin', 'uv'), join(homedir(), '.cargo', 'bin', 'uv'), '/usr/local/bin/uv'];
|
||||
```
|
||||
|
||||
**After:**
|
||||
```javascript
|
||||
: [join(homedir(), '.local', 'bin', 'uv'), join(homedir(), '.cargo', 'bin', 'uv'), '/usr/local/bin/uv', '/opt/homebrew/bin/uv'];
|
||||
```
|
||||
|
||||
### Plugin file (`plugin/scripts/smart-install.js`) - Lines 103-105 and 222-224
|
||||
|
||||
The same fix should be applied in both locations where `uvPaths` is defined:
|
||||
- Line 105 in `isUvInstalled()`
|
||||
- Line 224 in `installUv()`
|
||||
|
||||
### Note: Bun Has the Same Issue
|
||||
|
||||
The Bun detection has the same gap:
|
||||
|
||||
**Current (`scripts/smart-install.js` line 20):**
|
||||
```javascript
|
||||
: [join(homedir(), '.bun', 'bin', 'bun'), '/usr/local/bin/bun'];
|
||||
```
|
||||
|
||||
**Should also add:**
|
||||
```javascript
|
||||
: [join(homedir(), '.bun', 'bin', 'bun'), '/usr/local/bin/bun', '/opt/homebrew/bin/bun'];
|
||||
```
|
||||
|
||||
## Verification
|
||||
|
||||
After the fix, verify by:
|
||||
1. Installing uv via Homebrew on an Apple Silicon Mac
|
||||
2. Running the smart-install script
|
||||
3. Confirming uv is detected without attempting reinstallation
|
||||
|
||||
## Conclusion
|
||||
|
||||
**Fix is required.** The `/opt/homebrew/bin/uv` path is missing from both files. This is a simple one-line addition to the path arrays. The same fix should also be applied to Bun detection paths for consistency.
|
||||
324
docs/reports/2026-01-04--issue-532-memory-leak-analysis.md
Normal file
324
docs/reports/2026-01-04--issue-532-memory-leak-analysis.md
Normal file
@@ -0,0 +1,324 @@
|
||||
# Issue #532: Memory Leak in SessionManager - Analysis Report
|
||||
|
||||
**Date**: 2026-01-04
|
||||
**Issue**: Memory leak causing 54GB+ VS Code memory consumption after several days of use
|
||||
**Reported Root Causes**:
|
||||
1. Sessions never auto-cleanup after SDK agent completes
|
||||
2. `conversationHistory` array grows unbounded (never trimmed)
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
This analysis confirms **both issues exist in the current codebase** (v8.5.7). While v8.5.7 included a major modular refactor, it did **not address either memory leak issue**. The `SessionManager` holds sessions indefinitely in memory with no TTL/cleanup mechanism, and `conversationHistory` arrays grow unbounded within each session (with only OpenRouter implementing partial mitigation).
|
||||
|
||||
---
|
||||
|
||||
## 1. SessionManager Session Storage Analysis
|
||||
|
||||
### Location
|
||||
`/Users/alexnewman/Scripts/claude-mem/src/services/worker/SessionManager.ts`
|
||||
|
||||
### Current Implementation
|
||||
|
||||
```typescript
|
||||
export class SessionManager {
|
||||
private sessions: Map<number, ActiveSession> = new Map();
|
||||
private sessionQueues: Map<number, EventEmitter> = new Map();
|
||||
// ...
|
||||
}
|
||||
```
|
||||
|
||||
Sessions are stored in an in-memory `Map<number, ActiveSession>` with the session database ID as the key.
|
||||
|
||||
### Session Lifecycle
|
||||
|
||||
| Event | Method | Behavior |
|
||||
|-------|--------|----------|
|
||||
| Session created | `initializeSession()` | Added to `this.sessions` Map (line 152) |
|
||||
| Session deleted | `deleteSession()` | Removed from `this.sessions` Map (line 293) |
|
||||
| Worker shutdown | `shutdownAll()` | Calls `deleteSession()` on all sessions |
|
||||
|
||||
### The Problem: No Automatic Cleanup
|
||||
|
||||
Looking at `/Users/alexnewman/Scripts/claude-mem/src/services/worker/http/routes/SessionRoutes.ts` (lines 213-216), the session completion handling has this comment:
|
||||
|
||||
```typescript
|
||||
// NOTE: We do NOT delete the session here anymore.
|
||||
// The generator waits for events, so if it exited, it's either aborted or crashed.
|
||||
// Idle sessions stay in memory (ActiveSession is small) to listen for future events.
|
||||
```
|
||||
|
||||
**Critical Finding**: Sessions are **intentionally never deleted** after the SDK agent completes. They persist indefinitely "to listen for future events."
|
||||
|
||||
### When Sessions ARE Deleted
|
||||
|
||||
Sessions are only deleted when:
|
||||
1. Explicit `DELETE /sessions/:sessionDbId` HTTP request (manual cleanup)
|
||||
2. `POST /sessions/:sessionDbId/complete` HTTP request (cleanup-hook callback)
|
||||
3. Worker service shutdown (`shutdownAll()`)
|
||||
|
||||
There is **NO automatic cleanup mechanism** based on:
|
||||
- Session age/TTL
|
||||
- Session inactivity timeout
|
||||
- Memory pressure
|
||||
- Completed/failed status
|
||||
|
||||
---
|
||||
|
||||
## 2. conversationHistory Analysis
|
||||
|
||||
### Location
|
||||
`/Users/alexnewman/Scripts/claude-mem/src/services/worker-types.ts` (line 34)
|
||||
|
||||
### Type Definition
|
||||
|
||||
```typescript
|
||||
export interface ActiveSession {
|
||||
// ...
|
||||
conversationHistory: ConversationMessage[]; // Shared conversation history for provider switching
|
||||
// ...
|
||||
}
|
||||
```
|
||||
|
||||
### Usage Pattern
|
||||
|
||||
The `conversationHistory` array is populated by three agent implementations:
|
||||
|
||||
1. **SDKAgent** (`/Users/alexnewman/Scripts/claude-mem/src/services/worker/SDKAgent.ts`)
|
||||
- Adds user messages at lines 247, 280, 302
|
||||
- Assistant responses added via `ResponseProcessor`
|
||||
|
||||
2. **GeminiAgent** (`/Users/alexnewman/Scripts/claude-mem/src/services/worker/GeminiAgent.ts`)
|
||||
- Adds user messages at lines 143, 196, 232
|
||||
- Adds assistant responses at lines 148, 202, 238
|
||||
|
||||
3. **OpenRouterAgent** (`/Users/alexnewman/Scripts/claude-mem/src/services/worker/OpenRouterAgent.ts`)
|
||||
- Adds user messages at lines 103, 155, 191
|
||||
- Adds assistant responses at lines 108, 161, 197
|
||||
- **Implements truncation**: See `truncateHistory()` at lines 262-301
|
||||
|
||||
4. **ResponseProcessor** (`/Users/alexnewman/Scripts/claude-mem/src/services/worker/agents/ResponseProcessor.ts`)
|
||||
- Adds assistant responses at line 57
|
||||
|
||||
### The Problem: Unbounded Growth
|
||||
|
||||
**For Claude SDK and Gemini agents**, there is **no limit or trimming** of `conversationHistory`. Every message is `push()`ed without checking array size.
|
||||
|
||||
**OpenRouter ONLY** has mitigation via `truncateHistory()`:
|
||||
|
||||
```typescript
|
||||
private truncateHistory(history: ConversationMessage[]): ConversationMessage[] {
|
||||
const MAX_CONTEXT_MESSAGES = parseInt(settings.CLAUDE_MEM_OPENROUTER_MAX_CONTEXT_MESSAGES) || 20;
|
||||
const MAX_ESTIMATED_TOKENS = parseInt(settings.CLAUDE_MEM_OPENROUTER_MAX_TOKENS) || 100000;
|
||||
|
||||
// Sliding window: keep most recent messages within limits
|
||||
// ...
|
||||
}
|
||||
```
|
||||
|
||||
However, this only truncates the copy sent to OpenRouter API - **it does NOT truncate the actual `session.conversationHistory` array**. The original array still grows unbounded.
|
||||
|
||||
### Memory Impact Calculation
|
||||
|
||||
Each `ConversationMessage` contains:
|
||||
- `role`: 'user' | 'assistant' (small string)
|
||||
- `content`: string (can be very large - full prompts/responses)
|
||||
|
||||
A typical session with 100 tool uses could have:
|
||||
- 1 init prompt (~2KB)
|
||||
- 100 observation prompts (~5KB each = 500KB)
|
||||
- 100 responses (~1KB each = 100KB)
|
||||
- 1 summary prompt + response (~5KB)
|
||||
|
||||
**Per session**: ~600KB in `conversationHistory` alone
|
||||
|
||||
After several days with many sessions, this adds up to gigabytes.
|
||||
|
||||
---
|
||||
|
||||
## 3. v8.5.7 Refactor Assessment
|
||||
|
||||
The v8.5.7 release (2026-01-04) focused on modular architecture refactoring:
|
||||
|
||||
### What v8.5.7 DID:
|
||||
- Extracted SQLite repositories into `/src/services/sqlite/`
|
||||
- Extracted worker agents into `/src/services/worker/agents/`
|
||||
- Extracted search strategies into `/src/services/worker/search/`
|
||||
- Extracted context generation into `/src/services/context/`
|
||||
- Extracted infrastructure into `/src/services/infrastructure/`
|
||||
- Added 595 tests across 36 test files
|
||||
|
||||
### What v8.5.7 DID NOT address:
|
||||
- No session TTL or automatic cleanup mechanism
|
||||
- No `conversationHistory` size limits for Claude SDK or Gemini
|
||||
- No memory pressure monitoring for sessions
|
||||
- The "sessions stay in memory" design comment was already present
|
||||
|
||||
**Relevant v8.5.2 Note**: There was a related fix for SDK Agent child process memory leak (orphaned Claude processes), but that addressed process cleanup, not in-memory session state.
|
||||
|
||||
---
|
||||
|
||||
## 4. Specific Code Locations Requiring Fixes
|
||||
|
||||
### Fix Location 1: SessionManager needs cleanup mechanism
|
||||
**File**: `/Users/alexnewman/Scripts/claude-mem/src/services/worker/SessionManager.ts`
|
||||
|
||||
Add automatic session cleanup based on:
|
||||
- Session completion (when generator finishes and no pending work)
|
||||
- Session age TTL (e.g., 1 hour after last activity)
|
||||
- Memory pressure (configurable max sessions)
|
||||
|
||||
### Fix Location 2: conversationHistory needs bounds
|
||||
**Files**:
|
||||
- `/Users/alexnewman/Scripts/claude-mem/src/services/worker/SDKAgent.ts`
|
||||
- `/Users/alexnewman/Scripts/claude-mem/src/services/worker/GeminiAgent.ts`
|
||||
- `/Users/alexnewman/Scripts/claude-mem/src/services/worker/agents/ResponseProcessor.ts`
|
||||
|
||||
Apply sliding window truncation similar to OpenRouterAgent's approach, but mutate the original array.
|
||||
|
||||
### Fix Location 3: Session cleanup on completion
|
||||
**File**: `/Users/alexnewman/Scripts/claude-mem/src/services/worker/http/routes/SessionRoutes.ts`
|
||||
|
||||
Remove the design decision to keep idle sessions in memory. Add cleanup timer after generator completes.
|
||||
|
||||
---
|
||||
|
||||
## 5. Recommended Fixes
|
||||
|
||||
### Fix 1: Add Session TTL and Cleanup Timer
|
||||
|
||||
```typescript
|
||||
// In SessionManager.ts
|
||||
|
||||
private readonly SESSION_TTL_MS = 60 * 60 * 1000; // 1 hour
|
||||
private cleanupTimers: Map<number, NodeJS.Timeout> = new Map();
|
||||
|
||||
/**
|
||||
* Schedule automatic cleanup for idle sessions
|
||||
*/
|
||||
scheduleSessionCleanup(sessionDbId: number): void {
|
||||
// Clear existing timer if any
|
||||
const existingTimer = this.cleanupTimers.get(sessionDbId);
|
||||
if (existingTimer) {
|
||||
clearTimeout(existingTimer);
|
||||
}
|
||||
|
||||
// Schedule cleanup after TTL
|
||||
const timer = setTimeout(() => {
|
||||
const session = this.sessions.get(sessionDbId);
|
||||
if (session && !session.generatorPromise) {
|
||||
// Only delete if no active generator
|
||||
this.deleteSession(sessionDbId);
|
||||
logger.info('SESSION', 'Session auto-cleaned due to TTL', { sessionDbId });
|
||||
}
|
||||
}, this.SESSION_TTL_MS);
|
||||
|
||||
this.cleanupTimers.set(sessionDbId, timer);
|
||||
}
|
||||
|
||||
/**
|
||||
* Cancel cleanup timer (call when session receives new work)
|
||||
*/
|
||||
cancelSessionCleanup(sessionDbId: number): void {
|
||||
const timer = this.cleanupTimers.get(sessionDbId);
|
||||
if (timer) {
|
||||
clearTimeout(timer);
|
||||
this.cleanupTimers.delete(sessionDbId);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Fix 2: Add conversationHistory Bounds
|
||||
|
||||
```typescript
|
||||
// In src/services/worker/SessionManager.ts or new utility file
|
||||
|
||||
const MAX_CONVERSATION_HISTORY_LENGTH = 50; // Configurable
|
||||
|
||||
/**
|
||||
* Trim conversation history to prevent unbounded growth
|
||||
* Keeps the most recent messages
|
||||
*/
|
||||
export function trimConversationHistory(session: ActiveSession): void {
|
||||
if (session.conversationHistory.length > MAX_CONVERSATION_HISTORY_LENGTH) {
|
||||
const toRemove = session.conversationHistory.length - MAX_CONVERSATION_HISTORY_LENGTH;
|
||||
session.conversationHistory.splice(0, toRemove);
|
||||
logger.debug('SESSION', 'Trimmed conversation history', {
|
||||
sessionDbId: session.sessionDbId,
|
||||
removed: toRemove,
|
||||
remaining: session.conversationHistory.length
|
||||
});
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Then call this after each message is added in SDKAgent, GeminiAgent, and ResponseProcessor.
|
||||
|
||||
### Fix 3: Update SessionRoutes Generator Completion
|
||||
|
||||
```typescript
|
||||
// In SessionRoutes.ts, update the finally block (around line 164)
|
||||
|
||||
.finally(() => {
|
||||
const sessionDbId = session.sessionDbId;
|
||||
const wasAborted = session.abortController.signal.aborted;
|
||||
|
||||
if (wasAborted) {
|
||||
logger.info('SESSION', `Generator aborted`, { sessionId: sessionDbId });
|
||||
} else {
|
||||
logger.info('SESSION', `Generator completed naturally`, { sessionId: sessionDbId });
|
||||
}
|
||||
|
||||
session.generatorPromise = null;
|
||||
session.currentProvider = null;
|
||||
this.workerService.broadcastProcessingStatus();
|
||||
|
||||
// Check for pending work
|
||||
const pendingStore = this.sessionManager.getPendingMessageStore();
|
||||
const pendingCount = pendingStore.getPendingCount(sessionDbId);
|
||||
|
||||
if (pendingCount > 0 && !wasAborted) {
|
||||
// Restart for pending work
|
||||
// ... existing restart logic ...
|
||||
} else {
|
||||
// No pending work - schedule cleanup instead of keeping forever
|
||||
this.sessionManager.scheduleSessionCleanup(sessionDbId);
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Configuration Recommendations
|
||||
|
||||
Add these to `settings.json` defaults:
|
||||
|
||||
```json
|
||||
{
|
||||
"CLAUDE_MEM_SESSION_TTL_MINUTES": 60,
|
||||
"CLAUDE_MEM_MAX_CONVERSATION_HISTORY": 50,
|
||||
"CLAUDE_MEM_MAX_ACTIVE_SESSIONS": 100
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. Testing Recommendations
|
||||
|
||||
Add tests for:
|
||||
1. Session cleanup after TTL expires
|
||||
2. `conversationHistory` trimming at various sizes
|
||||
3. Memory monitoring under sustained load
|
||||
4. Cleanup timer cancellation on new work
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
| Issue | Status in v8.5.7 | Fix Required |
|
||||
|-------|------------------|--------------|
|
||||
| Sessions never auto-cleanup | NOT FIXED | Yes - add TTL/cleanup mechanism |
|
||||
| conversationHistory unbounded | NOT FIXED (except partial OpenRouter mitigation) | Yes - add trimming to all agents |
|
||||
|
||||
Both memory leaks are confirmed to exist in the current codebase and require the fixes outlined above.
|
||||
@@ -1,6 +1,6 @@
|
||||
{
|
||||
"name": "claude-mem-plugin",
|
||||
"version": "8.5.6",
|
||||
"version": "8.5.7",
|
||||
"private": true,
|
||||
"description": "Runtime dependencies for claude-mem bundled hooks",
|
||||
"type": "module",
|
||||
|
||||
File diff suppressed because one or more lines are too long
Reference in New Issue
Block a user