mirror of
https://github.com/thedotmack/claude-mem
synced 2026-04-25 17:15:04 +02:00
Must Fix: - Remove better-sqlite3 logic from smart-install.js (5 sections) - Update all documentation to reference bun:sqlite (7 files) Should Fix: - Add defensive break statement in worker-cli.ts:38 Nice to Have: - Add port validation in ProcessManager.start() (1024-65535) - Add one-time marker for PM2 cleanup migration - Verify clearPortCache() wiring (already correct) Addresses PR #248 review feedback (comment #3648517713) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
1063 lines
28 KiB
Plaintext
1063 lines
28 KiB
Plaintext
# Architecture Evolution: The Journey from v3 to v5
|
|
|
|
## The Problem We Solved
|
|
|
|
**Goal:** Create a memory system that makes Claude smarter across sessions without the user noticing it exists.
|
|
|
|
**Challenge:** How do you observe AI agent behavior, compress it intelligently, and serve it back at the right time - all without slowing down or interfering with the main workflow?
|
|
|
|
This is the story of how claude-mem evolved from a simple idea to a production-ready system, and the key architectural decisions that made it work.
|
|
|
|
---
|
|
|
|
## v5.x: Maturity and User Experience
|
|
|
|
After establishing the solid v4 architecture, v5.x focused on user experience, visualization, and polish.
|
|
|
|
### v5.1.2: Theme Toggle (November 2025)
|
|
|
|
**What Changed**: Added light/dark mode theme toggle to viewer UI
|
|
|
|
**New Features**:
|
|
- User-selectable theme preference (light, dark, system)
|
|
- Persistent theme settings in localStorage
|
|
- Smooth theme transitions
|
|
- System preference detection
|
|
|
|
**Implementation**:
|
|
```typescript
|
|
// Theme context with persistence
|
|
const ThemeProvider = ({ children }) => {
|
|
const [theme, setTheme] = useState<'light' | 'dark' | 'system'>(() => {
|
|
return localStorage.getItem('claude-mem-theme') || 'system';
|
|
});
|
|
|
|
useEffect(() => {
|
|
localStorage.setItem('claude-mem-theme', theme);
|
|
}, [theme]);
|
|
|
|
return (
|
|
<ThemeContext.Provider value={{ theme, setTheme }}>
|
|
{children}
|
|
</ThemeContext.Provider>
|
|
);
|
|
};
|
|
```
|
|
|
|
**Why It Matters**: Users working in different lighting conditions can now customize the viewer for comfort.
|
|
|
|
### v5.1.1: Worker Startup Fix (November 2025) - Now Deprecated
|
|
|
|
**Note**: This section describes a historical PM2-based approach that has been replaced with Bun in later versions.
|
|
|
|
**The Problem**: Worker startup failed on Windows with ENOENT error when using PM2
|
|
|
|
**Historical Solution**: Used full path to PM2 binary instead of relying on PATH
|
|
|
|
**Current Approach**: The project now uses Bun for process management, which provides better cross-platform compatibility and eliminates these PATH-related issues.
|
|
|
|
**Impact**: Cross-platform compatibility restored, Windows users can now use claude-mem without issues.
|
|
|
|
### v5.1.0: Web-Based Viewer UI (October 2025)
|
|
|
|
**The Breakthrough**: Real-time visualization of memory stream
|
|
|
|
**What We Built**:
|
|
- React-based web UI at http://localhost:37777
|
|
- Server-Sent Events (SSE) for real-time updates
|
|
- Infinite scroll pagination
|
|
- Project filtering
|
|
- Settings persistence (sidebar state, selected project)
|
|
- Auto-reconnection with exponential backoff
|
|
- GPU-accelerated animations
|
|
|
|
**New Worker Endpoints** (8 additions):
|
|
```
|
|
GET / # Serves viewer HTML
|
|
GET /stream # SSE real-time updates
|
|
GET /api/prompts # Paginated user prompts
|
|
GET /api/observations # Paginated observations
|
|
GET /api/summaries # Paginated session summaries
|
|
GET /api/stats # Database statistics
|
|
GET /api/settings # User settings
|
|
POST /api/settings # Save settings
|
|
```
|
|
|
|
**Database Enhancements**:
|
|
```typescript
|
|
// New SessionStore methods for viewer
|
|
getRecentPrompts(limit, offset, project?)
|
|
getRecentObservations(limit, offset, project?)
|
|
getRecentSummaries(limit, offset, project?)
|
|
getStats()
|
|
getUniqueProjects()
|
|
```
|
|
|
|
**React Architecture**:
|
|
```
|
|
src/ui/viewer/
|
|
├── components/
|
|
│ ├── Header.tsx # Navigation + stats
|
|
│ ├── Sidebar.tsx # Project filter
|
|
│ ├── Feed.tsx # Infinite scroll
|
|
│ └── cards/
|
|
│ ├── ObservationCard.tsx
|
|
│ ├── PromptCard.tsx
|
|
│ ├── SummaryCard.tsx
|
|
│ └── SkeletonCard.tsx
|
|
├── hooks/
|
|
│ ├── useSSE.ts # Real-time events
|
|
│ ├── usePagination.ts # Infinite scroll
|
|
│ ├── useSettings.ts # Persistence
|
|
│ └── useStats.ts # Statistics
|
|
└── utils/
|
|
├── merge.ts # Data deduplication
|
|
└── format.ts # Display formatting
|
|
```
|
|
|
|
**Build Process**:
|
|
```typescript
|
|
// esbuild bundles everything into single HTML file
|
|
esbuild.build({
|
|
entryPoints: ['src/ui/viewer/index.tsx'],
|
|
bundle: true,
|
|
outfile: 'plugin/ui/viewer.html',
|
|
loader: { '.tsx': 'tsx', '.woff2': 'dataurl' },
|
|
define: { 'process.env.NODE_ENV': '"production"' },
|
|
});
|
|
```
|
|
|
|
**Why It Matters**: Users can now see exactly what's being captured in real-time, making the memory system transparent and debuggable.
|
|
|
|
### v5.0.3: Smart Install Caching (October 2025)
|
|
|
|
**The Problem**: `npm install` ran on every SessionStart (2-5 seconds)
|
|
|
|
**The Insight**: Dependencies rarely change between sessions
|
|
|
|
**The Solution**: Version-based caching
|
|
```typescript
|
|
// Check version marker before installing
|
|
const currentVersion = getPackageVersion();
|
|
const installedVersion = readFileSync('.install-version', 'utf-8');
|
|
|
|
if (currentVersion !== installedVersion) {
|
|
// Only install if version changed
|
|
await runNpmInstall();
|
|
writeFileSync('.install-version', currentVersion);
|
|
}
|
|
```
|
|
|
|
**Cached Check Logic**:
|
|
1. Does `node_modules` exist?
|
|
2. Does `.install-version` match `package.json` version?
|
|
3. Is `better-sqlite3` present? (Legacy: now uses bun:sqlite which requires no installation)
|
|
|
|
**Impact**:
|
|
- SessionStart hook: 2-5 seconds → 10ms (99.5% faster)
|
|
- Only installs on: first run, version change, missing deps
|
|
- Better Windows error messages with build tool help
|
|
|
|
### v5.0.2: Worker Health Checks (October 2025)
|
|
|
|
**What Changed**: More robust worker startup and monitoring
|
|
|
|
**New Features**:
|
|
```typescript
|
|
// Health check endpoint
|
|
app.get('/health', (req, res) => {
|
|
res.json({
|
|
status: 'ok',
|
|
uptime: process.uptime(),
|
|
port: WORKER_PORT,
|
|
memory: process.memoryUsage(),
|
|
});
|
|
});
|
|
|
|
// Smart worker startup
|
|
async function ensureWorkerHealthy() {
|
|
const healthy = await isWorkerHealthy(1000);
|
|
if (!healthy) {
|
|
await startWorker();
|
|
await waitForWorkerHealth(10000);
|
|
}
|
|
}
|
|
```
|
|
|
|
**Benefits**:
|
|
- Graceful degradation when worker is down
|
|
- Auto-recovery from crashes
|
|
- Better error messages for debugging
|
|
|
|
### v5.0.1: Stability Improvements (October 2025)
|
|
|
|
**What Changed**: Various bug fixes and stability enhancements
|
|
|
|
**Key Fixes**:
|
|
- Fixed race conditions in observation queue processing
|
|
- Improved error handling in SDK worker
|
|
- Better cleanup of stale worker processes
|
|
- Enhanced logging for debugging
|
|
|
|
### v5.0.0: Hybrid Search Architecture (October 2025)
|
|
|
|
**The Evolution**: SQLite FTS5 + Chroma vector search
|
|
|
|
**What We Added**:
|
|
```
|
|
┌─────────────────────────────────────────────────────────┐
|
|
│ HYBRID SEARCH │
|
|
│ │
|
|
│ Text Query → SQLite FTS5 (keyword matching) │
|
|
│ ↓ │
|
|
│ Chroma Vector Search (semantic) │
|
|
│ ↓ │
|
|
│ Merge + Re-rank Results │
|
|
└─────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
**New Dependencies**:
|
|
- `chromadb` - Vector database for semantic search
|
|
- Python 3.8+ - Required by chromadb
|
|
|
|
**MCP Tools Enhancement**:
|
|
```typescript
|
|
// Chroma-backed semantic search
|
|
search_observations({
|
|
query: "authentication bug",
|
|
useSemanticSearch: true // Uses Chroma
|
|
});
|
|
|
|
// Falls back to FTS5 if Chroma unavailable
|
|
```
|
|
|
|
**Why Hybrid**:
|
|
- FTS5: Fast keyword matching, no dependencies
|
|
- Chroma: Semantic understanding, finds related concepts
|
|
- Graceful degradation: Works without Chroma (FTS5 only)
|
|
|
|
**Trade-offs**:
|
|
- Added Python dependency (optional)
|
|
- Increased installation complexity
|
|
- Better search relevance
|
|
|
|
---
|
|
|
|
## v1-v2: The Naive Approach
|
|
|
|
### The First Attempt: Dump Everything
|
|
|
|
**Architecture:**
|
|
```
|
|
PostToolUse Hook → Save raw tool outputs → Retrieve everything on startup
|
|
```
|
|
|
|
**What we learned:**
|
|
- ❌ Context pollution (thousands of tokens of irrelevant data)
|
|
- ❌ No compression (raw tool outputs are verbose)
|
|
- ❌ No search (had to scan everything linearly)
|
|
- ✅ Proved the concept: Memory across sessions is valuable
|
|
|
|
**Example of what went wrong:**
|
|
```
|
|
SessionStart loaded:
|
|
- 150 file read operations
|
|
- 80 grep searches
|
|
- 45 bash commands
|
|
- Total: ~35,000 tokens
|
|
- Relevant to current task: ~500 tokens (1.4%)
|
|
```
|
|
|
|
---
|
|
|
|
## v3: Smart Compression, Wrong Architecture
|
|
|
|
### The Breakthrough: AI-Powered Compression
|
|
|
|
**New idea:** Use Claude itself to compress observations
|
|
|
|
**Architecture:**
|
|
```
|
|
PostToolUse Hook → Queue observation → SDK Worker → AI compression → Store insights
|
|
```
|
|
|
|
**What we added:**
|
|
1. **Claude Agent SDK integration** - Use AI to compress observations
|
|
2. **Background worker** - Don't block main session
|
|
3. **Structured observations** - Extract facts, decisions, insights
|
|
4. **Session summaries** - Generate comprehensive summaries
|
|
|
|
**What worked:**
|
|
- ✅ Compression ratio: 10:1 to 100:1
|
|
- ✅ Semantic understanding (not just keyword matching)
|
|
- ✅ Background processing (hooks stayed fast)
|
|
- ✅ Search became useful
|
|
|
|
**What didn't work:**
|
|
- ❌ Still loaded everything upfront
|
|
- ❌ Session ID management was broken
|
|
- ❌ Aggressive cleanup interrupted summaries
|
|
- ❌ Multiple SDK sessions per Claude Code session
|
|
|
|
---
|
|
|
|
## The Key Realizations
|
|
|
|
### Realization 1: Progressive Disclosure
|
|
|
|
**Problem:** Even compressed observations can pollute context if you load them all.
|
|
|
|
**Insight:** Humans don't read everything before starting work. Why should AI?
|
|
|
|
**Solution:** Show an index first, fetch details on-demand.
|
|
|
|
```
|
|
❌ Old: Load 50 observations (8,500 tokens)
|
|
✅ New: Show index of 50 observations (800 tokens)
|
|
Agent fetches 2-3 relevant ones (300 tokens)
|
|
Total: 1,100 tokens vs 8,500 tokens
|
|
```
|
|
|
|
**Impact:**
|
|
- 87% reduction in context usage
|
|
- 100% relevance (only fetch what's needed)
|
|
- Agent autonomy (decides what's relevant)
|
|
|
|
### Realization 2: Session ID Chaos
|
|
|
|
**Problem:** SDK session IDs change on every turn.
|
|
|
|
**What we thought:**
|
|
```typescript
|
|
// ❌ Wrong assumption
|
|
UserPromptSubmit → Capture session ID once → Use forever
|
|
```
|
|
|
|
**Reality:**
|
|
```typescript
|
|
// ✅ Actual behavior
|
|
Turn 1: session_abc123
|
|
Turn 2: session_def456
|
|
Turn 3: session_ghi789
|
|
```
|
|
|
|
**Why this matters:**
|
|
- Can't resume sessions without tracking ID updates
|
|
- Session state gets lost between turns
|
|
- Observations get orphaned
|
|
|
|
**Solution:**
|
|
```typescript
|
|
// Capture from system init message
|
|
for await (const msg of response) {
|
|
if (msg.type === 'system' && msg.subtype === 'init') {
|
|
sdkSessionId = msg.session_id;
|
|
await updateSessionId(sessionId, sdkSessionId);
|
|
}
|
|
}
|
|
```
|
|
|
|
### Realization 3: Graceful vs Aggressive Cleanup
|
|
|
|
**v3 approach:**
|
|
```typescript
|
|
// ❌ Aggressive: Kill worker immediately
|
|
SessionEnd → DELETE /worker/session → Worker stops
|
|
```
|
|
|
|
**Problems:**
|
|
- Summary generation interrupted mid-process
|
|
- Pending observations lost
|
|
- Race conditions everywhere
|
|
|
|
**v4 approach:**
|
|
```typescript
|
|
// ✅ Graceful: Let worker finish
|
|
SessionEnd → Mark session complete → Worker finishes → Exit naturally
|
|
```
|
|
|
|
**Benefits:**
|
|
- Summaries complete successfully
|
|
- No lost observations
|
|
- Clean state transitions
|
|
|
|
**Code:**
|
|
```typescript
|
|
// v3: Aggressive
|
|
async function sessionEnd(sessionId: string) {
|
|
await fetch(`http://localhost:37777/sessions/${sessionId}`, {
|
|
method: 'DELETE'
|
|
});
|
|
}
|
|
|
|
// v4: Graceful
|
|
async function sessionEnd(sessionId: string) {
|
|
await db.run(
|
|
'UPDATE sdk_sessions SET completed_at = ? WHERE id = ?',
|
|
[Date.now(), sessionId]
|
|
);
|
|
}
|
|
```
|
|
|
|
### Realization 4: One Session, Not Many
|
|
|
|
**Problem:** We were creating multiple SDK sessions per Claude Code session.
|
|
|
|
**What we thought:**
|
|
```
|
|
Claude Code session → Create SDK session per observation → 100+ SDK sessions
|
|
```
|
|
|
|
**Reality should be:**
|
|
```
|
|
Claude Code session → ONE long-running SDK session → Streaming input
|
|
```
|
|
|
|
**Why this matters:**
|
|
- SDK maintains conversation state
|
|
- Context accumulates naturally
|
|
- Much more efficient
|
|
|
|
**Implementation:**
|
|
```typescript
|
|
// ✅ Streaming Input Mode
|
|
async function* messageGenerator(): AsyncIterable<UserMessage> {
|
|
// Initial prompt
|
|
yield {
|
|
role: "user",
|
|
content: "You are a memory assistant..."
|
|
};
|
|
|
|
// Then continuously yield observations
|
|
while (session.status === 'active') {
|
|
const observations = await pollQueue();
|
|
for (const obs of observations) {
|
|
yield {
|
|
role: "user",
|
|
content: formatObservation(obs)
|
|
};
|
|
}
|
|
await sleep(1000);
|
|
}
|
|
}
|
|
|
|
const response = query({
|
|
prompt: messageGenerator(),
|
|
options: { maxTurns: 1000 }
|
|
});
|
|
```
|
|
|
|
---
|
|
|
|
## v4: The Architecture That Works
|
|
|
|
### The Core Design
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────┐
|
|
│ CLAUDE CODE SESSION │
|
|
│ User → Claude → Tools (Read, Edit, Write, Bash) │
|
|
│ ↓ │
|
|
│ PostToolUse Hook │
|
|
│ (queues observation) │
|
|
└─────────────────────────────────────────────────────────┘
|
|
↓ SQLite queue
|
|
┌─────────────────────────────────────────────────────────┐
|
|
│ SDK WORKER PROCESS │
|
|
│ ONE streaming session per Claude Code session │
|
|
│ │
|
|
│ AsyncIterable<UserMessage> │
|
|
│ → Yields observations from queue │
|
|
│ → SDK compresses via AI │
|
|
│ → Parses XML responses │
|
|
│ → Stores in database │
|
|
└─────────────────────────────────────────────────────────┘
|
|
↓ SQLite storage
|
|
┌─────────────────────────────────────────────────────────┐
|
|
│ NEXT SESSION │
|
|
│ SessionStart Hook │
|
|
│ → Queries database │
|
|
│ → Returns progressive disclosure index │
|
|
│ → Agent fetches details via MCP │
|
|
└─────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### The Five Hook Architecture
|
|
|
|
<Tabs>
|
|
<Tab title="SessionStart">
|
|
**Purpose:** Inject context from previous sessions
|
|
|
|
**Timing:** When Claude Code starts
|
|
|
|
**What it does:**
|
|
- Queries last 10 session summaries
|
|
- Formats as progressive disclosure index
|
|
- Injects into context via stdout
|
|
|
|
**Key change from v3:**
|
|
- ✅ Index format (not full details)
|
|
- ✅ Token counts visible
|
|
- ✅ MCP search instructions included
|
|
</Tab>
|
|
|
|
<Tab title="UserPromptSubmit">
|
|
**Purpose:** Initialize session tracking
|
|
|
|
**Timing:** Before Claude processes prompt
|
|
|
|
**What it does:**
|
|
- Creates session record
|
|
- Saves raw user prompt (v4.2.0+)
|
|
- Starts worker if needed
|
|
|
|
**Key change from v3:**
|
|
- ✅ Stores raw prompts for search
|
|
- ✅ Auto-starts worker service
|
|
</Tab>
|
|
|
|
<Tab title="PostToolUse">
|
|
**Purpose:** Capture tool observations
|
|
|
|
**Timing:** After every tool execution
|
|
|
|
**What it does:**
|
|
- Enqueues observation in database
|
|
- Returns immediately
|
|
|
|
**Key change from v3:**
|
|
- ✅ Just enqueues (doesn't process)
|
|
- ✅ Worker handles all AI calls
|
|
</Tab>
|
|
|
|
<Tab title="Summary">
|
|
**Purpose:** Generate session summaries
|
|
|
|
**Timing:** Worker-triggered (mid-session)
|
|
|
|
**What it does:**
|
|
- Gathers observations
|
|
- Sends to Claude for summarization
|
|
- Stores structured summary
|
|
|
|
**Key change from v3:**
|
|
- ✅ Multiple summaries per session
|
|
- ✅ Summaries are checkpoints, not endings
|
|
</Tab>
|
|
|
|
<Tab title="SessionEnd">
|
|
**Purpose:** Graceful cleanup
|
|
|
|
**Timing:** When session ends
|
|
|
|
**What it does:**
|
|
- Marks session complete
|
|
- Lets worker finish processing
|
|
|
|
**Key change from v3:**
|
|
- ✅ Graceful (not aggressive)
|
|
- ✅ No DELETE requests
|
|
- ✅ Worker finishes naturally
|
|
</Tab>
|
|
</Tabs>
|
|
|
|
### Database Schema Evolution
|
|
|
|
**v3 schema:**
|
|
```sql
|
|
-- Simple, flat structure
|
|
CREATE TABLE observations (
|
|
id INTEGER PRIMARY KEY,
|
|
session_id TEXT,
|
|
text TEXT,
|
|
created_at INTEGER
|
|
);
|
|
```
|
|
|
|
**v4 schema:**
|
|
```sql
|
|
-- Rich, structured schema
|
|
CREATE TABLE observations (
|
|
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
|
session_id TEXT NOT NULL,
|
|
project TEXT NOT NULL,
|
|
|
|
-- Progressive disclosure metadata
|
|
title TEXT NOT NULL,
|
|
subtitle TEXT,
|
|
type TEXT NOT NULL, -- decision, bugfix, feature, etc.
|
|
|
|
-- Content
|
|
narrative TEXT NOT NULL,
|
|
facts TEXT, -- JSON array
|
|
|
|
-- Searchability
|
|
concepts TEXT, -- JSON array of tags
|
|
files_read TEXT, -- JSON array
|
|
files_modified TEXT, -- JSON array
|
|
|
|
-- Timestamps
|
|
created_at TEXT NOT NULL,
|
|
created_at_epoch INTEGER NOT NULL,
|
|
|
|
FOREIGN KEY(session_id) REFERENCES sdk_sessions(id)
|
|
);
|
|
|
|
-- FTS5 for full-text search
|
|
CREATE VIRTUAL TABLE observations_fts USING fts5(
|
|
title, subtitle, narrative, facts, concepts,
|
|
content=observations
|
|
);
|
|
|
|
-- Auto-sync triggers
|
|
CREATE TRIGGER observations_ai AFTER INSERT ON observations BEGIN
|
|
INSERT INTO observations_fts(rowid, title, subtitle, narrative, facts, concepts)
|
|
VALUES (new.id, new.title, new.subtitle, new.narrative, new.facts, new.concepts);
|
|
END;
|
|
```
|
|
|
|
**What changed:**
|
|
- ✅ Structured fields (title, subtitle, type)
|
|
- ✅ FTS5 full-text search
|
|
- ✅ Project-scoped queries
|
|
- ✅ Rich metadata for progressive disclosure
|
|
|
|
### Worker Service Redesign
|
|
|
|
**v3 worker:**
|
|
```typescript
|
|
// Multiple short SDK sessions
|
|
app.post('/process', async (req, res) => {
|
|
const response = await query({
|
|
prompt: buildPrompt(req.body),
|
|
options: { maxTurns: 1 }
|
|
});
|
|
|
|
for await (const msg of response) {
|
|
// Process single observation
|
|
}
|
|
|
|
res.json({ success: true });
|
|
});
|
|
```
|
|
|
|
**v4 worker:**
|
|
```typescript
|
|
// ONE long-running SDK session
|
|
async function runWorker(sessionId: string) {
|
|
const response = query({
|
|
prompt: messageGenerator(), // AsyncIterable
|
|
options: { maxTurns: 1000 }
|
|
});
|
|
|
|
for await (const msg of response) {
|
|
if (msg.type === 'text') {
|
|
parseObservations(msg.content);
|
|
parseSummaries(msg.content);
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
**Benefits:**
|
|
- Maintains conversation state
|
|
- SDK handles context automatically
|
|
- More efficient (fewer API calls)
|
|
- Natural multi-turn flow
|
|
|
|
---
|
|
|
|
## Critical Fixes Along the Way
|
|
|
|
### Fix 1: Context Injection Pollution (v4.3.1)
|
|
|
|
**Problem:** SessionStart hook output polluted with npm install logs
|
|
|
|
```bash
|
|
# Hook output contained:
|
|
npm WARN deprecated ...
|
|
npm WARN deprecated ...
|
|
{"hookSpecificOutput": {"additionalContext": "..."}}
|
|
```
|
|
|
|
**Why it broke:**
|
|
- Claude Code expects clean JSON or plain text
|
|
- stderr/stdout from npm install mixed with hook output
|
|
- Context didn't inject properly
|
|
|
|
**Solution:**
|
|
```json
|
|
{
|
|
"command": "npm install --loglevel=silent && node context-hook.js"
|
|
}
|
|
```
|
|
|
|
**Result:** Clean JSON output, context injection works
|
|
|
|
### Fix 2: Double Shebang Issue (v4.3.1)
|
|
|
|
**Problem:** Hook executables had duplicate shebangs
|
|
|
|
```javascript
|
|
#!/usr/bin/env node
|
|
#!/usr/bin/env node // ← Duplicate!
|
|
|
|
// Rest of code...
|
|
```
|
|
|
|
**Why it happened:**
|
|
- Source files had shebang
|
|
- esbuild added another shebang during build
|
|
|
|
**Solution:**
|
|
```typescript
|
|
// Remove shebangs from source files
|
|
// Let esbuild add them during build
|
|
```
|
|
|
|
**Result:** Clean executables, no parsing errors
|
|
|
|
### Fix 3: FTS5 Injection Vulnerability (v4.2.3)
|
|
|
|
**Problem:** User input passed directly to FTS5 query
|
|
|
|
```typescript
|
|
// ❌ Vulnerable
|
|
const results = db.query(
|
|
`SELECT * FROM observations_fts WHERE observations_fts MATCH '${userQuery}'`
|
|
);
|
|
```
|
|
|
|
**Attack:**
|
|
```typescript
|
|
userQuery = "'; DROP TABLE observations; --"
|
|
```
|
|
|
|
**Solution:**
|
|
```typescript
|
|
// ✅ Safe: Use parameterized queries
|
|
const results = db.query(
|
|
'SELECT * FROM observations_fts WHERE observations_fts MATCH ?',
|
|
[userQuery]
|
|
);
|
|
```
|
|
|
|
### Fix 4: NOT NULL Constraint Violation (v4.2.8)
|
|
|
|
**Problem:** Session creation failed when prompt was empty
|
|
|
|
```sql
|
|
INSERT INTO sdk_sessions (claude_session_id, user_prompt, ...)
|
|
VALUES ('abc123', NULL, ...) -- ❌ user_prompt is NOT NULL
|
|
```
|
|
|
|
**Solution:**
|
|
```typescript
|
|
// Allow NULL user_prompts
|
|
user_prompt: input.prompt ?? null
|
|
```
|
|
|
|
**Schema change:**
|
|
```sql
|
|
-- Before
|
|
user_prompt TEXT NOT NULL
|
|
|
|
-- After
|
|
user_prompt TEXT -- Nullable
|
|
```
|
|
|
|
---
|
|
|
|
## Performance Improvements
|
|
|
|
### Optimization 1: Prepared Statements
|
|
|
|
**Before:**
|
|
```typescript
|
|
for (const obs of observations) {
|
|
db.run(`INSERT INTO observations (...) VALUES (?, ?, ...)`, [obs.id, obs.text, ...]);
|
|
}
|
|
```
|
|
|
|
**After:**
|
|
```typescript
|
|
const stmt = db.prepare(`INSERT INTO observations (...) VALUES (?, ?, ...)`);
|
|
for (const obs of observations) {
|
|
stmt.run([obs.id, obs.text, ...]);
|
|
}
|
|
stmt.finalize();
|
|
```
|
|
|
|
**Impact:** 5x faster bulk inserts
|
|
|
|
### Optimization 2: FTS5 Indexing
|
|
|
|
**Before:**
|
|
```typescript
|
|
// Manual full-text search
|
|
const results = db.query(
|
|
`SELECT * FROM observations WHERE text LIKE '%${query}%'`
|
|
);
|
|
```
|
|
|
|
**After:**
|
|
```typescript
|
|
// FTS5 virtual table
|
|
const results = db.query(
|
|
`SELECT * FROM observations_fts WHERE observations_fts MATCH ?`,
|
|
[query]
|
|
);
|
|
```
|
|
|
|
**Impact:** 100x faster searches on large datasets
|
|
|
|
### Optimization 3: Index Format Default
|
|
|
|
**Before:**
|
|
```typescript
|
|
// Always return full observations
|
|
search_observations({ query: "hooks" });
|
|
// Returns: 5,000 tokens
|
|
```
|
|
|
|
**After:**
|
|
```typescript
|
|
// Default to index format
|
|
search_observations({ query: "hooks", format: "index" });
|
|
// Returns: 200 tokens
|
|
|
|
// Fetch full only when needed
|
|
search_observations({ query: "hooks", format: "full", limit: 1 });
|
|
// Returns: 150 tokens
|
|
```
|
|
|
|
**Impact:** 25x reduction in average search result size
|
|
|
|
---
|
|
|
|
## What We Learned
|
|
|
|
### Lesson 1: Context is Precious
|
|
|
|
**Principle:** Every token you put in context window costs attention.
|
|
|
|
**Application:**
|
|
- Progressive disclosure reduces waste by 87%
|
|
- Index-first approach gives agent control
|
|
- Token counts make costs visible
|
|
|
|
### Lesson 2: Session State is Complicated
|
|
|
|
**Principle:** Distributed state is hard. SDK handles it better than we can.
|
|
|
|
**Application:**
|
|
- Use SDK's built-in session resumption
|
|
- Don't try to manually reconstruct state
|
|
- Track session IDs from init messages
|
|
|
|
### Lesson 3: Graceful Beats Aggressive
|
|
|
|
**Principle:** Let processes finish their work before terminating.
|
|
|
|
**Application:**
|
|
- Graceful cleanup prevents data loss
|
|
- Workers finish important operations
|
|
- Clean state transitions reduce bugs
|
|
|
|
### Lesson 4: AI is the Compressor
|
|
|
|
**Principle:** Don't compress manually. Let AI do semantic compression.
|
|
|
|
**Application:**
|
|
- 10:1 to 100:1 compression ratios
|
|
- Semantic understanding, not keyword extraction
|
|
- Structured outputs (XML parsing)
|
|
|
|
### Lesson 5: Progressive Everything
|
|
|
|
**Principle:** Show metadata first, fetch details on-demand.
|
|
|
|
**Application:**
|
|
- Progressive disclosure in context injection
|
|
- Index format in search results
|
|
- Layer 1 (titles) → Layer 2 (summaries) → Layer 3 (full details)
|
|
|
|
---
|
|
|
|
## The Road Ahead
|
|
|
|
### Planned: Adaptive Index Size
|
|
|
|
```typescript
|
|
SessionStart({ source: "startup" }):
|
|
→ Show last 10 sessions (normal)
|
|
|
|
SessionStart({ source: "resume" }):
|
|
→ Show only current session (minimal)
|
|
|
|
SessionStart({ source: "compact" }):
|
|
→ Show last 20 sessions (comprehensive)
|
|
```
|
|
|
|
### Planned: Relevance Scoring
|
|
|
|
```typescript
|
|
// Use embeddings to pre-sort index by semantic relevance
|
|
search_observations({
|
|
query: "authentication bug",
|
|
sort: "relevance" // Based on embeddings
|
|
});
|
|
```
|
|
|
|
### Planned: Multi-Project Context
|
|
|
|
```typescript
|
|
// Cross-project pattern recognition
|
|
search_observations({
|
|
query: "API rate limiting",
|
|
projects: ["api-gateway", "user-service", "billing-service"]
|
|
});
|
|
```
|
|
|
|
### Planned: Collaborative Memory
|
|
|
|
```typescript
|
|
// Team-shared observations (optional)
|
|
createObservation({
|
|
title: "Rate limit: 100 req/min",
|
|
scope: "team" // vs "user"
|
|
});
|
|
```
|
|
|
|
---
|
|
|
|
## Migration Guide: v3 → v5
|
|
|
|
### Step 1: Backup Database
|
|
|
|
```bash
|
|
cp ~/.claude-mem/claude-mem.db ~/.claude-mem/claude-mem-v3-backup.db
|
|
```
|
|
|
|
### Step 2: Update Plugin
|
|
|
|
```bash
|
|
cd ~/.claude/plugins/marketplaces/thedotmack
|
|
git pull
|
|
```
|
|
|
|
### Step 3: Update Plugin
|
|
|
|
```bash
|
|
/plugin update claude-mem
|
|
```
|
|
|
|
**What happens automatically:**
|
|
- Dependencies update (including new ones like chromadb for v5.0.0+)
|
|
- Database schema migrations run automatically
|
|
- Worker service restarts with new code
|
|
- Smart install caching activates (v5.0.3+)
|
|
|
|
### Step 4: Test
|
|
|
|
```bash
|
|
# Start Claude Code
|
|
claude
|
|
|
|
# Check that context is injected
|
|
# (Should see progressive disclosure index with v5 viewer link)
|
|
|
|
# Open viewer UI (v5.1.0+)
|
|
open http://localhost:37777
|
|
|
|
# Submit a prompt and watch real-time updates in viewer
|
|
```
|
|
|
|
### Step 5: Explore New Features
|
|
|
|
```bash
|
|
# View memory stream in browser (v5.1.0+)
|
|
open http://localhost:37777
|
|
|
|
# Toggle theme (v5.1.2+)
|
|
# Click theme button in viewer header
|
|
|
|
# Check worker health
|
|
npm run worker:status
|
|
curl http://localhost:37777/health
|
|
```
|
|
|
|
---
|
|
|
|
## Key Metrics
|
|
|
|
### v3 Performance
|
|
|
|
| Metric | Value |
|
|
|--------|-------|
|
|
| Context usage per session | ~25,000 tokens |
|
|
| Relevant context | ~2,000 tokens (8%) |
|
|
| Hook execution time | ~200ms |
|
|
| Search latency | ~500ms (LIKE queries) |
|
|
|
|
### v4 Performance
|
|
|
|
| Metric | Value |
|
|
|--------|-------|
|
|
| Context usage per session | ~1,100 tokens |
|
|
| Relevant context | ~1,100 tokens (100%) |
|
|
| Hook execution time | ~45ms |
|
|
| Search latency | ~15ms (FTS5) |
|
|
|
|
### v5 Performance
|
|
|
|
| Metric | Value |
|
|
|--------|-------|
|
|
| Context usage per session | ~1,100 tokens |
|
|
| Relevant context | ~1,100 tokens (100%) |
|
|
| Hook execution time | ~10ms (cached install) |
|
|
| Search latency | ~12ms (FTS5) or ~25ms (hybrid) |
|
|
| Viewer UI load time | ~50ms (bundled HTML) |
|
|
| SSE update latency | ~5ms (real-time) |
|
|
|
|
**v3 → v4 Improvements:**
|
|
- 96% reduction in context waste
|
|
- 12x increase in relevance
|
|
- 4x faster hooks
|
|
- 33x faster search
|
|
|
|
**v4 → v5 Improvements:**
|
|
- 78% faster hooks (smart caching)
|
|
- Real-time visualization (viewer UI)
|
|
- Better search relevance (hybrid)
|
|
- Enhanced UX (theme toggle, persistence)
|
|
|
|
---
|
|
|
|
## Conclusion
|
|
|
|
The journey from v3 to v5 was about understanding these fundamental truths:
|
|
|
|
1. **Context is finite** - Progressive disclosure respects attention budget
|
|
2. **AI is the compressor** - Semantic understanding beats keyword extraction
|
|
3. **Agents are smart** - Let them decide what to fetch
|
|
4. **State is hard** - Use SDK's built-in mechanisms
|
|
5. **Graceful wins** - Let processes finish cleanly
|
|
|
|
The result is a memory system that's both powerful and invisible. Users never notice it working - Claude just gets smarter over time.
|
|
|
|
**v5 adds visibility**: Now users CAN see the memory system working if they want (via viewer UI), but it's still non-intrusive.
|
|
|
|
---
|
|
|
|
## Further Reading
|
|
|
|
- [Progressive Disclosure](progressive-disclosure) - The philosophy behind v4
|
|
- [Hooks Architecture](hooks-architecture) - How hooks power the system
|
|
- [Context Engineering](context-engineering) - Foundational principles
|
|
- [Worker Service](/architecture/worker-service) - Real-time visualization (v5.1.0+)
|
|
|
|
---
|
|
|
|
*This architecture evolution reflects hundreds of hours of experimentation, dozens of dead ends, and the invaluable experience of real-world usage. v5 is the architecture that emerged from understanding what actually works - and making it visible to users.*
|