Files
claude-mem/scripts/format-transcript-context.ts
Alex Newman 68290a9121 Performance improvements: Token reduction and enhanced summaries (#101)
* refactor: Reduce continuation prompt token usage by 95 lines

Removed redundant instructions from continuation prompt that were originally
added to mitigate a session continuity issue. That issue has since been
resolved, making these detailed instructions unnecessary on every continuation.

Changes:
- Reduced continuation prompt from ~106 lines to ~11 lines (~95 line reduction)
- Changed "User's Goal:" to "Next Prompt in Session:" (more accurate framing)
- Removed redundant WHAT TO RECORD, WHEN TO SKIP, and OUTPUT FORMAT sections
- Kept concise reminder: "Continue generating observations and progress summaries..."
- Initial prompt still contains all detailed instructions

Impact:
- Significant token savings on every continuation prompt
- Faster context injection with no loss of functionality
- Instructions remain comprehensive in initial prompt

Files modified:
- src/sdk/prompts.ts (buildContinuationPrompt function)
- plugin/scripts/worker-service.cjs (compiled output)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* refactor: Enhance observation and summary prompts for clarity and token efficiency

* Enhance prompt clarity and instructions in prompts.ts

- Added a reminder to think about instructions before starting work.
- Simplified the continuation prompt instruction by removing "for this ongoing session."

* feat: Enhance settings.json with permissions and deny access to sensitive files

refactor: Remove PLAN-full-observation-display.md and PR_SUMMARY.md as they are no longer needed

chore: Delete SECURITY_SUMMARY.md since it is redundant after recent changes

fix: Update worker-service.cjs to streamline observation generation instructions

cleanup: Remove src-analysis.md and src-tree.md for a cleaner codebase

refactor: Modify prompts.ts to clarify instructions for memory processing

* refactor: Remove legacy worker service implementation

* feat: Enhance summary hook to extract last assistant message and improve logging

- Added function to extract the last assistant message from the transcript.
- Updated summary hook to include last assistant message in the summary request.
- Modified SDKSession interface to store last assistant message.
- Adjusted buildSummaryPrompt to utilize last assistant message for generating summaries.
- Updated worker service and session manager to handle last assistant message in summarize requests.
- Introduced silentDebug utility for improved logging and diagnostics throughout the summary process.

* docs: Add comprehensive implementation plan for ROI metrics feature

Added detailed implementation plan covering:
- Token usage capture from Agent SDK
- Database schema changes (migration #8)
- Discovery cost tracking per observation
- Context hook display with ROI metrics
- Testing and rollout strategy

Timeline: ~20 hours over 4 days
Goal: Empirical data for YC application amendment

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* feat: Add transcript processing scripts for analysis and formatting

- Implemented `dump-transcript-readable.ts` to generate a readable markdown dump of transcripts, excluding certain entry types.
- Created `extract-rich-context-examples.ts` to extract and showcase rich context examples from transcripts, highlighting user requests and assistant reasoning.
- Developed `format-transcript-context.ts` to format transcript context into a structured markdown format for improved observation generation.
- Added `test-transcript-parser.ts` for validating data extraction from transcript JSONL files, including statistics and error reporting.
- Introduced `transcript-to-markdown.ts` for a complete representation of transcript data in markdown format, showing all context data.
- Enhanced type definitions in `transcript.ts` to support new features and ensure type safety.
- Built `transcript-parser.ts` to handle parsing of transcript JSONL files, including error handling and data extraction methods.

* Refactor hooks and SDKAgent for improved observation handling

- Updated `new-hook.ts` to clean user prompts by stripping leading slashes for better semantic clarity.
- Enhanced `save-hook.ts` to include additional tools in the SKIP_TOOLS set, preventing unnecessary observations from certain command invocations.
- Modified `prompts.ts` to change the structure of observation prompts, emphasizing the observational role and providing a detailed XML output format for observations.
- Adjusted `SDKAgent.ts` to enforce stricter tool usage restrictions, ensuring the memory agent operates solely as an observer without any tool access.

* feat: Enhance session initialization to accept user prompts and prompt numbers

- Updated `handleSessionInit` in `worker-service.ts` to extract `userPrompt` and `promptNumber` from the request body and pass them to `initializeSession`.
- Modified `initializeSession` in `SessionManager.ts` to handle optional `currentUserPrompt` and `promptNumber` parameters.
- Added logic to update the existing session's `userPrompt` and `lastPromptNumber` if a `currentUserPrompt` is provided.
- Implemented debug logging for session initialization and updates to track user prompts and prompt numbers.

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-11-13 18:22:44 -05:00

241 lines
7.0 KiB
TypeScript

#!/usr/bin/env tsx
/**
* Format Transcript Context
*
* Parses a Claude Code transcript and formats it to show rich contextual data
* that could be used for improved observation generation.
*/
import { TranscriptParser } from '../src/utils/transcript-parser.js';
import { writeFileSync } from 'fs';
import { basename } from 'path';
interface ConversationTurn {
turnNumber: number;
userMessage?: {
content: string;
timestamp: string;
};
assistantMessage?: {
textContent: string;
thinkingContent?: string;
toolUses: Array<{
name: string;
input: any;
timestamp: string;
}>;
timestamp: string;
};
toolResults?: Array<{
toolName: string;
result: any;
timestamp: string;
}>;
}
function extractConversationTurns(parser: TranscriptParser): ConversationTurn[] {
const entries = parser.getAllEntries();
const turns: ConversationTurn[] = [];
let currentTurn: ConversationTurn | null = null;
let turnNumber = 0;
for (const entry of entries) {
// User messages start a new turn
if (entry.type === 'user') {
// If previous turn exists, push it
if (currentTurn) {
turns.push(currentTurn);
}
// Start new turn
turnNumber++;
currentTurn = {
turnNumber,
toolResults: []
};
// Extract user text (skip tool results)
if (typeof entry.content === 'string') {
currentTurn.userMessage = {
content: entry.content,
timestamp: entry.timestamp
};
} else if (Array.isArray(entry.content)) {
const textContent = entry.content
.filter((c: any) => c.type === 'text')
.map((c: any) => c.text)
.join('\n');
if (textContent.trim()) {
currentTurn.userMessage = {
content: textContent,
timestamp: entry.timestamp
};
}
// Extract tool results
const toolResults = entry.content.filter((c: any) => c.type === 'tool_result');
for (const result of toolResults) {
currentTurn.toolResults!.push({
toolName: result.tool_use_id || 'unknown',
result: result.content,
timestamp: entry.timestamp
});
}
}
}
// Assistant messages
if (entry.type === 'assistant' && currentTurn) {
if (!Array.isArray(entry.content)) continue;
const textBlocks = entry.content.filter((c: any) => c.type === 'text');
const thinkingBlocks = entry.content.filter((c: any) => c.type === 'thinking');
const toolUseBlocks = entry.content.filter((c: any) => c.type === 'tool_use');
currentTurn.assistantMessage = {
textContent: textBlocks.map((c: any) => c.text).join('\n'),
thinkingContent: thinkingBlocks.map((c: any) => c.thinking).join('\n'),
toolUses: toolUseBlocks.map((t: any) => ({
name: t.name,
input: t.input,
timestamp: entry.timestamp
})),
timestamp: entry.timestamp
};
}
}
// Push last turn
if (currentTurn) {
turns.push(currentTurn);
}
return turns;
}
function formatTurnToMarkdown(turn: ConversationTurn): string {
let md = '';
md += `## Turn ${turn.turnNumber}\n\n`;
// User message
if (turn.userMessage) {
md += `### 👤 User Request\n`;
md += `**Time:** ${new Date(turn.userMessage.timestamp).toLocaleString()}\n\n`;
md += '```\n';
md += turn.userMessage.content.substring(0, 500);
if (turn.userMessage.content.length > 500) {
md += '\n... (truncated)';
}
md += '\n```\n\n';
}
// Assistant response
if (turn.assistantMessage) {
md += `### 🤖 Assistant Response\n`;
md += `**Time:** ${new Date(turn.assistantMessage.timestamp).toLocaleString()}\n\n`;
// Text content
if (turn.assistantMessage.textContent.trim()) {
md += '**Response:**\n```\n';
md += turn.assistantMessage.textContent.substring(0, 500);
if (turn.assistantMessage.textContent.length > 500) {
md += '\n... (truncated)';
}
md += '\n```\n\n';
}
// Thinking
if (turn.assistantMessage.thinkingContent?.trim()) {
md += '**Thinking:**\n```\n';
md += turn.assistantMessage.thinkingContent.substring(0, 300);
if (turn.assistantMessage.thinkingContent.length > 300) {
md += '\n... (truncated)';
}
md += '\n```\n\n';
}
// Tool uses
if (turn.assistantMessage.toolUses.length > 0) {
md += `**Tools Used:** ${turn.assistantMessage.toolUses.length}\n\n`;
for (const tool of turn.assistantMessage.toolUses) {
md += `- **${tool.name}**\n`;
md += ` \`\`\`json\n`;
const inputStr = JSON.stringify(tool.input, null, 2);
md += inputStr.substring(0, 200);
if (inputStr.length > 200) {
md += '\n ... (truncated)';
}
md += '\n ```\n';
}
md += '\n';
}
}
// Tool results summary
if (turn.toolResults && turn.toolResults.length > 0) {
md += `**Tool Results:** ${turn.toolResults.length} results received\n\n`;
}
md += '---\n\n';
return md;
}
function formatTranscriptToMarkdown(transcriptPath: string): string {
const parser = new TranscriptParser(transcriptPath);
const turns = extractConversationTurns(parser);
const stats = parser.getParseStats();
const tokens = parser.getTotalTokenUsage();
let md = `# Transcript Context Analysis\n\n`;
md += `**File:** ${basename(transcriptPath)}\n`;
md += `**Parsed:** ${new Date().toLocaleString()}\n\n`;
md += `## Statistics\n\n`;
md += `- Total entries: ${stats.totalLines}\n`;
md += `- Successfully parsed: ${stats.parsedEntries}\n`;
md += `- Failed lines: ${stats.failedLines}\n`;
md += `- Conversation turns: ${turns.length}\n\n`;
md += `## Token Usage\n\n`;
md += `- Input tokens: ${tokens.inputTokens.toLocaleString()}\n`;
md += `- Output tokens: ${tokens.outputTokens.toLocaleString()}\n`;
md += `- Cache creation: ${tokens.cacheCreationTokens.toLocaleString()}\n`;
md += `- Cache read: ${tokens.cacheReadTokens.toLocaleString()}\n`;
const totalTokens = tokens.inputTokens + tokens.outputTokens;
md += `- Total: ${totalTokens.toLocaleString()}\n\n`;
md += `---\n\n`;
md += `# Conversation Turns\n\n`;
// Format each turn
for (const turn of turns.slice(0, 20)) { // Limit to first 20 turns for readability
md += formatTurnToMarkdown(turn);
}
if (turns.length > 20) {
md += `\n_... ${turns.length - 20} more turns omitted for brevity_\n`;
}
return md;
}
// Main execution
const transcriptPath = process.argv[2];
if (!transcriptPath) {
console.error('Usage: tsx scripts/format-transcript-context.ts <path-to-transcript.jsonl>');
process.exit(1);
}
console.log(`Parsing transcript: ${transcriptPath}`);
const markdown = formatTranscriptToMarkdown(transcriptPath);
const outputPath = transcriptPath.replace('.jsonl', '-formatted.md');
writeFileSync(outputPath, markdown, 'utf-8');
console.log(`\nFormatted transcript written to: ${outputPath}`);
console.log(`\nOpen with: cat "${outputPath}"\n`);