mirror of
https://github.com/thedotmack/claude-mem
synced 2026-04-25 17:15:04 +02:00
222 lines
6.9 KiB
Plaintext
222 lines
6.9 KiB
Plaintext
# Context Engineering for AI Agents: Best Practices Cheat Sheet
|
|
|
|
## Core Principle
|
|
**Find the smallest possible set of high-signal tokens that maximize the likelihood of your desired outcome.**
|
|
|
|
---
|
|
|
|
## Context Engineering vs Prompt Engineering
|
|
|
|
**Prompt Engineering**: Writing and organizing LLM instructions for optimal outcomes (one-time task)
|
|
|
|
**Context Engineering**: Curating and maintaining the optimal set of tokens during inference across multiple turns (iterative process)
|
|
|
|
Context engineering manages:
|
|
- System instructions
|
|
- Tools
|
|
- Model Context Protocol (MCP)
|
|
- External data
|
|
- Message history
|
|
- Runtime data retrieval
|
|
|
|
---
|
|
|
|
## The Problem: Context Rot
|
|
|
|
**Key Insight**: LLMs have an "attention budget" that gets depleted as context grows
|
|
|
|
- Every token attends to every other token (n² relationships)
|
|
- As context length increases, model accuracy decreases
|
|
- Models have less training experience with longer sequences
|
|
- Context must be treated as a finite resource with diminishing marginal returns
|
|
|
|
---
|
|
|
|
## System Prompts: Find the "Right Altitude"
|
|
|
|
### The Goldilocks Zone
|
|
|
|
**Too Prescriptive** ❌
|
|
- Hardcoded if-else logic
|
|
- Brittle and fragile
|
|
- High maintenance complexity
|
|
|
|
**Too Vague** ❌
|
|
- High-level guidance without concrete signals
|
|
- Falsely assumes shared context
|
|
- Lacks actionable direction
|
|
|
|
**Just Right** ✅
|
|
- Specific enough to guide behavior effectively
|
|
- Flexible enough to provide strong heuristics
|
|
- Minimal set of information that fully outlines expected behavior
|
|
|
|
### Best Practices
|
|
- Use simple, direct language
|
|
- Organize into distinct sections (`<background_information>`, `<instructions>`, `## Tool guidance`, etc.)
|
|
- Use XML tags or Markdown headers for structure
|
|
- Start with minimal prompt, add based on failure modes
|
|
- Note: Minimal ≠ short (provide sufficient information upfront)
|
|
|
|
---
|
|
|
|
## Tools: Minimal and Clear
|
|
|
|
### Design Principles
|
|
- **Self-contained**: Each tool has a single, clear purpose
|
|
- **Robust to error**: Handle edge cases gracefully
|
|
- **Extremely clear**: Intended use is unambiguous
|
|
- **Token-efficient**: Returns relevant information without bloat
|
|
- **Descriptive parameters**: Unambiguous input names (e.g., `user_id` not `user`)
|
|
|
|
### Critical Rule
|
|
**If a human engineer can't definitively say which tool to use in a given situation, an AI agent can't be expected to do better.**
|
|
|
|
### Common Failure Modes to Avoid
|
|
- Bloated tool sets covering too much functionality
|
|
- Tools with overlapping purposes
|
|
- Ambiguous decision points about which tool to use
|
|
|
|
---
|
|
|
|
## Examples: Diverse, Not Exhaustive
|
|
|
|
**Do** ✅
|
|
- Curate a set of diverse, canonical examples
|
|
- Show expected behavior effectively
|
|
- Think "pictures worth a thousand words"
|
|
|
|
**Don't** ❌
|
|
- Stuff in a laundry list of edge cases
|
|
- Try to articulate every possible rule
|
|
- Overwhelm with exhaustive scenarios
|
|
|
|
---
|
|
|
|
## Context Retrieval Strategies
|
|
|
|
### Just-In-Time Context (Recommended for Agents)
|
|
**Approach**: Maintain lightweight identifiers (file paths, queries, links) and dynamically load data at runtime
|
|
|
|
**Benefits**:
|
|
- Avoids context pollution
|
|
- Enables progressive disclosure
|
|
- Mirrors human cognition (we don't memorize everything)
|
|
- Leverages metadata (file names, folder structure, timestamps)
|
|
- Agents discover context incrementally
|
|
|
|
**Trade-offs**:
|
|
- Slower than pre-computed retrieval
|
|
- Requires proper tool guidance to avoid dead-ends
|
|
|
|
### Pre-Inference Retrieval (Traditional RAG)
|
|
**Approach**: Use embedding-based retrieval to surface context before inference
|
|
|
|
**When to Use**: Static content that won't change during interaction
|
|
|
|
### Hybrid Strategy (Best of Both)
|
|
**Approach**: Retrieve some data upfront, enable autonomous exploration as needed
|
|
|
|
**Example**: Claude Code loads CLAUDE.md files upfront, uses glob/grep for just-in-time retrieval
|
|
|
|
**Rule of Thumb**: "Do the simplest thing that works"
|
|
|
|
---
|
|
|
|
## Long-Horizon Tasks: Three Techniques
|
|
|
|
### 1. Compaction
|
|
**What**: Summarize conversation nearing context limit, reinitiate with summary
|
|
|
|
**Implementation**:
|
|
- Pass message history to model for compression
|
|
- Preserve critical details (architectural decisions, bugs, implementation)
|
|
- Discard redundant outputs
|
|
- Continue with compressed context + recently accessed files
|
|
|
|
**Tuning Process**:
|
|
1. **First**: Maximize recall (capture all relevant information)
|
|
2. **Then**: Improve precision (eliminate superfluous content)
|
|
|
|
**Low-Hanging Fruit**: Clear old tool calls and results
|
|
|
|
**Best For**: Tasks requiring extensive back-and-forth
|
|
|
|
### 2. Structured Note-Taking (Agentic Memory)
|
|
**What**: Agent writes notes persisted outside context window, retrieved later
|
|
|
|
**Examples**:
|
|
- To-do lists
|
|
- NOTES.md files
|
|
- Game state tracking (Pokémon example: tracking 1,234 steps of training)
|
|
- Project progress logs
|
|
|
|
**Benefits**:
|
|
- Persistent memory with minimal overhead
|
|
- Maintains critical context across tool calls
|
|
- Enables multi-hour coherent strategies
|
|
|
|
**Best For**: Iterative development with clear milestones
|
|
|
|
### 3. Sub-Agent Architectures
|
|
**What**: Specialized sub-agents handle focused tasks with clean context windows
|
|
|
|
**How It Works**:
|
|
- Main agent coordinates high-level plan
|
|
- Sub-agents perform deep technical work
|
|
- Sub-agents explore extensively (tens of thousands of tokens)
|
|
- Return condensed summaries (1,000-2,000 tokens)
|
|
|
|
**Benefits**:
|
|
- Clear separation of concerns
|
|
- Parallel exploration
|
|
- Detailed context remains isolated
|
|
|
|
**Best For**: Complex research and analysis tasks
|
|
|
|
---
|
|
|
|
## Quick Decision Framework
|
|
|
|
| Scenario | Recommended Approach |
|
|
|----------|---------------------|
|
|
| Static content | Pre-inference retrieval or hybrid |
|
|
| Dynamic exploration needed | Just-in-time context |
|
|
| Extended back-and-forth | Compaction |
|
|
| Iterative development | Structured note-taking |
|
|
| Complex research | Sub-agent architectures |
|
|
| Rapid model improvement | "Do the simplest thing that works" |
|
|
|
|
---
|
|
|
|
## Key Takeaways
|
|
|
|
1. **Context is finite**: Treat it as a precious resource with an attention budget
|
|
2. **Think holistically**: Consider the entire state available to the LLM
|
|
3. **Stay minimal**: More context isn't always better
|
|
4. **Be iterative**: Context curation happens each time you pass to the model
|
|
5. **Design for autonomy**: As models improve, let them act intelligently
|
|
6. **Start simple**: Test with minimal setup, add based on failure modes
|
|
|
|
---
|
|
|
|
## Anti-Patterns to Avoid
|
|
|
|
- ❌ Cramming everything into prompts
|
|
- ❌ Creating brittle if-else logic
|
|
- ❌ Building bloated tool sets
|
|
- ❌ Stuffing exhaustive edge cases as examples
|
|
- ❌ Assuming larger context windows solve everything
|
|
- ❌ Ignoring context pollution over long interactions
|
|
|
|
---
|
|
|
|
## Remember
|
|
|
|
> "Even as models continue to improve, the challenge of maintaining coherence across extended interactions will remain central to building more effective agents."
|
|
|
|
Context engineering will evolve, but the core principle stays the same: **optimize signal-to-noise ratio in your token budget**.
|
|
|
|
---
|
|
|
|
*Based on Anthropic's "Effective context engineering for AI agents" (September 2025)* |