mirror of
https://github.com/glittercowboy/get-shit-done
synced 2026-05-14 11:06:35 +02:00
* docs(sdk): recommend 1-hour cache TTL for system prompts (#1980) Add sdk/docs/caching.md with prompt caching best practices for API users building on GSD patterns. Recommends 1-hour TTL for executor, planner, and verifier system prompts which are large and stable across requests within a session. The default 5-minute TTL expires during human review pauses between phases. 1-hour TTL costs 2x on cache miss but pays for itself after 3 hits — GSD phases typically involve dozens of requests per hour. Closes #1980 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs(sdk): fix ttl type to string per Anthropic API spec The Anthropic extended caching API requires ttl as a string ('1h'), not an integer (3600). Corrects both code examples in caching.md. Review feedback on #2055 from @trek-e. * docs(sdk): fix second ttl value in direct-api example to string '1h' Follow-up to trek-e's re-review on #2055. The first fix corrected the Agent SDK integration example (line 16) but missed the second code block (line 60) that shows the direct Claude API call. Both now use ttl: '1h' (string) as the Anthropic extended caching API requires — integer forms like ttl: 3600 are silently ignored by the API and the cache never activates. Closes #1980 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
69 lines
2.6 KiB
Markdown
69 lines
2.6 KiB
Markdown
# Prompt Caching Best Practices
|
|
|
|
When building applications on the GSD SDK, system prompts that include workflow instructions (executor prompts, planner context, verification rules) are large and stable across requests. Prompt caching avoids re-processing these on every API call.
|
|
|
|
## Recommended: 1-Hour Cache TTL
|
|
|
|
Use `cache_control` with a 1-hour TTL on system prompts that include GSD workflow content:
|
|
|
|
```typescript
|
|
const response = await client.messages.create({
|
|
model: 'claude-sonnet-4-20250514',
|
|
system: [
|
|
{
|
|
type: 'text',
|
|
text: executorPrompt, // GSD workflow instructions — large, stable across requests
|
|
cache_control: { type: 'ephemeral', ttl: '1h' },
|
|
},
|
|
],
|
|
messages,
|
|
});
|
|
```
|
|
|
|
### Why 1 hour instead of the default 5 minutes
|
|
|
|
GSD workflows involve human review pauses between phases — discussing results, checking verification output, deciding next steps. The default 5-minute TTL expires during these pauses, forcing full re-processing of the system prompt on the next request.
|
|
|
|
With a 1-hour TTL:
|
|
|
|
- **Cost:** 2x write cost on cache miss (vs. 1.25x for 5-minute TTL)
|
|
- **Break-even:** Pays for itself after 3 cache hits per hour
|
|
- **GSD usage pattern:** Phase execution involves dozens of requests per hour, well above break-even
|
|
- **Cache refresh:** Every cache hit resets the TTL at no cost, so active sessions maintain warm cache throughout
|
|
|
|
### Which prompts to cache
|
|
|
|
| Prompt | Cache? | Reason |
|
|
|--------|--------|--------|
|
|
| Executor system prompt | Yes | Large (~10K tokens), identical across tasks in a phase |
|
|
| Planner system prompt | Yes | Large, stable within a planning session |
|
|
| Verifier system prompt | Yes | Large, stable within a verification session |
|
|
| User/task-specific content | No | Changes per request |
|
|
|
|
### SDK integration point
|
|
|
|
In `session-runner.ts`, the `systemPrompt.append` field carries the executor/planner prompt. When using the Claude API directly (outside the Agent SDK's `query()` helper), wrap this content with `cache_control`:
|
|
|
|
```typescript
|
|
// In runPlanSession / runPhaseStepSession, the systemPrompt is:
|
|
systemPrompt: {
|
|
type: 'preset',
|
|
preset: 'claude_code',
|
|
append: executorPrompt, // <-- this is the content to cache
|
|
}
|
|
|
|
// When calling the API directly, convert to:
|
|
system: [
|
|
{
|
|
type: 'text',
|
|
text: executorPrompt,
|
|
cache_control: { type: 'ephemeral', ttl: '1h' },
|
|
},
|
|
]
|
|
```
|
|
|
|
## References
|
|
|
|
- [Anthropic Prompt Caching documentation](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching)
|
|
- [Extended caching (1-hour TTL)](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching#extended-caching)
|