mirror of
https://github.com/thedotmack/claude-mem
synced 2026-04-25 17:15:04 +02:00
fix: correct Gemini model name from gemini-3-flash to gemini-3-flash-preview
The Gemini API requires the -preview suffix for the Gemini 3 Flash model. gemini-3-flash does not exist - only gemini-3-flash-preview is available. This was causing 404 errors when users selected this model option. Closes #831 Co-Authored-By: Glucksberg <markuscontasul@gmail.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -26,7 +26,7 @@ Settings are managed in `~/.claude-mem/settings.json`. The file is auto-created
|
||||
| Setting | Default | Description |
|
||||
|-------------------------------|---------------------------------|---------------------------------------|
|
||||
| `CLAUDE_MEM_GEMINI_API_KEY` | — | Gemini API key ([get free key](https://aistudio.google.com/app/apikey)) |
|
||||
| `CLAUDE_MEM_GEMINI_MODEL` | `gemini-2.5-flash-lite` | Gemini model: `gemini-2.5-flash-lite`, `gemini-2.5-flash`, `gemini-3-flash` |
|
||||
| `CLAUDE_MEM_GEMINI_MODEL` | `gemini-2.5-flash-lite` | Gemini model: `gemini-2.5-flash-lite`, `gemini-2.5-flash`, `gemini-3-flash-preview` |
|
||||
|
||||
See [Gemini Provider](usage/gemini-provider) for detailed configuration and free tier information.
|
||||
|
||||
|
||||
@@ -103,7 +103,7 @@ Open http://localhost:37777 to see the memory viewer.
|
||||
|-------|---------------|-------|
|
||||
| `gemini-2.5-flash-lite` | 10 (4,000 with billing) | **Default.** Fastest, highest free tier RPM |
|
||||
| `gemini-2.5-flash` | 5 (1,000 with billing) | Higher capability |
|
||||
| `gemini-3-flash` | 5 (1,000 with billing) | Latest model |
|
||||
| `gemini-3-flash-preview` | 5 (1,000 with billing) | Latest model |
|
||||
|
||||
To change the model, update your settings:
|
||||
|
||||
|
||||
@@ -39,7 +39,7 @@ Claude-mem supports Google's Gemini API as an alternative to the Claude Agent SD
|
||||
|---------|--------|---------|-------------|
|
||||
| `CLAUDE_MEM_PROVIDER` | `claude`, `gemini` | `claude` | AI provider for observation extraction |
|
||||
| `CLAUDE_MEM_GEMINI_API_KEY` | string | — | Your Gemini API key |
|
||||
| `CLAUDE_MEM_GEMINI_MODEL` | `gemini-2.5-flash-lite`, `gemini-2.5-flash`, `gemini-3-flash` | `gemini-2.5-flash-lite` | Gemini model to use |
|
||||
| `CLAUDE_MEM_GEMINI_MODEL` | `gemini-2.5-flash-lite`, `gemini-2.5-flash`, `gemini-3-flash-preview` | `gemini-2.5-flash-lite` | Gemini model to use |
|
||||
| `CLAUDE_MEM_GEMINI_BILLING_ENABLED` | `true`, `false` | `false` | Skip rate limiting if billing is enabled on Google Cloud |
|
||||
|
||||
### Using the Settings UI
|
||||
@@ -79,7 +79,7 @@ The settings file takes precedence over the environment variable.
|
||||
|-------|--------------|-------|
|
||||
| `gemini-2.5-flash-lite` | 10 | Default, recommended for free tier (highest RPM) |
|
||||
| `gemini-2.5-flash` | 5 | Higher capability, lower rate limit |
|
||||
| `gemini-3-flash` | 5 | Latest model, lower rate limit |
|
||||
| `gemini-3-flash-preview` | 5 | Latest model, lower rate limit |
|
||||
|
||||
## Provider Switching
|
||||
|
||||
@@ -139,7 +139,7 @@ Google has two rate limit tiers for free usage:
|
||||
|-------|-----|-----|
|
||||
| gemini-2.5-flash-lite | 10 | 250K |
|
||||
| gemini-2.5-flash | 5 | 250K |
|
||||
| gemini-3-flash | 5 | 250K |
|
||||
| gemini-3-flash-preview | 5 | 250K |
|
||||
|
||||
Claude-mem enforces these limits automatically with built-in delays between requests. Processing may be slower but stays within limits.
|
||||
|
||||
@@ -149,7 +149,7 @@ Claude-mem enforces these limits automatically with built-in delays between requ
|
||||
|-------|-----|-----|
|
||||
| gemini-2.5-flash-lite | 4,000 | 4M |
|
||||
| gemini-2.5-flash | 1,000 | 1M |
|
||||
| gemini-3-flash | 1,000 | 1M |
|
||||
| gemini-3-flash-preview | 1,000 | 1M |
|
||||
|
||||
<Tip>
|
||||
**Recommended**: Enable billing on your Google Cloud project to unlock much higher rate limits. You won't be charged unless you exceed the generous free quota. This allows claude-mem to process observations instantly instead of waiting between requests.
|
||||
|
||||
Reference in New Issue
Block a user