From 071b26c53d1c2f6ceef6ee73b548af4cc5373891 Mon Sep 17 00:00:00 2001 From: Claude Date: Tue, 7 Apr 2026 18:07:22 +0000 Subject: [PATCH] Simplify plan: two-call pattern instead of tool loop abstraction MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The reconciliation needs at most two LLM calls (get_accounts then get_transactions), matching the existing Responder one-shot pattern. No ToolLoop class needed — just use chat_response + FunctionToolCaller directly. https://claude.ai/code/session_01MA6WZuUES4tcv3sShDH8bv --- PLAN.md | 168 ++++++++++++++++++++------------------------------------ 1 file changed, 59 insertions(+), 109 deletions(-) diff --git a/PLAN.md b/PLAN.md index 3b0f0c10e..4eacec9e6 100644 --- a/PLAN.md +++ b/PLAN.md @@ -9,84 +9,33 @@ This duplicates the tool-calling orchestration already built into the assistant - Executing functions via `FunctionToolCaller.fulfill_requests` - Passing results back to the LLM for follow-up -The fix is to extract a reusable tool-calling loop from the existing assistant infrastructure so `PdfProcessor` can delegate tool orchestration rather than reimplementing it. - ## Architecture -The current assistant layer has a **one-shot** tool-call design (`Responder` calls the LLM, executes tools once, calls LLM again with results — no further recursion). The PDF reconciliation needs a **multi-turn** loop (up to N iterations). Rather than making `Responder` itself support multi-turn (it's tied to chat streaming/events), we'll create a lightweight, synchronous tool-calling loop that both contexts can use. +The reconciliation flow is straightforward — **not** an unbounded multi-turn loop: + +1. The LLM calls `get_accounts` to find a matching account for the statement +2. If a match is found, the LLM calls `get_transactions` to fetch synced transactions for comparison +3. The LLM produces the reconciliation result + +This is at most **two rounds** of tool calls — the same one-shot pattern already implemented in `Assistant::Responder#handle_follow_up_response`. The LLM can even request both tools in a single round if prompted correctly. + +Rather than building a new `ToolLoop` abstraction, we should reuse the existing tool-calling infrastructure directly. `PdfProcessor` just needs to: +1. Make an LLM call via the provider's `chat_response` (which returns parsed `ChatFunctionRequest` objects) +2. Execute those requests via `FunctionToolCaller.fulfill_requests` +3. Make one follow-up `chat_response` call with the results + +This mirrors exactly what `Responder#handle_follow_up_response` already does, but synchronously and without streaming/events. ## Steps -### Step 1: Create `Assistant::ToolLoop` — a reusable synchronous tool-calling loop - -**New file:** `app/models/assistant/tool_loop.rb` - -This class encapsulates the "call LLM → detect tool requests → execute tools → feed results back → repeat" loop in a provider-agnostic way. It operates synchronously (no streaming) and supports a configurable max-iterations cap. - -```ruby -class Assistant::ToolLoop - DEFAULT_MAX_ITERATIONS = 8 - - def initialize(llm:, function_tool_caller:, max_iterations: DEFAULT_MAX_ITERATIONS) - @llm = llm - @function_tool_caller = function_tool_caller - @max_iterations = max_iterations - end - - # Runs the tool loop synchronously. - # Returns the final ChatResponse (with no remaining function_requests). - def run(prompt:, model:, instructions: nil, family: nil) - function_results = [] - iterations = 0 - response = nil - - loop do - break if iterations >= @max_iterations - - provider_response = @llm.chat_response( - prompt, - model: model, - instructions: instructions, - functions: @function_tool_caller.function_definitions, - function_results: function_results, - family: family - ) - - raise provider_response.error if !provider_response.success? - response = provider_response.data - - break if response.function_requests.empty? - - tool_calls = @function_tool_caller.fulfill_requests(response.function_requests) - function_results = tool_calls.map(&:to_result) - iterations += 1 - end - - response - end -end -``` - -Key properties: -- Uses the existing `LlmConcept#chat_response` interface (provider-agnostic) -- Uses the existing `FunctionToolCaller` for dispatching (no manual function routing) -- Returns the final `ChatResponse` with the LLM's text answer -- Capped iterations prevent runaway spend - -### Step 2: Refactor `PdfProcessor` to use `Assistant::ToolLoop` +### Step 1: Add a synchronous tool-call helper to `PdfProcessor` **File:** `app/models/provider/openai/pdf_processor.rb` -Changes: -1. **Remove** `run_reconciliation_tool_loop` and `execute_reconciliation_tool_call` methods -2. **Remove** `reconciliation_tools` method that manually builds tool definitions -3. **Add** a method that instantiates an `Assistant::ToolLoop` with the appropriate functions and calls `run` -4. The reconciliation prompt/instructions stay in `PdfProcessor` (that's domain-specific and belongs here) - -The `PdfProcessor` needs access to a `family` to instantiate `Assistant::Function::GetAccounts` and `GetTransactions` (they require a `user`). The PR already passes `family` into `PdfProcessor`. We'll use `family.users.first` (or an admin) as the `tool_user`, consistent with the PR's existing approach. +Replace `run_reconciliation_tool_loop`, `execute_reconciliation_tool_call`, and `reconciliation_tools` with a method that uses the existing provider and function infrastructure: ```ruby -def run_reconciliation(prompt, effective_model) +def run_reconciliation(pdf_text, effective_model) user = family&.users&.find_by(role: :admin) || family&.users&.first return nil unless user @@ -94,31 +43,47 @@ def run_reconciliation(prompt, effective_model) Assistant::Function::GetAccounts.new(user), Assistant::Function::GetTransactions.new(user) ] - tool_caller = Assistant::FunctionToolCaller.new(functions) - - # We need an LLM provider instance — we already have `client` but need - # a Provider::Openai instance. Since PdfProcessor is already inside the - # Openai namespace and receives the client, we construct a lightweight - # wrapper or reuse the provider from the registry. llm = Provider::Registry.get_provider(:openai) - loop = Assistant::ToolLoop.new( - llm: llm, - function_tool_caller: tool_caller, - max_iterations: 8 - ) - - loop.run( - prompt: prompt, + # First call: LLM analyzes PDF text and requests tools (get_accounts, get_transactions) + response = llm.chat_response( + reconciliation_prompt(pdf_text), model: effective_model, instructions: reconciliation_instructions, + functions: tool_caller.function_definitions, family: family ) + raise response.error unless response.success? + + first_response = response.data + return parse_reconciliation(first_response) if first_response.function_requests.empty? + + # Execute requested tools + tool_calls = tool_caller.fulfill_requests(first_response.function_requests) + + # Follow-up call: LLM receives tool results, produces reconciliation + follow_up = llm.chat_response( + reconciliation_prompt(pdf_text), + model: effective_model, + instructions: reconciliation_instructions, + functions: tool_caller.function_definitions, + function_results: tool_calls.map(&:to_result), + family: family + ) + raise follow_up.error unless follow_up.success? + + parse_reconciliation(follow_up.data) end ``` -### Step 3: Update `PdfProcessingResult` to include reconciliation +Key points: +- Uses `LlmConcept#chat_response` (provider-agnostic, not raw OpenAI API) +- Uses `FunctionToolCaller` for dispatch (no manual function routing by name) +- Two calls max: initial + follow-up with tool results (same as `Responder`) +- No loop needed — the LLM gets both tools, calls what it needs, gets one follow-up + +### Step 2: Update `PdfProcessingResult` to include reconciliation **File:** `app/models/provider/llm_concept.rb` @@ -126,11 +91,7 @@ end PdfProcessingResult = Data.define(:summary, :document_type, :extracted_data, :reconciliation) ``` -Add `:reconciliation` field with a default of `nil` (via `Data.define` keyword init). - -### Step 4: Update `build_result` in `PdfProcessor` - -Pass the reconciliation data from the tool loop response into the `PdfProcessingResult`: +### Step 3: Update `build_result` in `PdfProcessor` ```ruby def build_result(parsed, reconciliation: nil) @@ -143,7 +104,7 @@ def build_result(parsed, reconciliation: nil) end ``` -### Step 5: Keep PR #1382's domain changes (job, import, views, locales) +### Step 4: Keep PR #1382's domain changes (job, import, views, locales) The following changes from PR #1382 are domain-specific and correct — they should remain as-is: - `ProcessPdfJob` — skip extraction when reconciliation matches @@ -151,38 +112,27 @@ The following changes from PR #1382 are domain-specific and correct — they sho - View changes (`_pdf_import.html.erb`, `_nav.html.erb`) - Locale additions -### Step 6: Update tests +### Step 5: Update tests **File:** `test/models/provider/openai/pdf_processor_test.rb` - Remove tests that assert on manual tool-call loop internals -- Add tests that verify `Assistant::ToolLoop` is used -- Mock `Assistant::ToolLoop#run` to return expected reconciliation data - -**New file:** `test/models/assistant/tool_loop_test.rb` - -- Test the loop terminates when no function_requests -- Test the loop executes tools and passes results back -- Test the loop respects max_iterations -- Test error propagation - -### Step 7: Optionally refactor `Assistant::Responder` to use `ToolLoop` - -`Responder#handle_follow_up_response` currently does a single-shot tool call + follow-up. It could be refactored to delegate to `ToolLoop` for the non-streaming path, but this is a separate concern and can be deferred. The streaming requirement in `Responder` makes this non-trivial — note this as a future improvement. +- Test that `chat_response` is called with correct function definitions +- Test that `FunctionToolCaller.fulfill_requests` is used for tool dispatch +- Mock provider responses to return `ChatResponse` / `ChatFunctionRequest` objects ## Files Changed | File | Action | Description | |------|--------|-------------| -| `app/models/assistant/tool_loop.rb` | **Create** | Reusable synchronous tool-calling loop | -| `app/models/provider/openai/pdf_processor.rb` | **Modify** | Replace manual tool loop with `Assistant::ToolLoop` | +| `app/models/provider/openai/pdf_processor.rb` | **Modify** | Replace manual tool loop with provider-based `chat_response` + `FunctionToolCaller` | | `app/models/provider/llm_concept.rb` | **Modify** | Add `:reconciliation` to `PdfProcessingResult` | -| `test/models/assistant/tool_loop_test.rb` | **Create** | Tests for the new tool loop | | `test/models/provider/openai/pdf_processor_test.rb` | **Modify** | Update tests for refactored reconciliation | ## Key Design Decisions -1. **Separate class vs. extending Responder**: `Responder` is tightly coupled to streaming and chat events. A separate `ToolLoop` is simpler and more reusable. -2. **Provider-agnostic**: `ToolLoop` uses the `LlmConcept#chat_response` interface, so it works with any provider (OpenAI, custom, future providers). -3. **Reuses existing `FunctionToolCaller`**: No new function execution code — the same dispatch mechanism used by the chat assistant. +1. **No new abstraction needed**: The existing `chat_response` + `FunctionToolCaller` pattern is sufficient for a two-call flow. No `ToolLoop` class required. +2. **Provider-agnostic**: Uses `LlmConcept#chat_response` interface, not raw OpenAI API. +3. **Reuses existing `FunctionToolCaller`**: No new function execution code — same dispatch mechanism as the chat assistant. 4. **No changes to `Assistant::Function` classes**: `GetAccounts` and `GetTransactions` work as-is. +5. **Matches existing pattern**: The two-call flow (initial → execute tools → follow-up) mirrors `Responder#handle_follow_up_response` exactly.