mirror of https://github.com/paperclipai/paperclip synced 2026-04-25 17:25:15 +02:00

Files

Dotta 236d11d36f [codex] Add run liveness continuations (#4083 )

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Heartbeat runs are the control-plane record of each agent execution
window.
> - Long-running local agents can exhaust context or stop while still
holding useful next-step state.
> - Operators need that stop reason, next action, and continuation path
to be durable and visible.
> - This pull request adds run liveness metadata, continuation
summaries, and UI surfaces for issue run ledgers.
> - The benefit is that interrupted or long-running work can resume with
clearer context instead of losing the agent's last useful handoff.

## What Changed

- Added heartbeat-run liveness fields, continuation attempt tracking,
and an idempotent `0058` migration.
- Added server services and tests for run liveness, continuation
summaries, stop metadata, and activity backfill.
- Wired local and HTTP adapters to surface continuation/liveness context
through shared adapter utilities.
- Added shared constants, validators, and heartbeat types for liveness
continuation state.
- Added issue-detail UI surfaces for continuation handoffs and the run
ledger, with component tests.
- Updated agent runtime docs, heartbeat protocol docs, prompt guidance,
onboarding assets, and skills instructions to explain continuation
behavior.
- Addressed Greptile feedback by scoping document evidence by run,
excluding system continuation-summary documents from liveness evidence,
importing shared liveness types, surfacing hidden ledger run counts,
documenting bounded retry behavior, and moving run-ledger liveness
backfill off the request path.

## Verification

- `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts
server/src/__tests__/run-continuations.test.ts
server/src/__tests__/run-liveness.test.ts
server/src/__tests__/activity-service.test.ts
server/src/__tests__/documents-service.test.ts
server/src/__tests__/issue-continuation-summary.test.ts
server/src/services/heartbeat-stop-metadata.test.ts
ui/src/components/IssueRunLedger.test.tsx
ui/src/components/IssueContinuationHandoff.test.tsx
ui/src/components/IssueDocumentsSection.test.tsx`
- `pnpm --filter @paperclipai/db build`
- `pnpm exec vitest run server/src/__tests__/activity-service.test.ts
ui/src/components/IssueRunLedger.test.tsx`
- `pnpm --filter @paperclipai/ui typecheck`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm exec vitest run server/src/__tests__/activity-service.test.ts
server/src/__tests__/run-continuations.test.ts
ui/src/components/IssueRunLedger.test.tsx`
- `pnpm exec vitest run
server/src/__tests__/heartbeat-process-recovery.test.ts -t "treats a
plan document update"`
- `pnpm exec vitest run server/src/__tests__/activity-service.test.ts
server/src/__tests__/heartbeat-process-recovery.test.ts -t "activity
service|treats a plan document update"`
- Remote PR checks on head `e53b1a1d`: `verify`, `e2e`, `policy`, and
Snyk all passed.
- Confirmed `public-gh/master` is an ancestor of this branch after
fetching `public-gh master`.
- Confirmed `pnpm-lock.yaml` is not included in the branch diff.
- Confirmed migration `0058_wealthy_starbolt.sql` is ordered after
`0057` and uses `IF NOT EXISTS` guards for repeat application.
- Greptile inline review threads are resolved.

## Risks

- Medium risk: this touches heartbeat execution, liveness recovery,
activity rendering, issue routes, shared contracts, docs, and UI.
- Migration risk is mitigated by additive columns/indexes and idempotent
guards.
- Run-ledger liveness backfill is now asynchronous, so the first ledger
response can briefly show historical missing liveness until the
background backfill completes.
- UI screenshot coverage is not included in this packaging pass;
validation is currently through focused component tests.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5.4, local tool-use coding agent with terminal, git,
GitHub connector, GitHub CLI, and Paperclip API access.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Screenshot note: no before/after screenshots were captured in this PR
packaging pass; the UI changes are covered by focused component tests
listed above.

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>

2026-04-20 06:01:49 -05:00

5.3 KiB

Raw Blame History

Agent Runtime Guide

Status: User-facing guide
Last updated: 2026-02-17
Audience: Operators setting up and running agents in Paperclip

1. What this system does

Agents in Paperclip do not run continuously.
They run in heartbeats: short execution windows triggered by a wakeup.

Each heartbeat:

Starts the configured agent adapter (for example, Claude CLI or Codex CLI)
Gives it the current prompt/context
Lets it work until it exits, times out, or is cancelled
Stores results (status, token usage, errors, logs)
Updates the UI live

2. When an agent wakes up

An agent can be woken up in four ways:

timer: scheduled interval (for example every 5 minutes)
assignment: when work is assigned/checked out to that agent
on_demand: manual wakeup (button/API)
automation: system-triggered wakeup for future automations

If an agent is already running, new wakeups are merged (coalesced) instead of launching duplicate runs.

3. What to configure per agent

3.1 Adapter choice

Common choices:

claude_local: runs your local claude CLI
codex_local: runs your local codex CLI
process: generic shell command adapter
http: calls an external HTTP endpoint

For claude_local and codex_local, Paperclip assumes the CLI is already installed and authenticated on the host machine.

3.2 Runtime behavior

In agent runtime settings, configure heartbeat policy:

enabled: allow scheduled heartbeats
intervalSec: timer interval (0 = disabled)
wakeOnAssignment: wake when assigned work
wakeOnOnDemand: allow ping-style on-demand wakeups
wakeOnAutomation: allow system automation wakeups

3.3 Working directory and execution limits

For local adapters, set:

cwd (working directory)
timeoutSec (max runtime per heartbeat)
graceSec (time before force-kill after timeout/cancel)
optional env vars and extra CLI args

3.4 Prompt templates

You can set:

promptTemplate: used for every run (first run and resumed sessions)

Templates support variables like {{agent.id}}, {{agent.name}}, and run context values.

4. Session resume behavior

Paperclip stores resumable session state per (agent, taskKey, adapterType). taskKey is derived from wakeup context (taskKey, taskId, or issueId).

A heartbeat for the same task key reuses the previous session for that task.
Different task keys for the same agent keep separate session state.
If restore fails, adapters should retry once with a fresh session and continue.
You can reset all sessions for an agent or reset one task session by task key.

Use session reset when:

you significantly changed prompt strategy
the agent is stuck in a bad loop
you want a clean restart

5. Logs, status, and run history

For each heartbeat run you get:

run status (queued, running, succeeded, failed, timed_out, cancelled)
error text and stderr/stdout excerpts
token usage/cost when available from the adapter
full logs (stored outside core run rows, optimized for large output)

In local/dev setups, full logs are stored on disk under the configured run-log path.

6. Live updates in the UI

Paperclip pushes runtime/activity updates to the browser in real time.

You should see live changes for:

agent status
heartbeat run status
task/activity updates caused by agent work
dashboard/cost/activity panels as relevant

If the connection drops, the UI reconnects automatically.

7. Common operating patterns

7.1 Simple autonomous loop

Enable timer wakeups (for example every 300s)
Keep assignment wakeups on
Use a focused prompt template that tells agents to act in the same heartbeat, leave durable progress, and mark blocked work with an owner/action
Watch run logs and adjust prompt/config over time

7.2 Event-driven loop (less constant polling)

Disable timer or set a long interval
Keep wake-on-assignment enabled
Use child issues, comments, and on-demand wakeups for handoffs instead of loops that poll agents, sessions, or processes

7.3 Safety-first loop

Short timeout
Conservative prompt
Monitor errors + cancel quickly when needed
Reset sessions when drift appears

8. Troubleshooting

If runs fail repeatedly:

Check adapter command availability (claude/codex installed and logged in).
Verify cwd exists and is accessible.
Inspect run error + stderr excerpt, then full log.
Confirm timeout is not too low.
Reset session and retry.
Pause agent if it is causing repeated bad updates.

Typical failure causes:

CLI not installed/authenticated
bad working directory
malformed adapter args/env
prompt too broad or missing constraints
process timeout

9. Security and risk notes

Local CLI adapters run unsandboxed on the host machine.

That means:

prompt instructions matter
configured credentials/env vars are sensitive
working directory permissions matter

Start with least privilege where possible, and avoid exposing secrets in broad reusable prompts unless intentionally required.

10. Minimal setup checklist

Choose adapter (claude_local or codex_local).
Set cwd to the target workspace.
Add bootstrap + normal prompt templates.
Configure heartbeat policy (timer and/or assignment wakeups).
Trigger a manual wakeup.
Confirm run succeeds and session/token usage is recorded.
Watch live updates and iterate prompt/config.

5.3 KiB Raw Blame History