Files
paperclip/doc/spec/agents-runtime.md
Dotta 236d11d36f [codex] Add run liveness continuations (#4083)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Heartbeat runs are the control-plane record of each agent execution
window.
> - Long-running local agents can exhaust context or stop while still
holding useful next-step state.
> - Operators need that stop reason, next action, and continuation path
to be durable and visible.
> - This pull request adds run liveness metadata, continuation
summaries, and UI surfaces for issue run ledgers.
> - The benefit is that interrupted or long-running work can resume with
clearer context instead of losing the agent's last useful handoff.

## What Changed

- Added heartbeat-run liveness fields, continuation attempt tracking,
and an idempotent `0058` migration.
- Added server services and tests for run liveness, continuation
summaries, stop metadata, and activity backfill.
- Wired local and HTTP adapters to surface continuation/liveness context
through shared adapter utilities.
- Added shared constants, validators, and heartbeat types for liveness
continuation state.
- Added issue-detail UI surfaces for continuation handoffs and the run
ledger, with component tests.
- Updated agent runtime docs, heartbeat protocol docs, prompt guidance,
onboarding assets, and skills instructions to explain continuation
behavior.
- Addressed Greptile feedback by scoping document evidence by run,
excluding system continuation-summary documents from liveness evidence,
importing shared liveness types, surfacing hidden ledger run counts,
documenting bounded retry behavior, and moving run-ledger liveness
backfill off the request path.

## Verification

- `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts
server/src/__tests__/run-continuations.test.ts
server/src/__tests__/run-liveness.test.ts
server/src/__tests__/activity-service.test.ts
server/src/__tests__/documents-service.test.ts
server/src/__tests__/issue-continuation-summary.test.ts
server/src/services/heartbeat-stop-metadata.test.ts
ui/src/components/IssueRunLedger.test.tsx
ui/src/components/IssueContinuationHandoff.test.tsx
ui/src/components/IssueDocumentsSection.test.tsx`
- `pnpm --filter @paperclipai/db build`
- `pnpm exec vitest run server/src/__tests__/activity-service.test.ts
ui/src/components/IssueRunLedger.test.tsx`
- `pnpm --filter @paperclipai/ui typecheck`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm exec vitest run server/src/__tests__/activity-service.test.ts
server/src/__tests__/run-continuations.test.ts
ui/src/components/IssueRunLedger.test.tsx`
- `pnpm exec vitest run
server/src/__tests__/heartbeat-process-recovery.test.ts -t "treats a
plan document update"`
- `pnpm exec vitest run server/src/__tests__/activity-service.test.ts
server/src/__tests__/heartbeat-process-recovery.test.ts -t "activity
service|treats a plan document update"`
- Remote PR checks on head `e53b1a1d`: `verify`, `e2e`, `policy`, and
Snyk all passed.
- Confirmed `public-gh/master` is an ancestor of this branch after
fetching `public-gh master`.
- Confirmed `pnpm-lock.yaml` is not included in the branch diff.
- Confirmed migration `0058_wealthy_starbolt.sql` is ordered after
`0057` and uses `IF NOT EXISTS` guards for repeat application.
- Greptile inline review threads are resolved.

## Risks

- Medium risk: this touches heartbeat execution, liveness recovery,
activity rendering, issue routes, shared contracts, docs, and UI.
- Migration risk is mitigated by additive columns/indexes and idempotent
guards.
- Run-ledger liveness backfill is now asynchronous, so the first ledger
response can briefly show historical missing liveness until the
background backfill completes.
- UI screenshot coverage is not included in this packaging pass;
validation is currently through focused component tests.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5.4, local tool-use coding agent with terminal, git,
GitHub connector, GitHub CLI, and Paperclip API access.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Screenshot note: no before/after screenshots were captured in this PR
packaging pass; the UI changes are covered by focused component tests
listed above.

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 06:01:49 -05:00

5.3 KiB

Agent Runtime Guide

Status: User-facing guide
Last updated: 2026-02-17
Audience: Operators setting up and running agents in Paperclip

1. What this system does

Agents in Paperclip do not run continuously.
They run in heartbeats: short execution windows triggered by a wakeup.

Each heartbeat:

  1. Starts the configured agent adapter (for example, Claude CLI or Codex CLI)
  2. Gives it the current prompt/context
  3. Lets it work until it exits, times out, or is cancelled
  4. Stores results (status, token usage, errors, logs)
  5. Updates the UI live

2. When an agent wakes up

An agent can be woken up in four ways:

  • timer: scheduled interval (for example every 5 minutes)
  • assignment: when work is assigned/checked out to that agent
  • on_demand: manual wakeup (button/API)
  • automation: system-triggered wakeup for future automations

If an agent is already running, new wakeups are merged (coalesced) instead of launching duplicate runs.

3. What to configure per agent

3.1 Adapter choice

Common choices:

  • claude_local: runs your local claude CLI
  • codex_local: runs your local codex CLI
  • process: generic shell command adapter
  • http: calls an external HTTP endpoint

For claude_local and codex_local, Paperclip assumes the CLI is already installed and authenticated on the host machine.

3.2 Runtime behavior

In agent runtime settings, configure heartbeat policy:

  • enabled: allow scheduled heartbeats
  • intervalSec: timer interval (0 = disabled)
  • wakeOnAssignment: wake when assigned work
  • wakeOnOnDemand: allow ping-style on-demand wakeups
  • wakeOnAutomation: allow system automation wakeups

3.3 Working directory and execution limits

For local adapters, set:

  • cwd (working directory)
  • timeoutSec (max runtime per heartbeat)
  • graceSec (time before force-kill after timeout/cancel)
  • optional env vars and extra CLI args

3.4 Prompt templates

You can set:

  • promptTemplate: used for every run (first run and resumed sessions)

Templates support variables like {{agent.id}}, {{agent.name}}, and run context values.

4. Session resume behavior

Paperclip stores resumable session state per (agent, taskKey, adapterType). taskKey is derived from wakeup context (taskKey, taskId, or issueId).

  • A heartbeat for the same task key reuses the previous session for that task.
  • Different task keys for the same agent keep separate session state.
  • If restore fails, adapters should retry once with a fresh session and continue.
  • You can reset all sessions for an agent or reset one task session by task key.

Use session reset when:

  • you significantly changed prompt strategy
  • the agent is stuck in a bad loop
  • you want a clean restart

5. Logs, status, and run history

For each heartbeat run you get:

  • run status (queued, running, succeeded, failed, timed_out, cancelled)
  • error text and stderr/stdout excerpts
  • token usage/cost when available from the adapter
  • full logs (stored outside core run rows, optimized for large output)

In local/dev setups, full logs are stored on disk under the configured run-log path.

6. Live updates in the UI

Paperclip pushes runtime/activity updates to the browser in real time.

You should see live changes for:

  • agent status
  • heartbeat run status
  • task/activity updates caused by agent work
  • dashboard/cost/activity panels as relevant

If the connection drops, the UI reconnects automatically.

7. Common operating patterns

7.1 Simple autonomous loop

  1. Enable timer wakeups (for example every 300s)
  2. Keep assignment wakeups on
  3. Use a focused prompt template that tells agents to act in the same heartbeat, leave durable progress, and mark blocked work with an owner/action
  4. Watch run logs and adjust prompt/config over time

7.2 Event-driven loop (less constant polling)

  1. Disable timer or set a long interval
  2. Keep wake-on-assignment enabled
  3. Use child issues, comments, and on-demand wakeups for handoffs instead of loops that poll agents, sessions, or processes

7.3 Safety-first loop

  1. Short timeout
  2. Conservative prompt
  3. Monitor errors + cancel quickly when needed
  4. Reset sessions when drift appears

8. Troubleshooting

If runs fail repeatedly:

  1. Check adapter command availability (claude/codex installed and logged in).
  2. Verify cwd exists and is accessible.
  3. Inspect run error + stderr excerpt, then full log.
  4. Confirm timeout is not too low.
  5. Reset session and retry.
  6. Pause agent if it is causing repeated bad updates.

Typical failure causes:

  • CLI not installed/authenticated
  • bad working directory
  • malformed adapter args/env
  • prompt too broad or missing constraints
  • process timeout

9. Security and risk notes

Local CLI adapters run unsandboxed on the host machine.

That means:

  • prompt instructions matter
  • configured credentials/env vars are sensitive
  • working directory permissions matter

Start with least privilege where possible, and avoid exposing secrets in broad reusable prompts unless intentionally required.

10. Minimal setup checklist

  1. Choose adapter (claude_local or codex_local).
  2. Set cwd to the target workspace.
  3. Add bootstrap + normal prompt templates.
  4. Configure heartbeat policy (timer and/or assignment wakeups).
  5. Trigger a manual wakeup.
  6. Confirm run succeeds and session/token usage is recorded.
  7. Watch live updates and iterate prompt/config.