Compare commits

...

43 Commits

Author SHA1 Message Date
Devin Foley
5bd0f578fd Generalize sandbox provider core for plugin-only providers (#4449)
## Thinking Path

> - Paperclip is a control plane, so optional execution providers should
sit at the plugin edge instead of hardcoding provider-specific behavior
into core shared/server/ui layers.
> - Sandbox environments are already first-class, and the fake provider
proves the built-in path; the remaining gap was that real providers
still leaked provider-specific config and runtime assumptions into core.
> - That coupling showed up in config normalization, secret persistence,
capabilities reporting, lease reconstruction, and the board UI form
fields.
> - As long as core knew about those provider-shaped details, shipping a
provider as a pure third-party plugin meant every new provider would
still require host changes.
> - This pull request generalizes the sandbox provider seam around
schema-driven plugin metadata and generic secret-ref handling.
> - The runtime and UI now consume provider metadata generically, so
core only special-cases the built-in fake provider while third-party
providers can live entirely in plugins.

## What Changed

- Added generic sandbox-provider capability metadata so plugin-backed
providers can expose `configSchema` through shared environment support
and the environments capabilities API.
- Reworked sandbox config normalization/persistence/runtime resolution
to handle schema-declared secret-ref fields generically, storing them as
Paperclip secrets and resolving them for probe/execute/release flows.
- Generalized plugin sandbox runtime handling so provider validation,
reusable-lease matching, lease reconstruction, and plugin worker calls
all operate on provider-agnostic config instead of provider-shaped
branches.
- Replaced hardcoded sandbox provider form fields in Company Settings
with schema-driven rendering and blocked agent environment selection
from the built-in fake provider.
- Added regression coverage for the generic seam across shared support
helpers plus environment config, probe, routes, runtime, and
sandbox-provider runtime tests.

## Verification

- `pnpm vitest --run packages/shared/src/environment-support.test.ts
server/src/__tests__/environment-config.test.ts
server/src/__tests__/environment-probe.test.ts
server/src/__tests__/environment-routes.test.ts
server/src/__tests__/environment-runtime.test.ts
server/src/__tests__/sandbox-provider-runtime.test.ts`
- `pnpm -r typecheck`

## Risks

- Plugin sandbox providers now depend more heavily on accurate
`configSchema` declarations; incorrect schemas can misclassify
secret-bearing fields or omit required config.
- Reusable lease matching is now metadata-driven for plugin-backed
providers, so providers that fail to persist stable metadata may
reprovision instead of resuming an existing lease.
- The UI form is now fully schema-driven for plugin-backed sandbox
providers; provider manifests without good defaults or descriptions may
produce a rougher operator experience.

## Model Used

- OpenAI Codex via `codex_local`
- Model ID: `gpt-5.4`
- Reasoning effort: `high`
- Context window observed in runtime session metadata: `258400` tokens
- Capabilities used: terminal tool execution, git, and local code/test
inspection

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 18:03:41 -07:00
Dotta
deba60ebb2 Stabilize serialized server route tests (#4448)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The server route suite is a core confidence layer for auth, issue
context, and workspace runtime behavior
> - Some route tests were doing extra module/server isolation work that
made local runs slower and more fragile
> - The stable Vitest runner also needs to pass server-relative exclude
paths to avoid accidentally re-including serialized suites
> - This pull request tightens route test isolation and runner
serialization behavior
> - The benefit is more reliable targeted and stable-route test
execution without product behavior changes

## What Changed

- Updated `run-vitest-stable.mjs` to exclude serialized server tests
using server-relative paths.
- Forced the server Vitest config to use a single worker in addition to
isolated forks.
- Simplified agent permission route tests to create per-request test
servers without shared server lifecycle state.
- Stabilized issue goal context route mocks by using static mocked
services and a sequential suite.
- Re-registered workspace runtime route mocks before cache-busted route
imports.

## Verification

- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/agent-permissions-routes.test.ts
server/src/__tests__/issues-goal-context-routes.test.ts
server/src/__tests__/workspace-runtime-routes-authz.test.ts --pool=forks
--poolOptions.forks.isolate=true`
- `node --check scripts/run-vitest-stable.mjs`

## Risks

- Low risk. This is test infrastructure only.
- The stable runner path fix changes which tests are excluded from the
non-serialized server batch, matching the server project root that
Vitest applies internally.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled with
shell/GitHub/Paperclip API access. Context window was not reported by
the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 19:27:00 -05:00
Dotta
f68e9caa9a Polish markdown external link wrapping (#4447)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The board UI renders agent comments, PR links, issue links, and
operational markdown throughout issue threads
> - Long GitHub and external links can wrap awkwardly, leaving icons
orphaned from the text they describe
> - Small inbox visual polish also helps repeated board scanning without
changing behavior
> - This pull request glues markdown link icons to adjacent link
characters and removes a redundant inbox list border
> - The benefit is cleaner, more stable markdown and inbox rendering for
day-to-day operator review

## What Changed

- Added an external-link indicator for external markdown links.
- Kept the GitHub icon attached to the first link character so it does
not wrap onto a separate line.
- Kept the external-link icon attached to the final link character so it
does not wrap away from the URL/text.
- Added markdown rendering regressions for GitHub and external link icon
wrapping.
- Removed the extra border around the inbox list card.

## Verification

- `pnpm exec vitest run --project @paperclipai/ui
ui/src/components/MarkdownBody.test.tsx`
- `pnpm --filter @paperclipai/ui typecheck`

## Risks

- Low risk. The markdown change is limited to link child rendering and
preserves existing href/target/rel behavior.
- Visual-only inbox polish.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled with
shell/GitHub/Paperclip API access. Context window was not reported by
the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 19:26:13 -05:00
Dotta
73fbdf36db Gate stale-run watchdog decisions by board access (#4446)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The run ledger surfaces stale-run watchdog evaluation issues and
recovery actions
> - Viewer-level board users should be able to inspect status without
getting controls that the server will reject
> - The UI also needs enough board-access context to know when to hide
those decision actions
> - This pull request exposes board memberships in the current board
access snapshot and gates watchdog action controls for known viewer
contexts
> - The benefit is clearer least-privilege UI behavior around recovery
controls

## What Changed

- Included memberships in `/api/cli-auth/me` so the board UI can
distinguish active viewer memberships from operator/admin access.
- Added the stale-run evaluation issue assignee to output silence
summaries.
- Hid stale-run watchdog decision buttons for known non-owner viewer
contexts.
- Surfaced watchdog decision failures through toast and inline error
text.
- Threaded `companyId` through the issue activity run ledger so access
checks are company-scoped.
- Added IssueRunLedger coverage for non-owner viewers.

## Verification

- `pnpm exec vitest run --project @paperclipai/ui
ui/src/components/IssueRunLedger.test.tsx`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm --filter @paperclipai/ui typecheck`

## Risks

- Medium-low risk. This is a UI gating change backed by existing server
authorization.
- Local implicit and instance-admin board contexts continue to show
watchdog decision controls.
- No migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled with
shell/GitHub/Paperclip API access. Context window was not reported by
the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 19:25:23 -05:00
Dotta
6916e30f8e Cancel stale retries when issue ownership changes (#4445)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Issue execution is guarded by run locks and bounded retry scheduling
> - A failed run can schedule a retry, but the issue may be reassigned
before that retry becomes due
> - The old assignee's scheduled retry should not continue to hold or
reclaim execution for the issue
> - This pull request cancels stale scheduled retries when ownership
changes and cancels live work when an issue is explicitly cancelled
> - The benefit is cleaner issue handoff semantics and fewer stranded or
incorrect execution locks

## What Changed

- Cancel scheduled retry runs when their issue has been reassigned
before the retry is promoted.
- Clear stale issue execution locks and cancel the associated wakeup
request when a stale retry is cancelled.
- Avoid deferring a new assignee behind a previous assignee's scheduled
retry.
- Cancel an active run when an issue status is explicitly changed to
`cancelled`, while leaving `done` transitions alone.
- Added route and heartbeat regressions for reassignment and
cancellation behavior.

## Verification

- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/heartbeat-retry-scheduling.test.ts
server/src/__tests__/issue-comment-reopen-routes.test.ts --pool=forks
--poolOptions.forks.isolate=true`
  - `issue-comment-reopen-routes.test.ts`: 28 passed.
- `heartbeat-retry-scheduling.test.ts`: skipped by the existing embedded
Postgres host guard (`Postgres init script exited with code null`).
- `pnpm --filter @paperclipai/server typecheck`

## Risks

- Medium risk because this changes heartbeat retry lifecycle behavior.
- The cancellation path is scoped to scheduled retries whose issue
assignee no longer matches the retrying agent, and logs a lifecycle
event for auditability.
- No migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled with
shell/GitHub/Paperclip API access. Context window was not reported by
the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 19:24:13 -05:00
Dotta
0c6961a03e Normalize escaped multiline issue and approval text (#4444)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The board and agent APIs accept multiline issue, approval,
interaction, and document text
> - Some callers accidentally send literal escaped newline sequences
like `\n` instead of JSON-decoded line breaks
> - That makes comments, descriptions, documents, and approval notes
render as flattened text instead of readable markdown
> - This pull request centralizes multiline text normalization in shared
validators
> - The benefit is newline-preserving API behavior across issue and
approval workflows without route-specific fixes

## What Changed

- Added a shared `multilineTextSchema` helper that normalizes escaped
`\n`, `\r\n`, and `\r` sequences to real line breaks.
- Applied the helper to issue descriptions, issue update comments, issue
comment bodies, suggested task descriptions, interaction summaries,
issue documents, approval comments, and approval decision notes.
- Added shared validator regressions for issue and approval multiline
inputs.

## Verification

- `pnpm exec vitest run --project @paperclipai/shared
packages/shared/src/validators/approval.test.ts
packages/shared/src/validators/issue.test.ts`
- `pnpm --filter @paperclipai/shared typecheck`

## Risks

- Low risk. This only changes text fields that are explicitly multiline
user/operator content.
- If a caller intentionally wanted literal backslash-n text in these
fields, it will now render as a real line break.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled with
shell/GitHub/Paperclip API access. Context window was not reported by
the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 18:02:45 -05:00
Dotta
5a0c1979cf [codex] Add runtime lifecycle recovery and live issue visibility (#4419) 2026-04-24 15:50:32 -05:00
Dotta
9a8d219949 [codex] Stabilize tests and local maintenance assets (#4423)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - A fast-moving control plane needs stable local tests and repeatable
local maintenance tools so contributors can safely split and review work
> - Several route suites needed stronger isolation, Codex manual model
selection needed a faster-mode option, and local browser cleanup missed
Playwright's headless shell binary
> - Storybook static output also needed to be preserved as a generated
review artifact from the working branch
> - This pull request groups the test/local-dev maintenance pieces so
they can be reviewed separately from product runtime changes
> - The benefit is more predictable contributor verification and cleaner
local maintenance without mixing these changes into feature PRs

## What Changed

- Added stable Vitest runner support and serialized route/authz test
isolation.
- Fixed workspace runtime authz route mocks and stabilized
Claude/company-import related assertions.
- Allowed Codex fast mode for manually selected models.
- Broadened the agent browser cleanup script to detect
`chrome-headless-shell` as well as Chrome for Testing.
- Preserved generated Storybook static output from the source branch.

## Verification

- `pnpm exec vitest run
src/__tests__/workspace-runtime-routes-authz.test.ts
src/__tests__/claude-local-execute.test.ts --config vitest.config.ts`
from `server/` passed: 2 files, 19 tests.
- `pnpm exec vitest run src/server/codex-args.test.ts --config
vitest.config.ts` from `packages/adapters/codex-local/` passed: 1 file,
3 tests.
- `bash -n scripts/kill-agent-browsers.sh &&
scripts/kill-agent-browsers.sh --dry` passed; dry-run detected
`chrome-headless-shell` processes without killing them.
- `test -f ui/storybook-static/index.html && test -f
ui/storybook-static/assets/forms-editors.stories-Dry7qwx2.js` passed.
- `git diff --check public-gh/master..pap-2228-test-local-maintenance --
. ':(exclude)ui/storybook-static'` passed.
- `pnpm exec vitest run
cli/src/__tests__/company-import-export-e2e.test.ts --config
cli/vitest.config.ts` did not complete in the isolated split worktree
because `paperclipai run` exited during build prep with `TS2688: Cannot
find type definition file for 'react'`; this appears to be caused by the
worktree dependency symlink setup, not the code under test.
- Confirmed this PR does not include `pnpm-lock.yaml`.

## Risks

- Medium risk: the stable Vitest runner changes how route/authz tests
are scheduled.
- Generated `ui/storybook-static` files are large and contain minified
third-party output; `git diff --check` reports whitespace inside those
generated assets, so reviewers may choose to drop or regenerate that
artifact before merge.
- No database migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, with shell, git, Paperclip
API, and GitHub CLI tool use in the local Paperclip workspace.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Note: screenshot checklist item is not applicable to source UI behavior;
the included Storybook static output is generated artifact preservation
from the source branch.

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 15:11:42 -05:00
Devin Foley
70679a3321 Add sandbox environment support (#4415)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.

## What Changed

- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.

## Verification

- Ran locally before the final fixture-only scrub:
  - `pnpm -r typecheck`
  - `pnpm test:run`
  - `pnpm build`
- Ran locally after the final scrub amend:
  - `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
  - create a sandbox environment backed by the fake provider plugin
  - run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly

## Risks

- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.

## Model Used

- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
Dotta
641eb44949 [codex] Harden create-agent skill governance (#4422)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Hiring agents is a governance-sensitive workflow because it grants
roles, adapter config, skills, and execution capability
> - The create-agent skill needs explicit templates and review guidance
so hires are auditable and not over-permissioned
> - Skill sync also needs to recognize bundled Paperclip skills
consistently for Codex local agents
> - This pull request expands create-agent role templates, adds a
security-engineer template, and documents capability/secret-handling
review requirements
> - The benefit is safer, more repeatable agent creation with clearer
approval payloads and less permission sprawl

## What Changed

- Expanded `paperclip-create-agent` guidance for template selection,
adjacent-template drafting, and role-specific review bars.
- Added a Security Engineer agent template and collaboration/safety
sections for Coder, QA, and UX Designer templates.
- Hardened draft-review guidance around desired skills, external-system
access, secrets, and confidential advisory handling.
- Updated LLM agent-configuration guidance to point hiring workflows at
the create-agent skill.
- Added tests for bundled skill sync, create-agent skill injection, hire
approval payloads, and LLM route guidance.

## Verification

- `pnpm exec vitest run server/src/__tests__/agent-skills-routes.test.ts
server/src/__tests__/codex-local-skill-injection.test.ts
server/src/__tests__/codex-local-skill-sync.test.ts
server/src/__tests__/llms-routes.test.ts
server/src/__tests__/paperclip-skill-utils.test.ts --config
server/vitest.config.ts` passed: 5 files, 23 tests.
- `git diff --check public-gh/master..pap-2228-create-agent-governance
-- . ':(exclude)ui/storybook-static'` passed.
- Confirmed this PR does not include `pnpm-lock.yaml`.

## Risks

- Low-to-medium risk: this primarily changes skills/docs and tests, but
it affects future hiring guidance and approval expectations.
- Reviewers should check whether the new Security Engineer template is
too broad for default company installs.
- No database migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, with shell, git, Paperclip
API, and GitHub CLI tool use in the local Paperclip workspace.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Note: screenshot checklist item is not applicable; this PR changes
skills, docs, and server tests.

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 14:15:28 -05:00
Dotta
77a72e28c2 [codex] Polish issue composer and long document display (#4420)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Issue comments and documents are the main working surface where
operators and agents collaborate
> - File drops, markdown editing, and long issue descriptions need to
feel predictable because they sit directly in the task execution loop
> - The composer had edge cases around drag targets, attachment
feedback, image drops, and long markdown content crowding the page
> - This pull request polishes the issue composer, hardens markdown
editor regressions, and adds a fold curtain for long issue
descriptions/documents
> - The benefit is a calmer issue detail surface that handles uploads
and long work products without hiding state or breaking layout

## What Changed

- Scoped issue-composer drag/drop behavior so the composer owns file
drops without turning the whole thread into a competing drop target.
- Added clearer attachment upload feedback for non-image files and
image-drop stability coverage.
- Hardened markdown editor and markdown body handling around HTML-like
tag regressions.
- Added `FoldCurtain` and wired it into issue descriptions and issue
documents so long markdown previews can expand/collapse.
- Added Storybook coverage for the fold curtain state.

## Verification

- `pnpm exec vitest run ui/src/components/IssueChatThread.test.tsx
ui/src/components/MarkdownEditor.test.tsx
ui/src/components/MarkdownBody.test.tsx --config ui/vitest.config.ts`
passed: 3 files, 75 tests.
- `git diff --check public-gh/master..pap-2228-editor-composer-polish --
. ':(exclude)ui/storybook-static'` passed.
- Confirmed this PR does not include `pnpm-lock.yaml`.

## Risks

- Low-to-medium risk: this changes user-facing composer/drop behavior
and long markdown display.
- The fold curtain uses DOM measurement and `ResizeObserver`; reviewers
should check browser behavior for very long descriptions and documents.
- No database migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, with shell, git, Paperclip
API, and GitHub CLI tool use in the local Paperclip workspace.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Note: screenshots were not newly captured during branch splitting; the
UI states are covered by component tests and a Storybook story.

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 14:12:41 -05:00
Dotta
8f1cd0474f [codex] Improve transient recovery and Codex model refresh (#4383)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Adapter execution and retry classification decide whether agent work
pauses, retries, or recovers automatically
> - Transient provider failures need to be classified precisely so
Paperclip does not convert retryable upstream conditions into false hard
failures
> - At the same time, operators need an up-to-date model list for
Codex-backed agents and prompts should nudge agents toward targeted
verification instead of repo-wide sweeps
> - This pull request tightens transient recovery classification for
Claude and Codex, updates the agent prompt guidance, and adds Codex
model refresh support end-to-end
> - The benefit is better automatic retry behavior plus fresher
operator-facing model configuration

## What Changed

- added Codex usage-limit retry-window parsing and Claude extra-usage
transient classification
- normalized the heartbeat transient-recovery contract across adapter
executions and heartbeat scheduling
- documented that deferred comment wakes only reopen completed issues
for human/comment-reopen interactions, while system follow-ups leave
closed work closed
- updated adapter-utils prompt guidance to prefer targeted verification
- added Codex model refresh support in the server route, registry,
shared types, and agent config form
- added adapter/server tests covering the new parsing, retry scheduling,
and model-refresh behavior

## Verification

- `pnpm exec vitest run --project @paperclipai/adapter-utils
packages/adapter-utils/src/server-utils.test.ts`
- `pnpm exec vitest run --project @paperclipai/adapter-claude-local
packages/adapters/claude-local/src/server/parse.test.ts`
- `pnpm exec vitest run --project @paperclipai/adapter-codex-local
packages/adapters/codex-local/src/server/parse.test.ts`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/adapter-model-refresh-routes.test.ts
server/src/__tests__/adapter-models.test.ts
server/src/__tests__/claude-local-execute.test.ts
server/src/__tests__/codex-local-execute.test.ts
server/src/__tests__/heartbeat-process-recovery.test.ts
server/src/__tests__/heartbeat-retry-scheduling.test.ts`

## Risks

- Moderate behavior risk: retry classification affects whether runs
auto-recover or block, so mistakes here could either suppress needed
retries or over-retry real failures
- Low workflow risk: deferred comment wake reopening is intentionally
scoped to human/comment-reopen interactions so system follow-ups do not
revive completed issues unexpectedly

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex GPT-5-based coding agent with tool use and code execution
in the Codex CLI environment

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 09:40:40 -05:00
Dotta
4fdbbeced3 [codex] Refine markdown issue reference rendering (#4382)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Task references are a core part of how operators understand issue
relationships across the UI
> - Those references appear both in markdown bodies and in sidebar
relationship panels
> - The rendering had drifted between surfaces, and inline markdown
pills were reading awkwardly inside prose and lists
> - This pull request unifies the underlying issue-reference treatment,
routes issue descriptions through `MarkdownBody`, and switches inline
markdown references to a cleaner text-link presentation
> - The benefit is more consistent issue-reference UX with better
readability in markdown-heavy views

## What Changed

- unified sidebar and markdown issue-reference rendering around the
shared issue-reference components
- routed resting issue descriptions through `MarkdownBody` so
description previews inherit the richer issue-reference treatment
- replaced inline markdown pill chrome with a cleaner inline reference
presentation for prose contexts
- added and updated UI tests for `MarkdownBody` and `InlineEditor`

## Verification

- `pnpm exec vitest run --project @paperclipai/ui
ui/src/components/MarkdownBody.test.tsx
ui/src/components/InlineEditor.test.tsx`

## Risks

- Moderate UI risk: issue-reference rendering now differs intentionally
between inline markdown and relationship sidebars, so regressions would
show up as styling or hover-preview mismatches

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex GPT-5-based coding agent with tool use and code execution
in the Codex CLI environment

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 09:39:21 -05:00
Dotta
7ad225a198 [codex] Improve issue thread review flow (#4381)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Issue detail is where operators coordinate review, approvals, and
follow-up work with active runs
> - That thread UI needs to surface blockers, descendants, review
handoffs, and reply ergonomics clearly enough for humans to guide agent
work
> - Several small gaps in the issue-thread flow were making review and
navigation clunkier than necessary
> - This pull request improves the reply composer, descendant/blocker
presentation, interaction folding, and review-request handoff plumbing
together as one cohesive issue-thread workflow slice
> - The benefit is a cleaner operator review loop without changing the
broader task model

## What Changed

- restored and refined the floating reply composer behavior in the issue
thread
- folded expired confirmation interactions and improved post-submit
thread scrolling behavior
- surfaced descendant issue context and inline blocker/paused-assignee
notices on the issue detail view
- tightened large-board first paint behavior in `IssuesList`
- added loose review-request handoffs through the issue
execution-policy/update path and covered them with tests

## Verification

- `pnpm vitest run ui/src/pages/IssueDetail.test.tsx`
- `pnpm vitest run server/src/__tests__/issues-service.test.ts
server/src/__tests__/issue-execution-policy.test.ts`
- `pnpm exec vitest run --project @paperclipai/ui
ui/src/components/IssueChatThread.test.tsx
ui/src/components/IssueProperties.test.tsx
ui/src/components/IssuesList.test.tsx ui/src/lib/issue-tree.test.ts
ui/src/api/issues.test.ts`
- `pnpm exec vitest run --project @paperclipai/adapter-utils
packages/adapter-utils/src/server-utils.test.ts`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/issue-comment-reopen-routes.test.ts -t "coerces
executor handoff patches into workflow-controlled review wakes|wakes the
return assignee with execution_changes_requested"`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/issue-execution-policy.test.ts
server/src/__tests__/issues-service.test.ts`

## Visual Evidence

- UI layout changes are covered by the focused issue-thread component
and issue-detail tests listed above. Browser screenshots were not
attachable from this automated greploop environment, so reviewers should
use the running preview for final visual confirmation.

## Risks

- Moderate UI-flow risk: these changes touch the issue detail experience
in multiple spots, so regressions would most likely show up as
thread-layout quirks or incorrect review-handoff behavior

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex GPT-5-based coding agent with tool use and code execution
in the Codex CLI environment

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots or documented the visual verification path
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 08:02:45 -05:00
Dotta
35a9dc37b0 [codex] Speed up company skill detail loading (#4380)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Company skills are part of the control plane for distributing
reusable capabilities
> - Board flows that inspect company skill detail should stay responsive
because they are operator-facing control-plane reads
> - The existing detail path was doing broader work than needed for the
specific detail screen
> - This pull request narrows that company-skill detail loading path and
adds a regression test around it
> - The benefit is faster company skill detail reads without changing
the external API contract

## What Changed

- tightened the company-skill detail loading path in
`server/src/services/company-skills.ts`
- added `server/src/__tests__/company-skills-detail.test.ts` to verify
the detail route only pulls the required data

## Verification

- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/company-skills-detail.test.ts`

## Risks

- Low risk: this only changes the company-skill detail query path, but
any missed assumption in the detail consumer would surface when loading
that screen

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex GPT-5-based coding agent with tool use and code execution
in the Codex CLI environment

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 07:37:13 -05:00
Devin Foley
e4995bbb1c Add SSH environment support (#4358)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The environments subsystem already models execution environments,
but before this branch there was no end-to-end SSH-backed runtime path
for agents to actually run work against a remote box
> - That meant agents could be configured around environment concepts
without a reliable way to execute adapter sessions remotely, sync
workspace state, and preserve run context across supported adapters
> - We also need environment selection to participate in normal
Paperclip control-plane behavior: agent defaults, project/issue
selection, route validation, and environment probing
> - Because this capability is still experimental, the UI surface should
be easy to hide and easy to remove later without undoing the underlying
implementation
> - This pull request adds SSH environment execution support across the
runtime, adapters, routes, schema, and tests, then puts the visible
environment-management UI behind an experimental flag
> - The benefit is that we can validate real SSH-backed agent execution
now while keeping the user-facing controls safely gated until the
feature is ready to come out of experimentation

## What Changed

- Added SSH-backed execution target support in the shared adapter
runtime, including remote workspace preparation, skill/runtime asset
sync, remote session handling, and workspace restore behavior after
runs.
- Added SSH execution coverage for supported local adapters, plus remote
execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi.
- Added environment selection and environment-management backend support
needed for SSH execution, including route/service work, validation,
probing, and agent default environment persistence.
- Added CLI support for SSH environment lab verification and updated
related docs/tests.
- Added the `enableEnvironments` experimental flag and gated the
environment UI behind it on company settings, agent configuration, and
project configuration surfaces.

## Verification

- `pnpm exec vitest run
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/instance-settings-routes.test.ts`
- `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- `pnpm -r typecheck`
- `pnpm build`
- Manual verification on a branch-local dev server:
  - enabled the experimental flag
  - created an SSH environment
  - created a Linux Claude agent using that environment
- confirmed a run executed on the Linux box and synced workspace changes
back

## Risks

- Medium: this touches runtime execution flow across multiple adapters,
so regressions would likely show up in remote session setup, workspace
sync, or environment selection precedence.
- The UI flag reduces exposure, but the underlying runtime and route
changes are still substantial and rely on migration correctness.
- The change set is broad across adapters, control-plane services,
migrations, and UI gating, so review should pay close attention to
environment-selection precedence and remote workspace lifecycle
behavior.

## Model Used

- OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding
model with tool use and code execution in the local repo workspace. The
local adapter does not surface a more specific public model version
string in this branch workflow.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 19:15:22 -07:00
Dotta
f98c348e2b [codex] Add issue subtree pause, cancel, and restore controls (#4332)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - This branch extends the issue control-plane so board operators can
pause, cancel, and later restore whole issue subtrees while keeping
descendant execution and wake behavior coherent.
> - That required new hold state in the database, shared contracts,
server routes/services, and issue detail UI controls so subtree actions
are durable and auditable instead of ad hoc.
> - While this branch was in flight, `master` advanced with new
environment lifecycle work, including a new `0065_environments`
migration.
> - Before opening the PR, this branch had to be rebased onto
`paperclipai/paperclip:master` without losing the existing
subtree-control work or leaving conflicting migration numbering behind.
> - This pull request rebases the subtree pause/cancel/restore feature
cleanly onto current `master`, renumbers the hold migration to
`0066_issue_tree_holds`, and preserves the full branch diff in a single
PR.
> - The benefit is that reviewers get one clean, mergeable PR for the
subtree-control feature instead of stale branch history with migration
conflicts.

## What Changed

- Added durable issue subtree hold data structures, shared
API/types/validators, server routes/services, and UI flows for subtree
pause, cancel, and restore operations.
- Added server and UI coverage for subtree previewing, hold
creation/release, dependency-aware scheduling under holds, and issue
detail subtree controls.
- Rebased the branch onto current `paperclipai/paperclip:master` and
renumbered the branch migration from `0065_issue_tree_holds` to
`0066_issue_tree_holds` so it no longer conflicts with upstream
`0065_environments`.
- Added a small follow-up commit that makes restore requests return `200
OK` explicitly while keeping pause/cancel hold creation at `201
Created`, and updated the route test to match that contract.

## Verification

- `pnpm --filter @paperclipai/db typecheck`
- `pnpm --filter @paperclipai/shared typecheck`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm --filter @paperclipai/ui typecheck`
- `cd server && pnpm exec vitest run
src/__tests__/issue-tree-control-routes.test.ts
src/__tests__/issue-tree-control-service.test.ts
src/__tests__/issue-tree-control-service-unit.test.ts
src/__tests__/heartbeat-dependency-scheduling.test.ts`
- `cd ui && pnpm exec vitest run src/components/IssueChatThread.test.tsx
src/pages/IssueDetail.test.tsx`

## Risks

- This is a broad cross-layer change touching DB/schema, shared
contracts, server orchestration, and UI; regressions are most likely
around subtree status restoration or wake suppression/resume edge cases.
- The migration was renumbered during PR prep to avoid the new upstream
`0065_environments` conflict. Reviewers should confirm the final
`0066_issue_tree_holds` ordering is the only hold-related migration that
lands.
- The issue-tree restore endpoint now responds with `200` instead of
relying on implicit behavior, which is semantically better for a restore
operation but still changes an API detail that clients or tests could
have assumed.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent in the Paperclip Codex runtime (GPT-5-class
tool-using coding model; exact deployment ID/context window is not
exposed inside this session).

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-23 14:51:46 -05:00
Russell Dempsey
854fa81757 fix(pi-local): prepend installed skill bin/ dirs to child PATH (#4331)
## Thinking Path

> - Paperclip orchestrates AI agents; each agent runs under an adapter
that spawns a model CLI as a child process.
> - The pi-local adapter (`packages/adapters/pi-local`) spawns `pi` and
inherits the child's shell environment — including `PATH`, which
determines what the child's bash tool can execute by name.
> - Paperclip skills ship executable helpers under `<skill>/bin/` (e.g.
`paperclip-get-issue`) and Reviewer/QA-style `AGENTS.md` files invoke
them by name via the agent's bash tool.
> - Pi-local builds its runtime env with `ensurePathInEnv({
...process.env, ...env })` only — it never adds the installed skills'
`bin/` dirs to PATH. The pi CLI's `--skill` arg loads each skill's
SKILL.md but does not augment PATH.
> - Consequence: every bash invocation of a skill helper fails with
`exit 127: command not found`. The agent then spends its heartbeat
guessing (re-reading SKILL.md, trying `find`, inventing command paths)
and either times out or gives up.
> - This PR prepends each injected skill's `bin/` directory to the child
PATH immediately before runtimeEnv is constructed.
> - The benefit: pi_local agents whose AGENTS.md uses any `paperclip-*`
skill helper can actually run those helpers.

## What Changed

- `packages/adapters/pi-local/src/server/execute.ts`: compute
`skillBinDirs` from the already-resolved `piSkillEntries`, dedupe
against the existing PATH, prepend them to whichever of `PATH` / `Path`
the merged env uses, then build `runtimeEnv`. No new helpers, no
adapter-utils changes.

## Verification

Manual repro before the fix:

1. Create a pi_local agent wired to a paperclip skill (e.g.
paperclip-control).
2. Wake the agent on an in_review issue with an AGENTS.md that starts
with `paperclip-get-issue "$PAPERCLIP_TASK_ID"`.
3. Session file: `{ "role": "toolResult", "isError": true, "content": [{
"text": "/bin/bash: paperclip-get-issue: command not found\n\nCommand
exited with code 127" }] }`.

After the fix: same wake; `paperclip-get-issue` resolves and returns the
issue JSON; agent proceeds.

Local commands:

```
pnpm --filter @paperclipai/adapter-pi-local typecheck   # clean
pnpm --filter @paperclipai/adapter-pi-local build       # clean
pnpm --filter @paperclipai/server exec vitest run \
  src/__tests__/pi-local-execute.test.ts \
  src/__tests__/pi-local-adapter-environment.test.ts \
  src/__tests__/pi-local-skill-sync.test.ts
# 5/5 passing
```

No new tests: the existing `pi-local-skill-sync.test.ts` covers skill
symlink injection (upstream of the PATH step), and
`pi-local-execute.test.ts` covers the spawn path; this change only
augments env on the same spawn path.

## Risks

Low. Pure PATH augmentation on the child env. Edge cases:

- Zero skills installed → no PATH change (guarded by
`skillBinDirs.length > 0`).
- Duplicate bin dirs already on PATH → deduped; no pollution on re-runs.
- Windows `Path` casing → falls back correctly when merged env uses
`Path` instead of `PATH`.
- Skill dir without `bin/` subdir → joined path simply won't resolve;
harmless.

No behavioral change for pi_local agents that don't use skill-provided
commands.

## Model Used

- Claude, `claude-opus-4-7` (1M context), extended thinking enabled,
tool use enabled. Walked pi-local/cursor-local/claude-local and
adapter-utils to isolate the gap, wrote the inlined fix, and ran
typecheck/build/test locally.

## Checklist

- [x] Thinking path from project context to this change
- [x] Model used specified
- [x] Checked ROADMAP.md — no overlap
- [x] Tests run locally, passing
- [x] Tests added — new case in
`server/src/__tests__/pi-local-execute.test.ts`; verified it fails when
the fix is reverted
- [ ] UI screenshots — N/A (backend adapter change)
- [x] Docs updated — N/A (internal adapter, no user-facing docs)
- [x] Risks documented
- [x] Will address reviewer comments before merge
2026-04-23 10:15:10 -05:00
Dotta
fe14de504c [codex] Document README architecture systems (#4250)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies.
> - The public README is the first place many operators and contributors
learn what the product already includes.
> - The existing README explained the product promise but did not give a
compact, concrete tour of the major systems behind it.
> - This made Paperclip easier to underestimate as a wrapper around
agents instead of a full control plane with identity, work, execution,
governance, budgets, plugins, and portability.
> - This pull request adds an under-the-hood README section that names
those systems and shows how adapters connect into the server.
> - Greptile caught consistency gaps between the diagram and prose, so
the final version aligns the system labels and adapter examples across
both surfaces.
> - The benefit is a clearer first-read model of Paperclip's
architecture and shipped capabilities without changing runtime behavior.

## What Changed

- Added a `What's Under the Hood` section to `README.md`.
- Added an ASCII architecture diagram for the Paperclip server and
external agent adapters.
- Added a systems table covering identity, org charts, tasks, heartbeat
execution, workspaces, governance, budgets, routines, plugins,
secrets/storage, activity/events, and company portability.
- Addressed Greptile feedback by aligning diagram labels with table rows
and grouping adapter examples consistently.

## Verification

- `git diff --check public-gh/master...HEAD`
- Attempted `pnpm exec prettier --check README.md`, but this checkout
does not expose a `prettier` binary through `pnpm exec`.
- Greptile review rerun passed after addressing its two comments; review
threads are resolved.
- Remote PR checks passed on the latest head: `policy`, `verify`, `e2e`,
`security/snyk (cryppadotta)`, and `Greptile Review`.
- Not run locally: Vitest/build suites, because this is a README-only
documentation change and the PR's remote `verify` job ran typecheck,
tests, build, and release canary dry run.

## Risks

- Low runtime risk: documentation-only change.
- The main risk is wording drift if the README overstates or
underspecifies evolving product capabilities; the section was aligned
against the current product/spec docs and roadmap.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex / GPT-5 coding agent in a Paperclip heartbeat, with shell
and GitHub CLI tool use. Exact runtime model identifier and context
window were not exposed by the adapter.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-23 09:48:19 -05:00
Michel Tomas
3d15798c22 fix(adapters/routes): apply resolveExternalAdapterRegistration on hot-install (#4324)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The external adapter plugin system (#2218) lets adapters ship as npm
modules loaded via `server/src/adapters/plugin-loader.ts`; since #4296
merged, each `ServerAdapterModule` can declare `sessionManagement`
(`supportsSessionResume`, `nativeContextManagement`,
`defaultSessionCompaction`) and have it preserved through the init-time
load via the new `resolveExternalAdapterRegistration` helper
> - #4296 fixed the init-time IIFE path at
`server/src/adapters/registry.ts:363-369` but noted that the hot-install
path at `server/src/routes/adapters.ts:174
registerWithSessionManagement` still unconditionally overwrites
module-provided `sessionManagement` during `POST /api/adapters/install`
> - Practical impact today: an external adapter installed via the API
needs a Paperclip restart before its declared `sessionManagement` takes
effect — the IIFE runs on next boot and preserves it, but until then the
hot-install overwrite wins
> - This PR closes that parity gap: `registerWithSessionManagement`
delegates to the same `resolveExternalAdapterRegistration` helper
introduced by #4296, unifying both load paths behind one resolver
> - The benefit is consistent behaviour between cold-start and
hot-install: no "install then restart" ritual; declared
`sessionManagement` on an external module is honoured the moment `POST
/api/adapters/install` returns 201

## What Changed

- `server/src/routes/adapters.ts`: `registerWithSessionManagement`
delegates to the exported `resolveExternalAdapterRegistration` helper
(added in #4296). Honours module-provided `sessionManagement` first,
falls back to host registry lookup, defaults `undefined`. Updated the
section comment to document the parity-with-IIFE intent.
- `server/src/routes/adapters.ts`: dropped the now-unused
`getAdapterSessionManagement` import.
- `server/src/adapters/registry.ts`: updated the JSDoc on
`resolveExternalAdapterRegistration` — previously said "Exported for
unit tests; runtime callers use the IIFE below", now says the helper is
used by both the init-time IIFE and the hot-install path in
`routes/adapters.ts`. Addresses Greptile C1.
- `server/src/__tests__/adapter-routes.test.ts`: new integration test —
installs a mocked external adapter module carrying a non-trivial
`sessionManagement` declaration and asserts
`findServerAdapter(type).sessionManagement` preserves it after `POST
/api/adapters/install` returns 201.
- `server/src/__tests__/adapter-routes.test.ts`: added
`findServerAdapter` to the shared test-scope variable set so the new
test can inspect post-install registry state.

## Verification

Targeted test runs from a clean tree on
`fix/external-session-management-hot-install` (rebased onto current
`upstream/master` now that #4296 has merged):

- `pnpm test server/src/__tests__/adapter-routes.test.ts` — 6 passed
(new test + 5 pre-existing)
- `pnpm test server/src/__tests__/adapter-registry.test.ts` — 15 passed
(ensures the IIFE path from #4296 continues to behave correctly)
- `pnpm -w run test` full workspace suite — 1923 passed / 1 skipped
(unrelated skip)

End-to-end smoke on file:
[`@superbiche/cline-paperclip-adapter@0.1.1`](https://www.npmjs.com/package/@superbiche/cline-paperclip-adapter)
and
[`@superbiche/qwen-paperclip-adapter@0.1.1`](https://www.npmjs.com/package/@superbiche/qwen-paperclip-adapter),
both public on npm, both declare `sessionManagement`. With this PR in
place, the "restart after install" step disappears — the declared
compaction policy is active immediately after the install response.

## Risks

- Low risk. The change replaces an inline mutation with a call to a
helper that already has dedicated unit coverage (#4296 added three tests
for `resolveExternalAdapterRegistration` covering module-provided,
registry-fallback, and undefined paths). Behaviour is a strict superset
of the prior path — externals that did not declare `sessionManagement`
continue to get the hardcoded-registry lookup; externals that did
declare it now have those values preserved instead of overwritten.
- No migration impact. The stored plugin records
(`~/.paperclip/adapter-plugins.json`) are unchanged. Existing
hot-installed adapters behave correctly before and after.
- No behavioural change for builtin adapters; they hit
`registerServerAdapter` directly and never flow through
`registerWithSessionManagement`.

## Model Used

- Provider and model: Claude (Anthropic) via Claude Code
- Model ID: `claude-opus-4-7` (1M context)
- Reasoning mode: standard (no extended thinking on this PR)
- Tool use: yes — file edits, subprocess invocations for
builds/tests/git via the Claude Code harness

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots (N/A — server-only change)
- [x] I have updated relevant documentation to reflect my changes (the
JSDoc on `resolveExternalAdapterRegistration` and the section comment
above `registerWithSessionManagement` now document the parity-with-IIFE
intent)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 09:45:24 -05:00
Michel Tomas
24232078fd fix(adapters/registry): honor module-provided sessionManagement for external adapters (#4296)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Adapters are how paperclip hands work off to specific agent
runtimes; since #2218, external adapter packages can ship as npm modules
loaded via `server/src/adapters/plugin-loader.ts`
> - Each `ServerAdapterModule` can declare `sessionManagement`
(`supportsSessionResume`, `nativeContextManagement`,
`defaultSessionCompaction`) — but the init-time load at
`registry.ts:363-369` hard-overwrote it with a hardcoded-registry lookup
that has no entries for external types, so modules could not actually
set these fields
> - The hot-install path at `routes/adapters.ts:179` →
`registerServerAdapter` preserves module-provided `sessionManagement`,
so externals worked after `POST /api/adapters/install` — *until the next
server restart*, when the init-time IIFE wiped it back to `undefined`
> - #2218 explicitly deferred this: *"Adapter execution model, heartbeat
protocol, and session management are untouched."* This PR is the natural
follow-up for session management on the plugin-loader path
> - This PR aligns init-time registration with the hot-install path:
honor module-provided `sessionManagement` first, fall back to the
hardcoded registry when absent (so externals overriding a built-in type
still inherit its policy). Extracted as a testable helper with three
unit tests
> - The benefit is external adapters can declare session-resume
capabilities consistently across cold-start and hot-install, without
requiring upstream additions to the hardcoded registry for each new
plugin

## What Changed

- `server/src/adapters/registry.ts`: extracted the merge logic into a
new exported helper `resolveExternalAdapterRegistration()` — honors
module-provided `sessionManagement` first, falls back to
`getAdapterSessionManagement(type)`, else `undefined`. The init-time
IIFE calls the helper instead of inlining an overwrite.
- `server/src/adapters/registry.ts`: updated the section comment (lines
331–340) to reflect the new semantics and cross-reference the
hot-install path's behavior.
- `server/src/__tests__/adapter-registry.test.ts`: new
`describe("resolveExternalAdapterRegistration")` block with three tests
— module-provided value preserved, registry fallback when module omits,
`undefined` when neither provides.

## Verification

Targeted test run from a clean tree on
`fix/external-session-management`:

```
cd server && pnpm exec vitest run src/__tests__/adapter-registry.test.ts
# 1 test file, 15 tests passed, 0 failed (12 pre-existing + 3 new)
```

Full server suite via the independent review pass noted under Model
Used: **1,156 tests passed, 0 failed**.

Typecheck note: `pnpm --filter @paperclipai/server exec tsc --noEmit`
surfaces two errors in `src/services/plugin-host-services.ts:1510`
(`createInteraction` + implicit-any). Verified by `git stash` + re-run
on clean `upstream/master` — they reproduce without this PR's changes.
Pre-existing, out of scope.

## Risks

- **Low behavioral risk.** Strictly additive: externals that do NOT
provide `sessionManagement` continue to receive exactly the same value
as before (registry lookup → `undefined` for pure externals, or the
builtin's entry for externals overriding a built-in type). Only a new
capability is unlocked; no existing behavior changes for existing
adapters.
- **No breaking change.** `ServerAdapterModule.sessionManagement` was
already optional at the type level. Externals that never set it see no
difference on either path.
- **Consistency verified.** Init-time IIFE now matches the post-`POST
/api/adapters/install` behavior — a server restart no longer regresses
the field.

## Note

This is part of a broader effort to close the parity gap between
external and built-in adapters. Once externals reach 1:1 capability
coverage with internals, new-adapter contributions can increasingly be
steered toward the external-plugin path instead of the core product — a
trajectory CONTRIBUTING.md already encourages ("*If the idea fits as an
extension, prefer building it with the plugin system*").

## Model Used

- **Provider**: Anthropic
- **Model**: Claude Opus 4.7
- **Exact model ID**: `claude-opus-4-7` (1M-context variant:
`claude-opus-4-7[1m]`)
- **Context window**: 1,000,000 tokens
- **Harness**: Claude Code (Anthropic's official CLI), orchestrated by
@superbiche as human-in-the-loop. Full file-editing, shell, and `gh`
tool use, plus parallel research subagents for fact-finding against
paperclip internals (plugin-loader contract, sessionCodec reachability,
UI parser surface, Cline CLI JSON schema).
- **Independent local review**: Gemini 3.1 Pro (Google) performed a
separate verification pass on the committed branch — confirmed the
approach & necessity, ran the full workspace build, and executed the
complete server test suite (1,156 tests, all passing). Not used for
authoring; second-opinion pass only.
- **Authoring split**: @superbiche identified the gap (while mapping the
external-adapter surface for a downstream adapter build) and shaped the
plan — categorising the surface into `works / acceptable /
needs-upstream` buckets, directing the surgical-diff approach on a fresh
branch from `upstream/master`, and calling the framing ("alignment bug
between init-time IIFE and hot-install path" rather than "missing
capability"). Opus 4.7 executed the fact-finding, the diff, the tests,
and drafted this PR body — all under direct review.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work (convention-aligned bug fix on the external-adapter
plugin path introduced by #2218)
- [x] I have run tests locally and they pass (15/15 in the touched file;
1,156/1,156 full server suite via the independent Gemini 3.1 Pro review)
- [x] I have added tests where applicable (3 new for the extracted
helper)
- [x] If this change affects the UI, I have included before/after
screenshots (no UI touched)
- [x] I have updated relevant documentation to reflect my changes
(in-file comment reflects new semantics)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 07:39:43 -05:00
Devin Foley
13551b2bac Add local environment lifecycle (#4297)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Every heartbeat run needs a concrete place where the agent's adapter
process executes.
> - Today that execution location is implicitly the local machine, which
makes it hard to track, audit, and manage as a first-class runtime
concern.
> - The first step is to represent the current local execution path
explicitly without changing how users experience agent runs.
> - This pull request adds core Environment and Environment Lease
records, then routes existing local heartbeat execution through a
default `Local` environment.
> - The benefit is that local runs remain behavior-preserving while the
system now has durable environment identity, lease lifecycle tracking,
and activity records for execution placement.

## What Changed

- Added `environments` and `environment_leases` database tables, schema
exports, and migration `0065_environments.sql`.
- Added shared environment constants, TypeScript types, and validators
for environment drivers, statuses, lease policies, lease statuses, and
cleanup states.
- Added `environmentService` for listing, reading, creating, updating,
and ensuring company-scoped environments.
- Added environment lease lifecycle operations for acquire, metadata
update, single-lease release, and run-wide release.
- Updated heartbeat execution to lazily ensure a company-scoped default
`Local` environment before adapter execution.
- Updated heartbeat execution to acquire an ephemeral local environment
lease, write `paperclipEnvironment` into the run context snapshot, and
release active leases during run finalization.
- Added activity log events for environment lease acquisition and
release.
- Added tests for environment service behavior and the local heartbeat
environment lifecycle.
- Added a CI-follow-up heartbeat guard so deferred issue comment wakes
are promoted before automatic missing-comment retries, with focused
batching test coverage.

## Verification

Local verification run for this branch:

- `pnpm -r typecheck`
- `pnpm build`
- `pnpm exec vitest run server/src/__tests__/environment-service.test.ts
server/src/__tests__/heartbeat-local-environment.test.ts --pool=forks`

Additional reviewer/CI verification:

- Confirm `pnpm-lock.yaml` is not modified.
- Confirm `pnpm test:run` passes in CI.
- Confirm `PAPERCLIP_E2E_SKIP_LLM=true pnpm run test:e2e` passes in CI.
- Confirm a local heartbeat run creates one active `Local` environment
when needed, records one lease for the run, releases the lease when the
run finishes, and includes `paperclipEnvironment` in the run context
snapshot.

Screenshots: not applicable; this PR has no UI changes.

## Risks

- Migration risk: introduces two new tables and a new migration journal
entry. Review should verify company scoping, indexes, foreign keys, and
enum defaults are correct.
- Lifecycle risk: heartbeat finalization now releases environment leases
in addition to existing runtime cleanup. A finalization bug could leave
stale active leases or mark a failed run's lease incorrectly.
- Behavior-preservation risk: local adapter execution should remain
unchanged apart from environment bookkeeping. Review should pay
attention to the heartbeat path around context snapshot updates and
final cleanup ordering.
- Activity volume risk: each heartbeat run now logs lease acquisition
and release events, increasing activity log volume by two records per
run.

## Model Used

OpenAI GPT-5.4 via Codex CLI. Capabilities used: repository inspection,
TypeScript implementation review, local test/build execution, and
PR-description drafting.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots (N/A: no UI changes)
- [x] I have updated relevant documentation to reflect my changes (N/A:
no user-facing docs or commands changed)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-22 20:07:41 -07:00
Dotta
b69b563aa8 [codex] Fix stale issue execution run locks (#4258)
## Thinking Path

> - Paperclip is a control plane for AI-agent companies, so issue
checkout and execution ownership are core safety contracts.
> - The affected subsystem is the issue service and route layer that
gates agent writes by `checkoutRunId` and `executionRunId`.
> - PAP-1982 exposed a stale-lock failure mode where a terminal
heartbeat run could leave `executionRunId` pinned after checkout
ownership had moved or been cleared.
> - That stale execution lock could reject legitimate
PATCH/comment/release requests from the rightful assignee after a
harness restart.
> - This pull request centralizes terminal-run cleanup, applies it
before ownership-gated writes, and adds a board-only recovery endpoint
for operator intervention.
> - The benefit is that crashed or terminal runs no longer strand issues
behind stale execution locks, while live execution locks still block
conflicting writes.

## What Changed

- Added `issueService.clearExecutionRunIfTerminal()` to atomically lock
the issue/run rows and clear terminal or missing execution-run locks.
- Reused stale execution-lock cleanup from checkout,
`assertCheckoutOwner()`, and `release()`.
- Allowed the same assigned agent/current run to adopt an unowned
`in_progress` checkout after stale execution-lock cleanup.
- Updated release to clear `executionRunId`, `executionAgentNameKey`,
and `executionLockedAt`.
- Added board-only `POST /api/issues/:id/admin/force-release` with
company access checks, optional `clearAssignee=true`, and
`issue.admin_force_release` audit logging.
- Added embedded Postgres service tests and route integration tests for
stale-lock recovery, release behavior, and admin force-release
authorization/audit behavior.
- Documented the new force-release API in `doc/SPEC-implementation.md`.

## Verification

- `pnpm vitest run server/src/__tests__/issues-service.test.ts
server/src/__tests__/issue-stale-execution-lock-routes.test.ts` passed.
- `pnpm vitest run
server/src/__tests__/issue-stale-execution-lock-routes.test.ts
server/src/__tests__/approval-routes-idempotency.test.ts
server/src/__tests__/issue-comment-reopen-routes.test.ts
server/src/__tests__/issue-telemetry-routes.test.ts` passed.
- `pnpm -r typecheck` passed.
- `pnpm build` passed.
- `git diff --check` passed.
- `pnpm lint` could not run because this repo has no `lint` command.
- Full `pnpm test:run` completed with 4 failures in existing route
suites: `approval-routes-idempotency.test.ts` (2),
`issue-comment-reopen-routes.test.ts` (1), and
`issue-telemetry-routes.test.ts` (1). Those same files pass when run
isolated and when run together with the new stale-lock route test, so
this appears to be a whole-suite ordering/mock-isolation issue outside
this patch path.

## Risks

- Medium: this changes ownership-gated write behavior. The new adoption
path is limited to the current run, the current assignee, `in_progress`
issues, and rows with no checkout owner after terminal-lock cleanup.
- Low: the admin force-release endpoint is board-only and
company-scoped, but misuse can intentionally clear a live lock. It
writes an audit event with prior lock IDs.
- No schema or migration changes.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent (`gpt-5`), agentic coding with
terminal/tool use and local test execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-22 10:43:38 -05:00
Dotta
a957394420 [codex] Add structured issue-thread interactions (#4244)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Operators supervise that work through issues, comments, approvals,
and the board UI.
> - Some agent proposals need structured board/user decisions, not
hidden markdown conventions or heavyweight governed approvals.
> - Issue-thread interactions already provide a natural thread-native
surface for proposed tasks and questions.
> - This pull request extends that surface with request confirmations,
richer interaction cards, and agent/plugin/MCP helpers.
> - The benefit is that plan approvals and yes/no decisions become
explicit, auditable, and resumable without losing the single-issue
workflow.

## What Changed

- Added persisted issue-thread interactions for suggested tasks,
structured questions, and request confirmations.
- Added board UI cards for interaction review, selection, question
answers, and accept/reject confirmation flows.
- Added MCP and plugin SDK helpers for creating interaction cards from
agents/plugins.
- Updated agent wake instructions, onboarding assets, Paperclip skill
docs, and public docs to prefer structured confirmations for
issue-scoped decisions.
- Rebased the branch onto `public-gh/master` and renumbered branch
migrations to `0063` and `0064`; the idempotency migration uses `ADD
COLUMN IF NOT EXISTS` for old branch users.

## Verification

- `git diff --check public-gh/master..HEAD`
- `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts
packages/mcp-server/src/tools.test.ts
packages/shared/src/issue-thread-interactions.test.ts
ui/src/lib/issue-thread-interactions.test.ts
ui/src/lib/issue-chat-messages.test.ts
ui/src/components/IssueThreadInteractionCard.test.tsx
ui/src/components/IssueChatThread.test.tsx
server/src/__tests__/issue-thread-interaction-routes.test.ts
server/src/__tests__/issue-thread-interactions-service.test.ts
server/src/services/issue-thread-interactions.test.ts` -> 9 files / 79
tests passed
- `pnpm -r typecheck` -> passed, including `packages/db` migration
numbering check

## Risks

- Medium: this adds a new issue-thread interaction model across
db/shared/server/ui/plugin surfaces.
- Migration risk is reduced by placing this branch after current master
migrations (`0063`, `0064`) and making the idempotency column add
idempotent for users who applied the old branch numbering.
- UI interaction behavior is covered by component tests, but this PR
does not include browser screenshots.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-class coding agent runtime. Exact model ID and
context window are not exposed in this Paperclip run; tool use and local
shell/code execution were enabled.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-21 20:15:11 -05:00
Dotta
014aa0eb2d [codex] Clear stale queued comment targets (#4234)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Operators interact with agent work through issue threads and queued
comments.
> - When the selected comment target becomes stale, the composer can
keep pointing at an invalid target after thread state changes.
> - That makes follow-up comments easier to misroute and harder to
reason about.
> - This pull request clears stale queued comment targets and covers the
behavior with tests.
> - The benefit is more predictable issue-thread commenting during live
agent work.

## What Changed

- Clears queued comment targets when they no longer match the current
issue thread state.
- Adjusts issue detail comment-target handling to avoid stale target
reuse.
- Adds regression tests for optimistic issue comment target behavior.

## Verification

- `pnpm exec vitest run ui/src/lib/optimistic-issue-comments.test.ts`

## Risks

- Low risk; scoped to comment-target state handling in the issue UI.
- No migrations.

> Checked `ROADMAP.md`; this is a focused UI reliability fix, not a new
roadmap-level feature.

## Model Used

- OpenAI Codex, GPT-5-based coding agent, tool-enabled repository
editing and local test execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-21 16:50:26 -05:00
Dotta
bcbbb41a4b [codex] Harden heartbeat runtime cleanup (#4233)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The heartbeat runtime is the control-plane path that turns issue
assignments into agent runs and recovers after process exits.
> - Several edge cases could leave high-volume reads unbounded, stale
runtime services visible, blocked dependency wakes too eager, or
terminal adapter processes still around after output finished.
> - These problems make operator views noisy and make long-running agent
work less predictable.
> - This pull request tightens the runtime/read paths and adds focused
regression coverage.
> - The benefit is safer heartbeat execution and cleaner runtime state
without changing the public task model.

## What Changed

- Bounded high-volume issue/log reads in runtime code paths.
- Hardened heartbeat handling for blocked dependency wakes and terminal
run cleanup.
- Added adapter process cleanup coverage for terminal output cases.
- Added workspace runtime control tests for stale command matching and
stopped services.

## Verification

- `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts
server/src/__tests__/heartbeat-dependency-scheduling.test.ts
ui/src/components/WorkspaceRuntimeControls.test.tsx`

## Risks

- Medium risk because heartbeat cleanup and runtime filtering affect
active agent execution paths.
- No migrations.

> Checked `ROADMAP.md`; this is runtime hardening and bug-fix work, not
a new roadmap-level feature.

## Model Used

- OpenAI Codex, GPT-5-based coding agent, tool-enabled repository
editing and local test execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-21 16:48:47 -05:00
Dotta
73ef40e7be [codex] Sandbox dynamic adapter UI parsers (#4225)
## Thinking Path

> - Paperclip is a control plane for AI-agent companies.
> - External adapters can provide UI parser code that the board loads
dynamically for run transcript rendering.
> - Running adapter-provided parser code directly in the board page
gives that parser access to same-origin browser state.
> - This PR narrows that surface by evaluating dynamically loaded
external adapter UI parser code in a dedicated browser Web Worker with a
constrained postMessage protocol.
> - The worker here is a frontend isolation boundary for adapter UI
parser JavaScript; it is not Paperclip's server plugin-worker system and
it is not a server-side job runner.

## What Changed

- Runs dynamically loaded external adapter UI parsers inside a dedicated
Web Worker instead of importing/evaluating them directly in the board
page.
- Adds a narrow postMessage protocol for parser initialization and line
parsing.
- Caches completed async parse results and notifies the adapter registry
so transcript recomputation can synchronously drain the final parsed
line.
- Disables common worker network, persistence, child worker, Blob/object
URL, and WebRTC escape APIs inside the parser worker bootstrap.
- Handles worker error messages after initialization and drains pending
callbacks on worker termination or mid-session worker error.
- Adds focused regression coverage for the parser worker lockdown and
unused protocol removal.

## Verification

- `pnpm exec vitest run --config ui/vitest.config.ts
ui/src/adapters/sandboxed-parser-worker.test.ts`
- `pnpm exec tsc --noEmit --target es2021 --moduleResolution bundler
--module esnext --jsx react-jsx --lib dom,es2021 --skipLibCheck
ui/src/adapters/dynamic-loader.ts
ui/src/adapters/sandboxed-parser-worker.ts
ui/src/adapters/sandboxed-parser-worker.test.ts`
- `pnpm --filter @paperclipai/ui typecheck` was attempted; it reached
existing unrelated failures in HeartbeatRun test/storybook fixtures and
missing Storybook type resolution, with no adapter-module errors
surfaced.
- PR #4225 checks on current head `34c9da00`: `policy`, `e2e`, `verify`,
`security/snyk`, and `Greptile Review` are all `SUCCESS`.
- Greptile Review on current head `34c9da00` reached 5/5.

## Risks

- Medium risk: parser execution is now asynchronous through a worker
while the existing parser interface is synchronous, so transcript
updates should be watched with external adapters.
- Some adapter parser bundles may rely on direct ESM `export` syntax or
browser APIs that are no longer available inside the worker lockdown.
- The worker lockdown is a hardening layer around external parser code,
not a complete browser security sandbox for arbitrary untrusted
applications.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent runtime, shell/git tool use
enabled. Exact hosted model build and context window are not exposed in
this Paperclip heartbeat environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-21 13:42:44 -05:00
Dotta
a26e1288b6 [codex] Polish issue board workflows (#4224)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Human operators supervise that work through issue lists, issue
detail, comments, inbox groups, markdown references, and
profile/activity surfaces
> - The branch had many small UI fixes that improve the operator loop
but do not need to ship with backend runtime migrations
> - These changes belong together as board workflow polish because they
affect scanning, navigation, issue context, comment state, and markdown
clarity
> - This pull request groups the UI-only slice so it can merge
independently from runtime/backend changes
> - The benefit is a clearer board experience with better issue context,
steadier optimistic updates, and more predictable keyboard navigation

## What Changed

- Improves issue properties, sub-issue actions, blocker chips, and issue
list/detail refresh behavior.
- Adds blocker context above the issue composer and stabilizes
queued/interrupted comment UI state.
- Improves markdown issue/GitHub link rendering and opens external
markdown links in a new tab.
- Adds inbox group keyboard navigation and fold/unfold support.
- Polishes activity/avatar/profile/settings/workspace presentation
details.

## Verification

- `pnpm exec vitest run ui/src/components/IssueProperties.test.tsx
ui/src/components/IssueChatThread.test.tsx
ui/src/components/MarkdownBody.test.tsx ui/src/lib/inbox.test.ts
ui/src/lib/optimistic-issue-comments.test.ts`

## Risks

- Low to medium risk: changes are UI-focused but cover high-traffic
issue and inbox surfaces.
- This branch intentionally does not include the backend runtime changes
from the companion PR; where UI calls newer API filters, unsupported
servers should continue to fail visibly through existing API error
handling.
- Visual screenshots were not captured in this heartbeat; targeted
component/helper tests cover the changed behavior.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent runtime, shell/git tool use
enabled. Exact hosted model build and context window are not exposed in
this Paperclip heartbeat environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-21 12:25:34 -05:00
Dotta
09d0678840 [codex] Harden heartbeat scheduling and runtime controls (#4223)
## Thinking Path

> - Paperclip orchestrates AI agents through issue checkout, heartbeat
runs, routines, and auditable control-plane state
> - The runtime path has to recover from lost local processes, transient
adapter failures, blocked dependencies, and routine coalescing without
stranding work
> - The existing branch carried several reliability fixes across
heartbeat scheduling, issue runtime controls, routine dispatch, and
operator-facing run state
> - These changes belong together because they share backend contracts,
migrations, and runtime status semantics
> - This pull request groups the control-plane/runtime slice so it can
merge independently from board UI polish and adapter sandbox work
> - The benefit is safer heartbeat recovery, clearer runtime controls,
and more predictable recurring execution behavior

## What Changed

- Adds bounded heartbeat retry scheduling, scheduled retry state, and
Codex transient failure recovery handling.
- Tightens heartbeat process recovery, blocker wake behavior, issue
comment wake handling, routine dispatch coalescing, and
activity/dashboard bounds.
- Adds runtime-control MCP tools and Paperclip skill docs for issue
workspace runtime management.
- Adds migrations `0061_lively_thor_girl.sql` and
`0062_routine_run_dispatch_fingerprint.sql`.
- Surfaces retry state in run ledger/agent UI and keeps related shared
types synchronized.

## Verification

- `pnpm exec vitest run
server/src/__tests__/heartbeat-retry-scheduling.test.ts
server/src/__tests__/heartbeat-process-recovery.test.ts
server/src/__tests__/routines-service.test.ts`
- `pnpm exec vitest run src/tools.test.ts` from `packages/mcp-server`

## Risks

- Medium risk: this touches heartbeat recovery and routine dispatch,
which are central execution paths.
- Migration order matters if split branches land out of order: merge
this PR before branches that assume the new runtime/routine fields.
- Runtime retry behavior should be watched in CI and in local operator
smoke tests because it changes how transient failures are resumed.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent runtime, shell/git tool use
enabled. Exact hosted model build and context window are not exposed in
this Paperclip heartbeat environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-21 12:24:11 -05:00
Dotta
ab9051b595 Add first-class issue references (#4214)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Operators and agents coordinate through company-scoped issues,
comments, documents, and task relationships.
> - Issue text can mention other tickets, but those references were
previously plain markdown/text without durable relationship data.
> - That made it harder to understand related work, surface backlinks,
and keep cross-ticket context visible in the board.
> - This pull request adds first-class issue reference extraction,
storage, API responses, and UI surfaces.
> - The benefit is that issue references become queryable, navigable,
and visible without relying on ad hoc text scanning.

## What Changed

- Added shared issue-reference parsing utilities and exported
reference-related types/constants.
- Added an `issue_reference_mentions` table, idempotent migration DDL,
schema exports, and database documentation.
- Added server-side issue reference services, route integration,
activity summaries, and a backfill command for existing issue content.
- Added UI reference pills, related-work panels, markdown/editor mention
handling, and issue detail/property rendering updates.
- Added focused shared, server, and UI tests for parsing, persistence,
display, and related-work behavior.
- Rebased `PAP-735-first-class-task-references` cleanly onto
`public-gh/master`; no `pnpm-lock.yaml` changes are included.

## Verification

- `pnpm -r typecheck`
- `pnpm test:run packages/shared/src/issue-references.test.ts
server/src/__tests__/issue-references-service.test.ts
ui/src/components/IssueRelatedWorkPanel.test.tsx
ui/src/components/IssueProperties.test.tsx
ui/src/components/MarkdownBody.test.tsx`

## Risks

- Medium risk because this adds a new issue-reference persistence path
that touches shared parsing, database schema, server routes, and UI
rendering.
- Migration risk is mitigated by `CREATE TABLE IF NOT EXISTS`, guarded
foreign-key creation, and `CREATE INDEX IF NOT EXISTS` statements so
users who have applied an older local version of the numbered migration
can re-run safely.
- UI risk is limited by focused component coverage, but reviewers should
still manually inspect issue detail pages containing ticket references
before merge.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent, tool-using shell workflow with
repository inspection, git rebase/push, typecheck, and focused Vitest
verification.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: dotta <dotta@example.com>
Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-21 10:02:52 -05:00
Dotta
1954eb3048 [codex] Detect issue graph liveness deadlocks (#4209)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The heartbeat harness is responsible for waking agents, reconciling
issue state, and keeping execution moving.
> - Some dependency graphs can become live-locks when a blocked issue
depends on an unassigned, cancelled, or otherwise uninvokable issue.
> - Review and approval stages can also stall when the recorded
participant can no longer be resolved.
> - This pull request adds issue graph liveness classification plus
heartbeat reconciliation that creates durable escalation work for those
cases.
> - The benefit is that harness-level deadlocks become visible,
assigned, logged, and recoverable instead of silently leaving task
sequences blocked.

## What Changed

- Added an issue graph liveness classifier for blocked dependency and
invalid review participant states.
- Added heartbeat reconciliation that creates one stable escalation
issue per liveness incident, links it as a blocker, comments on the
affected issue, wakes the recommended owner, and logs activity.
- Wired startup and periodic server reconciliation for issue graph
liveness incidents.
- Added focused tests for classifier behavior, heartbeat escalation
creation/deduplication, and queued dependency wake promotion.
- Fixed queued issue wakes so a coalesced wake re-runs queue selection,
allowing dependency-unblocked work to start immediately.

## Verification

- `pnpm exec vitest run
server/src/__tests__/heartbeat-dependency-scheduling.test.ts
server/src/__tests__/issue-liveness.test.ts
server/src/__tests__/heartbeat-issue-liveness-escalation.test.ts`
- Passed locally: `server/src/__tests__/issue-liveness.test.ts` (5
tests)
- Skipped locally: embedded Postgres suites because optional package
`@embedded-postgres/darwin-x64` is not installed on this host
- `pnpm --filter @paperclipai/server typecheck`
- `git diff --check`
- Greptile review loop: ran 3 times as requested; the final
Greptile-reviewed head `0a864eab` had 0 comments and all Greptile
threads were resolved. Later commits are CI/test-stability fixes after
the requested max Greptile pass count.
- GitHub PR checks on head `87493ed4`: `policy`, `verify`, `e2e`, and
`security/snyk (cryppadotta)` all passed.

## Risks

- Moderate operational risk: the reconciler creates escalation issues
automatically, so incorrect classification could create noise. Stable
incident keys and deduplication limit repeated escalation.
- Low schema risk: this uses existing issue, relation, comment, wake,
and activity log tables with no migration.
- No UI screenshots included because this change is server-side harness
behavior only.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent. Exact runtime model ID and
context window were not exposed in this session. Used tool execution for
git, tests, typecheck, Greptile review handling, and GitHub CLI
operations.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-21 09:11:12 -05:00
Robin van Duiven
8d0c3d2fe6 fix(hermes): inject agent JWT into Hermes adapter env to fix identity attribution (#3608)
## Thinking Path

> - Paperclip orchestrates AI agents and records their actions through
auditable issue comments and API writes.
> - The local adapter registry is responsible for adapting each agent
runtime to Paperclip's server-side execution context.
> - The Hermes local adapter delegated directly to
`hermes-paperclip-adapter`, whose current execution context type
predates the server `authToken` field.
> - Without explicitly passing the run-scoped agent token and run id
into Hermes, Hermes could inherit a server or board-user
`PAPERCLIP_API_KEY` and lack a usable `PAPERCLIP_RUN_ID` for mutating
API calls.
> - That made Paperclip writes from Hermes agents risk appearing under
the wrong identity or without the correct run-scoped attribution.
> - This pull request wraps the Hermes execution call so Hermes receives
the agent run JWT as `PAPERCLIP_API_KEY` and the current execution id as
`PAPERCLIP_RUN_ID` while preserving explicit adapter configuration where
appropriate.
> - Follow-up review fixes preserve Hermes' built-in prompt when no
custom prompt template exists and document the intentional type cast.
> - The benefit is reliable agent attribution for the covered local
Hermes path without clobbering Hermes' default heartbeat/task
instructions.

## What Changed

- Wrapped `hermesLocalAdapter.execute` so `ctx.authToken` is injected
into `adapterConfig.env.PAPERCLIP_API_KEY` when no explicit Paperclip
API key is already configured.
- Injected `ctx.runId` into `adapterConfig.env.PAPERCLIP_RUN_ID` so the
auth guard's `X-Paperclip-Run-Id: $PAPERCLIP_RUN_ID` instruction
resolves to the current run id.
- Added a Paperclip API auth guard to existing custom Hermes
`promptTemplate` values without creating a replacement prompt when no
custom template exists.
- Documented the intentional `as unknown as` cast needed until
`hermes-paperclip-adapter` ships an `AdapterExecutionContext` type that
includes `authToken`.
- Added registry tests for JWT injection, run-id injection, explicit key
preservation, default prompt preservation, and the no-`authToken`
early-return path.

## Verification

- [x] `pnpm --filter "./server" exec vitest run adapter-registry` - 8
tests passed.
- [x] `pnpm --filter "./server" typecheck` - passed.
- [x] Trigger a Hermes agent heartbeat and verify Paperclip writes
appear under the agent identity rather than a shared board-user
identity, with the correct run id on mutating requests.

## Risks

- Low migration risk: this changes only the Hermes local adapter wrapper
and tests.
- Existing explicit `adapterConfig.env.PAPERCLIP_API_KEY` values are
preserved to avoid breaking intentionally configured agents.
- `PAPERCLIP_RUN_ID` is set from `ctx.runId` for each execution so
mutating API calls use the current run id instead of a stale or literal
placeholder value.
- Prompt behavior is intentionally conservative: the auth guard is only
prepended when a custom prompt template already exists, so Hermes'
built-in default prompt remains intact for unconfigured agents.
- Remaining operational risk: the identity and run-id behavior should
still be verified with a live Hermes heartbeat before relying on it in
production.

## Model Used

- OpenAI Codex, GPT-5 family coding agent, tool use enabled for local
shell, GitHub CLI, and test execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots (not applicable: backend-only change)
- [x] I have updated relevant documentation to reflect my changes (not
applicable: no product docs changed; PR description updated)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
Co-authored-by: Dotta <bippadotta@protonmail.com>
2026-04-21 07:18:11 -05:00
Dotta
1266954a4e [codex] Make heartbeat scheduling blocker-aware (#4157)
## Thinking Path

> - Paperclip orchestrates AI agents through issue-driven heartbeats,
checkouts, and wake scheduling.
> - This change sits in the server heartbeat and issue services that
decide which queued runs are allowed to start.
> - Before this branch, queued heartbeats could be selected even when
their issue still had unresolved blocker relationships.
> - That let blocked descendant work compete with actually-ready work
and risked auto-checking out issues that were not dependency-ready.
> - This pull request teaches the scheduler and checkout path to consult
issue dependency readiness before claiming queued runs.
> - It also exposes dependency readiness in the agent inbox so agents
can see which assigned issues are still blocked.
> - The result is that heartbeat execution follows the DAG of blocked
dependencies instead of waking work out of order.

## What Changed

- Added `IssueDependencyReadiness` helpers to `issueService`, including
unresolved blocker lookup for single issues and bulk issue lists.
- Prevented issue checkout and `in_progress` transitions when unresolved
blockers still exist.
- Made heartbeat queued-run claiming and prioritization dependency-aware
so ready work starts before blocked descendants.
- Included dependency readiness fields in `/api/agents/me/inbox-lite`
for agent heartbeat selection.
- Added regression coverage for dependency-aware heartbeat promotion and
issue-service participation filtering.

## Verification

- `pnpm run preflight:workspace-links`
- `pnpm exec vitest run
server/src/__tests__/heartbeat-dependency-scheduling.test.ts
server/src/__tests__/issues-service.test.ts`
- On this host, the Vitest command passed, but the embedded-Postgres
portions of those files were skipped because
`@embedded-postgres/darwin-x64` is not installed.

## Risks

- Scheduler ordering now prefers dependency-ready runs, so any hidden
assumptions about strict FIFO ordering could surface in edge cases.
- The new guardrails reject checkout or `in_progress` transitions for
blocked issues; callers depending on the old permissive behavior would
now get `422` errors.
- Local verification did not execute the embedded-Postgres integration
paths on this macOS host because the platform binary package was
missing.

> I checked `ROADMAP.md`; this is a targeted execution/scheduling fix
and does not duplicate planned roadmap feature work.

## Model Used

- OpenAI Codex via the Paperclip `codex_local` adapter in this
workspace. Exact backend model ID is not surfaced in the runtime here;
tool-enabled coding agent with terminal execution and repository editing
capabilities.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-20 16:03:57 -05:00
Hiuri Noronha
1bf2424377 fix: honor Hermes local command override (#3503)
## Summary

This fixes the Hermes local adapter so that a configured command
override is respected during both environment tests and execution.

## Problem

The Hermes adapter expects `adapterConfig.hermesCommand`, but the
generic local command path in the UI was storing
`adapterConfig.command`.

As a result, changing the command in the UI did not reliably affect
runtime behavior. In real use, the adapter could still fall back to the
default `hermes` binary.

This showed up clearly in setups where Hermes is launched through a
wrapper command rather than installed directly on the host.

## What changed

- switched the Hermes local UI adapter to the Hermes-specific config
builder
- updated the configuration form to read and write `hermesCommand` for
`hermes_local`
- preserved the override correctly in the test-environment path
- added server-side normalization from legacy `command` to
`hermesCommand`

## Compatibility

The server-side normalization keeps older saved agent configs working,
including configs that still store the value under `command`.

## Validation

Validated against a Docker-based Hermes workflow using a local wrapper
exposed through a symlinked command:

- `Command = hermes-docker`
- environment test respects the override
- runs no longer fall back to `hermes`

Typecheck also passed for both UI and server.

Co-authored-by: NoronhaH <NoronhaH@users.noreply.github.com>
2026-04-20 15:55:08 -05:00
LeonSGP
51f127f47b fix(hermes): stop advertising unsupported instructions bundles (#3908)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Local adapter capability flags decide which configuration surfaces
the UI and server expose for each adapter.
> - `hermes_local` currently advertises managed instructions bundle
support, so Paperclip exposes the AGENTS.md bundle flow for Hermes
agents.
> - The bundled `hermes-paperclip-adapter` only consumes
`promptTemplate` at runtime and does not read `instructionsFilePath`, so
that advertised bundle path silently does nothing.
> - Issue #3833 reports exactly that mismatch: users configure AGENTS.md
instructions, but Hermes only receives the built-in heartbeat prompt.
> - This pull request stops advertising managed instructions bundles for
`hermes_local` until the adapter actually consumes bundle files at
runtime.

## What Changed

- Changed the built-in `hermes_local` server adapter registration to
report `supportsInstructionsBundle: false`.
- Updated the UI's synchronous built-in capability fallback so Hermes no
longer shows the managed instructions bundle affordance on first render.
- Added regression coverage in
`server/src/__tests__/adapter-routes.test.ts` to assert that
`hermes_local` still reports skills + local JWT support, but not
instructions bundle support.

## Verification

- `git diff --check`
- `node --experimental-strip-types --input-type=module -e "import {
findActiveServerAdapter } from './server/src/adapters/index.ts'; const
adapter = findActiveServerAdapter('hermes_local');
console.log(JSON.stringify({ type: adapter?.type,
supportsInstructionsBundle: adapter?.supportsInstructionsBundle,
supportsLocalAgentJwt: adapter?.supportsLocalAgentJwt, supportsSkills:
Boolean(adapter?.listSkills || adapter?.syncSkills) }));"`
- Observed
`{"type":"hermes_local","supportsInstructionsBundle":false,"supportsLocalAgentJwt":true,"supportsSkills":true}`
- Added adapter-routes regression assertions for the Hermes capability
contract; CI should validate the full route path in a clean workspace.

## Risks

- Low risk: this only changes the advertised capability surface for
`hermes_local`.
- Behavior change: Hermes agents will no longer show the broken managed
instructions bundle UI until the underlying adapter actually supports
`instructionsFilePath`.
- Existing Hermes skill sync and local JWT behavior are unchanged.

## Model Used

- OpenAI Codex, GPT-5.4 class coding agent, medium reasoning,
terminal/git/gh tool use.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [ ] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-20 15:54:14 -05:00
github-actions[bot]
b94f1a1565 chore(lockfile): refresh pnpm-lock.yaml (#4139)
Auto-generated lockfile refresh after dependencies changed on master.
This PR only updates pnpm-lock.yaml.

Co-authored-by: lockfile-bot <lockfile-bot@users.noreply.github.com>
2026-04-20 12:15:59 -05:00
Dotta
2de893f624 [codex] add comprehensive UI Storybook coverage (#4132)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The board UI is the main operator surface, so its component and
workflow coverage needs to stay reviewable as the product grows.
> - This branch adds Storybook as a dedicated UI reference surface for
core Paperclip screens and interaction patterns.
> - That work spans Storybook infrastructure, app-level provider wiring,
and a large fixture set that can render real control-plane states
without a live backend.
> - The branch also expands coverage across agents, budgets, issues,
chat, dialogs, navigation, projects, and data visualization so future UI
changes have a concrete visual baseline.
> - This pull request packages that Storybook work on top of the latest
`master`, excludes the lockfile from the final diff per repo policy, and
fixes one fixture contract drift caught during verification.
> - The benefit is a single reviewable PR that adds broad UI
documentation and regression-surfacing coverage without losing the
existing branch work.

## What Changed

- Added Storybook 10 wiring for the UI package, including root scripts,
UI package scripts, Storybook config, preview wrappers, Tailwind
entrypoints, and setup docs.
- Added a large fixture-backed data source for Storybook so complex
board states can render without a live server.
- Added story suites covering foundations, status language,
control-plane surfaces, overview, UX labs, agent management, budget and
finance, forms and editors, issue management, navigation and layout,
chat and comments, data visualization, dialogs and modals, and
projects/goals/workspaces.
- Adjusted several UI components for Storybook parity so dialogs, menus,
keyboard shortcuts, budget markers, markdown editing, and related
surfaces render correctly in isolation.
- Rebasing work for PR assembly: replayed the branch onto current
`master`, removed `pnpm-lock.yaml` from the final PR diff, and aligned
the dashboard fixture with the current `DashboardSummary.runActivity`
API contract.

## Verification

- `pnpm --filter @paperclipai/ui typecheck`
- `pnpm --filter @paperclipai/ui build-storybook`
- Manual diff audit after rebase: verified the PR no longer includes
`pnpm-lock.yaml` and now cleanly targets current `master`.
- Before/after UI note: before this branch there was no dedicated
Storybook surface for these Paperclip views; after this branch the local
Storybook build includes the new overview and domain story suites in
`ui/storybook-static`.

## Risks

- Large static fixture files can drift from shared types as dashboard
and UI contracts evolve; this PR already needed one fixture correction
for `runActivity`.
- Storybook bundle output includes some large chunks, so future growth
may need chunking work if build performance becomes an issue.
- Several component tweaks were made for isolated rendering parity, so
reviewers should spot-check key board surfaces against the live app
behavior.

## Model Used

- OpenAI Codex, GPT-5-based coding agent in the Paperclip harness; exact
serving model ID is not exposed in-runtime to the agent.
- Tool-assisted workflow with terminal execution, git operations, local
typecheck/build verification, and GitHub CLI PR creation.
- Context window/reasoning mode not surfaced by the harness.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 12:13:23 -05:00
Dotta
7a329fb8bb Harden API route authorization boundaries (#4122)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The REST API is the control-plane boundary for companies, agents,
plugins, adapters, costs, invites, and issue mutations.
> - Several routes still relied on broad board or company access checks
without consistently enforcing the narrower actor, company, and
active-checkout boundaries those operations require.
> - That can allow agents or non-admin users to mutate sensitive
resources outside the intended governance path.
> - This pull request hardens the route authorization layer and adds
regression coverage for the audited API surfaces.
> - The benefit is tighter multi-company isolation, safer plugin and
adapter administration, and stronger enforcement of active issue
ownership.

## What Changed

- Added route-level authorization checks for budgets, plugin
administration/scoped routes, adapter management, company import/export,
direct agent creation, invite test resolution, and issue mutation/write
surfaces.
- Enforced active checkout ownership for agent-authenticated issue
mutations, while preserving explicit management overrides for permitted
managers.
- Restricted sensitive adapter and plugin management operations to
instance-admin or properly scoped actors.
- Tightened company portability and invite probing routes so agents
cannot cross company boundaries.
- Updated access constants and the Company Access UI copy for the new
active-checkout management grant.
- Added focused regression tests covering cross-company denial, agent
self-mutation denial, admin-only operations, and active checkout
ownership.
- Rebased the branch onto `public-gh/master` and fixed validation
fallout from the rebase: heartbeat-context route ordering and a company
import/export e2e fixture that now opts out of direct-hire approval
before using direct agent creation.
- Updated onboarding and signoff e2e setup to create seed agents through
`/agent-hires` plus board approval, so they remain compatible with the
approval-gated new-agent default.
- Addressed Greptile feedback by removing a duplicate company export API
alias, avoiding N+1 reporting-chain lookups in active-checkout override
checks, allowing agent mutations on unassigned `in_progress` issues, and
blocking NAT64 invite-probe targets.

## Verification

- `pnpm exec vitest run
server/src/__tests__/issues-goal-context-routes.test.ts
cli/src/__tests__/company-import-export-e2e.test.ts`
- `pnpm exec vitest run server/src/__tests__/plugin-routes-authz.test.ts
server/src/__tests__/adapter-routes-authz.test.ts
server/src/__tests__/agent-permissions-routes.test.ts
server/src/__tests__/company-portability-routes.test.ts
server/src/__tests__/costs-service.test.ts
server/src/__tests__/invite-test-resolution-route.test.ts
server/src/__tests__/issue-agent-mutation-ownership-routes.test.ts
server/src/__tests__/agent-adapter-validation-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/issue-agent-mutation-ownership-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/invite-test-resolution-route.test.ts`
- `pnpm -r typecheck`
- `pnpm --filter server typecheck`
- `pnpm --filter ui typecheck`
- `pnpm build`
- `pnpm test:e2e -- tests/e2e/onboarding.spec.ts
tests/e2e/signoff-policy.spec.ts`
- `pnpm test:e2e -- tests/e2e/signoff-policy.spec.ts`
- `pnpm test:run` was also run. It failed under default full-suite
parallelism with two order-dependent failures in
`plugin-routes-authz.test.ts` and `routines-e2e.test.ts`; both files
passed when rerun directly together with `pnpm exec vitest run
server/src/__tests__/plugin-routes-authz.test.ts
server/src/__tests__/routines-e2e.test.ts`.

## Risks

- Medium risk: this changes authorization behavior across multiple
sensitive API surfaces, so callers that depended on broad board/company
access may now receive `403` or `409` until they use the correct
governance path.
- Direct agent creation now respects the company-level board-approval
requirement; integrations that need pending hires should use
`/api/companies/:companyId/agent-hires`.
- Active in-progress issue mutations now require checkout ownership or
an explicit management override, which may reveal workflow assumptions
in older automation.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

OpenAI Codex, GPT-5 coding agent, tool-using workflow with local shell,
Git, GitHub CLI, and repository tests.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [ ] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 10:56:48 -05:00
Dotta
549ef11c14 [codex] Respect manual workspace runtime controls (#4125)
## Thinking Path

> - Paperclip orchestrates AI agents inside execution and project
workspaces
> - Workspace runtime services can be controlled manually by operators
and reused by agent runs
> - Manual start/stop state was not preserved consistently across
workspace policies and routine launches
> - Routine launches also needed branch/workspace variables to default
from the selected workspace context
> - This pull request makes runtime policy state explicit, preserves
manual control, and auto-fills routine branch variables from workspace
data
> - The benefit is less surprising workspace service behavior and fewer
manual inputs when running workspace-scoped routines

## What Changed

- Added runtime-state handling for manual workspace control across
execution and project workspace validators, routes, and services.
- Updated heartbeat/runtime startup behavior so manually stopped
services are respected.
- Auto-filled routine workspace branch variables from available
workspace context.
- Added focused server and UI tests for workspace runtime and routine
variable behavior.
- Removed muted gray background styling from workspace pages and cards
for a cleaner workspace UI.

## Verification

- `pnpm install --frozen-lockfile --ignore-scripts`
- `pnpm exec vitest run server/src/__tests__/routines-service.test.ts
server/src/__tests__/workspace-runtime.test.ts
ui/src/components/RoutineRunVariablesDialog.test.tsx`
- Result: 55 tests passed, 21 skipped. The embedded Postgres routines
tests skipped on this host with the existing PGlite/Postgres init
warning; workspace-runtime and UI tests passed.

## Risks

- Medium risk: this touches runtime service start/stop policy and
heartbeat launch behavior.
- The focused tests cover manual runtime state, routine variables, and
workspace runtime reuse paths.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, tool-enabled local shell and
GitHub workflow, exact runtime context window not exposed in this
session.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots, or documented why targeted component/service verification
is sufficient here
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 10:39:37 -05:00
Dotta
c7c1ca0c78 [codex] Clean up terminal-result adapter process groups (#4129)
## Thinking Path

> - Paperclip runs local adapter processes for agents and streams their
output into heartbeat runs
> - Some adapters can emit a terminal result before all descendant
processes have exited
> - If those descendants keep running, a heartbeat can appear complete
while the process group remains alive
> - Claude local runs need a bounded cleanup path after terminal JSON
output is observed and the child exits
> - This pull request adds terminal-result cleanup support to adapter
process utilities and wires it into the Claude local adapter
> - The benefit is fewer stranded adapter process groups after
successful terminal results

## What Changed

- Added terminal-result cleanup options to `runChildProcess`.
- Tracked child exit plus terminal output before signaling lingering
process groups.
- Added Claude local adapter configuration for terminal result cleanup
grace time.
- Added process cleanup tests covering terminal-output cleanup and noisy
non-terminal runs.

## Verification

- `pnpm install --frozen-lockfile --ignore-scripts`
- `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts`
- Result: 9 tests passed.

## Risks

- Medium risk: this changes adapter child-process cleanup behavior.
- The cleanup only arms after terminal result detection and child exit,
and it is covered by process-group tests.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, tool-enabled local shell and
GitHub workflow, exact runtime context window not exposed in this
session.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots, or documented why it is not applicable
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 10:38:57 -05:00
Dotta
56b3120971 [codex] Improve mobile org chart navigation (#4127)
## Thinking Path

> - Paperclip models companies as teams of human and AI operators
> - The org chart is the primary visual map of that company structure
> - Mobile users need to pan and inspect the chart without awkward
gestures or layout jumps
> - The roadmap also needed to reflect that the multiple-human-users
work is complete
> - This pull request improves mobile org chart gestures and updates the
roadmap references
> - The benefit is a smoother company navigation experience and docs
that match shipped multi-user support

## What Changed

- Added one-finger mobile pan handling for the org chart.
- Expanded org chart test coverage for touch gesture behavior.
- Updated README, ROADMAP, and CLI README references to mark
multiple-human-users work as complete.

## Verification

- `pnpm install --frozen-lockfile --ignore-scripts`
- `pnpm exec vitest run ui/src/pages/OrgChart.test.tsx`
- Result: 4 tests passed.

## Risks

- Low-medium risk: org chart pointer/touch handling changed, but the
behavior is scoped to the org chart page and covered by targeted tests.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, tool-enabled local shell and
GitHub workflow, exact runtime context window not exposed in this
session.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots, or documented why targeted interaction tests are sufficient
here
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 10:35:33 -05:00
Dotta
4357a3f352 [codex] Harden dashboard run activity charts (#4126)
## Thinking Path

> - Paperclip gives operators a live view of agent work across
dashboards, transcripts, and run activity charts
> - Those views consume live run updates and aggregate run activity from
backend dashboard data
> - Missing or partial run data could make charts brittle, and live
transcript updates were heavier than needed
> - Operators need dashboard data to stay stable even when recent run
payloads are incomplete
> - This pull request hardens dashboard run aggregation, guards chart
rendering, and lightens live run update handling
> - The benefit is a more reliable dashboard during active agent
execution

## What Changed

- Added dashboard run activity types and backend aggregation coverage.
- Guarded activity chart rendering when run data is missing or partial.
- Reduced live transcript update churn in active agent and run chat
surfaces.
- Fixed issue chat avatar alignment in the thread renderer.
- Added focused dashboard, activity chart, and live transcript tests.

## Verification

- `pnpm install --frozen-lockfile --ignore-scripts`
- `pnpm exec vitest run server/src/__tests__/dashboard-service.test.ts
ui/src/components/ActivityCharts.test.tsx
ui/src/components/transcript/useLiveRunTranscripts.test.tsx`
- Result: 8 tests passed, 1 skipped. The embedded Postgres dashboard
service test skipped on this host with the existing PGlite/Postgres init
warning; UI chart and transcript tests passed.

## Risks

- Medium-low risk: aggregation semantics changed, but the UI remains
guarded around incomplete data.
- The dashboard service test is host-skipped here, so CI should confirm
the embedded database path.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, tool-enabled local shell and
GitHub workflow, exact runtime context window not exposed in this
session.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots, or documented why targeted component tests are sufficient
here
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 10:34:21 -05:00
Dotta
0f4e4b4c10 [codex] Split reusable agent hiring templates (#4124)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Hiring new agents depends on clear, reusable operating instructions
> - The create-agent skill had one large template reference that mixed
multiple roles together
> - That made it harder to reuse, review, and adapt role-specific
instructions during governed hires
> - This pull request splits the reusable agent instruction templates
into focused role files and polishes the agent instructions pane layout
> - The benefit is faster, clearer agent hiring without bloating the
main skill document

## What Changed

- Split coder, QA, and UX designer reusable instructions into dedicated
reference files.
- Kept the index reference concise and pointed it at the role-specific
files.
- Updated the create-agent skill to describe the separated template
structure.
- Polished the agent detail instructions/package file tree layout so the
longer template references remain readable.

## Verification

- `pnpm install --frozen-lockfile --ignore-scripts`
- `pnpm --filter @paperclipai/ui typecheck`
- UI screenshot rationale: no screenshots attached because the visible
change is limited to the Agent detail instructions file-tree layout
(`wrapLabels` plus the side-by-side breakpoint). There is no new user
flow or state transition to demonstrate; reviewers can verify visually
by opening an agent's Instructions tab and resizing across the
single-column and side-by-side breakpoints to confirm long file names
wrap instead of truncating or overflowing.

## Risks

- Low risk: this is documentation and UI layout only.
- Main risk is stale links in the skill references; the new files are
committed in the referenced paths.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, tool-enabled local shell and
GitHub workflow, exact runtime context window not exposed in this
session.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots, or documented why targeted component/type verification is
sufficient here
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 10:33:19 -05:00
517 changed files with 102977 additions and 4182 deletions

1
.gitignore vendored
View File

@@ -3,6 +3,7 @@ node_modules/
**/node_modules
**/node_modules/
dist/
ui/storybook-static/
.env
*.tsbuildinfo
drizzle/meta/

View File

@@ -123,7 +123,9 @@ pnpm test:release-smoke
Run the browser suites only when your change touches them or when you are explicitly verifying CI/release flows.
Run this full check before claiming done:
For normal issue work, run the smallest relevant verification first. Do not default to repo-wide typecheck/build/test on every heartbeat when a narrower check is enough to prove the change.
Run this full check before claiming repo work done in a PR-ready hand-off, or when the change scope is broad enough that targeted checks are not sufficient:
```sh
pnpm -r typecheck

View File

@@ -29,6 +29,7 @@ COPY packages/adapters/openclaw-gateway/package.json packages/adapters/openclaw-
COPY packages/adapters/opencode-local/package.json packages/adapters/opencode-local/
COPY packages/adapters/pi-local/package.json packages/adapters/pi-local/
COPY packages/plugins/sdk/package.json packages/plugins/sdk/
COPY packages/plugins/paperclip-plugin-fake-sandbox/package.json packages/plugins/paperclip-plugin-fake-sandbox/
COPY patches/ patches/
RUN pnpm install --frozen-lockfile

111
README.md
View File

@@ -156,6 +156,115 @@ Paperclip handles the hard orchestration details correctly.
<br/>
## What's Under the Hood
Paperclip is a full control plane, not a wrapper. Before you build any of this yourself, know that it already exists:
```
┌──────────────────────────────────────────────────────────────┐
│ PAPERCLIP SERVER │
│ │
│ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐ │
│ │Identity & │ │ Work & │ │ Heartbeat │ │Governance │ │
│ │ Access │ │ Tasks │ │ Execution │ │& Approvals│ │
│ └───────────┘ └───────────┘ └───────────┘ └───────────┘ │
│ │
│ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐ │
│ │ Org Chart │ │Workspaces │ │ Plugins │ │ Budget │ │
│ │ & Agents │ │ & Runtime │ │ │ │ & Costs │ │
│ └───────────┘ └───────────┘ └───────────┘ └───────────┘ │
│ │
│ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐ │
│ │ Routines │ │ Secrets & │ │ Activity │ │ Company │ │
│ │& Schedules│ │ Storage │ │ & Events │ │Portability│ │
│ └───────────┘ └───────────┘ └───────────┘ └───────────┘ │
└──────────────────────────────────────────────────────────────┘
▲ ▲ ▲ ▲
┌─────┴─────┐ ┌─────┴─────┐ ┌─────┴─────┐ ┌─────┴─────┐
│ Claude │ │ Codex │ │ CLI │ │ HTTP/web │
│ Code │ │ │ │ agents │ │ bots │
└───────────┘ └───────────┘ └───────────┘ └───────────┘
```
### The Systems
<table>
<tr>
<td width="50%">
**Identity & Access** — Two deployment modes (trusted local or authenticated), board users, agent API keys, short-lived run JWTs, company memberships, invite flows, and OpenClaw onboarding. Every mutating request is traced to an actor.
</td>
<td width="50%">
**Org Chart & Agents** — Agents have roles, titles, reporting lines, permissions, and budgets. Adapter examples match the diagram: Claude Code, Codex, CLI agents such as Cursor/Gemini/bash, HTTP/webhook bots such as OpenClaw, and external adapter plugins. If it can receive a heartbeat, it's hired.
</td>
</tr>
<tr>
<td>
**Work & Task System** — Issues carry company/project/goal/parent links, atomic checkout with execution locks, first-class blocker dependencies, comments, documents, attachments, work products, labels, and inbox state. No double-work, no lost context.
</td>
<td>
**Heartbeat Execution** — DB-backed wakeup queue with coalescing, budget checks, workspace resolution, secret injection, skill loading, and adapter invocation. Runs produce structured logs, cost events, session state, and audit trails. Recovery handles orphaned runs automatically.
</td>
</tr>
<tr>
<td>
**Workspaces & Runtime** — Project workspaces, isolated execution workspaces (git worktrees, operator branches), and runtime services (dev servers, preview URLs). Agents work in the right directory with the right context every time.
</td>
<td>
**Governance & Approvals** — Board approval workflows, execution policies with review/approval stages, decision tracking, budget hard-stops, agent pause/resume/terminate, and full audit logging. You're the board — nothing ships without your sign-off.
</td>
</tr>
<tr>
<td>
**Budget & Cost Control** — Token and cost tracking by company, agent, project, goal, issue, provider, and model. Scoped budget policies with warning thresholds and hard stops. Overspend pauses agents and cancels queued work automatically.
</td>
<td>
**Routines & Schedules** — Recurring tasks with cron, webhook, and API triggers. Concurrency and catch-up policies. Each routine execution creates a tracked issue and wakes the assigned agent — no manual kick-offs needed.
</td>
</tr>
<tr>
<td>
**Plugins** — Instance-wide plugin system with out-of-process workers, capability-gated host services, job scheduling, tool exposure, and UI contributions. Extend Paperclip without forking it.
</td>
<td>
**Secrets & Storage** — Instance and company secrets, encrypted local storage, provider-backed object storage, attachments, and work products. Sensitive values stay out of prompts unless a scoped run explicitly needs them.
</td>
</tr>
<tr>
<td>
**Activity & Events** — Mutating actions, heartbeat state changes, cost events, approvals, comments, and work products are recorded as durable activity so operators can audit what happened and why.
</td>
<td>
**Company Portability** — Export and import entire organizations — agents, skills, projects, routines, and issues — with secret scrubbing and collision handling. One deployment, many companies, complete data isolation.
</td>
</tr>
</table>
<br/>
## What Paperclip is not
| | |
@@ -256,7 +365,7 @@ See [doc/DEVELOPING.md](doc/DEVELOPING.md) for the full development guide.
- ✅ Scheduled Routines
- ✅ Better Budgeting
- ✅ Agent Reviews and Approvals
- Multiple Human Users
- Multiple Human Users
- ⚪ Cloud / Sandbox agents (e.g. Cursor / e2b agents)
- ⚪ Artifacts & Work Products
- ⚪ Memory / Knowledge

View File

@@ -44,7 +44,7 @@ Budgets are a core control-plane feature, not an afterthought. Better budgeting
Paperclip should support explicit review and approval stages as first-class workflow steps, not just ad hoc comments. That means reviewer routing, approval gates, change requests, and durable audit trails that fit the same task model as the rest of the control plane.
### Multiple Human Users
### Multiple Human Users
Paperclip needs a clearer path from solo operator to real human teams. That means shared board access, safer collaboration, and a better model for several humans supervising the same autonomous company.

View File

@@ -258,7 +258,7 @@ See [doc/DEVELOPING.md](https://github.com/paperclipai/paperclip/blob/master/doc
- ⚪ Artifacts & Deployments
- ⚪ CEO Chat
- ⚪ MAXIMIZER MODE
- Multiple Human Users
- Multiple Human Users
- ⚪ Cloud / Sandbox agents (e.g. Cursor / e2b agents)
- ⚪ Cloud deployments
- ⚪ Desktop App

View File

@@ -287,6 +287,11 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
headers: { "content-type": "application/json" },
body: JSON.stringify({ name: `CLI Export Source ${Date.now()}` }),
});
await api(apiBase, `/api/companies/${sourceCompany.id}`, {
method: "PATCH",
headers: { "content-type": "application/json" },
body: JSON.stringify({ requireBoardApprovalForNewAgents: false }),
});
const sourceAgent = await api<{ id: string; name: string }>(
apiBase,
@@ -393,10 +398,11 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
apiBase,
`/api/companies/${importedNew.company.id}/issues`,
);
const importedMatchingIssues = importedIssues.filter((issue) => issue.title === sourceIssue.title);
expect(importedAgents.map((agent) => agent.name)).toContain(sourceAgent.name);
expect(importedProjects.map((project) => project.name)).toContain(sourceProject.name);
expect(importedIssues.map((issue) => issue.title)).toContain(sourceIssue.title);
expect(importedMatchingIssues).toHaveLength(1);
const previewExisting = await runCliJson<{
errors: string[];
@@ -466,11 +472,13 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
apiBase,
`/api/companies/${importedNew.company.id}/issues`,
);
const twiceImportedMatchingIssues = twiceImportedIssues.filter((issue) => issue.title === sourceIssue.title);
expect(twiceImportedAgents).toHaveLength(2);
expect(new Set(twiceImportedAgents.map((agent) => agent.name)).size).toBe(2);
expect(twiceImportedProjects).toHaveLength(2);
expect(twiceImportedIssues).toHaveLength(2);
expect(twiceImportedMatchingIssues).toHaveLength(2);
expect(new Set(twiceImportedMatchingIssues.map((issue) => issue.identifier)).size).toBe(2);
const zipPath = path.join(tempRoot, "exported-company.zip");
const portableFiles: Record<string, string> = {};
@@ -498,5 +506,5 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
expect(importedFromZip.company.action).toBe("created");
expect(importedFromZip.agents.some((agent) => agent.action === "created")).toBe(true);
}, 60_000);
}, 90_000);
});

View File

@@ -0,0 +1,24 @@
import path from "node:path";
import { describe, expect, it } from "vitest";
import { collectEnvLabDoctorStatus, resolveEnvLabSshStatePath } from "../commands/env-lab.js";
describe("env-lab command", () => {
it("resolves the default SSH fixture state path under the instance root", () => {
const statePath = resolveEnvLabSshStatePath("fixture-test");
expect(statePath).toContain(
path.join("instances", "fixture-test", "env-lab", "ssh-fixture", "state.json"),
);
});
it("reports doctor status for an instance without a running fixture", async () => {
const status = await collectEnvLabDoctorStatus({ instance: "fixture-test-missing" });
expect(status.statePath).toContain(
path.join("instances", "fixture-test-missing", "env-lab", "ssh-fixture", "state.json"),
);
expect(typeof status.ssh.supported).toBe("boolean");
expect(status.ssh.running).toBe(false);
expect(status.ssh.environment).toBeNull();
});
});

View File

@@ -599,7 +599,7 @@ describe("worktree helpers", () => {
fs.rmSync(tempRoot, { recursive: true, force: true });
}
},
20000,
30000,
);
it("avoids ports already claimed by sibling worktree instance configs", async () => {
@@ -881,7 +881,7 @@ describe("worktree helpers", () => {
}
fs.rmSync(tempRoot, { recursive: true, force: true });
}
}, 20_000);
}, 30_000);
it("restores the current worktree config and instance data if reseed fails", async () => {
const tempRoot = fs.mkdtempSync(path.join(os.tmpdir(), "paperclip-worktree-reseed-rollback-"));
@@ -1038,7 +1038,7 @@ describe("worktree helpers", () => {
execFileSync("git", ["worktree", "remove", "--force", worktreePath], { cwd: repoRoot, stdio: "ignore" });
fs.rmSync(tempRoot, { recursive: true, force: true });
}
});
}, 15_000);
it("creates and initializes a worktree from the top-level worktree:make command", async () => {
const tempRoot = fs.mkdtempSync(path.join(os.tmpdir(), "paperclip-worktree-make-"));

View File

@@ -61,6 +61,7 @@ interface IssueUpdateOptions extends BaseClientOptions {
interface IssueCommentOptions extends BaseClientOptions {
body: string;
reopen?: boolean;
resume?: boolean;
}
interface IssueCheckoutOptions extends BaseClientOptions {
@@ -241,12 +242,14 @@ export function registerIssueCommands(program: Command): void {
.argument("<issueId>", "Issue ID")
.requiredOption("--body <text>", "Comment body")
.option("--reopen", "Reopen if issue is done/cancelled")
.option("--resume", "Request explicit follow-up and wake the assignee when resumable")
.action(async (issueId: string, opts: IssueCommentOptions) => {
try {
const ctx = resolveCommandContext(opts);
const payload = addIssueCommentSchema.parse({
body: opts.body,
reopen: opts.reopen,
resume: opts.resume,
});
const comment = await ctx.api.post<IssueComment>(`/api/issues/${issueId}/comments`, payload);
printOutput(comment, { json: ctx.json });

174
cli/src/commands/env-lab.ts Normal file
View File

@@ -0,0 +1,174 @@
import path from "node:path";
import type { Command } from "commander";
import * as p from "@clack/prompts";
import pc from "picocolors";
import {
buildSshEnvLabFixtureConfig,
getSshEnvLabSupport,
readSshEnvLabFixtureStatus,
startSshEnvLabFixture,
stopSshEnvLabFixture,
} from "@paperclipai/adapter-utils/ssh";
import { resolvePaperclipInstanceId, resolvePaperclipInstanceRoot } from "../config/home.js";
export function resolveEnvLabSshStatePath(instanceId?: string): string {
const resolvedInstanceId = resolvePaperclipInstanceId(instanceId);
return path.resolve(
resolvePaperclipInstanceRoot(resolvedInstanceId),
"env-lab",
"ssh-fixture",
"state.json",
);
}
function printJson(value: unknown) {
process.stdout.write(`${JSON.stringify(value, null, 2)}\n`);
}
function summarizeFixture(state: {
host: string;
port: number;
username: string;
workspaceDir: string;
sshdLogPath: string;
}) {
p.log.message(`Host: ${pc.cyan(state.host)}:${pc.cyan(String(state.port))}`);
p.log.message(`User: ${pc.cyan(state.username)}`);
p.log.message(`Workspace: ${pc.cyan(state.workspaceDir)}`);
p.log.message(`Log: ${pc.dim(state.sshdLogPath)}`);
}
export async function collectEnvLabDoctorStatus(opts: { instance?: string }) {
const statePath = resolveEnvLabSshStatePath(opts.instance);
const [sshSupport, sshStatus] = await Promise.all([
getSshEnvLabSupport(),
readSshEnvLabFixtureStatus(statePath),
]);
const environment = sshStatus.state ? await buildSshEnvLabFixtureConfig(sshStatus.state) : null;
return {
statePath,
ssh: {
supported: sshSupport.supported,
reason: sshSupport.reason,
running: sshStatus.running,
state: sshStatus.state,
environment,
},
};
}
export async function envLabUpCommand(opts: { instance?: string; json?: boolean }) {
const statePath = resolveEnvLabSshStatePath(opts.instance);
const state = await startSshEnvLabFixture({ statePath });
const environment = await buildSshEnvLabFixtureConfig(state);
if (opts.json) {
printJson({ state, environment });
return;
}
p.log.success("SSH env-lab fixture is running.");
summarizeFixture(state);
p.log.message(`State: ${pc.dim(statePath)}`);
}
export async function envLabStatusCommand(opts: { instance?: string; json?: boolean }) {
const statePath = resolveEnvLabSshStatePath(opts.instance);
const status = await readSshEnvLabFixtureStatus(statePath);
const environment = status.state ? await buildSshEnvLabFixtureConfig(status.state) : null;
if (opts.json) {
printJson({ ...status, environment, statePath });
return;
}
if (!status.state || !status.running) {
p.log.info(`SSH env-lab fixture is not running (${pc.dim(statePath)}).`);
return;
}
p.log.success("SSH env-lab fixture is running.");
summarizeFixture(status.state);
p.log.message(`State: ${pc.dim(statePath)}`);
}
export async function envLabDownCommand(opts: { instance?: string; json?: boolean }) {
const statePath = resolveEnvLabSshStatePath(opts.instance);
const stopped = await stopSshEnvLabFixture(statePath);
if (opts.json) {
printJson({ stopped, statePath });
return;
}
if (!stopped) {
p.log.info(`No SSH env-lab fixture was running (${pc.dim(statePath)}).`);
return;
}
p.log.success("SSH env-lab fixture stopped.");
p.log.message(`State: ${pc.dim(statePath)}`);
}
export async function envLabDoctorCommand(opts: { instance?: string; json?: boolean }) {
const status = await collectEnvLabDoctorStatus(opts);
if (opts.json) {
printJson(status);
return;
}
if (status.ssh.supported) {
p.log.success("SSH fixture prerequisites are installed.");
} else {
p.log.warn(`SSH fixture prerequisites are incomplete: ${status.ssh.reason ?? "unknown reason"}`);
}
if (status.ssh.state && status.ssh.running) {
p.log.success("SSH env-lab fixture is running.");
summarizeFixture(status.ssh.state);
p.log.message(`Private key: ${pc.dim(status.ssh.state.clientPrivateKeyPath)}`);
p.log.message(`Known hosts: ${pc.dim(status.ssh.state.knownHostsPath)}`);
} else if (status.ssh.state) {
p.log.warn("SSH env-lab fixture state exists, but the process is not running.");
p.log.message(`State: ${pc.dim(status.statePath)}`);
} else {
p.log.info("SSH env-lab fixture is not running.");
p.log.message(`State: ${pc.dim(status.statePath)}`);
}
p.log.message(`Cleanup: ${pc.dim("pnpm paperclipai env-lab down")}`);
}
export function registerEnvLabCommands(program: Command) {
const envLab = program.command("env-lab").description("Deterministic local environment fixtures");
envLab
.command("up")
.description("Start the default SSH env-lab fixture")
.option("-i, --instance <id>", "Paperclip instance id (default: current/default)")
.option("--json", "Print machine-readable fixture details")
.action(envLabUpCommand);
envLab
.command("status")
.description("Show the current SSH env-lab fixture state")
.option("-i, --instance <id>", "Paperclip instance id (default: current/default)")
.option("--json", "Print machine-readable fixture details")
.action(envLabStatusCommand);
envLab
.command("down")
.description("Stop the default SSH env-lab fixture")
.option("-i, --instance <id>", "Paperclip instance id (default: current/default)")
.option("--json", "Print machine-readable stop details")
.action(envLabDownCommand);
envLab
.command("doctor")
.description("Check SSH fixture prerequisites and current status")
.option("-i, --instance <id>", "Paperclip instance id (default: current/default)")
.option("--json", "Print machine-readable diagnostic details")
.action(envLabDoctorCommand);
}

View File

@@ -1311,6 +1311,7 @@ async function seedWorktreeDatabase(input: {
backupDir: path.resolve(input.targetPaths.backupDir, "seed"),
retention: { dailyDays: 7, weeklyWeeks: 4, monthlyMonths: 1 },
filenamePrefix: `${input.instanceId}-seed`,
backupEngine: "javascript",
includeMigrationJournal: true,
excludeTables: seedPlan.excludedTables,
nullifyColumns: seedPlan.nullifyColumns,

View File

@@ -8,6 +8,7 @@ import { heartbeatRun } from "./commands/heartbeat-run.js";
import { runCommand } from "./commands/run.js";
import { bootstrapCeoInvite } from "./commands/auth-bootstrap-ceo.js";
import { dbBackupCommand } from "./commands/db-backup.js";
import { registerEnvLabCommands } from "./commands/env-lab.js";
import { registerContextCommands } from "./commands/client/context.js";
import { registerCompanyCommands } from "./commands/client/company.js";
import { registerIssueCommands } from "./commands/client/issue.js";
@@ -147,6 +148,7 @@ registerDashboardCommands(program);
registerRoutineCommands(program);
registerFeedbackCommands(program);
registerWorktreeCommands(program);
registerEnvLabCommands(program);
registerPluginCommands(program);
const auth = program.command("auth").description("Authentication and bootstrap utilities");

View File

@@ -2,7 +2,7 @@
Paperclip CLI now supports both:
- instance setup/diagnostics (`onboard`, `doctor`, `configure`, `env`, `allowed-hostname`)
- instance setup/diagnostics (`onboard`, `doctor`, `configure`, `env`, `allowed-hostname`, `env-lab`)
- control-plane client operations (issues, approvals, agents, activity, dashboard)
## Base Usage
@@ -45,6 +45,15 @@ Allow an authenticated/private hostname (for example custom Tailscale DNS):
pnpm paperclipai allowed-hostname dotta-macbook-pro
```
Bring up the default local SSH fixture for environment testing:
```sh
pnpm paperclipai env-lab up
pnpm paperclipai env-lab doctor
pnpm paperclipai env-lab status --json
pnpm paperclipai env-lab down
```
All client commands support:
- `--data-dir <path>`

View File

@@ -27,6 +27,18 @@ pnpm db:migrate
When `DATABASE_URL` is unset, this command targets the current embedded PostgreSQL instance for your active Paperclip config/instance.
Issue reference mentions follow the normal migration path: the schema migration creates the tracking table, but it does not backfill historical issue titles, descriptions, comments, or documents automatically.
To backfill existing content manually after migrating, run:
```sh
pnpm issue-references:backfill
# optional: limit to one company
pnpm issue-references:backfill -- --company <company-id>
```
Future issue, comment, and document writes sync references automatically without running the backfill command.
This mode is ideal for local development and one-command installs.
Docker note: the Docker quickstart image also uses embedded PostgreSQL by default. Persist `/paperclip` to keep DB state across container restarts (see `doc/DOCKER.md`).

View File

@@ -43,6 +43,17 @@ This starts:
`pnpm dev` and `pnpm dev:once` are now idempotent for the current repo and instance: if the matching Paperclip dev runner is already alive, Paperclip reports the existing process instead of starting a duplicate.
## Storybook
The board UI Storybook keeps stories and Storybook config under `ui/storybook/` so component review files stay out of the app source routes.
```sh
pnpm storybook
pnpm build-storybook
```
These run the `@paperclipai/ui` Storybook on port `6006` and build the static output to `ui/storybook-static/`.
Inspect or stop the current repo's managed dev runner:
```sh

View File

@@ -37,7 +37,7 @@ These decisions close open questions from `SPEC.md` for V1.
| Visibility | Full visibility to board and all agents in same company |
| Communication | Tasks + comments only (no separate chat system) |
| Task ownership | Single assignee; atomic checkout required for `in_progress` transition |
| Recovery | No automatic reassignment; work recovery stays manual/explicit |
| Recovery | No automatic reassignment; control-plane recovery may retry lost execution continuity once, then uses explicit recovery issues or human escalation |
| Agent adapters | Built-in `process` and `http` adapters |
| Auth | Mode-dependent human auth (`local_trusted` implicit board in current code; authenticated mode uses sessions), API keys for agents |
| Budget period | Monthly UTC calendar window |
@@ -395,7 +395,7 @@ Side effects:
- entering `done` sets `completed_at`
- entering `cancelled` sets `cancelled_at`
Detailed ownership, execution, blocker, and crash-recovery semantics are documented in `doc/execution-semantics.md`.
Detailed ownership, execution, blocker, active-run watchdog, and crash-recovery semantics are documented in `doc/execution-semantics.md`.
## 8.3 Approval Status
@@ -484,6 +484,7 @@ All endpoints are under `/api` and return JSON.
- `DELETE /issues/:issueId/documents/:key`
- `POST /issues/:issueId/checkout`
- `POST /issues/:issueId/release`
- `POST /issues/:issueId/admin/force-release` (board-only lock recovery)
- `POST /issues/:issueId/comments`
- `GET /issues/:issueId/comments`
- `POST /companies/:companyId/issues/:issueId/attachments` (multipart upload)
@@ -508,6 +509,8 @@ Server behavior:
2. if updated row count is 0, return `409` with current owner/status
3. successful checkout sets `assignee_agent_id`, `status = in_progress`, and `started_at`
`POST /issues/:issueId/admin/force-release` is an operator recovery endpoint for stale harness locks. It requires board access to the issue company, clears checkout and execution run lock fields, and may clear the agent assignee when `clearAssignee=true` is passed. The route must write an `issue.admin_force_release` activity log entry containing the previous checkout and execution run IDs.
## 10.5 Projects
- `GET /companies/:companyId/projects`

View File

@@ -1,7 +1,7 @@
# Execution Semantics
Status: Current implementation guide
Date: 2026-04-13
Date: 2026-04-23
Audience: Product and engineering
This document explains how Paperclip interprets issue assignment, issue status, execution runs, wakeups, parent/sub-issue structure, and blocker relationships.
@@ -146,6 +146,8 @@ Use it for:
- explicit waiting relationships
- automatic wakeups when all blockers resolve
Blocked issues should stay idle while blockers remain unresolved. Paperclip should not create a queued heartbeat run for that issue until the final blocker is done and the `issue_blockers_resolved` wake can start real work.
If a parent is truly waiting on a child, model that with blockers. Do not rely on the parent/child relationship alone.
## 7. Consistent Execution Path Rules
@@ -216,15 +218,81 @@ This is an active-work continuity recovery.
Startup recovery and periodic recovery are different from normal wakeup delivery.
On startup and on the periodic recovery loop, Paperclip now does three things in sequence:
On startup and on the periodic recovery loop, Paperclip now does four things in sequence:
1. reap orphaned `running` runs
2. resume persisted `queued` runs
3. reconcile stranded assigned work
4. scan silent active runs and create or update explicit watchdog review issues
That last step is what closes the gap where issue state survives a crash but the wake/run path does not.
The stranded-work pass closes the gap where issue state survives a crash but the wake/run path does not. The silent-run scan covers the separate case where a live process exists but has stopped producing observable output.
## 10. What This Does Not Mean
## 10. Silent Active-Run Watchdog
An active run can still be unhealthy even when its process is `running`. Paperclip treats prolonged output silence as a watchdog signal, not as proof that the run is failed.
The recovery service owns this contract:
- classify active-run output silence as `ok`, `suspicious`, `critical`, `snoozed`, or `not_applicable`
- collect bounded evidence from run logs, recent run events, child issues, and blockers
- preserve redaction and truncation before evidence is written to issue descriptions
- create at most one open `stale_active_run_evaluation` issue per run
- honor active snooze decisions before creating more review work
- build the `outputSilence` summary shown by live-run and active-run API responses
Suspicious silence creates a medium-priority review issue for the selected recovery owner. Critical silence raises that review issue to high priority and blocks the source issue on the explicit evaluation task without cancelling the active process.
Watchdog decisions are explicit operator/recovery-owner decisions:
- `snooze` records an operator-chosen future quiet-until time and suppresses scan-created review work during that window
- `continue` records that the current evidence is acceptable, does not cancel or mutate the active run, and sets a 30-minute default re-arm window before the watchdog evaluates the still-silent run again
- `dismissed_false_positive` records why the review was not actionable
Operators should prefer `snooze` for known time-bounded quiet periods. `continue` is only a short acknowledgement of the current evidence; if the run remains silent after the re-arm window, the periodic watchdog scan can create or update review work again.
The board can record watchdog decisions. The assigned owner of the watchdog evaluation issue can also record them. Other agents cannot.
## 11. Auto-Recover vs Explicit Recovery vs Human Escalation
Paperclip uses three different recovery outcomes, depending on how much it can safely infer.
### Auto-Recover
Auto-recovery is allowed when ownership is clear and the control plane only lost execution continuity.
Examples:
- requeue one dispatch wake for an assigned `todo` issue whose latest run failed, timed out, or was cancelled
- requeue one continuation wake for an assigned `in_progress` issue whose live execution path disappeared
- assign an orphan blocker back to its creator when that blocker is already preventing other work
Auto-recovery preserves the existing owner. It does not choose a replacement agent.
### Explicit Recovery Issue
Paperclip creates an explicit recovery issue when the system can identify a problem but cannot safely complete the work itself.
Examples:
- automatic stranded-work retry was already exhausted
- a dependency graph has an invalid/uninvokable owner, unassigned blocker, or invalid review participant
- an active run is silent past the watchdog threshold
The source issue remains visible and blocked on the recovery issue when blocking is necessary for correctness. The recovery owner must restore a live path, resolve the source issue manually, or record the reason it is a false positive.
### Human Escalation
Human escalation is required when the next safe action depends on board judgment, budget/approval policy, or information unavailable to the control plane.
Examples:
- all candidate recovery owners are paused, terminated, pending approval, or budget-blocked
- the issue is human-owned rather than agent-owned
- the run is intentionally quiet but needs an operator decision before cancellation or continuation
In these cases Paperclip should leave a visible issue/comment trail instead of silently retrying.
## 12. What This Does Not Mean
These semantics do not change V1 into an auto-reassignment system.
@@ -238,9 +306,10 @@ The recovery model is intentionally conservative:
- preserve ownership
- retry once when the control plane lost execution continuity
- create explicit recovery work when the system can identify a bounded recovery owner/action
- escalate visibly when the system cannot safely keep going
## 11. Practical Interpretation
## 13. Practical Interpretation
For a board operator, the intended meaning is:

View File

@@ -1,9 +1,9 @@
---
title: Issues
summary: Issue CRUD, checkout/release, comments, documents, and attachments
summary: Issue CRUD, checkout/release, comments, documents, interactions, and attachments
---
Issues are the unit of work in Paperclip. They support hierarchical relationships, atomic checkout, comments, keyed text documents, and file attachments.
Issues are the unit of work in Paperclip. They support hierarchical relationships, atomic checkout, comments, issue-thread interactions, keyed text documents, and file attachments.
## List Issues
@@ -121,6 +121,65 @@ POST /api/issues/{issueId}/comments
@-mentions (`@AgentName`) in comments trigger heartbeats for the mentioned agent.
## Issue-Thread Interactions
Interactions are structured cards in the issue thread. Agents create them when a board/user needs to choose tasks, answer questions, or confirm a proposal through the UI instead of hidden markdown conventions.
### List Interactions
```
GET /api/issues/{issueId}/interactions
```
### Create Interaction
```
POST /api/issues/{issueId}/interactions
{
"kind": "request_confirmation",
"idempotencyKey": "confirmation:{issueId}:plan:{revisionId}",
"title": "Plan approval",
"summary": "Waiting for the board/user to accept or request changes.",
"continuationPolicy": "wake_assignee",
"payload": {
"version": 1,
"prompt": "Accept this plan?",
"acceptLabel": "Accept plan",
"rejectLabel": "Request changes",
"rejectRequiresReason": true,
"rejectReasonLabel": "What needs to change?",
"detailsMarkdown": "Review the latest plan document before accepting.",
"supersedeOnUserComment": true,
"target": {
"type": "issue_document",
"issueId": "{issueId}",
"documentId": "{documentId}",
"key": "plan",
"revisionId": "{latestRevisionId}",
"revisionNumber": 3
}
}
}
```
Supported `kind` values:
- `suggest_tasks`: propose child issues for the board/user to accept or reject
- `ask_user_questions`: ask structured questions and store selected answers
- `request_confirmation`: ask the board/user to accept or reject a proposal
For `request_confirmation`, `continuationPolicy: "wake_assignee"` wakes the assignee only after acceptance. Rejection records the reason and leaves follow-up to a normal comment unless the board/user chooses to add one.
### Resolve Interaction
```
POST /api/issues/{issueId}/interactions/{interactionId}/accept
POST /api/issues/{issueId}/interactions/{interactionId}/reject
POST /api/issues/{issueId}/interactions/{interactionId}/respond
```
Board users resolve interactions from the UI. Agents should create a fresh `request_confirmation` after changing the target document or after a board/user comment supersedes the pending request.
## Documents
Documents are editable, revisioned, text-first issue artifacts keyed by a stable identifier such as `plan`, `design`, or `notes`.

View File

@@ -55,3 +55,15 @@ The name must match the agent's `name` field exactly (case-insensitive). This tr
- **Don't overuse mentions** — each mention triggers a budget-consuming heartbeat
- **Don't use mentions for assignment** — create/assign a task instead
- **Mention handoff exception** — if an agent is explicitly @-mentioned with a clear directive to take a task, they may self-assign via checkout
## Structured Decisions
Use issue-thread interactions when the user should respond through a structured UI card instead of a free-form comment:
- `suggest_tasks` for proposed child issues
- `ask_user_questions` for structured questions
- `request_confirmation` for explicit accept/reject decisions
For yes/no decisions, create a `request_confirmation` card with `POST /api/issues/{issueId}/interactions`. Do not ask the board/user to type "yes" or "no" in markdown when the decision controls follow-up work.
Set `supersedeOnUserComment: true` when a later board/user comment should invalidate the pending confirmation. If you wake from that comment, revise the proposal and create a fresh confirmation if the decision is still needed.

View File

@@ -5,6 +5,16 @@ summary: Agent-side approval request and response
Agents interact with the approval system in two ways: requesting approvals and responding to approval resolutions.
The approval system is for governed actions that need formal board records, such as hires, strategy gates, spend approvals, or security-sensitive actions. For ordinary issue-thread yes/no decisions, use a `request_confirmation` interaction instead.
Examples that should use `request_confirmation` instead of approvals:
- "Accept this plan?"
- "Proceed with this issue breakdown?"
- "Use option A or reject and request changes?"
Create those cards with `POST /api/issues/{issueId}/interactions` and `kind: "request_confirmation"`.
## Requesting a Hire
Managers and CEOs can request to hire new agents:
@@ -37,6 +47,16 @@ POST /api/companies/{companyId}/approvals
}
```
## Plan Approval Cards
For normal issue implementation plans, use the issue-thread confirmation surface:
1. Update the `plan` issue document.
2. Create `request_confirmation` bound to the latest `plan` revision.
3. Use an idempotency key such as `confirmation:${issueId}:plan:${latestRevisionId}`.
4. Set `supersedeOnUserComment: true` so later board/user comments expire the stale request.
5. Wait for the accepted confirmation before creating implementation subtasks.
## Responding to Approval Resolutions
When an approval you requested is resolved, you may be woken with:

View File

@@ -70,6 +70,8 @@ Use your tools and capabilities to complete the task. If the issue is actionable
Leave durable progress in comments, documents, or work products, and include the next action before exiting. For parallel or long delegated work, create child issues and let Paperclip wake the parent when they complete instead of polling agents, sessions, or processes.
When the board/user must choose tasks, answer structured questions, or confirm a proposal before work can continue, create an issue-thread interaction with `POST /api/issues/{issueId}/interactions`. Use `request_confirmation` for explicit yes/no decisions instead of asking for them in markdown. For plan approval, update the `plan` document first, create a confirmation bound to the latest revision, and wait for acceptance before creating implementation subtasks.
### Step 8: Update Status
Always include the run ID header on state changes:
@@ -107,6 +109,7 @@ Always set `parentId` and `goalId` on subtasks.
- **Start actionable work** in the same heartbeat; planning-only exits are for planning tasks
- **Leave a clear next action** in durable issue context
- **Use child issues instead of polling** for long or parallel delegated work
- **Use `request_confirmation`** for issue-scoped yes/no decisions and plan approval cards
- **Always set parentId** on subtasks
- **Never cancel cross-team tasks** — reassign to your manager
- **Escalate when stuck** — use your chain of command

View File

@@ -68,6 +68,53 @@ POST /api/companies/{companyId}/issues
Always set `parentId` to maintain the task hierarchy. Set `goalId` when applicable.
## Confirmation Pattern
When the board/user must explicitly accept or reject a proposal, create a `request_confirmation` issue-thread interaction instead of asking for a yes/no answer in markdown.
```
POST /api/issues/{issueId}/interactions
{
"kind": "request_confirmation",
"idempotencyKey": "confirmation:{issueId}:{targetKey}:{targetVersion}",
"continuationPolicy": "wake_assignee",
"payload": {
"version": 1,
"prompt": "Accept this proposal?",
"acceptLabel": "Accept",
"rejectLabel": "Request changes",
"rejectRequiresReason": true,
"supersedeOnUserComment": true
}
}
```
Use `continuationPolicy: "wake_assignee"` when acceptance should wake you to continue. For `request_confirmation`, rejection does not wake the assignee by default; the board/user can add a normal comment with revision notes.
## Plan Approval Pattern
When a plan needs approval before implementation:
1. Create or update the issue document with key `plan`.
2. Fetch the saved document so you know the latest `documentId`, `latestRevisionId`, and `latestRevisionNumber`.
3. Create a `request_confirmation` targeting that exact `plan` revision.
4. Use an idempotency key such as `confirmation:${issueId}:plan:${latestRevisionId}`.
5. Wait for acceptance before creating implementation subtasks.
6. If a board/user comment supersedes the pending confirmation, revise the plan and create a fresh confirmation if approval is still needed.
Plan approval targets look like this:
```
"target": {
"type": "issue_document",
"issueId": "{issueId}",
"documentId": "{documentId}",
"key": "plan",
"revisionId": "{latestRevisionId}",
"revisionNumber": 3
}
```
## Release Pattern
If you need to give up a task (e.g. you realize it should go to someone else):

View File

@@ -11,13 +11,16 @@
"dev:stop": "pnpm --filter @paperclipai/server exec tsx ../scripts/dev-service.ts stop",
"dev:server": "pnpm --filter @paperclipai/server dev",
"dev:ui": "pnpm --filter @paperclipai/ui dev",
"storybook": "pnpm --filter @paperclipai/ui storybook",
"build-storybook": "pnpm --filter @paperclipai/ui build-storybook",
"build": "pnpm run preflight:workspace-links && pnpm -r build",
"typecheck": "pnpm run preflight:workspace-links && pnpm -r typecheck",
"test": "pnpm run test:run",
"test:watch": "pnpm run preflight:workspace-links && vitest",
"test:run": "pnpm run preflight:workspace-links && vitest run",
"test:run": "pnpm run preflight:workspace-links && node scripts/run-vitest-stable.mjs",
"db:generate": "pnpm --filter @paperclipai/db generate",
"db:migrate": "pnpm --filter @paperclipai/db migrate",
"issue-references:backfill": "pnpm run preflight:workspace-links && tsx scripts/backfill-issue-reference-mentions.ts",
"secrets:migrate-inline-env": "tsx scripts/migrate-inline-env-secrets.ts",
"db:backup": "./scripts/backup-db.sh",
"paperclipai": "node cli/node_modules/tsx/dist/cli.mjs cli/src/index.ts",

View File

@@ -0,0 +1,152 @@
import path from "node:path";
import {
prepareSandboxManagedRuntime,
type PreparedSandboxManagedRuntime,
type SandboxManagedRuntimeAsset,
type SandboxManagedRuntimeClient,
type SandboxRemoteExecutionSpec,
} from "./sandbox-managed-runtime.js";
import type { RunProcessResult } from "./server-utils.js";
export interface CommandManagedRuntimeRunner {
execute(input: {
command: string;
args?: string[];
cwd?: string;
env?: Record<string, string>;
stdin?: string;
timeoutMs?: number;
onLog?: (stream: "stdout" | "stderr", chunk: string) => Promise<void>;
onSpawn?: (meta: { pid: number; startedAt: string }) => Promise<void>;
}): Promise<RunProcessResult>;
}
export interface CommandManagedRuntimeSpec {
providerKey?: string | null;
leaseId?: string | null;
remoteCwd: string;
timeoutMs?: number | null;
paperclipApiUrl?: string | null;
}
export type CommandManagedRuntimeAsset = SandboxManagedRuntimeAsset;
function shellQuote(value: string) {
return `'${value.replace(/'/g, `'"'"'`)}'`;
}
function toBuffer(bytes: Buffer | Uint8Array | ArrayBuffer): Buffer {
if (Buffer.isBuffer(bytes)) return bytes;
if (bytes instanceof ArrayBuffer) return Buffer.from(bytes);
return Buffer.from(bytes.buffer, bytes.byteOffset, bytes.byteLength);
}
function requireSuccessfulResult(result: RunProcessResult, action: string): void {
if (result.exitCode === 0 && !result.timedOut) return;
const stderr = result.stderr.trim();
const detail = stderr.length > 0 ? `: ${stderr}` : "";
throw new Error(`${action} failed with exit code ${result.exitCode ?? "null"}${detail}`);
}
function createCommandManagedRuntimeClient(input: {
runner: CommandManagedRuntimeRunner;
remoteCwd: string;
timeoutMs: number;
}): SandboxManagedRuntimeClient {
const runShell = async (script: string, opts: { stdin?: string; timeoutMs?: number } = {}) => {
const result = await input.runner.execute({
command: "sh",
args: ["-lc", script],
cwd: input.remoteCwd,
stdin: opts.stdin,
timeoutMs: opts.timeoutMs ?? input.timeoutMs,
});
requireSuccessfulResult(result, script);
return result;
};
return {
makeDir: async (remotePath) => {
await runShell(`mkdir -p ${shellQuote(remotePath)}`);
},
writeFile: async (remotePath, bytes) => {
const body = toBuffer(bytes).toString("base64");
await runShell(
`mkdir -p ${shellQuote(path.posix.dirname(remotePath))} && base64 -d > ${shellQuote(remotePath)}`,
{ stdin: body },
);
},
readFile: async (remotePath) => {
const result = await runShell(`base64 < ${shellQuote(remotePath)}`);
return Buffer.from(result.stdout.replace(/\s+/g, ""), "base64");
},
remove: async (remotePath) => {
const result = await input.runner.execute({
command: "sh",
args: ["-lc", `rm -rf ${shellQuote(remotePath)}`],
cwd: input.remoteCwd,
timeoutMs: input.timeoutMs,
});
requireSuccessfulResult(result, `remove ${remotePath}`);
},
run: async (command, options) => {
const result = await input.runner.execute({
command: "sh",
args: ["-lc", command],
cwd: input.remoteCwd,
timeoutMs: options.timeoutMs,
});
requireSuccessfulResult(result, command);
},
};
}
export async function prepareCommandManagedRuntime(input: {
runner: CommandManagedRuntimeRunner;
spec: CommandManagedRuntimeSpec;
adapterKey: string;
workspaceLocalDir: string;
workspaceRemoteDir?: string;
workspaceExclude?: string[];
preserveAbsentOnRestore?: string[];
assets?: CommandManagedRuntimeAsset[];
installCommand?: string | null;
}): Promise<PreparedSandboxManagedRuntime> {
const timeoutMs = input.spec.timeoutMs && input.spec.timeoutMs > 0 ? input.spec.timeoutMs : 300_000;
const workspaceRemoteDir = input.workspaceRemoteDir ?? input.spec.remoteCwd;
const runtimeSpec: SandboxRemoteExecutionSpec = {
transport: "sandbox",
provider: input.spec.providerKey ?? "sandbox",
sandboxId: input.spec.leaseId ?? "managed",
remoteCwd: workspaceRemoteDir,
timeoutMs,
apiKey: null,
paperclipApiUrl: input.spec.paperclipApiUrl ?? null,
};
const client = createCommandManagedRuntimeClient({
runner: input.runner,
remoteCwd: workspaceRemoteDir,
timeoutMs,
});
if (input.installCommand?.trim()) {
const result = await input.runner.execute({
command: "sh",
args: ["-lc", input.installCommand.trim()],
cwd: workspaceRemoteDir,
timeoutMs,
});
requireSuccessfulResult(result, input.installCommand.trim());
}
return await prepareSandboxManagedRuntime({
spec: runtimeSpec,
client,
adapterKey: input.adapterKey,
workspaceLocalDir: input.workspaceLocalDir,
workspaceRemoteDir,
workspaceExclude: input.workspaceExclude,
preserveAbsentOnRestore: input.preserveAbsentOnRestore,
assets: input.assets,
});
}

View File

@@ -0,0 +1,96 @@
import { describe, expect, it, vi } from "vitest";
import {
adapterExecutionTargetSessionIdentity,
adapterExecutionTargetToRemoteSpec,
runAdapterExecutionTargetProcess,
runAdapterExecutionTargetShellCommand,
type AdapterSandboxExecutionTarget,
} from "./execution-target.js";
describe("sandbox adapter execution targets", () => {
it("executes through the provider-neutral runner without a remote spec", async () => {
const runner = {
execute: vi.fn(async () => ({
exitCode: 0,
signal: null,
timedOut: false,
stdout: "ok\n",
stderr: "",
pid: null,
startedAt: new Date().toISOString(),
})),
};
const target: AdapterSandboxExecutionTarget = {
kind: "remote",
transport: "sandbox",
providerKey: "acme-sandbox",
environmentId: "env-1",
leaseId: "lease-1",
remoteCwd: "/workspace",
timeoutMs: 30_000,
runner,
};
expect(adapterExecutionTargetToRemoteSpec(target)).toBeNull();
const result = await runAdapterExecutionTargetProcess("run-1", target, "agent-cli", ["--json"], {
cwd: "/local/workspace",
env: { TOKEN: "token" },
stdin: "prompt",
timeoutSec: 5,
graceSec: 1,
onLog: async () => {},
});
expect(result.stdout).toBe("ok\n");
expect(runner.execute).toHaveBeenCalledWith(expect.objectContaining({
command: "agent-cli",
args: ["--json"],
cwd: "/workspace",
env: { TOKEN: "token" },
stdin: "prompt",
timeoutMs: 5000,
}));
expect(adapterExecutionTargetSessionIdentity(target)).toEqual({
transport: "sandbox",
providerKey: "acme-sandbox",
environmentId: "env-1",
leaseId: "lease-1",
remoteCwd: "/workspace",
});
});
it("runs shell commands through the same runner", async () => {
const runner = {
execute: vi.fn(async () => ({
exitCode: 0,
signal: null,
timedOut: false,
stdout: "/home/sandbox",
stderr: "",
pid: null,
startedAt: new Date().toISOString(),
})),
};
const target: AdapterSandboxExecutionTarget = {
kind: "remote",
transport: "sandbox",
remoteCwd: "/workspace",
runner,
};
await runAdapterExecutionTargetShellCommand("run-2", target, 'printf %s "$HOME"', {
cwd: "/local/workspace",
env: {},
timeoutSec: 7,
});
expect(runner.execute).toHaveBeenCalledWith(expect.objectContaining({
command: "sh",
args: ["-lc", 'printf %s "$HOME"'],
cwd: "/workspace",
timeoutMs: 7000,
}));
});
});

View File

@@ -0,0 +1,161 @@
import { afterEach, describe, expect, it, vi } from "vitest";
import * as ssh from "./ssh.js";
import {
adapterExecutionTargetUsesManagedHome,
runAdapterExecutionTargetShellCommand,
} from "./execution-target.js";
describe("runAdapterExecutionTargetShellCommand", () => {
afterEach(() => {
vi.restoreAllMocks();
});
it("quotes remote shell commands with the shared SSH quoting helper", async () => {
const runSshCommandSpy = vi.spyOn(ssh, "runSshCommand").mockResolvedValue({
stdout: "",
stderr: "",
});
await runAdapterExecutionTargetShellCommand(
"run-1",
{
kind: "remote",
transport: "ssh",
remoteCwd: "/srv/paperclip/workspace",
spec: {
host: "ssh.example.test",
port: 22,
username: "ssh-user",
remoteCwd: "/srv/paperclip/workspace",
remoteWorkspacePath: "/srv/paperclip/workspace",
privateKey: null,
knownHosts: null,
strictHostKeyChecking: true,
},
},
`printf '%s\\n' "$HOME" && echo "it's ok"`,
{
cwd: "/tmp/local",
env: {},
},
);
expect(runSshCommandSpy).toHaveBeenCalledWith(
expect.objectContaining({
host: "ssh.example.test",
username: "ssh-user",
}),
`sh -lc ${ssh.shellQuote(`printf '%s\\n' "$HOME" && echo "it's ok"`)}`,
expect.any(Object),
);
});
it("returns a timedOut result when the SSH shell command times out", async () => {
vi.spyOn(ssh, "runSshCommand").mockRejectedValue(Object.assign(new Error("timed out"), {
code: "ETIMEDOUT",
stdout: "partial stdout",
stderr: "partial stderr",
signal: "SIGTERM",
}));
const onLog = vi.fn(async () => {});
const result = await runAdapterExecutionTargetShellCommand(
"run-2",
{
kind: "remote",
transport: "ssh",
remoteCwd: "/srv/paperclip/workspace",
spec: {
host: "ssh.example.test",
port: 22,
username: "ssh-user",
remoteCwd: "/srv/paperclip/workspace",
remoteWorkspacePath: "/srv/paperclip/workspace",
privateKey: null,
knownHosts: null,
strictHostKeyChecking: true,
},
},
"sleep 10",
{
cwd: "/tmp/local",
env: {},
onLog,
},
);
expect(result).toMatchObject({
exitCode: null,
signal: "SIGTERM",
timedOut: true,
stdout: "partial stdout",
stderr: "partial stderr",
});
expect(onLog).toHaveBeenCalledWith("stdout", "partial stdout");
expect(onLog).toHaveBeenCalledWith("stderr", "partial stderr");
});
it("returns the SSH process exit code for non-zero remote command failures", async () => {
vi.spyOn(ssh, "runSshCommand").mockRejectedValue(Object.assign(new Error("non-zero exit"), {
code: 17,
stdout: "partial stdout",
stderr: "partial stderr",
signal: null,
}));
const onLog = vi.fn(async () => {});
const result = await runAdapterExecutionTargetShellCommand(
"run-3",
{
kind: "remote",
transport: "ssh",
remoteCwd: "/srv/paperclip/workspace",
spec: {
host: "ssh.example.test",
port: 22,
username: "ssh-user",
remoteCwd: "/srv/paperclip/workspace",
remoteWorkspacePath: "/srv/paperclip/workspace",
privateKey: null,
knownHosts: null,
strictHostKeyChecking: true,
},
},
"false",
{
cwd: "/tmp/local",
env: {},
onLog,
},
);
expect(result).toMatchObject({
exitCode: 17,
signal: null,
timedOut: false,
stdout: "partial stdout",
stderr: "partial stderr",
});
expect(onLog).toHaveBeenCalledWith("stdout", "partial stdout");
expect(onLog).toHaveBeenCalledWith("stderr", "partial stderr");
});
it("keeps managed homes disabled for both local and SSH targets", () => {
expect(adapterExecutionTargetUsesManagedHome(null)).toBe(false);
expect(adapterExecutionTargetUsesManagedHome({
kind: "remote",
transport: "ssh",
remoteCwd: "/srv/paperclip/workspace",
spec: {
host: "ssh.example.test",
port: 22,
username: "ssh-user",
remoteCwd: "/srv/paperclip/workspace",
remoteWorkspacePath: "/srv/paperclip/workspace",
privateKey: null,
knownHosts: null,
strictHostKeyChecking: true,
},
})).toBe(false);
});
});

View File

@@ -0,0 +1,516 @@
import path from "node:path";
import type { SshRemoteExecutionSpec } from "./ssh.js";
import {
prepareCommandManagedRuntime,
type CommandManagedRuntimeRunner,
} from "./command-managed-runtime.js";
import {
buildRemoteExecutionSessionIdentity,
prepareRemoteManagedRuntime,
remoteExecutionSessionMatches,
type RemoteManagedRuntimeAsset,
} from "./remote-managed-runtime.js";
import { parseSshRemoteExecutionSpec, runSshCommand, shellQuote } from "./ssh.js";
import {
ensureCommandResolvable,
resolveCommandForLogs,
runChildProcess,
type RunProcessResult,
type TerminalResultCleanupOptions,
} from "./server-utils.js";
export interface AdapterLocalExecutionTarget {
kind: "local";
environmentId?: string | null;
leaseId?: string | null;
}
export interface AdapterSshExecutionTarget {
kind: "remote";
transport: "ssh";
environmentId?: string | null;
leaseId?: string | null;
remoteCwd: string;
paperclipApiUrl?: string | null;
spec: SshRemoteExecutionSpec;
}
export interface AdapterSandboxExecutionTarget {
kind: "remote";
transport: "sandbox";
providerKey?: string | null;
environmentId?: string | null;
leaseId?: string | null;
remoteCwd: string;
paperclipApiUrl?: string | null;
timeoutMs?: number | null;
runner?: CommandManagedRuntimeRunner;
}
export type AdapterExecutionTarget =
| AdapterLocalExecutionTarget
| AdapterSshExecutionTarget
| AdapterSandboxExecutionTarget;
export type AdapterRemoteExecutionSpec = SshRemoteExecutionSpec;
export type AdapterManagedRuntimeAsset = RemoteManagedRuntimeAsset;
export interface PreparedAdapterExecutionTargetRuntime {
target: AdapterExecutionTarget;
runtimeRootDir: string | null;
assetDirs: Record<string, string>;
restoreWorkspace(): Promise<void>;
}
export interface AdapterExecutionTargetProcessOptions {
cwd: string;
env: Record<string, string>;
stdin?: string;
timeoutSec: number;
graceSec: number;
onLog: (stream: "stdout" | "stderr", chunk: string) => Promise<void>;
onSpawn?: (meta: { pid: number; processGroupId: number | null; startedAt: string }) => Promise<void>;
terminalResultCleanup?: TerminalResultCleanupOptions;
}
export interface AdapterExecutionTargetShellOptions {
cwd: string;
env: Record<string, string>;
timeoutSec?: number;
graceSec?: number;
onLog?: (stream: "stdout" | "stderr", chunk: string) => Promise<void>;
}
function parseObject(value: unknown): Record<string, unknown> {
return value && typeof value === "object" && !Array.isArray(value)
? (value as Record<string, unknown>)
: {};
}
function readString(value: unknown): string | null {
return typeof value === "string" && value.trim().length > 0 ? value.trim() : null;
}
function readStringMeta(parsed: Record<string, unknown>, key: string): string | null {
return readString(parsed[key]);
}
function isAdapterExecutionTargetInstance(value: unknown): value is AdapterExecutionTarget {
const parsed = parseObject(value);
if (parsed.kind === "local") return true;
if (parsed.kind !== "remote") return false;
if (parsed.transport === "ssh") return parseSshRemoteExecutionSpec(parseObject(parsed.spec)) !== null;
if (parsed.transport !== "sandbox") return false;
return readStringMeta(parsed, "remoteCwd") !== null;
}
export function adapterExecutionTargetToRemoteSpec(
target: AdapterExecutionTarget | null | undefined,
): AdapterRemoteExecutionSpec | null {
return target?.kind === "remote" && target.transport === "ssh" ? target.spec : null;
}
export function adapterExecutionTargetIsRemote(
target: AdapterExecutionTarget | null | undefined,
): boolean {
return target?.kind === "remote";
}
export function adapterExecutionTargetUsesManagedHome(
target: AdapterExecutionTarget | null | undefined,
): boolean {
return target?.kind === "remote" && target.transport === "sandbox";
}
export function adapterExecutionTargetRemoteCwd(
target: AdapterExecutionTarget | null | undefined,
localCwd: string,
): string {
return target?.kind === "remote" ? target.remoteCwd : localCwd;
}
export function adapterExecutionTargetPaperclipApiUrl(
target: AdapterExecutionTarget | null | undefined,
): string | null {
if (target?.kind !== "remote") return null;
if (target.transport === "ssh") return target.paperclipApiUrl ?? target.spec.paperclipApiUrl ?? null;
return target.paperclipApiUrl ?? null;
}
export function describeAdapterExecutionTarget(
target: AdapterExecutionTarget | null | undefined,
): string {
if (!target || target.kind === "local") return "local environment";
if (target.transport === "ssh") {
return `SSH environment ${target.spec.username}@${target.spec.host}:${target.spec.port}`;
}
return `sandbox environment${target.providerKey ? ` (${target.providerKey})` : ""}`;
}
function requireSandboxRunner(target: AdapterSandboxExecutionTarget): CommandManagedRuntimeRunner {
if (target.runner) return target.runner;
throw new Error(
"Sandbox execution target is missing its provider runtime runner. Sandbox commands must execute through the environment runtime.",
);
}
export async function ensureAdapterExecutionTargetCommandResolvable(
command: string,
target: AdapterExecutionTarget | null | undefined,
cwd: string,
env: NodeJS.ProcessEnv,
) {
if (target?.kind === "remote" && target.transport === "sandbox") {
return;
}
await ensureCommandResolvable(command, cwd, env, {
remoteExecution: adapterExecutionTargetToRemoteSpec(target),
});
}
export async function resolveAdapterExecutionTargetCommandForLogs(
command: string,
target: AdapterExecutionTarget | null | undefined,
cwd: string,
env: NodeJS.ProcessEnv,
): Promise<string> {
if (target?.kind === "remote" && target.transport === "sandbox") {
return `sandbox://${target.providerKey ?? "provider"}/${target.leaseId ?? "lease"}/${target.remoteCwd} :: ${command}`;
}
return await resolveCommandForLogs(command, cwd, env, {
remoteExecution: adapterExecutionTargetToRemoteSpec(target),
});
}
export async function runAdapterExecutionTargetProcess(
runId: string,
target: AdapterExecutionTarget | null | undefined,
command: string,
args: string[],
options: AdapterExecutionTargetProcessOptions,
): Promise<RunProcessResult> {
if (target?.kind === "remote" && target.transport === "sandbox") {
const runner = requireSandboxRunner(target);
return await runner.execute({
command,
args,
cwd: target.remoteCwd,
env: options.env,
stdin: options.stdin,
timeoutMs: options.timeoutSec > 0 ? options.timeoutSec * 1000 : target.timeoutMs ?? undefined,
onLog: options.onLog,
onSpawn: options.onSpawn
? async (meta) => options.onSpawn?.({ ...meta, processGroupId: null })
: undefined,
});
}
return await runChildProcess(runId, command, args, {
cwd: options.cwd,
env: options.env,
stdin: options.stdin,
timeoutSec: options.timeoutSec,
graceSec: options.graceSec,
onLog: options.onLog,
onSpawn: options.onSpawn,
terminalResultCleanup: options.terminalResultCleanup,
remoteExecution: adapterExecutionTargetToRemoteSpec(target),
});
}
export async function runAdapterExecutionTargetShellCommand(
runId: string,
target: AdapterExecutionTarget | null | undefined,
command: string,
options: AdapterExecutionTargetShellOptions,
): Promise<RunProcessResult> {
const onLog = options.onLog ?? (async () => {});
if (target?.kind === "remote") {
const startedAt = new Date().toISOString();
if (target.transport === "ssh") {
try {
const result = await runSshCommand(target.spec, `sh -lc ${shellQuote(command)}`, {
timeoutMs: (options.timeoutSec ?? 15) * 1000,
});
if (result.stdout) await onLog("stdout", result.stdout);
if (result.stderr) await onLog("stderr", result.stderr);
return {
exitCode: 0,
signal: null,
timedOut: false,
stdout: result.stdout,
stderr: result.stderr,
pid: null,
startedAt,
};
} catch (error) {
const timedOutError = error as NodeJS.ErrnoException & {
stdout?: string;
stderr?: string;
signal?: string | null;
};
const stdout = timedOutError.stdout ?? "";
const stderr = timedOutError.stderr ?? "";
if (typeof timedOutError.code === "number") {
if (stdout) await onLog("stdout", stdout);
if (stderr) await onLog("stderr", stderr);
return {
exitCode: timedOutError.code,
signal: timedOutError.signal ?? null,
timedOut: false,
stdout,
stderr,
pid: null,
startedAt,
};
}
if (timedOutError.code !== "ETIMEDOUT") {
throw error;
}
if (stdout) await onLog("stdout", stdout);
if (stderr) await onLog("stderr", stderr);
return {
exitCode: null,
signal: timedOutError.signal ?? null,
timedOut: true,
stdout,
stderr,
pid: null,
startedAt,
};
}
}
return await requireSandboxRunner(target).execute({
command: "sh",
args: ["-lc", command],
cwd: target.remoteCwd,
env: options.env,
timeoutMs: (options.timeoutSec ?? 15) * 1000,
onLog,
});
}
return await runAdapterExecutionTargetProcess(
runId,
target,
"sh",
["-lc", command],
{
cwd: options.cwd,
env: options.env,
timeoutSec: options.timeoutSec ?? 15,
graceSec: options.graceSec ?? 5,
onLog,
},
);
}
export async function readAdapterExecutionTargetHomeDir(
runId: string,
target: AdapterExecutionTarget | null | undefined,
options: AdapterExecutionTargetShellOptions,
): Promise<string | null> {
const result = await runAdapterExecutionTargetShellCommand(
runId,
target,
'printf %s "$HOME"',
options,
);
const homeDir = result.stdout.trim();
return homeDir.length > 0 ? homeDir : null;
}
export async function ensureAdapterExecutionTargetFile(
runId: string,
target: AdapterExecutionTarget | null | undefined,
filePath: string,
options: AdapterExecutionTargetShellOptions,
): Promise<void> {
await runAdapterExecutionTargetShellCommand(
runId,
target,
`mkdir -p ${shellQuote(path.posix.dirname(filePath))} && : > ${shellQuote(filePath)}`,
options,
);
}
export function adapterExecutionTargetSessionIdentity(
target: AdapterExecutionTarget | null | undefined,
): Record<string, unknown> | null {
if (!target || target.kind === "local") return null;
if (target.transport === "ssh") return buildRemoteExecutionSessionIdentity(target.spec);
return {
transport: "sandbox",
providerKey: target.providerKey ?? null,
environmentId: target.environmentId ?? null,
leaseId: target.leaseId ?? null,
remoteCwd: target.remoteCwd,
...(target.paperclipApiUrl ? { paperclipApiUrl: target.paperclipApiUrl } : {}),
};
}
export function adapterExecutionTargetSessionMatches(
saved: unknown,
target: AdapterExecutionTarget | null | undefined,
): boolean {
if (!target || target.kind === "local") {
return Object.keys(parseObject(saved)).length === 0;
}
if (target.transport === "ssh") return remoteExecutionSessionMatches(saved, target.spec);
const current = adapterExecutionTargetSessionIdentity(target);
const parsedSaved = parseObject(saved);
return (
readStringMeta(parsedSaved, "transport") === current?.transport &&
readStringMeta(parsedSaved, "providerKey") === current?.providerKey &&
readStringMeta(parsedSaved, "environmentId") === current?.environmentId &&
readStringMeta(parsedSaved, "leaseId") === current?.leaseId &&
readStringMeta(parsedSaved, "remoteCwd") === current?.remoteCwd &&
readStringMeta(parsedSaved, "paperclipApiUrl") === (current?.paperclipApiUrl ?? null)
);
}
export function parseAdapterExecutionTarget(value: unknown): AdapterExecutionTarget | null {
const parsed = parseObject(value);
const kind = readStringMeta(parsed, "kind");
if (kind === "local") {
return {
kind: "local",
environmentId: readStringMeta(parsed, "environmentId"),
leaseId: readStringMeta(parsed, "leaseId"),
};
}
if (kind === "remote" && readStringMeta(parsed, "transport") === "ssh") {
const spec = parseSshRemoteExecutionSpec(parseObject(parsed.spec));
if (!spec) return null;
return {
kind: "remote",
transport: "ssh",
environmentId: readStringMeta(parsed, "environmentId"),
leaseId: readStringMeta(parsed, "leaseId"),
remoteCwd: spec.remoteCwd,
paperclipApiUrl: readStringMeta(parsed, "paperclipApiUrl") ?? spec.paperclipApiUrl ?? null,
spec,
};
}
if (kind === "remote" && readStringMeta(parsed, "transport") === "sandbox") {
const remoteCwd = readStringMeta(parsed, "remoteCwd");
if (!remoteCwd) return null;
return {
kind: "remote",
transport: "sandbox",
providerKey: readStringMeta(parsed, "providerKey"),
environmentId: readStringMeta(parsed, "environmentId"),
leaseId: readStringMeta(parsed, "leaseId"),
remoteCwd,
paperclipApiUrl: readStringMeta(parsed, "paperclipApiUrl"),
timeoutMs: typeof parsed.timeoutMs === "number" ? parsed.timeoutMs : null,
};
}
return null;
}
export function adapterExecutionTargetFromRemoteExecution(
remoteExecution: unknown,
metadata: Pick<AdapterLocalExecutionTarget, "environmentId" | "leaseId"> = {},
): AdapterExecutionTarget | null {
const parsed = parseObject(remoteExecution);
const ssh = parseSshRemoteExecutionSpec(parsed);
if (ssh) {
return {
kind: "remote",
transport: "ssh",
environmentId: metadata.environmentId ?? null,
leaseId: metadata.leaseId ?? null,
remoteCwd: ssh.remoteCwd,
paperclipApiUrl: ssh.paperclipApiUrl ?? null,
spec: ssh,
};
}
return null;
}
export function readAdapterExecutionTarget(input: {
executionTarget?: unknown;
legacyRemoteExecution?: unknown;
}): AdapterExecutionTarget | null {
if (isAdapterExecutionTargetInstance(input.executionTarget)) {
return input.executionTarget;
}
return (
parseAdapterExecutionTarget(input.executionTarget) ??
adapterExecutionTargetFromRemoteExecution(input.legacyRemoteExecution)
);
}
export async function prepareAdapterExecutionTargetRuntime(input: {
target: AdapterExecutionTarget | null | undefined;
adapterKey: string;
workspaceLocalDir: string;
workspaceExclude?: string[];
preserveAbsentOnRestore?: string[];
assets?: AdapterManagedRuntimeAsset[];
installCommand?: string | null;
}): Promise<PreparedAdapterExecutionTargetRuntime> {
const target = input.target ?? { kind: "local" as const };
if (target.kind === "local") {
return {
target,
runtimeRootDir: null,
assetDirs: {},
restoreWorkspace: async () => {},
};
}
if (target.transport === "ssh") {
const prepared = await prepareRemoteManagedRuntime({
spec: target.spec,
adapterKey: input.adapterKey,
workspaceLocalDir: input.workspaceLocalDir,
assets: input.assets,
});
return {
target,
runtimeRootDir: prepared.runtimeRootDir,
assetDirs: prepared.assetDirs,
restoreWorkspace: prepared.restoreWorkspace,
};
}
const prepared = await prepareCommandManagedRuntime({
runner: requireSandboxRunner(target),
spec: {
providerKey: target.providerKey,
leaseId: target.leaseId,
remoteCwd: target.remoteCwd,
timeoutMs: target.timeoutMs,
paperclipApiUrl: target.paperclipApiUrl,
},
adapterKey: input.adapterKey,
workspaceLocalDir: input.workspaceLocalDir,
workspaceExclude: input.workspaceExclude,
preserveAbsentOnRestore: input.preserveAbsentOnRestore,
assets: input.assets,
installCommand: input.installCommand,
});
return {
target,
runtimeRootDir: prepared.runtimeRootDir,
assetDirs: prepared.assetDirs,
restoreWorkspace: prepared.restoreWorkspace,
};
}
export function runtimeAssetDir(
prepared: Pick<PreparedAdapterExecutionTargetRuntime, "assetDirs">,
key: string,
fallbackRemoteCwd: string,
): string {
return prepared.assetDirs[key] ?? path.posix.join(fallbackRemoteCwd, ".paperclip-runtime", key);
}

View File

@@ -0,0 +1,118 @@
import path from "node:path";
import {
type SshRemoteExecutionSpec,
prepareWorkspaceForSshExecution,
restoreWorkspaceFromSshExecution,
syncDirectoryToSsh,
} from "./ssh.js";
export interface RemoteManagedRuntimeAsset {
key: string;
localDir: string;
followSymlinks?: boolean;
exclude?: string[];
}
export interface PreparedRemoteManagedRuntime {
spec: SshRemoteExecutionSpec;
workspaceLocalDir: string;
workspaceRemoteDir: string;
runtimeRootDir: string;
assetDirs: Record<string, string>;
restoreWorkspace(): Promise<void>;
}
function asObject(value: unknown): Record<string, unknown> {
return value && typeof value === "object" && !Array.isArray(value)
? (value as Record<string, unknown>)
: {};
}
function asString(value: unknown): string {
return typeof value === "string" ? value : "";
}
function asNumber(value: unknown): number {
return typeof value === "number" ? value : Number(value);
}
export function buildRemoteExecutionSessionIdentity(spec: SshRemoteExecutionSpec | null) {
if (!spec) return null;
return {
transport: "ssh",
host: spec.host,
port: spec.port,
username: spec.username,
remoteCwd: spec.remoteCwd,
...(spec.paperclipApiUrl ? { paperclipApiUrl: spec.paperclipApiUrl } : {}),
} as const;
}
export function remoteExecutionSessionMatches(saved: unknown, current: SshRemoteExecutionSpec | null): boolean {
const currentIdentity = buildRemoteExecutionSessionIdentity(current);
if (!currentIdentity) return false;
const parsedSaved = asObject(saved);
return (
asString(parsedSaved.transport) === currentIdentity.transport &&
asString(parsedSaved.host) === currentIdentity.host &&
asNumber(parsedSaved.port) === currentIdentity.port &&
asString(parsedSaved.username) === currentIdentity.username &&
asString(parsedSaved.remoteCwd) === currentIdentity.remoteCwd &&
asString(parsedSaved.paperclipApiUrl) === asString(currentIdentity.paperclipApiUrl)
);
}
export async function prepareRemoteManagedRuntime(input: {
spec: SshRemoteExecutionSpec;
adapterKey: string;
workspaceLocalDir: string;
workspaceRemoteDir?: string;
assets?: RemoteManagedRuntimeAsset[];
}): Promise<PreparedRemoteManagedRuntime> {
const workspaceRemoteDir = input.workspaceRemoteDir ?? input.spec.remoteCwd;
const runtimeRootDir = path.posix.join(workspaceRemoteDir, ".paperclip-runtime", input.adapterKey);
await prepareWorkspaceForSshExecution({
spec: input.spec,
localDir: input.workspaceLocalDir,
remoteDir: workspaceRemoteDir,
});
const assetDirs: Record<string, string> = {};
try {
for (const asset of input.assets ?? []) {
const remoteDir = path.posix.join(runtimeRootDir, asset.key);
assetDirs[asset.key] = remoteDir;
await syncDirectoryToSsh({
spec: input.spec,
localDir: asset.localDir,
remoteDir,
followSymlinks: asset.followSymlinks,
exclude: asset.exclude,
});
}
} catch (error) {
await restoreWorkspaceFromSshExecution({
spec: input.spec,
localDir: input.workspaceLocalDir,
remoteDir: workspaceRemoteDir,
});
throw error;
}
return {
spec: input.spec,
workspaceLocalDir: input.workspaceLocalDir,
workspaceRemoteDir,
runtimeRootDir,
assetDirs,
restoreWorkspace: async () => {
await restoreWorkspaceFromSshExecution({
spec: input.spec,
localDir: input.workspaceLocalDir,
remoteDir: workspaceRemoteDir,
});
},
};
}

View File

@@ -0,0 +1,126 @@
import { lstat, mkdir, mkdtemp, readFile, rm, symlink, writeFile } from "node:fs/promises";
import os from "node:os";
import path from "node:path";
import { execFile as execFileCallback } from "node:child_process";
import { promisify } from "node:util";
import { afterEach, describe, expect, it } from "vitest";
import {
mirrorDirectory,
prepareSandboxManagedRuntime,
type SandboxManagedRuntimeClient,
} from "./sandbox-managed-runtime.js";
const execFile = promisify(execFileCallback);
describe("sandbox managed runtime", () => {
const cleanupDirs: string[] = [];
afterEach(async () => {
while (cleanupDirs.length > 0) {
const dir = cleanupDirs.pop();
if (!dir) continue;
await rm(dir, { recursive: true, force: true }).catch(() => undefined);
}
});
it("preserves excluded local workspace artifacts during restore mirroring", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-sandbox-restore-"));
cleanupDirs.push(rootDir);
const sourceDir = path.join(rootDir, "source");
const targetDir = path.join(rootDir, "target");
await mkdir(path.join(sourceDir, "src"), { recursive: true });
await mkdir(path.join(targetDir, ".claude"), { recursive: true });
await mkdir(path.join(targetDir, ".paperclip-runtime"), { recursive: true });
await writeFile(path.join(sourceDir, "src", "app.ts"), "export const value = 2;\n", "utf8");
await writeFile(path.join(targetDir, "stale.txt"), "remove me\n", "utf8");
await writeFile(path.join(targetDir, ".claude", "settings.json"), "{\"keep\":true}\n", "utf8");
await writeFile(path.join(targetDir, ".claude.json"), "{\"keep\":true}\n", "utf8");
await writeFile(path.join(targetDir, ".paperclip-runtime", "state.json"), "{}\n", "utf8");
await mirrorDirectory(sourceDir, targetDir, {
preserveAbsent: [".paperclip-runtime", ".claude", ".claude.json"],
});
await expect(readFile(path.join(targetDir, "src", "app.ts"), "utf8")).resolves.toBe("export const value = 2;\n");
await expect(readFile(path.join(targetDir, ".claude", "settings.json"), "utf8")).resolves.toBe("{\"keep\":true}\n");
await expect(readFile(path.join(targetDir, ".claude.json"), "utf8")).resolves.toBe("{\"keep\":true}\n");
await expect(readFile(path.join(targetDir, ".paperclip-runtime", "state.json"), "utf8")).resolves.toBe("{}\n");
await expect(readFile(path.join(targetDir, "stale.txt"), "utf8")).rejects.toMatchObject({ code: "ENOENT" });
});
it("syncs workspace and assets through a provider-neutral sandbox client", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-sandbox-managed-"));
cleanupDirs.push(rootDir);
const localWorkspaceDir = path.join(rootDir, "local-workspace");
const remoteWorkspaceDir = path.join(rootDir, "remote-workspace");
const localAssetsDir = path.join(rootDir, "local-assets");
const linkedAssetPath = path.join(rootDir, "linked-skill.md");
await mkdir(path.join(localWorkspaceDir, ".claude"), { recursive: true });
await mkdir(localAssetsDir, { recursive: true });
await writeFile(path.join(localWorkspaceDir, "README.md"), "local workspace\n", "utf8");
await writeFile(path.join(localWorkspaceDir, "._README.md"), "appledouble\n", "utf8");
await writeFile(path.join(localWorkspaceDir, ".claude", "settings.json"), "{\"local\":true}\n", "utf8");
await writeFile(linkedAssetPath, "skill body\n", "utf8");
await symlink(linkedAssetPath, path.join(localAssetsDir, "skill.md"));
const client: SandboxManagedRuntimeClient = {
makeDir: async (remotePath) => {
await mkdir(remotePath, { recursive: true });
},
writeFile: async (remotePath, bytes) => {
await mkdir(path.dirname(remotePath), { recursive: true });
await writeFile(remotePath, Buffer.from(bytes));
},
readFile: async (remotePath) => await readFile(remotePath),
remove: async (remotePath) => {
await rm(remotePath, { recursive: true, force: true });
},
run: async (command) => {
await execFile("sh", ["-lc", command], {
maxBuffer: 32 * 1024 * 1024,
});
},
};
const prepared = await prepareSandboxManagedRuntime({
spec: {
transport: "sandbox",
provider: "test",
sandboxId: "sandbox-1",
remoteCwd: remoteWorkspaceDir,
timeoutMs: 30_000,
apiKey: null,
},
adapterKey: "test-adapter",
client,
workspaceLocalDir: localWorkspaceDir,
workspaceExclude: [".claude"],
preserveAbsentOnRestore: [".claude"],
assets: [{
key: "skills",
localDir: localAssetsDir,
followSymlinks: true,
}],
});
await expect(readFile(path.join(remoteWorkspaceDir, "README.md"), "utf8")).resolves.toBe("local workspace\n");
await expect(readFile(path.join(remoteWorkspaceDir, "._README.md"), "utf8")).rejects.toMatchObject({ code: "ENOENT" });
await expect(readFile(path.join(remoteWorkspaceDir, ".claude", "settings.json"), "utf8")).rejects.toMatchObject({ code: "ENOENT" });
await expect(readFile(path.join(prepared.assetDirs.skills, "skill.md"), "utf8")).resolves.toBe("skill body\n");
expect((await lstat(path.join(prepared.assetDirs.skills, "skill.md"))).isFile()).toBe(true);
await writeFile(path.join(remoteWorkspaceDir, "README.md"), "remote workspace\n", "utf8");
await writeFile(path.join(remoteWorkspaceDir, "remote-only.txt"), "sync back\n", "utf8");
await mkdir(path.join(localWorkspaceDir, ".paperclip-runtime"), { recursive: true });
await writeFile(path.join(localWorkspaceDir, ".paperclip-runtime", "state.json"), "{}\n", "utf8");
await writeFile(path.join(localWorkspaceDir, "local-stale.txt"), "remove\n", "utf8");
await prepared.restoreWorkspace();
await expect(readFile(path.join(localWorkspaceDir, "README.md"), "utf8")).resolves.toBe("remote workspace\n");
await expect(readFile(path.join(localWorkspaceDir, "remote-only.txt"), "utf8")).resolves.toBe("sync back\n");
await expect(readFile(path.join(localWorkspaceDir, "local-stale.txt"), "utf8")).rejects.toMatchObject({ code: "ENOENT" });
await expect(readFile(path.join(localWorkspaceDir, ".claude", "settings.json"), "utf8")).resolves.toBe("{\"local\":true}\n");
await expect(readFile(path.join(localWorkspaceDir, ".paperclip-runtime", "state.json"), "utf8")).resolves.toBe("{}\n");
});
});

View File

@@ -0,0 +1,338 @@
import { execFile as execFileCallback } from "node:child_process";
import { constants as fsConstants, promises as fs } from "node:fs";
import os from "node:os";
import path from "node:path";
import { promisify } from "node:util";
const execFile = promisify(execFileCallback);
export interface SandboxRemoteExecutionSpec {
transport: "sandbox";
provider: string;
sandboxId: string;
remoteCwd: string;
timeoutMs: number;
apiKey: string | null;
paperclipApiUrl?: string | null;
}
export interface SandboxManagedRuntimeAsset {
key: string;
localDir: string;
followSymlinks?: boolean;
exclude?: string[];
}
export interface SandboxManagedRuntimeClient {
makeDir(remotePath: string): Promise<void>;
writeFile(remotePath: string, bytes: ArrayBuffer): Promise<void>;
readFile(remotePath: string): Promise<Buffer | Uint8Array | ArrayBuffer>;
remove(remotePath: string): Promise<void>;
run(command: string, options: { timeoutMs: number }): Promise<void>;
}
export interface PreparedSandboxManagedRuntime {
spec: SandboxRemoteExecutionSpec;
workspaceLocalDir: string;
workspaceRemoteDir: string;
runtimeRootDir: string;
assetDirs: Record<string, string>;
restoreWorkspace(): Promise<void>;
}
function asObject(value: unknown): Record<string, unknown> {
return value && typeof value === "object" && !Array.isArray(value)
? (value as Record<string, unknown>)
: {};
}
function asString(value: unknown): string {
return typeof value === "string" ? value : "";
}
function asNumber(value: unknown): number {
return typeof value === "number" ? value : Number(value);
}
function shellQuote(value: string) {
return `'${value.replace(/'/g, `'\"'\"'`)}'`;
}
export function parseSandboxRemoteExecutionSpec(value: unknown): SandboxRemoteExecutionSpec | null {
const parsed = asObject(value);
const transport = asString(parsed.transport).trim();
const provider = asString(parsed.provider).trim();
const sandboxId = asString(parsed.sandboxId).trim();
const remoteCwd = asString(parsed.remoteCwd).trim();
const timeoutMs = asNumber(parsed.timeoutMs);
if (
transport !== "sandbox" ||
provider.length === 0 ||
sandboxId.length === 0 ||
remoteCwd.length === 0 ||
!Number.isFinite(timeoutMs) ||
timeoutMs <= 0
) {
return null;
}
return {
transport: "sandbox",
provider,
sandboxId,
remoteCwd,
timeoutMs,
apiKey: asString(parsed.apiKey).trim() || null,
paperclipApiUrl: asString(parsed.paperclipApiUrl).trim() || null,
};
}
export function buildSandboxExecutionSessionIdentity(spec: SandboxRemoteExecutionSpec | null) {
if (!spec) return null;
return {
transport: "sandbox",
provider: spec.provider,
sandboxId: spec.sandboxId,
remoteCwd: spec.remoteCwd,
...(spec.paperclipApiUrl ? { paperclipApiUrl: spec.paperclipApiUrl } : {}),
} as const;
}
export function sandboxExecutionSessionMatches(saved: unknown, current: SandboxRemoteExecutionSpec | null): boolean {
const currentIdentity = buildSandboxExecutionSessionIdentity(current);
if (!currentIdentity) return false;
const parsedSaved = asObject(saved);
return (
asString(parsedSaved.transport) === currentIdentity.transport &&
asString(parsedSaved.provider) === currentIdentity.provider &&
asString(parsedSaved.sandboxId) === currentIdentity.sandboxId &&
asString(parsedSaved.remoteCwd) === currentIdentity.remoteCwd &&
asString(parsedSaved.paperclipApiUrl) === asString(currentIdentity.paperclipApiUrl)
);
}
async function withTempDir<T>(prefix: string, fn: (dir: string) => Promise<T>): Promise<T> {
const dir = await fs.mkdtemp(path.join(os.tmpdir(), prefix));
try {
return await fn(dir);
} finally {
await fs.rm(dir, { recursive: true, force: true }).catch(() => undefined);
}
}
async function execTar(args: string[]): Promise<void> {
await execFile("tar", args, {
env: {
...process.env,
COPYFILE_DISABLE: "1",
},
maxBuffer: 32 * 1024 * 1024,
});
}
async function createTarballFromDirectory(input: {
localDir: string;
archivePath: string;
exclude?: string[];
followSymlinks?: boolean;
}): Promise<void> {
const excludeArgs = ["._*", ...(input.exclude ?? [])].flatMap((entry) => ["--exclude", entry]);
await execTar([
"-c",
...(input.followSymlinks ? ["-h"] : []),
"-f",
input.archivePath,
"-C",
input.localDir,
...excludeArgs,
".",
]);
}
async function extractTarballToDirectory(input: {
archivePath: string;
localDir: string;
}): Promise<void> {
await fs.mkdir(input.localDir, { recursive: true });
await execTar(["-xf", input.archivePath, "-C", input.localDir]);
}
async function walkDirectory(root: string, relative = ""): Promise<string[]> {
const current = path.join(root, relative);
const entries = await fs.readdir(current, { withFileTypes: true }).catch(() => []);
const out: string[] = [];
for (const entry of entries) {
const nextRelative = relative ? path.posix.join(relative, entry.name) : entry.name;
out.push(nextRelative);
if (entry.isDirectory()) {
out.push(...(await walkDirectory(root, nextRelative)));
}
}
return out.sort((left, right) => right.length - left.length);
}
function isRelativePathOrDescendant(relative: string, candidate: string): boolean {
return relative === candidate || relative.startsWith(`${candidate}/`);
}
export async function mirrorDirectory(
sourceDir: string,
targetDir: string,
options: { preserveAbsent?: string[] } = {},
): Promise<void> {
await fs.mkdir(targetDir, { recursive: true });
const preserveAbsent = new Set(options.preserveAbsent ?? []);
const shouldPreserveAbsent = (relative: string) =>
[...preserveAbsent].some((candidate) => isRelativePathOrDescendant(relative, candidate));
const sourceEntries = new Set(await walkDirectory(sourceDir));
const targetEntries = await walkDirectory(targetDir);
for (const relative of targetEntries) {
if (shouldPreserveAbsent(relative)) continue;
if (!sourceEntries.has(relative)) {
await fs.rm(path.join(targetDir, relative), { recursive: true, force: true }).catch(() => undefined);
}
}
const copyEntry = async (relative: string) => {
const sourcePath = path.join(sourceDir, relative);
const targetPath = path.join(targetDir, relative);
const stats = await fs.lstat(sourcePath);
if (stats.isDirectory()) {
await fs.mkdir(targetPath, { recursive: true });
return;
}
await fs.mkdir(path.dirname(targetPath), { recursive: true });
await fs.rm(targetPath, { recursive: true, force: true }).catch(() => undefined);
if (stats.isSymbolicLink()) {
const linkTarget = await fs.readlink(sourcePath);
await fs.symlink(linkTarget, targetPath);
return;
}
await fs.copyFile(sourcePath, targetPath, fsConstants.COPYFILE_FICLONE).catch(async () => {
await fs.copyFile(sourcePath, targetPath);
});
await fs.chmod(targetPath, stats.mode);
};
const entries = (await walkDirectory(sourceDir)).sort((left, right) => left.localeCompare(right));
for (const relative of entries) {
await copyEntry(relative);
}
}
function toArrayBuffer(bytes: Buffer): ArrayBuffer {
return Uint8Array.from(bytes).buffer;
}
function toBuffer(bytes: Buffer | Uint8Array | ArrayBuffer): Buffer {
if (Buffer.isBuffer(bytes)) return bytes;
if (bytes instanceof ArrayBuffer) return Buffer.from(bytes);
return Buffer.from(bytes.buffer, bytes.byteOffset, bytes.byteLength);
}
function tarExcludeFlags(exclude: string[] | undefined): string {
return ["._*", ...(exclude ?? [])].map((entry) => `--exclude ${shellQuote(entry)}`).join(" ");
}
export async function prepareSandboxManagedRuntime(input: {
spec: SandboxRemoteExecutionSpec;
adapterKey: string;
client: SandboxManagedRuntimeClient;
workspaceLocalDir: string;
workspaceRemoteDir?: string;
workspaceExclude?: string[];
preserveAbsentOnRestore?: string[];
assets?: SandboxManagedRuntimeAsset[];
}): Promise<PreparedSandboxManagedRuntime> {
const workspaceRemoteDir = input.workspaceRemoteDir ?? input.spec.remoteCwd;
const runtimeRootDir = path.posix.join(workspaceRemoteDir, ".paperclip-runtime", input.adapterKey);
await withTempDir("paperclip-sandbox-sync-", async (tempDir) => {
const workspaceTarPath = path.join(tempDir, "workspace.tar");
await createTarballFromDirectory({
localDir: input.workspaceLocalDir,
archivePath: workspaceTarPath,
exclude: input.workspaceExclude,
});
const workspaceTarBytes = await fs.readFile(workspaceTarPath);
const remoteWorkspaceTar = path.posix.join(runtimeRootDir, "workspace-upload.tar");
await input.client.makeDir(runtimeRootDir);
await input.client.writeFile(remoteWorkspaceTar, toArrayBuffer(workspaceTarBytes));
const preservedNames = new Set([".paperclip-runtime", ...(input.preserveAbsentOnRestore ?? [])]);
const findPreserveArgs = [...preservedNames].map((entry) => `! -name ${shellQuote(entry)}`).join(" ");
await input.client.run(
`sh -lc ${shellQuote(
`mkdir -p ${shellQuote(workspaceRemoteDir)} && ` +
`find ${shellQuote(workspaceRemoteDir)} -mindepth 1 -maxdepth 1 ${findPreserveArgs} -exec rm -rf -- {} + && ` +
`tar -xf ${shellQuote(remoteWorkspaceTar)} -C ${shellQuote(workspaceRemoteDir)} && ` +
`rm -f ${shellQuote(remoteWorkspaceTar)}`,
)}`,
{ timeoutMs: input.spec.timeoutMs },
);
for (const asset of input.assets ?? []) {
const assetTarPath = path.join(tempDir, `${asset.key}.tar`);
await createTarballFromDirectory({
localDir: asset.localDir,
archivePath: assetTarPath,
followSymlinks: asset.followSymlinks,
exclude: asset.exclude,
});
const assetTarBytes = await fs.readFile(assetTarPath);
const remoteAssetDir = path.posix.join(runtimeRootDir, asset.key);
const remoteAssetTar = path.posix.join(runtimeRootDir, `${asset.key}-upload.tar`);
await input.client.writeFile(remoteAssetTar, toArrayBuffer(assetTarBytes));
await input.client.run(
`sh -lc ${shellQuote(
`rm -rf ${shellQuote(remoteAssetDir)} && ` +
`mkdir -p ${shellQuote(remoteAssetDir)} && ` +
`tar -xf ${shellQuote(remoteAssetTar)} -C ${shellQuote(remoteAssetDir)} && ` +
`rm -f ${shellQuote(remoteAssetTar)}`,
)}`,
{ timeoutMs: input.spec.timeoutMs },
);
}
});
const assetDirs = Object.fromEntries(
(input.assets ?? []).map((asset) => [asset.key, path.posix.join(runtimeRootDir, asset.key)]),
);
return {
spec: input.spec,
workspaceLocalDir: input.workspaceLocalDir,
workspaceRemoteDir,
runtimeRootDir,
assetDirs,
restoreWorkspace: async () => {
await withTempDir("paperclip-sandbox-restore-", async (tempDir) => {
const remoteWorkspaceTar = path.posix.join(runtimeRootDir, "workspace-download.tar");
await input.client.run(
`sh -lc ${shellQuote(
`mkdir -p ${shellQuote(runtimeRootDir)} && ` +
`tar -cf ${shellQuote(remoteWorkspaceTar)} -C ${shellQuote(workspaceRemoteDir)} ` +
`${tarExcludeFlags(input.workspaceExclude)} .`,
)}`,
{ timeoutMs: input.spec.timeoutMs },
);
const archiveBytes = await input.client.readFile(remoteWorkspaceTar);
await input.client.remove(remoteWorkspaceTar).catch(() => undefined);
const localArchivePath = path.join(tempDir, "workspace.tar");
const extractedDir = path.join(tempDir, "workspace");
await fs.writeFile(localArchivePath, toBuffer(archiveBytes));
await extractTarballToDirectory({
archivePath: localArchivePath,
localDir: extractedDir,
});
await mirrorDirectory(extractedDir, input.workspaceLocalDir, {
preserveAbsent: [".paperclip-runtime", ...(input.preserveAbsentOnRestore ?? [])],
});
});
},
};
}

View File

@@ -4,6 +4,7 @@ import {
appendWithByteCap,
DEFAULT_PAPERCLIP_AGENT_PROMPT_TEMPLATE,
renderPaperclipWakePrompt,
runningProcesses,
runChildProcess,
stringifyPaperclipWakePayload,
} from "./server-utils.js";
@@ -26,6 +27,17 @@ async function waitForPidExit(pid: number, timeoutMs = 2_000) {
return !isPidAlive(pid);
}
async function waitForTextMatch(read: () => string, pattern: RegExp, timeoutMs = 1_000) {
const deadline = Date.now() + timeoutMs;
while (Date.now() < deadline) {
const value = read();
const match = value.match(pattern);
if (match) return match;
await new Promise((resolve) => setTimeout(resolve, 25));
}
return read().match(pattern);
}
describe("runChildProcess", () => {
it("does not arm a timeout when timeoutSec is 0", async () => {
const result = await runChildProcess(
@@ -110,15 +122,131 @@ describe("runChildProcess", () => {
expect(await waitForPidExit(descendantPid!, 2_000)).toBe(true);
});
});
describe("appendWithByteCap", () => {
it("keeps valid UTF-8 when trimming through multibyte text", () => {
const output = appendWithByteCap("prefix ", "hello — world", 7);
it.skipIf(process.platform === "win32")("cleans up a lingering process group after terminal output and child exit", async () => {
const result = await runChildProcess(
randomUUID(),
process.execPath,
[
"-e",
[
"const { spawn } = require('node:child_process');",
"const child = spawn(process.execPath, ['-e', 'setInterval(() => {}, 1000)'], { stdio: ['ignore', 'inherit', 'ignore'] });",
"process.stdout.write(`descendant:${child.pid}\\n`);",
"process.stdout.write(`${JSON.stringify({ type: 'result', result: 'done' })}\\n`);",
"setTimeout(() => process.exit(0), 25);",
].join(" "),
],
{
cwd: process.cwd(),
env: {},
timeoutSec: 0,
graceSec: 1,
onLog: async () => {},
terminalResultCleanup: {
graceMs: 100,
hasTerminalResult: ({ stdout }) => stdout.includes('"type":"result"'),
},
},
);
expect(output).not.toContain("\uFFFD");
expect(Buffer.from(output, "utf8").toString("utf8")).toBe(output);
expect(Buffer.byteLength(output, "utf8")).toBeLessThanOrEqual(7);
const descendantPid = Number.parseInt(result.stdout.match(/descendant:(\d+)/)?.[1] ?? "", 10);
expect(result.timedOut).toBe(false);
expect(result.exitCode).toBe(0);
expect(Number.isInteger(descendantPid) && descendantPid > 0).toBe(true);
expect(await waitForPidExit(descendantPid, 2_000)).toBe(true);
});
it.skipIf(process.platform === "win32")("cleans up a still-running child after terminal output", async () => {
const result = await runChildProcess(
randomUUID(),
process.execPath,
[
"-e",
[
"process.stdout.write(`${JSON.stringify({ type: 'result', result: 'done' })}\\n`);",
"setInterval(() => {}, 1000);",
].join(" "),
],
{
cwd: process.cwd(),
env: {},
timeoutSec: 0,
graceSec: 1,
onLog: async () => {},
terminalResultCleanup: {
graceMs: 100,
hasTerminalResult: ({ stdout }) => stdout.includes('"type":"result"'),
},
},
);
expect(result.timedOut).toBe(false);
expect(result.signal).toBe("SIGTERM");
expect(result.stdout).toContain('"type":"result"');
});
it.skipIf(process.platform === "win32")("does not clean up noisy runs that have no terminal output", async () => {
const runId = randomUUID();
let observed = "";
const resultPromise = runChildProcess(
runId,
process.execPath,
[
"-e",
[
"const { spawn } = require('node:child_process');",
"const child = spawn(process.execPath, ['-e', \"setInterval(() => process.stdout.write('noise\\\\n'), 50)\"], { stdio: ['ignore', 'inherit', 'ignore'] });",
"process.stdout.write(`descendant:${child.pid}\\n`);",
"setTimeout(() => process.exit(0), 25);",
].join(" "),
],
{
cwd: process.cwd(),
env: {},
timeoutSec: 0,
graceSec: 1,
onLog: async (_stream, chunk) => {
observed += chunk;
},
terminalResultCleanup: {
graceMs: 50,
hasTerminalResult: ({ stdout }) => stdout.includes('"type":"result"'),
},
},
);
const pidMatch = await waitForTextMatch(() => observed, /descendant:(\d+)/);
const descendantPid = Number.parseInt(pidMatch?.[1] ?? "", 10);
expect(Number.isInteger(descendantPid) && descendantPid > 0).toBe(true);
const race = await Promise.race([
resultPromise.then(() => "settled" as const),
new Promise<"pending">((resolve) => setTimeout(() => resolve("pending"), 300)),
]);
expect(race).toBe("pending");
expect(isPidAlive(descendantPid)).toBe(true);
const running = runningProcesses.get(runId) as
| { child: { kill(signal: NodeJS.Signals): boolean }; processGroupId: number | null }
| undefined;
try {
if (running?.processGroupId) {
process.kill(-running.processGroupId, "SIGKILL");
} else {
running?.child.kill("SIGKILL");
}
await resultPromise;
} finally {
runningProcesses.delete(runId);
if (isPidAlive(descendantPid)) {
try {
process.kill(descendantPid, "SIGKILL");
} catch {
// Ignore cleanup races.
}
}
}
});
});
@@ -126,8 +254,14 @@ describe("renderPaperclipWakePrompt", () => {
it("keeps the default local-agent prompt action-oriented", () => {
expect(DEFAULT_PAPERCLIP_AGENT_PROMPT_TEMPLATE).toContain("Start actionable work in this heartbeat");
expect(DEFAULT_PAPERCLIP_AGENT_PROMPT_TEMPLATE).toContain("do not stop at a plan");
expect(DEFAULT_PAPERCLIP_AGENT_PROMPT_TEMPLATE).toContain("Prefer the smallest verification that proves the change");
expect(DEFAULT_PAPERCLIP_AGENT_PROMPT_TEMPLATE).toContain("Use child issues");
expect(DEFAULT_PAPERCLIP_AGENT_PROMPT_TEMPLATE).toContain("instead of polling agents, sessions, or processes");
expect(DEFAULT_PAPERCLIP_AGENT_PROMPT_TEMPLATE).toContain("Create child issues directly when you know what needs to be done");
expect(DEFAULT_PAPERCLIP_AGENT_PROMPT_TEMPLATE).toContain("POST /api/issues/{issueId}/interactions");
expect(DEFAULT_PAPERCLIP_AGENT_PROMPT_TEMPLATE).toContain("kind suggest_tasks, ask_user_questions, or request_confirmation");
expect(DEFAULT_PAPERCLIP_AGENT_PROMPT_TEMPLATE).toContain("confirmation:{issueId}:plan:{revisionId}");
expect(DEFAULT_PAPERCLIP_AGENT_PROMPT_TEMPLATE).toContain("Wait for acceptance before creating implementation subtasks");
expect(DEFAULT_PAPERCLIP_AGENT_PROMPT_TEMPLATE).toContain(
"Respect budget, pause/cancel, approval gates, and company boundaries",
);
@@ -157,6 +291,70 @@ describe("renderPaperclipWakePrompt", () => {
expect(prompt).toContain("mark blocked work with the unblock owner/action");
});
it("renders dependency-blocked interaction guidance", () => {
const prompt = renderPaperclipWakePrompt({
reason: "issue_commented",
issue: {
id: "issue-1",
identifier: "PAP-1703",
title: "Blocked parent",
status: "todo",
},
dependencyBlockedInteraction: true,
unresolvedBlockerIssueIds: ["blocker-1"],
unresolvedBlockerSummaries: [
{
id: "blocker-1",
identifier: "PAP-1723",
title: "Finish blocker",
status: "todo",
priority: "medium",
},
],
commentWindow: {
requestedCount: 1,
includedCount: 1,
missingCount: 0,
},
commentIds: ["comment-1"],
latestCommentId: "comment-1",
comments: [{ id: "comment-1", body: "hello" }],
fallbackFetchNeeded: false,
});
expect(prompt).toContain("dependency-blocked interaction: yes");
expect(prompt).toContain("respond or triage the human comment");
expect(prompt).toContain("PAP-1723 Finish blocker (todo)");
});
it("renders loose review request instructions for execution handoffs", () => {
const prompt = renderPaperclipWakePrompt({
reason: "execution_review_requested",
issue: {
id: "issue-1",
identifier: "PAP-2011",
title: "Review request handoff",
status: "in_review",
},
executionStage: {
wakeRole: "reviewer",
stageId: "stage-1",
stageType: "review",
currentParticipant: { type: "agent", agentId: "agent-1" },
returnAssignee: { type: "agent", agentId: "agent-2" },
reviewRequest: {
instructions: "Please focus on edge cases and leave a short risk summary.",
},
allowedActions: ["approve", "request_changes"],
},
fallbackFetchNeeded: false,
});
expect(prompt).toContain("Review request instructions:");
expect(prompt).toContain("Please focus on edge cases and leave a short risk summary.");
expect(prompt).toContain("You are waking as the active reviewer for this issue.");
});
it("includes continuation and child issue summaries in structured wake context", () => {
const payload = {
reason: "issue_children_completed",
@@ -226,3 +424,13 @@ describe("renderPaperclipWakePrompt", () => {
expect(prompt).toContain("Added the helper route and tests.");
});
});
describe("appendWithByteCap", () => {
it("keeps valid UTF-8 when trimming through multibyte text", () => {
const output = appendWithByteCap("prefix ", "hello — world", 7);
expect(output).not.toContain("\uFFFD");
expect(Buffer.from(output, "utf8").toString("utf8")).toBe(output);
expect(Buffer.byteLength(output, "utf8")).toBeLessThanOrEqual(7);
});
});

View File

@@ -1,6 +1,7 @@
import { spawn, type ChildProcess } from "node:child_process";
import { constants as fsConstants, promises as fs, type Dirent } from "node:fs";
import path from "node:path";
import { buildSshSpawnTarget, type SshRemoteExecutionSpec } from "./ssh.js";
import type {
AdapterSkillEntry,
AdapterSkillSnapshot,
@@ -16,6 +17,11 @@ export interface RunProcessResult {
startedAt: string | null;
}
export interface TerminalResultCleanupOptions {
hasTerminalResult: (output: { stdout: string; stderr: string }) => boolean;
graceMs?: number;
}
interface RunningProcess {
child: ChildProcess;
graceSec: number;
@@ -25,10 +31,18 @@ interface RunningProcess {
interface SpawnTarget {
command: string;
args: string[];
cwd?: string;
cleanup?: () => Promise<void>;
}
type RemoteExecutionSpec = SshRemoteExecutionSpec;
type ChildProcessWithEvents = ChildProcess & {
on(event: "error", listener: (err: Error) => void): ChildProcess;
on(
event: "exit",
listener: (code: number | null, signal: NodeJS.Signals | null) => void,
): ChildProcess;
on(
event: "close",
listener: (code: number | null, signal: NodeJS.Signals | null) => void,
@@ -60,6 +74,7 @@ function signalRunningProcess(
export const runningProcesses = new Map<string, RunningProcess>();
export const MAX_CAPTURE_BYTES = 4 * 1024 * 1024;
export const MAX_EXCERPT_BYTES = 32 * 1024;
const TERMINAL_RESULT_SCAN_OVERLAP_CHARS = 64 * 1024;
const SENSITIVE_ENV_KEY = /(key|token|secret|password|passwd|authorization|cookie)/i;
const PAPERCLIP_SKILL_ROOT_RELATIVE_CANDIDATES = [
"../../skills",
@@ -72,7 +87,13 @@ export const DEFAULT_PAPERCLIP_AGENT_PROMPT_TEMPLATE = [
"Execution contract:",
"- Start actionable work in this heartbeat; do not stop at a plan unless the issue asks for planning.",
"- Leave durable progress in comments, documents, or work products with a clear next action.",
"- Prefer the smallest verification that proves the change; do not default to full workspace typecheck/build/test on every heartbeat unless the task scope warrants it.",
"- Use child issues for parallel or long delegated work instead of polling agents, sessions, or processes.",
"- If woken by a human comment on a dependency-blocked issue, respond or triage the comment without treating the blocked deliverable work as unblocked.",
"- Create child issues directly when you know what needs to be done; use issue-thread interactions when the board/user must choose suggested tasks, answer structured questions, or confirm a proposal.",
"- To ask for that input, create an interaction on the current issue with POST /api/issues/{issueId}/interactions using kind suggest_tasks, ask_user_questions, or request_confirmation. Use continuationPolicy wake_assignee when you need to resume after a response; for request_confirmation this resumes only after acceptance.",
"- When you intentionally restart follow-up work on a completed assigned issue, include structured `resume: true` with the POST /api/issues/{issueId}/comments or PATCH /api/issues/{issueId} comment payload. Generic agent comments on closed issues are inert by default.",
"- For plan approval, update the plan document first, then create request_confirmation targeting the latest plan revision with idempotencyKey confirmation:{issueId}:plan:{revisionId}. Wait for acceptance before creating implementation subtasks, and create a fresh confirmation after superseding board/user comments if approval is still needed.",
"- If blocked, mark the issue blocked and name the unblock owner and action.",
"- Respect budget, pause/cancel, approval gates, and company boundaries.",
].join("\n");
@@ -263,6 +284,9 @@ type PaperclipWakeExecutionStage = {
stageType: string | null;
currentParticipant: PaperclipWakeExecutionPrincipal | null;
returnAssignee: PaperclipWakeExecutionPrincipal | null;
reviewRequest: {
instructions: string;
} | null;
lastDecisionOutcome: string | null;
allowedActions: string[];
};
@@ -303,10 +327,30 @@ type PaperclipWakeChildIssueSummary = {
summary: string | null;
};
type PaperclipWakeBlockerSummary = {
id: string | null;
identifier: string | null;
title: string | null;
status: string | null;
priority: string | null;
};
type PaperclipWakeTreeHoldSummary = {
holdId: string | null;
rootIssueId: string | null;
mode: string | null;
reason: string | null;
};
type PaperclipWakePayload = {
reason: string | null;
issue: PaperclipWakeIssue | null;
checkedOutByHarness: boolean;
dependencyBlockedInteraction: boolean;
treeHoldInteraction: boolean;
activeTreeHold: PaperclipWakeTreeHoldSummary | null;
unresolvedBlockerIssueIds: string[];
unresolvedBlockerSummaries: PaperclipWakeBlockerSummary[];
executionStage: PaperclipWakeExecutionStage | null;
continuationSummary: PaperclipWakeContinuationSummary | null;
livenessContinuation: PaperclipWakeLivenessContinuation | null;
@@ -399,6 +443,27 @@ function normalizePaperclipWakeChildIssueSummary(value: unknown): PaperclipWakeC
return { id, identifier, title, status, priority, summary };
}
function normalizePaperclipWakeBlockerSummary(value: unknown): PaperclipWakeBlockerSummary | null {
const blocker = parseObject(value);
const id = asString(blocker.id, "").trim() || null;
const identifier = asString(blocker.identifier, "").trim() || null;
const title = asString(blocker.title, "").trim() || null;
const status = asString(blocker.status, "").trim() || null;
const priority = asString(blocker.priority, "").trim() || null;
if (!id && !identifier && !title && !status) return null;
return { id, identifier, title, status, priority };
}
function normalizePaperclipWakeTreeHoldSummary(value: unknown): PaperclipWakeTreeHoldSummary | null {
const hold = parseObject(value);
const holdId = asString(hold.holdId, "").trim() || null;
const rootIssueId = asString(hold.rootIssueId, "").trim() || null;
const mode = asString(hold.mode, "").trim() || null;
const reason = asString(hold.reason, "").trim() || null;
if (!holdId && !rootIssueId && !mode && !reason) return null;
return { holdId, rootIssueId, mode, reason };
}
function normalizePaperclipWakeExecutionPrincipal(value: unknown): PaperclipWakeExecutionPrincipal | null {
const principal = parseObject(value);
const typeRaw = asString(principal.type, "").trim().toLowerCase();
@@ -424,11 +489,14 @@ function normalizePaperclipWakeExecutionStage(value: unknown): PaperclipWakeExec
: [];
const currentParticipant = normalizePaperclipWakeExecutionPrincipal(stage.currentParticipant);
const returnAssignee = normalizePaperclipWakeExecutionPrincipal(stage.returnAssignee);
const reviewRequestRaw = parseObject(stage.reviewRequest);
const reviewInstructions = asString(reviewRequestRaw.instructions, "").trim();
const reviewRequest = reviewInstructions ? { instructions: reviewInstructions } : null;
const stageId = asString(stage.stageId, "").trim() || null;
const stageType = asString(stage.stageType, "").trim() || null;
const lastDecisionOutcome = asString(stage.lastDecisionOutcome, "").trim() || null;
if (!wakeRole && !stageId && !stageType && !currentParticipant && !returnAssignee && !lastDecisionOutcome && allowedActions.length === 0) {
if (!wakeRole && !stageId && !stageType && !currentParticipant && !returnAssignee && !reviewRequest && !lastDecisionOutcome && allowedActions.length === 0) {
return null;
}
@@ -438,6 +506,7 @@ function normalizePaperclipWakeExecutionStage(value: unknown): PaperclipWakeExec
stageType,
currentParticipant,
returnAssignee,
reviewRequest,
lastDecisionOutcome,
allowedActions,
};
@@ -464,8 +533,19 @@ export function normalizePaperclipWakePayload(value: unknown): PaperclipWakePayl
.map((entry) => normalizePaperclipWakeChildIssueSummary(entry))
.filter((entry): entry is PaperclipWakeChildIssueSummary => Boolean(entry))
: [];
const unresolvedBlockerIssueIds = Array.isArray(payload.unresolvedBlockerIssueIds)
? payload.unresolvedBlockerIssueIds
.map((entry) => asString(entry, "").trim())
.filter(Boolean)
: [];
const unresolvedBlockerSummaries = Array.isArray(payload.unresolvedBlockerSummaries)
? payload.unresolvedBlockerSummaries
.map((entry) => normalizePaperclipWakeBlockerSummary(entry))
.filter((entry): entry is PaperclipWakeBlockerSummary => Boolean(entry))
: [];
if (comments.length === 0 && commentIds.length === 0 && childIssueSummaries.length === 0 && !executionStage && !continuationSummary && !livenessContinuation && !normalizePaperclipWakeIssue(payload.issue)) {
const activeTreeHold = normalizePaperclipWakeTreeHoldSummary(payload.activeTreeHold);
if (comments.length === 0 && commentIds.length === 0 && childIssueSummaries.length === 0 && unresolvedBlockerIssueIds.length === 0 && unresolvedBlockerSummaries.length === 0 && !activeTreeHold && !executionStage && !continuationSummary && !livenessContinuation && !normalizePaperclipWakeIssue(payload.issue)) {
return null;
}
@@ -473,6 +553,11 @@ export function normalizePaperclipWakePayload(value: unknown): PaperclipWakePayl
reason: asString(payload.reason, "").trim() || null,
issue: normalizePaperclipWakeIssue(payload.issue),
checkedOutByHarness: asBoolean(payload.checkedOutByHarness, false),
dependencyBlockedInteraction: asBoolean(payload.dependencyBlockedInteraction, false),
treeHoldInteraction: asBoolean(payload.treeHoldInteraction, false),
activeTreeHold,
unresolvedBlockerIssueIds,
unresolvedBlockerSummaries,
executionStage,
continuationSummary,
livenessContinuation,
@@ -553,6 +638,26 @@ export function renderPaperclipWakePrompt(
if (normalized.checkedOutByHarness) {
lines.push("- checkout: already claimed by the harness for this run");
}
if (normalized.dependencyBlockedInteraction) {
lines.push("- dependency-blocked interaction: yes");
lines.push("- execution scope: respond or triage the human comment; do not treat blocker-dependent deliverable work as unblocked");
if (normalized.unresolvedBlockerSummaries.length > 0) {
const blockers = normalized.unresolvedBlockerSummaries
.map((blocker) => `${blocker.identifier ?? blocker.id ?? "unknown"}${blocker.title ? ` ${blocker.title}` : ""}${blocker.status ? ` (${blocker.status})` : ""}`)
.join("; ");
lines.push(`- unresolved blockers: ${blockers}`);
} else if (normalized.unresolvedBlockerIssueIds.length > 0) {
lines.push(`- unresolved blocker issue ids: ${normalized.unresolvedBlockerIssueIds.join(", ")}`);
}
}
if (normalized.treeHoldInteraction) {
lines.push("- tree-hold interaction: yes");
lines.push("- execution scope: respond or triage the human comment; the subtree remains paused until an explicit resume action");
if (normalized.activeTreeHold) {
const hold = normalized.activeTreeHold;
lines.push(`- active tree hold: ${hold.holdId ?? "unknown"}${hold.rootIssueId ? ` rooted at ${hold.rootIssueId}` : ""}${hold.mode ? ` (${hold.mode})` : ""}`);
}
}
if (normalized.missingCount > 0) {
lines.push(`- omitted comments: ${normalized.missingCount}`);
}
@@ -568,6 +673,13 @@ export function renderPaperclipWakePrompt(
if (executionStage.allowedActions.length > 0) {
lines.push(`- allowed actions: ${executionStage.allowedActions.join(", ")}`);
}
if (executionStage.reviewRequest) {
lines.push(
"",
"Review request instructions:",
executionStage.reviewRequest.instructions,
);
}
lines.push("");
if (executionStage.wakeRole === "reviewer" || executionStage.wakeRole === "approver") {
lines.push(
@@ -715,11 +827,26 @@ export function buildPaperclipEnv(agent: { id: string; companyId: string }): Rec
process.env.PAPERCLIP_LISTEN_HOST ?? process.env.HOST ?? "localhost",
);
const runtimePort = process.env.PAPERCLIP_LISTEN_PORT ?? process.env.PORT ?? "3100";
const apiUrl = process.env.PAPERCLIP_API_URL ?? `http://${runtimeHost}:${runtimePort}`;
const apiUrl =
process.env.PAPERCLIP_RUNTIME_API_URL ??
process.env.PAPERCLIP_API_URL ??
`http://${runtimeHost}:${runtimePort}`;
vars.PAPERCLIP_API_URL = apiUrl;
return vars;
}
export function sanitizeInheritedPaperclipEnv(baseEnv: NodeJS.ProcessEnv): NodeJS.ProcessEnv {
const env: NodeJS.ProcessEnv = { ...baseEnv };
for (const key of Object.keys(env)) {
if (!key.startsWith("PAPERCLIP_")) continue;
if (key === "PAPERCLIP_RUNTIME_API_URL") continue;
if (key === "PAPERCLIP_LISTEN_HOST") continue;
if (key === "PAPERCLIP_LISTEN_PORT") continue;
delete env[key];
}
return env;
}
export function defaultPathForPlatform() {
if (process.platform === "win32") {
return "C:\\Windows\\System32;C:\\Windows;C:\\Windows\\System32\\Wbem";
@@ -768,7 +895,18 @@ async function resolveCommandPath(command: string, cwd: string, env: NodeJS.Proc
return null;
}
export async function resolveCommandForLogs(command: string, cwd: string, env: NodeJS.ProcessEnv): Promise<string> {
export async function resolveCommandForLogs(
command: string,
cwd: string,
env: NodeJS.ProcessEnv,
options: {
remoteExecution?: RemoteExecutionSpec | null;
} = {},
): Promise<string> {
const remote = options.remoteExecution ?? null;
if (remote) {
return `ssh://${remote.username}@${remote.host}:${remote.port}/${remote.remoteCwd} :: ${command}`;
}
return (await resolveCommandPath(command, cwd, env)) ?? command;
}
@@ -788,7 +926,33 @@ async function resolveSpawnTarget(
args: string[],
cwd: string,
env: NodeJS.ProcessEnv,
options: {
remoteExecution?: RemoteExecutionSpec | null;
remoteEnv?: Record<string, string> | null;
} = {},
): Promise<SpawnTarget> {
const remote = options.remoteExecution ?? null;
if (remote) {
const sshResolved = await resolveCommandPath("ssh", process.cwd(), env);
if (!sshResolved) {
throw new Error('Command not found in PATH: "ssh"');
}
const spawnTarget = await buildSshSpawnTarget({
spec: remote,
command,
args,
env: Object.fromEntries(
Object.entries(options.remoteEnv ?? {}).filter((entry): entry is [string, string] => typeof entry[1] === "string"),
),
});
return {
command: sshResolved,
args: spawnTarget.args,
cwd: process.cwd(),
cleanup: spawnTarget.cleanup,
};
}
const resolved = await resolveCommandPath(command, cwd, env);
const executable = resolved ?? command;
@@ -1215,7 +1379,19 @@ export async function removeMaintainerOnlySkillSymlinks(
}
}
export async function ensureCommandResolvable(command: string, cwd: string, env: NodeJS.ProcessEnv) {
export async function ensureCommandResolvable(
command: string,
cwd: string,
env: NodeJS.ProcessEnv,
options: {
remoteExecution?: RemoteExecutionSpec | null;
} = {},
) {
if (options.remoteExecution) {
const resolvedSsh = await resolveCommandPath("ssh", process.cwd(), env);
if (resolvedSsh) return;
throw new Error('Command not found in PATH: "ssh"');
}
const resolved = await resolveCommandPath(command, cwd, env);
if (resolved) return;
if (command.includes("/") || command.includes("\\")) {
@@ -1237,13 +1413,17 @@ export async function runChildProcess(
onLog: (stream: "stdout" | "stderr", chunk: string) => Promise<void>;
onLogError?: (err: unknown, runId: string, message: string) => void;
onSpawn?: (meta: { pid: number; processGroupId: number | null; startedAt: string }) => Promise<void>;
terminalResultCleanup?: TerminalResultCleanupOptions;
stdin?: string;
remoteExecution?: RemoteExecutionSpec | null;
},
): Promise<RunProcessResult> {
const onLogError = opts.onLogError ?? ((err, id, msg) => console.warn({ err, runId: id }, msg));
return new Promise<RunProcessResult>((resolve, reject) => {
const rawMerged: NodeJS.ProcessEnv = { ...process.env, ...opts.env };
const rawMerged: NodeJS.ProcessEnv = {
...sanitizeInheritedPaperclipEnv(process.env),
...opts.env,
};
// Strip Claude Code nesting-guard env vars so spawned `claude` processes
// don't refuse to start with "cannot be launched inside another session".
@@ -1261,10 +1441,13 @@ export async function runChildProcess(
}
const mergedEnv = ensurePathInEnv(rawMerged);
void resolveSpawnTarget(command, args, opts.cwd, mergedEnv)
void resolveSpawnTarget(command, args, opts.cwd, mergedEnv, {
remoteExecution: opts.remoteExecution ?? null,
remoteEnv: opts.remoteExecution ? opts.env : null,
})
.then((target) => {
const child = spawn(target.command, target.args, {
cwd: opts.cwd,
cwd: target.cwd ?? opts.cwd,
env: mergedEnv,
detached: process.platform !== "win32",
shell: false,
@@ -1286,11 +1469,60 @@ export async function runChildProcess(
let stdout = "";
let stderr = "";
let logChain: Promise<void> = Promise.resolve();
let terminalResultSeen = false;
let terminalCleanupStarted = false;
let terminalCleanupTimer: NodeJS.Timeout | null = null;
let terminalCleanupKillTimer: NodeJS.Timeout | null = null;
let terminalResultStdoutScanOffset = 0;
let terminalResultStderrScanOffset = 0;
const clearTerminalCleanupTimers = () => {
if (terminalCleanupTimer) clearTimeout(terminalCleanupTimer);
if (terminalCleanupKillTimer) clearTimeout(terminalCleanupKillTimer);
terminalCleanupTimer = null;
terminalCleanupKillTimer = null;
};
const maybeArmTerminalResultCleanup = () => {
const terminalCleanup = opts.terminalResultCleanup;
if (!terminalCleanup || terminalCleanupStarted || timedOut) return;
if (!terminalResultSeen) {
const stdoutStart = Math.max(0, terminalResultStdoutScanOffset - TERMINAL_RESULT_SCAN_OVERLAP_CHARS);
const stderrStart = Math.max(0, terminalResultStderrScanOffset - TERMINAL_RESULT_SCAN_OVERLAP_CHARS);
const scanOutput = {
stdout: stdout.slice(stdoutStart),
stderr: stderr.slice(stderrStart),
};
terminalResultStdoutScanOffset = stdout.length;
terminalResultStderrScanOffset = stderr.length;
if (scanOutput.stdout.length === 0 && scanOutput.stderr.length === 0) return;
try {
terminalResultSeen = terminalCleanup.hasTerminalResult(scanOutput);
} catch (err) {
onLogError(err, runId, "failed to inspect terminal adapter output");
}
}
if (!terminalResultSeen) return;
if (terminalCleanupTimer) return;
const graceMs = Math.max(0, terminalCleanup.graceMs ?? 5_000);
terminalCleanupTimer = setTimeout(() => {
terminalCleanupTimer = null;
if (terminalCleanupStarted || timedOut) return;
terminalCleanupStarted = true;
signalRunningProcess({ child, processGroupId }, "SIGTERM");
terminalCleanupKillTimer = setTimeout(() => {
terminalCleanupKillTimer = null;
signalRunningProcess({ child, processGroupId }, "SIGKILL");
}, Math.max(1, opts.graceSec) * 1000);
}, graceMs);
};
const timeout =
opts.timeoutSec > 0
? setTimeout(() => {
timedOut = true;
clearTerminalCleanupTimers();
signalRunningProcess({ child, processGroupId }, "SIGTERM");
setTimeout(() => {
signalRunningProcess({ child, processGroupId }, "SIGKILL");
@@ -1304,10 +1536,14 @@ export async function runChildProcess(
readable.pause();
const text = String(chunk);
stdout = appendWithCap(stdout, text);
maybeArmTerminalResultCleanup();
logChain = logChain
.then(() => opts.onLog("stdout", text))
.catch((err) => onLogError(err, runId, "failed to append stdout log chunk"))
.finally(() => resumeReadable(readable));
.finally(() => {
maybeArmTerminalResultCleanup();
resumeReadable(readable);
});
});
child.stderr?.on("data", (chunk: unknown) => {
@@ -1316,10 +1552,14 @@ export async function runChildProcess(
readable.pause();
const text = String(chunk);
stderr = appendWithCap(stderr, text);
maybeArmTerminalResultCleanup();
logChain = logChain
.then(() => opts.onLog("stderr", text))
.catch((err) => onLogError(err, runId, "failed to append stderr log chunk"))
.finally(() => resumeReadable(readable));
.finally(() => {
maybeArmTerminalResultCleanup();
resumeReadable(readable);
});
});
const stdin = child.stdin;
@@ -1333,7 +1573,9 @@ export async function runChildProcess(
child.on("error", (err: Error) => {
if (timeout) clearTimeout(timeout);
clearTerminalCleanupTimers();
runningProcesses.delete(runId);
void target.cleanup?.();
const errno = (err as NodeJS.ErrnoException).code;
const pathValue = mergedEnv.PATH ?? mergedEnv.Path ?? "";
const msg =
@@ -1343,19 +1585,28 @@ export async function runChildProcess(
reject(new Error(msg));
});
child.on("exit", () => {
maybeArmTerminalResultCleanup();
});
child.on("close", (code: number | null, signal: NodeJS.Signals | null) => {
if (timeout) clearTimeout(timeout);
clearTerminalCleanupTimers();
runningProcesses.delete(runId);
void logChain.finally(() => {
resolve({
exitCode: code,
signal,
timedOut,
stdout,
stderr,
pid: child.pid ?? null,
startedAt,
});
void Promise.resolve()
.then(() => target.cleanup?.())
.finally(() => {
resolve({
exitCode: code,
signal,
timedOut,
stdout,
stderr,
pid: child.pid ?? null,
startedAt,
});
});
});
});
})

View File

@@ -0,0 +1,275 @@
import { execFile } from "node:child_process";
import { mkdir, mkdtemp, rm, symlink, writeFile } from "node:fs/promises";
import os from "node:os";
import path from "node:path";
import { afterEach, describe, expect, it } from "vitest";
import {
buildSshSpawnTarget,
buildSshEnvLabFixtureConfig,
getSshEnvLabSupport,
prepareWorkspaceForSshExecution,
readSshEnvLabFixtureStatus,
restoreWorkspaceFromSshExecution,
runSshCommand,
syncDirectoryToSsh,
startSshEnvLabFixture,
stopSshEnvLabFixture,
} from "./ssh.js";
async function git(cwd: string, args: string[]): Promise<string> {
return await new Promise((resolve, reject) => {
execFile("git", ["-C", cwd, ...args], (error, stdout, stderr) => {
if (error) {
reject(new Error((stderr || stdout || error.message).trim()));
return;
}
resolve(stdout.trim());
});
});
}
describe("ssh env-lab fixture", () => {
const cleanupDirs: string[] = [];
afterEach(async () => {
while (cleanupDirs.length > 0) {
const dir = cleanupDirs.pop();
if (!dir) continue;
await rm(dir, { recursive: true, force: true }).catch(() => undefined);
}
});
it("starts an isolated sshd fixture and executes commands through it", async () => {
const support = await getSshEnvLabSupport();
if (!support.supported) {
console.warn(
`Skipping SSH env-lab fixture test: ${support.reason ?? "unsupported environment"}`,
);
return;
}
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-ssh-fixture-"));
cleanupDirs.push(rootDir);
const statePath = path.join(rootDir, "state.json");
const started = await startSshEnvLabFixture({ statePath });
const config = await buildSshEnvLabFixtureConfig(started);
const quotedWorkspace = JSON.stringify(started.workspaceDir);
const result = await runSshCommand(
config,
`sh -lc 'cd ${quotedWorkspace} && pwd'`,
);
expect(result.stdout.trim()).toBe(started.workspaceDir);
const status = await readSshEnvLabFixtureStatus(statePath);
expect(status.running).toBe(true);
await stopSshEnvLabFixture(statePath);
const stopped = await readSshEnvLabFixtureStatus(statePath);
expect(stopped.running).toBe(false);
});
it("does not treat an unrelated reused pid as the running fixture", async () => {
const support = await getSshEnvLabSupport();
if (!support.supported) {
console.warn(
`Skipping SSH env-lab fixture test: ${support.reason ?? "unsupported environment"}`,
);
return;
}
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-ssh-fixture-"));
cleanupDirs.push(rootDir);
const statePath = path.join(rootDir, "state.json");
const started = await startSshEnvLabFixture({ statePath });
await stopSshEnvLabFixture(statePath);
await mkdir(path.dirname(statePath), { recursive: true });
await writeFile(
statePath,
JSON.stringify({ ...started, pid: process.pid }, null, 2),
{ mode: 0o600 },
);
const staleStatus = await readSshEnvLabFixtureStatus(statePath);
expect(staleStatus.running).toBe(false);
const restarted = await startSshEnvLabFixture({ statePath });
expect(restarted.pid).not.toBe(process.pid);
await stopSshEnvLabFixture(statePath);
});
it("rejects invalid environment variable keys when constructing SSH spawn targets", async () => {
await expect(
buildSshSpawnTarget({
spec: {
host: "ssh.example.test",
port: 22,
username: "ssh-user",
remoteCwd: "/srv/paperclip/workspace",
remoteWorkspacePath: "/srv/paperclip/workspace",
privateKey: null,
knownHosts: null,
strictHostKeyChecking: true,
},
command: "env",
args: [],
env: {
"BAD KEY": "value",
},
}),
).rejects.toThrow("Invalid SSH environment variable key: BAD KEY");
});
it("syncs a local directory into the remote fixture workspace", async () => {
const support = await getSshEnvLabSupport();
if (!support.supported) {
console.warn(
`Skipping SSH env-lab fixture test: ${support.reason ?? "unsupported environment"}`,
);
return;
}
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-ssh-fixture-"));
cleanupDirs.push(rootDir);
const statePath = path.join(rootDir, "state.json");
const localDir = path.join(rootDir, "local-overlay");
await mkdir(localDir, { recursive: true });
await writeFile(path.join(localDir, "message.txt"), "hello from paperclip\n", "utf8");
await writeFile(path.join(localDir, "._message.txt"), "should never sync\n", "utf8");
const started = await startSshEnvLabFixture({ statePath });
const config = await buildSshEnvLabFixtureConfig(started);
const remoteDir = path.posix.join(started.workspaceDir, "overlay");
await syncDirectoryToSsh({
spec: {
...config,
remoteCwd: started.workspaceDir,
},
localDir,
remoteDir,
});
const result = await runSshCommand(
config,
`sh -lc 'cat ${JSON.stringify(path.posix.join(remoteDir, "message.txt"))} && if [ -e ${JSON.stringify(path.posix.join(remoteDir, "._message.txt"))} ]; then echo appledouble-present; fi'`,
);
expect(result.stdout).toContain("hello from paperclip");
expect(result.stdout).not.toContain("appledouble-present");
});
it("can dereference local symlinks while syncing to the remote fixture", async () => {
const support = await getSshEnvLabSupport();
if (!support.supported) {
console.warn(
`Skipping SSH symlink sync test: ${support.reason ?? "unsupported environment"}`,
);
return;
}
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-ssh-fixture-"));
cleanupDirs.push(rootDir);
const statePath = path.join(rootDir, "state.json");
const sourceDir = path.join(rootDir, "source");
const localDir = path.join(rootDir, "local-overlay");
await mkdir(sourceDir, { recursive: true });
await mkdir(localDir, { recursive: true });
await writeFile(path.join(sourceDir, "auth.json"), "{\"token\":\"secret\"}\n", "utf8");
await symlink(path.join(sourceDir, "auth.json"), path.join(localDir, "auth.json"));
const started = await startSshEnvLabFixture({ statePath });
const config = await buildSshEnvLabFixtureConfig(started);
const remoteDir = path.posix.join(started.workspaceDir, "overlay-follow-links");
await syncDirectoryToSsh({
spec: {
...config,
remoteCwd: started.workspaceDir,
},
localDir,
remoteDir,
followSymlinks: true,
});
const result = await runSshCommand(
config,
`sh -lc 'if [ -L ${JSON.stringify(path.posix.join(remoteDir, "auth.json"))} ]; then echo symlink; else echo regular; fi && cat ${JSON.stringify(path.posix.join(remoteDir, "auth.json"))}'`,
);
expect(result.stdout).toContain("regular");
expect(result.stdout).toContain("{\"token\":\"secret\"}");
});
it("round-trips a git workspace through the SSH fixture", async () => {
const support = await getSshEnvLabSupport();
if (!support.supported) {
console.warn(
`Skipping SSH workspace round-trip test: ${support.reason ?? "unsupported environment"}`,
);
return;
}
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-ssh-fixture-"));
cleanupDirs.push(rootDir);
const statePath = path.join(rootDir, "state.json");
const localRepo = path.join(rootDir, "local-workspace");
await mkdir(localRepo, { recursive: true });
await git(localRepo, ["init", "-b", "main"]);
await git(localRepo, ["config", "user.name", "Paperclip Test"]);
await git(localRepo, ["config", "user.email", "test@paperclip.dev"]);
await writeFile(path.join(localRepo, "tracked.txt"), "base\n", "utf8");
await writeFile(path.join(localRepo, "._tracked.txt"), "should stay local only\n", "utf8");
await git(localRepo, ["add", "tracked.txt"]);
await git(localRepo, ["commit", "-m", "initial"]);
const originalHead = await git(localRepo, ["rev-parse", "HEAD"]);
await writeFile(path.join(localRepo, "tracked.txt"), "dirty local\n", "utf8");
await writeFile(path.join(localRepo, "untracked.txt"), "from local\n", "utf8");
const started = await startSshEnvLabFixture({ statePath });
const config = await buildSshEnvLabFixtureConfig(started);
const spec = {
...config,
remoteCwd: started.workspaceDir,
} as const;
await prepareWorkspaceForSshExecution({
spec,
localDir: localRepo,
remoteDir: started.workspaceDir,
});
const remoteStatus = await runSshCommand(
config,
`sh -lc 'cd ${JSON.stringify(started.workspaceDir)} && git status --short'`,
);
expect(remoteStatus.stdout).toContain("M tracked.txt");
expect(remoteStatus.stdout).toContain("?? untracked.txt");
expect(remoteStatus.stdout).not.toContain("._tracked.txt");
await runSshCommand(
config,
`sh -lc 'cd ${JSON.stringify(started.workspaceDir)} && git config user.name "Paperclip SSH" && git config user.email "ssh@paperclip.dev" && git add tracked.txt untracked.txt && git commit -m "remote update" >/dev/null && printf "remote dirty\\n" > tracked.txt && printf "remote extra\\n" > remote-only.txt'`,
{ timeoutMs: 30_000, maxBuffer: 256 * 1024 },
);
await restoreWorkspaceFromSshExecution({
spec,
localDir: localRepo,
remoteDir: started.workspaceDir,
});
const restoredHead = await git(localRepo, ["rev-parse", "HEAD"]);
expect(restoredHead).not.toBe(originalHead);
expect(await git(localRepo, ["log", "-1", "--pretty=%s"])).toBe("remote update");
expect(await git(localRepo, ["status", "--short"])).toContain("M tracked.txt");
expect(await git(localRepo, ["status", "--short"])).not.toContain("._tracked.txt");
});
});

File diff suppressed because it is too large Load Diff

View File

@@ -2,6 +2,9 @@
// Minimal adapter-facing interfaces (no drizzle dependency)
// ---------------------------------------------------------------------------
import type { SshRemoteExecutionSpec } from "./ssh.js";
import type { AdapterExecutionTarget } from "./execution-target.js";
export interface AdapterAgent {
id: string;
companyId: string;
@@ -61,12 +64,16 @@ export interface AdapterRuntimeServiceReport {
healthStatus?: "unknown" | "healthy" | "unhealthy";
}
export type AdapterExecutionErrorFamily = "transient_upstream";
export interface AdapterExecutionResult {
exitCode: number | null;
signal: string | null;
timedOut: boolean;
errorMessage?: string | null;
errorCode?: string | null;
errorFamily?: AdapterExecutionErrorFamily | null;
retryNotBefore?: string | null;
errorMeta?: Record<string, unknown>;
usage?: UsageSummary;
/**
@@ -118,6 +125,14 @@ export interface AdapterExecutionContext {
runtime: AdapterRuntime;
config: Record<string, unknown>;
context: Record<string, unknown>;
executionTarget?: AdapterExecutionTarget | null;
/**
* Legacy remote transport view. Prefer `executionTarget`, which is the
* provider-neutral contract produced by core runtime code.
*/
executionTransport?: {
remoteExecution?: Record<string, unknown> | null;
};
onLog: (stream: "stdout" | "stderr", chunk: string) => Promise<void>;
onMeta?: (meta: AdapterInvocationMeta) => Promise<void>;
onSpawn?: (meta: { pid: number; processGroupId: number | null; startedAt: string }) => Promise<void>;
@@ -300,6 +315,13 @@ export interface ServerAdapterModule {
supportsLocalAgentJwt?: boolean;
models?: AdapterModel[];
listModels?: () => Promise<AdapterModel[]>;
/**
* Optional explicit refresh hook for model discovery.
* Use this when the adapter caches discovered models and needs a bypass path
* so the UI can fetch newly released models without waiting for cache expiry
* or a Paperclip code update.
*/
refreshModels?: () => Promise<AdapterModel[]>;
agentConfigurationDoc?: string;
/**
* Optional lifecycle hook when an agent is approved/hired (join-request or hire_agent approval).
@@ -417,6 +439,7 @@ export interface CreateConfigValues {
workspaceBranchTemplate?: string;
worktreeParentDir?: string;
runtimeServicesJson?: string;
defaultEnvironmentId?: string;
maxTurnsPerRun: number;
heartbeatEnabled: boolean;
intervalSec: number;

View File

@@ -0,0 +1,262 @@
import { mkdir, mkdtemp, rm, writeFile } from "node:fs/promises";
import os from "node:os";
import path from "node:path";
import { afterEach, describe, expect, it, vi } from "vitest";
const {
runChildProcess,
ensureCommandResolvable,
resolveCommandForLogs,
prepareWorkspaceForSshExecution,
restoreWorkspaceFromSshExecution,
syncDirectoryToSsh,
} = vi.hoisted(() => ({
runChildProcess: vi.fn(async () => ({
exitCode: 0,
signal: null,
timedOut: false,
stdout: [
JSON.stringify({ type: "system", subtype: "init", session_id: "claude-session-1", model: "claude-sonnet" }),
JSON.stringify({ type: "assistant", session_id: "claude-session-1", message: { content: [{ type: "text", text: "hello" }] } }),
JSON.stringify({ type: "result", session_id: "claude-session-1", result: "hello", usage: { input_tokens: 1, cache_read_input_tokens: 0, output_tokens: 1 } }),
].join("\n"),
stderr: "",
pid: 123,
startedAt: new Date().toISOString(),
})),
ensureCommandResolvable: vi.fn(async () => undefined),
resolveCommandForLogs: vi.fn(async () => "ssh://fixture@127.0.0.1:2222/remote/workspace :: claude"),
prepareWorkspaceForSshExecution: vi.fn(async () => undefined),
restoreWorkspaceFromSshExecution: vi.fn(async () => undefined),
syncDirectoryToSsh: vi.fn(async () => undefined),
}));
vi.mock("@paperclipai/adapter-utils/server-utils", async () => {
const actual = await vi.importActual<typeof import("@paperclipai/adapter-utils/server-utils")>(
"@paperclipai/adapter-utils/server-utils",
);
return {
...actual,
ensureCommandResolvable,
resolveCommandForLogs,
runChildProcess,
};
});
vi.mock("@paperclipai/adapter-utils/ssh", async () => {
const actual = await vi.importActual<typeof import("@paperclipai/adapter-utils/ssh")>(
"@paperclipai/adapter-utils/ssh",
);
return {
...actual,
prepareWorkspaceForSshExecution,
restoreWorkspaceFromSshExecution,
syncDirectoryToSsh,
};
});
import { execute } from "./execute.js";
describe("claude remote execution", () => {
const cleanupDirs: string[] = [];
afterEach(async () => {
vi.clearAllMocks();
while (cleanupDirs.length > 0) {
const dir = cleanupDirs.pop();
if (!dir) continue;
await rm(dir, { recursive: true, force: true }).catch(() => undefined);
}
});
it("prepares the workspace, syncs Claude runtime assets, and restores workspace changes for remote SSH execution", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-claude-remote-"));
cleanupDirs.push(rootDir);
const workspaceDir = path.join(rootDir, "workspace");
const instructionsPath = path.join(rootDir, "instructions.md");
await mkdir(workspaceDir, { recursive: true });
await writeFile(instructionsPath, "Use the remote workspace.\n", "utf8");
await execute({
runId: "run-1",
agent: {
id: "agent-1",
companyId: "company-1",
name: "Claude Coder",
adapterType: "claude_local",
adapterConfig: {},
},
runtime: {
sessionId: null,
sessionParams: null,
sessionDisplayId: null,
taskKey: null,
},
config: {
command: "claude",
instructionsFilePath: instructionsPath,
},
context: {
paperclipWorkspace: {
cwd: workspaceDir,
source: "project_primary",
},
},
executionTransport: {
remoteExecution: {
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteWorkspacePath: "/remote/workspace",
remoteCwd: "/remote/workspace",
privateKey: "PRIVATE KEY",
knownHosts: "[127.0.0.1]:2222 ssh-ed25519 AAAA",
strictHostKeyChecking: true,
paperclipApiUrl: "http://198.51.100.10:3102",
},
},
onLog: async () => {},
});
expect(prepareWorkspaceForSshExecution).toHaveBeenCalledTimes(1);
expect(prepareWorkspaceForSshExecution).toHaveBeenCalledWith(expect.objectContaining({
localDir: workspaceDir,
remoteDir: "/remote/workspace",
}));
expect(syncDirectoryToSsh).toHaveBeenCalledTimes(1);
expect(syncDirectoryToSsh).toHaveBeenCalledWith(expect.objectContaining({
remoteDir: "/remote/workspace/.paperclip-runtime/claude/skills",
followSymlinks: true,
}));
expect(runChildProcess).toHaveBeenCalledTimes(1);
const call = runChildProcess.mock.calls[0] as unknown as
| [string, string, string[], { env: Record<string, string>; remoteExecution?: { remoteCwd: string } | null }]
| undefined;
expect(call?.[2]).toContain("--append-system-prompt-file");
expect(call?.[2]).toContain("/remote/workspace/.paperclip-runtime/claude/skills/agent-instructions.md");
expect(call?.[2]).toContain("--add-dir");
expect(call?.[2]).toContain("/remote/workspace/.paperclip-runtime/claude/skills");
expect(call?.[3].env.PAPERCLIP_API_URL).toBe("http://198.51.100.10:3102");
expect(call?.[3].remoteExecution?.remoteCwd).toBe("/remote/workspace");
expect(restoreWorkspaceFromSshExecution).toHaveBeenCalledTimes(1);
expect(restoreWorkspaceFromSshExecution).toHaveBeenCalledWith(expect.objectContaining({
localDir: workspaceDir,
remoteDir: "/remote/workspace",
}));
});
it("does not resume saved Claude sessions for remote SSH execution without a matching remote identity", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-claude-remote-resume-"));
cleanupDirs.push(rootDir);
const workspaceDir = path.join(rootDir, "workspace");
await mkdir(workspaceDir, { recursive: true });
await execute({
runId: "run-ssh-no-resume",
agent: {
id: "agent-1",
companyId: "company-1",
name: "Claude Coder",
adapterType: "claude_local",
adapterConfig: {},
},
runtime: {
sessionId: "session-123",
sessionParams: {
sessionId: "session-123",
cwd: "/remote/workspace",
},
sessionDisplayId: "session-123",
taskKey: null,
},
config: {
command: "claude",
},
context: {
paperclipWorkspace: {
cwd: workspaceDir,
source: "project_primary",
},
},
executionTransport: {
remoteExecution: {
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteWorkspacePath: "/remote/workspace",
remoteCwd: "/remote/workspace",
privateKey: "PRIVATE KEY",
knownHosts: "[127.0.0.1]:2222 ssh-ed25519 AAAA",
strictHostKeyChecking: true,
},
},
onLog: async () => {},
});
expect(runChildProcess).toHaveBeenCalledTimes(1);
const call = runChildProcess.mock.calls[0] as unknown as [string, string, string[]] | undefined;
expect(call?.[2]).not.toContain("--resume");
});
it("resumes saved Claude sessions for remote SSH execution when the remote identity matches", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-claude-remote-resume-match-"));
cleanupDirs.push(rootDir);
const workspaceDir = path.join(rootDir, "workspace");
await mkdir(workspaceDir, { recursive: true });
await execute({
runId: "run-ssh-resume",
agent: {
id: "agent-1",
companyId: "company-1",
name: "Claude Coder",
adapterType: "claude_local",
adapterConfig: {},
},
runtime: {
sessionId: "session-123",
sessionParams: {
sessionId: "session-123",
cwd: "/remote/workspace",
remoteExecution: {
transport: "ssh",
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteCwd: "/remote/workspace",
},
},
sessionDisplayId: "session-123",
taskKey: null,
},
config: {
command: "claude",
},
context: {
paperclipWorkspace: {
cwd: workspaceDir,
source: "project_primary",
},
},
executionTransport: {
remoteExecution: {
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteWorkspacePath: "/remote/workspace",
remoteCwd: "/remote/workspace",
privateKey: "PRIVATE KEY",
knownHosts: "[127.0.0.1]:2222 ssh-ed25519 AAAA",
strictHostKeyChecking: true,
},
},
onLog: async () => {},
});
expect(runChildProcess).toHaveBeenCalledTimes(1);
const call = runChildProcess.mock.calls[0] as unknown as [string, string, string[]] | undefined;
expect(call?.[2]).toContain("--resume");
expect(call?.[2]).toContain("session-123");
});
});

View File

@@ -3,6 +3,20 @@ import path from "node:path";
import { fileURLToPath } from "node:url";
import type { AdapterExecutionContext, AdapterExecutionResult } from "@paperclipai/adapter-utils";
import type { RunProcessResult } from "@paperclipai/adapter-utils/server-utils";
import {
adapterExecutionTargetIsRemote,
adapterExecutionTargetPaperclipApiUrl,
adapterExecutionTargetRemoteCwd,
adapterExecutionTargetSessionIdentity,
adapterExecutionTargetSessionMatches,
adapterExecutionTargetUsesManagedHome,
describeAdapterExecutionTarget,
ensureAdapterExecutionTargetCommandResolvable,
prepareAdapterExecutionTargetRuntime,
readAdapterExecutionTarget,
resolveAdapterExecutionTargetCommandForLogs,
runAdapterExecutionTargetProcess,
} from "@paperclipai/adapter-utils/execution-target";
import {
asString,
asNumber,
@@ -15,20 +29,19 @@ import {
joinPromptSections,
buildInvocationEnvForLogs,
ensureAbsoluteDirectory,
ensureCommandResolvable,
ensurePathInEnv,
resolveCommandForLogs,
renderTemplate,
renderPaperclipWakePrompt,
stringifyPaperclipWakePayload,
DEFAULT_PAPERCLIP_AGENT_PROMPT_TEMPLATE,
runChildProcess,
} from "@paperclipai/adapter-utils/server-utils";
import {
parseClaudeStreamJson,
describeClaudeFailure,
detectClaudeLoginRequired,
extractClaudeRetryNotBefore,
isClaudeMaxTurnsResult,
isClaudeTransientUpstreamError,
isClaudeUnknownSessionError,
} from "./parse.js";
import { resolveClaudeDesiredSkillNames } from "./skills.js";
@@ -42,6 +55,7 @@ interface ClaudeExecutionInput {
agent: AdapterExecutionContext["agent"];
config: Record<string, unknown>;
context: Record<string, unknown>;
executionTarget?: ReturnType<typeof readAdapterExecutionTarget>;
authToken?: string;
}
@@ -92,7 +106,7 @@ function resolveClaudeBillingType(env: Record<string, string>): "api" | "subscri
}
async function buildClaudeRuntimeConfig(input: ClaudeExecutionInput): Promise<ClaudeRuntimeConfig> {
const { runId, agent, config, context, authToken } = input;
const { runId, agent, config, context, executionTarget, authToken } = input;
const command = asString(config.command, "claude");
const workspaceContext = parseObject(context.paperclipWorkspace);
@@ -218,6 +232,10 @@ async function buildClaudeRuntimeConfig(input: ClaudeExecutionInput): Promise<Cl
if (runtimePrimaryUrl) {
env.PAPERCLIP_RUNTIME_PRIMARY_URL = runtimePrimaryUrl;
}
const targetPaperclipApiUrl = adapterExecutionTargetPaperclipApiUrl(executionTarget);
if (targetPaperclipApiUrl) {
env.PAPERCLIP_API_URL = targetPaperclipApiUrl;
}
for (const [key, value] of Object.entries(envConfig)) {
if (typeof value === "string") env[key] = value;
@@ -228,8 +246,8 @@ async function buildClaudeRuntimeConfig(input: ClaudeExecutionInput): Promise<Cl
}
const runtimeEnv = ensurePathInEnv({ ...process.env, ...env });
await ensureCommandResolvable(command, cwd, runtimeEnv);
const resolvedCommand = await resolveCommandForLogs(command, cwd, runtimeEnv);
await ensureAdapterExecutionTargetCommandResolvable(command, executionTarget, cwd, runtimeEnv);
const resolvedCommand = await resolveAdapterExecutionTargetCommandForLogs(command, executionTarget, cwd, runtimeEnv);
const loggedEnv = buildInvocationEnvForLogs(env, {
runtimeEnv,
includeRuntimeKeys: ["HOME", "CLAUDE_CONFIG_DIR"],
@@ -276,7 +294,7 @@ export async function runClaudeLogin(input: {
authToken: input.authToken,
});
const proc = await runChildProcess(input.runId, runtime.command, ["login"], {
const proc = await runAdapterExecutionTargetProcess(input.runId, null, runtime.command, ["login"], {
cwd: runtime.cwd,
env: runtime.env,
timeoutSec: runtime.timeoutSec,
@@ -298,6 +316,11 @@ export async function runClaudeLogin(input: {
export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExecutionResult> {
const { runId, agent, runtime, config, context, onLog, onMeta, onSpawn, authToken } = ctx;
const executionTarget = readAdapterExecutionTarget({
executionTarget: ctx.executionTarget,
legacyRemoteExecution: ctx.executionTransport?.remoteExecution,
});
const executionTargetIsRemote = adapterExecutionTargetIsRemote(executionTarget);
const promptTemplate = asString(
config.promptTemplate,
@@ -315,6 +338,7 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
agent,
config,
context,
executionTarget,
authToken,
});
const {
@@ -330,6 +354,11 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
graceSec,
extraArgs,
} = runtimeConfig;
const effectiveExecutionCwd = adapterExecutionTargetRemoteCwd(executionTarget, cwd);
const terminalResultCleanupGraceMs = Math.max(
0,
asNumber(config.terminalResultCleanupGraceMs, 5_000),
);
const effectiveEnv = Object.fromEntries(
Object.entries({ ...process.env, ...env }).filter(
(entry): entry is [string, string] => typeof entry[1] === "string",
@@ -365,27 +394,74 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
instructionsContents: combinedInstructionsContents,
onLog,
});
const effectiveInstructionsFilePath = promptBundle.instructionsFilePath ?? undefined;
const preparedExecutionTargetRuntime = executionTargetIsRemote
? await (async () => {
await onLog(
"stdout",
`[paperclip] Syncing workspace and Claude runtime assets to ${describeAdapterExecutionTarget(executionTarget)}.\n`,
);
return await prepareAdapterExecutionTargetRuntime({
target: executionTarget,
adapterKey: "claude",
workspaceLocalDir: cwd,
assets: [
{
key: "skills",
localDir: promptBundle.addDir,
followSymlinks: true,
},
],
});
})()
: null;
const restoreRemoteWorkspace = preparedExecutionTargetRuntime
? () => preparedExecutionTargetRuntime.restoreWorkspace()
: null;
const effectivePromptBundleAddDir = executionTargetIsRemote
? preparedExecutionTargetRuntime?.assetDirs.skills ??
path.posix.join(effectiveExecutionCwd, ".paperclip-runtime", "claude", "skills")
: promptBundle.addDir;
const effectiveInstructionsFilePath = promptBundle.instructionsFilePath
? executionTargetIsRemote
? path.posix.join(effectivePromptBundleAddDir, path.basename(promptBundle.instructionsFilePath))
: promptBundle.instructionsFilePath
: undefined;
const runtimeSessionParams = parseObject(runtime.sessionParams);
const runtimeSessionId = asString(runtimeSessionParams.sessionId, runtime.sessionId ?? "");
const runtimeSessionCwd = asString(runtimeSessionParams.cwd, "");
const runtimeRemoteExecution = parseObject(runtimeSessionParams.remoteExecution);
const runtimePromptBundleKey = asString(runtimeSessionParams.promptBundleKey, "");
const hasMatchingPromptBundle =
runtimePromptBundleKey.length === 0 || runtimePromptBundleKey === promptBundle.bundleKey;
const canResumeSession =
runtimeSessionId.length > 0 &&
hasMatchingPromptBundle &&
(runtimeSessionCwd.length === 0 || path.resolve(runtimeSessionCwd) === path.resolve(cwd));
(runtimeSessionCwd.length === 0 || path.resolve(runtimeSessionCwd) === path.resolve(effectiveExecutionCwd)) &&
adapterExecutionTargetSessionMatches(runtimeRemoteExecution, executionTarget);
const sessionId = canResumeSession ? runtimeSessionId : null;
if (
executionTargetIsRemote &&
runtimeSessionId &&
runtimeSessionCwd.length > 0 &&
path.resolve(runtimeSessionCwd) !== path.resolve(cwd)
!canResumeSession
) {
await onLog(
"stdout",
`[paperclip] Claude session "${runtimeSessionId}" was saved for cwd "${runtimeSessionCwd}" and will not be resumed in "${cwd}".\n`,
`[paperclip] Claude session "${runtimeSessionId}" does not match the current remote execution identity and will not be resumed in "${effectiveExecutionCwd}". Starting a fresh remote session.\n`,
);
} else if (
runtimeSessionId &&
runtimeSessionCwd.length > 0 &&
path.resolve(runtimeSessionCwd) !== path.resolve(effectiveExecutionCwd)
) {
await onLog(
"stdout",
`[paperclip] Claude session "${runtimeSessionId}" does not match the current remote execution identity and will not be resumed in "${effectiveExecutionCwd}". Starting a fresh remote session.\n`,
);
} else if (runtimeSessionId && !canResumeSession) {
await onLog(
"stdout",
`[paperclip] Claude session "${runtimeSessionId}" was saved for cwd "${runtimeSessionCwd}" and will not be resumed in "${effectiveExecutionCwd}".\n`,
);
}
if (runtimeSessionId && runtimePromptBundleKey.length > 0 && runtimePromptBundleKey !== promptBundle.bundleKey) {
@@ -412,10 +488,12 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
const shouldUseResumeDeltaPrompt = Boolean(sessionId) && wakePrompt.length > 0;
const renderedPrompt = shouldUseResumeDeltaPrompt ? "" : renderTemplate(promptTemplate, templateData);
const sessionHandoffNote = asString(context.paperclipSessionHandoffMarkdown, "").trim();
const taskContextNote = asString(context.paperclipTaskMarkdown, "").trim();
const prompt = joinPromptSections([
renderedBootstrapPrompt,
wakePrompt,
sessionHandoffNote,
taskContextNote,
renderedPrompt,
]);
const promptMetrics = {
@@ -423,6 +501,7 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
bootstrapPromptChars: renderedBootstrapPrompt.length,
wakePromptChars: wakePrompt.length,
sessionHandoffChars: sessionHandoffNote.length,
taskContextChars: taskContextNote.length,
heartbeatPromptChars: renderedPrompt.length,
};
@@ -448,7 +527,7 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
if (attemptInstructionsFilePath && !resumeSessionId) {
args.push("--append-system-prompt-file", attemptInstructionsFilePath);
}
args.push("--add-dir", promptBundle.addDir);
args.push("--add-dir", effectivePromptBundleAddDir);
if (extraArgs.length > 0) args.push(...extraArgs);
return args;
};
@@ -485,7 +564,7 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
await onMeta({
adapterType: "claude_local",
command: resolvedCommand,
cwd,
cwd: effectiveExecutionCwd,
commandArgs: args,
commandNotes,
env: loggedEnv,
@@ -495,7 +574,7 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
});
}
const proc = await runChildProcess(runId, command, args, {
const proc = await runAdapterExecutionTargetProcess(runId, executionTarget, command, args, {
cwd,
env,
stdin: prompt,
@@ -503,6 +582,10 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
graceSec,
onSpawn,
onLog,
terminalResultCleanup: {
graceMs: terminalResultCleanupGraceMs,
hasTerminalResult: ({ stdout }) => parseClaudeStreamJson(stdout).resultJson !== null,
},
});
const parsedStream = parseClaudeStreamJson(proc.stdout);
@@ -544,16 +627,48 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
}
if (!parsed) {
const fallbackErrorMessage = parseFallbackErrorMessage(proc);
const transientUpstream =
!loginMeta.requiresLogin &&
(proc.exitCode ?? 0) !== 0 &&
isClaudeTransientUpstreamError({
parsed: null,
stdout: proc.stdout,
stderr: proc.stderr,
errorMessage: fallbackErrorMessage,
});
const transientRetryNotBefore = transientUpstream
? extractClaudeRetryNotBefore({
parsed: null,
stdout: proc.stdout,
stderr: proc.stderr,
errorMessage: fallbackErrorMessage,
})
: null;
const errorCode = loginMeta.requiresLogin
? "claude_auth_required"
: transientUpstream
? "claude_transient_upstream"
: null;
return {
exitCode: proc.exitCode,
signal: proc.signal,
timedOut: false,
errorMessage: parseFallbackErrorMessage(proc),
errorCode: loginMeta.requiresLogin ? "claude_auth_required" : null,
errorMessage: fallbackErrorMessage,
errorCode,
errorFamily: transientUpstream ? "transient_upstream" : null,
retryNotBefore: transientRetryNotBefore ? transientRetryNotBefore.toISOString() : null,
errorMeta,
resultJson: {
stdout: proc.stdout,
stderr: proc.stderr,
...(transientUpstream ? { errorFamily: "transient_upstream" } : {}),
...(transientRetryNotBefore
? { retryNotBefore: transientRetryNotBefore.toISOString() }
: {}),
...(transientRetryNotBefore
? { transientRetryNotBefore: transientRetryNotBefore.toISOString() }
: {}),
},
clearSession: Boolean(opts.clearSessionOnMissingSession),
};
@@ -576,24 +691,61 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
const resolvedSessionParams = resolvedSessionId
? ({
sessionId: resolvedSessionId,
cwd,
cwd: effectiveExecutionCwd,
promptBundleKey: promptBundle.bundleKey,
...(executionTargetIsRemote
? {
remoteExecution: adapterExecutionTargetSessionIdentity(executionTarget),
}
: {}),
...(workspaceId ? { workspaceId } : {}),
...(workspaceRepoUrl ? { repoUrl: workspaceRepoUrl } : {}),
...(workspaceRepoRef ? { repoRef: workspaceRepoRef } : {}),
} as Record<string, unknown>)
: null;
const clearSessionForMaxTurns = isClaudeMaxTurnsResult(parsed);
const parsedIsError = asBoolean(parsed.is_error, false);
const failed = (proc.exitCode ?? 0) !== 0 || parsedIsError;
const errorMessage = failed
? describeClaudeFailure(parsed) ?? `Claude exited with code ${proc.exitCode ?? -1}`
: null;
const transientUpstream =
failed &&
!loginMeta.requiresLogin &&
isClaudeTransientUpstreamError({
parsed,
stdout: proc.stdout,
stderr: proc.stderr,
errorMessage,
});
const transientRetryNotBefore = transientUpstream
? extractClaudeRetryNotBefore({
parsed,
stdout: proc.stdout,
stderr: proc.stderr,
errorMessage,
})
: null;
const resolvedErrorCode = loginMeta.requiresLogin
? "claude_auth_required"
: transientUpstream
? "claude_transient_upstream"
: null;
const mergedResultJson: Record<string, unknown> = {
...parsed,
...(transientUpstream ? { errorFamily: "transient_upstream" } : {}),
...(transientRetryNotBefore ? { retryNotBefore: transientRetryNotBefore.toISOString() } : {}),
...(transientRetryNotBefore ? { transientRetryNotBefore: transientRetryNotBefore.toISOString() } : {}),
};
return {
exitCode: proc.exitCode,
signal: proc.signal,
timedOut: false,
errorMessage:
(proc.exitCode ?? 0) === 0
? null
: describeClaudeFailure(parsed) ?? `Claude exited with code ${proc.exitCode ?? -1}`,
errorCode: loginMeta.requiresLogin ? "claude_auth_required" : null,
errorMessage,
errorCode: resolvedErrorCode,
errorFamily: transientUpstream ? "transient_upstream" : null,
retryNotBefore: transientRetryNotBefore ? transientRetryNotBefore.toISOString() : null,
errorMeta,
usage,
sessionId: resolvedSessionId,
@@ -604,27 +756,37 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
model: parsedStream.model || asString(parsed.model, model),
billingType,
costUsd: parsedStream.costUsd ?? asNumber(parsed.total_cost_usd, 0),
resultJson: parsed,
resultJson: mergedResultJson,
summary: parsedStream.summary || asString(parsed.result, ""),
clearSession: clearSessionForMaxTurns || Boolean(opts.clearSessionOnMissingSession && !resolvedSessionId),
};
};
const initial = await runAttempt(sessionId ?? null);
if (
sessionId &&
!initial.proc.timedOut &&
(initial.proc.exitCode ?? 0) !== 0 &&
initial.parsed &&
isClaudeUnknownSessionError(initial.parsed)
) {
await onLog(
"stdout",
`[paperclip] Claude resume session "${sessionId}" is unavailable; retrying with a fresh session.\n`,
);
const retry = await runAttempt(null);
return toAdapterResult(retry, { fallbackSessionId: null, clearSessionOnMissingSession: true });
}
try {
const initial = await runAttempt(sessionId ?? null);
if (
sessionId &&
!initial.proc.timedOut &&
(initial.proc.exitCode ?? 0) !== 0 &&
initial.parsed &&
isClaudeUnknownSessionError(initial.parsed)
) {
await onLog(
"stdout",
`[paperclip] Claude resume session "${sessionId}" is unavailable; retrying with a fresh session.\n`,
);
const retry = await runAttempt(null);
return toAdapterResult(retry, { fallbackSessionId: null, clearSessionOnMissingSession: true });
}
return toAdapterResult(initial, { fallbackSessionId: runtimeSessionId || runtime.sessionId });
return toAdapterResult(initial, { fallbackSessionId: runtimeSessionId || runtime.sessionId });
} finally {
if (restoreRemoteWorkspace) {
await onLog(
"stdout",
`[paperclip] Restoring workspace changes from ${describeAdapterExecutionTarget(executionTarget)}.\n`,
);
await restoreRemoteWorkspace();
}
}
}

View File

@@ -0,0 +1,123 @@
import { describe, expect, it } from "vitest";
import {
extractClaudeRetryNotBefore,
isClaudeTransientUpstreamError,
} from "./parse.js";
describe("isClaudeTransientUpstreamError", () => {
it("classifies the 'out of extra usage' subscription window failure as transient", () => {
expect(
isClaudeTransientUpstreamError({
errorMessage: "You're out of extra usage · resets 4pm (America/Chicago)",
}),
).toBe(true);
expect(
isClaudeTransientUpstreamError({
parsed: {
is_error: true,
result: "You're out of extra usage. Resets at 4pm (America/Chicago).",
},
}),
).toBe(true);
});
it("classifies Anthropic API rate_limit_error and overloaded_error as transient", () => {
expect(
isClaudeTransientUpstreamError({
parsed: {
is_error: true,
errors: [{ type: "rate_limit_error", message: "Rate limit reached for requests." }],
},
}),
).toBe(true);
expect(
isClaudeTransientUpstreamError({
parsed: {
is_error: true,
errors: [{ type: "overloaded_error", message: "Overloaded" }],
},
}),
).toBe(true);
expect(
isClaudeTransientUpstreamError({
stderr: "HTTP 429: Too Many Requests",
}),
).toBe(true);
expect(
isClaudeTransientUpstreamError({
stderr: "Bedrock ThrottlingException: slow down",
}),
).toBe(true);
});
it("classifies the subscription 5-hour / weekly limit wording", () => {
expect(
isClaudeTransientUpstreamError({
errorMessage: "Claude usage limit reached — weekly limit reached. Try again in 2 days.",
}),
).toBe(true);
expect(
isClaudeTransientUpstreamError({
errorMessage: "5-hour limit reached.",
}),
).toBe(true);
});
it("does not classify login/auth failures as transient", () => {
expect(
isClaudeTransientUpstreamError({
stderr: "Please log in. Run `claude login` first.",
}),
).toBe(false);
});
it("does not classify max-turns or unknown-session as transient", () => {
expect(
isClaudeTransientUpstreamError({
parsed: { subtype: "error_max_turns", result: "Maximum turns reached." },
}),
).toBe(false);
expect(
isClaudeTransientUpstreamError({
parsed: {
result: "No conversation found with session id abc-123",
errors: [{ message: "No conversation found with session id abc-123" }],
},
}),
).toBe(false);
});
it("does not classify deterministic validation errors as transient", () => {
expect(
isClaudeTransientUpstreamError({
errorMessage: "Invalid request_error: Unknown parameter 'foo'.",
}),
).toBe(false);
});
});
describe("extractClaudeRetryNotBefore", () => {
it("parses the 'resets 4pm' hint in its explicit timezone", () => {
const now = new Date("2026-04-22T15:15:00.000Z");
const extracted = extractClaudeRetryNotBefore(
{ errorMessage: "You're out of extra usage · resets 4pm (America/Chicago)" },
now,
);
expect(extracted?.toISOString()).toBe("2026-04-22T21:00:00.000Z");
});
it("rolls forward past midnight when the reset time has already passed today", () => {
const now = new Date("2026-04-22T23:30:00.000Z");
const extracted = extractClaudeRetryNotBefore(
{ errorMessage: "Usage limit reached. Resets at 3:15 AM (UTC)." },
now,
);
expect(extracted?.toISOString()).toBe("2026-04-23T03:15:00.000Z");
});
it("returns null when no reset hint is present", () => {
expect(
extractClaudeRetryNotBefore({ errorMessage: "Overloaded. Try again later." }, new Date()),
).toBeNull();
});
});

View File

@@ -1,9 +1,19 @@
import type { UsageSummary } from "@paperclipai/adapter-utils";
import { asString, asNumber, parseObject, parseJson } from "@paperclipai/adapter-utils/server-utils";
import {
asString,
asNumber,
parseObject,
parseJson,
} from "@paperclipai/adapter-utils/server-utils";
const CLAUDE_AUTH_REQUIRED_RE = /(?:not\s+logged\s+in|please\s+log\s+in|please\s+run\s+`?claude\s+login`?|login\s+required|requires\s+login|unauthorized|authentication\s+required)/i;
const URL_RE = /(https?:\/\/[^\s'"`<>()[\]{};,!?]+[^\s'"`<>()[\]{};,!.?:]+)/gi;
const CLAUDE_TRANSIENT_UPSTREAM_RE =
/(?:rate[-\s]?limit(?:ed)?|rate_limit_error|too\s+many\s+requests|\b429\b|overloaded(?:_error)?|server\s+overloaded|service\s+unavailable|\b503\b|\b529\b|high\s+demand|try\s+again\s+later|temporarily\s+unavailable|throttl(?:ed|ing)|throttlingexception|servicequotaexceededexception|out\s+of\s+extra\s+usage|extra\s+usage\b|claude\s+usage\s+limit\s+reached|5[-\s]?hour\s+limit\s+reached|weekly\s+limit\s+reached|usage\s+limit\s+reached|usage\s+cap\s+reached)/i;
const CLAUDE_EXTRA_USAGE_RESET_RE =
/(?:out\s+of\s+extra\s+usage|extra\s+usage|usage\s+limit\s+reached|usage\s+cap\s+reached|5[-\s]?hour\s+limit\s+reached|weekly\s+limit\s+reached|claude\s+usage\s+limit\s+reached)[\s\S]{0,80}?\bresets?\s+(?:at\s+)?([^\n()]+?)(?:\s*\(([^)]+)\))?(?:[.!]|\n|$)/i;
export function parseClaudeStreamJson(stdout: string) {
let sessionId: string | null = null;
let model = "";
@@ -177,3 +187,197 @@ export function isClaudeUnknownSessionError(parsed: Record<string, unknown>): bo
/no conversation found with session id|unknown session|session .* not found/i.test(msg),
);
}
function buildClaudeTransientHaystack(input: {
parsed?: Record<string, unknown> | null;
stdout?: string | null;
stderr?: string | null;
errorMessage?: string | null;
}): string {
const parsed = input.parsed ?? null;
const resultText = parsed ? asString(parsed.result, "") : "";
const parsedErrors = parsed ? extractClaudeErrorMessages(parsed) : [];
return [
input.errorMessage ?? "",
resultText,
...parsedErrors,
input.stdout ?? "",
input.stderr ?? "",
]
.join("\n")
.split(/\r?\n/)
.map((line) => line.trim())
.filter(Boolean)
.join("\n");
}
function readTimeZoneParts(date: Date, timeZone: string) {
const values = new Map(
new Intl.DateTimeFormat("en-US", {
timeZone,
hourCycle: "h23",
year: "numeric",
month: "2-digit",
day: "2-digit",
hour: "2-digit",
minute: "2-digit",
}).formatToParts(date).map((part) => [part.type, part.value]),
);
return {
year: Number.parseInt(values.get("year") ?? "", 10),
month: Number.parseInt(values.get("month") ?? "", 10),
day: Number.parseInt(values.get("day") ?? "", 10),
hour: Number.parseInt(values.get("hour") ?? "", 10),
minute: Number.parseInt(values.get("minute") ?? "", 10),
};
}
function normalizeResetTimeZone(timeZoneHint: string | null | undefined): string | null {
const normalized = timeZoneHint?.trim();
if (!normalized) return null;
if (/^(?:utc|gmt)$/i.test(normalized)) return "UTC";
try {
new Intl.DateTimeFormat("en-US", { timeZone: normalized }).format(new Date(0));
return normalized;
} catch {
return null;
}
}
function dateFromTimeZoneWallClock(input: {
year: number;
month: number;
day: number;
hour: number;
minute: number;
timeZone: string;
}): Date | null {
let candidate = new Date(Date.UTC(input.year, input.month - 1, input.day, input.hour, input.minute, 0, 0));
const targetUtc = Date.UTC(input.year, input.month - 1, input.day, input.hour, input.minute, 0, 0);
for (let attempt = 0; attempt < 4; attempt += 1) {
const actual = readTimeZoneParts(candidate, input.timeZone);
const actualUtc = Date.UTC(actual.year, actual.month - 1, actual.day, actual.hour, actual.minute, 0, 0);
const offsetMs = targetUtc - actualUtc;
if (offsetMs === 0) break;
candidate = new Date(candidate.getTime() + offsetMs);
}
const verified = readTimeZoneParts(candidate, input.timeZone);
if (
verified.year !== input.year ||
verified.month !== input.month ||
verified.day !== input.day ||
verified.hour !== input.hour ||
verified.minute !== input.minute
) {
return null;
}
return candidate;
}
function nextClockTimeInTimeZone(input: {
now: Date;
hour: number;
minute: number;
timeZoneHint: string;
}): Date | null {
const timeZone = normalizeResetTimeZone(input.timeZoneHint);
if (!timeZone) return null;
const nowParts = readTimeZoneParts(input.now, timeZone);
let retryAt = dateFromTimeZoneWallClock({
year: nowParts.year,
month: nowParts.month,
day: nowParts.day,
hour: input.hour,
minute: input.minute,
timeZone,
});
if (!retryAt) return null;
if (retryAt.getTime() <= input.now.getTime()) {
const nextDay = new Date(Date.UTC(nowParts.year, nowParts.month - 1, nowParts.day + 1, 0, 0, 0, 0));
retryAt = dateFromTimeZoneWallClock({
year: nextDay.getUTCFullYear(),
month: nextDay.getUTCMonth() + 1,
day: nextDay.getUTCDate(),
hour: input.hour,
minute: input.minute,
timeZone,
});
}
return retryAt;
}
function parseClaudeResetClockTime(clockText: string, now: Date, timeZoneHint?: string | null): Date | null {
const normalized = clockText.trim().replace(/\s+/g, " ");
const match = normalized.match(/^(\d{1,2})(?::(\d{2}))?\s*([ap])\.?\s*m\.?/i);
if (!match) return null;
const hour12 = Number.parseInt(match[1] ?? "", 10);
const minute = Number.parseInt(match[2] ?? "0", 10);
if (!Number.isInteger(hour12) || hour12 < 1 || hour12 > 12) return null;
if (!Number.isInteger(minute) || minute < 0 || minute > 59) return null;
let hour24 = hour12 % 12;
if ((match[3] ?? "").toLowerCase() === "p") hour24 += 12;
if (timeZoneHint) {
const explicitRetryAt = nextClockTimeInTimeZone({
now,
hour: hour24,
minute,
timeZoneHint,
});
if (explicitRetryAt) return explicitRetryAt;
}
const retryAt = new Date(now);
retryAt.setHours(hour24, minute, 0, 0);
if (retryAt.getTime() <= now.getTime()) {
retryAt.setDate(retryAt.getDate() + 1);
}
return retryAt;
}
export function extractClaudeRetryNotBefore(
input: {
parsed?: Record<string, unknown> | null;
stdout?: string | null;
stderr?: string | null;
errorMessage?: string | null;
},
now = new Date(),
): Date | null {
const haystack = buildClaudeTransientHaystack(input);
const match = haystack.match(CLAUDE_EXTRA_USAGE_RESET_RE);
if (!match) return null;
return parseClaudeResetClockTime(match[1] ?? "", now, match[2]);
}
export function isClaudeTransientUpstreamError(input: {
parsed?: Record<string, unknown> | null;
stdout?: string | null;
stderr?: string | null;
errorMessage?: string | null;
}): boolean {
const parsed = input.parsed ?? null;
// Deterministic failures are handled by their own classifiers.
if (parsed && (isClaudeMaxTurnsResult(parsed) || isClaudeUnknownSessionError(parsed))) {
return false;
}
const loginMeta = detectClaudeLoginRequired({
parsed,
stdout: input.stdout ?? "",
stderr: input.stderr ?? "",
});
if (loginMeta.requiresLogin) return false;
const haystack = buildClaudeTransientHaystack(input);
if (!haystack) return false;
return CLAUDE_TRANSIENT_UPSTREAM_RE.test(haystack);
}

View File

@@ -0,0 +1,7 @@
import { defineConfig } from "vitest/config";
export default defineConfig({
test: {
environment: "node",
},
});

View File

@@ -4,7 +4,23 @@ export const DEFAULT_CODEX_LOCAL_MODEL = "gpt-5.3-codex";
export const DEFAULT_CODEX_LOCAL_BYPASS_APPROVALS_AND_SANDBOX = true;
export const CODEX_LOCAL_FAST_MODE_SUPPORTED_MODELS = ["gpt-5.4"] as const;
function normalizeModelId(model: string | null | undefined): string {
return typeof model === "string" ? model.trim() : "";
}
export function isCodexLocalKnownModel(model: string | null | undefined): boolean {
const normalizedModel = normalizeModelId(model);
if (!normalizedModel) return false;
return models.some((entry) => entry.id === normalizedModel);
}
export function isCodexLocalManualModel(model: string | null | undefined): boolean {
const normalizedModel = normalizeModelId(model);
return Boolean(normalizedModel) && !isCodexLocalKnownModel(normalizedModel);
}
export function isCodexLocalFastModeSupported(model: string | null | undefined): boolean {
if (isCodexLocalManualModel(model)) return true;
const normalizedModel = typeof model === "string" ? model.trim() : "";
return CODEX_LOCAL_FAST_MODE_SUPPORTED_MODELS.includes(
normalizedModel as (typeof CODEX_LOCAL_FAST_MODE_SUPPORTED_MODELS)[number],
@@ -35,7 +51,7 @@ Core fields:
- modelReasoningEffort (string, optional): reasoning effort override (minimal|low|medium|high|xhigh) passed via -c model_reasoning_effort=...
- promptTemplate (string, optional): run prompt template
- search (boolean, optional): run codex with --search
- fastMode (boolean, optional): enable Codex Fast mode; currently supported on GPT-5.4 only and consumes credits faster
- fastMode (boolean, optional): enable Codex Fast mode; supported on GPT-5.4 and passed through for manual model IDs
- dangerouslyBypassApprovalsAndSandbox (boolean, optional): run with bypass flag
- command (string, optional): defaults to "codex"
- extraArgs (string[], optional): additional CLI args
@@ -54,6 +70,6 @@ Notes:
- Paperclip injects desired local skills into the effective CODEX_HOME/skills/ directory at execution time so Codex can discover "$paperclip" and related skills without polluting the project working directory. In managed-home mode (the default) this is ~/.paperclip/instances/<id>/companies/<companyId>/codex-home/skills/; when CODEX_HOME is explicitly overridden in adapter config, that override is used instead.
- Unless explicitly overridden in adapter config, Paperclip runs Codex with a per-company managed CODEX_HOME under the active Paperclip instance and seeds auth/config from the shared Codex home (the CODEX_HOME env var, when set, or ~/.codex).
- Some model/tool combinations reject certain effort levels (for example minimal with web search enabled).
- Fast mode is currently supported on GPT-5.4 only. When enabled, Paperclip applies \`service_tier="fast"\` and \`features.fast_mode=true\`.
- Fast mode is supported on GPT-5.4 and manual model IDs. When enabled for those models, Paperclip applies \`service_tier="fast"\` and \`features.fast_mode=true\`.
- When Paperclip realizes a workspace/runtime for a run, it injects PAPERCLIP_WORKSPACE_* and PAPERCLIP_RUNTIME_* env vars for agent-side tooling.
`;

View File

@@ -26,6 +26,28 @@ describe("buildCodexExecArgs", () => {
]);
});
it("enables Codex fast mode overrides for manual models", () => {
const result = buildCodexExecArgs({
model: "gpt-5.5",
fastMode: true,
});
expect(result.fastModeRequested).toBe(true);
expect(result.fastModeApplied).toBe(true);
expect(result.fastModeIgnoredReason).toBeNull();
expect(result.args).toEqual([
"exec",
"--json",
"--model",
"gpt-5.5",
"-c",
'service_tier="fast"',
"-c",
"features.fast_mode=true",
"-",
]);
});
it("ignores fast mode for unsupported models", () => {
const result = buildCodexExecArgs({
model: "gpt-5.3-codex",
@@ -34,7 +56,9 @@ describe("buildCodexExecArgs", () => {
expect(result.fastModeRequested).toBe(true);
expect(result.fastModeApplied).toBe(false);
expect(result.fastModeIgnoredReason).toContain("currently only supported on gpt-5.4");
expect(result.fastModeIgnoredReason).toContain(
"currently only supported on gpt-5.4 or manually configured model IDs",
);
expect(result.args).toEqual([
"exec",
"--json",

View File

@@ -25,7 +25,7 @@ function asRecord(value: unknown): Record<string, unknown> {
}
function formatFastModeSupportedModels(): string {
return CODEX_LOCAL_FAST_MODE_SUPPORTED_MODELS.join(", ");
return `${CODEX_LOCAL_FAST_MODE_SUPPORTED_MODELS.join(", ")} or manually configured model IDs`;
}
export function buildCodexExecArgs(

View File

@@ -0,0 +1,359 @@
import { mkdir, mkdtemp, rm, writeFile } from "node:fs/promises";
import os from "node:os";
import path from "node:path";
import { afterEach, describe, expect, it, vi } from "vitest";
const {
runChildProcess,
ensureCommandResolvable,
resolveCommandForLogs,
prepareWorkspaceForSshExecution,
restoreWorkspaceFromSshExecution,
syncDirectoryToSsh,
} = vi.hoisted(() => ({
runChildProcess: vi.fn(async () => ({
exitCode: 1,
signal: null,
timedOut: false,
stdout: "",
stderr: "remote failure",
pid: 123,
startedAt: new Date().toISOString(),
})),
ensureCommandResolvable: vi.fn(async () => undefined),
resolveCommandForLogs: vi.fn(async () => "/usr/bin/codex"),
prepareWorkspaceForSshExecution: vi.fn(async () => undefined),
restoreWorkspaceFromSshExecution: vi.fn(async () => undefined),
syncDirectoryToSsh: vi.fn(async () => undefined),
}));
vi.mock("@paperclipai/adapter-utils/server-utils", async () => {
const actual = await vi.importActual<typeof import("@paperclipai/adapter-utils/server-utils")>(
"@paperclipai/adapter-utils/server-utils",
);
return {
...actual,
ensureCommandResolvable,
resolveCommandForLogs,
runChildProcess,
};
});
vi.mock("@paperclipai/adapter-utils/ssh", async () => {
const actual = await vi.importActual<typeof import("@paperclipai/adapter-utils/ssh")>(
"@paperclipai/adapter-utils/ssh",
);
return {
...actual,
prepareWorkspaceForSshExecution,
restoreWorkspaceFromSshExecution,
syncDirectoryToSsh,
};
});
import { execute } from "./execute.js";
describe("codex remote execution", () => {
const cleanupDirs: string[] = [];
afterEach(async () => {
vi.clearAllMocks();
while (cleanupDirs.length > 0) {
const dir = cleanupDirs.pop();
if (!dir) continue;
await rm(dir, { recursive: true, force: true }).catch(() => undefined);
}
});
it("prepares the workspace, syncs CODEX_HOME, and restores workspace changes for remote SSH execution", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-codex-remote-"));
cleanupDirs.push(rootDir);
const workspaceDir = path.join(rootDir, "workspace");
const codexHomeDir = path.join(rootDir, "codex-home");
await mkdir(workspaceDir, { recursive: true });
await mkdir(codexHomeDir, { recursive: true });
await writeFile(path.join(rootDir, "instructions.md"), "Use the remote workspace.\n", "utf8");
await writeFile(path.join(codexHomeDir, "auth.json"), "{}", "utf8");
await execute({
runId: "run-1",
agent: {
id: "agent-1",
companyId: "company-1",
name: "CodexCoder",
adapterType: "codex_local",
adapterConfig: {},
},
runtime: {
sessionId: null,
sessionParams: null,
sessionDisplayId: null,
taskKey: null,
},
config: {
command: "codex",
env: {
CODEX_HOME: codexHomeDir,
},
},
context: {
paperclipWorkspace: {
cwd: workspaceDir,
source: "project_primary",
},
},
executionTransport: {
remoteExecution: {
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteWorkspacePath: "/remote/workspace",
remoteCwd: "/remote/workspace",
privateKey: "PRIVATE KEY",
knownHosts: "[127.0.0.1]:2222 ssh-ed25519 AAAA",
strictHostKeyChecking: true,
},
},
onLog: async () => {},
});
expect(prepareWorkspaceForSshExecution).toHaveBeenCalledTimes(1);
expect(prepareWorkspaceForSshExecution).toHaveBeenCalledWith(expect.objectContaining({
localDir: workspaceDir,
remoteDir: "/remote/workspace",
}));
expect(syncDirectoryToSsh).toHaveBeenCalledTimes(1);
expect(syncDirectoryToSsh).toHaveBeenCalledWith(expect.objectContaining({
localDir: codexHomeDir,
remoteDir: "/remote/workspace/.paperclip-runtime/codex/home",
followSymlinks: true,
}));
expect(runChildProcess).toHaveBeenCalledTimes(1);
const call = runChildProcess.mock.calls[0] as unknown as
| [string, string, string[], { env: Record<string, string>; remoteExecution?: { remoteCwd: string } | null }]
| undefined;
expect(call?.[3].env.CODEX_HOME).toBe("/remote/workspace/.paperclip-runtime/codex/home");
expect(call?.[3].remoteExecution?.remoteCwd).toBe("/remote/workspace");
expect(restoreWorkspaceFromSshExecution).toHaveBeenCalledTimes(1);
expect(restoreWorkspaceFromSshExecution).toHaveBeenCalledWith(expect.objectContaining({
localDir: workspaceDir,
remoteDir: "/remote/workspace",
}));
});
it("does not resume saved Codex sessions for remote SSH execution without a matching remote identity", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-codex-remote-resume-"));
cleanupDirs.push(rootDir);
const workspaceDir = path.join(rootDir, "workspace");
const codexHomeDir = path.join(rootDir, "codex-home");
await mkdir(workspaceDir, { recursive: true });
await mkdir(codexHomeDir, { recursive: true });
await writeFile(path.join(codexHomeDir, "auth.json"), "{}", "utf8");
await execute({
runId: "run-ssh-no-resume",
agent: {
id: "agent-1",
companyId: "company-1",
name: "CodexCoder",
adapterType: "codex_local",
adapterConfig: {},
},
runtime: {
sessionId: "session-123",
sessionParams: {
sessionId: "session-123",
cwd: "/remote/workspace",
},
sessionDisplayId: "session-123",
taskKey: null,
},
config: {
command: "codex",
env: {
CODEX_HOME: codexHomeDir,
},
},
context: {
paperclipWorkspace: {
cwd: workspaceDir,
source: "project_primary",
},
},
executionTransport: {
remoteExecution: {
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteWorkspacePath: "/remote/workspace",
remoteCwd: "/remote/workspace",
privateKey: "PRIVATE KEY",
knownHosts: "[127.0.0.1]:2222 ssh-ed25519 AAAA",
strictHostKeyChecking: true,
},
},
onLog: async () => {},
});
expect(runChildProcess).toHaveBeenCalledTimes(1);
const call = runChildProcess.mock.calls[0] as unknown as [string, string, string[]] | undefined;
expect(call?.[2]).toEqual([
"exec",
"--json",
"-",
]);
});
it("resumes saved Codex sessions for remote SSH execution when the remote identity matches", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-codex-remote-resume-match-"));
cleanupDirs.push(rootDir);
const workspaceDir = path.join(rootDir, "workspace");
const codexHomeDir = path.join(rootDir, "codex-home");
await mkdir(workspaceDir, { recursive: true });
await mkdir(codexHomeDir, { recursive: true });
await writeFile(path.join(codexHomeDir, "auth.json"), "{}", "utf8");
await execute({
runId: "run-ssh-resume",
agent: {
id: "agent-1",
companyId: "company-1",
name: "CodexCoder",
adapterType: "codex_local",
adapterConfig: {},
},
runtime: {
sessionId: "session-123",
sessionParams: {
sessionId: "session-123",
cwd: "/remote/workspace",
remoteExecution: {
transport: "ssh",
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteCwd: "/remote/workspace",
},
},
sessionDisplayId: "session-123",
taskKey: null,
},
config: {
command: "codex",
env: {
CODEX_HOME: codexHomeDir,
},
},
context: {
paperclipWorkspace: {
cwd: workspaceDir,
source: "project_primary",
},
},
executionTransport: {
remoteExecution: {
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteWorkspacePath: "/remote/workspace",
remoteCwd: "/remote/workspace",
privateKey: "PRIVATE KEY",
knownHosts: "[127.0.0.1]:2222 ssh-ed25519 AAAA",
strictHostKeyChecking: true,
},
},
onLog: async () => {},
});
expect(runChildProcess).toHaveBeenCalledTimes(1);
const call = runChildProcess.mock.calls[0] as unknown as [string, string, string[]] | undefined;
expect(call?.[2]).toEqual([
"exec",
"--json",
"resume",
"session-123",
"-",
]);
});
it("uses the provider-neutral execution target contract for remote SSH execution", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-codex-target-"));
cleanupDirs.push(rootDir);
const workspaceDir = path.join(rootDir, "workspace");
const codexHomeDir = path.join(rootDir, "codex-home");
await mkdir(workspaceDir, { recursive: true });
await mkdir(codexHomeDir, { recursive: true });
await writeFile(path.join(codexHomeDir, "auth.json"), "{}", "utf8");
await execute({
runId: "run-target",
agent: {
id: "agent-1",
companyId: "company-1",
name: "CodexCoder",
adapterType: "codex_local",
adapterConfig: {},
},
runtime: {
sessionId: "session-123",
sessionParams: {
sessionId: "session-123",
cwd: "/remote/workspace",
remoteExecution: {
transport: "ssh",
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteCwd: "/remote/workspace",
},
},
sessionDisplayId: "session-123",
taskKey: null,
},
config: {
command: "codex",
env: {
CODEX_HOME: codexHomeDir,
},
},
context: {
paperclipWorkspace: {
cwd: workspaceDir,
source: "project_primary",
},
},
executionTarget: {
kind: "remote",
transport: "ssh",
remoteCwd: "/remote/workspace",
spec: {
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteWorkspacePath: "/remote/workspace",
remoteCwd: "/remote/workspace",
privateKey: "PRIVATE KEY",
knownHosts: "[127.0.0.1]:2222 ssh-ed25519 AAAA",
strictHostKeyChecking: true,
},
},
onLog: async () => {},
});
expect(syncDirectoryToSsh).toHaveBeenCalledTimes(1);
expect(runChildProcess).toHaveBeenCalledTimes(1);
const call = runChildProcess.mock.calls[0] as unknown as
| [string, string, string[], { env: Record<string, string>; remoteExecution?: { remoteCwd: string } | null }]
| undefined;
expect(call?.[2]).toEqual([
"exec",
"--json",
"resume",
"session-123",
"-",
]);
expect(call?.[3].env.CODEX_HOME).toBe("/remote/workspace/.paperclip-runtime/codex/home");
expect(call?.[3].remoteExecution?.remoteCwd).toBe("/remote/workspace");
});
});

View File

@@ -2,6 +2,19 @@ import fs from "node:fs/promises";
import path from "node:path";
import { fileURLToPath } from "node:url";
import { inferOpenAiCompatibleBiller, type AdapterExecutionContext, type AdapterExecutionResult } from "@paperclipai/adapter-utils";
import {
adapterExecutionTargetIsRemote,
adapterExecutionTargetPaperclipApiUrl,
adapterExecutionTargetRemoteCwd,
adapterExecutionTargetSessionIdentity,
adapterExecutionTargetSessionMatches,
describeAdapterExecutionTarget,
ensureAdapterExecutionTargetCommandResolvable,
prepareAdapterExecutionTargetRuntime,
readAdapterExecutionTarget,
resolveAdapterExecutionTargetCommandForLogs,
runAdapterExecutionTargetProcess,
} from "@paperclipai/adapter-utils/execution-target";
import {
asString,
asNumber,
@@ -9,20 +22,22 @@ import {
buildPaperclipEnv,
buildInvocationEnvForLogs,
ensureAbsoluteDirectory,
ensureCommandResolvable,
ensurePaperclipSkillSymlink,
ensurePathInEnv,
readPaperclipRuntimeSkillEntries,
resolveCommandForLogs,
resolvePaperclipDesiredSkillNames,
renderTemplate,
renderPaperclipWakePrompt,
stringifyPaperclipWakePayload,
DEFAULT_PAPERCLIP_AGENT_PROMPT_TEMPLATE,
joinPromptSections,
runChildProcess,
} from "@paperclipai/adapter-utils/server-utils";
import { parseCodexJsonl, isCodexUnknownSessionError } from "./parse.js";
import {
parseCodexJsonl,
extractCodexRetryNotBefore,
isCodexTransientUpstreamError,
isCodexUnknownSessionError,
} from "./parse.js";
import { pathExists, prepareManagedCodexHome, resolveManagedCodexHomeDir, resolveSharedCodexHomeDir } from "./codex-home.js";
import { resolveCodexDesiredSkillNames } from "./skills.js";
import { buildCodexExecArgs } from "./codex-args.js";
@@ -149,6 +164,52 @@ type EnsureCodexSkillsInjectedOptions = {
linkSkill?: (source: string, target: string) => Promise<void>;
};
type CodexTransientFallbackMode =
| "same_session"
| "safer_invocation"
| "fresh_session"
| "fresh_session_safer_invocation";
function readCodexTransientFallbackMode(context: Record<string, unknown>): CodexTransientFallbackMode | null {
const value = asString(context.codexTransientFallbackMode, "").trim();
switch (value) {
case "same_session":
case "safer_invocation":
case "fresh_session":
case "fresh_session_safer_invocation":
return value;
default:
return null;
}
}
function fallbackModeUsesSaferInvocation(mode: CodexTransientFallbackMode | null): boolean {
return mode === "safer_invocation" || mode === "fresh_session_safer_invocation";
}
function fallbackModeUsesFreshSession(mode: CodexTransientFallbackMode | null): boolean {
return mode === "fresh_session" || mode === "fresh_session_safer_invocation";
}
function buildCodexTransientHandoffNote(input: {
previousSessionId: string | null;
fallbackMode: CodexTransientFallbackMode;
continuationSummaryBody: string | null;
}): string {
return [
"Paperclip session handoff:",
input.previousSessionId ? `- Previous session: ${input.previousSessionId}` : "",
"- Rotation reason: repeated Codex transient remote-compaction failures",
`- Fallback mode: ${input.fallbackMode}`,
input.continuationSummaryBody
? `- Issue continuation summary: ${input.continuationSummaryBody.slice(0, 1_500)}`
: "",
"Continue from the current task state. Rebuild only the minimum context you need.",
]
.filter(Boolean)
.join("\n");
}
export async function ensureCodexSkillsInjected(
onLog: AdapterExecutionContext["onLog"],
options: EnsureCodexSkillsInjectedOptions = {},
@@ -255,6 +316,11 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
const effectiveWorkspaceCwd = useConfiguredInsteadOfAgentHome ? "" : workspaceCwd;
const cwd = effectiveWorkspaceCwd || configuredCwd || process.cwd();
const envConfig = parseObject(config.env);
const executionTarget = readAdapterExecutionTarget({
executionTarget: ctx.executionTarget,
legacyRemoteExecution: ctx.executionTransport?.remoteExecution,
});
const executionTargetIsRemote = adapterExecutionTargetIsRemote(executionTarget);
const configuredCodexHome =
typeof envConfig.CODEX_HOME === "string" && envConfig.CODEX_HOME.trim().length > 0
? path.resolve(envConfig.CODEX_HOME.trim())
@@ -278,10 +344,37 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
desiredSkillNames,
},
);
const effectiveExecutionCwd = adapterExecutionTargetRemoteCwd(executionTarget, cwd);
const preparedExecutionTargetRuntime = executionTargetIsRemote
? await (async () => {
await onLog(
"stdout",
`[paperclip] Syncing workspace and CODEX_HOME to ${describeAdapterExecutionTarget(executionTarget)}.\n`,
);
return await prepareAdapterExecutionTargetRuntime({
target: executionTarget,
adapterKey: "codex",
workspaceLocalDir: cwd,
assets: [
{
key: "home",
localDir: effectiveCodexHome,
followSymlinks: true,
},
],
});
})()
: null;
const restoreRemoteWorkspace = preparedExecutionTargetRuntime
? () => preparedExecutionTargetRuntime.restoreWorkspace()
: null;
const remoteCodexHome = executionTargetIsRemote
? preparedExecutionTargetRuntime?.assetDirs.home ??
path.posix.join(effectiveExecutionCwd, ".paperclip-runtime", "codex", "home")
: null;
const hasExplicitApiKey =
typeof envConfig.PAPERCLIP_API_KEY === "string" && envConfig.PAPERCLIP_API_KEY.trim().length > 0;
const env: Record<string, string> = { ...buildPaperclipEnv(agent) };
env.CODEX_HOME = effectiveCodexHome;
env.PAPERCLIP_RUN_ID = runId;
const wakeTaskId =
(typeof context.taskId === "string" && context.taskId.trim().length > 0 && context.taskId.trim()) ||
@@ -367,9 +460,14 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
if (runtimePrimaryUrl) {
env.PAPERCLIP_RUNTIME_PRIMARY_URL = runtimePrimaryUrl;
}
const targetPaperclipApiUrl = adapterExecutionTargetPaperclipApiUrl(executionTarget);
if (targetPaperclipApiUrl) {
env.PAPERCLIP_API_URL = targetPaperclipApiUrl;
}
for (const [k, v] of Object.entries(envConfig)) {
if (typeof v === "string") env[k] = v;
}
env.CODEX_HOME = remoteCodexHome ?? effectiveCodexHome;
if (!hasExplicitApiKey && authToken) {
env.PAPERCLIP_API_KEY = authToken;
}
@@ -380,8 +478,8 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
);
const billingType = resolveCodexBillingType(effectiveEnv);
const runtimeEnv = ensurePathInEnv(effectiveEnv);
await ensureCommandResolvable(command, cwd, runtimeEnv);
const resolvedCommand = await resolveCommandForLogs(command, cwd, runtimeEnv);
await ensureAdapterExecutionTargetCommandResolvable(command, executionTarget, cwd, runtimeEnv);
const resolvedCommand = await resolveAdapterExecutionTargetCommandForLogs(command, executionTarget, cwd, runtimeEnv);
const loggedEnv = buildInvocationEnvForLogs(env, {
runtimeEnv,
includeRuntimeKeys: ["HOME"],
@@ -394,14 +492,24 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
const runtimeSessionParams = parseObject(runtime.sessionParams);
const runtimeSessionId = asString(runtimeSessionParams.sessionId, runtime.sessionId ?? "");
const runtimeSessionCwd = asString(runtimeSessionParams.cwd, "");
const runtimeRemoteExecution = parseObject(runtimeSessionParams.remoteExecution);
const canResumeSession =
runtimeSessionId.length > 0 &&
(runtimeSessionCwd.length === 0 || path.resolve(runtimeSessionCwd) === path.resolve(cwd));
const sessionId = canResumeSession ? runtimeSessionId : null;
if (runtimeSessionId && !canResumeSession) {
(runtimeSessionCwd.length === 0 || path.resolve(runtimeSessionCwd) === path.resolve(effectiveExecutionCwd)) &&
adapterExecutionTargetSessionMatches(runtimeRemoteExecution, executionTarget);
const codexTransientFallbackMode = readCodexTransientFallbackMode(context);
const forceSaferInvocation = fallbackModeUsesSaferInvocation(codexTransientFallbackMode);
const forceFreshSession = fallbackModeUsesFreshSession(codexTransientFallbackMode);
const sessionId = canResumeSession && !forceFreshSession ? runtimeSessionId : null;
if (executionTargetIsRemote && runtimeSessionId && !canResumeSession) {
await onLog(
"stdout",
`[paperclip] Codex session "${runtimeSessionId}" was saved for cwd "${runtimeSessionCwd}" and will not be resumed in "${cwd}".\n`,
`[paperclip] Codex session "${runtimeSessionId}" does not match the current remote execution identity and will not be resumed in "${effectiveExecutionCwd}". Starting a fresh remote session.\n`,
);
} else if (runtimeSessionId && !canResumeSession) {
await onLog(
"stdout",
`[paperclip] Codex session "${runtimeSessionId}" was saved for cwd "${runtimeSessionCwd}" and will not be resumed in "${effectiveExecutionCwd}".\n`,
);
}
const instructionsFilePath = asString(config.instructionsFilePath, "").trim();
@@ -444,28 +552,66 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
const shouldUseResumeDeltaPrompt = Boolean(sessionId) && wakePrompt.length > 0;
const promptInstructionsPrefix = shouldUseResumeDeltaPrompt ? "" : instructionsPrefix;
instructionsChars = promptInstructionsPrefix.length;
const continuationSummary = parseObject(context.paperclipContinuationSummary);
const continuationSummaryBody = asString(continuationSummary.body, "").trim() || null;
const codexFallbackHandoffNote =
forceFreshSession
? buildCodexTransientHandoffNote({
previousSessionId: runtimeSessionId || runtime.sessionId || null,
fallbackMode: codexTransientFallbackMode ?? "fresh_session",
continuationSummaryBody,
})
: "";
const commandNotes = (() => {
if (!instructionsFilePath) {
return [repoAgentsNote];
const notes = [repoAgentsNote];
if (forceSaferInvocation) {
notes.push("Codex transient fallback requested safer invocation settings for this retry.");
}
if (forceFreshSession) {
notes.push("Codex transient fallback forced a fresh session with a continuation handoff.");
}
return notes;
}
if (instructionsPrefix.length > 0) {
if (shouldUseResumeDeltaPrompt) {
return [
const notes = [
`Loaded agent instructions from ${instructionsFilePath}`,
"Skipped stdin instruction reinjection because an existing Codex session is being resumed with a wake delta.",
repoAgentsNote,
];
if (forceSaferInvocation) {
notes.push("Codex transient fallback requested safer invocation settings for this retry.");
}
if (forceFreshSession) {
notes.push("Codex transient fallback forced a fresh session with a continuation handoff.");
}
return notes;
}
return [
const notes = [
`Loaded agent instructions from ${instructionsFilePath}`,
`Prepended instructions + path directive to stdin prompt (relative references from ${instructionsDir}).`,
repoAgentsNote,
];
if (forceSaferInvocation) {
notes.push("Codex transient fallback requested safer invocation settings for this retry.");
}
if (forceFreshSession) {
notes.push("Codex transient fallback forced a fresh session with a continuation handoff.");
}
return notes;
}
return [
const notes = [
`Configured instructionsFilePath ${instructionsFilePath}, but file could not be read; continuing without injected instructions.`,
repoAgentsNote,
];
if (forceSaferInvocation) {
notes.push("Codex transient fallback requested safer invocation settings for this retry.");
}
if (forceFreshSession) {
notes.push("Codex transient fallback forced a fresh session with a continuation handoff.");
}
return notes;
})();
const renderedPrompt = shouldUseResumeDeltaPrompt ? "" : renderTemplate(promptTemplate, templateData);
const sessionHandoffNote = asString(context.paperclipSessionHandoffMarkdown, "").trim();
@@ -473,6 +619,7 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
promptInstructionsPrefix,
renderedBootstrapPrompt,
wakePrompt,
codexFallbackHandoffNote,
sessionHandoffNote,
renderedPrompt,
]);
@@ -486,7 +633,10 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
};
const runAttempt = async (resumeSessionId: string | null) => {
const execArgs = buildCodexExecArgs(config, { resumeSessionId });
const execArgs = buildCodexExecArgs(
forceSaferInvocation ? { ...config, fastMode: false } : config,
{ resumeSessionId },
);
const args = execArgs.args;
const commandNotesWithFastMode =
execArgs.fastModeIgnoredReason == null
@@ -496,7 +646,7 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
await onMeta({
adapterType: "codex_local",
command: resolvedCommand,
cwd,
cwd: effectiveExecutionCwd,
commandNotes: commandNotesWithFastMode,
commandArgs: args.map((value, idx) => {
if (idx === args.length - 1 && value !== "-") return `<prompt ${prompt.length} chars>`;
@@ -509,7 +659,7 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
});
}
const proc = await runChildProcess(runId, command, args, {
const proc = await runAdapterExecutionTargetProcess(runId, executionTarget, command, args, {
cwd,
env,
stdin: prompt,
@@ -540,6 +690,7 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
const toResult = (
attempt: { proc: { exitCode: number | null; signal: string | null; timedOut: boolean; stdout: string; stderr: string }; rawStderr: string; parsed: ReturnType<typeof parseCodexJsonl> },
clearSessionOnMissingSession = false,
isRetry = false,
): AdapterExecutionResult => {
if (attempt.proc.timedOut) {
return {
@@ -551,11 +702,19 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
};
}
const resolvedSessionId = attempt.parsed.sessionId ?? runtimeSessionId ?? runtime.sessionId ?? null;
const canFallbackToRuntimeSession = !isRetry && !forceFreshSession;
const resolvedSessionId =
attempt.parsed.sessionId ??
(canFallbackToRuntimeSession ? (runtimeSessionId ?? runtime.sessionId ?? null) : null);
const resolvedSessionParams = resolvedSessionId
? ({
sessionId: resolvedSessionId,
cwd,
cwd: effectiveExecutionCwd,
...(executionTargetIsRemote
? {
remoteExecution: adapterExecutionTargetSessionIdentity(executionTarget),
}
: {}),
...(workspaceId ? { workspaceId } : {}),
...(workspaceRepoUrl ? { repoUrl: workspaceRepoUrl } : {}),
...(workspaceRepoRef ? { repoRef: workspaceRepoRef } : {}),
@@ -567,6 +726,21 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
parsedError ||
stderrLine ||
`Codex exited with code ${attempt.proc.exitCode ?? -1}`;
const transientRetryNotBefore =
(attempt.proc.exitCode ?? 0) !== 0
? extractCodexRetryNotBefore({
stdout: attempt.proc.stdout,
stderr: attempt.proc.stderr,
errorMessage: fallbackErrorMessage,
})
: null;
const transientUpstream =
(attempt.proc.exitCode ?? 0) !== 0 &&
isCodexTransientUpstreamError({
stdout: attempt.proc.stdout,
stderr: attempt.proc.stderr,
errorMessage: fallbackErrorMessage,
});
return {
exitCode: attempt.proc.exitCode,
@@ -576,6 +750,12 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
(attempt.proc.exitCode ?? 0) === 0
? null
: fallbackErrorMessage,
errorCode:
transientUpstream
? "codex_transient_upstream"
: null,
errorFamily: transientUpstream ? "transient_upstream" : null,
retryNotBefore: transientRetryNotBefore ? transientRetryNotBefore.toISOString() : null,
usage: attempt.parsed.usage,
sessionId: resolvedSessionId,
sessionParams: resolvedSessionParams,
@@ -588,26 +768,39 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
resultJson: {
stdout: attempt.proc.stdout,
stderr: attempt.proc.stderr,
...(transientUpstream ? { errorFamily: "transient_upstream" } : {}),
...(transientRetryNotBefore ? { retryNotBefore: transientRetryNotBefore.toISOString() } : {}),
...(transientRetryNotBefore ? { transientRetryNotBefore: transientRetryNotBefore.toISOString() } : {}),
},
summary: attempt.parsed.summary,
clearSession: Boolean(clearSessionOnMissingSession && !resolvedSessionId),
clearSession: Boolean((clearSessionOnMissingSession || forceFreshSession) && !resolvedSessionId),
};
};
const initial = await runAttempt(sessionId);
if (
sessionId &&
!initial.proc.timedOut &&
(initial.proc.exitCode ?? 0) !== 0 &&
isCodexUnknownSessionError(initial.proc.stdout, initial.rawStderr)
) {
await onLog(
"stdout",
`[paperclip] Codex resume session "${sessionId}" is unavailable; retrying with a fresh session.\n`,
);
const retry = await runAttempt(null);
return toResult(retry, true);
}
try {
const initial = await runAttempt(sessionId);
if (
sessionId &&
!initial.proc.timedOut &&
(initial.proc.exitCode ?? 0) !== 0 &&
isCodexUnknownSessionError(initial.proc.stdout, initial.rawStderr)
) {
await onLog(
"stdout",
`[paperclip] Codex resume session "${sessionId}" is unavailable; retrying with a fresh session.\n`,
);
const retry = await runAttempt(null);
return toResult(retry, true, true);
}
return toResult(initial);
return toResult(initial, false, false);
} finally {
if (restoreRemoteWorkspace) {
await onLog(
"stdout",
`[paperclip] Restoring workspace changes from ${describeAdapterExecutionTarget(executionTarget)}.\n`,
);
await restoreRemoteWorkspace();
}
}
}

View File

@@ -1,7 +1,7 @@
export { execute, ensureCodexSkillsInjected } from "./execute.js";
export { listCodexSkills, syncCodexSkills } from "./skills.js";
export { testEnvironment } from "./test.js";
export { parseCodexJsonl, isCodexUnknownSessionError } from "./parse.js";
export { parseCodexJsonl, isCodexTransientUpstreamError, isCodexUnknownSessionError } from "./parse.js";
export {
getQuotaWindows,
readCodexAuthInfo,

View File

@@ -1,5 +1,10 @@
import { describe, expect, it } from "vitest";
import { isCodexUnknownSessionError, parseCodexJsonl } from "./parse.js";
import {
extractCodexRetryNotBefore,
isCodexTransientUpstreamError,
isCodexUnknownSessionError,
parseCodexJsonl,
} from "./parse.js";
describe("parseCodexJsonl", () => {
it("captures session id, assistant summary, usage, and error message", () => {
@@ -81,3 +86,55 @@ describe("isCodexUnknownSessionError", () => {
expect(isCodexUnknownSessionError("", "model overloaded")).toBe(false);
});
});
describe("isCodexTransientUpstreamError", () => {
it("classifies the remote-compaction high-demand failure as transient upstream", () => {
expect(
isCodexTransientUpstreamError({
errorMessage:
"Error running remote compact task: We're currently experiencing high demand, which may cause temporary errors.",
}),
).toBe(true);
expect(
isCodexTransientUpstreamError({
stderr: "We're currently experiencing high demand, which may cause temporary errors.",
}),
).toBe(true);
});
it("classifies usage-limit windows as transient and extracts the retry time", () => {
const errorMessage = "You've hit your usage limit for GPT-5.3-Codex-Spark. Switch to another model now, or try again at 11:31 PM.";
const now = new Date(2026, 3, 22, 22, 29, 2);
expect(isCodexTransientUpstreamError({ errorMessage })).toBe(true);
expect(extractCodexRetryNotBefore({ errorMessage }, now)?.getTime()).toBe(
new Date(2026, 3, 22, 23, 31, 0, 0).getTime(),
);
});
it("parses explicit timezone hints on usage-limit retry windows", () => {
const errorMessage = "You've hit your usage limit for GPT-5.3-Codex-Spark. Switch to another model now, or try again at 11:31 PM (America/Chicago).";
const now = new Date("2026-04-23T03:29:02.000Z");
expect(extractCodexRetryNotBefore({ errorMessage }, now)?.toISOString()).toBe(
"2026-04-23T04:31:00.000Z",
);
});
it("does not classify deterministic compaction errors as transient", () => {
expect(
isCodexTransientUpstreamError({
errorMessage: [
"Error running remote compact task: {",
' "error": {',
' "message": "Unknown parameter: \'prompt_cache_retention\'.",',
' "type": "invalid_request_error",',
' "param": "prompt_cache_retention",',
' "code": "unknown_parameter"',
" }",
"}",
].join("\n"),
}),
).toBe(false);
});
});

View File

@@ -1,4 +1,15 @@
import { asString, asNumber, parseObject, parseJson } from "@paperclipai/adapter-utils/server-utils";
import {
asString,
asNumber,
parseObject,
parseJson,
} from "@paperclipai/adapter-utils/server-utils";
const CODEX_TRANSIENT_UPSTREAM_RE =
/(?:we(?:'|)re\s+currently\s+experiencing\s+high\s+demand|temporary\s+errors|rate[-\s]?limit(?:ed)?|too\s+many\s+requests|\b429\b|server\s+overloaded|service\s+unavailable|try\s+again\s+later)/i;
const CODEX_REMOTE_COMPACTION_RE = /remote\s+compact\s+task/i;
const CODEX_USAGE_LIMIT_RE =
/you(?:'|)ve hit your usage limit for .+\.\s+switch to another model now,\s+or try again at\s+([^.!\n]+)(?:[.!]|\n|$)/i;
export function parseCodexJsonl(stdout: string) {
let sessionId: string | null = null;
@@ -71,3 +82,180 @@ export function isCodexUnknownSessionError(stdout: string, stderr: string): bool
haystack,
);
}
function buildCodexErrorHaystack(input: {
stdout?: string | null;
stderr?: string | null;
errorMessage?: string | null;
}): string {
return [
input.errorMessage ?? "",
input.stdout ?? "",
input.stderr ?? "",
]
.join("\n")
.split(/\r?\n/)
.map((line) => line.trim())
.filter(Boolean)
.join("\n");
}
function readTimeZoneParts(date: Date, timeZone: string) {
const values = new Map(
new Intl.DateTimeFormat("en-US", {
timeZone,
hourCycle: "h23",
year: "numeric",
month: "2-digit",
day: "2-digit",
hour: "2-digit",
minute: "2-digit",
}).formatToParts(date).map((part) => [part.type, part.value]),
);
return {
year: Number.parseInt(values.get("year") ?? "", 10),
month: Number.parseInt(values.get("month") ?? "", 10),
day: Number.parseInt(values.get("day") ?? "", 10),
hour: Number.parseInt(values.get("hour") ?? "", 10),
minute: Number.parseInt(values.get("minute") ?? "", 10),
};
}
function normalizeResetTimeZone(timeZoneHint: string | null | undefined): string | null {
const normalized = timeZoneHint?.trim();
if (!normalized) return null;
if (/^(?:utc|gmt)$/i.test(normalized)) return "UTC";
try {
new Intl.DateTimeFormat("en-US", { timeZone: normalized }).format(new Date(0));
return normalized;
} catch {
return null;
}
}
function dateFromTimeZoneWallClock(input: {
year: number;
month: number;
day: number;
hour: number;
minute: number;
timeZone: string;
}): Date | null {
let candidate = new Date(Date.UTC(input.year, input.month - 1, input.day, input.hour, input.minute, 0, 0));
const targetUtc = Date.UTC(input.year, input.month - 1, input.day, input.hour, input.minute, 0, 0);
for (let attempt = 0; attempt < 4; attempt += 1) {
const actual = readTimeZoneParts(candidate, input.timeZone);
const actualUtc = Date.UTC(actual.year, actual.month - 1, actual.day, actual.hour, actual.minute, 0, 0);
const offsetMs = targetUtc - actualUtc;
if (offsetMs === 0) break;
candidate = new Date(candidate.getTime() + offsetMs);
}
const verified = readTimeZoneParts(candidate, input.timeZone);
if (
verified.year !== input.year ||
verified.month !== input.month ||
verified.day !== input.day ||
verified.hour !== input.hour ||
verified.minute !== input.minute
) {
return null;
}
return candidate;
}
function nextClockTimeInTimeZone(input: {
now: Date;
hour: number;
minute: number;
timeZoneHint: string;
}): Date | null {
const timeZone = normalizeResetTimeZone(input.timeZoneHint);
if (!timeZone) return null;
const nowParts = readTimeZoneParts(input.now, timeZone);
let retryAt = dateFromTimeZoneWallClock({
year: nowParts.year,
month: nowParts.month,
day: nowParts.day,
hour: input.hour,
minute: input.minute,
timeZone,
});
if (!retryAt) return null;
if (retryAt.getTime() <= input.now.getTime()) {
const nextDay = new Date(Date.UTC(nowParts.year, nowParts.month - 1, nowParts.day + 1, 0, 0, 0, 0));
retryAt = dateFromTimeZoneWallClock({
year: nextDay.getUTCFullYear(),
month: nextDay.getUTCMonth() + 1,
day: nextDay.getUTCDate(),
hour: input.hour,
minute: input.minute,
timeZone,
});
}
return retryAt;
}
function parseLocalClockTime(clockText: string, now: Date): Date | null {
const normalized = clockText.trim();
const match = normalized.match(/^(\d{1,2})(?::(\d{2}))?\s*([ap])\.?\s*m\.?(?:\s*\(([^)]+)\)|\s+([A-Z]{2,5}))?$/i);
if (!match) return null;
const hour12 = Number.parseInt(match[1] ?? "", 10);
const minute = Number.parseInt(match[2] ?? "0", 10);
if (!Number.isInteger(hour12) || hour12 < 1 || hour12 > 12) return null;
if (!Number.isInteger(minute) || minute < 0 || minute > 59) return null;
let hour24 = hour12 % 12;
if ((match[3] ?? "").toLowerCase() === "p") hour24 += 12;
const timeZoneHint = match[4] ?? match[5];
if (timeZoneHint) {
const explicitRetryAt = nextClockTimeInTimeZone({
now,
hour: hour24,
minute,
timeZoneHint,
});
if (explicitRetryAt) return explicitRetryAt;
}
const retryAt = new Date(now);
retryAt.setHours(hour24, minute, 0, 0);
if (retryAt.getTime() <= now.getTime()) {
retryAt.setDate(retryAt.getDate() + 1);
}
return retryAt;
}
export function extractCodexRetryNotBefore(input: {
stdout?: string | null;
stderr?: string | null;
errorMessage?: string | null;
}, now = new Date()): Date | null {
const haystack = buildCodexErrorHaystack(input);
const usageLimitMatch = haystack.match(CODEX_USAGE_LIMIT_RE);
if (!usageLimitMatch) return null;
return parseLocalClockTime(usageLimitMatch[1] ?? "", now);
}
export function isCodexTransientUpstreamError(input: {
stdout?: string | null;
stderr?: string | null;
errorMessage?: string | null;
}): boolean {
const haystack = buildCodexErrorHaystack(input);
if (extractCodexRetryNotBefore(input) != null) return true;
if (!CODEX_TRANSIENT_UPSTREAM_RE.test(haystack)) return false;
// Keep automatic retries scoped to the observed remote-compaction/high-demand
// failure shape, plus explicit usage-limit windows that tell us when retrying
// becomes safe again.
return CODEX_REMOTE_COMPACTION_RE.test(haystack) || /high\s+demand|temporary\s+errors/i.test(haystack);
}

View File

@@ -146,7 +146,7 @@ export async function testEnvironment(
code: "codex_fast_mode_unsupported_model",
level: "warn",
message: execArgs.fastModeIgnoredReason,
hint: "Switch the agent model to GPT-5.4 to enable Codex Fast mode.",
hint: "Switch the agent model to GPT-5.4 or enter a manual model ID to enable Codex Fast mode.",
});
}

View File

@@ -0,0 +1,268 @@
import { mkdir, mkdtemp, rm } from "node:fs/promises";
import os from "node:os";
import path from "node:path";
import { afterEach, describe, expect, it, vi } from "vitest";
const {
runChildProcess,
ensureCommandResolvable,
resolveCommandForLogs,
prepareWorkspaceForSshExecution,
restoreWorkspaceFromSshExecution,
runSshCommand,
syncDirectoryToSsh,
} = vi.hoisted(() => ({
runChildProcess: vi.fn(async () => ({
exitCode: 0,
signal: null,
timedOut: false,
stdout: [
JSON.stringify({ type: "system", session_id: "cursor-session-1" }),
JSON.stringify({ type: "assistant", text: "hello" }),
JSON.stringify({ type: "result", is_error: false, result: "hello", session_id: "cursor-session-1" }),
].join("\n"),
stderr: "",
pid: 123,
startedAt: new Date().toISOString(),
})),
ensureCommandResolvable: vi.fn(async () => undefined),
resolveCommandForLogs: vi.fn(async () => "ssh://fixture@127.0.0.1:2222/remote/workspace :: agent"),
prepareWorkspaceForSshExecution: vi.fn(async () => undefined),
restoreWorkspaceFromSshExecution: vi.fn(async () => undefined),
runSshCommand: vi.fn(async () => ({
stdout: "/home/agent",
stderr: "",
exitCode: 0,
})),
syncDirectoryToSsh: vi.fn(async () => undefined),
}));
vi.mock("@paperclipai/adapter-utils/server-utils", async () => {
const actual = await vi.importActual<typeof import("@paperclipai/adapter-utils/server-utils")>(
"@paperclipai/adapter-utils/server-utils",
);
return {
...actual,
ensureCommandResolvable,
resolveCommandForLogs,
runChildProcess,
};
});
vi.mock("@paperclipai/adapter-utils/ssh", async () => {
const actual = await vi.importActual<typeof import("@paperclipai/adapter-utils/ssh")>(
"@paperclipai/adapter-utils/ssh",
);
return {
...actual,
prepareWorkspaceForSshExecution,
restoreWorkspaceFromSshExecution,
runSshCommand,
syncDirectoryToSsh,
};
});
import { execute } from "./execute.js";
describe("cursor remote execution", () => {
const cleanupDirs: string[] = [];
afterEach(async () => {
vi.clearAllMocks();
while (cleanupDirs.length > 0) {
const dir = cleanupDirs.pop();
if (!dir) continue;
await rm(dir, { recursive: true, force: true }).catch(() => undefined);
}
});
it("prepares the workspace, syncs Cursor skills, and restores workspace changes for remote SSH execution", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-cursor-remote-"));
cleanupDirs.push(rootDir);
const workspaceDir = path.join(rootDir, "workspace");
await mkdir(workspaceDir, { recursive: true });
const result = await execute({
runId: "run-1",
agent: {
id: "agent-1",
companyId: "company-1",
name: "Cursor Builder",
adapterType: "cursor",
adapterConfig: {},
},
runtime: {
sessionId: null,
sessionParams: null,
sessionDisplayId: null,
taskKey: null,
},
config: {
command: "agent",
},
context: {
paperclipWorkspace: {
cwd: workspaceDir,
source: "project_primary",
},
},
executionTransport: {
remoteExecution: {
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteWorkspacePath: "/remote/workspace",
remoteCwd: "/remote/workspace",
privateKey: "PRIVATE KEY",
knownHosts: "[127.0.0.1]:2222 ssh-ed25519 AAAA",
strictHostKeyChecking: true,
paperclipApiUrl: "http://198.51.100.10:3102",
},
},
onLog: async () => {},
});
expect(result.sessionParams).toMatchObject({
sessionId: "cursor-session-1",
cwd: "/remote/workspace",
remoteExecution: {
transport: "ssh",
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteCwd: "/remote/workspace",
paperclipApiUrl: "http://198.51.100.10:3102",
},
});
expect(prepareWorkspaceForSshExecution).toHaveBeenCalledTimes(1);
expect(syncDirectoryToSsh).toHaveBeenCalledTimes(1);
expect(syncDirectoryToSsh).toHaveBeenCalledWith(expect.objectContaining({
remoteDir: "/remote/workspace/.paperclip-runtime/cursor/skills",
followSymlinks: true,
}));
expect(runSshCommand).toHaveBeenCalledWith(
expect.anything(),
expect.stringContaining(".cursor/skills"),
expect.anything(),
);
const call = runChildProcess.mock.calls[0] as unknown as
| [string, string, string[], { env: Record<string, string>; remoteExecution?: { remoteCwd: string } | null }]
| undefined;
expect(call?.[2]).toContain("--workspace");
expect(call?.[2]).toContain("/remote/workspace");
expect(call?.[3].env.PAPERCLIP_API_URL).toBe("http://198.51.100.10:3102");
expect(call?.[3].remoteExecution?.remoteCwd).toBe("/remote/workspace");
expect(restoreWorkspaceFromSshExecution).toHaveBeenCalledTimes(1);
});
it("resumes saved Cursor sessions for remote SSH execution only when the identity matches", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-cursor-remote-resume-"));
cleanupDirs.push(rootDir);
const workspaceDir = path.join(rootDir, "workspace");
await mkdir(workspaceDir, { recursive: true });
await execute({
runId: "run-ssh-resume",
agent: {
id: "agent-1",
companyId: "company-1",
name: "Cursor Builder",
adapterType: "cursor",
adapterConfig: {},
},
runtime: {
sessionId: "session-123",
sessionParams: {
sessionId: "session-123",
cwd: "/remote/workspace",
remoteExecution: {
transport: "ssh",
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteCwd: "/remote/workspace",
},
},
sessionDisplayId: "session-123",
taskKey: null,
},
config: {
command: "agent",
},
context: {
paperclipWorkspace: {
cwd: workspaceDir,
source: "project_primary",
},
},
executionTransport: {
remoteExecution: {
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteWorkspacePath: "/remote/workspace",
remoteCwd: "/remote/workspace",
privateKey: "PRIVATE KEY",
knownHosts: "[127.0.0.1]:2222 ssh-ed25519 AAAA",
strictHostKeyChecking: true,
},
},
onLog: async () => {},
});
const call = runChildProcess.mock.calls[0] as unknown as [string, string, string[]] | undefined;
expect(call?.[2]).toContain("--resume");
expect(call?.[2]).toContain("session-123");
});
it("restores the remote workspace if skills sync fails after workspace prep", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-cursor-remote-sync-fail-"));
cleanupDirs.push(rootDir);
const workspaceDir = path.join(rootDir, "workspace");
await mkdir(workspaceDir, { recursive: true });
syncDirectoryToSsh.mockRejectedValueOnce(new Error("sync failed"));
await expect(execute({
runId: "run-sync-fail",
agent: {
id: "agent-1",
companyId: "company-1",
name: "Cursor Builder",
adapterType: "cursor",
adapterConfig: {},
},
runtime: {
sessionId: null,
sessionParams: null,
sessionDisplayId: null,
taskKey: null,
},
config: {
command: "agent",
},
context: {
paperclipWorkspace: {
cwd: workspaceDir,
source: "project_primary",
},
},
executionTransport: {
remoteExecution: {
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteWorkspacePath: "/remote/workspace",
remoteCwd: "/remote/workspace",
privateKey: "PRIVATE KEY",
knownHosts: "[127.0.0.1]:2222 ssh-ed25519 AAAA",
strictHostKeyChecking: true,
},
},
onLog: async () => {},
})).rejects.toThrow("sync failed");
expect(prepareWorkspaceForSshExecution).toHaveBeenCalledTimes(1);
expect(restoreWorkspaceFromSshExecution).toHaveBeenCalledTimes(1);
expect(runChildProcess).not.toHaveBeenCalled();
});
});

View File

@@ -3,6 +3,22 @@ import os from "node:os";
import path from "node:path";
import { fileURLToPath } from "node:url";
import { inferOpenAiCompatibleBiller, type AdapterExecutionContext, type AdapterExecutionResult } from "@paperclipai/adapter-utils";
import {
adapterExecutionTargetIsRemote,
adapterExecutionTargetPaperclipApiUrl,
adapterExecutionTargetRemoteCwd,
adapterExecutionTargetSessionIdentity,
adapterExecutionTargetSessionMatches,
adapterExecutionTargetUsesManagedHome,
describeAdapterExecutionTarget,
ensureAdapterExecutionTargetCommandResolvable,
prepareAdapterExecutionTargetRuntime,
readAdapterExecutionTarget,
readAdapterExecutionTargetHomeDir,
resolveAdapterExecutionTargetCommandForLogs,
runAdapterExecutionTargetProcess,
runAdapterExecutionTargetShellCommand,
} from "@paperclipai/adapter-utils/execution-target";
import {
asString,
asNumber,
@@ -11,11 +27,9 @@ import {
buildPaperclipEnv,
buildInvocationEnvForLogs,
ensureAbsoluteDirectory,
ensureCommandResolvable,
ensurePaperclipSkillSymlink,
ensurePathInEnv,
readPaperclipRuntimeSkillEntries,
resolveCommandForLogs,
resolvePaperclipDesiredSkillNames,
removeMaintainerOnlySkillSymlinks,
renderTemplate,
@@ -23,7 +37,6 @@ import {
stringifyPaperclipWakePayload,
DEFAULT_PAPERCLIP_AGENT_PROMPT_TEMPLATE,
joinPromptSections,
runChildProcess,
} from "@paperclipai/adapter-utils/server-utils";
import { DEFAULT_CURSOR_LOCAL_MODEL } from "../index.js";
import { parseCursorJsonl, isCursorUnknownSessionError } from "./parse.js";
@@ -97,6 +110,19 @@ function cursorSkillsHome(): string {
return path.join(os.homedir(), ".cursor", "skills");
}
async function buildCursorSkillsDir(config: Record<string, unknown>): Promise<string> {
const tmp = await fs.mkdtemp(path.join(os.tmpdir(), "paperclip-cursor-skills-"));
const target = path.join(tmp, "skills");
await fs.mkdir(target, { recursive: true });
const availableEntries = await readPaperclipRuntimeSkillEntries(config, __moduleDir);
const desiredNames = new Set(resolvePaperclipDesiredSkillNames(config, availableEntries));
for (const entry of availableEntries) {
if (!desiredNames.has(entry.key)) continue;
await fs.symlink(entry.source, path.join(target, entry.runtimeName));
}
return target;
}
type EnsureCursorSkillsInjectedOptions = {
skillsDir?: string | null;
skillsEntries?: Array<{ key: string; runtimeName: string; source: string }>;
@@ -162,6 +188,11 @@ export async function ensureCursorSkillsInjected(
export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExecutionResult> {
const { runId, agent, runtime, config, context, onLog, onMeta, onSpawn, authToken } = ctx;
const executionTarget = readAdapterExecutionTarget({
executionTarget: ctx.executionTarget,
legacyRemoteExecution: ctx.executionTransport?.remoteExecution,
});
const executionTargetIsRemote = adapterExecutionTargetIsRemote(executionTarget);
const promptTemplate = asString(
config.promptTemplate,
@@ -190,9 +221,11 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
await ensureAbsoluteDirectory(cwd, { createIfMissing: true });
const cursorSkillEntries = await readPaperclipRuntimeSkillEntries(config, __moduleDir);
const desiredCursorSkillNames = resolvePaperclipDesiredSkillNames(config, cursorSkillEntries);
await ensureCursorSkillsInjected(onLog, {
skillsEntries: cursorSkillEntries.filter((entry) => desiredCursorSkillNames.includes(entry.key)),
});
if (!executionTargetIsRemote) {
await ensureCursorSkillsInjected(onLog, {
skillsEntries: cursorSkillEntries.filter((entry) => desiredCursorSkillNames.includes(entry.key)),
});
}
const envConfig = parseObject(config.env);
const hasExplicitApiKey =
@@ -265,6 +298,10 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
if (workspaceHints.length > 0) {
env.PAPERCLIP_WORKSPACES_JSON = JSON.stringify(workspaceHints);
}
const targetPaperclipApiUrl = adapterExecutionTargetPaperclipApiUrl(executionTarget);
if (targetPaperclipApiUrl) {
env.PAPERCLIP_API_URL = targetPaperclipApiUrl;
}
for (const [k, v] of Object.entries(envConfig)) {
if (typeof v === "string") env[k] = v;
}
@@ -278,8 +315,8 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
);
const billingType = resolveCursorBillingType(effectiveEnv);
const runtimeEnv = ensurePathInEnv(effectiveEnv);
await ensureCommandResolvable(command, cwd, runtimeEnv);
const resolvedCommand = await resolveCommandForLogs(command, cwd, runtimeEnv);
await ensureAdapterExecutionTargetCommandResolvable(command, executionTarget, cwd, runtimeEnv);
const resolvedCommand = await resolveAdapterExecutionTargetCommandForLogs(command, executionTarget, cwd, runtimeEnv);
const loggedEnv = buildInvocationEnvForLogs(env, {
runtimeEnv,
includeRuntimeKeys: ["HOME"],
@@ -294,18 +331,77 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
return asStringArray(config.args);
})();
const autoTrustEnabled = !hasCursorTrustBypassArg(extraArgs);
const effectiveExecutionCwd = adapterExecutionTargetRemoteCwd(executionTarget, cwd);
let restoreRemoteWorkspace: (() => Promise<void>) | null = null;
let localSkillsDir: string | null = null;
if (executionTargetIsRemote) {
try {
localSkillsDir = await buildCursorSkillsDir(config);
await onLog(
"stdout",
`[paperclip] Syncing workspace and Cursor runtime assets to ${describeAdapterExecutionTarget(executionTarget)}.\n`,
);
const preparedExecutionTargetRuntime = await prepareAdapterExecutionTargetRuntime({
target: executionTarget,
adapterKey: "cursor",
workspaceLocalDir: cwd,
assets: [{
key: "skills",
localDir: localSkillsDir,
followSymlinks: true,
}],
});
restoreRemoteWorkspace = () => preparedExecutionTargetRuntime.restoreWorkspace();
const managedHome = adapterExecutionTargetUsesManagedHome(executionTarget);
if (managedHome && preparedExecutionTargetRuntime.runtimeRootDir) {
env.HOME = preparedExecutionTargetRuntime.runtimeRootDir;
}
const remoteHomeDir = managedHome && preparedExecutionTargetRuntime.runtimeRootDir
? preparedExecutionTargetRuntime.runtimeRootDir
: await readAdapterExecutionTargetHomeDir(runId, executionTarget, {
cwd,
env,
timeoutSec,
graceSec,
onLog,
});
if (remoteHomeDir && preparedExecutionTargetRuntime.assetDirs.skills) {
const remoteSkillsDir = path.posix.join(remoteHomeDir, ".cursor", "skills");
await runAdapterExecutionTargetShellCommand(
runId,
executionTarget,
`mkdir -p ${JSON.stringify(path.posix.dirname(remoteSkillsDir))} && rm -rf ${JSON.stringify(remoteSkillsDir)} && cp -a ${JSON.stringify(preparedExecutionTargetRuntime.assetDirs.skills)} ${JSON.stringify(remoteSkillsDir)}`,
{ cwd, env, timeoutSec, graceSec, onLog },
);
}
} catch (error) {
await Promise.allSettled([
restoreRemoteWorkspace?.(),
localSkillsDir ? fs.rm(localSkillsDir, { recursive: true, force: true }).catch(() => undefined) : Promise.resolve(),
]);
throw error;
}
}
const runtimeSessionParams = parseObject(runtime.sessionParams);
const runtimeSessionId = asString(runtimeSessionParams.sessionId, runtime.sessionId ?? "");
const runtimeSessionCwd = asString(runtimeSessionParams.cwd, "");
const runtimeRemoteExecution = parseObject(runtimeSessionParams.remoteExecution);
const canResumeSession =
runtimeSessionId.length > 0 &&
(runtimeSessionCwd.length === 0 || path.resolve(runtimeSessionCwd) === path.resolve(cwd));
(runtimeSessionCwd.length === 0 || path.resolve(runtimeSessionCwd) === path.resolve(effectiveExecutionCwd)) &&
adapterExecutionTargetSessionMatches(runtimeRemoteExecution, executionTarget);
const sessionId = canResumeSession ? runtimeSessionId : null;
if (runtimeSessionId && !canResumeSession) {
if (executionTargetIsRemote && runtimeSessionId && !canResumeSession) {
await onLog(
"stdout",
`[paperclip] Cursor session "${runtimeSessionId}" was saved for cwd "${runtimeSessionCwd}" and will not be resumed in "${cwd}".\n`,
`[paperclip] Cursor session "${runtimeSessionId}" does not match the current remote execution identity and will not be resumed in "${effectiveExecutionCwd}". Starting a fresh remote session.\n`,
);
} else if (runtimeSessionId && !canResumeSession) {
await onLog(
"stdout",
`[paperclip] Cursor session "${runtimeSessionId}" was saved for cwd "${runtimeSessionCwd}" and will not be resumed in "${effectiveExecutionCwd}".\n`,
);
}
@@ -387,7 +483,7 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
};
const buildArgs = (resumeSessionId: string | null) => {
const args = ["-p", "--output-format", "stream-json", "--workspace", cwd];
const args = ["-p", "--output-format", "stream-json", "--workspace", effectiveExecutionCwd];
if (resumeSessionId) args.push("--resume", resumeSessionId);
if (model) args.push("--model", model);
if (mode) args.push("--mode", mode);
@@ -402,7 +498,7 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
await onMeta({
adapterType: "cursor",
command: resolvedCommand,
cwd,
cwd: effectiveExecutionCwd,
commandNotes,
commandArgs: args,
env: loggedEnv,
@@ -436,7 +532,7 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
}
};
const proc = await runChildProcess(runId, command, args, {
const proc = await runAdapterExecutionTargetProcess(runId, executionTarget, command, args, {
cwd,
env,
timeoutSec,
@@ -488,10 +584,15 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
const resolvedSessionParams = resolvedSessionId
? ({
sessionId: resolvedSessionId,
cwd,
cwd: effectiveExecutionCwd,
...(workspaceId ? { workspaceId } : {}),
...(workspaceRepoUrl ? { repoUrl: workspaceRepoUrl } : {}),
...(workspaceRepoRef ? { repoRef: workspaceRepoRef } : {}),
...(executionTargetIsRemote
? {
remoteExecution: adapterExecutionTargetSessionIdentity(executionTarget),
}
: {}),
} as Record<string, unknown>)
: null;
const parsedError = typeof attempt.parsed.errorMessage === "string" ? attempt.parsed.errorMessage.trim() : "";
@@ -527,20 +628,32 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
};
};
const initial = await runAttempt(sessionId);
if (
sessionId &&
!initial.proc.timedOut &&
(initial.proc.exitCode ?? 0) !== 0 &&
isCursorUnknownSessionError(initial.proc.stdout, initial.proc.stderr)
) {
await onLog(
"stdout",
`[paperclip] Cursor resume session "${sessionId}" is unavailable; retrying with a fresh session.\n`,
);
const retry = await runAttempt(null);
return toResult(retry, true);
try {
const initial = await runAttempt(sessionId);
if (
sessionId &&
!initial.proc.timedOut &&
(initial.proc.exitCode ?? 0) !== 0 &&
isCursorUnknownSessionError(initial.proc.stdout, initial.proc.stderr)
) {
await onLog(
"stdout",
`[paperclip] Cursor resume session "${sessionId}" is unavailable; retrying with a fresh session.\n`,
);
const retry = await runAttempt(null);
return toResult(retry, true);
}
return toResult(initial);
} finally {
if (restoreRemoteWorkspace) {
await onLog(
"stdout",
`[paperclip] Restoring workspace changes from ${describeAdapterExecutionTarget(executionTarget)}.\n`,
);
await restoreRemoteWorkspace();
}
if (localSkillsDir) {
await fs.rm(localSkillsDir, { recursive: true, force: true }).catch(() => undefined);
}
}
return toResult(initial);
}

View File

@@ -0,0 +1,272 @@
import { mkdir, mkdtemp, rm } from "node:fs/promises";
import os from "node:os";
import path from "node:path";
import { afterEach, describe, expect, it, vi } from "vitest";
const {
runChildProcess,
ensureCommandResolvable,
resolveCommandForLogs,
prepareWorkspaceForSshExecution,
restoreWorkspaceFromSshExecution,
runSshCommand,
syncDirectoryToSsh,
} = vi.hoisted(() => ({
runChildProcess: vi.fn(async () => ({
exitCode: 0,
signal: null,
timedOut: false,
stdout: [
JSON.stringify({ type: "system", subtype: "init", session_id: "gemini-session-1", model: "gemini-2.5-pro" }),
JSON.stringify({ type: "assistant", message: { content: [{ type: "output_text", text: "hello" }] } }),
JSON.stringify({
type: "result",
subtype: "success",
session_id: "gemini-session-1",
usage: { promptTokenCount: 1, cachedContentTokenCount: 0, candidatesTokenCount: 1 },
result: "hello",
}),
].join("\n"),
stderr: "",
pid: 123,
startedAt: new Date().toISOString(),
})),
ensureCommandResolvable: vi.fn(async () => undefined),
resolveCommandForLogs: vi.fn(async () => "ssh://fixture@127.0.0.1:2222/remote/workspace :: gemini"),
prepareWorkspaceForSshExecution: vi.fn(async () => undefined),
restoreWorkspaceFromSshExecution: vi.fn(async () => undefined),
runSshCommand: vi.fn(async () => ({
stdout: "/home/agent",
stderr: "",
exitCode: 0,
})),
syncDirectoryToSsh: vi.fn(async () => undefined),
}));
vi.mock("@paperclipai/adapter-utils/server-utils", async () => {
const actual = await vi.importActual<typeof import("@paperclipai/adapter-utils/server-utils")>(
"@paperclipai/adapter-utils/server-utils",
);
return {
...actual,
ensureCommandResolvable,
resolveCommandForLogs,
runChildProcess,
};
});
vi.mock("@paperclipai/adapter-utils/ssh", async () => {
const actual = await vi.importActual<typeof import("@paperclipai/adapter-utils/ssh")>(
"@paperclipai/adapter-utils/ssh",
);
return {
...actual,
prepareWorkspaceForSshExecution,
restoreWorkspaceFromSshExecution,
runSshCommand,
syncDirectoryToSsh,
};
});
import { execute } from "./execute.js";
describe("gemini remote execution", () => {
const cleanupDirs: string[] = [];
afterEach(async () => {
vi.clearAllMocks();
while (cleanupDirs.length > 0) {
const dir = cleanupDirs.pop();
if (!dir) continue;
await rm(dir, { recursive: true, force: true }).catch(() => undefined);
}
});
it("prepares the workspace, syncs Gemini skills, and restores workspace changes for remote SSH execution", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-gemini-remote-"));
cleanupDirs.push(rootDir);
const workspaceDir = path.join(rootDir, "workspace");
await mkdir(workspaceDir, { recursive: true });
const result = await execute({
runId: "run-1",
agent: {
id: "agent-1",
companyId: "company-1",
name: "Gemini Builder",
adapterType: "gemini_local",
adapterConfig: {},
},
runtime: {
sessionId: null,
sessionParams: null,
sessionDisplayId: null,
taskKey: null,
},
config: {
command: "gemini",
},
context: {
paperclipWorkspace: {
cwd: workspaceDir,
source: "project_primary",
},
},
executionTransport: {
remoteExecution: {
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteWorkspacePath: "/remote/workspace",
remoteCwd: "/remote/workspace",
privateKey: "PRIVATE KEY",
knownHosts: "[127.0.0.1]:2222 ssh-ed25519 AAAA",
strictHostKeyChecking: true,
paperclipApiUrl: "http://198.51.100.10:3102",
},
},
onLog: async () => {},
});
expect(result.sessionParams).toMatchObject({
sessionId: "gemini-session-1",
cwd: "/remote/workspace",
remoteExecution: {
transport: "ssh",
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteCwd: "/remote/workspace",
paperclipApiUrl: "http://198.51.100.10:3102",
},
});
expect(prepareWorkspaceForSshExecution).toHaveBeenCalledTimes(1);
expect(syncDirectoryToSsh).toHaveBeenCalledTimes(1);
expect(syncDirectoryToSsh).toHaveBeenCalledWith(expect.objectContaining({
remoteDir: "/remote/workspace/.paperclip-runtime/gemini/skills",
followSymlinks: true,
}));
expect(runSshCommand).toHaveBeenCalledWith(
expect.anything(),
expect.stringContaining(".gemini/skills"),
expect.anything(),
);
const call = runChildProcess.mock.calls[0] as unknown as
| [string, string, string[], { env: Record<string, string>; remoteExecution?: { remoteCwd: string } | null }]
| undefined;
expect(call?.[3].env.PAPERCLIP_API_URL).toBe("http://198.51.100.10:3102");
expect(call?.[3].remoteExecution?.remoteCwd).toBe("/remote/workspace");
expect(restoreWorkspaceFromSshExecution).toHaveBeenCalledTimes(1);
});
it("resumes saved Gemini sessions for remote SSH execution only when the identity matches", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-gemini-remote-resume-"));
cleanupDirs.push(rootDir);
const workspaceDir = path.join(rootDir, "workspace");
await mkdir(workspaceDir, { recursive: true });
await execute({
runId: "run-ssh-resume",
agent: {
id: "agent-1",
companyId: "company-1",
name: "Gemini Builder",
adapterType: "gemini_local",
adapterConfig: {},
},
runtime: {
sessionId: "session-123",
sessionParams: {
sessionId: "session-123",
cwd: "/remote/workspace",
remoteExecution: {
transport: "ssh",
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteCwd: "/remote/workspace",
},
},
sessionDisplayId: "session-123",
taskKey: null,
},
config: {
command: "gemini",
},
context: {
paperclipWorkspace: {
cwd: workspaceDir,
source: "project_primary",
},
},
executionTransport: {
remoteExecution: {
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteWorkspacePath: "/remote/workspace",
remoteCwd: "/remote/workspace",
privateKey: "PRIVATE KEY",
knownHosts: "[127.0.0.1]:2222 ssh-ed25519 AAAA",
strictHostKeyChecking: true,
},
},
onLog: async () => {},
});
const call = runChildProcess.mock.calls[0] as unknown as [string, string, string[]] | undefined;
expect(call?.[2]).toContain("--resume");
expect(call?.[2]).toContain("session-123");
});
it("restores the remote workspace if skills sync fails after workspace prep", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-gemini-remote-sync-fail-"));
cleanupDirs.push(rootDir);
const workspaceDir = path.join(rootDir, "workspace");
await mkdir(workspaceDir, { recursive: true });
syncDirectoryToSsh.mockRejectedValueOnce(new Error("sync failed"));
await expect(execute({
runId: "run-sync-fail",
agent: {
id: "agent-1",
companyId: "company-1",
name: "Gemini Builder",
adapterType: "gemini_local",
adapterConfig: {},
},
runtime: {
sessionId: null,
sessionParams: null,
sessionDisplayId: null,
taskKey: null,
},
config: {
command: "gemini",
},
context: {
paperclipWorkspace: {
cwd: workspaceDir,
source: "project_primary",
},
},
executionTransport: {
remoteExecution: {
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteWorkspacePath: "/remote/workspace",
remoteCwd: "/remote/workspace",
privateKey: "PRIVATE KEY",
knownHosts: "[127.0.0.1]:2222 ssh-ed25519 AAAA",
strictHostKeyChecking: true,
},
},
onLog: async () => {},
})).rejects.toThrow("sync failed");
expect(prepareWorkspaceForSshExecution).toHaveBeenCalledTimes(1);
expect(restoreWorkspaceFromSshExecution).toHaveBeenCalledTimes(1);
expect(runChildProcess).not.toHaveBeenCalled();
});
});

View File

@@ -4,6 +4,22 @@ import os from "node:os";
import path from "node:path";
import { fileURLToPath } from "node:url";
import type { AdapterExecutionContext, AdapterExecutionResult } from "@paperclipai/adapter-utils";
import {
adapterExecutionTargetIsRemote,
adapterExecutionTargetPaperclipApiUrl,
adapterExecutionTargetRemoteCwd,
adapterExecutionTargetSessionIdentity,
adapterExecutionTargetSessionMatches,
adapterExecutionTargetUsesManagedHome,
describeAdapterExecutionTarget,
ensureAdapterExecutionTargetCommandResolvable,
prepareAdapterExecutionTargetRuntime,
readAdapterExecutionTarget,
readAdapterExecutionTargetHomeDir,
resolveAdapterExecutionTargetCommandForLogs,
runAdapterExecutionTargetProcess,
runAdapterExecutionTargetShellCommand,
} from "@paperclipai/adapter-utils/execution-target";
import {
asBoolean,
asNumber,
@@ -12,12 +28,10 @@ import {
buildPaperclipEnv,
buildInvocationEnvForLogs,
ensureAbsoluteDirectory,
ensureCommandResolvable,
ensurePaperclipSkillSymlink,
joinPromptSections,
ensurePathInEnv,
readPaperclipRuntimeSkillEntries,
resolveCommandForLogs,
resolvePaperclipDesiredSkillNames,
removeMaintainerOnlySkillSymlinks,
parseObject,
@@ -136,8 +150,28 @@ async function ensureGeminiSkillsInjected(
}
}
async function buildGeminiSkillsDir(
config: Record<string, unknown>,
): Promise<string> {
const tmp = await fs.mkdtemp(path.join(os.tmpdir(), "paperclip-gemini-skills-"));
const target = path.join(tmp, "skills");
await fs.mkdir(target, { recursive: true });
const availableEntries = await readPaperclipRuntimeSkillEntries(config, __moduleDir);
const desiredNames = new Set(resolvePaperclipDesiredSkillNames(config, availableEntries));
for (const entry of availableEntries) {
if (!desiredNames.has(entry.key)) continue;
await fs.symlink(entry.source, path.join(target, entry.runtimeName));
}
return target;
}
export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExecutionResult> {
const { runId, agent, runtime, config, context, onLog, onMeta, onSpawn, authToken } = ctx;
const executionTarget = readAdapterExecutionTarget({
executionTarget: ctx.executionTarget,
legacyRemoteExecution: ctx.executionTransport?.remoteExecution,
});
const executionTargetIsRemote = adapterExecutionTargetIsRemote(executionTarget);
const promptTemplate = asString(
config.promptTemplate,
@@ -166,7 +200,9 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
await ensureAbsoluteDirectory(cwd, { createIfMissing: true });
const geminiSkillEntries = await readPaperclipRuntimeSkillEntries(config, __moduleDir);
const desiredGeminiSkillNames = resolvePaperclipDesiredSkillNames(config, geminiSkillEntries);
await ensureGeminiSkillsInjected(onLog, geminiSkillEntries, desiredGeminiSkillNames);
if (!executionTargetIsRemote) {
await ensureGeminiSkillsInjected(onLog, geminiSkillEntries, desiredGeminiSkillNames);
}
const envConfig = parseObject(config.env);
const hasExplicitApiKey =
@@ -211,6 +247,8 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
if (workspaceRepoRef) env.PAPERCLIP_WORKSPACE_REPO_REF = workspaceRepoRef;
if (agentHome) env.AGENT_HOME = agentHome;
if (workspaceHints.length > 0) env.PAPERCLIP_WORKSPACES_JSON = JSON.stringify(workspaceHints);
const targetPaperclipApiUrl = adapterExecutionTargetPaperclipApiUrl(executionTarget);
if (targetPaperclipApiUrl) env.PAPERCLIP_API_URL = targetPaperclipApiUrl;
for (const [key, value] of Object.entries(envConfig)) {
if (typeof value === "string") env[key] = value;
@@ -225,8 +263,8 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
);
const billingType = resolveGeminiBillingType(effectiveEnv);
const runtimeEnv = ensurePathInEnv(effectiveEnv);
await ensureCommandResolvable(command, cwd, runtimeEnv);
const resolvedCommand = await resolveCommandForLogs(command, cwd, runtimeEnv);
await ensureAdapterExecutionTargetCommandResolvable(command, executionTarget, cwd, runtimeEnv);
const resolvedCommand = await resolveAdapterExecutionTargetCommandForLogs(command, executionTarget, cwd, runtimeEnv);
const loggedEnv = buildInvocationEnvForLogs(env, {
runtimeEnv,
includeRuntimeKeys: ["HOME"],
@@ -240,18 +278,78 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
if (fromExtraArgs.length > 0) return fromExtraArgs;
return asStringArray(config.args);
})();
const effectiveExecutionCwd = adapterExecutionTargetRemoteCwd(executionTarget, cwd);
let restoreRemoteWorkspace: (() => Promise<void>) | null = null;
let remoteSkillsDir: string | null = null;
let localSkillsDir: string | null = null;
if (executionTargetIsRemote) {
try {
localSkillsDir = await buildGeminiSkillsDir(config);
await onLog(
"stdout",
`[paperclip] Syncing workspace and Gemini runtime assets to ${describeAdapterExecutionTarget(executionTarget)}.\n`,
);
const preparedExecutionTargetRuntime = await prepareAdapterExecutionTargetRuntime({
target: executionTarget,
adapterKey: "gemini",
workspaceLocalDir: cwd,
assets: [{
key: "skills",
localDir: localSkillsDir,
followSymlinks: true,
}],
});
restoreRemoteWorkspace = () => preparedExecutionTargetRuntime.restoreWorkspace();
const managedHome = adapterExecutionTargetUsesManagedHome(executionTarget);
if (managedHome && preparedExecutionTargetRuntime.runtimeRootDir) {
env.HOME = preparedExecutionTargetRuntime.runtimeRootDir;
}
const remoteHomeDir = managedHome && preparedExecutionTargetRuntime.runtimeRootDir
? preparedExecutionTargetRuntime.runtimeRootDir
: await readAdapterExecutionTargetHomeDir(runId, executionTarget, {
cwd,
env,
timeoutSec,
graceSec,
onLog,
});
if (remoteHomeDir && preparedExecutionTargetRuntime.assetDirs.skills) {
remoteSkillsDir = path.posix.join(remoteHomeDir, ".gemini", "skills");
await runAdapterExecutionTargetShellCommand(
runId,
executionTarget,
`mkdir -p ${JSON.stringify(path.posix.dirname(remoteSkillsDir))} && rm -rf ${JSON.stringify(remoteSkillsDir)} && cp -a ${JSON.stringify(preparedExecutionTargetRuntime.assetDirs.skills)} ${JSON.stringify(remoteSkillsDir)}`,
{ cwd, env, timeoutSec, graceSec, onLog },
);
}
} catch (error) {
await Promise.allSettled([
restoreRemoteWorkspace?.(),
localSkillsDir ? fs.rm(path.dirname(localSkillsDir), { recursive: true, force: true }).catch(() => undefined) : Promise.resolve(),
]);
throw error;
}
}
const runtimeSessionParams = parseObject(runtime.sessionParams);
const runtimeSessionId = asString(runtimeSessionParams.sessionId, runtime.sessionId ?? "");
const runtimeSessionCwd = asString(runtimeSessionParams.cwd, "");
const runtimeRemoteExecution = parseObject(runtimeSessionParams.remoteExecution);
const canResumeSession =
runtimeSessionId.length > 0 &&
(runtimeSessionCwd.length === 0 || path.resolve(runtimeSessionCwd) === path.resolve(cwd));
(runtimeSessionCwd.length === 0 || path.resolve(runtimeSessionCwd) === path.resolve(effectiveExecutionCwd)) &&
adapterExecutionTargetSessionMatches(runtimeRemoteExecution, executionTarget);
const sessionId = canResumeSession ? runtimeSessionId : null;
if (runtimeSessionId && !canResumeSession) {
if (executionTargetIsRemote && runtimeSessionId && !canResumeSession) {
await onLog(
"stdout",
`[paperclip] Gemini session "${runtimeSessionId}" was saved for cwd "${runtimeSessionCwd}" and will not be resumed in "${cwd}".\n`,
`[paperclip] Gemini session "${runtimeSessionId}" does not match the current remote execution identity and will not be resumed in "${effectiveExecutionCwd}". Starting a fresh remote session.\n`,
);
} else if (runtimeSessionId && !canResumeSession) {
await onLog(
"stdout",
`[paperclip] Gemini session "${runtimeSessionId}" was saved for cwd "${runtimeSessionCwd}" and will not be resumed in "${effectiveExecutionCwd}".\n`,
);
}
@@ -350,7 +448,7 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
await onMeta({
adapterType: "gemini_local",
command: resolvedCommand,
cwd,
cwd: effectiveExecutionCwd,
commandNotes,
commandArgs: args.map((value, index) => (
index === args.length - 1 ? `<prompt ${prompt.length} chars>` : value
@@ -362,7 +460,7 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
});
}
const proc = await runChildProcess(runId, command, args, {
const proc = await runAdapterExecutionTargetProcess(runId, executionTarget, command, args, {
cwd,
env,
timeoutSec,
@@ -416,10 +514,15 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
const resolvedSessionParams = resolvedSessionId
? ({
sessionId: resolvedSessionId,
cwd,
cwd: effectiveExecutionCwd,
...(workspaceId ? { workspaceId } : {}),
...(workspaceRepoUrl ? { repoUrl: workspaceRepoUrl } : {}),
...(workspaceRepoRef ? { repoRef: workspaceRepoRef } : {}),
...(executionTargetIsRemote
? {
remoteExecution: adapterExecutionTargetSessionIdentity(executionTarget),
}
: {}),
} as Record<string, unknown>)
: null;
const parsedError = typeof attempt.parsed.errorMessage === "string" ? attempt.parsed.errorMessage.trim() : "";
@@ -458,20 +561,27 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
};
};
const initial = await runAttempt(sessionId);
if (
sessionId &&
!initial.proc.timedOut &&
(initial.proc.exitCode ?? 0) !== 0 &&
isGeminiUnknownSessionError(initial.proc.stdout, initial.proc.stderr)
) {
await onLog(
"stdout",
`[paperclip] Gemini resume session "${sessionId}" is unavailable; retrying with a fresh session.\n`,
);
const retry = await runAttempt(null);
return toResult(retry, true, true);
}
try {
const initial = await runAttempt(sessionId);
if (
sessionId &&
!initial.proc.timedOut &&
(initial.proc.exitCode ?? 0) !== 0 &&
isGeminiUnknownSessionError(initial.proc.stdout, initial.proc.stderr)
) {
await onLog(
"stdout",
`[paperclip] Gemini resume session "${sessionId}" is unavailable; retrying with a fresh session.\n`,
);
const retry = await runAttempt(null);
return toResult(retry, true, true);
}
return toResult(initial);
return toResult(initial);
} finally {
await Promise.all([
restoreRemoteWorkspace?.(),
localSkillsDir ? fs.rm(path.dirname(localSkillsDir), { recursive: true, force: true }).catch(() => undefined) : Promise.resolve(),
]);
}
}

View File

@@ -422,6 +422,8 @@ function buildWakeText(
" - GET /api/issues/{issueId}/comments",
" - Execute the issue instructions exactly. If the issue is actionable, take concrete action in this run; do not stop at a plan unless planning was requested.",
" - Leave durable progress with a clear next action. Use child issues for long or parallel delegated work instead of polling agents, sessions, or processes.",
" - Create child issues directly when you know what needs to be done; use POST /api/issues/{issueId}/interactions with kind suggest_tasks, ask_user_questions, or request_confirmation when the board/user must choose, answer, or confirm before you can continue.",
" - For plan approval, update the plan document first, then create request_confirmation targeting the latest plan revision with idempotencyKey confirmation:{issueId}:plan:{revisionId}; wait for acceptance before creating implementation subtasks.",
" - If blocked, PATCH /api/issues/{issueId} with {\"status\":\"blocked\",\"comment\":\"what is blocked, who owns the unblock, and the next action\"}.",
" - If instructions require a comment, POST /api/issues/{issueId}/comments with {\"body\":\"...\"}.",
" - PATCH /api/issues/{issueId} with {\"status\":\"done\",\"comment\":\"what changed and why\"}.",

View File

@@ -0,0 +1,225 @@
import { mkdir, mkdtemp, rm } from "node:fs/promises";
import os from "node:os";
import path from "node:path";
import { afterEach, describe, expect, it, vi } from "vitest";
const {
runChildProcess,
ensureCommandResolvable,
resolveCommandForLogs,
prepareWorkspaceForSshExecution,
restoreWorkspaceFromSshExecution,
runSshCommand,
syncDirectoryToSsh,
} = vi.hoisted(() => ({
runChildProcess: vi.fn(async () => ({
exitCode: 0,
signal: null,
timedOut: false,
stdout: [
JSON.stringify({ type: "step_start", sessionID: "session_123" }),
JSON.stringify({ type: "text", sessionID: "session_123", part: { text: "hello" } }),
JSON.stringify({
type: "step_finish",
sessionID: "session_123",
part: { cost: 0.001, tokens: { input: 1, output: 1, reasoning: 0, cache: { read: 0, write: 0 } } },
}),
].join("\n"),
stderr: "",
pid: 123,
startedAt: new Date().toISOString(),
})),
ensureCommandResolvable: vi.fn(async () => undefined),
resolveCommandForLogs: vi.fn(async () => "ssh://fixture@127.0.0.1:2222/remote/workspace :: opencode"),
prepareWorkspaceForSshExecution: vi.fn(async () => undefined),
restoreWorkspaceFromSshExecution: vi.fn(async () => undefined),
runSshCommand: vi.fn(async () => ({
stdout: "/home/agent",
stderr: "",
exitCode: 0,
})),
syncDirectoryToSsh: vi.fn(async () => undefined),
}));
vi.mock("@paperclipai/adapter-utils/server-utils", async () => {
const actual = await vi.importActual<typeof import("@paperclipai/adapter-utils/server-utils")>(
"@paperclipai/adapter-utils/server-utils",
);
return {
...actual,
ensureCommandResolvable,
resolveCommandForLogs,
runChildProcess,
};
});
vi.mock("@paperclipai/adapter-utils/ssh", async () => {
const actual = await vi.importActual<typeof import("@paperclipai/adapter-utils/ssh")>(
"@paperclipai/adapter-utils/ssh",
);
return {
...actual,
prepareWorkspaceForSshExecution,
restoreWorkspaceFromSshExecution,
runSshCommand,
syncDirectoryToSsh,
};
});
import { execute } from "./execute.js";
describe("opencode remote execution", () => {
const cleanupDirs: string[] = [];
afterEach(async () => {
vi.clearAllMocks();
while (cleanupDirs.length > 0) {
const dir = cleanupDirs.pop();
if (!dir) continue;
await rm(dir, { recursive: true, force: true }).catch(() => undefined);
}
});
it("prepares the workspace, syncs OpenCode skills, and restores workspace changes for remote SSH execution", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-opencode-remote-"));
cleanupDirs.push(rootDir);
const workspaceDir = path.join(rootDir, "workspace");
await mkdir(workspaceDir, { recursive: true });
const result = await execute({
runId: "run-1",
agent: {
id: "agent-1",
companyId: "company-1",
name: "OpenCode Builder",
adapterType: "opencode_local",
adapterConfig: {},
},
runtime: {
sessionId: null,
sessionParams: null,
sessionDisplayId: null,
taskKey: null,
},
config: {
command: "opencode",
model: "opencode/gpt-5-nano",
},
context: {
paperclipWorkspace: {
cwd: workspaceDir,
source: "project_primary",
},
},
executionTransport: {
remoteExecution: {
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteWorkspacePath: "/remote/workspace",
remoteCwd: "/remote/workspace",
privateKey: "PRIVATE KEY",
knownHosts: "[127.0.0.1]:2222 ssh-ed25519 AAAA",
strictHostKeyChecking: true,
paperclipApiUrl: "http://198.51.100.10:3102",
},
},
onLog: async () => {},
});
expect(result.sessionParams).toMatchObject({
sessionId: "session_123",
cwd: "/remote/workspace",
remoteExecution: {
transport: "ssh",
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteCwd: "/remote/workspace",
paperclipApiUrl: "http://198.51.100.10:3102",
},
});
expect(prepareWorkspaceForSshExecution).toHaveBeenCalledTimes(1);
expect(syncDirectoryToSsh).toHaveBeenCalledTimes(2);
expect(syncDirectoryToSsh).toHaveBeenCalledWith(expect.objectContaining({
remoteDir: "/remote/workspace/.paperclip-runtime/opencode/xdgConfig",
}));
expect(syncDirectoryToSsh).toHaveBeenCalledWith(expect.objectContaining({
remoteDir: "/remote/workspace/.paperclip-runtime/opencode/skills",
followSymlinks: true,
}));
expect(runSshCommand).toHaveBeenCalledWith(
expect.anything(),
expect.stringContaining(".claude/skills"),
expect.anything(),
);
const call = runChildProcess.mock.calls[0] as unknown as
| [string, string, string[], { env: Record<string, string>; remoteExecution?: { remoteCwd: string } | null }]
| undefined;
expect(call?.[3].env.PAPERCLIP_API_URL).toBe("http://198.51.100.10:3102");
expect(call?.[3].env.XDG_CONFIG_HOME).toBe("/remote/workspace/.paperclip-runtime/opencode/xdgConfig");
expect(call?.[3].remoteExecution?.remoteCwd).toBe("/remote/workspace");
expect(restoreWorkspaceFromSshExecution).toHaveBeenCalledTimes(1);
});
it("resumes saved OpenCode sessions for remote SSH execution only when the identity matches", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-opencode-remote-resume-"));
cleanupDirs.push(rootDir);
const workspaceDir = path.join(rootDir, "workspace");
await mkdir(workspaceDir, { recursive: true });
await execute({
runId: "run-ssh-resume",
agent: {
id: "agent-1",
companyId: "company-1",
name: "OpenCode Builder",
adapterType: "opencode_local",
adapterConfig: {},
},
runtime: {
sessionId: "session-123",
sessionParams: {
sessionId: "session-123",
cwd: "/remote/workspace",
remoteExecution: {
transport: "ssh",
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteCwd: "/remote/workspace",
},
},
sessionDisplayId: "session-123",
taskKey: null,
},
config: {
command: "opencode",
model: "opencode/gpt-5-nano",
},
context: {
paperclipWorkspace: {
cwd: workspaceDir,
source: "project_primary",
},
},
executionTransport: {
remoteExecution: {
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteWorkspacePath: "/remote/workspace",
remoteCwd: "/remote/workspace",
privateKey: "PRIVATE KEY",
knownHosts: "[127.0.0.1]:2222 ssh-ed25519 AAAA",
strictHostKeyChecking: true,
},
},
onLog: async () => {},
});
const call = runChildProcess.mock.calls[0] as unknown as [string, string, string[]] | undefined;
expect(call?.[2]).toContain("--session");
expect(call?.[2]).toContain("session-123");
});
});

View File

@@ -3,6 +3,22 @@ import os from "node:os";
import path from "node:path";
import { fileURLToPath } from "node:url";
import { inferOpenAiCompatibleBiller, type AdapterExecutionContext, type AdapterExecutionResult } from "@paperclipai/adapter-utils";
import {
adapterExecutionTargetIsRemote,
adapterExecutionTargetPaperclipApiUrl,
adapterExecutionTargetRemoteCwd,
adapterExecutionTargetSessionIdentity,
adapterExecutionTargetSessionMatches,
adapterExecutionTargetUsesManagedHome,
describeAdapterExecutionTarget,
ensureAdapterExecutionTargetCommandResolvable,
prepareAdapterExecutionTargetRuntime,
readAdapterExecutionTarget,
readAdapterExecutionTargetHomeDir,
resolveAdapterExecutionTargetCommandForLogs,
runAdapterExecutionTargetProcess,
runAdapterExecutionTargetShellCommand,
} from "@paperclipai/adapter-utils/execution-target";
import {
asString,
asNumber,
@@ -12,10 +28,8 @@ import {
joinPromptSections,
buildInvocationEnvForLogs,
ensureAbsoluteDirectory,
ensureCommandResolvable,
ensurePaperclipSkillSymlink,
ensurePathInEnv,
resolveCommandForLogs,
renderTemplate,
renderPaperclipWakePrompt,
stringifyPaperclipWakePayload,
@@ -93,8 +107,26 @@ async function ensureOpenCodeSkillsInjected(
}
}
async function buildOpenCodeSkillsDir(config: Record<string, unknown>): Promise<string> {
const tmp = await fs.mkdtemp(path.join(os.tmpdir(), "paperclip-opencode-skills-"));
const target = path.join(tmp, "skills");
await fs.mkdir(target, { recursive: true });
const availableEntries = await readPaperclipRuntimeSkillEntries(config, __moduleDir);
const desiredNames = new Set(resolvePaperclipDesiredSkillNames(config, availableEntries));
for (const entry of availableEntries) {
if (!desiredNames.has(entry.key)) continue;
await fs.symlink(entry.source, path.join(target, entry.runtimeName));
}
return target;
}
export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExecutionResult> {
const { runId, agent, runtime, config, context, onLog, onMeta, onSpawn, authToken } = ctx;
const executionTarget = readAdapterExecutionTarget({
executionTarget: ctx.executionTarget,
legacyRemoteExecution: ctx.executionTransport?.remoteExecution,
});
const executionTargetIsRemote = adapterExecutionTargetIsRemote(executionTarget);
const promptTemplate = asString(
config.promptTemplate,
@@ -123,11 +155,13 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
await ensureAbsoluteDirectory(cwd, { createIfMissing: true });
const openCodeSkillEntries = await readPaperclipRuntimeSkillEntries(config, __moduleDir);
const desiredOpenCodeSkillNames = resolvePaperclipDesiredSkillNames(config, openCodeSkillEntries);
await ensureOpenCodeSkillsInjected(
onLog,
openCodeSkillEntries,
desiredOpenCodeSkillNames,
);
if (!executionTargetIsRemote) {
await ensureOpenCodeSkillsInjected(
onLog,
openCodeSkillEntries,
desiredOpenCodeSkillNames,
);
}
const envConfig = parseObject(config.env);
const hasExplicitApiKey =
@@ -172,6 +206,8 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
if (workspaceRepoRef) env.PAPERCLIP_WORKSPACE_REPO_REF = workspaceRepoRef;
if (agentHome) env.AGENT_HOME = agentHome;
if (workspaceHints.length > 0) env.PAPERCLIP_WORKSPACES_JSON = JSON.stringify(workspaceHints);
const targetPaperclipApiUrl = adapterExecutionTargetPaperclipApiUrl(executionTarget);
if (targetPaperclipApiUrl) env.PAPERCLIP_API_URL = targetPaperclipApiUrl;
for (const [key, value] of Object.entries(envConfig)) {
if (typeof value === "string") env[key] = value;
@@ -185,26 +221,30 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
env.PAPERCLIP_API_KEY = authToken;
}
const preparedRuntimeConfig = await prepareOpenCodeRuntimeConfig({ env, config });
const localRuntimeConfigHome =
preparedRuntimeConfig.notes.length > 0 ? preparedRuntimeConfig.env.XDG_CONFIG_HOME : "";
try {
const runtimeEnv = Object.fromEntries(
Object.entries(ensurePathInEnv({ ...process.env, ...preparedRuntimeConfig.env })).filter(
(entry): entry is [string, string] => typeof entry[1] === "string",
),
);
await ensureCommandResolvable(command, cwd, runtimeEnv);
const resolvedCommand = await resolveCommandForLogs(command, cwd, runtimeEnv);
await ensureAdapterExecutionTargetCommandResolvable(command, executionTarget, cwd, runtimeEnv);
const resolvedCommand = await resolveAdapterExecutionTargetCommandForLogs(command, executionTarget, cwd, runtimeEnv);
const loggedEnv = buildInvocationEnvForLogs(preparedRuntimeConfig.env, {
runtimeEnv,
includeRuntimeKeys: ["HOME"],
resolvedCommand,
});
await ensureOpenCodeModelConfiguredAndAvailable({
model,
command,
cwd,
env: runtimeEnv,
});
if (!executionTargetIsRemote) {
await ensureOpenCodeModelConfiguredAndAvailable({
model,
command,
cwd,
env: runtimeEnv,
});
}
const timeoutSec = asNumber(config.timeoutSec, 0);
const graceSec = asNumber(config.graceSec, 20);
@@ -213,18 +253,80 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
if (fromExtraArgs.length > 0) return fromExtraArgs;
return asStringArray(config.args);
})();
const effectiveExecutionCwd = adapterExecutionTargetRemoteCwd(executionTarget, cwd);
let restoreRemoteWorkspace: (() => Promise<void>) | null = null;
let localSkillsDir: string | null = null;
if (executionTargetIsRemote) {
localSkillsDir = await buildOpenCodeSkillsDir(config);
await onLog(
"stdout",
`[paperclip] Syncing workspace and OpenCode runtime assets to ${describeAdapterExecutionTarget(executionTarget)}.\n`,
);
const preparedExecutionTargetRuntime = await prepareAdapterExecutionTargetRuntime({
target: executionTarget,
adapterKey: "opencode",
workspaceLocalDir: cwd,
assets: [
{
key: "skills",
localDir: localSkillsDir,
followSymlinks: true,
},
...(localRuntimeConfigHome
? [{
key: "xdgConfig",
localDir: localRuntimeConfigHome,
}]
: []),
],
});
restoreRemoteWorkspace = () => preparedExecutionTargetRuntime.restoreWorkspace();
const managedHome = adapterExecutionTargetUsesManagedHome(executionTarget);
if (managedHome && preparedExecutionTargetRuntime.runtimeRootDir) {
preparedRuntimeConfig.env.HOME = preparedExecutionTargetRuntime.runtimeRootDir;
}
if (localRuntimeConfigHome && preparedExecutionTargetRuntime.assetDirs.xdgConfig) {
preparedRuntimeConfig.env.XDG_CONFIG_HOME = preparedExecutionTargetRuntime.assetDirs.xdgConfig;
}
const remoteHomeDir = managedHome && preparedExecutionTargetRuntime.runtimeRootDir
? preparedExecutionTargetRuntime.runtimeRootDir
: await readAdapterExecutionTargetHomeDir(runId, executionTarget, {
cwd,
env: preparedRuntimeConfig.env,
timeoutSec,
graceSec,
onLog,
});
if (remoteHomeDir && preparedExecutionTargetRuntime.assetDirs.skills) {
const remoteSkillsDir = path.posix.join(remoteHomeDir, ".claude", "skills");
await runAdapterExecutionTargetShellCommand(
runId,
executionTarget,
`mkdir -p ${JSON.stringify(path.posix.dirname(remoteSkillsDir))} && rm -rf ${JSON.stringify(remoteSkillsDir)} && cp -a ${JSON.stringify(preparedExecutionTargetRuntime.assetDirs.skills)} ${JSON.stringify(remoteSkillsDir)}`,
{ cwd, env: preparedRuntimeConfig.env, timeoutSec, graceSec, onLog },
);
}
}
const runtimeSessionParams = parseObject(runtime.sessionParams);
const runtimeSessionId = asString(runtimeSessionParams.sessionId, runtime.sessionId ?? "");
const runtimeSessionCwd = asString(runtimeSessionParams.cwd, "");
const runtimeRemoteExecution = parseObject(runtimeSessionParams.remoteExecution);
const canResumeSession =
runtimeSessionId.length > 0 &&
(runtimeSessionCwd.length === 0 || path.resolve(runtimeSessionCwd) === path.resolve(cwd));
(runtimeSessionCwd.length === 0 || path.resolve(runtimeSessionCwd) === path.resolve(effectiveExecutionCwd)) &&
adapterExecutionTargetSessionMatches(runtimeRemoteExecution, executionTarget);
const sessionId = canResumeSession ? runtimeSessionId : null;
if (runtimeSessionId && !canResumeSession) {
if (executionTargetIsRemote && runtimeSessionId && !canResumeSession) {
await onLog(
"stdout",
`[paperclip] OpenCode session "${runtimeSessionId}" was saved for cwd "${runtimeSessionCwd}" and will not be resumed in "${cwd}".\n`,
`[paperclip] OpenCode session "${runtimeSessionId}" does not match the current remote execution identity and will not be resumed in "${effectiveExecutionCwd}". Starting a fresh remote session.\n`,
);
} else if (runtimeSessionId && !canResumeSession) {
await onLog(
"stdout",
`[paperclip] OpenCode session "${runtimeSessionId}" was saved for cwd "${runtimeSessionCwd}" and will not be resumed in "${effectiveExecutionCwd}".\n`,
);
}
const instructionsFilePath = asString(config.instructionsFilePath, "").trim();
@@ -314,7 +416,7 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
await onMeta({
adapterType: "opencode_local",
command: resolvedCommand,
cwd,
cwd: effectiveExecutionCwd,
commandNotes,
commandArgs: [...args, `<stdin prompt ${prompt.length} chars>`],
env: loggedEnv,
@@ -324,9 +426,9 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
});
}
const proc = await runChildProcess(runId, command, args, {
const proc = await runAdapterExecutionTargetProcess(runId, executionTarget, command, args, {
cwd,
env: runtimeEnv,
env: preparedRuntimeConfig.env,
stdin: prompt,
timeoutSec,
graceSec,
@@ -364,10 +466,15 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
const resolvedSessionParams = resolvedSessionId
? ({
sessionId: resolvedSessionId,
cwd,
cwd: effectiveExecutionCwd,
...(workspaceId ? { workspaceId } : {}),
...(workspaceRepoUrl ? { repoUrl: workspaceRepoUrl } : {}),
...(workspaceRepoRef ? { repoRef: workspaceRepoRef } : {}),
...(executionTargetIsRemote
? {
remoteExecution: adapterExecutionTargetSessionIdentity(executionTarget),
}
: {}),
} as Record<string, unknown>)
: null;
@@ -408,23 +515,30 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
};
};
const initial = await runAttempt(sessionId);
const initialFailed =
!initial.proc.timedOut && ((initial.proc.exitCode ?? 0) !== 0 || Boolean(initial.parsed.errorMessage));
if (
sessionId &&
initialFailed &&
isOpenCodeUnknownSessionError(initial.proc.stdout, initial.rawStderr)
) {
await onLog(
"stdout",
`[paperclip] OpenCode session "${sessionId}" is unavailable; retrying with a fresh session.\n`,
);
const retry = await runAttempt(null);
return toResult(retry, true);
}
try {
const initial = await runAttempt(sessionId);
const initialFailed =
!initial.proc.timedOut && ((initial.proc.exitCode ?? 0) !== 0 || Boolean(initial.parsed.errorMessage));
if (
sessionId &&
initialFailed &&
isOpenCodeUnknownSessionError(initial.proc.stdout, initial.rawStderr)
) {
await onLog(
"stdout",
`[paperclip] OpenCode session "${sessionId}" is unavailable; retrying with a fresh session.\n`,
);
const retry = await runAttempt(null);
return toResult(retry, true);
}
return toResult(initial);
return toResult(initial);
} finally {
await Promise.all([
restoreRemoteWorkspace?.(),
localSkillsDir ? fs.rm(path.dirname(localSkillsDir), { recursive: true, force: true }).catch(() => undefined) : Promise.resolve(),
]);
}
} finally {
await preparedRuntimeConfig.cleanup();
}

View File

@@ -40,6 +40,33 @@ describe("parseOpenCodeJsonl", () => {
});
expect(parsed.costUsd).toBeCloseTo(0.0025, 6);
expect(parsed.errorMessage).toContain("model unavailable");
expect(parsed.toolErrors).toEqual([]);
});
it("keeps failed tool calls separate from fatal run errors", () => {
const stdout = [
JSON.stringify({
type: "tool_use",
sessionID: "session_123",
part: {
state: {
status: "error",
error: "File not found: e2b-adapter-result.txt",
},
},
}),
JSON.stringify({
type: "text",
sessionID: "session_123",
part: { text: "Recovered and completed the task" },
}),
].join("\n");
const parsed = parseOpenCodeJsonl(stdout);
expect(parsed.sessionId).toBe("session_123");
expect(parsed.summary).toBe("Recovered and completed the task");
expect(parsed.errorMessage).toBeNull();
expect(parsed.toolErrors).toEqual(["File not found: e2b-adapter-result.txt"]);
});
it("detects unknown session errors", () => {

View File

@@ -23,6 +23,7 @@ export function parseOpenCodeJsonl(stdout: string) {
let sessionId: string | null = null;
const messages: string[] = [];
const errors: string[] = [];
const toolErrors: string[] = [];
const usage = {
inputTokens: 0,
cachedInputTokens: 0,
@@ -65,7 +66,7 @@ export function parseOpenCodeJsonl(stdout: string) {
const state = parseObject(part.state);
if (asString(state.status, "") === "error") {
const text = asString(state.error, "").trim();
if (text) errors.push(text);
if (text) toolErrors.push(text);
}
continue;
}
@@ -83,6 +84,7 @@ export function parseOpenCodeJsonl(stdout: string) {
usage,
costUsd,
errorMessage: errors.length > 0 ? errors.join("\n") : null,
toolErrors,
};
}

View File

@@ -0,0 +1,229 @@
import { mkdir, mkdtemp, rm } from "node:fs/promises";
import os from "node:os";
import path from "node:path";
import { afterEach, describe, expect, it, vi } from "vitest";
const {
runChildProcess,
ensureCommandResolvable,
resolveCommandForLogs,
prepareWorkspaceForSshExecution,
restoreWorkspaceFromSshExecution,
runSshCommand,
syncDirectoryToSsh,
} = vi.hoisted(() => ({
runChildProcess: vi.fn(async () => ({
exitCode: 0,
signal: null,
timedOut: false,
stdout: JSON.stringify({
type: "turn_end",
message: {
role: "assistant",
content: "done",
usage: {
input: 10,
output: 20,
cacheRead: 0,
cost: { total: 0.01 },
},
},
toolResults: [],
}),
stderr: "",
pid: 123,
startedAt: new Date().toISOString(),
})),
ensureCommandResolvable: vi.fn(async () => undefined),
resolveCommandForLogs: vi.fn(async () => "ssh://fixture@127.0.0.1:2222/remote/workspace :: pi"),
prepareWorkspaceForSshExecution: vi.fn(async () => undefined),
restoreWorkspaceFromSshExecution: vi.fn(async () => undefined),
runSshCommand: vi.fn(async () => ({
stdout: "",
stderr: "",
exitCode: 0,
})),
syncDirectoryToSsh: vi.fn(async () => undefined),
}));
vi.mock("@paperclipai/adapter-utils/server-utils", async () => {
const actual = await vi.importActual<typeof import("@paperclipai/adapter-utils/server-utils")>(
"@paperclipai/adapter-utils/server-utils",
);
return {
...actual,
ensureCommandResolvable,
resolveCommandForLogs,
runChildProcess,
};
});
vi.mock("@paperclipai/adapter-utils/ssh", async () => {
const actual = await vi.importActual<typeof import("@paperclipai/adapter-utils/ssh")>(
"@paperclipai/adapter-utils/ssh",
);
return {
...actual,
prepareWorkspaceForSshExecution,
restoreWorkspaceFromSshExecution,
runSshCommand,
syncDirectoryToSsh,
};
});
import { execute } from "./execute.js";
describe("pi remote execution", () => {
const cleanupDirs: string[] = [];
afterEach(async () => {
vi.clearAllMocks();
while (cleanupDirs.length > 0) {
const dir = cleanupDirs.pop();
if (!dir) continue;
await rm(dir, { recursive: true, force: true }).catch(() => undefined);
}
});
it("prepares the workspace, syncs Pi skills, and restores workspace changes for remote SSH execution", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-pi-remote-"));
cleanupDirs.push(rootDir);
const workspaceDir = path.join(rootDir, "workspace");
await mkdir(workspaceDir, { recursive: true });
const result = await execute({
runId: "run-1",
agent: {
id: "agent-1",
companyId: "company-1",
name: "Pi Builder",
adapterType: "pi_local",
adapterConfig: {},
},
runtime: {
sessionId: null,
sessionParams: null,
sessionDisplayId: null,
taskKey: null,
},
config: {
command: "pi",
model: "openai/gpt-5.4-mini",
},
context: {
paperclipWorkspace: {
cwd: workspaceDir,
source: "project_primary",
},
},
executionTransport: {
remoteExecution: {
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteWorkspacePath: "/remote/workspace",
remoteCwd: "/remote/workspace",
privateKey: "PRIVATE KEY",
knownHosts: "[127.0.0.1]:2222 ssh-ed25519 AAAA",
strictHostKeyChecking: true,
paperclipApiUrl: "http://198.51.100.10:3102",
},
},
onLog: async () => {},
});
expect(result.sessionParams).toMatchObject({
cwd: "/remote/workspace",
remoteExecution: {
transport: "ssh",
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteCwd: "/remote/workspace",
paperclipApiUrl: "http://198.51.100.10:3102",
},
});
expect(String(result.sessionId)).toContain("/remote/workspace/.paperclip-runtime/pi/sessions/");
expect(prepareWorkspaceForSshExecution).toHaveBeenCalledTimes(1);
expect(syncDirectoryToSsh).toHaveBeenCalledTimes(1);
expect(syncDirectoryToSsh).toHaveBeenCalledWith(expect.objectContaining({
remoteDir: "/remote/workspace/.paperclip-runtime/pi/skills",
followSymlinks: true,
}));
expect(runSshCommand).toHaveBeenCalledWith(
expect.anything(),
expect.stringContaining(".paperclip-runtime/pi/sessions"),
expect.anything(),
);
const call = runChildProcess.mock.calls[0] as unknown as
| [string, string, string[], { env: Record<string, string>; remoteExecution?: { remoteCwd: string } | null }]
| undefined;
expect(call?.[2]).toContain("--session");
expect(call?.[2]).toContain("--skill");
expect(call?.[2]).toContain("/remote/workspace/.paperclip-runtime/pi/skills");
expect(call?.[3].env.PAPERCLIP_API_URL).toBe("http://198.51.100.10:3102");
expect(call?.[3].remoteExecution?.remoteCwd).toBe("/remote/workspace");
expect(restoreWorkspaceFromSshExecution).toHaveBeenCalledTimes(1);
});
it("resumes saved Pi sessions for remote SSH execution only when the identity matches", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-pi-remote-resume-"));
cleanupDirs.push(rootDir);
const workspaceDir = path.join(rootDir, "workspace");
await mkdir(workspaceDir, { recursive: true });
await execute({
runId: "run-ssh-resume",
agent: {
id: "agent-1",
companyId: "company-1",
name: "Pi Builder",
adapterType: "pi_local",
adapterConfig: {},
},
runtime: {
sessionId: "/remote/workspace/.paperclip-runtime/pi/sessions/session-123.jsonl",
sessionParams: {
sessionId: "/remote/workspace/.paperclip-runtime/pi/sessions/session-123.jsonl",
cwd: "/remote/workspace",
remoteExecution: {
transport: "ssh",
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteCwd: "/remote/workspace",
},
},
sessionDisplayId: "session-123",
taskKey: null,
},
config: {
command: "pi",
model: "openai/gpt-5.4-mini",
},
context: {
paperclipWorkspace: {
cwd: workspaceDir,
source: "project_primary",
},
},
executionTransport: {
remoteExecution: {
host: "127.0.0.1",
port: 2222,
username: "fixture",
remoteWorkspacePath: "/remote/workspace",
remoteCwd: "/remote/workspace",
privateKey: "PRIVATE KEY",
knownHosts: "[127.0.0.1]:2222 ssh-ed25519 AAAA",
strictHostKeyChecking: true,
},
},
onLog: async () => {},
});
const call = runChildProcess.mock.calls[0] as unknown as [string, string, string[]] | undefined;
expect(call?.[2]).toContain("--session");
expect(call?.[2]).toContain("/remote/workspace/.paperclip-runtime/pi/sessions/session-123.jsonl");
});
});

View File

@@ -3,6 +3,21 @@ import os from "node:os";
import path from "node:path";
import { fileURLToPath } from "node:url";
import { inferOpenAiCompatibleBiller, type AdapterExecutionContext, type AdapterExecutionResult } from "@paperclipai/adapter-utils";
import {
adapterExecutionTargetIsRemote,
adapterExecutionTargetPaperclipApiUrl,
adapterExecutionTargetRemoteCwd,
adapterExecutionTargetSessionIdentity,
adapterExecutionTargetSessionMatches,
adapterExecutionTargetUsesManagedHome,
describeAdapterExecutionTarget,
ensureAdapterExecutionTargetCommandResolvable,
ensureAdapterExecutionTargetFile,
prepareAdapterExecutionTargetRuntime,
readAdapterExecutionTarget,
resolveAdapterExecutionTargetCommandForLogs,
runAdapterExecutionTargetProcess,
} from "@paperclipai/adapter-utils/execution-target";
import {
asString,
asNumber,
@@ -12,11 +27,9 @@ import {
joinPromptSections,
buildInvocationEnvForLogs,
ensureAbsoluteDirectory,
ensureCommandResolvable,
ensurePaperclipSkillSymlink,
ensurePathInEnv,
readPaperclipRuntimeSkillEntries,
resolveCommandForLogs,
resolvePaperclipDesiredSkillNames,
removeMaintainerOnlySkillSymlinks,
renderTemplate,
@@ -95,6 +108,19 @@ async function ensurePiSkillsInjected(
}
}
async function buildPiSkillsDir(config: Record<string, unknown>): Promise<string> {
const tmp = await fs.mkdtemp(path.join(os.tmpdir(), "paperclip-pi-skills-"));
const target = path.join(tmp, "skills");
await fs.mkdir(target, { recursive: true });
const availableEntries = await readPaperclipRuntimeSkillEntries(config, __moduleDir);
const desiredNames = new Set(resolvePaperclipDesiredSkillNames(config, availableEntries));
for (const entry of availableEntries) {
if (!desiredNames.has(entry.key)) continue;
await fs.symlink(entry.source, path.join(target, entry.runtimeName));
}
return target;
}
function resolvePiBiller(env: Record<string, string>, provider: string | null): string {
return inferOpenAiCompatibleBiller(env, null) ?? provider ?? "unknown";
}
@@ -109,8 +135,18 @@ function buildSessionPath(agentId: string, timestamp: string): string {
return path.join(PAPERCLIP_SESSIONS_DIR, `${safeTimestamp}-${agentId}.jsonl`);
}
function buildRemoteSessionPath(runtimeRootDir: string, agentId: string, timestamp: string): string {
const safeTimestamp = timestamp.replace(/[:.]/g, "-");
return path.posix.join(runtimeRootDir, "sessions", `${safeTimestamp}-${agentId}.jsonl`);
}
export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExecutionResult> {
const { runId, agent, runtime, config, context, onLog, onMeta, onSpawn, authToken } = ctx;
const executionTarget = readAdapterExecutionTarget({
executionTarget: ctx.executionTarget,
legacyRemoteExecution: ctx.executionTransport?.remoteExecution,
});
const executionTargetIsRemote = adapterExecutionTargetIsRemote(executionTarget);
const promptTemplate = asString(
config.promptTemplate,
@@ -140,15 +176,18 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
const useConfiguredInsteadOfAgentHome = workspaceSource === "agent_home" && configuredCwd.length > 0;
const effectiveWorkspaceCwd = useConfiguredInsteadOfAgentHome ? "" : workspaceCwd;
const cwd = effectiveWorkspaceCwd || configuredCwd || process.cwd();
const effectiveExecutionCwd = adapterExecutionTargetRemoteCwd(executionTarget, cwd);
await ensureAbsoluteDirectory(cwd, { createIfMissing: true });
// Ensure sessions directory exists
await ensureSessionsDir();
// Inject skills
if (!executionTargetIsRemote) {
await ensureSessionsDir();
}
const piSkillEntries = await readPaperclipRuntimeSkillEntries(config, __moduleDir);
const desiredPiSkillNames = resolvePaperclipDesiredSkillNames(config, piSkillEntries);
await ensurePiSkillsInjected(onLog, piSkillEntries, desiredPiSkillNames);
if (!executionTargetIsRemote) {
await ensurePiSkillsInjected(onLog, piSkillEntries, desiredPiSkillNames);
}
// Build environment
const envConfig = parseObject(config.env);
@@ -156,7 +195,7 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
typeof envConfig.PAPERCLIP_API_KEY === "string" && envConfig.PAPERCLIP_API_KEY.trim().length > 0;
const env: Record<string, string> = { ...buildPaperclipEnv(agent) };
env.PAPERCLIP_RUN_ID = runId;
const wakeTaskId =
(typeof context.taskId === "string" && context.taskId.trim().length > 0 && context.taskId.trim()) ||
(typeof context.issueId === "string" && context.issueId.trim().length > 0 && context.issueId.trim()) ||
@@ -196,6 +235,8 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
if (workspaceRepoRef) env.PAPERCLIP_WORKSPACE_REPO_REF = workspaceRepoRef;
if (agentHome) env.AGENT_HOME = agentHome;
if (workspaceHints.length > 0) env.PAPERCLIP_WORKSPACES_JSON = JSON.stringify(workspaceHints);
const targetPaperclipApiUrl = adapterExecutionTargetPaperclipApiUrl(executionTarget);
if (targetPaperclipApiUrl) env.PAPERCLIP_API_URL = targetPaperclipApiUrl;
for (const [key, value] of Object.entries(envConfig)) {
if (typeof value === "string") env[key] = value;
@@ -203,27 +244,51 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
if (!hasExplicitApiKey && authToken) {
env.PAPERCLIP_API_KEY = authToken;
}
// Prepend installed skill `bin/` dirs to PATH so an agent's bash tool can
// invoke skill binaries (e.g. `paperclip-get-issue`) by name. Without this,
// any pi_local agent whose AGENTS.md calls a skill command via bash hits
// exit 127 "command not found". Only include skills that ensurePiSkillsInjected
// actually linked — otherwise non-injected skills' binaries would be reachable
// to the agent.
const injectedSkillKeys = new Set(desiredPiSkillNames);
const skillBinDirs = piSkillEntries
.filter((entry) => injectedSkillKeys.has(entry.key) && entry.source.length > 0)
.map((entry) => path.join(entry.source, "bin"));
const mergedEnv = ensurePathInEnv({ ...process.env, ...env });
const pathKey =
typeof mergedEnv.Path === "string" && mergedEnv.Path.length > 0 && !mergedEnv.PATH
? "Path"
: "PATH";
const basePath = mergedEnv[pathKey] ?? "";
if (skillBinDirs.length > 0) {
const existing = basePath.split(path.delimiter).filter(Boolean);
const additions = skillBinDirs.filter((dir) => !existing.includes(dir));
if (additions.length > 0) {
mergedEnv[pathKey] = [...additions, basePath].filter(Boolean).join(path.delimiter);
}
}
const runtimeEnv = Object.fromEntries(
Object.entries(ensurePathInEnv({ ...process.env, ...env })).filter(
Object.entries(mergedEnv).filter(
(entry): entry is [string, string] => typeof entry[1] === "string",
),
);
await ensureCommandResolvable(command, cwd, runtimeEnv);
const resolvedCommand = await resolveCommandForLogs(command, cwd, runtimeEnv);
await ensureAdapterExecutionTargetCommandResolvable(command, executionTarget, cwd, runtimeEnv);
const resolvedCommand = await resolveAdapterExecutionTargetCommandForLogs(command, executionTarget, cwd, runtimeEnv);
const loggedEnv = buildInvocationEnvForLogs(env, {
runtimeEnv,
includeRuntimeKeys: ["HOME"],
resolvedCommand,
});
// Validate model is available before execution
await ensurePiModelConfiguredAndAvailable({
model,
command,
cwd,
env: runtimeEnv,
});
if (!executionTargetIsRemote) {
await ensurePiModelConfiguredAndAvailable({
model,
command,
cwd,
env: runtimeEnv,
});
}
const timeoutSec = asNumber(config.timeoutSec, 0);
const graceSec = asNumber(config.graceSec, 20);
@@ -232,31 +297,84 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
if (fromExtraArgs.length > 0) return fromExtraArgs;
return asStringArray(config.args);
})();
let restoreRemoteWorkspace: (() => Promise<void>) | null = null;
let remoteRuntimeRootDir: string | null = null;
let localSkillsDir: string | null = null;
let remoteSkillsDir: string | null = null;
if (executionTargetIsRemote) {
try {
localSkillsDir = await buildPiSkillsDir(config);
await onLog(
"stdout",
`[paperclip] Syncing workspace and Pi runtime assets to ${describeAdapterExecutionTarget(executionTarget)}.\n`,
);
const preparedRemoteRuntime = await prepareAdapterExecutionTargetRuntime({
target: executionTarget,
adapterKey: "pi",
workspaceLocalDir: cwd,
assets: [
{
key: "skills",
localDir: localSkillsDir,
followSymlinks: true,
},
],
});
restoreRemoteWorkspace = () => preparedRemoteRuntime.restoreWorkspace();
if (adapterExecutionTargetUsesManagedHome(executionTarget) && preparedRemoteRuntime.runtimeRootDir) {
env.HOME = preparedRemoteRuntime.runtimeRootDir;
}
remoteRuntimeRootDir = preparedRemoteRuntime.runtimeRootDir;
remoteSkillsDir = preparedRemoteRuntime.assetDirs.skills ?? null;
} catch (error) {
await Promise.allSettled([
restoreRemoteWorkspace?.(),
localSkillsDir ? fs.rm(path.dirname(localSkillsDir), { recursive: true, force: true }).catch(() => undefined) : Promise.resolve(),
]);
throw error;
}
}
// Handle session
const runtimeSessionParams = parseObject(runtime.sessionParams);
const runtimeSessionId = asString(runtimeSessionParams.sessionId, runtime.sessionId ?? "");
const runtimeSessionCwd = asString(runtimeSessionParams.cwd, "");
const runtimeRemoteExecution = parseObject(runtimeSessionParams.remoteExecution);
const canResumeSession =
runtimeSessionId.length > 0 &&
(runtimeSessionCwd.length === 0 || path.resolve(runtimeSessionCwd) === path.resolve(cwd));
const sessionPath = canResumeSession ? runtimeSessionId : buildSessionPath(agent.id, new Date().toISOString());
(runtimeSessionCwd.length === 0 || path.resolve(runtimeSessionCwd) === path.resolve(effectiveExecutionCwd)) &&
adapterExecutionTargetSessionMatches(runtimeRemoteExecution, executionTarget);
const sessionPath = canResumeSession
? runtimeSessionId
: executionTargetIsRemote && remoteRuntimeRootDir
? buildRemoteSessionPath(remoteRuntimeRootDir, agent.id, new Date().toISOString())
: buildSessionPath(agent.id, new Date().toISOString());
if (runtimeSessionId && !canResumeSession) {
await onLog(
"stdout",
`[paperclip] Pi session "${runtimeSessionId}" was saved for cwd "${runtimeSessionCwd}" and will not be resumed in "${cwd}".\n`,
executionTargetIsRemote
? `[paperclip] Pi session "${runtimeSessionId}" does not match the current remote execution identity and will not be resumed in "${effectiveExecutionCwd}". Starting a fresh remote session.\n`
: `[paperclip] Pi session "${runtimeSessionId}" was saved for cwd "${runtimeSessionCwd}" and will not be resumed in "${effectiveExecutionCwd}".\n`,
);
}
// Ensure session file exists (Pi requires this on first run)
if (!canResumeSession) {
try {
await fs.writeFile(sessionPath, "", { flag: "wx" });
} catch (err) {
// File may already exist, that's ok
if ((err as NodeJS.ErrnoException).code !== "EEXIST") {
throw err;
if (executionTargetIsRemote) {
await ensureAdapterExecutionTargetFile(runId, executionTarget, sessionPath, {
cwd,
env,
timeoutSec: 15,
graceSec: 5,
onLog,
});
} else {
try {
await fs.writeFile(sessionPath, "", { flag: "wx" });
} catch (err) {
if ((err as NodeJS.ErrnoException).code !== "EEXIST") {
throw err;
}
}
}
}
@@ -267,7 +385,7 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
? path.resolve(cwd, instructionsFilePath)
: "";
const instructionsFileDir = instructionsFilePath ? `${path.dirname(instructionsFilePath)}/` : "";
let systemPromptExtension = "";
let instructionsReadFailed = false;
if (resolvedInstructionsFilePath) {
@@ -341,26 +459,24 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
const buildArgs = (sessionFile: string): string[] => {
const args: string[] = [];
// Use JSON mode for structured output with print mode (non-interactive)
args.push("--mode", "json");
args.push("-p"); // Non-interactive mode: process prompt and exit
// Use --append-system-prompt to extend Pi's default system prompt
args.push("--append-system-prompt", renderedSystemPromptExtension);
if (provider) args.push("--provider", provider);
if (modelId) args.push("--model", modelId);
if (thinking) args.push("--thinking", thinking);
args.push("--tools", "read,bash,edit,write,grep,find,ls");
args.push("--session", sessionFile);
// Add Paperclip skills directory so Pi can load the paperclip skill
args.push("--skill", PI_AGENT_SKILLS_DIR);
args.push("--skill", remoteSkillsDir ?? PI_AGENT_SKILLS_DIR);
if (extraArgs.length > 0) args.push(...extraArgs);
// Add the user prompt as the last argument
args.push(userPrompt);
@@ -373,7 +489,7 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
await onMeta({
adapterType: "pi_local",
command: resolvedCommand,
cwd,
cwd: effectiveExecutionCwd,
commandNotes,
commandArgs: args,
env: loggedEnv,
@@ -391,13 +507,13 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
await onLog(stream, chunk);
return;
}
// Buffer stdout and emit only complete lines
stdoutBuffer += chunk;
const lines = stdoutBuffer.split("\n");
// Keep the last (potentially incomplete) line in the buffer
stdoutBuffer = lines.pop() || "";
// Emit complete lines
for (const line of lines) {
if (line) {
@@ -406,20 +522,20 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
}
};
const proc = await runChildProcess(runId, command, args, {
const proc = await runAdapterExecutionTargetProcess(runId, executionTarget, command, args, {
cwd,
env: runtimeEnv,
env: executionTargetIsRemote ? env : runtimeEnv,
timeoutSec,
graceSec,
onSpawn,
onLog: bufferedOnLog,
});
// Flush any remaining buffer content
if (stdoutBuffer) {
await onLog("stdout", stdoutBuffer);
}
return {
proc,
rawStderr: proc.stderr,
@@ -447,7 +563,18 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
const resolvedSessionId = clearSessionOnMissingSession ? null : sessionPath;
const resolvedSessionParams = resolvedSessionId
? { sessionId: resolvedSessionId, cwd }
? {
sessionId: resolvedSessionId,
cwd: effectiveExecutionCwd,
...(workspaceId ? { workspaceId } : {}),
...(workspaceRepoUrl ? { repoUrl: workspaceRepoUrl } : {}),
...(workspaceRepoRef ? { repoRef: workspaceRepoRef } : {}),
...(executionTargetIsRemote
? {
remoteExecution: adapterExecutionTargetSessionIdentity(executionTarget),
}
: {}),
}
: null;
const stderrLine = firstNonEmptyLine(attempt.proc.stderr);
@@ -483,30 +610,49 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
};
};
const initial = await runAttempt(sessionPath);
const initialFailed =
!initial.proc.timedOut && ((initial.proc.exitCode ?? 0) !== 0 || initial.parsed.errors.length > 0);
if (
canResumeSession &&
initialFailed &&
isPiUnknownSessionError(initial.proc.stdout, initial.rawStderr)
) {
await onLog(
"stdout",
`[paperclip] Pi session "${runtimeSessionId}" is unavailable; retrying with a fresh session.\n`,
);
const newSessionPath = buildSessionPath(agent.id, new Date().toISOString());
try {
await fs.writeFile(newSessionPath, "", { flag: "wx" });
} catch (err) {
if ((err as NodeJS.ErrnoException).code !== "EEXIST") {
throw err;
}
}
const retry = await runAttempt(newSessionPath);
return toResult(retry, true);
}
try {
const initial = await runAttempt(sessionPath);
const initialFailed =
!initial.proc.timedOut && ((initial.proc.exitCode ?? 0) !== 0 || initial.parsed.errors.length > 0);
return toResult(initial);
if (
canResumeSession &&
initialFailed &&
isPiUnknownSessionError(initial.proc.stdout, initial.rawStderr)
) {
await onLog(
"stdout",
`[paperclip] Pi session "${runtimeSessionId}" is unavailable; retrying with a fresh session.\n`,
);
const newSessionPath = executionTargetIsRemote && remoteRuntimeRootDir
? buildRemoteSessionPath(remoteRuntimeRootDir, agent.id, new Date().toISOString())
: buildSessionPath(agent.id, new Date().toISOString());
if (executionTargetIsRemote) {
await ensureAdapterExecutionTargetFile(runId, executionTarget, newSessionPath, {
cwd,
env,
timeoutSec: 15,
graceSec: 5,
onLog,
});
} else {
try {
await fs.writeFile(newSessionPath, "", { flag: "wx" });
} catch (err) {
if ((err as NodeJS.ErrnoException).code !== "EEXIST") {
throw err;
}
}
}
const retry = await runAttempt(newSessionPath);
return toResult(retry, true);
}
return toResult(initial);
} finally {
await Promise.all([
restoreRemoteWorkspace?.(),
localSkillsDir ? fs.rm(path.dirname(localSkillsDir), { recursive: true, force: true }).catch(() => undefined) : Promise.resolve(),
]);
}
}

View File

@@ -31,4 +31,5 @@ export {
formatEmbeddedPostgresError,
} from "./embedded-postgres-error.js";
export { issueRelations } from "./schema/issue_relations.js";
export { issueReferenceMentions } from "./schema/issue_reference_mentions.js";
export * from "./schema/index.js";

View File

@@ -0,0 +1,50 @@
CREATE TABLE IF NOT EXISTS "issue_reference_mentions" (
"id" uuid PRIMARY KEY DEFAULT gen_random_uuid() NOT NULL,
"company_id" uuid NOT NULL,
"source_issue_id" uuid NOT NULL,
"target_issue_id" uuid NOT NULL,
"source_kind" text NOT NULL,
"source_record_id" uuid,
"document_key" text,
"matched_text" text,
"created_at" timestamp with time zone DEFAULT now() NOT NULL,
"updated_at" timestamp with time zone DEFAULT now() NOT NULL
);
--> statement-breakpoint
DO $$ BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'issue_reference_mentions_company_id_companies_id_fk') THEN
ALTER TABLE "issue_reference_mentions" ADD CONSTRAINT "issue_reference_mentions_company_id_companies_id_fk" FOREIGN KEY ("company_id") REFERENCES "public"."companies"("id") ON DELETE no action ON UPDATE no action;
END IF;
END $$;
--> statement-breakpoint
DO $$ BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'issue_reference_mentions_source_issue_id_issues_id_fk') THEN
ALTER TABLE "issue_reference_mentions" ADD CONSTRAINT "issue_reference_mentions_source_issue_id_issues_id_fk" FOREIGN KEY ("source_issue_id") REFERENCES "public"."issues"("id") ON DELETE cascade ON UPDATE no action;
END IF;
END $$;
--> statement-breakpoint
DO $$ BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'issue_reference_mentions_target_issue_id_issues_id_fk') THEN
ALTER TABLE "issue_reference_mentions" ADD CONSTRAINT "issue_reference_mentions_target_issue_id_issues_id_fk" FOREIGN KEY ("target_issue_id") REFERENCES "public"."issues"("id") ON DELETE cascade ON UPDATE no action;
END IF;
END $$;--> statement-breakpoint
CREATE INDEX IF NOT EXISTS "issue_reference_mentions_company_source_issue_idx" ON "issue_reference_mentions" USING btree ("company_id","source_issue_id");--> statement-breakpoint
CREATE INDEX IF NOT EXISTS "issue_reference_mentions_company_target_issue_idx" ON "issue_reference_mentions" USING btree ("company_id","target_issue_id");--> statement-breakpoint
CREATE INDEX IF NOT EXISTS "issue_reference_mentions_company_issue_pair_idx" ON "issue_reference_mentions" USING btree ("company_id","source_issue_id","target_issue_id");--> statement-breakpoint
DELETE FROM "issue_reference_mentions"
WHERE "id" IN (
SELECT "id"
FROM (
SELECT
"id",
row_number() OVER (
PARTITION BY "company_id", "source_issue_id", "target_issue_id", "source_kind", "source_record_id"
ORDER BY "created_at", "id"
) AS "row_number"
FROM "issue_reference_mentions"
) AS "duplicates"
WHERE "duplicates"."row_number" > 1
);--> statement-breakpoint
DROP INDEX IF EXISTS "issue_reference_mentions_company_source_mention_uq";--> statement-breakpoint
CREATE UNIQUE INDEX IF NOT EXISTS "issue_reference_mentions_company_source_mention_record_uq" ON "issue_reference_mentions" USING btree ("company_id","source_issue_id","target_issue_id","source_kind","source_record_id") WHERE "source_record_id" IS NOT NULL;--> statement-breakpoint
CREATE UNIQUE INDEX IF NOT EXISTS "issue_reference_mentions_company_source_mention_null_record_uq" ON "issue_reference_mentions" USING btree ("company_id","source_issue_id","target_issue_id","source_kind") WHERE "source_record_id" IS NULL;

View File

@@ -0,0 +1,3 @@
ALTER TABLE "heartbeat_runs" ADD COLUMN IF NOT EXISTS "scheduled_retry_at" timestamp with time zone;--> statement-breakpoint
ALTER TABLE "heartbeat_runs" ADD COLUMN IF NOT EXISTS "scheduled_retry_attempt" integer DEFAULT 0 NOT NULL;--> statement-breakpoint
ALTER TABLE "heartbeat_runs" ADD COLUMN IF NOT EXISTS "scheduled_retry_reason" text;

View File

@@ -0,0 +1,9 @@
ALTER TABLE "routine_runs" ADD COLUMN IF NOT EXISTS "dispatch_fingerprint" text;--> statement-breakpoint
ALTER TABLE "issues" ADD COLUMN IF NOT EXISTS "origin_fingerprint" text DEFAULT 'default' NOT NULL;--> statement-breakpoint
DROP INDEX IF EXISTS "issues_open_routine_execution_uq";--> statement-breakpoint
CREATE UNIQUE INDEX IF NOT EXISTS "issues_open_routine_execution_uq" ON "issues" USING btree ("company_id","origin_kind","origin_id","origin_fingerprint") WHERE "issues"."origin_kind" = 'routine_execution'
and "issues"."origin_id" is not null
and "issues"."hidden_at" is null
and "issues"."execution_run_id" is not null
and "issues"."status" in ('backlog', 'todo', 'in_progress', 'in_review', 'blocked');--> statement-breakpoint
CREATE INDEX IF NOT EXISTS "routine_runs_dispatch_fingerprint_idx" ON "routine_runs" USING btree ("routine_id","dispatch_fingerprint");

View File

@@ -0,0 +1,65 @@
CREATE TABLE IF NOT EXISTS "issue_thread_interactions" (
"id" uuid PRIMARY KEY DEFAULT gen_random_uuid() NOT NULL,
"company_id" uuid NOT NULL,
"issue_id" uuid NOT NULL,
"kind" text NOT NULL,
"status" text DEFAULT 'pending' NOT NULL,
"continuation_policy" text DEFAULT 'wake_assignee' NOT NULL,
"source_comment_id" uuid,
"source_run_id" uuid,
"title" text,
"summary" text,
"created_by_agent_id" uuid,
"created_by_user_id" text,
"resolved_by_agent_id" uuid,
"resolved_by_user_id" text,
"payload" jsonb NOT NULL,
"result" jsonb,
"resolved_at" timestamp with time zone,
"created_at" timestamp with time zone DEFAULT now() NOT NULL,
"updated_at" timestamp with time zone DEFAULT now() NOT NULL
);
--> statement-breakpoint
DO $$ BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'issue_thread_interactions_company_id_companies_id_fk') THEN
ALTER TABLE "issue_thread_interactions" ADD CONSTRAINT "issue_thread_interactions_company_id_companies_id_fk" FOREIGN KEY ("company_id") REFERENCES "public"."companies"("id") ON DELETE no action ON UPDATE no action;
END IF;
END $$;
--> statement-breakpoint
DO $$ BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'issue_thread_interactions_issue_id_issues_id_fk') THEN
ALTER TABLE "issue_thread_interactions" ADD CONSTRAINT "issue_thread_interactions_issue_id_issues_id_fk" FOREIGN KEY ("issue_id") REFERENCES "public"."issues"("id") ON DELETE no action ON UPDATE no action;
END IF;
END $$;
--> statement-breakpoint
DO $$ BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'issue_thread_interactions_source_comment_id_issue_comments_id_fk') THEN
ALTER TABLE "issue_thread_interactions" ADD CONSTRAINT "issue_thread_interactions_source_comment_id_issue_comments_id_fk" FOREIGN KEY ("source_comment_id") REFERENCES "public"."issue_comments"("id") ON DELETE set null ON UPDATE no action;
END IF;
END $$;
--> statement-breakpoint
DO $$ BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'issue_thread_interactions_source_run_id_heartbeat_runs_id_fk') THEN
ALTER TABLE "issue_thread_interactions" ADD CONSTRAINT "issue_thread_interactions_source_run_id_heartbeat_runs_id_fk" FOREIGN KEY ("source_run_id") REFERENCES "public"."heartbeat_runs"("id") ON DELETE set null ON UPDATE no action;
END IF;
END $$;
--> statement-breakpoint
DO $$ BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'issue_thread_interactions_created_by_agent_id_agents_id_fk') THEN
ALTER TABLE "issue_thread_interactions" ADD CONSTRAINT "issue_thread_interactions_created_by_agent_id_agents_id_fk" FOREIGN KEY ("created_by_agent_id") REFERENCES "public"."agents"("id") ON DELETE no action ON UPDATE no action;
END IF;
END $$;
--> statement-breakpoint
DO $$ BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'issue_thread_interactions_resolved_by_agent_id_agents_id_fk') THEN
ALTER TABLE "issue_thread_interactions" ADD CONSTRAINT "issue_thread_interactions_resolved_by_agent_id_agents_id_fk" FOREIGN KEY ("resolved_by_agent_id") REFERENCES "public"."agents"("id") ON DELETE no action ON UPDATE no action;
END IF;
END $$;
--> statement-breakpoint
CREATE INDEX IF NOT EXISTS "issue_thread_interactions_issue_idx" ON "issue_thread_interactions" USING btree ("issue_id");
--> statement-breakpoint
CREATE INDEX IF NOT EXISTS "issue_thread_interactions_company_issue_created_at_idx" ON "issue_thread_interactions" USING btree ("company_id","issue_id","created_at");
--> statement-breakpoint
CREATE INDEX IF NOT EXISTS "issue_thread_interactions_company_issue_status_idx" ON "issue_thread_interactions" USING btree ("company_id","issue_id","status");
--> statement-breakpoint
CREATE INDEX IF NOT EXISTS "issue_thread_interactions_source_comment_idx" ON "issue_thread_interactions" USING btree ("source_comment_id");

View File

@@ -0,0 +1,4 @@
ALTER TABLE "issue_thread_interactions" ADD COLUMN IF NOT EXISTS "idempotency_key" text;--> statement-breakpoint
CREATE UNIQUE INDEX IF NOT EXISTS "issue_thread_interactions_company_issue_idempotency_uq"
ON "issue_thread_interactions" USING btree ("company_id","issue_id","idempotency_key")
WHERE "issue_thread_interactions"."idempotency_key" IS NOT NULL;

View File

@@ -0,0 +1,50 @@
CREATE TABLE "environments" (
"id" uuid PRIMARY KEY DEFAULT gen_random_uuid() NOT NULL,
"company_id" uuid NOT NULL,
"name" text NOT NULL,
"description" text,
"driver" text DEFAULT 'local' NOT NULL,
"status" text DEFAULT 'active' NOT NULL,
"config" jsonb DEFAULT '{}'::jsonb NOT NULL,
"metadata" jsonb,
"created_at" timestamp with time zone DEFAULT now() NOT NULL,
"updated_at" timestamp with time zone DEFAULT now() NOT NULL
);
--> statement-breakpoint
CREATE TABLE "environment_leases" (
"id" uuid PRIMARY KEY DEFAULT gen_random_uuid() NOT NULL,
"company_id" uuid NOT NULL,
"environment_id" uuid NOT NULL,
"execution_workspace_id" uuid,
"issue_id" uuid,
"heartbeat_run_id" uuid,
"status" text DEFAULT 'active' NOT NULL,
"lease_policy" text DEFAULT 'ephemeral' NOT NULL,
"provider" text,
"provider_lease_id" text,
"acquired_at" timestamp with time zone DEFAULT now() NOT NULL,
"last_used_at" timestamp with time zone DEFAULT now() NOT NULL,
"expires_at" timestamp with time zone,
"released_at" timestamp with time zone,
"failure_reason" text,
"cleanup_status" text,
"metadata" jsonb,
"created_at" timestamp with time zone DEFAULT now() NOT NULL,
"updated_at" timestamp with time zone DEFAULT now() NOT NULL
);
--> statement-breakpoint
ALTER TABLE "environments" ADD CONSTRAINT "environments_company_id_companies_id_fk" FOREIGN KEY ("company_id") REFERENCES "public"."companies"("id") ON DELETE cascade ON UPDATE no action;--> statement-breakpoint
ALTER TABLE "environment_leases" ADD CONSTRAINT "environment_leases_company_id_companies_id_fk" FOREIGN KEY ("company_id") REFERENCES "public"."companies"("id") ON DELETE cascade ON UPDATE no action;--> statement-breakpoint
ALTER TABLE "environment_leases" ADD CONSTRAINT "environment_leases_environment_id_environments_id_fk" FOREIGN KEY ("environment_id") REFERENCES "public"."environments"("id") ON DELETE cascade ON UPDATE no action;--> statement-breakpoint
ALTER TABLE "environment_leases" ADD CONSTRAINT "environment_leases_execution_workspace_id_execution_workspaces_id_fk" FOREIGN KEY ("execution_workspace_id") REFERENCES "public"."execution_workspaces"("id") ON DELETE set null ON UPDATE no action;--> statement-breakpoint
ALTER TABLE "environment_leases" ADD CONSTRAINT "environment_leases_issue_id_issues_id_fk" FOREIGN KEY ("issue_id") REFERENCES "public"."issues"("id") ON DELETE set null ON UPDATE no action;--> statement-breakpoint
ALTER TABLE "environment_leases" ADD CONSTRAINT "environment_leases_heartbeat_run_id_heartbeat_runs_id_fk" FOREIGN KEY ("heartbeat_run_id") REFERENCES "public"."heartbeat_runs"("id") ON DELETE set null ON UPDATE no action;--> statement-breakpoint
CREATE INDEX "environments_company_status_idx" ON "environments" USING btree ("company_id","status");--> statement-breakpoint
CREATE UNIQUE INDEX "environments_company_driver_idx" ON "environments" USING btree ("company_id","driver");--> statement-breakpoint
CREATE INDEX "environments_company_name_idx" ON "environments" USING btree ("company_id","name");--> statement-breakpoint
CREATE INDEX "environment_leases_company_environment_status_idx" ON "environment_leases" USING btree ("company_id","environment_id","status");--> statement-breakpoint
CREATE INDEX "environment_leases_company_execution_workspace_idx" ON "environment_leases" USING btree ("company_id","execution_workspace_id");--> statement-breakpoint
CREATE INDEX "environment_leases_company_issue_idx" ON "environment_leases" USING btree ("company_id","issue_id");--> statement-breakpoint
CREATE INDEX "environment_leases_heartbeat_run_idx" ON "environment_leases" USING btree ("heartbeat_run_id");--> statement-breakpoint
CREATE INDEX "environment_leases_company_last_used_idx" ON "environment_leases" USING btree ("company_id","last_used_at");--> statement-breakpoint
CREATE INDEX "environment_leases_provider_lease_idx" ON "environment_leases" USING btree ("provider_lease_id");

View File

@@ -0,0 +1,107 @@
CREATE TABLE IF NOT EXISTS "issue_tree_holds" (
"id" uuid PRIMARY KEY DEFAULT gen_random_uuid() NOT NULL,
"company_id" uuid NOT NULL,
"root_issue_id" uuid NOT NULL,
"mode" text NOT NULL,
"status" text DEFAULT 'active' NOT NULL,
"reason" text,
"release_policy" jsonb,
"created_by_actor_type" text DEFAULT 'system' NOT NULL,
"created_by_agent_id" uuid,
"created_by_user_id" text,
"created_by_run_id" uuid,
"released_at" timestamp with time zone,
"released_by_actor_type" text,
"released_by_agent_id" uuid,
"released_by_user_id" text,
"released_by_run_id" uuid,
"release_reason" text,
"release_metadata" jsonb,
"created_at" timestamp with time zone DEFAULT now() NOT NULL,
"updated_at" timestamp with time zone DEFAULT now() NOT NULL
);
--> statement-breakpoint
CREATE TABLE IF NOT EXISTS "issue_tree_hold_members" (
"id" uuid PRIMARY KEY DEFAULT gen_random_uuid() NOT NULL,
"company_id" uuid NOT NULL,
"hold_id" uuid NOT NULL,
"issue_id" uuid NOT NULL,
"parent_issue_id" uuid,
"depth" integer DEFAULT 0 NOT NULL,
"issue_identifier" text,
"issue_title" text NOT NULL,
"issue_status" text NOT NULL,
"assignee_agent_id" uuid,
"assignee_user_id" text,
"active_run_id" uuid,
"active_run_status" text,
"skipped" boolean DEFAULT false NOT NULL,
"skip_reason" text,
"created_at" timestamp with time zone DEFAULT now() NOT NULL
);
--> statement-breakpoint
DO $$ BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'issue_tree_holds_company_id_companies_id_fk') THEN
ALTER TABLE "issue_tree_holds" ADD CONSTRAINT "issue_tree_holds_company_id_companies_id_fk" FOREIGN KEY ("company_id") REFERENCES "public"."companies"("id") ON DELETE no action ON UPDATE no action;
END IF;
END $$;--> statement-breakpoint
DO $$ BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'issue_tree_holds_root_issue_id_issues_id_fk') THEN
ALTER TABLE "issue_tree_holds" ADD CONSTRAINT "issue_tree_holds_root_issue_id_issues_id_fk" FOREIGN KEY ("root_issue_id") REFERENCES "public"."issues"("id") ON DELETE cascade ON UPDATE no action;
END IF;
END $$;--> statement-breakpoint
DO $$ BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'issue_tree_holds_created_by_agent_id_agents_id_fk') THEN
ALTER TABLE "issue_tree_holds" ADD CONSTRAINT "issue_tree_holds_created_by_agent_id_agents_id_fk" FOREIGN KEY ("created_by_agent_id") REFERENCES "public"."agents"("id") ON DELETE set null ON UPDATE no action;
END IF;
END $$;--> statement-breakpoint
DO $$ BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'issue_tree_holds_created_by_run_id_heartbeat_runs_id_fk') THEN
ALTER TABLE "issue_tree_holds" ADD CONSTRAINT "issue_tree_holds_created_by_run_id_heartbeat_runs_id_fk" FOREIGN KEY ("created_by_run_id") REFERENCES "public"."heartbeat_runs"("id") ON DELETE set null ON UPDATE no action;
END IF;
END $$;--> statement-breakpoint
DO $$ BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'issue_tree_holds_released_by_agent_id_agents_id_fk') THEN
ALTER TABLE "issue_tree_holds" ADD CONSTRAINT "issue_tree_holds_released_by_agent_id_agents_id_fk" FOREIGN KEY ("released_by_agent_id") REFERENCES "public"."agents"("id") ON DELETE set null ON UPDATE no action;
END IF;
END $$;--> statement-breakpoint
DO $$ BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'issue_tree_holds_released_by_run_id_heartbeat_runs_id_fk') THEN
ALTER TABLE "issue_tree_holds" ADD CONSTRAINT "issue_tree_holds_released_by_run_id_heartbeat_runs_id_fk" FOREIGN KEY ("released_by_run_id") REFERENCES "public"."heartbeat_runs"("id") ON DELETE set null ON UPDATE no action;
END IF;
END $$;--> statement-breakpoint
DO $$ BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'issue_tree_hold_members_company_id_companies_id_fk') THEN
ALTER TABLE "issue_tree_hold_members" ADD CONSTRAINT "issue_tree_hold_members_company_id_companies_id_fk" FOREIGN KEY ("company_id") REFERENCES "public"."companies"("id") ON DELETE no action ON UPDATE no action;
END IF;
END $$;--> statement-breakpoint
DO $$ BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'issue_tree_hold_members_hold_id_issue_tree_holds_id_fk') THEN
ALTER TABLE "issue_tree_hold_members" ADD CONSTRAINT "issue_tree_hold_members_hold_id_issue_tree_holds_id_fk" FOREIGN KEY ("hold_id") REFERENCES "public"."issue_tree_holds"("id") ON DELETE cascade ON UPDATE no action;
END IF;
END $$;--> statement-breakpoint
DO $$ BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'issue_tree_hold_members_issue_id_issues_id_fk') THEN
ALTER TABLE "issue_tree_hold_members" ADD CONSTRAINT "issue_tree_hold_members_issue_id_issues_id_fk" FOREIGN KEY ("issue_id") REFERENCES "public"."issues"("id") ON DELETE cascade ON UPDATE no action;
END IF;
END $$;--> statement-breakpoint
DO $$ BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'issue_tree_hold_members_parent_issue_id_issues_id_fk') THEN
ALTER TABLE "issue_tree_hold_members" ADD CONSTRAINT "issue_tree_hold_members_parent_issue_id_issues_id_fk" FOREIGN KEY ("parent_issue_id") REFERENCES "public"."issues"("id") ON DELETE set null ON UPDATE no action;
END IF;
END $$;--> statement-breakpoint
DO $$ BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'issue_tree_hold_members_assignee_agent_id_agents_id_fk') THEN
ALTER TABLE "issue_tree_hold_members" ADD CONSTRAINT "issue_tree_hold_members_assignee_agent_id_agents_id_fk" FOREIGN KEY ("assignee_agent_id") REFERENCES "public"."agents"("id") ON DELETE set null ON UPDATE no action;
END IF;
END $$;--> statement-breakpoint
DO $$ BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_constraint WHERE conname = 'issue_tree_hold_members_active_run_id_heartbeat_runs_id_fk') THEN
ALTER TABLE "issue_tree_hold_members" ADD CONSTRAINT "issue_tree_hold_members_active_run_id_heartbeat_runs_id_fk" FOREIGN KEY ("active_run_id") REFERENCES "public"."heartbeat_runs"("id") ON DELETE set null ON UPDATE no action;
END IF;
END $$;--> statement-breakpoint
CREATE INDEX IF NOT EXISTS "issue_tree_holds_company_root_status_idx" ON "issue_tree_holds" USING btree ("company_id","root_issue_id","status");--> statement-breakpoint
CREATE INDEX IF NOT EXISTS "issue_tree_holds_company_status_mode_idx" ON "issue_tree_holds" USING btree ("company_id","status","mode");--> statement-breakpoint
CREATE UNIQUE INDEX IF NOT EXISTS "issue_tree_hold_members_hold_issue_uq" ON "issue_tree_hold_members" USING btree ("hold_id","issue_id");--> statement-breakpoint
CREATE INDEX IF NOT EXISTS "issue_tree_hold_members_company_issue_idx" ON "issue_tree_hold_members" USING btree ("company_id","issue_id");--> statement-breakpoint
CREATE INDEX IF NOT EXISTS "issue_tree_hold_members_hold_depth_idx" ON "issue_tree_hold_members" USING btree ("hold_id","depth");

View File

@@ -0,0 +1,3 @@
ALTER TABLE "agents" ADD COLUMN "default_environment_id" uuid;
ALTER TABLE "agents" ADD CONSTRAINT "agents_default_environment_id_environments_id_fk" FOREIGN KEY ("default_environment_id") REFERENCES "public"."environments"("id") ON DELETE set null ON UPDATE no action;
CREATE INDEX "agents_company_default_environment_idx" ON "agents" USING btree ("company_id","default_environment_id");

View File

@@ -0,0 +1,2 @@
DROP INDEX IF EXISTS "environments_company_driver_idx";--> statement-breakpoint
CREATE UNIQUE INDEX IF NOT EXISTS "environments_company_driver_idx" ON "environments" USING btree ("company_id","driver") WHERE "driver" = 'local';

View File

@@ -0,0 +1,13 @@
CREATE UNIQUE INDEX IF NOT EXISTS "issues_active_liveness_recovery_incident_uq"
ON "issues" USING btree ("company_id","origin_kind","origin_id")
WHERE "origin_kind" = 'harness_liveness_escalation'
AND "origin_id" IS NOT NULL
AND "hidden_at" IS NULL
AND "status" NOT IN ('done', 'cancelled');
--> statement-breakpoint
CREATE UNIQUE INDEX IF NOT EXISTS "issues_active_liveness_recovery_leaf_uq"
ON "issues" USING btree ("company_id","origin_kind","origin_fingerprint")
WHERE "origin_kind" = 'harness_liveness_escalation'
AND "origin_fingerprint" <> 'default'
AND "hidden_at" IS NULL
AND "status" NOT IN ('done', 'cancelled');

View File

@@ -0,0 +1,70 @@
ALTER TABLE "heartbeat_runs" ADD COLUMN IF NOT EXISTS "last_output_at" timestamp with time zone;
--> statement-breakpoint
ALTER TABLE "heartbeat_runs" ADD COLUMN IF NOT EXISTS "last_output_seq" integer DEFAULT 0 NOT NULL;
--> statement-breakpoint
ALTER TABLE "heartbeat_runs" ADD COLUMN IF NOT EXISTS "last_output_stream" text;
--> statement-breakpoint
ALTER TABLE "heartbeat_runs" ADD COLUMN IF NOT EXISTS "last_output_bytes" bigint;
--> statement-breakpoint
CREATE INDEX IF NOT EXISTS "heartbeat_runs_company_status_last_output_idx"
ON "heartbeat_runs" USING btree ("company_id","status","last_output_at");
--> statement-breakpoint
CREATE INDEX IF NOT EXISTS "heartbeat_runs_company_status_process_started_idx"
ON "heartbeat_runs" USING btree ("company_id","status","process_started_at");
--> statement-breakpoint
CREATE TABLE IF NOT EXISTS "heartbeat_run_watchdog_decisions" (
"id" uuid PRIMARY KEY DEFAULT gen_random_uuid() NOT NULL,
"company_id" uuid NOT NULL,
"run_id" uuid NOT NULL,
"evaluation_issue_id" uuid,
"decision" text NOT NULL,
"snoozed_until" timestamp with time zone,
"reason" text,
"created_by_agent_id" uuid,
"created_by_user_id" text,
"created_by_run_id" uuid,
"created_at" timestamp with time zone DEFAULT now() NOT NULL
);
--> statement-breakpoint
DO $$ BEGIN
ALTER TABLE "heartbeat_run_watchdog_decisions" ADD CONSTRAINT "heartbeat_run_watchdog_decisions_company_id_companies_id_fk" FOREIGN KEY ("company_id") REFERENCES "public"."companies"("id") ON DELETE no action ON UPDATE no action;
EXCEPTION
WHEN duplicate_object THEN null;
END $$;
--> statement-breakpoint
DO $$ BEGIN
ALTER TABLE "heartbeat_run_watchdog_decisions" ADD CONSTRAINT "heartbeat_run_watchdog_decisions_run_id_heartbeat_runs_id_fk" FOREIGN KEY ("run_id") REFERENCES "public"."heartbeat_runs"("id") ON DELETE cascade ON UPDATE no action;
EXCEPTION
WHEN duplicate_object THEN null;
END $$;
--> statement-breakpoint
DO $$ BEGIN
ALTER TABLE "heartbeat_run_watchdog_decisions" ADD CONSTRAINT "heartbeat_run_watchdog_decisions_evaluation_issue_id_issues_id_fk" FOREIGN KEY ("evaluation_issue_id") REFERENCES "public"."issues"("id") ON DELETE set null ON UPDATE no action;
EXCEPTION
WHEN duplicate_object THEN null;
END $$;
--> statement-breakpoint
DO $$ BEGIN
ALTER TABLE "heartbeat_run_watchdog_decisions" ADD CONSTRAINT "heartbeat_run_watchdog_decisions_created_by_agent_id_agents_id_fk" FOREIGN KEY ("created_by_agent_id") REFERENCES "public"."agents"("id") ON DELETE set null ON UPDATE no action;
EXCEPTION
WHEN duplicate_object THEN null;
END $$;
--> statement-breakpoint
DO $$ BEGIN
ALTER TABLE "heartbeat_run_watchdog_decisions" ADD CONSTRAINT "heartbeat_run_watchdog_decisions_created_by_run_id_heartbeat_runs_id_fk" FOREIGN KEY ("created_by_run_id") REFERENCES "public"."heartbeat_runs"("id") ON DELETE set null ON UPDATE no action;
EXCEPTION
WHEN duplicate_object THEN null;
END $$;
--> statement-breakpoint
CREATE INDEX IF NOT EXISTS "heartbeat_run_watchdog_decisions_company_run_created_idx"
ON "heartbeat_run_watchdog_decisions" USING btree ("company_id","run_id","created_at");
--> statement-breakpoint
CREATE INDEX IF NOT EXISTS "heartbeat_run_watchdog_decisions_company_run_snooze_idx"
ON "heartbeat_run_watchdog_decisions" USING btree ("company_id","run_id","snoozed_until");
--> statement-breakpoint
CREATE UNIQUE INDEX IF NOT EXISTS "issues_active_stale_run_evaluation_uq"
ON "issues" USING btree ("company_id","origin_kind","origin_id")
WHERE "origin_kind" = 'stale_active_run_evaluation'
AND "origin_id" IS NOT NULL
AND "hidden_at" IS NULL
AND "status" NOT IN ('done', 'cancelled');

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -421,6 +421,83 @@
"when": 1776542246000,
"tag": "0059_plugin_database_namespaces",
"breakpoints": true
},
{
"idx": 60,
"version": "7",
"when": 1776717606743,
"tag": "0060_orange_annihilus",
"breakpoints": true
},
{
"idx": 61,
"version": "7",
"when": 1776785165389,
"tag": "0061_lively_thor_girl",
"breakpoints": true
},
{
"idx": 62,
"version": "7",
"when": 1776780000000,
"tag": "0062_routine_run_dispatch_fingerprint",
"breakpoints": true
},
{
"idx": 63,
"version": "7",
"when": 1776780001000,
"tag": "0063_issue_thread_interactions",
"breakpoints": true
},
{
"idx": 64,
"version": "7",
"when": 1776780002000,
"tag": "0064_issue_thread_interaction_idempotency",
"breakpoints": true
},
{
"idx": 65,
"version": "7",
"when": 1776903900000,
"tag": "0065_environments",
"breakpoints": true
},
{
"idx": 66,
"version": "7",
"when": 1776903901000,
"tag": "0066_issue_tree_holds",
"breakpoints": true
},
{
"idx": 67,
"version": "7",
"when": 1776904200000,
"tag": "0067_agent_default_environment",
"breakpoints": true
},
{
"idx": 68,
"version": "7",
"when": 1776959400000,
"tag": "0068_environment_local_driver_unique",
"breakpoints": true
},
{
"idx": 69,
"version": "7",
"when": 1776780003000,
"tag": "0069_liveness_recovery_dedupe",
"breakpoints": true
},
{
"idx": 70,
"version": "7",
"when": 1776780004000,
"tag": "0070_active_run_output_watchdog",
"breakpoints": true
}
]
}

View File

@@ -9,6 +9,7 @@ import {
index,
} from "drizzle-orm/pg-core";
import { companies } from "./companies.js";
import { environments } from "./environments.js";
export const agents = pgTable(
"agents",
@@ -25,6 +26,7 @@ export const agents = pgTable(
adapterType: text("adapter_type").notNull().default("process"),
adapterConfig: jsonb("adapter_config").$type<Record<string, unknown>>().notNull().default({}),
runtimeConfig: jsonb("runtime_config").$type<Record<string, unknown>>().notNull().default({}),
defaultEnvironmentId: uuid("default_environment_id").references(() => environments.id, { onDelete: "set null" }),
budgetMonthlyCents: integer("budget_monthly_cents").notNull().default(0),
spentMonthlyCents: integer("spent_monthly_cents").notNull().default(0),
pauseReason: text("pause_reason"),
@@ -38,5 +40,6 @@ export const agents = pgTable(
(table) => ({
companyStatusIdx: index("agents_company_status_idx").on(table.companyId, table.status),
companyReportsToIdx: index("agents_company_reports_to_idx").on(table.companyId, table.reportsTo),
companyDefaultEnvironmentIdx: index("agents_company_default_environment_idx").on(table.companyId, table.defaultEnvironmentId),
}),
);

View File

@@ -0,0 +1,46 @@
import { index, jsonb, pgTable, text, timestamp, uuid } from "drizzle-orm/pg-core";
import { companies } from "./companies.js";
import { environments } from "./environments.js";
import { executionWorkspaces } from "./execution_workspaces.js";
import { heartbeatRuns } from "./heartbeat_runs.js";
import { issues } from "./issues.js";
export const environmentLeases = pgTable(
"environment_leases",
{
id: uuid("id").primaryKey().defaultRandom(),
companyId: uuid("company_id").notNull().references(() => companies.id, { onDelete: "cascade" }),
environmentId: uuid("environment_id").notNull().references(() => environments.id, { onDelete: "cascade" }),
executionWorkspaceId: uuid("execution_workspace_id").references(() => executionWorkspaces.id, { onDelete: "set null" }),
issueId: uuid("issue_id").references(() => issues.id, { onDelete: "set null" }),
heartbeatRunId: uuid("heartbeat_run_id").references(() => heartbeatRuns.id, { onDelete: "set null" }),
status: text("status").notNull().default("active"),
leasePolicy: text("lease_policy").notNull().default("ephemeral"),
provider: text("provider"),
providerLeaseId: text("provider_lease_id"),
acquiredAt: timestamp("acquired_at", { withTimezone: true }).notNull().defaultNow(),
lastUsedAt: timestamp("last_used_at", { withTimezone: true }).notNull().defaultNow(),
expiresAt: timestamp("expires_at", { withTimezone: true }),
releasedAt: timestamp("released_at", { withTimezone: true }),
failureReason: text("failure_reason"),
cleanupStatus: text("cleanup_status"),
metadata: jsonb("metadata").$type<Record<string, unknown>>(),
createdAt: timestamp("created_at", { withTimezone: true }).notNull().defaultNow(),
updatedAt: timestamp("updated_at", { withTimezone: true }).notNull().defaultNow(),
},
(table) => ({
companyEnvironmentStatusIdx: index("environment_leases_company_environment_status_idx").on(
table.companyId,
table.environmentId,
table.status,
),
companyExecutionWorkspaceIdx: index("environment_leases_company_execution_workspace_idx").on(
table.companyId,
table.executionWorkspaceId,
),
companyIssueIdx: index("environment_leases_company_issue_idx").on(table.companyId, table.issueId),
heartbeatRunIdx: index("environment_leases_heartbeat_run_idx").on(table.heartbeatRunId),
companyLastUsedIdx: index("environment_leases_company_last_used_idx").on(table.companyId, table.lastUsedAt),
providerLeaseIdx: index("environment_leases_provider_lease_idx").on(table.providerLeaseId),
}),
);

View File

@@ -0,0 +1,26 @@
import { sql } from "drizzle-orm";
import { index, jsonb, pgTable, text, timestamp, uniqueIndex, uuid } from "drizzle-orm/pg-core";
import { companies } from "./companies.js";
export const environments = pgTable(
"environments",
{
id: uuid("id").primaryKey().defaultRandom(),
companyId: uuid("company_id").notNull().references(() => companies.id, { onDelete: "cascade" }),
name: text("name").notNull(),
description: text("description"),
driver: text("driver").notNull().default("local"),
status: text("status").notNull().default("active"),
config: jsonb("config").$type<Record<string, unknown>>().notNull().default({}),
metadata: jsonb("metadata").$type<Record<string, unknown>>(),
createdAt: timestamp("created_at", { withTimezone: true }).notNull().defaultNow(),
updatedAt: timestamp("updated_at", { withTimezone: true }).notNull().defaultNow(),
},
(table) => ({
companyStatusIdx: index("environments_company_status_idx").on(table.companyId, table.status),
companyDriverIdx: uniqueIndex("environments_company_driver_idx")
.on(table.companyId, table.driver)
.where(sql`${table.driver} = 'local'`),
companyNameIdx: index("environments_company_name_idx").on(table.companyId, table.name),
}),
);

View File

@@ -0,0 +1,34 @@
import { index, pgTable, text, timestamp, uuid } from "drizzle-orm/pg-core";
import { agents } from "./agents.js";
import { companies } from "./companies.js";
import { heartbeatRuns } from "./heartbeat_runs.js";
import { issues } from "./issues.js";
export const heartbeatRunWatchdogDecisions = pgTable(
"heartbeat_run_watchdog_decisions",
{
id: uuid("id").primaryKey().defaultRandom(),
companyId: uuid("company_id").notNull().references(() => companies.id),
runId: uuid("run_id").notNull().references(() => heartbeatRuns.id, { onDelete: "cascade" }),
evaluationIssueId: uuid("evaluation_issue_id").references(() => issues.id, { onDelete: "set null" }),
decision: text("decision").notNull(),
snoozedUntil: timestamp("snoozed_until", { withTimezone: true }),
reason: text("reason"),
createdByAgentId: uuid("created_by_agent_id").references(() => agents.id, { onDelete: "set null" }),
createdByUserId: text("created_by_user_id"),
createdByRunId: uuid("created_by_run_id").references(() => heartbeatRuns.id, { onDelete: "set null" }),
createdAt: timestamp("created_at", { withTimezone: true }).notNull().defaultNow(),
},
(table) => ({
companyRunCreatedIdx: index("heartbeat_run_watchdog_decisions_company_run_created_idx").on(
table.companyId,
table.runId,
table.createdAt,
),
companyRunSnoozeIdx: index("heartbeat_run_watchdog_decisions_company_run_snooze_idx").on(
table.companyId,
table.runId,
table.snoozedUntil,
),
}),
);

View File

@@ -34,10 +34,17 @@ export const heartbeatRuns = pgTable(
processPid: integer("process_pid"),
processGroupId: integer("process_group_id"),
processStartedAt: timestamp("process_started_at", { withTimezone: true }),
lastOutputAt: timestamp("last_output_at", { withTimezone: true }),
lastOutputSeq: integer("last_output_seq").notNull().default(0),
lastOutputStream: text("last_output_stream"),
lastOutputBytes: bigint("last_output_bytes", { mode: "number" }),
retryOfRunId: uuid("retry_of_run_id").references((): AnyPgColumn => heartbeatRuns.id, {
onDelete: "set null",
}),
processLossRetryCount: integer("process_loss_retry_count").notNull().default(0),
scheduledRetryAt: timestamp("scheduled_retry_at", { withTimezone: true }),
scheduledRetryAttempt: integer("scheduled_retry_attempt").notNull().default(0),
scheduledRetryReason: text("scheduled_retry_reason"),
issueCommentStatus: text("issue_comment_status").notNull().default("not_applicable"),
issueCommentSatisfiedByCommentId: uuid("issue_comment_satisfied_by_comment_id"),
issueCommentRetryQueuedAt: timestamp("issue_comment_retry_queued_at", { withTimezone: true }),
@@ -61,5 +68,15 @@ export const heartbeatRuns = pgTable(
table.livenessState,
table.createdAt,
),
companyStatusLastOutputIdx: index("heartbeat_runs_company_status_last_output_idx").on(
table.companyId,
table.status,
table.lastOutputAt,
),
companyStatusProcessStartedIdx: index("heartbeat_runs_company_status_process_started_idx").on(
table.companyId,
table.status,
table.processStartedAt,
),
}),
);

View File

@@ -22,11 +22,14 @@ export { agentWakeupRequests } from "./agent_wakeup_requests.js";
export { projects } from "./projects.js";
export { projectWorkspaces } from "./project_workspaces.js";
export { executionWorkspaces } from "./execution_workspaces.js";
export { environments } from "./environments.js";
export { environmentLeases } from "./environment_leases.js";
export { workspaceOperations } from "./workspace_operations.js";
export { workspaceRuntimeServices } from "./workspace_runtime_services.js";
export { projectGoals } from "./project_goals.js";
export { goals } from "./goals.js";
export { issues } from "./issues.js";
export { issueReferenceMentions } from "./issue_reference_mentions.js";
export { issueRelations } from "./issue_relations.js";
export { routines, routineTriggers, routineRuns } from "./routines.js";
export { issueWorkProducts } from "./issue_work_products.js";
@@ -34,6 +37,9 @@ export { labels } from "./labels.js";
export { issueLabels } from "./issue_labels.js";
export { issueApprovals } from "./issue_approvals.js";
export { issueComments } from "./issue_comments.js";
export { issueThreadInteractions } from "./issue_thread_interactions.js";
export { issueTreeHolds } from "./issue_tree_holds.js";
export { issueTreeHoldMembers } from "./issue_tree_hold_members.js";
export { issueExecutionDecisions } from "./issue_execution_decisions.js";
export { issueInboxArchives } from "./issue_inbox_archives.js";
export { inboxDismissals } from "./inbox_dismissals.js";
@@ -47,6 +53,7 @@ export { documentRevisions } from "./document_revisions.js";
export { issueDocuments } from "./issue_documents.js";
export { heartbeatRuns } from "./heartbeat_runs.js";
export { heartbeatRunEvents } from "./heartbeat_run_events.js";
export { heartbeatRunWatchdogDecisions } from "./heartbeat_run_watchdog_decisions.js";
export { costEvents } from "./cost_events.js";
export { financeEvents } from "./finance_events.js";
export { approvals } from "./approvals.js";

View File

@@ -0,0 +1,48 @@
import { sql } from "drizzle-orm";
import { index, pgTable, text, timestamp, uniqueIndex, uuid } from "drizzle-orm/pg-core";
import { companies } from "./companies.js";
import { issues } from "./issues.js";
export const issueReferenceMentions = pgTable(
"issue_reference_mentions",
{
id: uuid("id").primaryKey().defaultRandom(),
companyId: uuid("company_id").notNull().references(() => companies.id),
sourceIssueId: uuid("source_issue_id").notNull().references(() => issues.id, { onDelete: "cascade" }),
targetIssueId: uuid("target_issue_id").notNull().references(() => issues.id, { onDelete: "cascade" }),
sourceKind: text("source_kind").$type<"title" | "description" | "comment" | "document">().notNull(),
sourceRecordId: uuid("source_record_id"),
documentKey: text("document_key"),
matchedText: text("matched_text"),
createdAt: timestamp("created_at", { withTimezone: true }).notNull().defaultNow(),
updatedAt: timestamp("updated_at", { withTimezone: true }).notNull().defaultNow(),
},
(table) => ({
companySourceIssueIdx: index("issue_reference_mentions_company_source_issue_idx").on(
table.companyId,
table.sourceIssueId,
),
companyTargetIssueIdx: index("issue_reference_mentions_company_target_issue_idx").on(
table.companyId,
table.targetIssueId,
),
companyIssuePairIdx: index("issue_reference_mentions_company_issue_pair_idx").on(
table.companyId,
table.sourceIssueId,
table.targetIssueId,
),
companySourceMentionWithRecordUq: uniqueIndex("issue_reference_mentions_company_source_mention_record_uq").on(
table.companyId,
table.sourceIssueId,
table.targetIssueId,
table.sourceKind,
table.sourceRecordId,
).where(sql`${table.sourceRecordId} is not null`),
companySourceMentionWithoutRecordUq: uniqueIndex("issue_reference_mentions_company_source_mention_null_record_uq").on(
table.companyId,
table.sourceIssueId,
table.targetIssueId,
table.sourceKind,
).where(sql`${table.sourceRecordId} is null`),
}),
);

View File

@@ -0,0 +1,54 @@
import type {
IssueThreadInteractionPayload,
IssueThreadInteractionResult,
} from "@paperclipai/shared";
import { sql } from "drizzle-orm";
import { pgTable, uuid, text, timestamp, jsonb, index, uniqueIndex } from "drizzle-orm/pg-core";
import { agents } from "./agents.js";
import { companies } from "./companies.js";
import { heartbeatRuns } from "./heartbeat_runs.js";
import { issueComments } from "./issue_comments.js";
import { issues } from "./issues.js";
export const issueThreadInteractions = pgTable(
"issue_thread_interactions",
{
id: uuid("id").primaryKey().defaultRandom(),
companyId: uuid("company_id").notNull().references(() => companies.id),
issueId: uuid("issue_id").notNull().references(() => issues.id),
kind: text("kind").notNull(),
status: text("status").notNull().default("pending"),
continuationPolicy: text("continuation_policy").notNull().default("wake_assignee"),
idempotencyKey: text("idempotency_key"),
sourceCommentId: uuid("source_comment_id").references(() => issueComments.id, { onDelete: "set null" }),
sourceRunId: uuid("source_run_id").references(() => heartbeatRuns.id, { onDelete: "set null" }),
title: text("title"),
summary: text("summary"),
createdByAgentId: uuid("created_by_agent_id").references(() => agents.id),
createdByUserId: text("created_by_user_id"),
resolvedByAgentId: uuid("resolved_by_agent_id").references(() => agents.id),
resolvedByUserId: text("resolved_by_user_id"),
payload: jsonb("payload").$type<IssueThreadInteractionPayload>().notNull(),
result: jsonb("result").$type<IssueThreadInteractionResult>(),
resolvedAt: timestamp("resolved_at", { withTimezone: true }),
createdAt: timestamp("created_at", { withTimezone: true }).notNull().defaultNow(),
updatedAt: timestamp("updated_at", { withTimezone: true }).notNull().defaultNow(),
},
(table) => ({
issueIdx: index("issue_thread_interactions_issue_idx").on(table.issueId),
companyIssueCreatedAtIdx: index("issue_thread_interactions_company_issue_created_at_idx").on(
table.companyId,
table.issueId,
table.createdAt,
),
companyIssueStatusIdx: index("issue_thread_interactions_company_issue_status_idx").on(
table.companyId,
table.issueId,
table.status,
),
companyIssueIdempotencyUq: uniqueIndex("issue_thread_interactions_company_issue_idempotency_uq")
.on(table.companyId, table.issueId, table.idempotencyKey)
.where(sql`${table.idempotencyKey} IS NOT NULL`),
sourceCommentIdx: index("issue_thread_interactions_source_comment_idx").on(table.sourceCommentId),
}),
);

View File

@@ -0,0 +1,33 @@
import { index, pgTable, text, timestamp, uniqueIndex, uuid, boolean, integer } from "drizzle-orm/pg-core";
import { agents } from "./agents.js";
import { companies } from "./companies.js";
import { heartbeatRuns } from "./heartbeat_runs.js";
import { issues } from "./issues.js";
import { issueTreeHolds } from "./issue_tree_holds.js";
export const issueTreeHoldMembers = pgTable(
"issue_tree_hold_members",
{
id: uuid("id").primaryKey().defaultRandom(),
companyId: uuid("company_id").notNull().references(() => companies.id),
holdId: uuid("hold_id").notNull().references(() => issueTreeHolds.id, { onDelete: "cascade" }),
issueId: uuid("issue_id").notNull().references(() => issues.id, { onDelete: "cascade" }),
parentIssueId: uuid("parent_issue_id").references(() => issues.id, { onDelete: "set null" }),
depth: integer("depth").notNull().default(0),
issueIdentifier: text("issue_identifier"),
issueTitle: text("issue_title").notNull(),
issueStatus: text("issue_status").notNull(),
assigneeAgentId: uuid("assignee_agent_id").references(() => agents.id, { onDelete: "set null" }),
assigneeUserId: text("assignee_user_id"),
activeRunId: uuid("active_run_id").references(() => heartbeatRuns.id, { onDelete: "set null" }),
activeRunStatus: text("active_run_status"),
skipped: boolean("skipped").notNull().default(false),
skipReason: text("skip_reason"),
createdAt: timestamp("created_at", { withTimezone: true }).notNull().defaultNow(),
},
(table) => ({
holdIssueUniqueIdx: uniqueIndex("issue_tree_hold_members_hold_issue_uq").on(table.holdId, table.issueId),
companyIssueIdx: index("issue_tree_hold_members_company_issue_idx").on(table.companyId, table.issueId),
holdDepthIdx: index("issue_tree_hold_members_hold_depth_idx").on(table.holdId, table.depth),
}),
);

View File

@@ -0,0 +1,39 @@
import { index, jsonb, pgTable, text, timestamp, uuid } from "drizzle-orm/pg-core";
import { agents } from "./agents.js";
import { companies } from "./companies.js";
import { heartbeatRuns } from "./heartbeat_runs.js";
import { issues } from "./issues.js";
export const issueTreeHolds = pgTable(
"issue_tree_holds",
{
id: uuid("id").primaryKey().defaultRandom(),
companyId: uuid("company_id").notNull().references(() => companies.id),
rootIssueId: uuid("root_issue_id").notNull().references(() => issues.id, { onDelete: "cascade" }),
mode: text("mode").notNull(),
status: text("status").notNull().default("active"),
reason: text("reason"),
releasePolicy: jsonb("release_policy").$type<Record<string, unknown>>(),
createdByActorType: text("created_by_actor_type").notNull().default("system"),
createdByAgentId: uuid("created_by_agent_id").references(() => agents.id, { onDelete: "set null" }),
createdByUserId: text("created_by_user_id"),
createdByRunId: uuid("created_by_run_id").references(() => heartbeatRuns.id, { onDelete: "set null" }),
releasedAt: timestamp("released_at", { withTimezone: true }),
releasedByActorType: text("released_by_actor_type"),
releasedByAgentId: uuid("released_by_agent_id").references(() => agents.id, { onDelete: "set null" }),
releasedByUserId: text("released_by_user_id"),
releasedByRunId: uuid("released_by_run_id").references(() => heartbeatRuns.id, { onDelete: "set null" }),
releaseReason: text("release_reason"),
releaseMetadata: jsonb("release_metadata").$type<Record<string, unknown>>(),
createdAt: timestamp("created_at", { withTimezone: true }).notNull().defaultNow(),
updatedAt: timestamp("updated_at", { withTimezone: true }).notNull().defaultNow(),
},
(table) => ({
companyRootStatusIdx: index("issue_tree_holds_company_root_status_idx").on(
table.companyId,
table.rootIssueId,
table.status,
),
companyStatusModeIdx: index("issue_tree_holds_company_status_mode_idx").on(table.companyId, table.status, table.mode),
}),
);

View File

@@ -44,6 +44,7 @@ export const issues = pgTable(
originKind: text("origin_kind").notNull().default("manual"),
originId: text("origin_id"),
originRunId: text("origin_run_id"),
originFingerprint: text("origin_fingerprint").notNull().default("default"),
requestDepth: integer("request_depth").notNull().default(0),
billingCode: text("billing_code"),
assigneeAdapterOverrides: jsonb("assignee_adapter_overrides").$type<Record<string, unknown>>(),
@@ -82,7 +83,7 @@ export const issues = pgTable(
identifierSearchIdx: index("issues_identifier_search_idx").using("gin", table.identifier.op("gin_trgm_ops")),
descriptionSearchIdx: index("issues_description_search_idx").using("gin", table.description.op("gin_trgm_ops")),
openRoutineExecutionIdx: uniqueIndex("issues_open_routine_execution_uq")
.on(table.companyId, table.originKind, table.originId)
.on(table.companyId, table.originKind, table.originId, table.originFingerprint)
.where(
sql`${table.originKind} = 'routine_execution'
and ${table.originId} is not null
@@ -90,5 +91,29 @@ export const issues = pgTable(
and ${table.executionRunId} is not null
and ${table.status} in ('backlog', 'todo', 'in_progress', 'in_review', 'blocked')`,
),
activeLivenessRecoveryIncidentIdx: uniqueIndex("issues_active_liveness_recovery_incident_uq")
.on(table.companyId, table.originKind, table.originId)
.where(
sql`${table.originKind} = 'harness_liveness_escalation'
and ${table.originId} is not null
and ${table.hiddenAt} is null
and ${table.status} not in ('done', 'cancelled')`,
),
activeLivenessRecoveryLeafIdx: uniqueIndex("issues_active_liveness_recovery_leaf_uq")
.on(table.companyId, table.originKind, table.originFingerprint)
.where(
sql`${table.originKind} = 'harness_liveness_escalation'
and ${table.originFingerprint} <> 'default'
and ${table.hiddenAt} is null
and ${table.status} not in ('done', 'cancelled')`,
),
activeStaleRunEvaluationIdx: uniqueIndex("issues_active_stale_run_evaluation_uq")
.on(table.companyId, table.originKind, table.originId)
.where(
sql`${table.originKind} = 'stale_active_run_evaluation'
and ${table.originId} is not null
and ${table.hiddenAt} is null
and ${table.status} not in ('done', 'cancelled')`,
),
}),
);

View File

@@ -96,6 +96,7 @@ export const routineRuns = pgTable(
triggeredAt: timestamp("triggered_at", { withTimezone: true }).notNull().defaultNow(),
idempotencyKey: text("idempotency_key"),
triggerPayload: jsonb("trigger_payload").$type<Record<string, unknown>>(),
dispatchFingerprint: text("dispatch_fingerprint"),
linkedIssueId: uuid("linked_issue_id").references(() => issues.id, { onDelete: "set null" }),
coalescedIntoRunId: uuid("coalesced_into_run_id"),
failureReason: text("failure_reason"),
@@ -106,6 +107,7 @@ export const routineRuns = pgTable(
(table) => ({
companyRoutineIdx: index("routine_runs_company_routine_idx").on(table.companyId, table.routineId, table.createdAt),
triggerIdx: index("routine_runs_trigger_idx").on(table.triggerId, table.createdAt),
dispatchFingerprintIdx: index("routine_runs_dispatch_fingerprint_idx").on(table.routineId, table.dispatchFingerprint),
linkedIssueIdx: index("routine_runs_linked_issue_idx").on(table.linkedIssueId),
idempotencyIdx: index("routine_runs_trigger_idempotency_idx").on(table.triggerId, table.idempotencyKey),
}),

View File

@@ -33,77 +33,56 @@ export type EmbeddedPostgresTestDatabase = {
let embeddedPostgresSupportPromise: Promise<EmbeddedPostgresTestSupport> | null = null;
const DEFAULT_PAPERCLIP_EMBEDDED_POSTGRES_PORT = 54329;
function getReservedTestPorts(): Set<number> {
const configuredPorts = [
DEFAULT_PAPERCLIP_EMBEDDED_POSTGRES_PORT,
Number.parseInt(process.env.PAPERCLIP_EMBEDDED_POSTGRES_PORT ?? "", 10),
...String(process.env.PAPERCLIP_TEST_POSTGRES_RESERVED_PORTS ?? "")
.split(",")
.map((value) => Number.parseInt(value.trim(), 10)),
];
return new Set(configuredPorts.filter((port) => Number.isInteger(port) && port > 0 && port <= 65535));
}
async function getEmbeddedPostgresCtor(): Promise<EmbeddedPostgresCtor> {
const mod = await import("embedded-postgres");
return mod.default as EmbeddedPostgresCtor;
}
async function getAvailablePort(): Promise<number> {
return await new Promise((resolve, reject) => {
const server = net.createServer();
server.unref();
server.on("error", reject);
server.listen(0, "127.0.0.1", () => {
const address = server.address();
if (!address || typeof address === "string") {
server.close(() => reject(new Error("Failed to allocate test port")));
return;
}
const { port } = address;
server.close((error) => {
if (error) reject(error);
else resolve(port);
const reservedPorts = getReservedTestPorts();
for (let attempt = 0; attempt < 20; attempt += 1) {
const port = await new Promise<number>((resolve, reject) => {
const server = net.createServer();
server.unref();
server.on("error", reject);
server.listen(0, "127.0.0.1", () => {
const address = server.address();
if (!address || typeof address === "string") {
server.close(() => reject(new Error("Failed to allocate test port")));
return;
}
const { port } = address;
server.close((error) => {
if (error) reject(error);
else resolve(port);
});
});
});
});
}
function formatEmbeddedPostgresError(error: unknown): string {
if (error instanceof Error && error.message.length > 0) return error.message;
if (typeof error === "string" && error.length > 0) return error;
return "embedded Postgres startup failed";
}
async function probeEmbeddedPostgresSupport(): Promise<EmbeddedPostgresTestSupport> {
const dataDir = fs.mkdtempSync(path.join(os.tmpdir(), "paperclip-embedded-postgres-probe-"));
const port = await getAvailablePort();
const EmbeddedPostgres = await getEmbeddedPostgresCtor();
const instance = new EmbeddedPostgres({
databaseDir: dataDir,
user: "paperclip",
password: "paperclip",
port,
persistent: true,
initdbFlags: ["--encoding=UTF8", "--locale=C", "--lc-messages=C"],
onLog: () => {},
onError: () => {},
});
try {
await instance.initialise();
await instance.start();
return { supported: true };
} catch (error) {
return {
supported: false,
reason: formatEmbeddedPostgresError(error),
};
} finally {
await instance.stop().catch(() => {});
fs.rmSync(dataDir, { recursive: true, force: true });
if (!reservedPorts.has(port)) return port;
}
throw new Error(
`Failed to allocate embedded Postgres test port outside reserved Paperclip ports: ${[
...reservedPorts,
].join(", ")}`,
);
}
export async function getEmbeddedPostgresTestSupport(): Promise<EmbeddedPostgresTestSupport> {
if (!embeddedPostgresSupportPromise) {
embeddedPostgresSupportPromise = probeEmbeddedPostgresSupport();
}
return await embeddedPostgresSupportPromise;
}
export async function startEmbeddedPostgresTestDatabase(
tempDirPrefix: string,
): Promise<EmbeddedPostgresTestDatabase> {
async function createEmbeddedPostgresTestInstance(tempDirPrefix: string) {
const dataDir = fs.mkdtempSync(path.join(os.tmpdir(), tempDirPrefix));
const port = await getAvailablePort();
const EmbeddedPostgres = await getEmbeddedPostgresCtor();
@@ -118,6 +97,51 @@ export async function startEmbeddedPostgresTestDatabase(
onError: () => {},
});
return { dataDir, port, instance };
}
function cleanupEmbeddedPostgresTestDirs(dataDir: string) {
fs.rmSync(dataDir, { recursive: true, force: true });
}
function formatEmbeddedPostgresError(error: unknown): string {
if (error instanceof Error && error.message.length > 0) return error.message;
if (typeof error === "string" && error.length > 0) return error;
return "embedded Postgres startup failed";
}
async function probeEmbeddedPostgresSupport(): Promise<EmbeddedPostgresTestSupport> {
const { dataDir, instance } = await createEmbeddedPostgresTestInstance(
"paperclip-embedded-postgres-probe-",
);
try {
await instance.initialise();
await instance.start();
return { supported: true };
} catch (error) {
return {
supported: false,
reason: formatEmbeddedPostgresError(error),
};
} finally {
await instance.stop().catch(() => {});
cleanupEmbeddedPostgresTestDirs(dataDir);
}
}
export async function getEmbeddedPostgresTestSupport(): Promise<EmbeddedPostgresTestSupport> {
if (!embeddedPostgresSupportPromise) {
embeddedPostgresSupportPromise = probeEmbeddedPostgresSupport();
}
return await embeddedPostgresSupportPromise;
}
export async function startEmbeddedPostgresTestDatabase(
tempDirPrefix: string,
): Promise<EmbeddedPostgresTestDatabase> {
const { dataDir, port, instance } = await createEmbeddedPostgresTestInstance(tempDirPrefix);
try {
await instance.initialise();
await instance.start();
@@ -131,12 +155,12 @@ export async function startEmbeddedPostgresTestDatabase(
connectionString,
cleanup: async () => {
await instance.stop().catch(() => {});
fs.rmSync(dataDir, { recursive: true, force: true });
cleanupEmbeddedPostgresTestDirs(dataDir);
},
};
} catch (error) {
await instance.stop().catch(() => {});
fs.rmSync(dataDir, { recursive: true, force: true });
cleanupEmbeddedPostgresTestDirs(dataDir);
throw new Error(
`Failed to start embedded PostgreSQL test database: ${formatEmbeddedPostgresError(error)}`,
);

View File

@@ -47,6 +47,8 @@ Read tools:
- `paperclipListDocumentRevisions`
- `paperclipListProjects`
- `paperclipGetProject`
- `paperclipGetIssueWorkspaceRuntime`
- `paperclipWaitForIssueWorkspaceService`
- `paperclipListGoals`
- `paperclipGetGoal`
- `paperclipListApprovals`
@@ -61,8 +63,12 @@ Write tools:
- `paperclipCheckoutIssue`
- `paperclipReleaseIssue`
- `paperclipAddComment`
- `paperclipSuggestTasks`
- `paperclipAskUserQuestions`
- `paperclipRequestConfirmation`
- `paperclipUpsertIssueDocument`
- `paperclipRestoreIssueDocumentRevision`
- `paperclipControlIssueWorkspaceServices`
- `paperclipCreateApproval`
- `paperclipLinkIssueApproval`
- `paperclipUnlinkIssueApproval`

View File

@@ -107,6 +107,165 @@ describe("paperclip MCP tools", () => {
});
});
it("controls issue workspace services through the current execution workspace", async () => {
const fetchMock = vi.fn()
.mockResolvedValueOnce(mockJsonResponse({
currentExecutionWorkspace: {
id: "44444444-4444-4444-8444-444444444444",
runtimeServices: [],
},
}))
.mockResolvedValueOnce(mockJsonResponse({
operation: { id: "operation-1" },
workspace: {
id: "44444444-4444-4444-8444-444444444444",
runtimeServices: [
{
id: "55555555-5555-4555-8555-555555555555",
serviceName: "web",
status: "running",
url: "http://127.0.0.1:5173",
},
],
},
}));
vi.stubGlobal("fetch", fetchMock);
const tool = getTool("paperclipControlIssueWorkspaceServices");
await tool.execute({
issueId: "PAP-1135",
action: "restart",
workspaceCommandId: "web",
});
expect(fetchMock).toHaveBeenCalledTimes(2);
const [lookupUrl, lookupInit] = fetchMock.mock.calls[0] as [string, RequestInit];
expect(String(lookupUrl)).toBe("http://localhost:3100/api/issues/PAP-1135/heartbeat-context");
expect(lookupInit.method).toBe("GET");
const [controlUrl, controlInit] = fetchMock.mock.calls[1] as [string, RequestInit];
expect(String(controlUrl)).toBe(
"http://localhost:3100/api/execution-workspaces/44444444-4444-4444-8444-444444444444/runtime-services/restart",
);
expect(controlInit.method).toBe("POST");
expect(JSON.parse(String(controlInit.body))).toEqual({
workspaceCommandId: "web",
});
});
it("waits for an issue workspace runtime service URL", async () => {
const fetchMock = vi.fn()
.mockResolvedValueOnce(mockJsonResponse({
currentExecutionWorkspace: {
id: "44444444-4444-4444-8444-444444444444",
runtimeServices: [
{
id: "55555555-5555-4555-8555-555555555555",
serviceName: "web",
status: "running",
healthStatus: "healthy",
url: "http://127.0.0.1:5173",
},
],
},
}));
vi.stubGlobal("fetch", fetchMock);
const tool = getTool("paperclipWaitForIssueWorkspaceService");
const response = await tool.execute({
issueId: "PAP-1135",
serviceName: "web",
timeoutSeconds: 1,
});
expect(fetchMock).toHaveBeenCalledTimes(1);
expect(response.content[0]?.text).toContain("http://127.0.0.1:5173");
});
it("creates suggest_tasks interactions with the expected issue-scoped payload", async () => {
const fetchMock = vi.fn().mockResolvedValue(
mockJsonResponse({ id: "interaction-1", kind: "suggest_tasks" }),
);
vi.stubGlobal("fetch", fetchMock);
const tool = getTool("paperclipSuggestTasks");
await tool.execute({
issueId: "PAP-1135",
idempotencyKey: "run-1:suggest",
payload: {
version: 1,
tasks: [{ clientKey: "task-1", title: "One" }],
},
});
const [url, init] = fetchMock.mock.calls[0] as [string, RequestInit];
expect(String(url)).toBe("http://localhost:3100/api/issues/PAP-1135/interactions");
expect(init.method).toBe("POST");
expect(JSON.parse(String(init.body))).toEqual({
kind: "suggest_tasks",
continuationPolicy: "wake_assignee",
idempotencyKey: "run-1:suggest",
payload: {
version: 1,
tasks: [{ clientKey: "task-1", title: "One" }],
},
});
});
it("creates request_confirmation interactions with plan target payloads", async () => {
const fetchMock = vi.fn().mockResolvedValue(
mockJsonResponse({ id: "interaction-1", kind: "request_confirmation" }),
);
vi.stubGlobal("fetch", fetchMock);
const tool = getTool("paperclipRequestConfirmation");
await tool.execute({
issueId: "PAP-1135",
idempotencyKey: "confirmation:PAP-1135:plan:33333333-3333-4333-8333-333333333333",
title: "Plan approval",
payload: {
version: 1,
prompt: "Accept this plan?",
acceptLabel: "Accept plan",
allowDeclineReason: true,
rejectLabel: "Request changes",
rejectRequiresReason: true,
supersedeOnUserComment: true,
target: {
type: "issue_document",
key: "plan",
revisionId: "33333333-3333-4333-8333-333333333333",
revisionNumber: 3,
},
},
});
const [url, init] = fetchMock.mock.calls[0] as [string, RequestInit];
expect(String(url)).toBe("http://localhost:3100/api/issues/PAP-1135/interactions");
expect(init.method).toBe("POST");
expect(JSON.parse(String(init.body))).toEqual({
kind: "request_confirmation",
continuationPolicy: "none",
idempotencyKey: "confirmation:PAP-1135:plan:33333333-3333-4333-8333-333333333333",
title: "Plan approval",
payload: {
version: 1,
prompt: "Accept this plan?",
acceptLabel: "Accept plan",
allowDeclineReason: true,
rejectLabel: "Request changes",
rejectRequiresReason: true,
supersedeOnUserComment: true,
target: {
type: "issue_document",
key: "plan",
revisionId: "33333333-3333-4333-8333-333333333333",
revisionNumber: 3,
},
},
});
});
it("creates approvals with the expected company-scoped payload", async () => {
const fetchMock = vi.fn().mockResolvedValue(
mockJsonResponse({ id: "approval-1" }),

View File

@@ -1,9 +1,13 @@
import { z } from "zod";
import {
addIssueCommentSchema,
askUserQuestionsPayloadSchema,
checkoutIssueSchema,
createApprovalSchema,
createIssueSchema,
issueThreadInteractionContinuationPolicySchema,
requestConfirmationPayloadSchema,
suggestTasksPayloadSchema,
updateIssueSchema,
upsertIssueDocumentSchema,
linkIssueApprovalSchema,
@@ -107,6 +111,39 @@ const addCommentToolSchema = z.object({
issueId: issueIdSchema,
}).merge(addIssueCommentSchema);
const createSuggestTasksToolSchema = z.object({
issueId: issueIdSchema,
idempotencyKey: z.string().trim().max(255).nullable().optional(),
sourceCommentId: z.string().uuid().nullable().optional(),
sourceRunId: z.string().uuid().nullable().optional(),
title: z.string().trim().max(240).nullable().optional(),
summary: z.string().trim().max(1000).nullable().optional(),
continuationPolicy: issueThreadInteractionContinuationPolicySchema.optional().default("wake_assignee"),
payload: suggestTasksPayloadSchema,
});
const createAskUserQuestionsToolSchema = z.object({
issueId: issueIdSchema,
idempotencyKey: z.string().trim().max(255).nullable().optional(),
sourceCommentId: z.string().uuid().nullable().optional(),
sourceRunId: z.string().uuid().nullable().optional(),
title: z.string().trim().max(240).nullable().optional(),
summary: z.string().trim().max(1000).nullable().optional(),
continuationPolicy: issueThreadInteractionContinuationPolicySchema.optional().default("wake_assignee"),
payload: askUserQuestionsPayloadSchema,
});
const createRequestConfirmationToolSchema = z.object({
issueId: issueIdSchema,
idempotencyKey: z.string().trim().max(255).nullable().optional(),
sourceCommentId: z.string().uuid().nullable().optional(),
sourceRunId: z.string().uuid().nullable().optional(),
title: z.string().trim().max(240).nullable().optional(),
summary: z.string().trim().max(1000).nullable().optional(),
continuationPolicy: issueThreadInteractionContinuationPolicySchema.optional().default("none"),
payload: requestConfirmationPayloadSchema,
});
const approvalDecisionSchema = z.object({
approvalId: approvalIdSchema,
action: z.enum(["approve", "reject", "requestRevision", "resubmit"]),
@@ -124,6 +161,66 @@ const apiRequestSchema = z.object({
jsonBody: z.string().optional(),
});
const workspaceRuntimeControlTargetSchema = z.object({
workspaceCommandId: z.string().min(1).optional().nullable(),
runtimeServiceId: z.string().uuid().optional().nullable(),
serviceIndex: z.number().int().nonnegative().optional().nullable(),
});
const issueWorkspaceRuntimeControlSchema = z.object({
issueId: issueIdSchema,
action: z.enum(["start", "stop", "restart"]),
}).merge(workspaceRuntimeControlTargetSchema);
const waitForIssueWorkspaceServiceSchema = z.object({
issueId: issueIdSchema,
runtimeServiceId: z.string().uuid().optional().nullable(),
serviceName: z.string().min(1).optional().nullable(),
timeoutSeconds: z.number().int().positive().max(300).optional(),
});
function sleep(ms: number) {
return new Promise((resolve) => setTimeout(resolve, ms));
}
function readCurrentExecutionWorkspace(context: unknown): Record<string, unknown> | null {
if (!context || typeof context !== "object") return null;
const workspace = (context as { currentExecutionWorkspace?: unknown }).currentExecutionWorkspace;
return workspace && typeof workspace === "object" ? workspace as Record<string, unknown> : null;
}
function readWorkspaceRuntimeServices(workspace: Record<string, unknown> | null): Array<Record<string, unknown>> {
const raw = workspace?.runtimeServices;
return Array.isArray(raw)
? raw.filter((entry): entry is Record<string, unknown> => Boolean(entry) && typeof entry === "object")
: [];
}
function selectRuntimeService(
services: Array<Record<string, unknown>>,
input: { runtimeServiceId?: string | null; serviceName?: string | null },
) {
if (input.runtimeServiceId) {
return services.find((service) => service.id === input.runtimeServiceId) ?? null;
}
if (input.serviceName) {
return services.find((service) => service.serviceName === input.serviceName) ?? null;
}
return services.find((service) => service.status === "running" || service.status === "starting")
?? services[0]
?? null;
}
async function getIssueWorkspaceRuntime(client: PaperclipApiClient, issueId: string) {
const context = await client.requestJson("GET", `/issues/${encodeURIComponent(issueId)}/heartbeat-context`);
const workspace = readCurrentExecutionWorkspace(context);
return {
context,
workspace,
runtimeServices: readWorkspaceRuntimeServices(workspace),
};
}
export function createToolDefinitions(client: PaperclipApiClient): ToolDefinition[] {
return [
makeTool(
@@ -247,6 +344,55 @@ export function createToolDefinitions(client: PaperclipApiClient): ToolDefinitio
return client.requestJson("GET", `/projects/${encodeURIComponent(projectId)}${qs}`);
},
),
makeTool(
"paperclipGetIssueWorkspaceRuntime",
"Get the current execution workspace and runtime services for an issue, including service URLs",
z.object({ issueId: issueIdSchema }),
async ({ issueId }) => getIssueWorkspaceRuntime(client, issueId),
),
makeTool(
"paperclipControlIssueWorkspaceServices",
"Start, stop, or restart the current issue execution workspace runtime services",
issueWorkspaceRuntimeControlSchema,
async ({ issueId, action, ...target }) => {
const runtime = await getIssueWorkspaceRuntime(client, issueId);
const workspaceId = typeof runtime.workspace?.id === "string" ? runtime.workspace.id : null;
if (!workspaceId) {
throw new Error("Issue has no current execution workspace");
}
return client.requestJson(
"POST",
`/execution-workspaces/${encodeURIComponent(workspaceId)}/runtime-services/${action}`,
{ body: target },
);
},
),
makeTool(
"paperclipWaitForIssueWorkspaceService",
"Wait until an issue execution workspace runtime service is running and has a URL when one is exposed",
waitForIssueWorkspaceServiceSchema,
async ({ issueId, runtimeServiceId, serviceName, timeoutSeconds }) => {
const deadline = Date.now() + (timeoutSeconds ?? 60) * 1000;
let latest: Awaited<ReturnType<typeof getIssueWorkspaceRuntime>> | null = null;
while (Date.now() <= deadline) {
latest = await getIssueWorkspaceRuntime(client, issueId);
const service = selectRuntimeService(latest.runtimeServices, { runtimeServiceId, serviceName });
if (service?.status === "running" && service.healthStatus !== "unhealthy") {
return {
workspace: latest.workspace,
service,
};
}
await sleep(1000);
}
return {
timedOut: true,
latestWorkspace: latest?.workspace ?? null,
latestRuntimeServices: latest?.runtimeServices ?? [],
};
},
),
makeTool(
"paperclipListGoals",
"List goals in a company",
@@ -304,7 +450,7 @@ export function createToolDefinitions(client: PaperclipApiClient): ToolDefinitio
),
makeTool(
"paperclipUpdateIssue",
"Patch an issue, optionally including a comment",
"Patch an issue, optionally including a comment; include resume=true when intentionally requesting follow-up on resumable closed work",
updateIssueToolSchema,
async ({ issueId, ...body }) =>
client.requestJson("PATCH", `/issues/${encodeURIComponent(issueId)}`, { body }),
@@ -329,11 +475,47 @@ export function createToolDefinitions(client: PaperclipApiClient): ToolDefinitio
),
makeTool(
"paperclipAddComment",
"Add a comment to an issue",
"Add a comment to an issue; include resume=true when intentionally requesting follow-up on resumable closed work",
addCommentToolSchema,
async ({ issueId, ...body }) =>
client.requestJson("POST", `/issues/${encodeURIComponent(issueId)}/comments`, { body }),
),
makeTool(
"paperclipSuggestTasks",
"Create a suggest_tasks interaction on an issue",
createSuggestTasksToolSchema,
async ({ issueId, ...body }) =>
client.requestJson("POST", `/issues/${encodeURIComponent(issueId)}/interactions`, {
body: {
kind: "suggest_tasks",
...body,
},
}),
),
makeTool(
"paperclipAskUserQuestions",
"Create an ask_user_questions interaction on an issue",
createAskUserQuestionsToolSchema,
async ({ issueId, ...body }) =>
client.requestJson("POST", `/issues/${encodeURIComponent(issueId)}/interactions`, {
body: {
kind: "ask_user_questions",
...body,
},
}),
),
makeTool(
"paperclipRequestConfirmation",
"Create a request_confirmation interaction on an issue",
createRequestConfirmationToolSchema,
async ({ issueId, ...body }) =>
client.requestJson("POST", `/issues/${encodeURIComponent(issueId)}/interactions`, {
body: {
kind: "request_confirmation",
...body,
},
}),
),
makeTool(
"paperclipUpsertIssueDocument",
"Create or update an issue document",

View File

@@ -4,9 +4,9 @@ import fs from "node:fs";
import path from "node:path";
import { fileURLToPath } from "node:url";
const VALID_TEMPLATES = ["default", "connector", "workspace"] as const;
const VALID_TEMPLATES = ["default", "connector", "workspace", "environment"] as const;
type PluginTemplate = (typeof VALID_TEMPLATES)[number];
const VALID_CATEGORIES = new Set(["connector", "workspace", "automation", "ui"] as const);
const VALID_CATEGORIES = new Set(["connector", "workspace", "automation", "ui", "environment"] as const);
export interface ScaffoldPluginOptions {
pluginName: string;
@@ -15,7 +15,7 @@ export interface ScaffoldPluginOptions {
displayName?: string;
description?: string;
author?: string;
category?: "connector" | "workspace" | "automation" | "ui";
category?: "connector" | "workspace" | "automation" | "ui" | "environment";
sdkPath?: string;
}
@@ -138,7 +138,7 @@ export function scaffoldPluginProject(options: ScaffoldPluginOptions): string {
const displayName = options.displayName ?? makeDisplayName(options.pluginName);
const description = options.description ?? "A Paperclip plugin";
const author = options.author ?? "Plugin Author";
const category = options.category ?? (template === "workspace" ? "workspace" : "connector");
const category = options.category ?? (template === "workspace" ? "workspace" : template === "environment" ? "environment" : "connector");
const manifestId = packageToManifestId(options.pluginName);
const localSdkPath = path.resolve(options.sdkPath ?? getLocalSdkPackagePath());
const localSharedPath = getLocalSharedPackagePath(localSdkPath);
@@ -296,9 +296,231 @@ export default defineConfig({
`,
);
writeFile(
path.join(outputDir, "src", "manifest.ts"),
`import type { PaperclipPluginManifestV1 } from "@paperclipai/plugin-sdk";
if (template === "environment") {
writeFile(
path.join(outputDir, "src", "manifest.ts"),
`import type { PaperclipPluginManifestV1 } from "@paperclipai/plugin-sdk";
const manifest: PaperclipPluginManifestV1 = {
id: ${quote(manifestId)},
apiVersion: 1,
version: "0.1.0",
displayName: ${quote(displayName)},
description: ${quote(description)},
author: ${quote(author)},
categories: [${quote(category)}],
capabilities: [
"environment.drivers.register",
"plugin.state.read",
"plugin.state.write"
],
entrypoints: {
worker: "./dist/worker.js",
ui: "./dist/ui"
},
environmentDrivers: [
{
driverKey: ${quote(manifestId + "-driver")},
displayName: ${quote(displayName + " Driver")}
}
],
ui: {
slots: [
{
type: "dashboardWidget",
id: "health-widget",
displayName: ${quote(`${displayName} Health`)},
exportName: "DashboardWidget"
}
]
}
};
export default manifest;
`,
);
writeFile(
path.join(outputDir, "src", "worker.ts"),
`import { definePlugin, runWorker } from "@paperclipai/plugin-sdk";
import type {
PluginEnvironmentValidateConfigParams,
PluginEnvironmentProbeParams,
PluginEnvironmentAcquireLeaseParams,
PluginEnvironmentResumeLeaseParams,
PluginEnvironmentReleaseLeaseParams,
PluginEnvironmentDestroyLeaseParams,
PluginEnvironmentRealizeWorkspaceParams,
PluginEnvironmentExecuteParams,
} from "@paperclipai/plugin-sdk";
const plugin = definePlugin({
async setup(ctx) {
ctx.data.register("health", async () => {
return { status: "ok", checkedAt: new Date().toISOString() };
});
},
async onHealth() {
return { status: "ok", message: "Environment plugin worker is running" };
},
async onEnvironmentValidateConfig(params: PluginEnvironmentValidateConfigParams) {
if (!params.config || typeof params.config !== "object") {
return { ok: false, errors: ["Config must be a non-null object"] };
}
return { ok: true, normalizedConfig: params.config };
},
async onEnvironmentProbe(_params: PluginEnvironmentProbeParams) {
return { ok: true, summary: "Environment is reachable" };
},
async onEnvironmentAcquireLease(params: PluginEnvironmentAcquireLeaseParams) {
const providerLeaseId = \`lease-\${params.runId}-\${Date.now()}\`;
return {
providerLeaseId,
metadata: { acquiredAt: new Date().toISOString() },
};
},
async onEnvironmentResumeLease(params: PluginEnvironmentResumeLeaseParams) {
return {
providerLeaseId: params.providerLeaseId,
metadata: { ...params.leaseMetadata, resumed: true },
};
},
async onEnvironmentReleaseLease(_params: PluginEnvironmentReleaseLeaseParams) {
// Release provider-side resources here
},
async onEnvironmentDestroyLease(_params: PluginEnvironmentDestroyLeaseParams) {
// Destroy provider-side resources here
},
async onEnvironmentRealizeWorkspace(params: PluginEnvironmentRealizeWorkspaceParams) {
const cwd = params.workspace.remotePath ?? params.workspace.localPath ?? "/tmp/workspace";
return { cwd, metadata: { realized: true } };
},
async onEnvironmentExecute(params: PluginEnvironmentExecuteParams) {
// Replace this with real command execution against your provider
return {
exitCode: 0,
timedOut: false,
stdout: \`Executed: \${params.command}\`,
stderr: "",
};
},
});
export default plugin;
runWorker(plugin, import.meta.url);
`,
);
writeFile(
path.join(outputDir, "src", "ui", "index.tsx"),
`import { usePluginData, type PluginWidgetProps } from "@paperclipai/plugin-sdk/ui";
type HealthData = {
status: "ok" | "degraded" | "error";
checkedAt: string;
};
export function DashboardWidget(_props: PluginWidgetProps) {
const { data, loading, error } = usePluginData<HealthData>("health");
if (loading) return <div>Loading environment health...</div>;
if (error) return <div>Plugin error: {error.message}</div>;
return (
<div style={{ display: "grid", gap: "0.5rem" }}>
<strong>${displayName}</strong>
<div>Health: {data?.status ?? "unknown"}</div>
<div>Checked: {data?.checkedAt ?? "never"}</div>
</div>
);
}
`,
);
writeFile(
path.join(outputDir, "tests", "plugin.spec.ts"),
`import { describe, expect, it } from "vitest";
import {
createEnvironmentTestHarness,
createFakeEnvironmentDriver,
assertEnvironmentEventOrder,
assertLeaseLifecycle,
} from "@paperclipai/plugin-sdk/testing";
import manifest from "../src/manifest.js";
import plugin from "../src/worker.js";
const ENV_ID = "env-test-1";
const BASE_PARAMS = {
driverKey: manifest.environmentDrivers![0].driverKey,
companyId: "co-1",
environmentId: ENV_ID,
config: {},
};
describe("environment plugin scaffold", () => {
it("validates config", async () => {
const driver = createFakeEnvironmentDriver({ driverKey: BASE_PARAMS.driverKey });
const harness = createEnvironmentTestHarness({ manifest, environmentDriver: driver });
await plugin.definition.setup(harness.ctx);
const result = await plugin.definition.onEnvironmentValidateConfig!({
driverKey: BASE_PARAMS.driverKey,
config: { host: "test" },
});
expect(result.ok).toBe(true);
});
it("probes the environment", async () => {
const driver = createFakeEnvironmentDriver({ driverKey: BASE_PARAMS.driverKey });
const harness = createEnvironmentTestHarness({ manifest, environmentDriver: driver });
await plugin.definition.setup(harness.ctx);
const result = await plugin.definition.onEnvironmentProbe!(BASE_PARAMS);
expect(result.ok).toBe(true);
});
it("runs a full lease lifecycle through the harness", async () => {
const driver = createFakeEnvironmentDriver({ driverKey: BASE_PARAMS.driverKey });
const harness = createEnvironmentTestHarness({ manifest, environmentDriver: driver });
await plugin.definition.setup(harness.ctx);
const lease = await harness.acquireLease({ ...BASE_PARAMS, runId: "run-1" });
expect(lease.providerLeaseId).toBeTruthy();
await harness.realizeWorkspace({
...BASE_PARAMS,
lease,
workspace: { localPath: "/tmp/test" },
});
await harness.releaseLease({
...BASE_PARAMS,
providerLeaseId: lease.providerLeaseId,
});
assertEnvironmentEventOrder(harness.environmentEvents, [
"acquireLease",
"realizeWorkspace",
"releaseLease",
]);
assertLeaseLifecycle(harness.environmentEvents, ENV_ID);
});
});
`,
);
} else {
writeFile(
path.join(outputDir, "src", "manifest.ts"),
`import type { PaperclipPluginManifestV1 } from "@paperclipai/plugin-sdk";
const manifest: PaperclipPluginManifestV1 = {
id: ${quote(manifestId)},
@@ -331,11 +553,11 @@ const manifest: PaperclipPluginManifestV1 = {
export default manifest;
`,
);
);
writeFile(
path.join(outputDir, "src", "worker.ts"),
`import { definePlugin, runWorker } from "@paperclipai/plugin-sdk";
writeFile(
path.join(outputDir, "src", "worker.ts"),
`import { definePlugin, runWorker } from "@paperclipai/plugin-sdk";
const plugin = definePlugin({
async setup(ctx) {
@@ -363,11 +585,11 @@ const plugin = definePlugin({
export default plugin;
runWorker(plugin, import.meta.url);
`,
);
);
writeFile(
path.join(outputDir, "src", "ui", "index.tsx"),
`import { usePluginAction, usePluginData, type PluginWidgetProps } from "@paperclipai/plugin-sdk/ui";
writeFile(
path.join(outputDir, "src", "ui", "index.tsx"),
`import { usePluginAction, usePluginData, type PluginWidgetProps } from "@paperclipai/plugin-sdk/ui";
type HealthData = {
status: "ok" | "degraded" | "error";
@@ -391,11 +613,11 @@ export function DashboardWidget(_props: PluginWidgetProps) {
);
}
`,
);
);
writeFile(
path.join(outputDir, "tests", "plugin.spec.ts"),
`import { describe, expect, it } from "vitest";
writeFile(
path.join(outputDir, "tests", "plugin.spec.ts"),
`import { describe, expect, it } from "vitest";
import { createTestHarness } from "@paperclipai/plugin-sdk/testing";
import manifest from "../src/manifest.js";
import plugin from "../src/worker.js";
@@ -416,7 +638,8 @@ describe("plugin scaffold", () => {
});
});
`,
);
);
}
writeFile(
path.join(outputDir, "README.md"),

View File

@@ -0,0 +1,29 @@
{
"name": "@paperclipai/plugin-fake-sandbox",
"version": "0.1.0",
"description": "First-party deterministic fake sandbox provider plugin for Paperclip environments",
"type": "module",
"private": true,
"exports": {
".": "./src/index.ts"
},
"paperclipPlugin": {
"manifest": "./dist/manifest.js",
"worker": "./dist/worker.js"
},
"scripts": {
"prebuild": "node ../../../scripts/ensure-plugin-build-deps.mjs",
"build": "tsc",
"clean": "rm -rf dist",
"typecheck": "pnpm --filter @paperclipai/plugin-sdk build && tsc --noEmit",
"test": "vitest run --config vitest.config.ts"
},
"dependencies": {
"@paperclipai/plugin-sdk": "workspace:*"
},
"devDependencies": {
"@types/node": "^24.6.0",
"typescript": "^5.7.3",
"vitest": "^3.2.4"
}
}

View File

@@ -0,0 +1,2 @@
export { default as manifest } from "./manifest.js";
export { default as plugin } from "./plugin.js";

View File

@@ -0,0 +1,50 @@
import type { PaperclipPluginManifestV1 } from "@paperclipai/plugin-sdk";
const PLUGIN_ID = "paperclip.fake-sandbox-provider";
const PLUGIN_VERSION = "0.1.0";
const manifest: PaperclipPluginManifestV1 = {
id: PLUGIN_ID,
apiVersion: 1,
version: PLUGIN_VERSION,
displayName: "Fake Sandbox Provider",
description:
"First-party deterministic sandbox provider plugin for exercising Paperclip provider-plugin integration without external infrastructure.",
author: "Paperclip",
categories: ["automation"],
capabilities: ["environment.drivers.register"],
entrypoints: {
worker: "./dist/worker.js",
},
environmentDrivers: [
{
driverKey: "fake-plugin",
kind: "sandbox_provider",
displayName: "Fake Sandbox Provider",
description:
"Runs commands in an isolated local temporary directory while exercising the sandbox provider plugin lifecycle.",
configSchema: {
type: "object",
properties: {
image: {
type: "string",
description: "Deterministic fake image label for metadata and matching.",
default: "fake:latest",
},
timeoutMs: {
type: "number",
description: "Command timeout in milliseconds.",
default: 300000,
},
reuseLease: {
type: "boolean",
description: "Whether to reuse fake leases by environment id.",
default: false,
},
},
},
},
],
};
export default manifest;

View File

@@ -0,0 +1,228 @@
import { describe, expect, it } from "vitest";
import {
assertEnvironmentEventOrder,
createEnvironmentTestHarness,
} from "@paperclipai/plugin-sdk/testing";
import manifest from "./manifest.js";
import plugin from "./plugin.js";
describe("fake sandbox provider plugin", () => {
it("runs a deterministic provider lifecycle through environment hooks", async () => {
const definition = plugin.definition;
const harness = createEnvironmentTestHarness({
manifest,
environmentDriver: {
driverKey: "fake-plugin",
onValidateConfig: definition.onEnvironmentValidateConfig,
onProbe: definition.onEnvironmentProbe,
onAcquireLease: definition.onEnvironmentAcquireLease,
onResumeLease: definition.onEnvironmentResumeLease,
onReleaseLease: definition.onEnvironmentReleaseLease,
onDestroyLease: definition.onEnvironmentDestroyLease,
onRealizeWorkspace: definition.onEnvironmentRealizeWorkspace,
onExecute: definition.onEnvironmentExecute,
},
});
const base = {
driverKey: "fake-plugin",
companyId: "company-1",
environmentId: "env-1",
config: { image: "fake:test", reuseLease: false },
};
const validation = await harness.validateConfig({
driverKey: "fake-plugin",
config: base.config,
});
expect(validation).toMatchObject({
ok: true,
normalizedConfig: { image: "fake:test", reuseLease: false },
});
const probe = await harness.probe(base);
expect(probe).toMatchObject({
ok: true,
metadata: { provider: "fake-plugin", image: "fake:test" },
});
const lease = await harness.acquireLease({ ...base, runId: "run-1" });
expect(lease.providerLeaseId).toContain("fake-plugin://run-1/");
const realized = await harness.realizeWorkspace({
...base,
lease,
workspace: { mode: "isolated_workspace" },
});
expect(realized.cwd).toContain("paperclip-fake-sandbox-");
const executed = await harness.execute({
...base,
lease,
command: "sh",
args: ["-lc", "printf fake-plugin-ok"],
cwd: realized.cwd,
timeoutMs: 10_000,
});
expect(executed).toMatchObject({
exitCode: 0,
timedOut: false,
stdout: "fake-plugin-ok",
});
await harness.destroyLease({
...base,
providerLeaseId: lease.providerLeaseId,
});
assertEnvironmentEventOrder(harness.environmentEvents, [
"validateConfig",
"probe",
"acquireLease",
"realizeWorkspace",
"execute",
"destroyLease",
]);
});
it("does not expose host-only environment variables to executed commands", async () => {
const previousSecret = process.env.PAPERCLIP_FAKE_PLUGIN_HOST_SECRET;
process.env.PAPERCLIP_FAKE_PLUGIN_HOST_SECRET = "should-not-leak";
try {
const definition = plugin.definition;
const harness = createEnvironmentTestHarness({
manifest,
environmentDriver: {
driverKey: "fake-plugin",
onAcquireLease: definition.onEnvironmentAcquireLease,
onDestroyLease: definition.onEnvironmentDestroyLease,
onRealizeWorkspace: definition.onEnvironmentRealizeWorkspace,
onExecute: definition.onEnvironmentExecute,
},
});
const base = {
driverKey: "fake-plugin",
companyId: "company-1",
environmentId: "env-1",
config: { image: "fake:test", reuseLease: false },
};
const lease = await harness.acquireLease({ ...base, runId: "run-1" });
const realized = await harness.realizeWorkspace({
...base,
lease,
workspace: { mode: "isolated_workspace" },
});
const executed = await harness.execute({
...base,
lease,
command: "sh",
args: ["-lc", "test -z \"${PAPERCLIP_FAKE_PLUGIN_HOST_SECRET+x}\" && printf \"$EXPLICIT_ONLY\""],
cwd: realized.cwd,
env: { EXPLICIT_ONLY: "visible" },
timeoutMs: 10_000,
});
expect(executed).toMatchObject({
exitCode: 0,
timedOut: false,
stdout: "visible",
});
await harness.destroyLease({
...base,
providerLeaseId: lease.providerLeaseId,
});
} finally {
if (previousSecret === undefined) {
delete process.env.PAPERCLIP_FAKE_PLUGIN_HOST_SECRET;
} else {
process.env.PAPERCLIP_FAKE_PLUGIN_HOST_SECRET = previousSecret;
}
}
});
it("includes /usr/local/bin in the default PATH when no PATH override is provided", async () => {
const definition = plugin.definition;
const harness = createEnvironmentTestHarness({
manifest,
environmentDriver: {
driverKey: "fake-plugin",
onAcquireLease: definition.onEnvironmentAcquireLease,
onDestroyLease: definition.onEnvironmentDestroyLease,
onRealizeWorkspace: definition.onEnvironmentRealizeWorkspace,
onExecute: definition.onEnvironmentExecute,
},
});
const base = {
driverKey: "fake-plugin",
companyId: "company-1",
environmentId: "env-1",
config: { image: "fake:test", reuseLease: false },
};
const lease = await harness.acquireLease({ ...base, runId: "run-1" });
const realized = await harness.realizeWorkspace({
...base,
lease,
workspace: { mode: "isolated_workspace" },
});
const executed = await harness.execute({
...base,
lease,
command: "sh",
args: ["-lc", "printf %s \"$PATH\""],
cwd: realized.cwd,
timeoutMs: 10_000,
});
expect(executed.stdout).toContain("/usr/local/bin");
await harness.destroyLease({
...base,
providerLeaseId: lease.providerLeaseId,
});
});
it("escalates to SIGKILL after timeout if the child ignores SIGTERM", async () => {
const definition = plugin.definition;
const harness = createEnvironmentTestHarness({
manifest,
environmentDriver: {
driverKey: "fake-plugin",
onAcquireLease: definition.onEnvironmentAcquireLease,
onDestroyLease: definition.onEnvironmentDestroyLease,
onRealizeWorkspace: definition.onEnvironmentRealizeWorkspace,
onExecute: definition.onEnvironmentExecute,
},
});
const base = {
driverKey: "fake-plugin",
companyId: "company-1",
environmentId: "env-1",
config: { image: "fake:test", reuseLease: false },
};
const lease = await harness.acquireLease({ ...base, runId: "run-1" });
const realized = await harness.realizeWorkspace({
...base,
lease,
workspace: { mode: "isolated_workspace" },
});
const executed = await harness.execute({
...base,
lease,
command: "sh",
args: ["-lc", "trap '' TERM; while :; do sleep 1; done"],
cwd: realized.cwd,
timeoutMs: 100,
});
expect(executed.timedOut).toBe(true);
expect(executed.exitCode).toBeNull();
await harness.destroyLease({
...base,
providerLeaseId: lease.providerLeaseId,
});
});
});

View File

@@ -0,0 +1,282 @@
import { randomUUID } from "node:crypto";
import { mkdir, mkdtemp, rm } from "node:fs/promises";
import os from "node:os";
import path from "node:path";
import { spawn } from "node:child_process";
import { definePlugin } from "@paperclipai/plugin-sdk";
import type {
PluginEnvironmentAcquireLeaseParams,
PluginEnvironmentDestroyLeaseParams,
PluginEnvironmentExecuteParams,
PluginEnvironmentExecuteResult,
PluginEnvironmentLease,
PluginEnvironmentProbeParams,
PluginEnvironmentProbeResult,
PluginEnvironmentRealizeWorkspaceParams,
PluginEnvironmentRealizeWorkspaceResult,
PluginEnvironmentReleaseLeaseParams,
PluginEnvironmentResumeLeaseParams,
PluginEnvironmentValidateConfigParams,
PluginEnvironmentValidationResult,
} from "@paperclipai/plugin-sdk";
interface FakeDriverConfig {
image: string;
timeoutMs: number;
reuseLease: boolean;
}
interface FakeLeaseState {
providerLeaseId: string;
rootDir: string;
remoteCwd: string;
image: string;
reuseLease: boolean;
}
const leases = new Map<string, FakeLeaseState>();
const DEFAULT_FAKE_SANDBOX_PATH = "/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin";
const FAKE_SANDBOX_SIGKILL_GRACE_MS = 250;
function parseConfig(raw: Record<string, unknown>): FakeDriverConfig {
return {
image: typeof raw.image === "string" && raw.image.trim().length > 0 ? raw.image.trim() : "fake:latest",
timeoutMs: typeof raw.timeoutMs === "number" && Number.isFinite(raw.timeoutMs) ? raw.timeoutMs : 300_000,
reuseLease: raw.reuseLease === true,
};
}
async function createLeaseState(input: {
providerLeaseId: string;
image: string;
reuseLease: boolean;
}): Promise<FakeLeaseState> {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-fake-sandbox-"));
const remoteCwd = path.join(rootDir, "workspace");
await mkdir(remoteCwd, { recursive: true });
const state = {
providerLeaseId: input.providerLeaseId,
rootDir,
remoteCwd,
image: input.image,
reuseLease: input.reuseLease,
};
leases.set(input.providerLeaseId, state);
return state;
}
function leaseMetadata(state: FakeLeaseState) {
return {
provider: "fake-plugin",
image: state.image,
reuseLease: state.reuseLease,
remoteCwd: state.remoteCwd,
fakeRootDir: state.rootDir,
};
}
async function removeLease(providerLeaseId: string | null | undefined): Promise<void> {
if (!providerLeaseId) return;
const state = leases.get(providerLeaseId);
leases.delete(providerLeaseId);
if (state) {
await rm(state.rootDir, { recursive: true, force: true });
}
}
function buildCommandLine(command: string, args: string[] | undefined): string {
return [command, ...(args ?? [])].join(" ");
}
function buildCommandEnvironment(explicitEnv: Record<string, string> | undefined): Record<string, string> {
return {
PATH: explicitEnv?.PATH ?? DEFAULT_FAKE_SANDBOX_PATH,
...(explicitEnv ?? {}),
};
}
async function runCommand(params: PluginEnvironmentExecuteParams, timeoutMs: number): Promise<PluginEnvironmentExecuteResult> {
const cwd = typeof params.cwd === "string" && params.cwd.length > 0 ? params.cwd : process.cwd();
const startedAt = new Date().toISOString();
return await new Promise((resolve, reject) => {
const child = spawn(params.command, params.args ?? [], {
cwd,
env: buildCommandEnvironment(params.env),
shell: false,
stdio: [params.stdin != null ? "pipe" : "ignore", "pipe", "pipe"],
});
let stdout = "";
let stderr = "";
let timedOut = false;
let killTimer: NodeJS.Timeout | null = null;
const timer = timeoutMs > 0
? setTimeout(() => {
timedOut = true;
child.kill("SIGTERM");
killTimer = setTimeout(() => {
child.kill("SIGKILL");
}, FAKE_SANDBOX_SIGKILL_GRACE_MS);
}, timeoutMs)
: null;
child.stdout?.on("data", (chunk) => {
stdout += String(chunk);
});
child.stderr?.on("data", (chunk) => {
stderr += String(chunk);
});
child.on("error", (error) => {
if (timer) clearTimeout(timer);
if (killTimer) clearTimeout(killTimer);
reject(error);
});
child.on("close", (code, signal) => {
if (timer) clearTimeout(timer);
if (killTimer) clearTimeout(killTimer);
resolve({
exitCode: timedOut ? null : code,
signal,
timedOut,
stdout,
stderr,
metadata: {
startedAt,
commandLine: buildCommandLine(params.command, params.args),
},
});
});
if (params.stdin != null && child.stdin) {
child.stdin.write(params.stdin);
child.stdin.end();
}
});
}
const plugin = definePlugin({
async setup(ctx) {
ctx.logger.info("Fake sandbox provider plugin ready");
},
async onHealth() {
return { status: "ok", message: "Fake sandbox provider plugin healthy" };
},
async onEnvironmentValidateConfig(
params: PluginEnvironmentValidateConfigParams,
): Promise<PluginEnvironmentValidationResult> {
const config = parseConfig(params.config);
return {
ok: true,
normalizedConfig: { ...config },
};
},
async onEnvironmentProbe(
params: PluginEnvironmentProbeParams,
): Promise<PluginEnvironmentProbeResult> {
const config = parseConfig(params.config);
return {
ok: true,
summary: `Fake sandbox provider is ready for image ${config.image}.`,
metadata: {
provider: "fake-plugin",
image: config.image,
timeoutMs: config.timeoutMs,
reuseLease: config.reuseLease,
},
};
},
async onEnvironmentAcquireLease(
params: PluginEnvironmentAcquireLeaseParams,
): Promise<PluginEnvironmentLease> {
const config = parseConfig(params.config);
const providerLeaseId = config.reuseLease
? `fake-plugin://${params.environmentId}`
: `fake-plugin://${params.runId}/${randomUUID()}`;
const existing = leases.get(providerLeaseId);
const state = existing ?? await createLeaseState({
providerLeaseId,
image: config.image,
reuseLease: config.reuseLease,
});
return {
providerLeaseId,
metadata: {
...leaseMetadata(state),
resumedLease: Boolean(existing),
},
};
},
async onEnvironmentResumeLease(
params: PluginEnvironmentResumeLeaseParams,
): Promise<PluginEnvironmentLease> {
const config = parseConfig(params.config);
const existing = leases.get(params.providerLeaseId);
const state = existing ?? await createLeaseState({
providerLeaseId: params.providerLeaseId,
image: config.image,
reuseLease: config.reuseLease,
});
return {
providerLeaseId: state.providerLeaseId,
metadata: {
...leaseMetadata(state),
resumedLease: true,
},
};
},
async onEnvironmentReleaseLease(
params: PluginEnvironmentReleaseLeaseParams,
): Promise<void> {
const config = parseConfig(params.config);
if (!config.reuseLease) {
await removeLease(params.providerLeaseId);
}
},
async onEnvironmentDestroyLease(
params: PluginEnvironmentDestroyLeaseParams,
): Promise<void> {
await removeLease(params.providerLeaseId);
},
async onEnvironmentRealizeWorkspace(
params: PluginEnvironmentRealizeWorkspaceParams,
): Promise<PluginEnvironmentRealizeWorkspaceResult> {
const state = params.lease.providerLeaseId
? leases.get(params.lease.providerLeaseId)
: null;
const remoteCwd =
state?.remoteCwd ??
(typeof params.lease.metadata?.remoteCwd === "string" ? params.lease.metadata.remoteCwd : null) ??
params.workspace.remotePath ??
params.workspace.localPath ??
path.join(os.tmpdir(), "paperclip-fake-sandbox-workspace");
await mkdir(remoteCwd, { recursive: true });
return {
cwd: remoteCwd,
metadata: {
provider: "fake-plugin",
remoteCwd,
},
};
},
async onEnvironmentExecute(
params: PluginEnvironmentExecuteParams,
): Promise<PluginEnvironmentExecuteResult> {
const config = parseConfig(params.config);
return await runCommand(params, params.timeoutMs ?? config.timeoutMs);
},
});
export default plugin;

View File

@@ -0,0 +1,5 @@
import { runWorker } from "@paperclipai/plugin-sdk";
import plugin from "./plugin.js";
export default plugin;
runWorker(plugin, import.meta.url);

View File

@@ -0,0 +1,10 @@
{
"extends": "../../../tsconfig.json",
"compilerOptions": {
"outDir": "dist",
"rootDir": "src",
"lib": ["ES2023"],
"types": ["node", "vitest"]
},
"include": ["src"]
}

Some files were not shown because too many files have changed in this diff Show More