Compare commits

...

247 Commits

Author SHA1 Message Date
Dotta
bcfea025e8 Wake assignee after interaction cancellation
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-30 10:16:18 -05:00
Dotta
773ba92e93 Stabilize new issue dialog submit test
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-30 09:59:46 -05:00
Dotta
d34b367bec Stabilize paused issue detail assertion
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-30 09:54:28 -05:00
Dotta
7bf9fde510 Allow workflow-controlled review handoff
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-30 09:49:21 -05:00
Dotta
2ed7901f1b Default live run polling cap
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-30 09:38:57 -05:00
Dotta
1cc3838261 Fix PR validation follow-ups
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-30 09:31:06 -05:00
Dotta
83d8a62d5a Remove lockfile from PAP-2946 rollup PR
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-30 09:21:51 -05:00
Dotta
10398a933b Merge origin/master into master
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-30 09:01:37 -05:00
Devin Foley
c0ce35d1fb Improve E2B plugin configuration UX and fix execution timeouts (#4802)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - E2B is a sandbox provider plugin that runs agent code in isolated
cloud environments
> - Operators configure E2B through the plugin settings page
> - But the E2B API key configuration was unclear — the settings field
description didn't explain that pasted keys are auto-saved as company
secrets, and the fallback to the host `E2B_API_KEY` variable wasn't
documented
> - Additionally, long-running E2B sandbox commands were timing out
because the plugin environment RPC driver used a fixed timeout, and
environment commands competed for the single foreground command slot
> - This PR clarifies the E2B configuration UX, fixes RPC timeouts for
plugin environment execution, and runs E2B environment commands in
background mode to avoid blocking the foreground slot
> - The benefit is clearer E2B setup for operators and more reliable
sandbox command execution

## What Changed

- Updated E2B plugin manifest and settings UI to clarify API key
configuration — field description now explains that pasted keys are
saved as company secrets and documents the `E2B_API_KEY` host fallback
- Added test coverage for the plugin settings page rendering
- Fixed `plugin-environment-driver.ts` to pass the configured timeout
through to RPC calls instead of using a hardcoded default
- Updated `environment-runtime.ts` to propagate timeout from the
environment lease to the plugin driver
- Changed E2B sandbox command execution to use background handles so
long-running agent commands don't block the foreground slot needed by
the callback bridge

## Verification

- `pnpm test` — all existing and new tests pass
- `pnpm typecheck` — clean
- Manual: navigate to plugin settings, verify E2B API key field shows
the updated description text
- Manual: run an E2B-backed agent task with a long-running command,
verify it completes without RPC timeout

## Risks

- Low risk. Configuration UX change is cosmetic. The timeout fix passes
an existing value through instead of dropping it. Background command
execution is a behavioral change but only affects E2B sandbox commands —
the foreground slot is still available for bridge health checks.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 17:12:30 -07:00
Devin Foley
a4ac6ff133 Add sandbox callback bridge for remote environment API access (#4801)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents can run inside sandboxed environments like E2B, which are
isolated from the host network
> - Sandboxed agents need to call back to the Paperclip API to report
progress, post comments, and update issue status
> - But sandbox environments cannot reach the Paperclip server directly
because they run in isolated network namespaces
> - This PR adds a callback bridge that proxies API requests from the
sandbox to the Paperclip server, running as a local HTTP server on the
host that forwards authenticated requests
> - The bridge is started automatically when an adapter launches a
sandbox execution, and torn down when the run completes
> - The benefit is sandboxed agents can interact with the Paperclip API
without requiring network-level access to the host, enabling E2B and
similar providers to work end-to-end

## What Changed

- Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a
lightweight HTTP bridge server that accepts requests from sandbox
environments and proxies them to the Paperclip API with authentication
- Added request validation and security policy: the bridge only forwards
requests to the configured API URL, validates content types, enforces
size limits, and rejects non-API paths
- Wired the bridge into all remote adapter execute paths (claude, codex,
cursor, gemini, pi) — the bridge starts before the agent process and the
bridge URL is passed via environment variables
- Updated `environment-execution-target.ts` to prefer the explicit API
URL from environment lease metadata for sandbox callback routing
- Fixed Claude sandbox runtime setup to work with the bridge
configuration
- Added comprehensive test coverage for bridge request handling, policy
enforcement, and sandbox execution integration
- Fixed browser bundling — the bridge module is excluded from the
frontend bundle via the adapter-utils index export

## Verification

- `pnpm test` — all existing and new tests pass, including bridge unit
tests and sandbox execution integration tests
- `pnpm typecheck` — clean
- Manual: configure an E2B environment, run an agent task, verify the
agent can post comments and update issue status through the bridge

## Risks

- Medium. This is a new network-facing component (HTTP server on
localhost). The security policy restricts forwarding to the configured
API URL only and validates all requests, but any proxy introduces attack
surface. The bridge binds to localhost only and is scoped to the
lifetime of a single agent run.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 16:37:34 -07:00
Devin Foley
4cf612a92d Fix runtime state race, workspace sync, plugin startup, and orphaned leases (#4804)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents run inside environments that are leased, and the server
manages runtime state, workspace configuration, and plugin lifecycle
> - Several edge cases caused failures during concurrent operations: a
race condition in runtime state insertion could produce duplicate-key
errors, reused workspaces didn't sync their configuration when the
parent issue was updated, sandbox provider plugins could be queried
before registration completed, and orphaned environment leases from
failed runs were never released
> - This PR fixes these four runtime/environment issues
> - The benefit is more reliable concurrent agent execution and proper
resource cleanup

## What Changed

- `services/heartbeat.ts`: Fixed a race condition where concurrent
runtime state inserts could fail with a duplicate-key error by using an
upsert pattern
- `services/issues.ts`: Sync reused workspace configuration when an
issue is updated, so the workspace reflects the latest issue state
- `services/environment-runtime.ts`: Fixed a startup race where sandbox
provider plugins could be queried before registration completed, by
awaiting plugin readiness before resolving environment drivers
- `services/heartbeat.ts`: Release environment leases for orphaned runs
that lost their process without cleanup

## Verification

- `pnpm test` — all existing and new tests pass, including new tests for
runtime state upsert and process recovery lease cleanup
- `pnpm typecheck` — clean
- Manual: trigger concurrent agent runs to verify no duplicate-key
failures; verify orphaned leases are released after process loss

## Risks

- Low risk. The runtime state upsert changes insert-to-upsert behavior,
which could mask a legitimate duplicate if two different runs produce
the same key — but this is prevented by the run ID being part of the
key. The plugin startup await is bounded by the existing registration
timeout.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 16:37:10 -07:00
Devin Foley
f9cf1d2f6a Add cursor sandbox support and fix SSH workspace sync (#4803)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents can run inside sandboxed environments like E2B, or on remote
hosts via SSH
> - The cursor adapter needs to resolve `cursor-agent` inside sandbox
environments where it's installed in `~/.local/bin`
> - But when using the default `agent` command on a sandbox target, the
adapter didn't know to look in `~/.local/bin/cursor-agent`, causing
"command not found" failures
> - Additionally, repeated SSH runs failed because `git checkout` during
workspace sync conflicted with leftover `.paperclip-runtime` files from
previous runs
> - This PR adds sandbox-aware command resolution for cursor and fixes
the SSH workspace sync conflict
> - The benefit is cursor works in E2B sandboxes out of the box, and
repeated SSH runs don't fail on workspace sync

## What Changed

- `cursor-local`: Added `prepareCursorSandboxCommand` — on sandbox
targets, reads the remote `$HOME`, prepends `~/.local/bin` to PATH, and
prefers `~/.local/bin/cursor-agent` when the default command is
requested; tightened the sandbox command probe to validate the binary
exists before launching; preserves explicit custom command overrides
- `adapter-utils/ssh.ts`: Added `--force` to git checkout in SSH
workspace sync to handle `.paperclip-runtime` untracked file conflicts
from previous runs

## Verification

- `pnpm test` — all existing and new tests pass, including cursor
sandbox probe, sandbox execution, and custom command override tests
- `pnpm typecheck` — clean
- Manual: configure an E2B environment, run a cursor-local task, verify
it resolves cursor-agent from the sandbox install path

## Risks

- Low-medium. The `--force` flag on git checkout could discard
uncommitted changes in the remote workspace, but the workspace is
managed by Paperclip and should not contain user edits.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 16:12:06 -07:00
Devin Foley
a0f5cbffd7 Harden release flow with registry verification and dist-tag checks (#4800)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Paperclip is distributed as npm packages, including plugins like
`plugin-e2b`
> - The release process publishes canary and stable builds via npm
dist-tags
> - But there was no automated verification that published packages
actually landed with the correct dist-tags, and broken canary publishes
could silently ship to users
> - This PR adds a registry verification script that checks published
packages match their expected dist-tags, and wires it into PR CI so
regressions are caught before merge
> - The benefit is release integrity is verified automatically, and
broken dist-tag states are caught early

## What Changed

- Added `scripts/verify-release-registry-state.mjs` — verifies that
published npm packages have correct dist-tag assignments and detects
orphaned or mispointed tags
- Added `scripts/verify-release-registry-state.test.mjs` — test coverage
for the verification logic
- Updated `scripts/release.sh` to include canary dist-tag safety checks
before publishing
- Updated `.github/workflows/pr.yml` to run registry verification as a
CI step
- Updated `doc/PUBLISHING.md` and `doc/RELEASING.md` with the new
verification workflow

## Verification

- `pnpm test` — all tests pass including new verification script tests
- `node scripts/verify-release-registry-state.mjs` — runs against the
live npm registry and reports current state
- CI: the new PR workflow step runs on every PR push

## Risks

- Low risk. This is additive CI and tooling — no runtime code changes.
The registry verification is read-only (queries npm, does not publish).
The release script changes add safety checks that abort before
publishing if state is unexpected.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 15:56:20 -07:00
Devin Foley
367d4cab72 Fix SSH callback URL selection for LAN and private networks (#4799)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents can run on remote hosts via SSH environments
> - When a remote agent needs to call back to the Paperclip API, it
needs a reachable URL
> - But the runtime API URL candidate builder did not account for
private network topologies where the server is only reachable via LAN or
VPN addresses
> - Agents on SSH hosts were failing to connect because the callback URL
pointed to localhost or an unreachable address
> - This PR fixes callback URL selection to honor `PAPERCLIP_API_URL`,
prefer LAN-reachable candidates, filter unreachable link-local
addresses, and include interface hosts in onboarding invite URLs
> - The benefit is SSH-based agents can reliably reach the Paperclip API
on private networks without manual URL configuration

## What Changed

- `runtime-api.ts`: Added `PAPERCLIP_API_URL` as a first-priority
candidate in `buildRuntimeApiCandidateUrls`; extracted
`collectReachableInterfaceHosts` to enumerate non-loopback,
non-link-local network interface IPs with IPv4 preference
- `server/src/index.ts`: Export `PAPERCLIP_API_URL` from the server
environment so it is available to callback candidate resolution
- `server/src/routes/access.ts`: Include LAN interface hosts in
onboarding invite connection candidates
- `server/src/config.ts`: Attempted auto-allowing LAN interface hosts,
then reverted to the per-instance allowlist approach (both commits
included for history clarity)

## Verification

- `pnpm test` — all existing and new tests pass, including new tests for
LAN candidate ordering and link-local filtering
- `pnpm typecheck` — clean
- Manual: start a Paperclip server on a machine with a LAN IP, create an
SSH environment pointing to another host on the same LAN, verify the
agent's callback URL uses the LAN IP rather than localhost

## Risks

- Low-medium. The candidate list now includes more addresses (all
non-loopback LAN interfaces). These are candidates for the agent to try,
not an allowlist — the server's allowed hostnames still gate which
origins are accepted. Ordering change (LAN preferred over loopback)
could affect existing setups where localhost was intentionally
preferred.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 15:56:17 -07:00
Devin Foley
9b99d30330 Add dedicated environment settings page and test-in-environment (#4798)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents run inside environments (local, SSH, E2B sandbox)
> - Operators need to configure and manage these environments
> - But environment settings were buried inside the general company
settings page, making them hard to find
> - Additionally, when testing an agent from the configuration form, the
test always ran locally regardless of which environment was selected
> - This PR moves environments into a dedicated top-level company
settings section and wires the "Test Environment" button to run inside
the selected environment
> - The benefit is operators can find and manage environments more
easily, and the test button now validates the actual environment the
agent will use

## What Changed

- Added a dedicated `CompanyEnvironments` settings page with its own
route and sidebar entry
- Updated `CompanySettingsSidebar` and `CompanySettingsNav` to include
the new environments section
- Modified the agent test route (`POST /agents/:id/test`) to accept an
optional `environmentId` parameter
- Updated all adapter `test.ts` handlers to resolve and use the
specified execution target environment
- Added `resolveTestExecutionTarget` to `execution-target.ts` for remote
environment test resolution with cwd fallback
- Moved the "Test Environment" button and its feedback display into the
`NewAgent` page footer for better UX flow

## Verification

- `pnpm test` — all existing and new tests pass
- `pnpm typecheck` — clean
- Manual: navigate to Company Settings, confirm "Environments" appears
as a top-level section
- Manual: configure an agent with a non-local environment, click "Test
Environment", confirm the test runs inside that environment

## Risks

- Low risk. UI-only routing change for the settings page. The
test-in-environment change adds an optional parameter with a local
fallback, so existing behavior is preserved when no environment is
specified.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 15:56:13 -07:00
Dotta
703e268c96 Stop liveness recovery from refreshing covered blockers
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-29 17:08:54 -05:00
Dotta
e60f2f7b5e Cover liveness blocker escalations after revert
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-29 16:56:02 -05:00
Dotta
dd57bfba21 Revert "Split run progress from liveness disposition"
This reverts commit 55518170ad.
2026-04-29 16:02:22 -05:00
Dotta
caf37a355d Revert "PAP-2731 Implement sky explicit-wait UI surfaces"
This reverts commit 08a9e0caf0.
2026-04-29 16:02:11 -05:00
Dotta
0a345e9648 Handle liveness escalation blockers as waiting
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-29 14:29:47 -05:00
Dotta
398d85488c Guard liveness recovery by issue creation time 2026-04-29 14:24:30 -05:00
Dotta
41a989e00c Make database backups full logical dumps
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-29 14:20:19 -05:00
Dotta
ba88e86f63 Cancel queued runs for cancelled issues 2026-04-29 14:15:24 -05:00
Dotta
ed51123e3c PAP-2852 Add space-width gap between caret and mention menu
Per visual feedback the menu was sitting flush against the cursor;
introduce MENTION_MENU_CARET_GAP (10px) so the popover floats roughly
one space-width to the right of the caret.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-29 14:14:59 -05:00
Dotta
2ba694d10f Clarify explicit waiting posture for plans 2026-04-29 14:02:53 -05:00
Dotta
02dd5309b3 Cover live-run polling cap
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-29 14:01:06 -05:00
Dotta
33bcaa5a4c Allow recovery owners to cancel stale runs
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-29 14:00:36 -05:00
Dotta
229a4c60a2 Limit legacy disposition preview lookback
Default the legacy execution-disposition preview to recently created issues, keep an explicit --all escape hatch, and cover the lookback behavior with a targeted test.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-29 13:54:24 -05:00
Dotta
b13c9f5fba Limit legacy disposition preview to recent issues
Default the legacy execution-disposition preview to a 48-hour lookback and require --all for full historical scans. This keeps repair passes from treating stale blocked/review rows as current cleanup work unless the operator explicitly asks for a backfill.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-29 13:35:23 -05:00
Dotta
4c5eaeadd5 PAP-2822 Walk back visible disposition badges per board feedback
The board accepted the canonical executionDisposition payload + classifier
but found the visible badges in IssueRow / IssueProperties / Storybook too
noisy on production surfaces. UI-only walkback: keep the data layer, drop
every visible badge.

- Delete IssueDispositionBadge component and its test.
- Drop the badge import, suppression rule, and trailing-slot references in
  IssueRow (desktop + mobile). Existing waiting pill, status-icon ring, and
  blockerAttention copy remain unchanged.
- Drop the badge IIFE and related imports from IssueProperties so the
  Status row only renders the StatusIcon again.
- Prune the badge cases from IssueRow.test (Live, Stalled, dispatchable,
  Resuming, explicit-waiting suppression) so the file still type-checks.
- Storybook: remove "Canonical disposition badges" and "IssueRow with
  disposition badge inline" sections plus DispositionRowSurface and
  dispositionRowFixtures. The rest of Status Language is unchanged.

Kept: executionDisposition on issue payloads, the resolver and its test,
and the PAP-2818 contract doc — future features can consume the field
without re-running the classifier work.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-29 13:25:59 -05:00
Dotta
7690c21148 Add legacy execution disposition preview 2026-04-29 13:11:49 -05:00
Dotta
7fc94107cf PAP-2818 Document execution disposition contract
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-29 13:04:55 -05:00
Dotta
326fd2f6fb Allow cancelling question interactions
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-29 11:54:37 -05:00
Dotta
0b85f56d27 PAP-2853 Close 8px flip-gap on mention menu
Align CSS max-height with MENTION_MENU_HEIGHT (208) so the rendered
menu fills the layout-reserved space; previously max-h-[200px] clipped
the box 8px short of the JS layout budget, leaving a visible gap above
the caret line on the flipped path.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-29 11:48:01 -05:00
Dotta
aa1628008e PAP-2822 Address UX review on disposition badge
- Drop the dispatchable "Ready" variant from the badge: the only place it ever
  rendered was Storybook (IssueRow and IssueProperties already returned null
  for it). dispositionCategory now returns null for dispatchable.
- Pull agent_continuable into a discrete "Resuming" category with a muted
  indigo treatment so it doesn't share the rose alarm family with Recovery /
  Needs attention / Stalled, and surface "Resuming · n/m" on the visible
  badge instead of hiding the attempt counter in the tooltip.
- Differentiate human_escalation_required by owner on the visible label:
  Needs board / Needs manager / Needs owner / Needs external. Color stays in
  the rose family but the actionable bit is no longer hover-only.
- Bump invalid (Stalled) to a solid rose-600 fill so it stands out against
  the rest of the rose family at scan distance, satisfying the parent plan's
  acceptance criterion that the board can spot a stalled review leaf without
  reading the full thread.
- Mirror the IssueRow suppression rule on IssueProperties so the Status row
  doesn't double-render the same waiting/recovery signal as the chat-thread
  IssueBlockedNotice.
- Add aria-label={tooltip} on the badge so screen readers get the path
  detail, not just the category label.
- Storybook: rename and split the disposition matrix to cover all four
  escalation owners, both Resuming attempts, and add a populated IssueRow
  surface that shows the badge in production layout context (Live, Resuming,
  Needs board, Needs manager, Stalled).

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-29 11:45:11 -05:00
Dotta
925ac9a3c1 PAP-2852 Anchor mention menu to caret and allow touch scrolling
Mention menu now sits on the same line as the caret and to the right of
the cursor (tracking each typed character instead of staying glued to
the @ marker). When the menu would overflow below the viewport it flips
above the line so it doesn't cover the line.

Touch handling no longer preventDefaults on touchstart, so the dropdown
scrolls natively on mobile; selection runs only on touchend if the
finger lifted with negligible movement (a tap, not a scroll).

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-29 11:16:43 -05:00
Dotta
6ac545bae7 PAP-2822 Render execution disposition badge in board UI
Adds IssueDispositionBadge that maps the canonical disposition to a
small palette of categories (live, ready, waiting, blocked-chain,
recovery, needs attention, stalled). Renders the badge in the inbox
row and on the issue detail status row, suppressing it where the
existing waiting/recovery copy already conveys the same signal so we
don't double up. Storybook gains a section showing every category for
UX review.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-29 10:50:32 -05:00
Dotta
2b3a7f683b PAP-2822 Expose canonical executionDisposition on issue payloads
Adds a per-issue read-model resolver that loads the disposition state
vector (active runs, queued wakes, agents, interactions, approvals,
recovery/productivity-review issues, latest run evidence) and runs the
shared classifier. Wires it into issue list, detail, and
heartbeat-context responses, and surfaces it on the shared Issue type
as `executionDisposition` so the board can read execution health
without inferring it from status alone.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-29 10:50:25 -05:00
Dotta
cbd091b3ef Harden agent issue authz
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-29 10:40:32 -05:00
Dotta
ed6600b9d1 Hide redundant sibling blocker chips
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-29 10:38:29 -05:00
Dotta
613812d1c0 Add terminal bench loop skill smoke 2026-04-29 10:33:13 -05:00
Dotta
04c85b6db6 Integrate canonical recovery dispositions
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-29 10:28:23 -05:00
Dotta
d354ecadfa PAP-2820 Enforce issue transition dispositions 2026-04-29 10:26:50 -05:00
Dotta
9413f55381 Add canonical issue execution disposition classifier
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-29 10:19:22 -05:00
Dotta
628d879978 Move issue cost summary above activity feed
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-29 09:49:51 -05:00
Dotta
eb6957fa86 Fix mention chip padding cascade 2026-04-29 09:38:50 -05:00
Dotta
229487afef Unify issue activity ledger
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-29 08:49:35 -05:00
Dotta
ffe319a3f7 PAP-2789 Simplify waiting banner Owner/Resume rows and fix Jump anchor
The explicit-wait IssueBlockedNotice rendered Target/Owner/Resume as a row
of pills. Drop the Target pill entirely and render the remaining Owner and
Resume entries as a plain dl/dt/dd block — the value matters more than the
chip styling for these long-form details.

The "Jump to questions/confirmation/approval" affordance referenced anchor
ids that did not exist in the DOM (`issue-thread-interaction-{id}` and
`issue-approval-{id}`); the actual anchors are `interaction-{id}` and
`approval-{id}`. Switch to those ids and use a button with
scrollIntoView({behavior:"smooth"}) so the jump actually scrolls to the
target.

Add a small ExplicitWaitingJumpAnchor component that flips the arrow ↑/↓
based on whether the target is above or below the banner, recomputing on
scroll/resize.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 22:34:52 -05:00
Dotta
bcc88b561a PAP-2778 Wrap waiting pills instead of overflowing
The Target/Owner/Resume chips in the explicit-wait IssueBlockedNotice
banner used inline-flex with `truncate max-w-[Nrem]` on inner spans, so
on narrow viewports a single chip (especially Resume's full sentence)
extended past the container edge. Cap each chip at max-w-full and let
the value span wrap on word boundaries; desktop layout is unchanged.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 21:07:29 -05:00
Dotta
9703830313 PAP-2753 Add copy button to markdown code blocks
Wraps fenced code blocks in MarkdownBody with a relative container and an
absolute-positioned Copy button (top-right) that copies the block contents
to the clipboard via navigator.clipboard with an execCommand fallback for
non-secure contexts. The button is hidden by default and reveals on hover
or focus, swaps to a check icon and "Copied\!" label for ~1.5s after
clicking, and falls back to "Copy failed" when the clipboard write throws.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 19:05:52 -05:00
Dotta
08a9e0caf0 PAP-2731 Implement sky explicit-wait UI surfaces
Replaces the rose liveness-break banner for issues whose blocker-attention
read model reports `state="covered"` with `reason="explicit_waiting"`.

- IssueBlockedNotice: add a sky-toned banner with Hourglass icon, kind-aware
  headline/body copy (plan confirmation, user response, suggested tasks,
  board approval, generic fallback), Target/Owner/Resume pill row, a "Jump
  to confirmation" anchor, and a sky chip strip for parent-of-leaf chain
  waits. Rose banner still wins on tie via the existing precedence.
- StatusIcon: add a sky ring + bottom-right nub for explicit_waiting and
  emit `data-blocker-attention-state="explicit_waiting"`. Tooltip is
  "Waiting on board/user/agent" or prefixed "Blocked · ..." when blocked.
- IssueRow: add a sky `Waiting · {owner}` pill (collapses to `Waiting` on
  mobile) when the row's attention reason is explicit_waiting. Owner label
  resolves via new optional agent/user maps wired in by IssuesList & Inbox.
- IssueDetail/IssueChatThread: lift the linkedApprovals query into the chat
  tab and pass interactions/approvals/agentMap/userLabelMap/currentUserId
  through to the new banner.
- Storybook + tests: extend coveredBlockedMatrix with sky cells, add four
  new BlockedNotice fixtures (plan, user response, board approval, chain),
  and cover the new states with unit tests on StatusIcon, IssueBlockedNotice,
  and IssueRow.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 19:00:13 -05:00
Dotta
5adbe69381 Harden pending wait recovery gating
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 18:31:24 -05:00
Dotta
7f9ae0adcd Harden recovery wait boundaries 2026-04-28 18:20:46 -05:00
Dotta
c1fd7521a8 PAP-2729 Handle interaction waiting lifecycle
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 18:13:05 -05:00
Dotta
1e68cb9be0 Handle explicit waits in liveness read models
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 18:06:29 -05:00
Dotta
b72ab70a4f Fix pause resume QA blockers
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 18:02:28 -05:00
Dotta
3494e84a29 Add v2026.428.0 release changelog (#4665)
## Summary

- Adds `releases/v2026.428.0.md` covering the diff between `v2026.427.0`
and `origin/master` (seven merged PRs).
- Generated via `.agents/skills/release-changelog/SKILL.md`.
- Flags the additive migrations `0071_default_hire_approval_off` and
`0072_large_sandman` plus the new-companies hire-approval default flip
in the upgrade guide.

## Notes

- Highlight: pause/resume actions in the sidebar agents panel
([#4616](https://github.com/paperclipai/paperclip/pull/4616)).
- Improvements: assigned-todo recovery dispatch, recovery issue
hardening, hire-approval opt-in default, inline selector keyboard
handling.
- Fixes: manual routine inbox visibility, stale company skill refresh
rejection, stale stored company-selection cleanup.

## Test plan

- [ ] Reviewer confirms section coverage and PR-to-bullet attribution.
- [ ] Confirm the file lands at \`releases/v2026.428.0.md\`.
- [ ] Confirm no canary suffix in title or filename.

Source issue: [PAP-2599](/PAP/issues/PAP-2599)

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-28 17:40:20 -05:00
Dotta
7a57a54b77 Implement issue pause resume UI
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 17:38:21 -05:00
Dotta
2834e356d9 PAP-2720 Load full thread before jump-to-latest
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 17:23:36 -05:00
Dotta
6b7f6ce4b8 [codex] Split PR #4692 UI/QoL updates (#4701)
## Thinking Path

> - Paperclip orchestrates AI agents through a company-scoped control
plane.
> - The affected surface is the board UI for issue threads, issue lists,
routines, dialogs, navigation, and issue review indicators.
> - Closed PR #4692 bundled backend, schema, docs, workflow, and UI/QoL
work into one oversized change set.
> - Greptile could not keep reviewing that broad PR because it exceeded
the 100-file review limit and mixed unrelated concerns.
> - This pull request extracts the UI/QoL slice into a fresh branch
under the review limit while leaving workflow and lockfile churn out.
> - The benefit is a focused review path for the board UI performance
and workflow improvements without reopening the oversized PR.

## What Changed

- Added long issue-thread virtualization, scroll-container binding,
anchor preservation, latest-comment jump targeting, and related
regression/perf fixtures.
- Improved issue list scalability with scroll-based loading, server
offset parameters, and pagination-focused UI tests.
- Reduced new issue dialog typing churn and split dialog action
subscriptions so broad layout/nav surfaces avoid unnecessary renders.
- Added routine variables help and routine description mention options
for users, agents, and projects.
- Added productivity review badge/link UI and fixed the badge to use
Paperclip's company-prefixed router link.
- Kept the split PR below Greptile's review limit and excluded
`.github/workflows/pr.yml` and `pnpm-lock.yaml`.

## Verification

- `pnpm install --no-frozen-lockfile` in the clean worktree to install
`@tanstack/react-virtual` locally without committing lockfile churn.
- `pnpm --filter @paperclipai/ui exec vitest run --config
vitest.config.ts src/components/IssueChatThread.test.tsx
src/components/IssuesList.test.tsx
src/components/NewIssueDialog.test.tsx src/pages/Routines.test.tsx
src/pages/Issues.test.tsx` passed: 5 files, 83 tests.
- `pnpm --filter @paperclipai/ui typecheck` passed.
- `git diff --check origin/master..HEAD` passed.
- Split-scope checks: 53 changed files; no `.github/workflows/pr.yml`;
no `pnpm-lock.yaml`.
- Screenshots were not captured in this heartbeat; the changes are
primarily virtualization, routing, pagination, and editor behavior
covered by focused regression tests.

## Risks

- Moderate UI risk because issue-thread virtualization changes scroll
behavior on long conversations; regression tests cover anchor jumps,
latest-comment targeting, row metadata, and short-thread fallback.
- Moderate integration risk because the issue-list offset parameter and
productivity review field depend on matching API behavior.
- Dependency risk: the UI package adds `@tanstack/react-virtual` while
repository policy keeps `pnpm-lock.yaml` out of PRs, so CI must resolve
dependency changes through the repo's normal lockfile policy.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled local repository and
GitHub workflow. Exact runtime context window was not exposed by the
harness.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-28 17:18:58 -05:00
Dotta
8d1647b039 PAP-2672 Converge Jump-to-latest on the absolute newest comment
The user reported having to click Jump-to-latest ~10 times before the
viewport actually pinned to the latest comment on long threads (e.g.
PAP-2536). The virtualizer estimates 220px per unmeasured row, so
totalSize is hugely underestimated until rows render and get measured;
a single smooth scroll lands above the real bottom and each subsequent
click walks closer as more rows mount and expand.

- IssueChatThread.handleJumpToLatest now awaits an optional
  onRefreshLatestComments before resolving the latest target so any
  comment that arrived after the initial load (or was dropped on a
  live-update reconnect) is in the loaded set first.
- scrollToLatestCommentWithSettle issues the initial smooth scroll,
  then on an 80 ms tick re-issues scrollIntoView({ block: "end" }) on
  the latest comment row for up to 4 s, exiting once the row's bottom
  matches the scroll container's bottom (within 4 px) and scrollTop +
  scrollHeight have been stable for 3 ticks. If the row hasn't been
  mounted into the virtualizer buffer, scrollToIndex is used to nudge
  the offset until it is.
- IssueDetail destructures `refetch` from the comments useInfiniteQuery
  and threads refetchLatestComments through to IssueChatThread.

QA verified on PAP-2720: single-click convergence in 327-407 ms across
5 successive trials, comment-fa353646 lands flush with the viewport
bottom (-0.44 px sub-pixel), stopped-scroll stability holds.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 17:09:41 -05:00
Dotta
de4ce9828d PAP-2643 Phase 3: gate run-liveness continuation and scheduled retry promotion under pause holds
Suppress bounded run-liveness continuation wakes when an active pause hold
covers the issue, and cancel due scheduled retries with errorCode `issue_paused`
instead of promoting them only to be cancelled at claim time. Adds regression
coverage for both gates plus the no-pause-hold passthrough invariant.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 16:57:28 -05:00
Dotta
1991ec9d6f [codex] Split backend control-plane QoL slice (#4700)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies, so
backend task ownership, recovery, review visibility, and company-scoped
limits need to stay enforceable without UI-only coupling.
> - Closed PR #4692 bundled those backend changes with UI workflow,
docs, skills, workflow, and lockfile churn.
> - PAP-2694 asks for a clean backend/control-plane slice from that
closed branch.
> - This branch starts from current `master` and mines only the `cli`,
`packages/db`, `packages/shared`, and `server` contracts/tests needed
for the backend behavior.
> - It explicitly excludes UI workflow/performance work,
`.github/workflows/pr.yml`, `pnpm-lock.yaml`, docs, skills,
package-script, adapter UI build-config, and perf fixture script
changes; the only UI files are fixture/test updates required by the
tightened shared `Company` contract.
> - The benefit is a smaller reviewable PR that preserves the
control-plane fixes while staying under Greptile s 100-file review
limit.

## What Changed

- Added company-scoped attachment-size limits through DB
schema/migrations, shared company portability contracts, CLI
import/export coverage, and server attachment upload enforcement.
- Added productivity review service/API behavior for no-comment streak,
long-active, and high-churn review issues, including request-depth
clamping and issue summary exposure.
- Hardened issue ownership and recovery/control-plane paths: peer-agent
mutation denial, issue tree pause/resume behavior, stranded recovery
origins, and related activity/test coverage.
- Preserved related backend contract updates for routine timestamp
variables and managed agent instruction bundles because they live in
shared/server contracts from the source branch.
- Addressed Greptile feedback by making `Company.attachmentMaxBytes`
non-optional, simplifying review request-depth clamping, fixing the
migration final newline, and enforcing the process-level attachment cap
as the final ceiling for uploads.
- Added minimal company fixtures needed for repo-wide typecheck/build
and kept the PR to 66 changed files with forbidden/non-slice paths
excluded.

## Verification

- `pnpm install --frozen-lockfile`
- `git diff --check origin/master..HEAD`
- `git diff --name-only origin/master..HEAD | wc -l` -> 66 files
- `git diff --name-only origin/master..HEAD -- .github/workflows/pr.yml
pnpm-lock.yaml package.json doc skills .agents scripts
packages/adapters` -> no output
- `pnpm exec vitest run --config vitest.config.ts
packages/shared/src/validators/issue.test.ts
packages/shared/src/routine-variables.test.ts
packages/shared/src/adapter-types.test.ts
cli/src/__tests__/company-import-export-e2e.test.ts
cli/src/__tests__/company.test.ts
server/src/__tests__/productivity-review-service.test.ts
server/src/__tests__/issue-tree-control-service.test.ts
server/src/__tests__/issue-tree-control-routes.test.ts
server/src/__tests__/issue-agent-mutation-ownership-routes.test.ts
server/src/__tests__/issue-attachment-routes.test.ts
server/src/__tests__/heartbeat-process-recovery.test.ts
server/src/__tests__/issues-service.test.ts` -> 12 files, 147 tests
passed
- `pnpm exec vitest run --config vitest.config.ts
cli/src/__tests__/company-delete.test.ts
cli/src/__tests__/company-import-export-e2e.test.ts
server/src/__tests__/productivity-review-service.test.ts` -> 3 files, 18
tests passed
- `pnpm exec vitest run --config vitest.config.ts
server/src/__tests__/issue-attachment-routes.test.ts` -> 1 file, 6 tests
passed
- `pnpm --filter @paperclipai/db typecheck && pnpm --filter
@paperclipai/shared typecheck && pnpm --filter @paperclipai/server
typecheck && pnpm --filter paperclipai typecheck`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm --filter @paperclipai/ui typecheck && pnpm --filter
@paperclipai/ui build`

## Risks

- Includes migrations `0073_shiny_salo.sql` and
`0074_striped_genesis.sql`; merge ordering matters if another PR adds
migrations first.
- This is intentionally backend-only apart from fixture/test updates
forced by shared type correctness; UI affordances from PR #4692 are not
present here and should land in separate UI slices.
- The worktree install emitted plugin SDK bin-link warnings for unbuilt
plugin packages, but the targeted tests and package typechecks completed
successfully.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected; check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled terminal/GitHub
workflow. Exact runtime context window was not exposed by the harness.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-28 16:46:45 -05:00
Dotta
47d862b4b2 Show agent models on agents page
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 16:32:41 -05:00
Dotta
d9f540c331 [codex] Refresh docs and agent skills (#4693)
## Thinking Path

> - Paperclip orchestrates AI agents through a company-scoped control
plane
> - Contributors and agents need docs and skills that match the current
V1 behavior
> - The source branch included documentation updates alongside
implementation work
> - Keeping docs and skill guidance separate makes the implementation PR
easier to review
> - This pull request refreshes the V1 docs and agent-operating guidance
without changing runtime behavior
> - The benefit is current contributor guidance that can merge
independently from code changes

## What Changed

- Refreshed V1 product, goal, implementation, database, and development
documentation.
- Updated the Paperclip heartbeat skill guidance and create-agent skill
references.
- Added the Paperclip plan-to-task conversion skill.
- Updated release changelog skill guidance.

## Verification

- `git diff --check public-gh/master..HEAD` passed in the PR worktree
after the Greptile fix.
- Greptile Review passed on head `673317ed` with zero unresolved review
threads.
- GitHub PR checks passed on head `673317ed`: `policy`, `verify`, `e2e`,
and `security/snyk (cryppadotta)`.

## Risks

- Low runtime risk because this branch only changes docs and skill
guidance.
- Documentation may need follow-up wording adjustments if reviewers want
a different framing for V1 behavior.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled terminal/GitHub
workflow. Exact runtime context window was not exposed by the harness.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-28 16:12:03 -05:00
Dotta
38b57c5d56 Render rose recovery_needed surface across StatusIcon, BlockedNotice, and rows
StatusIcon adds an AlertTriangle overlay on the red-ring blocked variant
when the read model reports recovery_needed; tooltip and aria-label
spell out the liveness break and which leaf is invalid.

IssueBlockedNotice gets a rose container variant with two copy
templates: parent-perspective ("paused at a liveness break in [leaf]")
and leaf-perspective ("productive run that exited without queueing a
continuation"). The sub-row label flips to "Liveness break at" and an
owner pill renders with the next-action verb derived from the
nextActionHint.

IssueRow surfaces "liveness break at {leaf}" in the meta line for
recovery_needed parents so inbox/MyIssues rows show the actionable leaf
without opening the parent.

The status-language Storybook matrix gains the four new state/reason
combinations and parent/leaf-perspective notice variants. Targeted
vitest cases cover both the icon overlay and notice rendering.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 15:59:14 -05:00
Dotta
3ea2ff4187 Surface invalid-leaf recovery_needed in blockerAttention read model
A blocked parent like PAP-2089 currently reports a generic
needs_attention chain even when its leaf is the productive-run-stopped
shape from PAP-2674's plan. Extend listIssueBlockerAttentionMap to
fetch each candidate leaf's latest issue-linked succeeded run plus its
productivity-review state, classify the leaf as recovery_needed when it
matches the shape, and propagate the leaf identifier, next-action
owner, and CTA hint up to the root. The leaf itself also surfaces
recovery_needed on its own attention so the leaf-level notice can
render when an inbox wake lands directly on it.

Adds two integration cases (productive_run_stopped and
continuation_exhausted) to the embedded-Postgres regression suite.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 15:59:03 -05:00
Dotta
7d475f85bf Add recovery_needed and next-action surface to IssueBlockerAttention
Phase 5 of the liveness plan needs the blocker read model to distinguish
a "productive run stopped without continuation" leaf from generic
attention or stalled-review chains. Extend the shared type so server and
UI share one vocabulary for the new surface (state, reasons, owner,
hint).

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 15:58:54 -05:00
Dotta
2656a86dae Recover productive stranded runs 2026-04-28 15:30:19 -05:00
Dotta
4b309a9e3b Classify runnable liveness follow-up
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 15:27:32 -05:00
Dotta
55518170ad Split run progress from liveness disposition 2026-04-28 15:23:52 -05:00
Dotta
90e6a394c7 Fix productivity review badge routing
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 15:12:20 -05:00
Dotta
45f1bc293b test: encode productive terminal liveness regression
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 13:35:31 -05:00
Dotta
930a468772 Cover issue chat latest comment jump
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 13:24:07 -05:00
Dotta
b0d6ec582d Cover agent issue mutation ownership
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 13:24:03 -05:00
Dotta
14a149c79f Harden productivity review request depth
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 13:24:00 -05:00
Dotta
9465d04755 Fix subtree resume pause release
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 13:23:56 -05:00
Dotta
ec2c5537fc Merge origin/master into master
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 13:19:24 -05:00
Dotta
b6cf7dd91a Enable routine description mentions
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 13:05:48 -05:00
Dotta
e71b27fbd4 Stabilize infinite issue paging
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 12:57:49 -05:00
Dotta
889cf660ac Page issues list with scroll offsets
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 12:52:09 -05:00
Dotta
19c786a20b docs: refresh v1 implementation docs
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 12:49:00 -05:00
Dotta
9c2dbab2fc PAP-2660: Bind issue chat virtualizer to <main> scroll container
The Phase 3 virtualizer (PAP-2645) used useWindowVirtualizer, but on the
real issue page the chat lives inside `<main id="main-content"
overflow-auto>`. Window scroll never moves on desktop, so the virtualizer
only ever mounted the first 7 rows and the chat area went blank past the
first viewport. Detect the actual scroll container at mount via DOM walk
and rekey the inner virtualizer to use element-mode observers/scroll-fn
when an overflow ancestor exists; fall back to window mode for mobile and
the auth-free perf fixture, where the document scrolls.

Adds a regression test that mounts IssueChatThread inside an overflow-auto
ancestor and asserts jump-to-latest drives that element's scrollTo, not
window.scrollTo.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 12:45:27 -05:00
Dotta
1cb1d7c61b docs: update goal and product accuracy
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 12:44:54 -05:00
Dotta
9a2c345866 PAP-2659: Warn before sending issue reply with no assignee
If the issue reply composer would post without any assignee, the first
Send press now surfaces a warning toast instead of submitting. A second
press confirms the user wants to proceed unassigned, and changing the
assignee resets the confirmation.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 12:07:03 -05:00
Dotta
09923b01b2 Make issues page load more on scroll
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 11:48:02 -05:00
Dotta
39ba2797fb Preserve virtualized issue chat anchors
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 11:44:03 -05:00
Dotta
4c02abd919 PAP-2645: Virtualize long issue chat threads
Add @tanstack/react-virtual and gate IssueChatThread on a 150-row
threshold. Thread shorter than the threshold still uses the direct
render path; longer threads window the merged feed via
useWindowVirtualizer with dynamic measurement and overscan, so the
document scroll continues to represent the full conversation height
without introducing a nested scrollbar.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 11:29:41 -05:00
Dotta
9a35d3bd0d Isolate issue thread row rendering
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 11:17:10 -05:00
Dotta
a2be4580ae Add long-thread issue chat perf fixture
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 11:11:26 -05:00
Dotta
647c2c487e Tighten shared project workspace runtime stop/restart auth
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 11:02:27 -05:00
Dotta
42e1b1bfa1 Split dialog action subscriptions
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 10:33:32 -05:00
Dotta
e090d9a77b PAP-2615: rename source-header productivity-review pill to "Under review"
UX feedback split the badge by role: on the source issue the linked pill
describes a state ("Under review"), while on the productivity-review issue
itself the static pill identifies the kind of issue ("Productivity review").
Tooltip and aria-label still surface the formal name on hover.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 09:44:03 -05:00
Dotta
484d37feb9 Reject prompt templates for new bundled agents
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 09:36:12 -05:00
Dotta
b779e43272 PAP-2614: Surface productivity review state in issue UI
Adds a productivityReview lookup to single-issue and list responses, plus
a ProductivityReviewBadge component that surfaces the open productivity
review on the source issue's header (with trigger/no-comment-streak
detail in the tooltip), and a compact eye glyph next to the StatusIcon
on IssueRow so operators can spot yellow tasks in any list. The review
issue itself shows a static "Productivity review" pill in the header
when its originKind matches.

Also extends the shared IssueOriginKind union with the recovery-side
kinds (harness_liveness_escalation, issue_productivity_review,
stranded_issue_recovery) so UI checks against issue.originKind type-check.

Wires the new helper through GET /issues/:id, GET /companies/:id/issues
(via issueService.list), and GET /issues/:id/heartbeat-context. Adds a
status-language Storybook story matrix for UX review.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 09:27:16 -05:00
Dotta
e7d32f19cd Clear legacy prompt templates for bundled agents
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 09:15:53 -05:00
Dotta
930b717ba5 PAP-2607: nudge inline mention chips up 1px
Reviewer follow-up — pills sat slightly low after the baseline-to-middle
switch. position: relative + top: -1px raises them to optically center
with the surrounding cap height.
2026-04-28 09:09:12 -05:00
Dotta
8b7cf25d59 test: cover first-attempt stranded recovery
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 09:03:08 -05:00
Dotta
82a558df86 Implement productivity review reconciler
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 09:00:02 -05:00
Dotta
2176de4b48 Separate successful continuations from recovery
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 08:48:55 -05:00
Dotta
6df68f5305 PAP-2607: align inline mention chips on text baseline
Inline-flex baseline behavior is browser-dependent, which made skill/agent/issue
chips in MarkdownBody and the MDXEditor hang below the surrounding text. Switch
to vertical-align: middle and drop the vertical padding so the pill sits
centered on the line with the prose around it.

Adds a skill-chip line to the MarkdownBody storybook fixture so this regresses
loudly next time.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 08:19:22 -05:00
Dotta
e3869bda4c Allow configurable issue attachment limits
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 08:17:52 -05:00
Dotta
559a843a25 Improve new issue dialog typing performance
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 07:58:05 -05:00
Dotta
48a7c43e23 PAP-2605: unify routine variables container and add help modal
Wrap the variables editor in a single rounded container so the
"Variables" header and the per-variable cards share one border with
internal dividers, instead of stacking as two visually separate
containers when expanded.

Add a help button to the variable instructions hint that opens a modal
explaining custom {{variable_name}} placeholders and the built-in
{{date}} and {{timestamp}} variables.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 07:42:30 -05:00
Dotta
e46d0559c8 empty section if needed 2026-04-28 07:41:54 -05:00
Dotta
3d31751fe7 exclude founders from contributors list 2026-04-28 07:41:20 -05:00
Dotta
af903a3d61 Add routines sort control 2026-04-28 07:39:29 -05:00
Dotta
fcee7e129e convert plans to tasks 2026-04-28 07:37:08 -05:00
Dotta
3f8e2a6a0d paperclip plan skill in access 2026-04-28 07:27:37 -05:00
Dotta
4401454bee planning skills 2026-04-28 07:25:29 -05:00
Dotta
55364835e9 Add {{timestamp}} built-in routine variable
Renders as a human-readable date and time in UTC (e.g. "April 28,
2026 at 12:17 PM UTC"), complementing the existing {{date}} variable.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-28 07:21:31 -05:00
Dotta
d0bdbe11a9 Stabilize inline selector keyboard handling (#4617)
## Thinking Path

> - Paperclip's board UI relies on compact selectors for frequent issue
and agent edits.
> - Inline selectors often live inside larger keyboard-aware surfaces
such as composers and popovers.
> - Arrow, enter, tab, and escape keys handled by the selector should
not leak to parent document shortcuts.
> - Stale company selection should also stay hidden until the company
list confirms it is valid.
> - This pull request tightens inline selector keyboard handling and
adds regression coverage for stale company bootstrap behavior.
> - The benefit is fewer accidental parent interactions and safer
company-scoped UI initialization.

## What Changed

- Added a stable empty `recentOptionIds` default so selector filtering
does not get a new array every render.
- Mirrored highlighted option state into a ref so Enter/Tab commits the
current highlighted option reliably after keyboard navigation.
- Stopped propagation for selector-owned navigation/commit/escape keys.
- Added jsdom regressions for inline selector keyboard handling and
CompanyProvider stale selection behavior.

## Verification

- `pnpm exec vitest run ui/src/components/InlineEntitySelector.test.tsx
ui/src/context/CompanyContext.test.tsx`
- Targeted selector and CompanyProvider tests pass cleanly without React
`act(...)` warnings.
- Screenshots not attached: this is keyboard/state behavior covered by
component tests.

## Risks

- Low risk: changes are scoped to inline selector key handling and
tests. The main behavior shift is intentionally preventing handled
selector keys from reaching parent listeners.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, tool-enabled local
repository and shell access, Paperclip heartbeat context.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 20:04:35 -05:00
Dotta
43b0f2ae58 Add pause and resume actions to sidebar agents (#4616)
## Thinking Path

> - Paperclip operators need fast control over the agents running their
company.
> - The sidebar is the persistent place operators scan agent state while
navigating the board UI.
> - Agent pause and resume already exist as control-plane actions, but
sidebar users had to navigate away to use them.
> - This pull request adds a compact per-agent action menu in the
sidebar.
> - It keeps edit, pause, and resume close to the visible agent list
while preserving existing navigation behavior.
> - The benefit is faster operator intervention when an agent needs to
be paused or restarted.

## What Changed

- Refactored sidebar agent rows into a small item component with a
hover/focus action menu.
- Added edit, pause, and resume actions using existing agent API calls
and cache invalidation keys.
- Added success/error toasts for pause and resume mutations.
- Tracked pause/resume pending state per agent so one active mutation
does not disable every sidebar row.
- Disabled direct sidebar resume for budget-paused agents and labeled
that state clearly.
- Added jsdom coverage for active-agent pause, paused-agent resume,
per-agent pending state, and budget-paused resume protection.
- Added visual review artifacts for the default row and opened action
menu:
  - [Sidebar row](docs/pr-screenshots/pr-4616/sidebar-agent-row.png)
- [Sidebar action
menu](docs/pr-screenshots/pr-4616/sidebar-agent-actions.png)

## Verification

- `pnpm exec vitest run ui/src/components/SidebarAgents.test.tsx`
- `pnpm --filter @paperclipai/server prepare:ui-dist`
- Browser screenshot pass against a temporary local trusted instance at
`http://127.0.0.1:3102` using Playwright.

## Risks

- Low risk: UI-only addition using existing agent pause/resume
endpoints. The main risk is layout crowding for very narrow sidebars,
mitigated by the icon-only trigger and existing truncation.
- Budget-paused agents now require a non-sidebar path to resume, which
is intentional to avoid accidentally restarting agents stopped by budget
enforcement.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, tool-enabled local
repository and shell/browser access, Paperclip heartbeat context.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 20:03:54 -05:00
Dotta
f88f538e6d Keep manual routine runs visible in the runner inbox (#4615)
## Thinking Path

> - Paperclip coordinates recurring agent work through scheduled and
manual routines.
> - Manual routine runs are board-initiated work and should stay visible
to the human who kicked them off.
> - Routine execution issues are agent-assigned, so they can be filtered
away from a board user's inbox unless the user is recorded as touching
the work.
> - Coalesced or skipped active routine runs have the same visibility
problem because they reuse an existing live issue.
> - This pull request carries the manual runner actor into routine
dispatch and touches the linked issue for that user's inbox.
> - The benefit is that manually triggered routine work stays
discoverable by the operator who started it.

## What Changed

- Passed the board or agent actor from the routine run route into the
routine service.
- Recorded manual board runners as `createdByUserId` on fresh routine
execution issues.
- Touched coalesced or skipped active routine issues for the manual
runner by updating read state and clearing that user's inbox archive.
- Added route and service regressions for manual routine run actor
propagation and inbox visibility.

## Verification

- `pnpm exec vitest run server/src/__tests__/routines-routes.test.ts
server/src/__tests__/routines-service.test.ts`

## Risks

- Low risk: the change is scoped to manual routine runs and only updates
issue attribution/read-state metadata for the initiating actor.
- No migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, tool-enabled local
repository and shell access, Paperclip heartbeat context.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 20:03:24 -05:00
Dotta
d3123be870 Fold sub-issue blocker count into chip
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-27 20:03:20 -05:00
Dotta
68c37660f0 Dispatch assigned todo work during recovery sweeps (#4614)
## Thinking Path

> - Paperclip orchestrates AI agents for autonomous companies.
> - Agent assignments must reliably turn into heartbeat work without
board operators manually nudging stuck tasks.
> - The stranded-assignment recovery sweep already handles failed or
lost runs.
> - But assigned `todo` issues with no prior run could sit idle because
there was nothing to retry or recover.
> - This pull request dispatches those never-started assigned todos as
normal assignment wakes.
> - The benefit is that recovery fixes missed initial dispatches without
creating unnecessary recovery issues.

## What Changed

- Added an initial assigned-todo dispatch path to the recovery service
when an assigned `todo` issue has no heartbeat run yet.
- Reused invocation budget hard-stop checks before dispatching or
requeueing recovery work.
- Counted `assignmentDispatched` in startup/scheduled recovery logs.
- Added heartbeat recovery regressions for first dispatch, duplicate
queued wake prevention, budget-blocked skips, and paused-agent skips.

## Verification

- `pnpm exec vitest run
server/src/__tests__/heartbeat-process-recovery.test.ts`

## Risks

- Low to medium risk: this changes liveness recovery behavior for
assigned `todo` issues, but it stays on the existing assignment wake
path and skips paused or budget-blocked agents.
- No migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, tool-enabled local
repository and shell access, Paperclip heartbeat context.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-27 20:02:44 -05:00
Dotta
8b5cfb2613 Collapse extra sub-issue blocker chips
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-27 19:45:07 -05:00
Dotta
e670b510d5 Fix budget skip in assigned todo recovery
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-27 15:45:50 -05:00
Dotta
f20e7fa790 Add assigned todo dispatch backstop
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-27 15:35:51 -05:00
Dotta
45529bca74 Fix assignee selector arrow navigation 2026-04-27 15:27:27 -05:00
Dotta
aefdf5ed23 Add agent sidebar pause menu
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-27 15:19:18 -05:00
Dotta
0c170712e1 Merge origin/master into master
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-27 15:12:02 -05:00
Dotta
7a9b3a6037 [codex] Harden recovery issue handling (#4600)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The control plane must recover stranded agent work without creating
new operational loops
> - Stranded recovery issues can themselves fail, and exposing raw retry
errors in comments can leak sensitive adapter details
> - New local companies also should not force a hire-approval gate
unless operators enable that policy
> - This pull request hardens recovery issue handling, redacts retry
failure details in issue copy, preserves `maxConcurrentRuns: 1`, and
flips new-hire approval to an opt-in default
> - The benefit is safer automatic recovery and smoother default company
setup without hidden migration conflicts

## What Changed

- Added migration `0071_default_hire_approval_off` and updated company
schema/import/export/docs so hire approvals default off and serialize
only when enabled.
- Added migration `0072_large_sandman` with a partial unique index
preventing duplicate active stranded recovery issues for the same source
issue.
- Blocked failed `stranded_issue_recovery` issues in place instead of
creating nested recovery issues.
- Redacted latest retry failure details from recovery issue comments
while still linking reviewers to run evidence.
- Allowed `maxConcurrentRuns: 1` to be honored by heartbeat concurrency
normalization.
- Added focused regression coverage for recovery recursion, redaction,
migration ordering, and concurrency behavior.

## Verification

- `pnpm --filter @paperclipai/db run check:migrations`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/recovery-classifiers.test.ts`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/company-portability.test.ts --pool=forks
--poolOptions.forks.isolate=true`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/agent-permissions-routes.test.ts --pool=forks
--poolOptions.forks.isolate=true`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/heartbeat-process-recovery.test.ts --pool=forks
--poolOptions.forks.isolate=true` exits 0, but this host skipped the
embedded Postgres tests with the existing init guard.
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/heartbeat-dependency-scheduling.test.ts
--pool=forks --poolOptions.forks.isolate=true` exits 0, but this host
skipped the embedded Postgres tests with the existing init guard.

## Risks

- Migration risk is low but this PR intentionally owns both new
migrations to avoid separate PR migration-journal conflicts.
- Recovery comments now require operators to inspect linked run evidence
for details instead of reading raw errors inline.
- The hire approval default changes behavior for newly created/imported
companies only; existing persisted company settings are not changed
except by the SQL default for future rows.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled terminal/GitHub
workflow, reasoning mode active. Context window not exposed in this
environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 15:02:47 -05:00
Dotta
6ccf80bcf2 [codex] Reject stale company skill refreshes (#4601)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Company skills are part of the reusable agent capability layer
> - Skill inventory refresh work can outlive the company it was
requested for
> - Without an explicit company existence check, stale refreshes can
continue into bundled/local skill cleanup for deleted or missing
companies
> - This pull request makes company-skill listing fail fast when the
company no longer exists
> - The benefit is clearer API behavior and less stale background work
against missing company scope

## What Changed

- Added a company existence check before `companySkillService.list()`
refreshes bundled and local-path skill state.
- Added regression coverage asserting missing companies return `404
Company not found`.

## Verification

- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/company-skills-service.test.ts --pool=forks
--poolOptions.forks.isolate=true` exits 0, but this host skipped the
embedded Postgres tests with the existing init guard.

## Risks

- Low risk. Existing callers for valid companies are unchanged.
- Missing-company callers now receive an explicit 404 instead of
continuing refresh work.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled terminal/GitHub
workflow, reasoning mode active. Context window not exposed in this
environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-27 13:19:38 -05:00
Dotta
d95968a9f8 [codex] Ignore stale stored company selections (#4602)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The board UI is the operator’s control surface for selecting the
active company
> - A company id stored in localStorage can become stale across resets,
imports, or deleted companies
> - Exposing that stale id before companies load can briefly put
downstream UI in an invalid company scope
> - This pull request defers selected-company exposure until the loaded
company list validates the stored id
> - The benefit is a cleaner company-selection bootstrap path and fewer
transient invalid API requests

## What Changed

- Initialized `CompanyProvider` selection as `null` until companies
finish loading.
- Reused a stored company id only when it exists in the loaded
selectable company list.
- Cleared storage and selected state when no companies are available.
- Added jsdom regression coverage for stale stored ids before and after
company loading.

## Verification

- `pnpm exec vitest run --project @paperclipai/ui
ui/src/context/CompanyContext.test.tsx`

## Risks

- Low risk. The change only affects selection bootstrap and keeps valid
stored selections intact.
- There may be a slightly longer initial `null` selected-company state
while the company list is loading.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled terminal/GitHub
workflow, reasoning mode active. Context window not exposed in this
environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 13:18:21 -05:00
Dotta
748db32478 Merge origin/master into master
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-27 12:53:30 -05:00
Dotta
b028c4866b Fix manual routine run inbox visibility
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-27 11:33:57 -05:00
Dotta
d424ada8d3 Redact recovery failure details from issue copy
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-27 11:18:16 -05:00
Dotta
b2e619a1a8 test: cover recursive recovery prevention 2026-04-27 10:59:15 -05:00
Dotta
22c23e1957 Add stranded recovery uniqueness index
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-27 10:54:42 -05:00
Dotta
4850ae158b Guard stranded recovery recursion 2026-04-27 10:52:08 -05:00
Dotta
15c0ce3722 Add Twitter/X link to READMEs (PAP-2475) (#4593)
## Summary

- Adds [@papercliping on X](https://x.com/papercliping) next to the
existing Discord link in the top-of-file nav and the bottom
**Community** list of both `README.md` and `cli/README.md`.
- The `cli/README.md` change keeps the npm-published readme consistent
with the GitHub one.

Resolves PAP-2475.

## Test plan

- [ ] Render `README.md` on GitHub and confirm the new Twitter link in
the header strip and the Community section.
- [ ] Render `cli/README.md` (preview on GitHub or via npm) and confirm
the same.
- [ ] Click both Twitter links and verify they land on
`https://x.com/papercliping`.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 09:35:33 -05:00
Dotta
6e8966a954 Fix stale company skill requests on fresh instances 2026-04-27 09:25:23 -05:00
Dotta
ecd92af001 release: v2026.427.0 notes (#4590)
## Summary

- Adds `releases/v2026.427.0.md` covering the 77 PRs since `v2026.416.0`
- Highlights: multi-user access + invites, structured issue-thread
interactions, run liveness continuations, sub-issues as a workflow
checklist, issue subtree pause/cancel/restore, first-class issue
references
- Beta Features: Environments + pluggable sandbox providers (incl.
`@paperclipai/plugin-e2b`)
- Notes 14 additive migrations (`0057`–`0070`) in the Upgrade Guide; no
breaking changes flagged

Source issue: [PAP-2476](/PAP/issues/PAP-2476)

## Test plan
- [ ] Spot-check that each linked PR actually landed in the v2026.427.0
range
- [ ] Confirm migration list matches `server/src/database/migrations`
- [ ] Verify rendered markdown looks right on GitHub

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 09:16:20 -05:00
Dotta
215b6cd161 [codex] Add security role route coverage (#4589)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Agent creation accepts roles that become part of the agent contract
and telemetry.
> - The shared role list already includes the security role.
> - Direct agent creation should preserve that role through route
handling and analytics metadata.
> - This pull request adds route coverage for creating a security-role
agent and asserting telemetry receives the same role.
> - The benefit is regression coverage for security agents without
changing the production route behavior.

## What Changed

- Added a server route test that creates an agent with `role:
"security"`.
- Asserted the create payload and telemetry metadata preserve `security`
as the agent role.

## Verification

- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/agent-skills-routes.test.ts --pool=forks
--poolOptions.forks.isolate=true`

## Risks

- Low risk; test-only coverage.
- No runtime behavior, schema, or API contract changes.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, `gpt-5`, coding model with tool use and local command
execution; context window not exposed by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 08:49:59 -05:00
Dotta
53396f272a [codex] Fix sub-issue progress summary styling (#4588)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The issue list and issue detail surfaces summarize child/sub-issue
progress for operators.
> - Those summaries need to be compact and visually consistent because
they appear in dense lists.
> - The progress strip is most useful when there are multiple sub-issues
to compare, so the summary intentionally stays hidden for a single
sub-issue.
> - This pull request tightens the sub-issue progress summary styling
and updates the related tests.
> - The benefit is a cleaner, more scannable task list without changing
task ownership, status, or workflow behavior.

## What Changed

- Adjusted sub-issue progress summary copy/styling in the issue list and
detail summary helpers.
- Intentionally render the progress summary only for two or more child
issues; a single child issue still appears in the normal sub-issue list
without a redundant progress strip.
- Updated the UI tests that assert the rendered summary behavior.
- Clarified the two-plus-child threshold in code with a named constant.

## Verification

- `pnpm exec vitest run --project @paperclipai/ui
ui/src/components/IssuesList.test.tsx
ui/src/lib/issue-detail-subissues.test.ts`

## Screenshots

![Before/after comparison of sub-issue progress summary
styling](cd26b5bd63/pap-2449-subissue-progress-before-after.svg)

## Risks

- Low risk; this is a small UI presentation change with focused test
coverage.
- The intentional threshold change means parents with exactly one child
no longer show the aggregate progress strip, avoiding redundant summary
chrome while keeping the child visible in the list.
- No schema or API behavior changes.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, `gpt-5`, coding model with tool use and local command
execution; context window not exposed by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 08:48:26 -05:00
Dotta
fda296ee4f [codex] Add configurable liveness auto-recovery controls (#4587)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Heartbeat liveness recovery decides when stalled issue trees need
manager-visible follow-up.
> - Automatic recovery issue creation is useful, but operators need
instance-level controls for how aggressive it is.
> - Without controls, recovery behavior is harder to tune for local
development, production operations, and noisy edge cases.
> - This pull request adds configurable liveness auto-recovery settings
across shared contracts, API routes, services, and the instance
experimental settings UI.
> - The benefit is that operators can keep liveness findings advisory or
enable bounded recovery automation with explicit intervals and lookback
windows.

## What Changed

- Added shared types and validators for liveness auto-recovery settings.
- Extended instance settings routes and services to persist and validate
the new controls.
- Wired heartbeat/recovery services to honor enablement, minimum
interval, and lookback settings.
- Added UI controls for liveness recovery under instance experimental
settings.
- Covered the new server behavior with instance settings and liveness
escalation tests.

## Verification

- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/heartbeat-issue-liveness-escalation.test.ts
server/src/__tests__/instance-settings-routes.test.ts --pool=forks
--poolOptions.forks.isolate=true`
- `pnpm --filter @paperclipai/shared typecheck`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm --filter @paperclipai/ui typecheck`

## Risks

- Moderate behavioral risk because recovery automation timing changes
when enabled; defaults keep existing advisory behavior unless the
setting is turned on.
- No database migration in this PR; settings are stored through the
existing instance settings path.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, `gpt-5`, coding model with tool use and local command
execution; context window not exposed by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 08:46:44 -05:00
Neeraj Kumar Singh B
f0f9460d1d docs: AWS ECS Fargate deployment runbook (#3897)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies and ships
a
>   "local-first, cloud-ready" deployment model
> - The deploy docs currently cover local/Docker but not a production
> cloud target, so teams asking "how do I put this behind a real domain"
>   have no canonical path
> - We already support Docker images, RDS-compatible Postgres, and an
EFS
>   storage profile, so AWS ECS Fargate is a natural fit
> - Without a runbook, each team reinvents VPC, security groups, TLS,
and
>   secrets wiring and usually gets at least one step wrong
> - This pull request adds `docs/deploy/aws-ecs.md`, an ECS
task-definition
> template, and an `.env.aws.example`, cross-linked from the deploy
overview
> - The benefit is a single, reproducible ~$110/mo path to a production
>   deployment, plus a full teardown for throwaway environments

## What Changed

- New `docs/deploy/aws-ecs.md` — an 11-step ECS Fargate runbook covering
ECR,
  VPC, RDS, EFS, Secrets Manager, IAM, ALB, and ECS service with the
  deployment circuit breaker enabled
- New `docker/ecs-task-definition.json` — Fargate-ready task definition
with
  `<ACCOUNT_ID>`, `<REGION>`, `<EFS_ID>`, `<DOMAIN>` placeholder tokens
- New `docker/.env.aws.example` — documents every non-secret env var the
  ECS deployment needs
- `docs/deploy/overview.md` — one-line cross-reference to the new guide
- Greptile feedback addressed in follow-up commits:
  - `containerName` in the service-create call now matches
    `paperclip-server` in the task definition
  - HTTP :80 listener added that 301-redirects to :443
  - Dedicated RDS DB subnet group created before `create-db-instance`
  - EFS teardown polls on mount-target deletion instead of `sleep 30`

## Verification

- Walked every step of the runbook against the task definition to
confirm
  variable names (`$ALB_SG`, `$ECS_SG`, `$RDS_SG`, `$EFS_SG`, `$TG_ARN`,
`$LISTENER_ARN`, `$HTTP_LISTENER_ARN`, `$EFS_ID`, `$RDS_ENDPOINT`, etc.)
are
  defined before they are referenced
- Confirmed the `containerName` in Step 10 (`paperclip-server`) matches
  `docker/ecs-task-definition.json` line 11
- Confirmed the `sed` placeholder substitution in Step 8 matches the
tokens
  in the task definition template
- Teardown order was checked in reverse-dependency order: ECS service →
  listeners → target group → ALB → RDS (waits for deletion) → DB subnet
  group → EFS mount targets (polled) → EFS → secrets → SGs → ECR → IAM →
  log group

## Risks

- **Low risk for the repo.** Docs-only change plus two template files
under
`docker/`; no runtime code paths are touched and nothing is imported by
  the build.
- **Risk for users who follow the runbook:** AWS bills accrue
immediately
  once RDS/ALB/EFS exist. The runbook calls this out and includes a full
  teardown procedure. Placeholder tokens (`<ACCOUNT_ID>`, `<REGION>`,
`<EFS_ID>`, `<DOMAIN>`) are documented so nothing is silently
hard-coded.

## Model Used

- Claude (Anthropic), model `claude-opus-4-6`, ~200K context window,
extended thinking mode on, used with tool access (file edit, shell) via
Claude Code. The Greptile follow-up commits were authored the same way.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass — N/A for docs/config
templates; validated by reading
- [x] I have added or updated tests where applicable — N/A for docs
- [x] If this change affects the UI, I have included before/after
screenshots — N/A, no UI
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-27 08:41:47 -05:00
Dotta
1d8c7a09b8 [codex] Add security role route regression (#4586)
## Thinking Path

> - Paperclip orchestrates AI agents through company-scoped
control-plane workflows.
> - Agent creation is one of the core board/operator surfaces for
defining who works in a company.
> - The shared taxonomy now includes a first-class `security` agent
role.
> - Direct agent creation must preserve that role through default
instruction materialization and telemetry.
> - A prior replacement PR covered this path, but Greptile identified
that the route-test mock could let a future patch object shadow the
regression.
> - This pull request reopens the narrow regression coverage from
current `master` with the mock ordering fixed.
> - The benefit is a focused guardrail that keeps `security` role
creation observable without expanding the production diff.

## What Changed

- Added a direct agent creation route regression test for `role:
"security"`.
- Verified telemetry receives `agentRole: "security"` after the default
instruction materialization update path.
- Ordered the regression mock as `...patch` before `role: "security"` so
future patch fields cannot shadow the asserted role.

## Verification

- `pnpm install --frozen-lockfile` to link dependencies in the fresh
worktree; it completed with existing plugin SDK bin warnings.
- `pnpm exec vitest run server/src/__tests__/agent-skills-routes.test.ts
packages/shared/src/adapter-types.test.ts`

## Risks

- Low risk. This is test-only coverage and does not change runtime
behavior.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 based coding agent, tool-enabled with local shell
and repository editing capabilities.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots (N/A: no UI changes)
- [x] I have updated relevant documentation to reflect my changes (N/A:
test-only regression)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 08:11:52 -05:00
Devin Foley
d2cbe2cb23 Prefer pushing feature branches to a user fork in paperclip-dev skill (#4572)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The `paperclip-dev` skill is the canonical reference agents read
before doing development work on the Paperclip repo itself
> - Today the skill assumes feature branches get pushed to `origin` (=
`paperclipai/paperclip`), which clutters the upstream branch list when
contributors actually have personal forks
> - This is the standard open-source contribution pattern (push to fork,
PR upstream) and the skill should reflect it
> - This pull request adds a "Forks — Prefer Pushing to a User Fork"
section that teaches agents to detect a fork remote, push there, and
only fall back to `origin` when no fork is configured
> - The benefit is cleaner upstream branch hygiene and behavior that
matches typical contributor workflows without any code/runtime change

## What Changed

- Added a new **Forks — Prefer Pushing to a User Fork** section to
`skills/paperclip-dev/SKILL.md` covering:
- How to detect a user fork via `git remote -v` (treat any
non-`paperclipai` GitHub remote as the fork)
  - How to push to the fork (`git push -u <fork-remote> HEAD`)
- How to create the PR from the fork (`gh pr create --repo
paperclipai/paperclip --head <fork-owner>:<branch>`)
- The no-fork fallback (push to `origin`, do not auto-create a fork —
ask first)
  - Keeping the fork's `master` in sync
- Added a reinforcing entry to the **Common Mistakes** table linking
back to the new section

## Verification

- Docs-only change to a single markdown skill file. Reviewer can confirm
by reading the diff in `skills/paperclip-dev/SKILL.md`:
- New `## Forks — Prefer Pushing to a User Fork` section sits between
`## Worktrees` and `## Pull Requests`
  - New row appended to the `## Common Mistakes` table
- No tests, no build, no runtime behavior affected.

## Risks

- Low risk. Documentation-only edit. The instructions are advisory —
they only change agent behavior on future runs that read the skill.

## Model Used

- Provider: Anthropic (Claude)
- Model ID: `claude-opus-4-7` (Claude Opus 4)
- Capabilities: tool use (file read/edit, shell, git, gh CLI), extended
reasoning
- Context: invoked via Claude Code / Paperclip heartbeat for issue
PAPA-139

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass (N/A — docs-only change; no
test surface)
- [x] I have added or updated tests where applicable (N/A)
- [x] If this change affects the UI, I have included before/after
screenshots (N/A — no UI change)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 22:19:07 -07:00
Dotta
82e257c7ba Cancel stale queued heartbeats when issue graph changes (PAP-2314) (#4534)
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-26 21:17:38 -05:00
Devin Foley
868d08903e test: isolate CLI company import e2e state (#4560)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies, and its
CLI import/export path is part of how operators move company state
safely between environments.
> - The `paperclipai company import/export` e2e test is supposed to
validate that portability flow inside a hermetic harness, not against a
developer's live Paperclip home.
> - This regression showed nested CLI subprocesses could silently fall
back to ambient `PAPERCLIP_*` state and mutate a real local instance by
creating extra companies such as `CLI-1-Roundtrip-Test`.
> - The first job was to pin the test subprocesses to isolated config,
home, instance, auth, and context paths, and to add a regression
assertion that proves the nested CLI writes stay inside the test-owned
state.
> - Once the PR was up, CI and Greptile exposed two follow-on issues
that were blocking merge: plugin SDK typecheck bootstrap was racing
across packages in fresh CI, and the new lock helper needed one more fix
to release its lock on failure.
> - This pull request therefore ends up doing two tightly related
things: fixing the original CLI isolation leak, and hardening the
supporting typecheck/bootstrap path enough for the fix to verify cleanly
in CI.
> - The benefit is that the portability e2e test is now actually
isolated, and the PR verification path is stable enough to catch
regressions instead of introducing its own nondeterministic failures.

## What Changed

- Hardened `cli/src/__tests__/company-import-export-e2e.test.ts` so
nested CLI subprocesses re-seed isolated `PAPERCLIP_CONFIG`,
`PAPERCLIP_HOME`, `PAPERCLIP_INSTANCE_ID`, `PAPERCLIP_CONTEXT`,
`PAPERCLIP_AUTH_STORE`, and throwaway `HOME` values instead of falling
back to ambient machine state.
- Added a regression assertion around `paperclipai context set --json`,
then cleared the temporary `context.json` so the isolation check and the
later export/import flow stay independent.
- Passed the same isolated `HOME` into the server subprocess so both
sides of the e2e harness are symmetric.
- Introduced locking in `scripts/ensure-plugin-build-deps.mjs` and
switched the server/plugin example `typecheck` scripts to use that
helper instead of launching concurrent raw `@paperclipai/plugin-sdk`
builds.
- Fixed the helper failure path so it releases the lock before exiting
non-zero, which prevents stale-lock timeouts during parallel typecheck
runs.

## Verification

- `pnpm vitest run cli/src/__tests__/company-import-export-e2e.test.ts
--project paperclipai`
- `pnpm --filter paperclipai typecheck`
- `pnpm -r typecheck`
- PR checks now pass on the current head, including `policy`, `verify`,
`e2e`, `security/snyk`, and `Greptile Review`.

## Risks

- Low risk. The product-facing behavior change is scoped to test harness
code in the CLI e2e suite.
- The CI stabilization changes only affect bootstrap/typecheck helper
paths for the server and plugin/example packages, but they do touch
shared verification plumbing; the main risk is changing how fresh build
artifacts are prepared in local/CI typecheck runs.

## Model Used

- Anthropic Claude via Paperclip `claude_local`, model
`claude-opus-4-7`, high-effort local coding agent, used for the initial
implementation and first peer-reviewed verification.
- OpenAI Codex via Paperclip `codex_local`, model `gpt-5.4`, high
reasoning-effort local coding agent with tool use, used for CI triage,
Greptile follow-up fixes, verification, and PR maintenance.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 19:10:01 -07:00
Devin Foley
1d9f7a5149 Fix flaky heartbeat recovery teardown CI failure (#4559)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The linked CI job is in the server test/recovery path, where
heartbeat runs and issue cleanup need to leave the control plane in a
consistent state even when retries fail.
> - In this case the failure was not runtime product behavior but test
teardown behavior inside `heartbeat-process-recovery.test.ts`.
> - The failing GitHub Actions job showed a foreign-key race on
`company_skills_company_id_companies_id_fk` while the test tried to
delete the parent company record.
> - The surrounding teardown code already uses bounded retry cleanup for
other dependent tables (`issues`, `heartbeatRuns`, and `agents`) because
this test file intentionally exercises asynchronous recovery flows.
> - This pull request applies that same retry pattern to the final
`db.delete(companies)` step, re-clearing `companySkills` before each
retry.
> - The benefit is a targeted fix for the CI flake without changing
runtime behavior or expanding the scope beyond the failing teardown
path.

## What Changed

- Wrapped the final `db.delete(companies)` call in
`server/src/__tests__/heartbeat-process-recovery.test.ts` with the same
5-attempt retry pattern already used elsewhere in that teardown.
- Re-cleared `companySkills` before each company-delete retry so
late-arriving FK-dependent rows do not mask the real test result.
- Verified the fix against the originally failing
`heartbeat-process-recovery` test file and the broader `pnpm test:run`
command under CI-like env conditions.

## Verification

- `pnpm exec vitest run
server/src/__tests__/heartbeat-process-recovery.test.ts`
- Re-ran `pnpm exec vitest run
server/src/__tests__/heartbeat-process-recovery.test.ts` multiple times
locally; the previously failing teardown stayed green.
- `env -u PAPERCLIP_API_URL -u PAPERCLIP_RUNTIME_API_URL -u
PAPERCLIP_RUN_ID -u PAPERCLIP_TASK_ID -u PAPERCLIP_AGENT_ID -u
PAPERCLIP_COMPANY_ID -u PAPERCLIP_API_KEY -u PAPERCLIP_WAKE_REASON -u
PAPERCLIP_WAKE_COMMENT_ID -u PAPERCLIP_WAKE_PAYLOAD_JSON -u
PAPERCLIP_APPROVAL_ID -u PAPERCLIP_APPROVAL_STATUS pnpm test:run`

## Risks

- Low risk. The change is test-only and scoped to teardown retry
behavior in a single server test file.
- If the underlying async cleanup behavior changes again, this test
could still become flaky in a different way, but this PR addresses the
specific FK race seen in the linked CI job.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI `gpt-5.4` via Paperclip `codex_local`, high reasoning mode,
with tool use for shell, git, HTTP API calls, and patch application.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 17:30:20 -07:00
Devin Foley
8145141c55 Fix external issue URL rewriting in markdown (#4558)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Issue and comment rendering is part of the board UI where humans
supervise and inspect agent work.
> - External Paperclip issue URLs can appear in comments as references
to other runs, review threads, or remote test environments.
> - Those links must preserve their full destination, including origin,
port, and `#comment-...` fragments, or the operator is taken to the
wrong place.
> - The bug here was that absolute `http(s)` issue URLs were being
normalized into internal `/issues/...` routes in the markdown path.
> - This pull request stops rewriting absolute URLs while keeping
internal issue-reference behavior for relative paths and identifiers.
> - The benefit is that authored external links now navigate exactly
where the operator expects, especially for remote test and
comment-deep-link workflows.

## What Changed

- Stopped `ui/src/lib/issue-reference.ts` from treating absolute
`http(s)` URLs as internal issue paths.
- Added defense-in-depth in `ui/src/lib/mention-chips.ts` so absolute
`http(s)` URLs are never reclassified as issue mention chips.
- Updated `ui/src/lib/issue-reference.test.ts` to cover absolute
Paperclip URLs with preserved origin, port, and comment hash.
- Updated `ui/src/components/MarkdownBody.test.tsx` to assert the
reported URL renders as an external link, not an internal `/issues/...`
href.

## Verification

- `pnpm exec vitest run ui/src/lib/issue-reference.test.ts
ui/src/components/MarkdownBody.test.tsx`
- Expected result: `2` files passed, `37` tests passed.
- Manual spot-check from the issue report path: a URL like
`http://remote.example.test:3103/PAPA/issues/PAPA-115#comment-...`
should remain an external link with its full destination preserved.

## Risks

- Low risk. The change narrows when Paperclip rewrites URLs, so the main
risk is if some existing workflow depended on absolute `http(s)`
Paperclip URLs being converted into internal issue links. The added
regression coverage is aimed at preventing that from regressing
silently.

## Model Used

- OpenAI Codex local agent via Paperclip `codex_local`
- Backing model family: GPT-5-based Codex runtime
- Exact backend model ID/version: not exposed by this adapter/runtime
surface
- Context window: not exposed by this adapter/runtime surface
- Capabilities used: tool use, shell command execution, code editing,
git operations, and local test execution

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [ ] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 17:19:23 -07:00
Devin Foley
54ab0d24cd Fix disappearing issue comments (#4557)
## Thinking Path

> - Paperclip is a control plane for AI-agent companies, so issue detail
pages are a primary surface for understanding agent work and human
feedback.
> - The relevant subsystem here is the issue comments/chat experience
across the React issue detail page and the server comment pagination
API.
> - Long issue threads were only surfacing the newest page of comments
at first render, which hid earlier human and agent messages behind extra
pagination.
> - The first UI fix exposed that the descending cursor path on the
server could also fail for older-page fetches, leaving the chat tab
stuck on an infinite "Loading earlier comments..." state.
> - This needed to be addressed in both layers so the chat tab can
surface earlier conversation history without manual recovery and without
server errors.
> - This pull request auto-loads earlier comment pages in the issue
detail chat view and fixes the descending cursor predicate used by issue
comment pagination.
> - The benefit is that long-running issues like `PAPA-103` now show the
missing conversation history near the top of the chat surface instead of
hiding it or failing to load it.

## What Changed

- Auto-load earlier issue comment pages in the issue detail chat tab
until the thread reaches a 150-comment cap or there are no older
comments left.
- Add UI-side guard logic and regression coverage for optimistic issue
comment pagination so the autoload behavior stops cleanly.
- Replace the raw SQL descending cursor predicate in
`issueService.listComments` with typed Drizzle comparisons for the
`(createdAt, id)` anchor tuple.
- Add a server regression test that paginates earlier comments in
descending order from an anchor comment.
- Smoke-test the exact previously failing seeded `PAPA-103` cursor path
on the isolated dev instance used for review.

## Verification

- `pnpm --filter @paperclipai/server exec vitest run
src/__tests__/issues-service.test.ts`
- `pnpm --filter @paperclipai/server typecheck`
- Manual smoke against seeded `PAPA-103` data on the isolated dev
server:
- `GET /api/issues/PAPA-103/comments?order=desc&limit=50` returns `200`
- `GET
/api/issues/PAPA-103/comments?after=765d3609-edc6-4d11-a8fe-d466affbe85d&order=desc&limit=50`
now returns `200` with 50 comments instead of `500`

## Risks

- Moderate UI/perf risk on very large threads because the chat tab now
prefetches multiple earlier pages on mount; the cap is intentionally
limited to 150 comments to bound that work.
- Low API risk because the server fix only changes the cursor predicate
construction for anchor-based comment pagination, but any mistake there
would affect older-comment paging order.

> I checked `ROADMAP.md` before opening this PR and this bug fix does
not duplicate planned core work.

## Model Used

- OpenAI Codex coding agent in the Paperclip local adapter environment.
The exact backend model ID and context window were not exposed
in-session. Tool-assisted workflow included shell execution, git/GitHub
CLI, local test execution, and targeted code edits.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 16:23:53 -07:00
Devin Foley
b2496c8067 fix(auth): trust allowed hostname port variants on detected listen port (#4554)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies, so
authenticated board access has to be predictable across local and
worktree deployments.
> - This change sits in the authenticated-mode server startup and Better
Auth origin-trust wiring.
> - The original auth branch fixed one real gap by adding port-qualified
trusted origins for allowed hostnames on non-default ports.
> - Review of that branch found a second-order bug: trusted origins were
still derived from the configured port before startup detected the
actual listen port.
> - In isolated worktrees, that meant a common `3100 -> 3101` port shift
could still leave Better Auth trusting the stale origin.
> - This pull request keeps the original allowed-hostname port-variant
fix, then moves trust derivation onto the resolved listen port and adds
regression coverage around startup wiring.
> - The benefit is that authenticated sessions keep working on allowed
private hostnames even when Paperclip has to auto-shift to a different
local port.

## What Changed

- Added `:port` trusted-origin variants for authenticated-mode
`allowedHostnames` when Paperclip runs on non-default ports.
- Changed authenticated startup so `listenPort` is detected before
Better Auth initialization, and explicit auth base URLs are rewritten
before auth startup.
- Updated `deriveAuthTrustedOrigins()` to accept the resolved listen
port so Better Auth trusts the actual browser origin instead of the
stale configured port.
- Added focused regression coverage in
`server/src/__tests__/better-auth.test.ts` and
`server/src/__tests__/server-startup-feedback-export.test.ts`.

## Verification

- `pnpm exec vitest run server/src/__tests__/better-auth.test.ts
server/src/__tests__/server-startup-feedback-export.test.ts`
- Reviewer re-check: reviewed commits `380f5b9f` and `092bb34c` after
the follow-up fix landed and found no remaining issues.

## Risks

- Low risk: this only affects authenticated-mode origin derivation and
startup ordering around detected listen ports.
- Main behavioral shift: startup no longer mutates `config.port` to the
selected port; it now carries `requestedListenPort` separately and uses
`listenPort` where runtime behavior needs the resolved value.
- If another path was implicitly relying on `config.port` being
overwritten during startup, that path would need follow-up, though the
current startup/test coverage did not reveal one.

> I checked `ROADMAP.md` and did not find an overlapping planned core
work item for this auth trusted-origin port handling fix.

## Model Used

- OpenAI Codex via Paperclip `codex_local` agents for implementation and
review. Exact backend model ID/context window were not surfaced in this
run context; work was performed through the Codex local adapter with
tool use, code execution, and review passes.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 15:40:39 -07:00
Devin Foley
08af830430 Tighten publicBaseUrl port rewriting (#4553)
## Thinking Path

> - Paperclip is a control plane for autonomous agent companies, so its
local and authenticated deployment behavior has to stay predictable
under port rebinding and worktree isolation.
> - This change sits in the server/worktree configuration path that
derives runtime URLs and auth origins from `auth.publicBaseUrl`.
> - The original hostname-port rewrite change fixed one real gap for
private/tailnet host:port worktree setups, but it widened the rewrite
rule too far.
> - Rewriting every explicit `auth.publicBaseUrl` can corrupt public or
reverse-proxy URLs by turning a stable origin like
`https://paperclip.example` into a local listen-port URL.
> - Paperclip's auth and trusted-origin handling depend on that URL
staying semantically correct, so this had to be narrowed before merge.
> - This pull request tightens the rewrite rule to explicit-port URLs
only and adds regression coverage across the CLI helper, worktree config
persistence, and server startup path.
> - The benefit is that private host:port worktree flows still work,
while public/default-port URLs remain stable and safe.

## What Changed

- Tightened `rewriteLocalUrlPort` in `cli/src/commands/worktree-lib.ts`,
`server/src/worktree-config.ts`, and `server/src/index.ts` so it only
rewrites URLs that already include an explicit port.
- Removed the old loopback-only hostname gate from the CLI/worktree
helpers and replaced it with the more precise `parsed.port` guard.
- Updated CLI helper coverage to assert that explicit-port non-loopback
URLs still rewrite while no-port public URLs stay unchanged.
- Expanded `server/src/__tests__/worktree-config.test.ts` to cover
explicit-port rewrite and no-port stability for both persisted worktree
config and in-memory runtime port selection.
- Added startup-path coverage in
`server/src/__tests__/server-startup-feedback-export.test.ts` for
`detect-port` rebinding with both explicit-port and no-port
`auth.publicBaseUrl` values.

## Verification

- `pnpm --filter @paperclipai/plugin-sdk build`
- `npx vitest run
server/src/__tests__/server-startup-feedback-export.test.ts`
- `npx vitest run cli/src/__tests__/worktree.test.ts
server/src/__tests__/worktree-config.test.ts`
- All of the above were run locally in this issue worktree and passed.

## Risks

- Low risk. The behavior change is deliberately narrower than the
reviewed broad-host rewrite and is guarded by regression coverage for
both the explicit-port and no-port cases.
- The main remaining risk is behavioral only if another code path starts
depending on port rewriting for URLs that never declared a port, which
would be a separate bug.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex local agent using `gpt-5.4` with high reasoning effort,
tool use, shell execution, and file editing.
- Anthropic Claude local agent using `claude-opus-4-6` for follow-up
code review approval on the implementation issue.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 14:29:22 -07:00
Devin Foley
d47ffa87f0 Fix CEO AGENT_HOME paths and centralize workspace env propagation (#4551)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The local adapter layer is responsible for turning Paperclip runtime
context into the environment seen by the child agent process.
> - The CEO onboarding bundle tells the agent where to read and write
its persistent memory and fact files.
> - That bundle was using `./memory/...` and `./life/...`, which only
works when the process cwd happens to equal the agent home directory.
> - At the same time, six local adapters each duplicated the same
workspace-env propagation logic, including `AGENT_HOME`, which makes
this contract easy to drift.
> - This pull request fixes the CEO instructions to use
`$AGENT_HOME/...` and centralizes workspace-env propagation in one
shared helper with shared tests.
> - The benefit is a real bug fix for agent memory paths plus a single
tested contract that makes future built-in adapter work less likely to
forget `AGENT_HOME`.

## What Changed

- Updated `server/src/onboarding-assets/ceo/HEARTBEAT.md` to use
`$AGENT_HOME/memory/...` and `$AGENT_HOME/life/...` instead of
cwd-relative `./memory/...` and `./life/...`.
- Added `applyPaperclipWorkspaceEnv(...)` in
`packages/adapter-utils/src/server-utils.ts` to centralize
`PAPERCLIP_WORKSPACE_*` and `AGENT_HOME` propagation.
- Added shared helper coverage in
`packages/adapter-utils/src/server-utils.test.ts` for both populated and
skip-empty cases.
- Switched the built-in local adapters (`claude_local`, `codex_local`,
`cursor_local`, `gemini_local`, `opencode_local`, `pi_local`) over to
the shared helper instead of inline env assignment blocks.

## Verification

- `pnpm install`
- `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/codex-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- Result: 7 test files passed, 31 tests passed, 0 failures.

## Risks

- Low risk.
- The only behavioral surface is the shared env propagation refactor
across six adapters; if the helper diverged from prior semantics, an
adapter could miss a workspace env var.
- The shared helper test plus the affected adapter execute tests reduce
that risk, and the helper preserves the prior "set only non-empty
strings" behavior.

## Model Used

- OpenAI Codex via Paperclip `codex_local` agent runtime; tool-assisted
coding workflow with shell execution, file patching, git operations, and
API interaction. The exact backend model identifier and context window
are not surfaced by this local runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 13:57:35 -07:00
Devin Foley
d1484551ee Add open-source hygiene note to paperclip-dev skill (#4541)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The `paperclip-dev` skill is part of the contributor and agent
workflow layer that tells developers how to work in this repository
safely.
> - That skill already references the public upstream `origin`, but it
did not explicitly say that pushes there must be treated as publishable
open-source output.
> - Without that reminder, contributors are more likely to leak secrets,
PII, private logs, machine-local config, or noisy throwaway git history
into the public repo.
> - This pull request adds a prominent `OPEN SOURCE HYGIENE` callout
near the top of the skill, before the git workflow guidance.
> - The benefit is clearer safety guidance for contributors and less
accidental disclosure or branch/commit noise on the upstream project.

## What Changed

- Added an `OPEN SOURCE HYGIENE` callout near the top of
`skills/paperclip-dev/SKILL.md`.
- Explicitly warned that anything pushed to `origin` must be
publishable.
- Called out avoiding secrets, API keys, PII, private logs,
machine-local config, and noisy throwaway branches or checkpoint
commits.

## Verification

- N/a

## Risks

- Low risk. This is a docs-only change in a skill file; the main risk is
wording tone or placement, not runtime behavior.

## Model Used

- OpenAI Codex via the `codex_local` Paperclip adapter, GPT-5-based
coding agent runtime. Exact backend serving model ID is not exposed in
this heartbeat environment. Tool use, shell execution, and patch
application were enabled.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [ ] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 12:14:49 -07:00
Devin Foley
91333ec86f feat: add paperclip-dev skill with optional bundled skill support (#3854)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents working on the Paperclip codebase itself need guidance on dev
workflows: server lifecycle, worktrees, builds, database ops,
diagnostics
> - There was no bundled skill covering these workflows — agents had to
figure it out from scratch each time
> - Additionally, not every skill should be force-installed on every
agent — a dev-focused skill should be opt-in
> - This PR adds a `paperclip-dev` skill with `required: false`
frontmatter so it ships with Paperclip but isn't auto-installed
> - The skill's PR section references canonical files
(`.github/PULL_REQUEST_TEMPLATE.md`, `CONTRIBUTING.md`) instead of
duplicating their content, with gated instructions that force agents to
read those files before creating any PR
> - The benefit is that developers (human or agent) can opt in to
structured dev guidance without polluting the default agent skill set or
creating drift between duplicated docs

## What Changed

- Added `skills/paperclip-dev/SKILL.md` covering server management,
worktree lifecycle, builds, database ops, diagnostics, agent operations,
and common mistakes
- The Pull Requests section uses gated, reference-based instructions —
agents MUST read `.github/PULL_REQUEST_TEMPLATE.md` and
`CONTRIBUTING.md` before running `gh pr create`, with a brief checklist
of required section names (no content duplication)
- Updated `packages/adapter-utils/src/server-utils.ts` to respect
`required: false` frontmatter — optional skills are bundled but not
auto-installed on agents
- Added test in `server/src/__tests__/paperclip-skill-utils.test.ts`
verifying that optional skills are excluded from the default install set

## Verification

```bash
# Run tests
pnpm test

# Manual verification: create a fresh worktree without seeding
npx paperclipai worktree:make test-optional-skill --no-seed
cd ~/paperclip-test-optional-skill
eval "$(npx paperclipai worktree env)"
npx paperclipai run

# Verify paperclip-dev appears in company skill library but is NOT auto-assigned
# Call listPaperclipSkillEntries() — paperclip-dev should show required: false
# Call resolvePaperclipDesiredSkillNames() — paperclip-dev should NOT be in the default set

# Cleanup
npx paperclipai worktree:cleanup test-optional-skill
```

## Risks

- Low risk. The `required` field defaults to `true` when absent, so all
existing skills behave identically. Only the new `paperclip-dev` skill
sets `required: false`.

## Model Used

Claude Opus 4.6 (`claude-opus-4-6`) via Claude Code, with tool use and
extended context.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-26 11:06:13 -07:00
Dotta
c036bbfa98 Add first-class security agent role to taxonomy (#4532)
## Thinking Path

> - Paperclip is the control plane for AI-agent companies, so agent
metadata is part of the platform's governance and audit surface.
> - The shared agent taxonomy in `@paperclipai/shared` is the source of
truth for allowed agent roles and their UI labels.
> - The current taxonomy lacks a `security` role, which causes Security
Engineer hires to collapse into `engineer`.
> - That breaks separation-of-duties evidence in telemetry and weakens
role-level audit fidelity even though it does not directly change
permissions.
> - This pull request adds `security` as a first-class shared role and
covers the prior rejection path with a regression test.
> - The benefit is that Security Engineer agents can now be persisted
and rendered under the correct role without schema or permission churn.

## What Changed

- Added `security` to `AGENT_ROLES` in
`packages/shared/src/constants.ts`.
- Added the `Security` display label to `AGENT_ROLE_LABELS` so existing
UI consumers render the new role automatically.
- Added a shared validator regression test proving `createAgentSchema`
accepts `role: "security"` and that the label stays stable.

## Verification

- `pnpm --filter @paperclipai/shared typecheck`
- `pnpm --filter @paperclipai/shared exec vitest run
src/adapter-types.test.ts`

## Risks

- Low risk. This is a shared enum expansion with no database migration
and no permission-model change.
- Residual risk: this PR does not backfill existing agents already
persisted as `engineer`; it only fixes new validations and labels going
forward.

> I checked `ROADMAP.md`/`doc` for overlap and did not find an existing
planned item covering this taxonomy fix.

## Model Used

- OpenAI GPT-5.4 via the Codex local adapter, with tool use and local
code execution enabled. The runtime did not surface a separate
context-window identifier in agent metadata.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 07:52:05 -05:00
Dotta
df425fde96 Present ordered sub-issues as a workflow checklist (#4523)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Operators use issue detail pages and child issue lists to understand
multi-step execution plans.
> - Ordered sub-issues currently read like a flat table, so dependency
chains and current next steps are harder to scan.
> - The branch work adds a workflow-oriented presentation for child
issues without changing the single-assignee task model.
> - This pull request makes ordered sub-issues read more like a progress
checklist while preserving normal issue list controls.
> - The benefit is that operators can see completed steps, active work,
blocked follow-ups, and dependency order at a glance.

## What Changed

- Added workflow sorting utilities and tests for dependency-aware child
issue ordering.
- Added sub-issue progress summary, checklist numbering, current-step
affordances, blocker context, and done-state de-emphasis in the issue
list UI.
- Wired issue detail sub-issue panels to use the workflow sort/progress
checklist presentation.
- Updated issue service behavior/tests for child issue ordering inputs
used by the UI.
- Added a Storybook visual review fixture and screenshot helper for the
sub-issue workflow checklist surface.

## Verification

- `pnpm run preflight:workspace-links && pnpm exec vitest run
server/src/__tests__/issues-service.test.ts
ui/src/components/IssueRow.test.tsx
ui/src/components/IssuesList.test.tsx ui/src/pages/IssueDetail.test.tsx
ui/src/lib/issue-detail-subissues.test.ts
ui/src/lib/workflow-sort.test.ts`
- Result: 6 test files passed, 55 tests passed, 34 embedded Postgres
issue-service tests skipped because `@embedded-postgres/darwin-x64` is
unavailable on this host.
- Visual review: generated Storybook screenshots from the existing local
Storybook server on port 6006 with `node
scripts/screenshot-subissues.mjs /tmp/pap-2189-subissues-screens
http://localhost:6006`.
- Screenshot artifacts:
- Desktop dark: ![Desktop
dark](doc/assets/pap-2189/desktop-1440x900-dark.png)
- Desktop light: ![Desktop
light](doc/assets/pap-2189/desktop-1440x900-light.png)
- Mobile dark: ![Mobile
dark](doc/assets/pap-2189/mobile-390x844-dark.png)
- Mobile light: ![Mobile
light](doc/assets/pap-2189/mobile-390x844-light.png)
- Local Storybook note: starting a second Storybook process selected
port 6008 because 6006 was occupied, then Vite failed with an esbuild
host/binary version mismatch (`0.25.12` host vs `0.27.3` binary). The
already-running Storybook server on 6006 served the fixture successfully
for screenshots.

## Risks

- Medium UI risk: the issue list now has additional sub-issue-specific
visual states, so dense lists should be checked for spacing and
scanability.
- Low ordering risk: workflow sorting is covered by focused unit tests,
but unusual dependency topologies may still need reviewer attention.
- No migration risk: this PR does not add database migrations or touch
`pnpm-lock.yaml`.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled shell/git/GitHub
workflow. Context window is runtime-provided and not exposed in this
environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-26 07:36:49 -05:00
Devin Foley
40782f703d Fix release packaging for standalone public packages (#4494)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies, and the
sandbox-provider work just moved E2B into a standalone publishable
plugin package.
> - That plugin is intentionally excluded from the root pnpm workspace
so it can model third-party install behavior without forcing lockfile
churn in the main repo.
> - The merged architecture change exposed a follow-up release problem:
the canary publish workflow tried to publish `@paperclipai/plugin-e2b`,
but the tarball had no `dist/` payload because standalone public
packages were not being built in the release path.
> - That means the release pipeline needed a packaging fix in core
release tooling, not another architectural change in the sandbox
provider itself.
> - This pull request adds a generic release step for public packages
that live outside the pnpm workspace, instead of hardcoding E2B-specific
behavior into the release script.
> - The benefit is that standalone publishable packages can be built and
packed correctly during release, including future sandbox-provider
plugins that follow the same pattern.

## What Changed

- Added `scripts/build-standalone-public-packages.mjs` to discover
public packages outside the pnpm workspace, run a clean package-local
install, and build them before publish.
- Updated `scripts/release.sh` to invoke that helper immediately after
the normal workspace build step.
- Kept the behavior generic by driving off the existing public package
map and pnpm workspace patterns rather than special-casing
`@paperclipai/plugin-e2b`.

## Verification

- `rm -rf packages/plugins/sandbox-providers/e2b/dist`
- `node ./scripts/build-standalone-public-packages.mjs`
- `cd packages/plugins/sandbox-providers/e2b && npm pack --dry-run`
- Confirm the tarball now includes the rebuilt `dist/` files instead of
only `README.md` / `package.json`

## Risks

- Low risk: this only changes the release build path for public packages
outside the pnpm workspace.
- The helper performs a clean package-local install for each standalone
public package, so release time may increase slightly as more such
packages are added.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex via `codex_local`
- Model ID: `gpt-5.4`
- Reasoning effort: `high`
- Context window observed in runtime session metadata: `258400` tokens
- Capabilities used: terminal tool execution, git, GitHub CLI, and local
build/test inspection

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-25 12:16:23 -07:00
Devin Foley
4ef969f084 Add E2B sandbox provider plugin (#4452)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Sandbox environments are part of that execution layer, and the
recent core refactor moved provider-specific behavior to a generic
plugin seam
> - This pull request adds a dedicated `@paperclipai/plugin-e2b` package
so E2B can live entirely outside core host code
> - Because the feature is still unreleased, the plugin should model
third-party packaging directly instead of carrying extra
backward-compatibility complexity in core or the workspace lockfile
> - This branch therefore makes the E2B provider a standalone
publishable package, documents the package-local dev flow, and keeps the
publish manifest/runtime dependency story correct
> - The benefit is that E2B becomes a true plugin reference
implementation that can be installed by package name without reopening
core Paperclip code

## What Changed

- Added `packages/plugins/paperclip-plugin-e2b` as the E2B sandbox
provider plugin package
- Implemented config validation, lease acquire/resume/release/destroy
handlers, workspace realization, and command execution for E2B sandboxes
- Excluded the E2B plugin package from the root workspace so the repo no
longer needs `pnpm-lock.yaml` churn for its third-party dependency graph
- Added package-local development/install support plus a prepack
manifest generator so the published tarball still declares
`@paperclipai/plugin-sdk` and `e2b` runtime dependencies
- Addressed review feedback by fixing sandbox cleanup on acquire
failures, rejecting blank templates, normalizing fractional `timeoutMs`,
and always passing the configured template name to the E2B SDK
- Updated focused Vitest coverage for config normalization, validation,
acquire cleanup, command execution, and lease release behavior
- Updated the Dockerfile deps stage to copy the E2B package manifest so
the policy check stays in sync

## Verification

- `cd packages/plugins/paperclip-plugin-e2b && pnpm install
--ignore-workspace --no-lockfile`
- `cd packages/plugins/paperclip-plugin-e2b && pnpm build`
- `cd packages/plugins/paperclip-plugin-e2b && pnpm --ignore-workspace
test`
- `cd packages/plugins/paperclip-plugin-e2b && pnpm --ignore-workspace
typecheck`
- `cd packages/plugins/paperclip-plugin-e2b && npm pack --dry-run`

## Risks

- The package now relies on a prepack manifest rewrite so the
publish-time dependency list stays correct while the repo-local dev
manifest stays workspace-light
- The current repo snapshot is still unreleased, so the generated
publish manifest points at the repo SDK version until the normal release
flow rewrites versions before publish
- Real-world E2B environments may still expose edge cases around
lifecycle timing or sandbox metadata beyond the mocked unit coverage

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex via `codex_local`
- Model ID: `gpt-5.4`
- Reasoning effort: `high`
- Context window observed in runtime session metadata: `258400` tokens
- Capabilities used: terminal tool execution, git, GitHub CLI, and local
build/test inspection

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-25 11:01:11 -07:00
Dotta
ecdb43b64f Fix heartbeat concurrency normalization
Honor maxConcurrentRuns values below the default so a configured value of 1 keeps same-agent assignment wakes queued while a run is active. Add a regression test covering the queued second assignment case.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-25 12:36:18 -05:00
Dotta
b522672423 Default new hires to approved
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-25 10:35:58 -05:00
Devin Foley
5bd0f578fd Generalize sandbox provider core for plugin-only providers (#4449)
## Thinking Path

> - Paperclip is a control plane, so optional execution providers should
sit at the plugin edge instead of hardcoding provider-specific behavior
into core shared/server/ui layers.
> - Sandbox environments are already first-class, and the fake provider
proves the built-in path; the remaining gap was that real providers
still leaked provider-specific config and runtime assumptions into core.
> - That coupling showed up in config normalization, secret persistence,
capabilities reporting, lease reconstruction, and the board UI form
fields.
> - As long as core knew about those provider-shaped details, shipping a
provider as a pure third-party plugin meant every new provider would
still require host changes.
> - This pull request generalizes the sandbox provider seam around
schema-driven plugin metadata and generic secret-ref handling.
> - The runtime and UI now consume provider metadata generically, so
core only special-cases the built-in fake provider while third-party
providers can live entirely in plugins.

## What Changed

- Added generic sandbox-provider capability metadata so plugin-backed
providers can expose `configSchema` through shared environment support
and the environments capabilities API.
- Reworked sandbox config normalization/persistence/runtime resolution
to handle schema-declared secret-ref fields generically, storing them as
Paperclip secrets and resolving them for probe/execute/release flows.
- Generalized plugin sandbox runtime handling so provider validation,
reusable-lease matching, lease reconstruction, and plugin worker calls
all operate on provider-agnostic config instead of provider-shaped
branches.
- Replaced hardcoded sandbox provider form fields in Company Settings
with schema-driven rendering and blocked agent environment selection
from the built-in fake provider.
- Added regression coverage for the generic seam across shared support
helpers plus environment config, probe, routes, runtime, and
sandbox-provider runtime tests.

## Verification

- `pnpm vitest --run packages/shared/src/environment-support.test.ts
server/src/__tests__/environment-config.test.ts
server/src/__tests__/environment-probe.test.ts
server/src/__tests__/environment-routes.test.ts
server/src/__tests__/environment-runtime.test.ts
server/src/__tests__/sandbox-provider-runtime.test.ts`
- `pnpm -r typecheck`

## Risks

- Plugin sandbox providers now depend more heavily on accurate
`configSchema` declarations; incorrect schemas can misclassify
secret-bearing fields or omit required config.
- Reusable lease matching is now metadata-driven for plugin-backed
providers, so providers that fail to persist stable metadata may
reprovision instead of resuming an existing lease.
- The UI form is now fully schema-driven for plugin-backed sandbox
providers; provider manifests without good defaults or descriptions may
produce a rougher operator experience.

## Model Used

- OpenAI Codex via `codex_local`
- Model ID: `gpt-5.4`
- Reasoning effort: `high`
- Context window observed in runtime session metadata: `258400` tokens
- Capabilities used: terminal tool execution, git, and local code/test
inspection

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 18:03:41 -07:00
Dotta
f985b38c62 Merge origin/master into master
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 19:35:17 -05:00
Dotta
deba60ebb2 Stabilize serialized server route tests (#4448)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The server route suite is a core confidence layer for auth, issue
context, and workspace runtime behavior
> - Some route tests were doing extra module/server isolation work that
made local runs slower and more fragile
> - The stable Vitest runner also needs to pass server-relative exclude
paths to avoid accidentally re-including serialized suites
> - This pull request tightens route test isolation and runner
serialization behavior
> - The benefit is more reliable targeted and stable-route test
execution without product behavior changes

## What Changed

- Updated `run-vitest-stable.mjs` to exclude serialized server tests
using server-relative paths.
- Forced the server Vitest config to use a single worker in addition to
isolated forks.
- Simplified agent permission route tests to create per-request test
servers without shared server lifecycle state.
- Stabilized issue goal context route mocks by using static mocked
services and a sequential suite.
- Re-registered workspace runtime route mocks before cache-busted route
imports.

## Verification

- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/agent-permissions-routes.test.ts
server/src/__tests__/issues-goal-context-routes.test.ts
server/src/__tests__/workspace-runtime-routes-authz.test.ts --pool=forks
--poolOptions.forks.isolate=true`
- `node --check scripts/run-vitest-stable.mjs`

## Risks

- Low risk. This is test infrastructure only.
- The stable runner path fix changes which tests are excluded from the
non-serialized server batch, matching the server project root that
Vitest applies internally.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled with
shell/GitHub/Paperclip API access. Context window was not reported by
the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 19:27:00 -05:00
Dotta
f68e9caa9a Polish markdown external link wrapping (#4447)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The board UI renders agent comments, PR links, issue links, and
operational markdown throughout issue threads
> - Long GitHub and external links can wrap awkwardly, leaving icons
orphaned from the text they describe
> - Small inbox visual polish also helps repeated board scanning without
changing behavior
> - This pull request glues markdown link icons to adjacent link
characters and removes a redundant inbox list border
> - The benefit is cleaner, more stable markdown and inbox rendering for
day-to-day operator review

## What Changed

- Added an external-link indicator for external markdown links.
- Kept the GitHub icon attached to the first link character so it does
not wrap onto a separate line.
- Kept the external-link icon attached to the final link character so it
does not wrap away from the URL/text.
- Added markdown rendering regressions for GitHub and external link icon
wrapping.
- Removed the extra border around the inbox list card.

## Verification

- `pnpm exec vitest run --project @paperclipai/ui
ui/src/components/MarkdownBody.test.tsx`
- `pnpm --filter @paperclipai/ui typecheck`

## Risks

- Low risk. The markdown change is limited to link child rendering and
preserves existing href/target/rel behavior.
- Visual-only inbox polish.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled with
shell/GitHub/Paperclip API access. Context window was not reported by
the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 19:26:13 -05:00
Dotta
73fbdf36db Gate stale-run watchdog decisions by board access (#4446)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The run ledger surfaces stale-run watchdog evaluation issues and
recovery actions
> - Viewer-level board users should be able to inspect status without
getting controls that the server will reject
> - The UI also needs enough board-access context to know when to hide
those decision actions
> - This pull request exposes board memberships in the current board
access snapshot and gates watchdog action controls for known viewer
contexts
> - The benefit is clearer least-privilege UI behavior around recovery
controls

## What Changed

- Included memberships in `/api/cli-auth/me` so the board UI can
distinguish active viewer memberships from operator/admin access.
- Added the stale-run evaluation issue assignee to output silence
summaries.
- Hid stale-run watchdog decision buttons for known non-owner viewer
contexts.
- Surfaced watchdog decision failures through toast and inline error
text.
- Threaded `companyId` through the issue activity run ledger so access
checks are company-scoped.
- Added IssueRunLedger coverage for non-owner viewers.

## Verification

- `pnpm exec vitest run --project @paperclipai/ui
ui/src/components/IssueRunLedger.test.tsx`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm --filter @paperclipai/ui typecheck`

## Risks

- Medium-low risk. This is a UI gating change backed by existing server
authorization.
- Local implicit and instance-admin board contexts continue to show
watchdog decision controls.
- No migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled with
shell/GitHub/Paperclip API access. Context window was not reported by
the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 19:25:23 -05:00
Dotta
6916e30f8e Cancel stale retries when issue ownership changes (#4445)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Issue execution is guarded by run locks and bounded retry scheduling
> - A failed run can schedule a retry, but the issue may be reassigned
before that retry becomes due
> - The old assignee's scheduled retry should not continue to hold or
reclaim execution for the issue
> - This pull request cancels stale scheduled retries when ownership
changes and cancels live work when an issue is explicitly cancelled
> - The benefit is cleaner issue handoff semantics and fewer stranded or
incorrect execution locks

## What Changed

- Cancel scheduled retry runs when their issue has been reassigned
before the retry is promoted.
- Clear stale issue execution locks and cancel the associated wakeup
request when a stale retry is cancelled.
- Avoid deferring a new assignee behind a previous assignee's scheduled
retry.
- Cancel an active run when an issue status is explicitly changed to
`cancelled`, while leaving `done` transitions alone.
- Added route and heartbeat regressions for reassignment and
cancellation behavior.

## Verification

- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/heartbeat-retry-scheduling.test.ts
server/src/__tests__/issue-comment-reopen-routes.test.ts --pool=forks
--poolOptions.forks.isolate=true`
  - `issue-comment-reopen-routes.test.ts`: 28 passed.
- `heartbeat-retry-scheduling.test.ts`: skipped by the existing embedded
Postgres host guard (`Postgres init script exited with code null`).
- `pnpm --filter @paperclipai/server typecheck`

## Risks

- Medium risk because this changes heartbeat retry lifecycle behavior.
- The cancellation path is scoped to scheduled retries whose issue
assignee no longer matches the retrying agent, and logs a lifecycle
event for auditability.
- No migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled with
shell/GitHub/Paperclip API access. Context window was not reported by
the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 19:24:13 -05:00
Dotta
0c6961a03e Normalize escaped multiline issue and approval text (#4444)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The board and agent APIs accept multiline issue, approval,
interaction, and document text
> - Some callers accidentally send literal escaped newline sequences
like `\n` instead of JSON-decoded line breaks
> - That makes comments, descriptions, documents, and approval notes
render as flattened text instead of readable markdown
> - This pull request centralizes multiline text normalization in shared
validators
> - The benefit is newline-preserving API behavior across issue and
approval workflows without route-specific fixes

## What Changed

- Added a shared `multilineTextSchema` helper that normalizes escaped
`\n`, `\r\n`, and `\r` sequences to real line breaks.
- Applied the helper to issue descriptions, issue update comments, issue
comment bodies, suggested task descriptions, interaction summaries,
issue documents, approval comments, and approval decision notes.
- Added shared validator regressions for issue and approval multiline
inputs.

## Verification

- `pnpm exec vitest run --project @paperclipai/shared
packages/shared/src/validators/approval.test.ts
packages/shared/src/validators/issue.test.ts`
- `pnpm --filter @paperclipai/shared typecheck`

## Risks

- Low risk. This only changes text fields that are explicitly multiline
user/operator content.
- If a caller intentionally wanted literal backslash-n text in these
fields, it will now render as a real line break.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled with
shell/GitHub/Paperclip API access. Context window was not reported by
the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 18:02:45 -05:00
Dotta
4d2534e1bc Merge origin/master into master
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 17:05:11 -05:00
Dotta
5a0c1979cf [codex] Add runtime lifecycle recovery and live issue visibility (#4419) 2026-04-24 15:50:32 -05:00
Dotta
17b860bdfa Fix reassigned issue scheduled retries
Cancel stale scheduled retries when an issue is reassigned so the new assignee can queue immediately instead of waiting behind the old assignee's retry.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 15:23:27 -05:00
Dotta
3df7db0b44 Cancel active runs when issues are cancelled
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 15:21:24 -05:00
Dotta
9a8d219949 [codex] Stabilize tests and local maintenance assets (#4423)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - A fast-moving control plane needs stable local tests and repeatable
local maintenance tools so contributors can safely split and review work
> - Several route suites needed stronger isolation, Codex manual model
selection needed a faster-mode option, and local browser cleanup missed
Playwright's headless shell binary
> - Storybook static output also needed to be preserved as a generated
review artifact from the working branch
> - This pull request groups the test/local-dev maintenance pieces so
they can be reviewed separately from product runtime changes
> - The benefit is more predictable contributor verification and cleaner
local maintenance without mixing these changes into feature PRs

## What Changed

- Added stable Vitest runner support and serialized route/authz test
isolation.
- Fixed workspace runtime authz route mocks and stabilized
Claude/company-import related assertions.
- Allowed Codex fast mode for manually selected models.
- Broadened the agent browser cleanup script to detect
`chrome-headless-shell` as well as Chrome for Testing.
- Preserved generated Storybook static output from the source branch.

## Verification

- `pnpm exec vitest run
src/__tests__/workspace-runtime-routes-authz.test.ts
src/__tests__/claude-local-execute.test.ts --config vitest.config.ts`
from `server/` passed: 2 files, 19 tests.
- `pnpm exec vitest run src/server/codex-args.test.ts --config
vitest.config.ts` from `packages/adapters/codex-local/` passed: 1 file,
3 tests.
- `bash -n scripts/kill-agent-browsers.sh &&
scripts/kill-agent-browsers.sh --dry` passed; dry-run detected
`chrome-headless-shell` processes without killing them.
- `test -f ui/storybook-static/index.html && test -f
ui/storybook-static/assets/forms-editors.stories-Dry7qwx2.js` passed.
- `git diff --check public-gh/master..pap-2228-test-local-maintenance --
. ':(exclude)ui/storybook-static'` passed.
- `pnpm exec vitest run
cli/src/__tests__/company-import-export-e2e.test.ts --config
cli/vitest.config.ts` did not complete in the isolated split worktree
because `paperclipai run` exited during build prep with `TS2688: Cannot
find type definition file for 'react'`; this appears to be caused by the
worktree dependency symlink setup, not the code under test.
- Confirmed this PR does not include `pnpm-lock.yaml`.

## Risks

- Medium risk: the stable Vitest runner changes how route/authz tests
are scheduled.
- Generated `ui/storybook-static` files are large and contain minified
third-party output; `git diff --check` reports whitespace inside those
generated assets, so reviewers may choose to drop or regenerate that
artifact before merge.
- No database migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, with shell, git, Paperclip
API, and GitHub CLI tool use in the local Paperclip workspace.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Note: screenshot checklist item is not applicable to source UI behavior;
the included Storybook static output is generated artifact preservation
from the source branch.

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 15:11:42 -05:00
Dotta
25c9e92ca4 Glue icon to link text, add external-link indicator
PAP-2240: the GitHub icon on github.com markdown links could wrap onto
the previous line, separated from the first character of the URL. Pin
the icon and first character inside a white-space:nowrap span so they
never split across a line break. Also add an ExternalLink (lucide) icon
on the right of any target="_blank" link, glued to the last character
with the same no-wrap treatment so the "opens in new tab" cue stays
with the link text.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 14:52:10 -05:00
Dotta
69fa65328a Remove border around inbox list card
The inbox list card's outer border was visually redundant — the list items
are already delimited by their internal dividers. Drop `border border-border`
from the list container so neither light nor dark mode shows the outline.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 14:37:44 -05:00
Dotta
80ae1b55f6 Merge remote-tracking branch 'origin/master'
* origin/master:
  Add sandbox environment support (#4415)
  [codex] Harden create-agent skill governance (#4422)
  [codex] Polish issue composer and long document display (#4420)

# Conflicts:
#	server/src/__tests__/agent-permissions-routes.test.ts
#	server/src/__tests__/issues-goal-context-routes.test.ts
#	server/src/routes/agents.ts
#	server/src/routes/issues.ts
#	ui/src/components/IssueChatThread.test.tsx
2026-04-24 14:29:55 -05:00
Dotta
c6955a607c Normalize escaped newlines in server text inputs 2026-04-24 14:21:49 -05:00
Dotta
4de7648bd9 Remove superseded benchmark harness 2026-04-24 14:20:19 -05:00
Dotta
a279304c30 Remove generated Storybook static output
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 14:19:23 -05:00
Devin Foley
70679a3321 Add sandbox environment support (#4415)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.

## What Changed

- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.

## Verification

- Ran locally before the final fixture-only scrub:
  - `pnpm -r typecheck`
  - `pnpm test:run`
  - `pnpm build`
- Ran locally after the final scrub amend:
  - `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
  - create a sandbox environment backed by the fake provider plugin
  - run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly

## Risks

- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.

## Model Used

- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
Dotta
641eb44949 [codex] Harden create-agent skill governance (#4422)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Hiring agents is a governance-sensitive workflow because it grants
roles, adapter config, skills, and execution capability
> - The create-agent skill needs explicit templates and review guidance
so hires are auditable and not over-permissioned
> - Skill sync also needs to recognize bundled Paperclip skills
consistently for Codex local agents
> - This pull request expands create-agent role templates, adds a
security-engineer template, and documents capability/secret-handling
review requirements
> - The benefit is safer, more repeatable agent creation with clearer
approval payloads and less permission sprawl

## What Changed

- Expanded `paperclip-create-agent` guidance for template selection,
adjacent-template drafting, and role-specific review bars.
- Added a Security Engineer agent template and collaboration/safety
sections for Coder, QA, and UX Designer templates.
- Hardened draft-review guidance around desired skills, external-system
access, secrets, and confidential advisory handling.
- Updated LLM agent-configuration guidance to point hiring workflows at
the create-agent skill.
- Added tests for bundled skill sync, create-agent skill injection, hire
approval payloads, and LLM route guidance.

## Verification

- `pnpm exec vitest run server/src/__tests__/agent-skills-routes.test.ts
server/src/__tests__/codex-local-skill-injection.test.ts
server/src/__tests__/codex-local-skill-sync.test.ts
server/src/__tests__/llms-routes.test.ts
server/src/__tests__/paperclip-skill-utils.test.ts --config
server/vitest.config.ts` passed: 5 files, 23 tests.
- `git diff --check public-gh/master..pap-2228-create-agent-governance
-- . ':(exclude)ui/storybook-static'` passed.
- Confirmed this PR does not include `pnpm-lock.yaml`.

## Risks

- Low-to-medium risk: this primarily changes skills/docs and tests, but
it affects future hiring guidance and approval expectations.
- Reviewers should check whether the new Security Engineer template is
too broad for default company installs.
- No database migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, with shell, git, Paperclip
API, and GitHub CLI tool use in the local Paperclip workspace.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Note: screenshot checklist item is not applicable; this PR changes
skills, docs, and server tests.

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 14:15:28 -05:00
Dotta
77a72e28c2 [codex] Polish issue composer and long document display (#4420)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Issue comments and documents are the main working surface where
operators and agents collaborate
> - File drops, markdown editing, and long issue descriptions need to
feel predictable because they sit directly in the task execution loop
> - The composer had edge cases around drag targets, attachment
feedback, image drops, and long markdown content crowding the page
> - This pull request polishes the issue composer, hardens markdown
editor regressions, and adds a fold curtain for long issue
descriptions/documents
> - The benefit is a calmer issue detail surface that handles uploads
and long work products without hiding state or breaking layout

## What Changed

- Scoped issue-composer drag/drop behavior so the composer owns file
drops without turning the whole thread into a competing drop target.
- Added clearer attachment upload feedback for non-image files and
image-drop stability coverage.
- Hardened markdown editor and markdown body handling around HTML-like
tag regressions.
- Added `FoldCurtain` and wired it into issue descriptions and issue
documents so long markdown previews can expand/collapse.
- Added Storybook coverage for the fold curtain state.

## Verification

- `pnpm exec vitest run ui/src/components/IssueChatThread.test.tsx
ui/src/components/MarkdownEditor.test.tsx
ui/src/components/MarkdownBody.test.tsx --config ui/vitest.config.ts`
passed: 3 files, 75 tests.
- `git diff --check public-gh/master..pap-2228-editor-composer-polish --
. ':(exclude)ui/storybook-static'` passed.
- Confirmed this PR does not include `pnpm-lock.yaml`.

## Risks

- Low-to-medium risk: this changes user-facing composer/drop behavior
and long markdown display.
- The fold curtain uses DOM measurement and `ResizeObserver`; reviewers
should check browser behavior for very long descriptions and documents.
- No database migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, with shell, git, Paperclip
API, and GitHub CLI tool use in the local Paperclip workspace.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Note: screenshots were not newly captured during branch splitting; the
UI states are covered by component tests and a Storybook story.

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 14:12:41 -05:00
Dotta
aa69bd4082 Relax active run payload assertion 2026-04-24 14:01:22 -05:00
Dotta
3e5a1c83e3 Stabilize maintenance branch tests 2026-04-24 13:59:21 -05:00
Dotta
0c813dddd3 Stabilize dependency scheduling teardown 2026-04-24 13:59:21 -05:00
Dotta
7515fa673e Guard fold curtain resize observer 2026-04-24 13:59:21 -05:00
Dotta
8cc676cb84 Stabilize issue route tree control mocks 2026-04-24 13:59:21 -05:00
Dotta
496a7ec858 Fix composer drop overlay test copy
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 13:43:04 -05:00
Dotta
524de337b4 Broaden agent browser cleanup script
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 13:37:57 -05:00
Dotta
4e445c155e Add fold curtain for long issue documents
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 13:37:54 -05:00
Dotta
a62b180587 Harden create-agent skill guidance
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 13:37:50 -05:00
Dotta
195091afac Polish benchmark scaffold layout
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 13:37:45 -05:00
Dotta
ef740acbae Stabilize running task chain of thought
Keep live run chain-of-thought sections in the active Working state until the run itself completes, even during gaps between command executions. Add a regression test for a running run whose latest command already has a result.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 13:32:07 -05:00
Dotta
e729c0b1fe Prevent mention wakes from taking issue execution lock 2026-04-24 13:32:07 -05:00
Dotta
98f6136a3e Show terminal blockers in blocked issue notice 2026-04-24 13:32:07 -05:00
Dotta
1439708ea0 Cancel blocked queued heartbeat retries 2026-04-24 13:32:07 -05:00
Dotta
d4c70cdd18 Fix stale covered-blocked run detection
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 13:32:07 -05:00
Dotta
3e4dd2a988 Pad composer bottom so long text doesn't touch buttons
Add pb-2 inside the floating reply editor's scrollable content so the
last line of a long message has breathing room before the action row.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-24 13:32:07 -05:00
Dotta
4fb8e337e9 Add live run issue filter
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 13:32:07 -05:00
Dotta
2e0111b11e Restore covered blocked cyan status ring
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 13:32:07 -05:00
Dotta
528468c5b3 Fix covered blocked status visual 2026-04-24 13:32:07 -05:00
Dotta
6a518ca828 Page persisted run logs incrementally
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 13:32:06 -05:00
Dotta
f626996e63 Fix covered blocked status icons 2026-04-24 13:32:06 -05:00
Dotta
0ee91c1651 Virtualize raw transcript rendering
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 13:32:06 -05:00
Dotta
8f208110b0 Add covered blocked issue attention state 2026-04-24 13:32:06 -05:00
Dotta
4788e563fd Harden pause-hold wake provenance
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 13:32:06 -05:00
Dotta
4ccbd7ec43 Add dashboard live runs page
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 13:32:06 -05:00
Dotta
97ea7e50f8 Revert raw transcript timestamp tooltips
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 13:32:06 -05:00
Dotta
1fa83cbdd3 Suppress recovery under pause holds
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 13:32:06 -05:00
Dotta
fcaa681400 Implement explicit issue follow-up intent
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 13:32:06 -05:00
Dotta
855471b6a0 test: cover paused-tree recovery suppression
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 13:32:06 -05:00
Dotta
c19197581b Preserve Storybook static build output 2026-04-24 13:32:06 -05:00
Dotta
5d22432abe Lazily compute transcript timestamp tooltips 2026-04-24 13:32:06 -05:00
Dotta
4d2c361cca Add explicit issue follow-up intent 2026-04-24 13:32:06 -05:00
Dotta
c5020b13fc Add Paperclip bench harness 2026-04-24 13:32:06 -05:00
Dotta
6fd5fe6607 PAP-2147: explicit template selection in paperclip-create-agent
Phase 1 of the agent-creator skill upgrade plan.

- SKILL.md: step 4 is now an explicit "choose the instruction source"
  decision (exact template / adjacent template / generic fallback), and
  step 7 references the new pre-submit checklist.
- references/baseline-role-guide.md (new): section-by-section guide for
  drafting a no-template AGENTS.md.
- references/draft-review-checklist.md (new): pre-submit checklist
  covering identity, role clarity, workflow, lenses, output bar,
  collaboration routing, governance fields, least-privilege permissions,
  and done criteria.
- references/agent-instruction-templates.md: added the three-path
  decision flow and separate "apply exact / adjacent / fallback"
  procedures.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 13:32:06 -05:00
Dotta
d72b693fd6 Add raw transcript timestamp tooltips
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 13:32:06 -05:00
Dotta
89b94da25b Add markdown HTML security regressions 2026-04-24 13:32:06 -05:00
Dotta
269bae3688 Prevent stale heartbeat start locks 2026-04-24 13:32:06 -05:00
Dotta
4bcd27bc18 Wake assigned issues reopened from done
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 13:32:06 -05:00
Dotta
cd2c922af2 Stabilize image drops in issue composer 2026-04-24 13:32:06 -05:00
Dotta
43ce73ecb7 Allow subtree cancel without reason
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 13:32:06 -05:00
Dotta
6f34728caa Broaden rich editor HTML tag regression 2026-04-24 13:32:06 -05:00
Dotta
6c7c18a7c5 Keep issue composer as sole drop target 2026-04-24 13:32:06 -05:00
Dotta
356dac58f5 Fix subtree cancel status application
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 13:32:06 -05:00
Dotta
164c3f4491 Fix rich markdown editor HTML tag crashes 2026-04-24 13:32:06 -05:00
Dotta
f457c9d36f Isolate Claude prompt cache tests from runner env 2026-04-24 13:31:03 -05:00
Dotta
deca3c6842 Stabilize company import issue assertions 2026-04-24 13:31:03 -05:00
Dotta
6eed8376f1 Fix watchdog banner severity tone 2026-04-24 13:31:03 -05:00
Dotta
4533859706 Fix duplicate agent route import 2026-04-24 13:31:03 -05:00
Dotta
46deb875b5 Stabilize retry timestamp execute tests 2026-04-24 13:31:03 -05:00
Dotta
74948dab52 Show issue composer attachment feedback 2026-04-24 13:31:02 -05:00
Dotta
23fef9d430 Allow Codex fast mode for manual models
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 13:31:02 -05:00
Dotta
1feac07821 Fix issue chat composer dropzone scope 2026-04-24 13:31:02 -05:00
Dotta
d845ff6d9b Fix workspace runtime route authz mocks 2026-04-24 13:31:02 -05:00
Dotta
747ad09977 Stabilize Vitest route suite isolation 2026-04-24 13:31:02 -05:00
Dotta
0abe882151 Add watchdog follow-ups and issue reopen fixes 2026-04-24 13:31:02 -05:00
Dotta
65d410f405 Re-arm watchdog continue decisions 2026-04-24 13:30:47 -05:00
Dotta
fdb2bcec39 PAP-2079 create terminal recovery issues 2026-04-24 13:30:29 -05:00
Dotta
321b90205f Fix watchdog decision authorization binding
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 13:30:29 -05:00
Dotta
8efd9cef86 PAP-2075 centralize active-run watchdog recovery 2026-04-24 13:30:29 -05:00
Dotta
27e8d75829 Move heartbeat recovery side effects into service 2026-04-24 13:30:29 -05:00
Dotta
6b53b31a00 Centralize recovery classifiers 2026-04-24 13:30:29 -05:00
Dotta
17962ad465 Stabilize live run transcript excerpts 2026-04-24 13:30:29 -05:00
Dotta
ecc6c8f1b6 Recover stranded assigned issues via child issue
When automatic retries for stranded todo/in_progress work are exhausted,
create an explicit `stranded_issue_recovery` child issue assigned to an
invokable manager (reportsTo chain, then CTO/CEO, then the original
assignee). Link it as a first-class blocker on the source, wake the
recovery owner, and surface the next action in a source-issue comment
instead of leaving the work silently blocked.

Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 13:30:29 -05:00
Dotta
08abf9d060 Implement active-run output watchdog
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-24 13:30:29 -05:00
Dotta
e23a3a1856 Improve run liveness actionability 2026-04-24 13:30:29 -05:00
Dotta
f16f4025fd Gate liveness recovery issue creation 2026-04-24 13:30:29 -05:00
Dotta
8f1cd0474f [codex] Improve transient recovery and Codex model refresh (#4383)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Adapter execution and retry classification decide whether agent work
pauses, retries, or recovers automatically
> - Transient provider failures need to be classified precisely so
Paperclip does not convert retryable upstream conditions into false hard
failures
> - At the same time, operators need an up-to-date model list for
Codex-backed agents and prompts should nudge agents toward targeted
verification instead of repo-wide sweeps
> - This pull request tightens transient recovery classification for
Claude and Codex, updates the agent prompt guidance, and adds Codex
model refresh support end-to-end
> - The benefit is better automatic retry behavior plus fresher
operator-facing model configuration

## What Changed

- added Codex usage-limit retry-window parsing and Claude extra-usage
transient classification
- normalized the heartbeat transient-recovery contract across adapter
executions and heartbeat scheduling
- documented that deferred comment wakes only reopen completed issues
for human/comment-reopen interactions, while system follow-ups leave
closed work closed
- updated adapter-utils prompt guidance to prefer targeted verification
- added Codex model refresh support in the server route, registry,
shared types, and agent config form
- added adapter/server tests covering the new parsing, retry scheduling,
and model-refresh behavior

## Verification

- `pnpm exec vitest run --project @paperclipai/adapter-utils
packages/adapter-utils/src/server-utils.test.ts`
- `pnpm exec vitest run --project @paperclipai/adapter-claude-local
packages/adapters/claude-local/src/server/parse.test.ts`
- `pnpm exec vitest run --project @paperclipai/adapter-codex-local
packages/adapters/codex-local/src/server/parse.test.ts`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/adapter-model-refresh-routes.test.ts
server/src/__tests__/adapter-models.test.ts
server/src/__tests__/claude-local-execute.test.ts
server/src/__tests__/codex-local-execute.test.ts
server/src/__tests__/heartbeat-process-recovery.test.ts
server/src/__tests__/heartbeat-retry-scheduling.test.ts`

## Risks

- Moderate behavior risk: retry classification affects whether runs
auto-recover or block, so mistakes here could either suppress needed
retries or over-retry real failures
- Low workflow risk: deferred comment wake reopening is intentionally
scoped to human/comment-reopen interactions so system follow-ups do not
revive completed issues unexpectedly

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex GPT-5-based coding agent with tool use and code execution
in the Codex CLI environment

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 09:40:40 -05:00
Dotta
4fdbbeced3 [codex] Refine markdown issue reference rendering (#4382)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Task references are a core part of how operators understand issue
relationships across the UI
> - Those references appear both in markdown bodies and in sidebar
relationship panels
> - The rendering had drifted between surfaces, and inline markdown
pills were reading awkwardly inside prose and lists
> - This pull request unifies the underlying issue-reference treatment,
routes issue descriptions through `MarkdownBody`, and switches inline
markdown references to a cleaner text-link presentation
> - The benefit is more consistent issue-reference UX with better
readability in markdown-heavy views

## What Changed

- unified sidebar and markdown issue-reference rendering around the
shared issue-reference components
- routed resting issue descriptions through `MarkdownBody` so
description previews inherit the richer issue-reference treatment
- replaced inline markdown pill chrome with a cleaner inline reference
presentation for prose contexts
- added and updated UI tests for `MarkdownBody` and `InlineEditor`

## Verification

- `pnpm exec vitest run --project @paperclipai/ui
ui/src/components/MarkdownBody.test.tsx
ui/src/components/InlineEditor.test.tsx`

## Risks

- Moderate UI risk: issue-reference rendering now differs intentionally
between inline markdown and relationship sidebars, so regressions would
show up as styling or hover-preview mismatches

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex GPT-5-based coding agent with tool use and code execution
in the Codex CLI environment

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 09:39:21 -05:00
Dotta
7ad225a198 [codex] Improve issue thread review flow (#4381)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Issue detail is where operators coordinate review, approvals, and
follow-up work with active runs
> - That thread UI needs to surface blockers, descendants, review
handoffs, and reply ergonomics clearly enough for humans to guide agent
work
> - Several small gaps in the issue-thread flow were making review and
navigation clunkier than necessary
> - This pull request improves the reply composer, descendant/blocker
presentation, interaction folding, and review-request handoff plumbing
together as one cohesive issue-thread workflow slice
> - The benefit is a cleaner operator review loop without changing the
broader task model

## What Changed

- restored and refined the floating reply composer behavior in the issue
thread
- folded expired confirmation interactions and improved post-submit
thread scrolling behavior
- surfaced descendant issue context and inline blocker/paused-assignee
notices on the issue detail view
- tightened large-board first paint behavior in `IssuesList`
- added loose review-request handoffs through the issue
execution-policy/update path and covered them with tests

## Verification

- `pnpm vitest run ui/src/pages/IssueDetail.test.tsx`
- `pnpm vitest run server/src/__tests__/issues-service.test.ts
server/src/__tests__/issue-execution-policy.test.ts`
- `pnpm exec vitest run --project @paperclipai/ui
ui/src/components/IssueChatThread.test.tsx
ui/src/components/IssueProperties.test.tsx
ui/src/components/IssuesList.test.tsx ui/src/lib/issue-tree.test.ts
ui/src/api/issues.test.ts`
- `pnpm exec vitest run --project @paperclipai/adapter-utils
packages/adapter-utils/src/server-utils.test.ts`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/issue-comment-reopen-routes.test.ts -t "coerces
executor handoff patches into workflow-controlled review wakes|wakes the
return assignee with execution_changes_requested"`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/issue-execution-policy.test.ts
server/src/__tests__/issues-service.test.ts`

## Visual Evidence

- UI layout changes are covered by the focused issue-thread component
and issue-detail tests listed above. Browser screenshots were not
attachable from this automated greploop environment, so reviewers should
use the running preview for final visual confirmation.

## Risks

- Moderate UI-flow risk: these changes touch the issue detail experience
in multiple spots, so regressions would most likely show up as
thread-layout quirks or incorrect review-handoff behavior

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex GPT-5-based coding agent with tool use and code execution
in the Codex CLI environment

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots or documented the visual verification path
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 08:02:45 -05:00
Dotta
35a9dc37b0 [codex] Speed up company skill detail loading (#4380)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Company skills are part of the control plane for distributing
reusable capabilities
> - Board flows that inspect company skill detail should stay responsive
because they are operator-facing control-plane reads
> - The existing detail path was doing broader work than needed for the
specific detail screen
> - This pull request narrows that company-skill detail loading path and
adds a regression test around it
> - The benefit is faster company skill detail reads without changing
the external API contract

## What Changed

- tightened the company-skill detail loading path in
`server/src/services/company-skills.ts`
- added `server/src/__tests__/company-skills-detail.test.ts` to verify
the detail route only pulls the required data

## Verification

- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/company-skills-detail.test.ts`

## Risks

- Low risk: this only changes the company-skill detail query path, but
any missed assumption in the detail consumer would surface when loading
that screen

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex GPT-5-based coding agent with tool use and code execution
in the Codex CLI environment

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 07:37:13 -05:00
512 changed files with 101377 additions and 6771 deletions

View File

@@ -177,8 +177,12 @@ real name or email). To find GitHub usernames:
**Never expose contributor email addresses.** Use `@username` only.
Exclude bot accounts (e.g. `lockfile-bot`, `dependabot`) from the list. List contributors
in alphabetical order by GitHub username (case-insensitive).
Exclude bot accounts (e.g. `lockfile-bot`, `dependabot`) from the list.
Exclude Paperclip founders from the list (e.g. `cryppadotta`, `forgottendev`, `devinfoley`, `sockmonster`, `scotttong`)
List contributors in alphabetical order by GitHub username (case-insensitive).
If there are no contributors left after exclusions, then just skip this section and don't mention it.
## Step 6 — Review Before Release

View File

@@ -41,44 +41,7 @@ jobs:
node-version: 24
- name: Validate Dockerfile deps stage
run: |
missing=0
# Extract only the deps stage from the Dockerfile
deps_stage="$(awk '/^FROM .* AS deps$/{found=1; next} found && /^FROM /{exit} found{print}' Dockerfile)"
if [ -z "$deps_stage" ]; then
echo "::error::Could not extract deps stage from Dockerfile (expected 'FROM ... AS deps')"
exit 1
fi
# Derive workspace search roots from pnpm-workspace.yaml (exclude dev-only packages)
search_roots="$(grep '^ *- ' pnpm-workspace.yaml | sed 's/^ *- //' | sed 's/\*$//' | grep -v 'examples' | grep -v 'create-paperclip-plugin' | tr '\n' ' ')"
if [ -z "$search_roots" ]; then
echo "::error::Could not derive workspace roots from pnpm-workspace.yaml"
exit 1
fi
# Check all workspace package.json files are copied in the deps stage
for pkg in $(find $search_roots -maxdepth 2 -name package.json -not -path '*/examples/*' -not -path '*/create-paperclip-plugin/*' -not -path '*/node_modules/*' 2>/dev/null | sort -u); do
dir="$(dirname "$pkg")"
if ! echo "$deps_stage" | grep -q "^COPY ${dir}/package.json"; then
echo "::error::Dockerfile deps stage missing: COPY ${pkg} ${dir}/"
missing=1
fi
done
# Check patches directory is copied if it exists
if [ -d patches ] && ! echo "$deps_stage" | grep -q '^COPY patches/'; then
echo "::error::Dockerfile deps stage missing: COPY patches/ patches/"
missing=1
fi
if [ "$missing" -eq 1 ]; then
echo "Dockerfile deps stage is out of sync. Update it to include the missing files."
exit 1
fi
run: node ./scripts/check-docker-deps-stage.mjs
- name: Validate dependency resolution when manifests change
run: |
@@ -117,6 +80,9 @@ jobs:
- name: Run tests
run: pnpm test:run
- name: Verify release registry test coverage
run: pnpm run test:release-registry
- name: Build
run: pnpm build

1
.gitignore vendored
View File

@@ -3,6 +3,7 @@ node_modules/
**/node_modules
**/node_modules/
dist/
ui/storybook-static/
.env
*.tsbuildinfo
drizzle/meta/

View File

@@ -123,7 +123,9 @@ pnpm test:release-smoke
Run the browser suites only when your change touches them or when you are explicitly verifying CI/release flows.
Run this full check before claiming done:
For normal issue work, run the smallest relevant verification first. Do not default to repo-wide typecheck/build/test on every heartbeat when a narrower check is enough to prove the change.
Run this full check before claiming repo work done in a PR-ready hand-off, or when the change scope is broad enough that targeted checks are not sufficient:
```sh
pnpm -r typecheck

View File

@@ -1,3 +1,4 @@
# syntax=docker/dockerfile:1.20
FROM node:lts-trixie-slim AS base
ARG USER_UID=1000
ARG USER_GID=1000
@@ -29,6 +30,8 @@ COPY packages/adapters/openclaw-gateway/package.json packages/adapters/openclaw-
COPY packages/adapters/opencode-local/package.json packages/adapters/opencode-local/
COPY packages/adapters/pi-local/package.json packages/adapters/pi-local/
COPY packages/plugins/sdk/package.json packages/plugins/sdk/
COPY --parents packages/plugins/sandbox-providers/./*/package.json packages/plugins/sandbox-providers/
COPY packages/plugins/paperclip-plugin-fake-sandbox/package.json packages/plugins/paperclip-plugin-fake-sandbox/
COPY patches/ patches/
RUN pnpm install --frozen-lockfile

View File

@@ -6,7 +6,8 @@
<a href="#quickstart"><strong>Quickstart</strong></a> &middot;
<a href="https://paperclip.ing/docs"><strong>Docs</strong></a> &middot;
<a href="https://github.com/paperclipai/paperclip"><strong>GitHub</strong></a> &middot;
<a href="https://discord.gg/m4HZY7xNG3"><strong>Discord</strong></a>
<a href="https://discord.gg/m4HZY7xNG3"><strong>Discord</strong></a> &middot;
<a href="https://x.com/papercliping"><strong>Twitter</strong></a>
</p>
<p align="center">
@@ -409,6 +410,7 @@ We welcome contributions. See the [contributing guide](CONTRIBUTING.md) for deta
## Community
- [Discord](https://discord.gg/m4HZY7xNG3) — Join the community
- [Twitter / X](https://x.com/papercliping) — Follow updates and announcements
- [GitHub Issues](https://github.com/paperclipai/paperclip/issues) — bugs and feature requests
- [GitHub Discussions](https://github.com/paperclipai/paperclip/discussions) — ideas and RFC

View File

@@ -6,7 +6,8 @@
<a href="#quickstart"><strong>Quickstart</strong></a> &middot;
<a href="https://paperclip.ing/docs"><strong>Docs</strong></a> &middot;
<a href="https://github.com/paperclipai/paperclip"><strong>GitHub</strong></a> &middot;
<a href="https://discord.gg/m4HZY7xNG3"><strong>Discord</strong></a>
<a href="https://discord.gg/m4HZY7xNG3"><strong>Discord</strong></a> &middot;
<a href="https://x.com/papercliping"><strong>Twitter</strong></a>
</p>
<p align="center">
@@ -278,6 +279,7 @@ We welcome contributions. See the [contributing guide](https://github.com/paperc
## Community
- [Discord](https://discord.gg/m4HZY7xNG3) — Join the community
- [Twitter / X](https://x.com/papercliping) — Follow updates and announcements
- [GitHub Issues](https://github.com/paperclipai/paperclip/issues) — bugs and feature requests
- [GitHub Discussions](https://github.com/paperclipai/paperclip/discussions) — ideas and RFC

View File

@@ -14,6 +14,7 @@ function makeCompany(overrides: Partial<Company>): Company {
issueCounter: 1,
budgetMonthlyCents: 0,
spentMonthlyCents: 0,
attachmentMaxBytes: 10 * 1024 * 1024,
requireBoardApprovalForNewAgents: false,
feedbackDataSharingEnabled: false,
feedbackDataSharingConsentAt: null,

View File

@@ -1,5 +1,5 @@
import { execFile, spawn } from "node:child_process";
import { mkdirSync, mkdtempSync, readFileSync, readdirSync, rmSync, writeFileSync } from "node:fs";
import { existsSync, mkdirSync, mkdtempSync, readFileSync, readdirSync, rmSync, writeFileSync } from "node:fs";
import net from "node:net";
import os from "node:os";
import path from "node:path";
@@ -104,20 +104,50 @@ function writeTestConfig(configPath: string, tempRoot: string, port: number, con
writeFileSync(configPath, `${JSON.stringify(config, null, 2)}\n`, "utf8");
}
function createServerEnv(configPath: string, port: number, connectionString: string) {
interface TestPaperclipEnv {
configPath: string;
paperclipHome: string;
instanceId: string;
shellHome?: string;
}
function createBasePaperclipEnv(options: TestPaperclipEnv) {
const env = { ...process.env };
for (const key of Object.keys(env)) {
if (key.startsWith("PAPERCLIP_")) {
delete env[key];
}
}
env.PAPERCLIP_CONFIG = options.configPath;
env.PAPERCLIP_HOME = options.paperclipHome;
env.PAPERCLIP_INSTANCE_ID = options.instanceId;
env.PAPERCLIP_CONTEXT = path.join(options.paperclipHome, "context.json");
env.PAPERCLIP_AUTH_STORE = path.join(options.paperclipHome, "auth.json");
if (options.shellHome) {
env.HOME = options.shellHome;
}
return env;
}
function createServerEnv(
configPath: string,
port: number,
connectionString: string,
options: Omit<TestPaperclipEnv, "configPath">,
) {
const env = createBasePaperclipEnv({
configPath,
...options,
});
delete env.DATABASE_URL;
delete env.PORT;
delete env.HOST;
delete env.SERVE_UI;
delete env.HEARTBEAT_SCHEDULER_ENABLED;
env.PAPERCLIP_CONFIG = configPath;
env.DATABASE_URL = connectionString;
env.HOST = "127.0.0.1";
env.PORT = String(port);
@@ -130,13 +160,8 @@ function createServerEnv(configPath: string, port: number, connectionString: str
return env;
}
function createCliEnv() {
const env = { ...process.env };
for (const key of Object.keys(env)) {
if (key.startsWith("PAPERCLIP_")) {
delete env[key];
}
}
function createCliEnv(options: TestPaperclipEnv) {
const env = createBasePaperclipEnv(options);
delete env.DATABASE_URL;
delete env.PORT;
delete env.HOST;
@@ -183,14 +208,25 @@ async function api<T>(baseUrl: string, pathname: string, init?: RequestInit): Pr
return text ? JSON.parse(text) as T : (null as T);
}
async function runCliJson<T>(args: string[], opts: { apiBase: string; configPath: string }) {
async function runCliJson<T>(
args: string[],
opts: TestPaperclipEnv & { apiBase?: string; includeConfigArg?: boolean },
) {
const repoRoot = path.resolve(path.dirname(fileURLToPath(import.meta.url)), "../../..");
const cliArgs = ["--silent", "paperclipai", ...args];
if (opts.apiBase) {
cliArgs.push("--api-base", opts.apiBase);
}
if (opts.includeConfigArg !== false) {
cliArgs.push("--config", opts.configPath);
}
cliArgs.push("--json");
const result = await execFileAsync(
"pnpm",
["--silent", "paperclipai", ...args, "--api-base", opts.apiBase, "--config", opts.configPath, "--json"],
cliArgs,
{
cwd: repoRoot,
env: createCliEnv(),
env: createCliEnv(opts),
maxBuffer: 10 * 1024 * 1024,
},
);
@@ -235,6 +271,9 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
let configPath = "";
let exportDir = "";
let apiBase = "";
let paperclipHome = "";
let cliShellHome = "";
let paperclipInstanceId = "";
let serverProcess: ServerProcess | null = null;
let tempDb: Awaited<ReturnType<typeof startEmbeddedPostgresTestDatabase>> | null = null;
@@ -242,6 +281,11 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
tempRoot = mkdtempSync(path.join(os.tmpdir(), "paperclip-company-cli-e2e-"));
configPath = path.join(tempRoot, "config", "config.json");
exportDir = path.join(tempRoot, "exported-company");
paperclipHome = path.join(tempRoot, "paperclip-home");
cliShellHome = path.join(tempRoot, "shell-home");
paperclipInstanceId = "company-cli-e2e";
mkdirSync(paperclipHome, { recursive: true });
mkdirSync(cliShellHome, { recursive: true });
tempDb = await startEmbeddedPostgresTestDatabase("paperclip-company-cli-db-");
@@ -256,7 +300,11 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
["paperclipai", "run", "--config", configPath],
{
cwd: repoRoot,
env: createServerEnv(configPath, port, tempDb.connectionString),
env: createServerEnv(configPath, port, tempDb.connectionString, {
paperclipHome,
instanceId: paperclipInstanceId,
shellHome: cliShellHome,
}),
stdio: ["ignore", "pipe", "pipe"],
},
);
@@ -282,6 +330,31 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
it("exports a company package and imports it into new and existing companies", async () => {
expect(serverProcess).not.toBeNull();
const cliContext = await runCliJson<{
contextPath: string;
profileName: string;
profile: { apiBase?: string };
}>(
["context", "set", "--profile", "isolation-check", "--api-base", "https://example.test"],
{
configPath,
paperclipHome,
instanceId: paperclipInstanceId,
shellHome: cliShellHome,
includeConfigArg: false,
},
);
const expectedContextPath = path.join(paperclipHome, "context.json");
const leakedContextPath = path.join(cliShellHome, ".paperclip", "context.json");
expect(cliContext.contextPath).toBe(expectedContextPath);
expect(cliContext.profileName).toBe("isolation-check");
expect(cliContext.profile.apiBase).toBe("https://example.test");
expect(existsSync(expectedContextPath)).toBe(true);
expect(existsSync(leakedContextPath)).toBe(false);
rmSync(expectedContextPath, { force: true });
expect(existsSync(expectedContextPath)).toBe(false);
const sourceCompany = await api<{ id: string; name: string; issuePrefix: string }>(apiBase, "/api/companies", {
method: "POST",
headers: { "content-type": "application/json" },
@@ -303,8 +376,11 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
name: "Export Engineer",
role: "engineer",
adapterType: "claude_local",
adapterConfig: {
promptTemplate: "You verify company portability.",
adapterConfig: {},
instructionsBundle: {
files: {
"AGENTS.md": "You verify company portability.",
},
},
}),
},
@@ -355,7 +431,13 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
"--include",
"company,agents,projects,issues",
],
{ apiBase, configPath },
{
apiBase,
configPath,
paperclipHome,
instanceId: paperclipInstanceId,
shellHome: cliShellHome,
},
);
expect(exportResult.ok).toBe(true);
@@ -379,7 +461,13 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
"company,agents,projects,issues",
"--yes",
],
{ apiBase, configPath },
{
apiBase,
configPath,
paperclipHome,
instanceId: paperclipInstanceId,
shellHome: cliShellHome,
},
);
expect(importedNew.company.action).toBe("created");
@@ -398,10 +486,11 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
apiBase,
`/api/companies/${importedNew.company.id}/issues`,
);
const importedMatchingIssues = importedIssues.filter((issue) => issue.title === sourceIssue.title);
expect(importedAgents.map((agent) => agent.name)).toContain(sourceAgent.name);
expect(importedProjects.map((project) => project.name)).toContain(sourceProject.name);
expect(importedIssues.map((issue) => issue.title)).toContain(sourceIssue.title);
expect(importedMatchingIssues).toHaveLength(1);
const previewExisting = await runCliJson<{
errors: string[];
@@ -426,7 +515,13 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
"rename",
"--dry-run",
],
{ apiBase, configPath },
{
apiBase,
configPath,
paperclipHome,
instanceId: paperclipInstanceId,
shellHome: cliShellHome,
},
);
expect(previewExisting.errors).toEqual([]);
@@ -453,7 +548,13 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
"rename",
"--yes",
],
{ apiBase, configPath },
{
apiBase,
configPath,
paperclipHome,
instanceId: paperclipInstanceId,
shellHome: cliShellHome,
},
);
expect(importedExisting.company.action).toBe("unchanged");
@@ -471,11 +572,13 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
apiBase,
`/api/companies/${importedNew.company.id}/issues`,
);
const twiceImportedMatchingIssues = twiceImportedIssues.filter((issue) => issue.title === sourceIssue.title);
expect(twiceImportedAgents).toHaveLength(2);
expect(new Set(twiceImportedAgents.map((agent) => agent.name)).size).toBe(2);
expect(twiceImportedProjects).toHaveLength(2);
expect(twiceImportedIssues).toHaveLength(2);
expect(twiceImportedMatchingIssues).toHaveLength(2);
expect(new Set(twiceImportedMatchingIssues.map((issue) => issue.identifier)).size).toBe(2);
const zipPath = path.join(tempRoot, "exported-company.zip");
const portableFiles: Record<string, string> = {};
@@ -498,7 +601,13 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
"company,agents,projects,issues",
"--yes",
],
{ apiBase, configPath },
{
apiBase,
configPath,
paperclipHome,
instanceId: paperclipInstanceId,
shellHome: cliShellHome,
},
);
expect(importedFromZip.company.action).toBe("created");

View File

@@ -160,6 +160,7 @@ describe("renderCompanyImportPreview", () => {
path: "COMPANY.md",
name: "Source Co",
description: null,
attachmentMaxBytes: null,
brandColor: null,
logoPath: null,
requireBoardApprovalForNewAgents: false,
@@ -375,6 +376,7 @@ describe("import selection catalog", () => {
path: "COMPANY.md",
name: "Source Co",
description: null,
attachmentMaxBytes: null,
brandColor: null,
logoPath: "images/company-logo.png",
requireBoardApprovalForNewAgents: false,

View File

@@ -190,8 +190,9 @@ describe("worktree helpers", () => {
).toEqual(["worktree", "add", "-b", "my-worktree", "/tmp/my-worktree", "origin/main"]);
});
it("rewrites loopback auth URLs to the new port only", () => {
it("rewrites auth URLs only when they already include a port", () => {
expect(rewriteLocalUrlPort("http://127.0.0.1:3100", 3110)).toBe("http://127.0.0.1:3110/");
expect(rewriteLocalUrlPort("http://my-host.ts.net:3100", 3110)).toBe("http://my-host.ts.net:3110/");
expect(rewriteLocalUrlPort("https://paperclip.example", 3110)).toBe("https://paperclip.example");
});

View File

@@ -61,6 +61,7 @@ interface IssueUpdateOptions extends BaseClientOptions {
interface IssueCommentOptions extends BaseClientOptions {
body: string;
reopen?: boolean;
resume?: boolean;
}
interface IssueCheckoutOptions extends BaseClientOptions {
@@ -241,12 +242,14 @@ export function registerIssueCommands(program: Command): void {
.argument("<issueId>", "Issue ID")
.requiredOption("--body <text>", "Comment body")
.option("--reopen", "Reopen if issue is done/cancelled")
.option("--resume", "Request explicit follow-up and wake the assignee when resumable")
.action(async (issueId: string, opts: IssueCommentOptions) => {
try {
const ctx = resolveCommandContext(opts);
const payload = addIssueCommentSchema.parse({
body: opts.body,
reopen: opts.reopen,
resume: opts.resume,
});
const comment = await ctx.api.post<IssueComment>(`/api/issues/${issueId}/comments`, payload);
printOutput(comment, { json: ctx.json });

View File

@@ -75,11 +75,6 @@ function nonEmpty(value: string | null | undefined): string | null {
return typeof value === "string" && value.trim().length > 0 ? value.trim() : null;
}
function isLoopbackHost(hostname: string): boolean {
const value = hostname.trim().toLowerCase();
return value === "127.0.0.1" || value === "localhost" || value === "::1";
}
export function sanitizeWorktreeInstanceId(rawValue: string): string {
const trimmed = rawValue.trim().toLowerCase();
const normalized = trimmed
@@ -168,7 +163,8 @@ export function rewriteLocalUrlPort(rawUrl: string | undefined, port: number): s
if (!rawUrl) return undefined;
try {
const parsed = new URL(rawUrl);
if (!isLoopbackHost(parsed.hostname)) return rawUrl;
// The URL API normalizes default ports like :80/:443 to "", so treat them as stable URLs.
if (!parsed.port) return rawUrl;
parsed.port = String(port);
return parsed.toString();
} catch {

View File

@@ -59,11 +59,11 @@ cp .env.example .env
# DATABASE_URL=postgres://paperclip:paperclip@localhost:5432/paperclip
```
Run migrations (once the migration generation issue is fixed) or use `drizzle-kit push`:
Run migrations:
```sh
DATABASE_URL=postgres://paperclip:paperclip@localhost:5432/paperclip \
npx drizzle-kit push
pnpm db:migrate
```
Start the server:
@@ -100,37 +100,27 @@ postgres://postgres.[PROJECT-REF]:[PASSWORD]@aws-0-[REGION].pooler.supabase.com:
### Configure
Set `DATABASE_URL` in your `.env`:
For the application runtime, use a direct PostgreSQL connection unless the database client has explicit prepared-statement configuration for your pooling mode:
```sh
DATABASE_URL=postgres://postgres.[PROJECT-REF]:[PASSWORD]@aws-0-[REGION].pooler.supabase.com:6543/postgres
DATABASE_URL=postgres://postgres.[PROJECT-REF]:[PASSWORD]@aws-0-[REGION].pooler.supabase.com:5432/postgres
```
For hosted deployments that use a pooled runtime URL, set
`DATABASE_MIGRATION_URL` to the direct connection URL. Paperclip uses it for
startup schema checks/migrations and plugin namespace migrations, while the app
continues to use `DATABASE_URL` for runtime queries:
If you later run the app with a pooled runtime URL, set `DATABASE_MIGRATION_URL` to the direct connection URL. Paperclip uses it for startup schema checks/migrations and plugin namespace migrations, while the app continues to use `DATABASE_URL` for runtime queries:
```sh
DATABASE_URL=postgres://postgres.[PROJECT-REF]:[PASSWORD]@aws-0-[REGION].pooler.supabase.com:6543/postgres
DATABASE_MIGRATION_URL=postgres://postgres.[PROJECT-REF]:[PASSWORD]@aws-0-[REGION].pooler.supabase.com:5432/postgres
```
If using connection pooling (port 6543), the `postgres` client must disable prepared statements. Update `packages/db/src/client.ts`:
```ts
export function createDb(url: string) {
const sql = postgres(url, { prepare: false });
return drizzlePg(sql, { schema });
}
```
If your hosted database requires transaction-pooling-only connections, use a direct or session-pooled connection for Paperclip until runtime pooling support is documented in this guide. Do not edit database client source files as part of deployment setup.
### Push the schema
```sh
# Use the direct connection (port 5432) for schema changes
DATABASE_URL=postgres://postgres.[PROJECT-REF]:[PASSWORD]@...5432/postgres \
npx drizzle-kit push
pnpm db:migrate
```
### Free tier limits
@@ -153,6 +143,22 @@ The database mode is controlled by `DATABASE_URL`:
Your Drizzle schema (`packages/db/src/schema/`) stays the same regardless of mode.
## Plugin database namespaces
The plugin runtime tracks plugin-owned database namespaces and migrations in `plugin_database_namespaces` and `plugin_migrations`. Hosted deployments that separate runtime and migration connections should set `DATABASE_MIGRATION_URL`; plugin namespace migration work uses the migration connection when present.
## Backups
Paperclip supports automatic and manual logical database backups. These dumps include
non-system database schemas such as `public`, the Drizzle migration journal, and
plugin-owned database schemas. See `doc/DEVELOPING.md` for the current
`paperclipai db:backup` / `pnpm db:backup` commands and backup retention
configuration.
Database backups do not include non-database instance files such as local-disk
uploads, workspace files, or the local encrypted secrets master key. Back those paths
up separately when you need full instance disaster recovery.
## Secret storage
Paperclip stores secret metadata and versions in:

View File

@@ -43,6 +43,8 @@ This starts:
`pnpm dev` and `pnpm dev:once` are now idempotent for the current repo and instance: if the matching Paperclip dev runner is already alive, Paperclip reports the existing process instead of starting a duplicate.
Issue execution may also use project execution workspace policies and workspace runtime services for per-project worktrees, preview servers, and managed dev commands. Configure those through the project workspace/runtime surfaces rather than starting long-running unmanaged processes when a task needs a reusable service.
## Storybook
The board UI Storybook keeps stories and Storybook config under `ui/storybook/` so component review files stay out of the app source routes.
@@ -113,6 +115,8 @@ pnpm test:release-smoke
These browser suites are intended for targeted local verification and CI, not the default agent/human test command.
For normal issue work, start with the smallest targeted check that proves the change. Reserve repo-wide typecheck/build/test runs for PR-ready handoff or changes broad enough that narrow checks do not cover the risk.
## One-Command Local Run
For a first-time local install, you can bootstrap and run in one command:
@@ -194,6 +198,8 @@ For `codex_local`, Paperclip also manages a per-company Codex home under the ins
If the `codex` CLI is not installed or not on `PATH`, `codex_local` agent runs fail at execution time with a clear adapter error. Quota polling uses a short-lived `codex app-server` subprocess: when `codex` cannot be spawned, that provider reports `ok: false` in aggregated quota results and the API server keeps running (it must not exit on a missing binary).
Local adapters require their corresponding CLI/session setup on the machine running Paperclip. External adapters are installed through the adapter/plugin flow and should not require hardcoded imports in `server/` or `ui/`.
## Worktree-local Instances
When developing from multiple git worktrees, do not point two Paperclip servers at the same embedded PostgreSQL data directory.
@@ -415,7 +421,9 @@ If you set `DATABASE_URL`, the server will use that instead of embedded PostgreS
## Automatic DB Backups
Paperclip can run automatic DB backups on a timer. Defaults:
Paperclip can run automatic logical database backups on a timer. These backups cover
non-system database schemas, including migration history and plugin-owned database
schemas. Defaults:
- enabled
- every 60 minutes
@@ -443,6 +451,10 @@ Environment overrides:
- `PAPERCLIP_DB_BACKUP_RETENTION_DAYS=<days>`
- `PAPERCLIP_DB_BACKUP_DIR=/absolute/or/~/path`
DB backups are not full instance filesystem backups. For full local disaster
recovery, also back up local storage files and the local encrypted secrets key if
those providers are enabled.
## Secrets in Dev
Agent env vars now support secret references. By default, secret values are stored with local encryption and only secret refs are persisted in agent config.

View File

@@ -23,7 +23,7 @@ Paperclip is the command, communication, and control plane for a company of AI a
- **Track work in real time** — see at any moment what every agent is working on
- **Control costs** — token salary budgets per agent, spend tracking, burn rate
- **Align to goals** — agents see how their work serves the bigger mission
- **Store company knowledge** — a shared brain for the organization
- **Preserve work context** — comments, documents, work products, attachments, and company state stay attached to the work
## Architecture
@@ -36,17 +36,20 @@ The central nervous system. Manages:
- Agent registry and org chart
- Task assignment and status
- Budget and token spend tracking
- Company knowledge base
- Issue comments, documents, work products, attachments, and company state
- Goal hierarchy (company → team → agent → task)
- Heartbeat monitoring — know when agents are alive, idle, or stuck
It also enforces execution-control semantics such as single-assignee issues, atomic checkout and execution locks, blockers, recovery issues, and workspace/runtime controls.
### 2. Execution Services (adapters)
Agents run externally and report into the control plane. An agent is just Python code that gets kicked off and does work. Adapters connect different execution environments:
Agents run externally and report into the control plane. Adapters connect different execution environments and define how a heartbeat is invoked, observed, and cancelled:
- **OpenClaw** — initial adapter target
- **Heartbeat loop** — simple custom Python that loops, checks in, does work
- **Others** — any runtime that can call an API
- **Local CLI/session adapters** — built-in adapters for tools such as Claude Code, Codex, Gemini, OpenCode, Pi, and Cursor
- **HTTP/process-style adapters** — command or webhook/API integrations for custom runtimes
- **OpenClaw gateway** — integration for OpenClaw-style remote agents
- **External adapter plugins** — dynamically loaded adapters installed outside the core app
The control plane doesn't run agents. It orchestrates them. Agents run wherever they run and phone home.

View File

@@ -32,12 +32,14 @@ Then you define who reports to the CEO: a CTO managing programmers, a CMO managi
### Agent Execution
There are two fundamental modes for running an agent's heartbeat:
Paperclip supports several ways to run an agent's heartbeat:
1. **Run a command** — Paperclip kicks off a process (shell command, Python script, etc.) and tracks it. The heartbeat is "execute this and monitor it."
2. **Fire and forget a request** — Paperclip sends a webhook/API call to an externally running agent. The heartbeat is "notify this agent to wake up." (OpenClaw hooks work this way.)
1. **Local CLI/session adapters** — Paperclip starts or resumes local coding-tool sessions such as Claude Code, Codex, Gemini, OpenCode, Pi, and Cursor, then tracks the run.
2. **Run a command** — Paperclip kicks off a process (shell command, Python script, etc.) and tracks it. The heartbeat is "execute this and monitor it."
3. **Fire and forget a request** — Paperclip sends a webhook/API call to an externally running agent. The heartbeat is "notify this agent to wake up." OpenClaw-style hooks work this way.
4. **External adapter plugins** — Paperclip loads adapter packages through the plugin/adapter flow so self-hosted installs can add runtimes without hardcoding them in core.
We provide sensible defaults — a default agent that shells out to Claude Code or Codex with your configuration, remembers session IDs, runs basic scripts. But you can plug in anything.
Agent runs can use project and execution workspaces, managed runtime services such as preview/dev servers, adapter-specific session state, and HTTP/webhook-style execution. We provide sensible defaults, but the adapter is still the boundary: if a runtime can be invoked, observed, and authorized, Paperclip can coordinate it.
### Task Management
@@ -54,7 +56,7 @@ I am researching the Facebook ads Granola uses (current task)
Tasks have parentage. Every task exists in service of a parent task, all the way up to the company goal. This is what keeps autonomous agents aligned — they can always answer "why am I doing this?"
More detailed task structure TBD.
The current issue model includes stable issue identifiers, parent/sub-issues, blockers, a single assignee, comments, issue documents, attachments and work products, and review/approval handoffs. That structure keeps work inspectable by both the board and agents while still allowing agents to decompose work into smaller tasks.
## Principles
@@ -115,7 +117,7 @@ Paperclips core identity is a **control plane for autonomous AI companies**,
- Do not make the core product a general chat app. The current product definition is explicitly task/comment-centric and “not a chatbot,” and that boundary is valuable.
- Do not build a complete Jira/GitHub replacement. The repo/docs already position Paperclip as organization orchestration, not focused on pull-request review.
- Do not build enterprise-grade RBAC first. The current V1 spec still treats multi-board governance and fine-grained human permissions as out of scope, so the first multi-user version should be coarse and company-scoped.
- Do not build enterprise-grade RBAC first. Paperclip now has authenticated mode, company memberships, instance roles, and permission grants, but fine-grained enterprise governance should remain secondary to the core company control plane.
- Do not lead with raw bash logs and transcripts. Default view should be human-readable intent/progress, with raw detail beneath.
- Do not force users to understand provider/API-key plumbing unless absolutely necessary. There are active onboarding/auth issues already; friction here is clearly real.
@@ -136,11 +138,14 @@ Paperclips core identity is a **control plane for autonomous AI companies**,
5. **Output-first**
Work is not done until the user can see the result: file, document, preview link, screenshot, plan, or PR.
6. **Local-first, cloud-ready**
6. **Execution visibility without log worship**
Active runs, recovery issues, productivity review states, blockers, and work products should be first-class surfaces. Raw transcripts are available when needed, but they are not the primary product surface.
7. **Local-first, cloud-ready**
The mental model should not change between local solo use and shared/private or public/cloud deployment.
7. **Safe autonomy**
8. **Safe autonomy**
Auto mode is allowed; hidden token burn is not.
8. **Thin core, rich edges**
9. **Thin core, rich edges**
Put optional chat, knowledge, and special surfaces into plugins/extensions rather than bloating the control plane.

View File

@@ -143,6 +143,13 @@ This keeps the default install path unchanged while allowing explicit installs w
npx paperclipai@canary onboard
```
The release script now verifies two things after a canary publish:
- the `canary` dist-tag resolves to the version that was just published
- every published internal `@paperclipai/*` dependency referenced by that manifest exists on npm
It also treats `latest -> canary` as a failure by default, because npm metadata can otherwise leave the default install path pointing at an unreleased canary dependency graph. Only pass `./scripts/release.sh canary --allow-canary-latest` when that `latest` behavior is explicitly intended.
### Stable
Stable publishes use the npm dist-tag `latest`.

View File

@@ -63,6 +63,8 @@ It:
- verifies the pushed commit
- computes the canary version for the current UTC date
- publishes under npm dist-tag `canary`
- verifies that `canary` resolves to the just-published version and that published internal dependencies exist on npm
- fails by default if npm leaves `latest` pointing at a canary; use `--allow-canary-latest` only when that state is intentional
- creates a git tag `canary/vYYYY.MDD.P-canary.N`
Users install canaries with:

View File

@@ -1,7 +1,7 @@
# Paperclip V1 Implementation Spec
Status: Implementation contract for first release (V1)
Date: 2026-02-17
Date: 2026-04-28
Audience: Product, engineering, and agent-integration authors
Source inputs: `GOAL.md`, `PRODUCT.md`, `SPEC.md`, `DATABASE.md`, current monorepo code
@@ -37,8 +37,9 @@ These decisions close open questions from `SPEC.md` for V1.
| Visibility | Full visibility to board and all agents in same company |
| Communication | Tasks + comments only (no separate chat system) |
| Task ownership | Single assignee; atomic checkout required for `in_progress` transition |
| Recovery | No automatic reassignment; work recovery stays manual/explicit |
| Agent adapters | Built-in `process` and `http` adapters |
| Recovery | Liveness/watchdog recovery preserves explicit ownership: retry lost execution continuity where safe, otherwise create visible recovery issues or require human escalation (see `doc/execution-semantics.md`) |
| Agent adapters | Built-in `process`, `http`, local CLI/session adapters, and OpenClaw gateway support; external adapters can also be loaded through the adapter plugin flow |
| Plugin framework | Local/self-hosted early plugin runtime is in scope; cloud marketplace and packaged public distribution remain out of scope |
| Auth | Mode-dependent human auth (`local_trusted` implicit board in current code; authenticated mode uses sessions), API keys for agents |
| Budget period | Monthly UTC calendar window |
| Budget enforcement | Soft alerts + hard limit auto-pause |
@@ -73,7 +74,7 @@ V1 implementation extends this baseline into a company-centric, governance-aware
## 5.2 Out of Scope (V1)
- Plugin framework and third-party extension SDK
- Cloud-grade plugin marketplace/distribution beyond the local/self-hosted plugin runtime
- Revenue/expense accounting beyond model/token costs
- Knowledge base subsystem
- Public marketplace (ClipHub)
@@ -123,6 +124,16 @@ Human auth tables (`users`, `sessions`, and provider-specific auth artifacts) ar
- `name` text not null
- `description` text null
- `status` enum: `active | paused | archived`
- `pause_reason` text null
- `paused_at` timestamptz null
- `issue_prefix` text not null
- `issue_counter` int not null
- `budget_monthly_cents` int not null default 0
- `spent_monthly_cents` int not null default 0
- `attachment_max_bytes` int not null
- `require_board_approval_for_new_agents` boolean not null default false
- feedback sharing consent fields
- branding fields such as `brand_color`
Invariant: every business record belongs to exactly one company.
@@ -133,15 +144,21 @@ Invariant: every business record belongs to exactly one company.
- `name` text not null
- `role` text not null
- `title` text null
- `status` enum: `active | paused | idle | running | error | terminated`
- `icon` text null
- `status` enum: `active | paused | idle | running | error | pending_approval | terminated`
- `reports_to` uuid fk `agents.id` null
- `capabilities` text null
- `adapter_type` enum: `process | http`
- `adapter_type` text; built-ins include `process`, `http`, `claude_local`, `codex_local`, `gemini_local`, `opencode_local`, `pi_local`, `cursor`, and `openclaw_gateway`
- `adapter_config` jsonb not null
- `runtime_config` jsonb not null default `{}`
- `default_environment_id` uuid fk `environments.id` null
- `context_mode` enum: `thin | fat` default `thin`
- `budget_monthly_cents` int not null default 0
- `spent_monthly_cents` int not null default 0
- pause fields: `pause_reason`, `paused_at`
- `permissions` jsonb not null default `{}`
- `last_heartbeat_at` timestamptz null
- `metadata` jsonb null
Invariants:
@@ -195,6 +212,7 @@ Invariant:
- `id` uuid pk
- `company_id` uuid fk not null
- `project_id` uuid fk `projects.id` null
- `project_workspace_id` uuid fk `project_workspaces.id` null
- `goal_id` uuid fk `goals.id` null
- `parent_id` uuid fk `issues.id` null
- `title` text not null
@@ -202,13 +220,22 @@ Invariant:
- `status` enum: `backlog | todo | in_progress | in_review | done | blocked | cancelled`
- `priority` enum: `critical | high | medium | low`
- `assignee_agent_id` uuid fk `agents.id` null
- `assignee_user_id` text null
- checkout/execution locks: `checkout_run_id`, `execution_run_id`, `execution_agent_name_key`, `execution_locked_at`
- `created_by_agent_id` uuid fk `agents.id` null
- `created_by_user_id` uuid fk `users.id` null
- identifier fields: `issue_number`, `identifier`
- origin fields: `origin_kind`, `origin_id`, `origin_run_id`, `origin_fingerprint`
- `request_depth` int not null default 0
- `billing_code` text null
- `assignee_adapter_overrides` jsonb null
- `execution_policy` jsonb null
- `execution_state` jsonb null
- execution workspace fields: `execution_workspace_id`, `execution_workspace_preference`, `execution_workspace_settings`
- `started_at` timestamptz null
- `completed_at` timestamptz null
- `cancelled_at` timestamptz null
- `hidden_at` timestamptz null
Invariants:
@@ -261,10 +288,10 @@ Invariant: each event must attach to agent and company; rollups are aggregation,
- `id` uuid pk
- `company_id` uuid fk not null
- `type` enum: `hire_agent | approve_ceo_strategy`
- `type` enum: `hire_agent | approve_ceo_strategy | budget_override_required | request_board_approval`
- `requested_by_agent_id` uuid fk `agents.id` null
- `requested_by_user_id` uuid fk `users.id` null
- `status` enum: `pending | approved | rejected | cancelled`
- `status` enum: `pending | revision_requested | approved | rejected | cancelled`
- `payload` jsonb not null
- `decision_note` text null
- `decided_by_user_id` uuid fk `users.id` null
@@ -363,6 +390,15 @@ Operational policy:
- `document_id` uuid fk not null
- `key` text not null (`plan`, `design`, `notes`, etc.)
## 7.16 Current Implementation Addenda
The current implementation includes additional V1-control-plane tables beyond the original February snapshot:
- Issue structure and review: `issue_relations` for blockers, `labels`/`issue_labels`, `issue_thread_interactions`, `issue_approvals`, `issue_execution_decisions`, `issue_work_products`, `issue_inbox_archives`, `issue_read_states`, and issue reference mention indexes.
- Execution and workspace control: `execution_workspaces`, `project_workspaces`, `workspace_runtime_services`, `workspace_operations`, `environments`, `environment_leases`, `agent_task_sessions`, `agent_runtime_state`, `agent_wakeup_requests`, heartbeat events, and watchdog decision tables.
- Plugins and routines: `plugins`, plugin config/state/entities/jobs/logs/webhooks, plugin database namespaces/migrations, plugin company settings, and `routines`.
- Access and operations: company memberships, instance roles, principal permission grants, invites, join requests, board API keys, CLI auth challenges, budget policies/incidents, feedback exports/votes, company skills, sidebar preferences, and company logos.
## 8. State Machines
## 8.1 Agent Status
@@ -395,7 +431,14 @@ Side effects:
- entering `done` sets `completed_at`
- entering `cancelled` sets `cancelled_at`
Detailed ownership, execution, blocker, and crash-recovery semantics are documented in `doc/execution-semantics.md`.
V1 non-terminal liveness rule:
- agent-owned `todo`, `in_progress`, `in_review`, and `blocked` issues must have a live execution path, an explicit waiting path, or an explicit recovery path
- `in_review` is healthy only when a typed execution participant, pending issue-thread interaction or approval, user owner, active run, queued wake, or explicit recovery issue owns the next action
- a blocked chain is covered only when each unresolved leaf issue is live or explicitly waiting
- when Paperclip cannot safely infer the next action, it surfaces the problem through visible blocked/recovery work instead of silently completing or reassigning work
Detailed ownership, execution, blocker, active-run watchdog, crash-recovery, and non-terminal liveness semantics are documented in `doc/execution-semantics.md`.
## 8.3 Approval Status
@@ -556,6 +599,17 @@ Dashboard payload must include:
- `422` semantic rule violation
- `500` server error
## 10.10 Current Implementation API Addenda
The current app also exposes V1-supporting surfaces for:
- issue thread interactions (`suggest_tasks`, `ask_user_questions`, `request_confirmation`)
- issue approvals, issue references/search, labels, read state, inbox/archive state, and work products
- execution workspaces, project workspaces, workspace runtime services, and workspace operations
- routines and scheduled/API/webhook triggers
- plugin installation, configuration, state, jobs, logs, webhooks, and plugin database namespace migration
- company import/export preview/apply, feedback export/vote routes, instance backup/config routes, invites, join requests, memberships, and permission grants
## 11. Heartbeat and Adapter Contract
## 11.1 Adapter Interface
@@ -731,13 +785,14 @@ Required UX behaviors:
- Node 20+
- `DATABASE_URL` optional
- if unset, auto-use PGlite and push schema
- if unset, auto-use embedded PostgreSQL under `~/.paperclip/instances/default/db`
## 15.2 Migrations
- Drizzle migrations are source of truth
- local/dev startup applies pending migrations automatically where supported
- `pnpm db:migrate` applies pending migrations manually
- no destructive migration in-place for V1 upgrade path
- provide migration script from existing minimal tables to company-scoped schema
## 15.3 Logging and Audit
@@ -792,6 +847,8 @@ A release candidate is blocked unless these pass:
## 18. Delivery Plan
Current implementation note: the milestones below describe the original V1 sequencing. Several systems originally framed as future work have since shipped or advanced materially, including issue documents/interactions, blockers, routines, execution workspaces, import/export portability, authenticated deployment modes, multi-user basics, and the local/self-hosted plugin runtime.
## Milestone 1: Company Core and Auth
- add `companies` and company scoping to existing entities
@@ -844,7 +901,7 @@ V1 is complete only when all criteria are true:
## 20. Post-V1 Backlog (Explicitly Deferred)
- plugin architecture
- cloud-grade plugin marketplace/distribution
- richer workflow-state customization per team
- milestones/labels/dependency graph depth beyond V1 minimum
- realtime transport optimization (SSE/WebSockets)

Binary file not shown.

After

Width:  |  Height:  |  Size: 174 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 174 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 177 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 177 KiB

View File

@@ -1,7 +1,7 @@
# Execution Semantics
Status: Current implementation guide
Date: 2026-04-13
Date: 2026-04-26
Audience: Product and engineering
This document explains how Paperclip interprets issue assignment, issue status, execution runs, wakeups, parent/sub-issue structure, and blocker relationships.
@@ -150,11 +150,23 @@ Blocked issues should stay idle while blockers remain unresolved. Paperclip shou
If a parent is truly waiting on a child, model that with blockers. Do not rely on the parent/child relationship alone.
## 7. Consistent Execution Path Rules
## 7. Non-Terminal Issue Liveness Contract
For agent-assigned, non-terminal, actionable issues, Paperclip should not leave work in a state where nobody is working it and nothing will wake it.
For agent-owned, non-terminal issues, Paperclip should never leave work in a state where nobody is responsible for the next move and nothing will wake or surface it.
The relevant execution path depends on status.
This is a visibility contract, not an auto-completion contract. If Paperclip cannot safely infer the next action, it should surface the ambiguity with a blocked state, a visible comment, or an explicit recovery issue. It must not silently mark work done from prose comments or guess that a dependency is complete.
An issue is healthy when the product can answer "what moves this forward next?" without requiring a human to reconstruct intent from the whole thread. An issue is stalled when it is non-terminal but has no live execution path, no explicit waiting path, and no recovery path.
The valid action-path primitives are:
- an active run linked to the issue
- a queued wake or continuation that can be delivered to the responsible agent
- a typed execution-policy participant, such as `executionState.currentParticipant`
- a pending issue-thread interaction or linked approval that is waiting for a specific responder
- a human owner via `assigneeUserId`
- a first-class blocker chain whose unresolved leaf issues are themselves healthy
- an open explicit recovery issue that names the owner and action needed to restore liveness
### Agent-assigned `todo`
@@ -162,9 +174,11 @@ This is dispatch state: ready to start, not yet actively claimed.
A healthy dispatch state means at least one of these is true:
- the issue already has a queued/running wake path
- the issue is intentionally resting in `todo` after a successful agent heartbeat, not after an interrupted dispatch
- the issue has been explicitly surfaced as stranded
- the issue already has a queued wake path
- the issue is intentionally resting in `todo` after a completed agent heartbeat, with no interrupted dispatch evidence
- the issue has been explicitly surfaced as stranded through a visible blocked/recovery path
An assigned `todo` issue is stalled when dispatch was interrupted, no wake remains queued or running, and no recovery path has been opened.
### Agent-assigned `in_progress`
@@ -174,7 +188,39 @@ A healthy active-work state means at least one of these is true:
- there is an active run for the issue
- there is already a queued continuation wake
- the issue has been explicitly surfaced as stranded
- there is an open explicit recovery issue for the lost execution path
An agent-owned `in_progress` issue is stalled when it has no active run, no queued continuation, and no explicit recovery surface. A still-running but silent process is not automatically stalled; it is handled by the active-run watchdog contract.
### `in_review`
This is review/approval state: execution is paused because the next move belongs to a reviewer, approver, board user, or recovery owner.
A healthy `in_review` issue has at least one valid action path:
- a typed execution-policy participant who can approve or request changes
- a pending issue-thread interaction or linked approval waiting for a named responder
- a human owner via `assigneeUserId`
- an active run or queued wake that is expected to process the review state
- an open explicit recovery issue for an ambiguous review handoff
Agent-assigned `in_review` with no typed participant is only healthy when one of the other paths exists. Assignment to the same agent that produced the handoff is not, by itself, a review path.
An `in_review` issue is stalled when it has no typed participant, no pending interaction or approval, no user owner, no active run, no queued wake, and no explicit recovery issue. Paperclip should surface that state as recovery work rather than silently completing the issue or leaving blocker chains parked indefinitely.
### `blocked`
This is explicit waiting state.
A healthy `blocked` issue has an explicit waiting path:
- first-class blockers exist, and each unresolved leaf has a valid action path under this contract
- the issue is blocked on an explicit recovery issue that itself has a live or waiting path
- the issue is waiting on a pending interaction, linked approval, human owner, or clearly named external owner/action
A blocker chain is covered only when its unresolved leaf is live or explicitly waiting. An intermediate `blocked` issue does not make the chain healthy by itself.
A `blocked` issue is stalled when the unresolved blocker leaf has no active run, queued wake, typed participant, pending interaction or approval, user owner, external owner/action, or recovery issue. In that case the parent should show the first stalled leaf instead of presenting the dependency as calmly covered.
## 8. Crash and Restart Recovery
@@ -218,15 +264,83 @@ This is an active-work continuity recovery.
Startup recovery and periodic recovery are different from normal wakeup delivery.
On startup and on the periodic recovery loop, Paperclip now does three things in sequence:
On startup and on the periodic recovery loop, Paperclip now does four things in sequence:
1. reap orphaned `running` runs
2. resume persisted `queued` runs
3. reconcile stranded assigned work
4. scan silent active runs and create or update explicit watchdog review issues
That last step is what closes the gap where issue state survives a crash but the wake/run path does not.
The stranded-work pass closes the gap where issue state survives a crash but the wake/run path does not. The silent-run scan covers the separate case where a live process exists but has stopped producing observable output.
## 10. What This Does Not Mean
## 10. Silent Active-Run Watchdog
An active run can still be unhealthy even when its process is `running`. Paperclip treats prolonged output silence as a watchdog signal, not as proof that the run is failed.
The recovery service owns this contract:
- classify active-run output silence as `ok`, `suspicious`, `critical`, `snoozed`, or `not_applicable`
- collect bounded evidence from run logs, recent run events, child issues, and blockers
- preserve redaction and truncation before evidence is written to issue descriptions
- create at most one open `stale_active_run_evaluation` issue per run
- honor active snooze decisions before creating more review work
- build the `outputSilence` summary shown by live-run and active-run API responses
Suspicious silence creates a medium-priority review issue for the selected recovery owner. Critical silence raises that review issue to high priority and blocks the source issue on the explicit evaluation task without cancelling the active process.
Watchdog decisions are explicit operator/recovery-owner decisions:
- `snooze` records an operator-chosen future quiet-until time and suppresses scan-created review work during that window
- `continue` records that the current evidence is acceptable, does not cancel or mutate the active run, and sets a 30-minute default re-arm window before the watchdog evaluates the still-silent run again
- `dismissed_false_positive` records why the review was not actionable
Operators should prefer `snooze` for known time-bounded quiet periods. `continue` is only a short acknowledgement of the current evidence; if the run remains silent after the re-arm window, the periodic watchdog scan can create or update review work again.
The board can record watchdog decisions. The assigned owner of the watchdog evaluation issue can also record them. Other agents cannot.
## 11. Auto-Recover vs Explicit Recovery vs Human Escalation
Paperclip uses three different recovery outcomes, depending on how much it can safely infer.
### Auto-Recover
Auto-recovery is allowed when ownership is clear and the control plane only lost execution continuity.
Examples:
- requeue one dispatch wake for an assigned `todo` issue whose latest run failed, timed out, or was cancelled
- requeue one continuation wake for an assigned `in_progress` issue whose live execution path disappeared
- assign an orphan blocker back to its creator when that blocker is already preventing other work
Auto-recovery preserves the existing owner. It does not choose a replacement agent.
### Explicit Recovery Issue
Paperclip creates an explicit recovery issue when the system can identify a problem but cannot safely complete the work itself.
Examples:
- automatic stranded-work retry was already exhausted
- a dependency graph has an invalid/uninvokable owner, unassigned blocker, or invalid review participant
- an active run is silent past the watchdog threshold
The source issue remains visible and blocked on the recovery issue when blocking is necessary for correctness. The recovery owner must restore a live path, resolve the source issue manually, or record the reason it is a false positive.
Instance-level issue-graph liveness auto-recovery is disabled by default. When enabled, its lookback window means "dependency paths updated within the last N hours"; older findings remain advisory and are counted as outside the configured lookback instead of creating recovery issues automatically. This is an operator noise control, not the older staleness delay for determining whether a chain is old enough to surface.
### Human Escalation
Human escalation is required when the next safe action depends on board judgment, budget/approval policy, or information unavailable to the control plane.
Examples:
- all candidate recovery owners are paused, terminated, pending approval, or budget-blocked
- the issue is human-owned rather than agent-owned
- the run is intentionally quiet but needs an operator decision before cancellation or continuation
In these cases Paperclip should leave a visible issue/comment trail instead of silently retrying.
## 12. What This Does Not Mean
These semantics do not change V1 into an auto-reassignment system.
@@ -240,9 +354,10 @@ The recovery model is intentionally conservative:
- preserve ownership
- retry once when the control plane lost execution continuity
- create explicit recovery work when the system can identify a bounded recovery owner/action
- escalate visibly when the system cannot safely keep going
## 11. Practical Interpretation
## 13. Practical Interpretation
For a board operator, the intended meaning is:

30
docker/.env.aws.example Normal file
View File

@@ -0,0 +1,30 @@
# AWS ECS Fargate deployment environment
# Copy to .env.aws and fill in values before deploying
#
# Secrets (DATABASE_URL, BETTER_AUTH_SECRET, ANTHROPIC_API_KEY, OPENAI_API_KEY,
# GITHUB_TOKEN) are injected via AWS Secrets Manager — do NOT set them here.
# Deployment mode
PAPERCLIP_DEPLOYMENT_MODE=authenticated
PAPERCLIP_DEPLOYMENT_EXPOSURE=public
PAPERCLIP_PUBLIC_URL=https://paperclip.example.com
# Server
HOST=0.0.0.0
PORT=3100
NODE_ENV=production
SERVE_UI=true
# Paperclip paths
PAPERCLIP_HOME=/paperclip
PAPERCLIP_INSTANCE_ID=default
PAPERCLIP_CONFIG=/paperclip/instances/default/config.json
# Auto-apply migrations on startup
PAPERCLIP_MIGRATION_AUTO_APPLY=true
# Enable heartbeat scheduler for remote agents
HEARTBEAT_SCHEDULER_ENABLED=true
# Post-deploy hardening (uncomment after first user signs up)
# PAPERCLIP_AUTH_DISABLE_SIGN_UP=true

View File

@@ -0,0 +1,90 @@
{
"family": "paperclip-server",
"networkMode": "awsvpc",
"requiresCompatibilities": ["FARGATE"],
"cpu": "2048",
"memory": "4096",
"executionRoleArn": "arn:aws:iam::<ACCOUNT_ID>:role/paperclip-ecs-execution",
"taskRoleArn": "arn:aws:iam::<ACCOUNT_ID>:role/paperclip-ecs-task",
"containerDefinitions": [
{
"name": "paperclip-server",
"image": "<ACCOUNT_ID>.dkr.ecr.<REGION>.amazonaws.com/paperclip-server:latest",
"essential": true,
"portMappings": [
{
"containerPort": 3100,
"protocol": "tcp"
}
],
"environment": [
{ "name": "NODE_ENV", "value": "production" },
{ "name": "HOST", "value": "0.0.0.0" },
{ "name": "PORT", "value": "3100" },
{ "name": "SERVE_UI", "value": "true" },
{ "name": "PAPERCLIP_HOME", "value": "/paperclip" },
{ "name": "PAPERCLIP_INSTANCE_ID", "value": "default" },
{ "name": "PAPERCLIP_CONFIG", "value": "/paperclip/instances/default/config.json" },
{ "name": "PAPERCLIP_DEPLOYMENT_MODE", "value": "authenticated" },
{ "name": "PAPERCLIP_DEPLOYMENT_EXPOSURE", "value": "public" },
{ "name": "PAPERCLIP_PUBLIC_URL", "value": "https://<DOMAIN>" },
{ "name": "PAPERCLIP_MIGRATION_AUTO_APPLY", "value": "true" },
{ "name": "HEARTBEAT_SCHEDULER_ENABLED", "value": "true" }
],
"secrets": [
{
"name": "DATABASE_URL",
"valueFrom": "arn:aws:secretsmanager:<REGION>:<ACCOUNT_ID>:secret:paperclip/database-url"
},
{
"name": "BETTER_AUTH_SECRET",
"valueFrom": "arn:aws:secretsmanager:<REGION>:<ACCOUNT_ID>:secret:paperclip/better-auth-secret"
},
{
"name": "ANTHROPIC_API_KEY",
"valueFrom": "arn:aws:secretsmanager:<REGION>:<ACCOUNT_ID>:secret:paperclip/anthropic-api-key"
},
{
"name": "OPENAI_API_KEY",
"valueFrom": "arn:aws:secretsmanager:<REGION>:<ACCOUNT_ID>:secret:paperclip/openai-api-key"
},
{
"name": "GITHUB_TOKEN",
"valueFrom": "arn:aws:secretsmanager:<REGION>:<ACCOUNT_ID>:secret:paperclip/github-token"
}
],
"mountPoints": [
{
"sourceVolume": "paperclip-data",
"containerPath": "/paperclip",
"readOnly": false
}
],
"healthCheck": {
"command": ["CMD-SHELL", "curl -f http://localhost:3100/api/health || exit 1"],
"interval": 30,
"timeout": 5,
"retries": 3,
"startPeriod": 60
},
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/paperclip",
"awslogs-region": "<REGION>",
"awslogs-stream-prefix": "server"
}
}
}
],
"volumes": [
{
"name": "paperclip-data",
"efsVolumeConfiguration": {
"fileSystemId": "<EFS_ID>",
"rootDirectory": "/",
"transitEncryption": "ENABLED"
}
}
]
}

580
docs/deploy/aws-ecs.md Normal file
View File

@@ -0,0 +1,580 @@
---
title: AWS ECS Fargate
summary: Deploy Paperclip to AWS using ECS Fargate, RDS Postgres, and EFS
---
Deploy Paperclip to AWS with ECS Fargate (compute), RDS Postgres 17 (database), and EFS (persistent storage). This guide uses the AWS CLI and produces a single-task ECS service behind an ALB with HTTPS.
## Prerequisites
- AWS CLI v2 configured with a profile that has admin-level permissions
- Docker installed locally (for building and pushing the image)
- A registered domain with DNS you control (for the TLS certificate)
- The Paperclip repo cloned locally
Set these shell variables for the rest of the guide:
```bash
export AWS_REGION=us-east-1
export AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
export PAPERCLIP_DOMAIN=paperclip.example.com # your domain
export DB_PASSWORD=$(openssl rand -base64 24 | tr -d '/+=' | head -c 32)
export AUTH_SECRET=$(openssl rand -base64 32)
```
## 1. Create ECR Repository
```bash
aws ecr create-repository \
--repository-name paperclip-server \
--image-scanning-configuration scanOnPush=true \
--region $AWS_REGION
```
## 2. Build and Push Docker Image
```bash
cd /path/to/paperclip
# Authenticate Docker to ECR
aws ecr get-login-password --region $AWS_REGION \
| docker login --username AWS --password-stdin \
$AWS_ACCOUNT_ID.dkr.ecr.$AWS_REGION.amazonaws.com
# Build
docker build -t paperclip-server .
# Tag and push
docker tag paperclip-server:latest \
$AWS_ACCOUNT_ID.dkr.ecr.$AWS_REGION.amazonaws.com/paperclip-server:latest
docker push \
$AWS_ACCOUNT_ID.dkr.ecr.$AWS_REGION.amazonaws.com/paperclip-server:latest
```
## 3. Networking (VPC, Subnets, Security Groups)
Use the default VPC or create a dedicated one. The guide assumes the default VPC with public and private subnets in two AZs.
```bash
# Get default VPC
VPC_ID=$(aws ec2 describe-vpcs \
--filters Name=isDefault,Values=true \
--query 'Vpcs[0].VpcId' --output text)
# Get two public subnets (for ALB)
SUBNET_IDS=$(aws ec2 describe-subnets \
--filters Name=vpc-id,Values=$VPC_ID \
--query 'Subnets[?MapPublicIpOnLaunch==`true`] | [0:2].SubnetId' \
--output text)
SUBNET_1=$(echo $SUBNET_IDS | awk '{print $1}')
SUBNET_2=$(echo $SUBNET_IDS | awk '{print $2}')
```
Create security groups:
```bash
# ALB security group — inbound HTTPS
ALB_SG=$(aws ec2 create-security-group \
--group-name paperclip-alb \
--description "Paperclip ALB" \
--vpc-id $VPC_ID \
--query 'GroupId' --output text)
aws ec2 authorize-security-group-ingress \
--group-id $ALB_SG \
--protocol tcp --port 443 --cidr 0.0.0.0/0
# Also open port 80 so the ALB can accept HTTP and redirect to HTTPS
aws ec2 authorize-security-group-ingress \
--group-id $ALB_SG \
--protocol tcp --port 80 --cidr 0.0.0.0/0
# ECS task security group — inbound from ALB only
ECS_SG=$(aws ec2 create-security-group \
--group-name paperclip-ecs \
--description "Paperclip ECS tasks" \
--vpc-id $VPC_ID \
--query 'GroupId' --output text)
aws ec2 authorize-security-group-ingress \
--group-id $ECS_SG \
--protocol tcp --port 3100 \
--source-group $ALB_SG
# RDS security group — inbound from ECS only
RDS_SG=$(aws ec2 create-security-group \
--group-name paperclip-rds \
--description "Paperclip RDS" \
--vpc-id $VPC_ID \
--query 'GroupId' --output text)
aws ec2 authorize-security-group-ingress \
--group-id $RDS_SG \
--protocol tcp --port 5432 \
--source-group $ECS_SG
# EFS security group — inbound NFS from ECS only
EFS_SG=$(aws ec2 create-security-group \
--group-name paperclip-efs \
--description "Paperclip EFS" \
--vpc-id $VPC_ID \
--query 'GroupId' --output text)
aws ec2 authorize-security-group-ingress \
--group-id $EFS_SG \
--protocol tcp --port 2049 \
--source-group $ECS_SG
```
## 4. Create RDS Postgres Instance
```bash
# Custom VPCs don't come with a default DB subnet group — create one
# that spans our two subnets so RDS can place the instance.
aws rds create-db-subnet-group \
--db-subnet-group-name paperclip-db-subnet \
--db-subnet-group-description "Paperclip RDS subnets" \
--subnet-ids $SUBNET_1 $SUBNET_2
aws rds create-db-instance \
--db-instance-identifier paperclip-db \
--db-instance-class db.t4g.micro \
--engine postgres \
--engine-version 17 \
--master-username paperclip \
--master-user-password "$DB_PASSWORD" \
--allocated-storage 20 \
--storage-type gp3 \
--vpc-security-group-ids $RDS_SG \
--db-subnet-group-name paperclip-db-subnet \
--no-publicly-accessible \
--backup-retention-period 7 \
--no-multi-az \
--db-name paperclip \
--region $AWS_REGION
# Wait for it to become available (takes 5-10 min)
aws rds wait db-instance-available \
--db-instance-identifier paperclip-db
# Get the endpoint
RDS_ENDPOINT=$(aws rds describe-db-instances \
--db-instance-identifier paperclip-db \
--query 'DBInstances[0].Endpoint.Address' --output text)
DATABASE_URL="postgresql://paperclip:${DB_PASSWORD}@${RDS_ENDPOINT}:5432/paperclip"
```
## 5. Create EFS Filesystem
```bash
EFS_ID=$(aws efs create-file-system \
--performance-mode generalPurpose \
--throughput-mode bursting \
--encrypted \
--tags Key=Name,Value=paperclip-data \
--query 'FileSystemId' --output text)
# Create mount targets in each subnet
for SUBNET in $SUBNET_1 $SUBNET_2; do
aws efs create-mount-target \
--file-system-id $EFS_ID \
--subnet-id $SUBNET \
--security-groups $EFS_SG
done
# Wait for mount targets
aws efs describe-mount-targets --file-system-id $EFS_ID
```
## 6. Store Secrets
```bash
aws secretsmanager create-secret \
--name paperclip/database-url \
--secret-string "$DATABASE_URL"
aws secretsmanager create-secret \
--name paperclip/anthropic-api-key \
--secret-string "YOUR_ANTHROPIC_KEY"
aws secretsmanager create-secret \
--name paperclip/better-auth-secret \
--secret-string "$AUTH_SECRET"
aws secretsmanager create-secret \
--name paperclip/openai-api-key \
--secret-string "YOUR_OPENAI_KEY"
aws secretsmanager create-secret \
--name paperclip/github-token \
--secret-string "YOUR_GITHUB_PAT"
```
## 7. IAM Roles
Create the ECS task execution role (pulls images, reads secrets) and the task role (application permissions).
```bash
# Task execution role
aws iam create-role \
--role-name paperclip-ecs-execution \
--assume-role-policy-document '{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {"Service": "ecs-tasks.amazonaws.com"},
"Action": "sts:AssumeRole"
}]
}'
aws iam attach-role-policy \
--role-name paperclip-ecs-execution \
--policy-arn arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy
# Allow reading secrets
aws iam put-role-policy \
--role-name paperclip-ecs-execution \
--policy-name SecretsAccess \
--policy-document '{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Action": ["secretsmanager:GetSecretValue"],
"Resource": "arn:aws:secretsmanager:'$AWS_REGION':'$AWS_ACCOUNT_ID':secret:paperclip/*"
}]
}'
# Task role (application — add permissions as needed)
aws iam create-role \
--role-name paperclip-ecs-task \
--assume-role-policy-document '{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {"Service": "ecs-tasks.amazonaws.com"},
"Action": "sts:AssumeRole"
}]
}'
```
## 8. ECS Cluster and Task Definition
```bash
aws ecs create-cluster --cluster-name paperclip
aws logs create-log-group --log-group-name /ecs/paperclip
```
Register the task definition using the template at `docker/ecs-task-definition.json`. Before registering, replace the placeholder values:
```bash
sed -e "s|<ACCOUNT_ID>|$AWS_ACCOUNT_ID|g" \
-e "s|<REGION>|$AWS_REGION|g" \
-e "s|<EFS_ID>|$EFS_ID|g" \
-e "s|<DOMAIN>|$PAPERCLIP_DOMAIN|g" \
docker/ecs-task-definition.json > /tmp/paperclip-task-def.json
aws ecs register-task-definition \
--cli-input-json file:///tmp/paperclip-task-def.json
```
## 9. ALB and TLS Certificate
Request a certificate (you must validate via DNS):
```bash
CERT_ARN=$(aws acm request-certificate \
--domain-name $PAPERCLIP_DOMAIN \
--validation-method DNS \
--query 'CertificateArn' --output text)
# Get the CNAME record to add to your DNS
aws acm describe-certificate \
--certificate-arn $CERT_ARN \
--query 'Certificate.DomainValidationOptions[0].ResourceRecord'
```
Add the CNAME to your DNS provider, then wait for validation:
```bash
aws acm wait certificate-validated --certificate-arn $CERT_ARN
```
Create the ALB:
```bash
ALB_ARN=$(aws elbv2 create-load-balancer \
--name paperclip-alb \
--subnets $SUBNET_1 $SUBNET_2 \
--security-groups $ALB_SG \
--scheme internet-facing \
--type application \
--query 'LoadBalancers[0].LoadBalancerArn' --output text)
ALB_DNS=$(aws elbv2 describe-load-balancers \
--load-balancer-arns $ALB_ARN \
--query 'LoadBalancers[0].DNSName' --output text)
# Target group
TG_ARN=$(aws elbv2 create-target-group \
--name paperclip-tg \
--protocol HTTP \
--port 3100 \
--vpc-id $VPC_ID \
--target-type ip \
--health-check-path /api/health \
--health-check-interval-seconds 30 \
--healthy-threshold-count 2 \
--unhealthy-threshold-count 3 \
--query 'TargetGroups[0].TargetGroupArn' --output text)
# HTTPS listener
LISTENER_ARN=$(aws elbv2 create-listener \
--load-balancer-arn $ALB_ARN \
--protocol HTTPS \
--port 443 \
--certificates CertificateArn=$CERT_ARN \
--default-actions Type=forward,TargetGroupArn=$TG_ARN \
--query 'Listeners[0].ListenerArn' --output text)
# HTTP listener — redirect all :80 traffic to :443
HTTP_LISTENER_ARN=$(aws elbv2 create-listener \
--load-balancer-arn $ALB_ARN \
--protocol HTTP \
--port 80 \
--default-actions Type=redirect,RedirectConfig='{Protocol=HTTPS,Port=443,StatusCode=HTTP_301}' \
--query 'Listeners[0].ListenerArn' --output text)
```
Point your DNS to the ALB:
- Create a CNAME or ALIAS record for `$PAPERCLIP_DOMAIN` -> `$ALB_DNS`
## 10. Create ECS Service
```bash
aws ecs create-service \
--cluster paperclip \
--service-name paperclip-server \
--task-definition paperclip-server \
--desired-count 1 \
--launch-type FARGATE \
--deployment-configuration '{
"deploymentCircuitBreaker": {"enable": true, "rollback": true},
"maximumPercent": 200,
"minimumHealthyPercent": 100
}' \
--network-configuration '{
"awsvpcConfiguration": {
"subnets": ["'$SUBNET_1'", "'$SUBNET_2'"],
"securityGroups": ["'$ECS_SG'"],
"assignPublicIp": "ENABLED"
}
}' \
--load-balancers '[{
"targetGroupArn": "'$TG_ARN'",
"containerName": "paperclip-server",
"containerPort": 3100
}]'
```
> **Note:** `assignPublicIp: ENABLED` is needed if using public subnets without a NAT Gateway. For private subnets, set to `DISABLED` and ensure a NAT Gateway is configured for outbound internet access.
## 11. Verify Deployment
```bash
# Watch task come up
aws ecs describe-services \
--cluster paperclip \
--services paperclip-server \
--query 'services[0].{desired:desiredCount,running:runningCount,status:status}'
# Check task health
aws ecs list-tasks --cluster paperclip --service-name paperclip-server
TASK_ARN=$(aws ecs list-tasks --cluster paperclip --service-name paperclip-server --query 'taskArns[0]' --output text)
aws ecs describe-tasks --cluster paperclip --tasks $TASK_ARN \
--query 'tasks[0].{status:lastStatus,health:healthStatus}'
# Check logs
aws logs tail /ecs/paperclip --since 10m --follow
# Hit the health endpoint
curl -sf https://$PAPERCLIP_DOMAIN/api/health
```
**Healthy indicators:**
- ECS task status: `RUNNING`, health: `HEALTHY`
- Logs show `plugin job coordinator started` and `plugin-loader: loadAll complete`
- `/api/health` returns 200
## Post-Deploy Security Hardening
After the first user has signed up (which grants admin role), lock down the instance:
```bash
# Disable public sign-up (prevents unauthorized users from creating accounts)
# Add to the task definition environment section, then redeploy:
# { "name": "PAPERCLIP_AUTH_DISABLE_SIGN_UP", "value": "true" }
# Or update via Secrets Manager / task def override, then force new deployment
aws ecs update-service \
--cluster paperclip \
--service paperclip-server \
--force-new-deployment
```
Use the invite flow (added in v2026.416.0) to grant access to additional users after sign-up is disabled.
## Deploying Updates
Build, push, and force a new deployment:
```bash
# Build and push new image
docker build -t paperclip-server .
docker tag paperclip-server:latest \
$AWS_ACCOUNT_ID.dkr.ecr.$AWS_REGION.amazonaws.com/paperclip-server:latest
docker push \
$AWS_ACCOUNT_ID.dkr.ecr.$AWS_REGION.amazonaws.com/paperclip-server:latest
# Roll out
aws ecs update-service \
--cluster paperclip \
--service paperclip-server \
--force-new-deployment
# Watch the deployment
aws ecs describe-services \
--cluster paperclip \
--services paperclip-server \
--query 'services[0].deployments[*].{status:status,running:runningCount,desired:desiredCount,rollout:rolloutState}'
```
ECS performs a rolling update: starts a new task, waits for it to pass health checks, then drains the old task.
## Rollback
If the new deployment is unhealthy:
```bash
# ECS automatically rolls back if the new task fails health checks
# (circuit breaker is enabled in the service configuration above).
# To force rollback manually:
# 1. Find the previous task definition revision
aws ecs list-task-definitions \
--family-prefix paperclip-server \
--sort DESC \
--query 'taskDefinitionArns[0:3]'
# 2. Update service to the previous revision
aws ecs update-service \
--cluster paperclip \
--service paperclip-server \
--task-definition paperclip-server:<PREVIOUS_REVISION>
```
## Scaling to Zero (Cost Savings)
Scale down when not in use:
```bash
# Stop
aws ecs update-service \
--cluster paperclip \
--service paperclip-server \
--desired-count 0
# Start
aws ecs update-service \
--cluster paperclip \
--service paperclip-server \
--desired-count 1
```
RDS can also be stopped (auto-restarts after 7 days):
```bash
aws rds stop-db-instance --db-instance-identifier paperclip-db
aws rds start-db-instance --db-instance-identifier paperclip-db
```
## Teardown
Remove all resources in reverse order:
```bash
# 1. ECS service and cluster
aws ecs update-service --cluster paperclip --service paperclip-server --desired-count 0
aws ecs delete-service --cluster paperclip --service paperclip-server --force
aws ecs delete-cluster --cluster paperclip
# 2. ALB and ACM cert
aws elbv2 delete-listener --listener-arn $HTTP_LISTENER_ARN
aws elbv2 delete-listener --listener-arn $LISTENER_ARN
aws elbv2 delete-target-group --target-group-arn $TG_ARN
aws elbv2 delete-load-balancer --load-balancer-arn $ALB_ARN
aws acm delete-certificate --certificate-arn $CERT_ARN
# 3. RDS (creates final snapshot)
aws rds delete-db-instance \
--db-instance-identifier paperclip-db \
--final-db-snapshot-identifier paperclip-db-final
aws rds wait db-instance-deleted --db-instance-identifier paperclip-db
aws rds delete-db-subnet-group --db-subnet-group-name paperclip-db-subnet
# 4. EFS (mount targets must be deleted first)
for MT in $(aws efs describe-mount-targets --file-system-id $EFS_ID --query 'MountTargets[*].MountTargetId' --output text); do
aws efs delete-mount-target --mount-target-id $MT
done
# Mount-target deletion is async; poll until none remain before deleting
# the filesystem, otherwise delete-file-system fails with FileSystemInUse.
echo "Waiting for mount targets to delete..."
while aws efs describe-mount-targets \
--file-system-id $EFS_ID \
--query 'MountTargets[0].MountTargetId' --output text 2>/dev/null | grep -q 'fsmt-'; do
sleep 5
done
aws efs delete-file-system --file-system-id $EFS_ID
# 5. Secrets
for s in database-url anthropic-api-key better-auth-secret openai-api-key github-token; do
aws secretsmanager delete-secret --secret-id paperclip/$s --force-delete-without-recovery
done
# 6. Security groups (after all dependents are gone)
for sg in $EFS_SG $RDS_SG $ECS_SG $ALB_SG; do
aws ec2 delete-security-group --group-id $sg
done
# 7. ECR
aws ecr delete-repository --repository-name paperclip-server --force
# 8. IAM roles
aws iam delete-role-policy --role-name paperclip-ecs-execution --policy-name SecretsAccess
aws iam detach-role-policy --role-name paperclip-ecs-execution \
--policy-arn arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy
aws iam delete-role --role-name paperclip-ecs-execution
aws iam delete-role --role-name paperclip-ecs-task
# 9. Log group
aws logs delete-log-group --log-group-name /ecs/paperclip
```
## Cost Reference
| Service | Config | Monthly |
|---------|--------|---------|
| ECS Fargate | 2 vCPU, 4 GB, 24/7 | ~$70 |
| RDS Postgres | db.t4g.micro, 20 GB | ~$15 |
| ALB | 1 LCU average | ~$22 |
| NAT Gateway | 1 AZ (if using private subnets) | ~$35 |
| EFS | 1 GB Standard | ~$0.30 |
| Secrets Manager | 5 secrets | ~$2 |
| CloudWatch Logs | ~1 GB/mo | ~$0.50 |
| ECR | ~1 GB | ~$0.10 |
| **Total (public subnets, no NAT)** | | **~$110/mo** |
| **Total (private subnets + NAT)** | | **~$145/mo** |
Use Fargate Spot and scheduled scaling to 0 during off-hours to reduce to ~$60-85/mo.

View File

@@ -40,7 +40,7 @@ Paperclip supports three deployment configurations, from zero-friction local to
- **Just trying Paperclip?** Use `local_trusted` (the default)
- **Sharing with a team on private network?** Use `authenticated` + `private`
- **Deploying to the cloud?** Use `authenticated` + `public`
- **Deploying to the cloud?** Use `authenticated` + `public` — see [AWS ECS Fargate guide](aws-ecs.md)
Set the mode during onboarding:

View File

@@ -47,7 +47,7 @@ You do **not** need to tell the CEO to engage specific agents. After you approve
- **Breaks goals into concrete tasks** with clear descriptions, priorities, and acceptance criteria
- **Assigns tasks to the right agent** based on role and capabilities (e.g., engineering tasks go to the CTO or engineers, marketing tasks go to the CMO)
- **Creates subtasks** when work needs to be decomposed further
- **Hires new agents** when the team lacks capacity for a goal (subject to your approval)
- **Hires new agents** when the team lacks capacity for a goal, with hire approvals available when enabled in company settings
- **Monitors progress** on each heartbeat, checking task status and unblocking reports
- **Escalates to you** when it encounters something it can't resolve — budget issues, blocked approvals, or strategic ambiguity

Binary file not shown.

After

Width:  |  Height:  |  Size: 76 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 74 KiB

View File

@@ -57,9 +57,9 @@ The CEO is the primary delegator. When you set company goals, the CEO:
1. Creates a strategy and submits it for your approval
2. Breaks approved goals into tasks
3. Assigns tasks to agents based on their role and capabilities
4. Hires new agents when needed (subject to your approval)
4. Hires new agents when needed, with hire approvals available when you enable them
You don't need to manually assign every task — set the goals and let the CEO organize the work. You approve key decisions (strategy, hiring) and monitor progress. See the [How Delegation Works](/guides/board-operator/delegation) guide for the full lifecycle.
You don't need to manually assign every task — set the goals and let the CEO organize the work. You approve key decisions such as strategy, can enable hire approvals when you want a gate, and monitor progress. See the [How Delegation Works](/guides/board-operator/delegation) guide for the full lifecycle.
## Heartbeats

View File

@@ -17,7 +17,7 @@
"typecheck": "pnpm run preflight:workspace-links && pnpm -r typecheck",
"test": "pnpm run test:run",
"test:watch": "pnpm run preflight:workspace-links && vitest",
"test:run": "pnpm run preflight:workspace-links && vitest run",
"test:run": "pnpm run preflight:workspace-links && node scripts/run-vitest-stable.mjs",
"db:generate": "pnpm --filter @paperclipai/db generate",
"db:migrate": "pnpm --filter @paperclipai/db migrate",
"issue-references:backfill": "pnpm run preflight:workspace-links && tsx scripts/backfill-issue-reference-mentions.ts",
@@ -35,13 +35,16 @@
"smoke:openclaw-join": "./scripts/smoke/openclaw-join.sh",
"smoke:openclaw-docker-ui": "./scripts/smoke/openclaw-docker-ui.sh",
"smoke:openclaw-sse-standalone": "./scripts/smoke/openclaw-sse-standalone.sh",
"smoke:terminal-bench-loop-skill": "node scripts/smoke/terminal-bench-loop-skill-smoke.mjs",
"test:release-registry": "node --test scripts/verify-release-registry-state.test.mjs",
"test:e2e": "npx playwright test --config tests/e2e/playwright.config.ts",
"test:e2e:headed": "npx playwright test --config tests/e2e/playwright.config.ts --headed",
"test:e2e:multiuser-authenticated": "npx playwright test --config tests/e2e/playwright-multiuser-authenticated.config.ts",
"evals:smoke": "cd evals/promptfoo && npx promptfoo@0.103.3 eval",
"test:release-smoke": "npx playwright test --config tests/release-smoke/playwright.config.ts",
"test:release-smoke:headed": "npx playwright test --config tests/release-smoke/playwright.config.ts --headed",
"metrics:paperclip-commits": "tsx scripts/paperclip-commit-metrics.ts"
"metrics:paperclip-commits": "tsx scripts/paperclip-commit-metrics.ts",
"perf:issue-chat-long-thread": "node scripts/measure-issue-chat-long-thread.mjs"
},
"devDependencies": {
"@playwright/test": "^1.58.2",

View File

@@ -0,0 +1,128 @@
import { mkdir, mkdtemp, readFile, rm, writeFile } from "node:fs/promises";
import os from "node:os";
import path from "node:path";
import { execFile as execFileCallback } from "node:child_process";
import { promisify } from "node:util";
import { afterEach, describe, expect, it } from "vitest";
import { prepareCommandManagedRuntime } from "./command-managed-runtime.js";
import type { RunProcessResult } from "./server-utils.js";
const execFile = promisify(execFileCallback);
describe("command managed runtime", () => {
const cleanupDirs: string[] = [];
afterEach(async () => {
while (cleanupDirs.length > 0) {
const dir = cleanupDirs.pop();
if (!dir) continue;
await rm(dir, { recursive: true, force: true }).catch(() => undefined);
}
});
it("keeps the runtime overlay out of sandbox workspace sync by default", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-command-runtime-"));
cleanupDirs.push(rootDir);
const localWorkspaceDir = path.join(rootDir, "local-workspace");
const remoteWorkspaceDir = path.join(rootDir, "remote-workspace");
await mkdir(path.join(localWorkspaceDir, ".paperclip-runtime"), { recursive: true });
await mkdir(remoteWorkspaceDir, { recursive: true });
await writeFile(path.join(localWorkspaceDir, "README.md"), "local workspace\n", "utf8");
await writeFile(path.join(localWorkspaceDir, ".paperclip-runtime", "state.json"), "{\"keep\":true}\n", "utf8");
const calls: Array<{
command: string;
args?: string[];
cwd?: string;
env?: Record<string, string>;
stdin?: string;
timeoutMs?: number;
}> = [];
const runner = {
execute: async (input: {
command: string;
args?: string[];
cwd?: string;
env?: Record<string, string>;
stdin?: string;
timeoutMs?: number;
}): Promise<RunProcessResult> => {
calls.push({ ...input });
const startedAt = new Date().toISOString();
const env = {
...process.env,
...input.env,
};
const command = input.command === "sh" ? "/bin/sh" : input.command;
const args = [...(input.args ?? [])];
if (input.stdin != null && input.command === "sh" && args[0] === "-lc" && typeof args[1] === "string") {
env.PAPERCLIP_TEST_STDIN = input.stdin;
args[1] = `printf '%s' \"$PAPERCLIP_TEST_STDIN\" | (${args[1]})`;
}
try {
const result = await execFile(command, args, {
cwd: input.cwd,
env,
maxBuffer: 32 * 1024 * 1024,
timeout: input.timeoutMs,
});
return {
exitCode: 0,
signal: null,
timedOut: false,
stdout: result.stdout,
stderr: result.stderr,
pid: null,
startedAt,
};
} catch (error) {
const err = error as NodeJS.ErrnoException & {
stdout?: string;
stderr?: string;
code?: string | number | null;
signal?: NodeJS.Signals | null;
killed?: boolean;
};
return {
exitCode: typeof err.code === "number" ? err.code : null,
signal: err.signal ?? null,
timedOut: Boolean(err.killed && input.timeoutMs),
stdout: err.stdout ?? "",
stderr: err.stderr ?? "",
pid: null,
startedAt,
};
}
},
};
const prepared = await prepareCommandManagedRuntime({
runner,
spec: {
remoteCwd: remoteWorkspaceDir,
timeoutMs: 30_000,
},
adapterKey: "claude",
workspaceLocalDir: localWorkspaceDir,
});
await expect(readFile(path.join(remoteWorkspaceDir, "README.md"), "utf8")).resolves.toBe("local workspace\n");
await expect(readFile(path.join(remoteWorkspaceDir, ".paperclip-runtime", "state.json"), "utf8")).rejects
.toMatchObject({ code: "ENOENT" });
expect(calls.every((call) => call.stdin == null)).toBe(true);
await mkdir(path.join(remoteWorkspaceDir, ".paperclip-runtime"), { recursive: true });
await writeFile(path.join(remoteWorkspaceDir, "README.md"), "remote workspace\n", "utf8");
await writeFile(path.join(remoteWorkspaceDir, ".paperclip-runtime", "remote-state.json"), "{\"remote\":true}\n", "utf8");
await prepared.restoreWorkspace();
await expect(readFile(path.join(localWorkspaceDir, "README.md"), "utf8")).resolves.toBe("remote workspace\n");
await expect(readFile(path.join(localWorkspaceDir, ".paperclip-runtime", "state.json"), "utf8")).resolves
.toBe("{\"keep\":true}\n");
await expect(readFile(path.join(localWorkspaceDir, ".paperclip-runtime", "remote-state.json"), "utf8")).rejects
.toMatchObject({ code: "ENOENT" });
expect(calls.every((call) => call.stdin == null)).toBe(true);
});
});

View File

@@ -0,0 +1,182 @@
import path from "node:path";
import {
prepareSandboxManagedRuntime,
type PreparedSandboxManagedRuntime,
type SandboxManagedRuntimeAsset,
type SandboxManagedRuntimeClient,
type SandboxRemoteExecutionSpec,
} from "./sandbox-managed-runtime.js";
import type { RunProcessResult } from "./server-utils.js";
export interface CommandManagedRuntimeRunner {
execute(input: {
command: string;
args?: string[];
cwd?: string;
env?: Record<string, string>;
stdin?: string;
timeoutMs?: number;
onLog?: (stream: "stdout" | "stderr", chunk: string) => Promise<void>;
onSpawn?: (meta: { pid: number; startedAt: string }) => Promise<void>;
}): Promise<RunProcessResult>;
}
export interface CommandManagedRuntimeSpec {
providerKey?: string | null;
leaseId?: string | null;
remoteCwd: string;
timeoutMs?: number | null;
paperclipApiUrl?: string | null;
}
export type CommandManagedRuntimeAsset = SandboxManagedRuntimeAsset;
function shellQuote(value: string) {
return `'${value.replace(/'/g, `'"'"'`)}'`;
}
function mergeRuntimeExcludes(entries: string[] | undefined): string[] {
return [...new Set([".paperclip-runtime", ...(entries ?? [])])];
}
const REMOTE_WRITE_BASE64_CHUNK_SIZE = 32 * 1024;
function toBuffer(bytes: Buffer | Uint8Array | ArrayBuffer): Buffer {
if (Buffer.isBuffer(bytes)) return bytes;
if (bytes instanceof ArrayBuffer) return Buffer.from(bytes);
return Buffer.from(bytes.buffer, bytes.byteOffset, bytes.byteLength);
}
function requireSuccessfulResult(result: RunProcessResult, action: string): void {
if (result.exitCode === 0 && !result.timedOut) return;
const stderr = result.stderr.trim();
const detail = stderr.length > 0 ? `: ${stderr}` : "";
throw new Error(`${action} failed with exit code ${result.exitCode ?? "null"}${detail}`);
}
export function createCommandManagedRuntimeClient(input: {
runner: CommandManagedRuntimeRunner;
remoteCwd: string;
timeoutMs: number;
}): SandboxManagedRuntimeClient {
const runShell = async (script: string, opts: { stdin?: string; timeoutMs?: number } = {}) => {
const result = await input.runner.execute({
command: "sh",
args: ["-lc", script],
cwd: input.remoteCwd,
stdin: opts.stdin,
timeoutMs: opts.timeoutMs ?? input.timeoutMs,
});
requireSuccessfulResult(result, script);
return result;
};
return {
makeDir: async (remotePath) => {
await runShell(`mkdir -p ${shellQuote(remotePath)}`);
},
writeFile: async (remotePath, bytes) => {
const body = toBuffer(bytes).toString("base64");
const remoteDir = path.posix.dirname(remotePath);
const remoteTempPath = `${remotePath}.paperclip-upload.b64`;
await runShell(
`mkdir -p ${shellQuote(remoteDir)} && rm -f ${shellQuote(remoteTempPath)} && : > ${shellQuote(remoteTempPath)}`,
);
for (let offset = 0; offset < body.length; offset += REMOTE_WRITE_BASE64_CHUNK_SIZE) {
const chunk = body.slice(offset, offset + REMOTE_WRITE_BASE64_CHUNK_SIZE);
await runShell(`printf '%s' ${shellQuote(chunk)} >> ${shellQuote(remoteTempPath)}`);
}
await runShell(
`base64 -d < ${shellQuote(remoteTempPath)} > ${shellQuote(remotePath)} && rm -f ${shellQuote(remoteTempPath)}`,
);
},
readFile: async (remotePath) => {
const result = await runShell(`base64 < ${shellQuote(remotePath)}`);
return Buffer.from(result.stdout.replace(/\s+/g, ""), "base64");
},
listFiles: async (remotePath) => {
const result = await runShell(
`if [ -d ${shellQuote(remotePath)} ]; then ` +
`for entry in ${shellQuote(remotePath)}/*; do ` +
`[ -f "$entry" ] || continue; ` +
`basename "$entry"; ` +
`done; ` +
`fi`,
);
return result.stdout
.split(/\r?\n/)
.map((entry) => entry.trim())
.filter((entry) => entry.length > 0)
.sort((left, right) => left.localeCompare(right));
},
remove: async (remotePath) => {
const result = await input.runner.execute({
command: "sh",
args: ["-lc", `rm -rf ${shellQuote(remotePath)}`],
cwd: input.remoteCwd,
timeoutMs: input.timeoutMs,
});
requireSuccessfulResult(result, `remove ${remotePath}`);
},
run: async (command, options) => {
const result = await input.runner.execute({
command: "sh",
args: ["-lc", command],
cwd: input.remoteCwd,
timeoutMs: options.timeoutMs,
});
requireSuccessfulResult(result, command);
},
};
}
export async function prepareCommandManagedRuntime(input: {
runner: CommandManagedRuntimeRunner;
spec: CommandManagedRuntimeSpec;
adapterKey: string;
workspaceLocalDir: string;
workspaceRemoteDir?: string;
workspaceExclude?: string[];
preserveAbsentOnRestore?: string[];
assets?: CommandManagedRuntimeAsset[];
installCommand?: string | null;
}): Promise<PreparedSandboxManagedRuntime> {
const timeoutMs = input.spec.timeoutMs && input.spec.timeoutMs > 0 ? input.spec.timeoutMs : 300_000;
const workspaceRemoteDir = input.workspaceRemoteDir ?? input.spec.remoteCwd;
const runtimeSpec: SandboxRemoteExecutionSpec = {
transport: "sandbox",
provider: input.spec.providerKey ?? "sandbox",
sandboxId: input.spec.leaseId ?? "managed",
remoteCwd: workspaceRemoteDir,
timeoutMs,
apiKey: null,
paperclipApiUrl: input.spec.paperclipApiUrl ?? null,
};
const client = createCommandManagedRuntimeClient({
runner: input.runner,
remoteCwd: workspaceRemoteDir,
timeoutMs,
});
if (input.installCommand?.trim()) {
const result = await input.runner.execute({
command: "sh",
args: ["-lc", input.installCommand.trim()],
cwd: workspaceRemoteDir,
timeoutMs,
});
requireSuccessfulResult(result, input.installCommand.trim());
}
return await prepareSandboxManagedRuntime({
spec: runtimeSpec,
client,
adapterKey: input.adapterKey,
workspaceLocalDir: input.workspaceLocalDir,
workspaceRemoteDir,
workspaceExclude: mergeRuntimeExcludes(input.workspaceExclude),
preserveAbsentOnRestore: input.preserveAbsentOnRestore,
assets: input.assets,
});
}

View File

@@ -0,0 +1,292 @@
import { createServer } from "node:http";
import { mkdir, mkdtemp, rm } from "node:fs/promises";
import os from "node:os";
import path from "node:path";
import { afterEach, describe, expect, it, vi } from "vitest";
import {
adapterExecutionTargetSessionIdentity,
adapterExecutionTargetToRemoteSpec,
runAdapterExecutionTargetProcess,
runAdapterExecutionTargetShellCommand,
startAdapterExecutionTargetPaperclipBridge,
type AdapterSandboxExecutionTarget,
} from "./execution-target.js";
import { runChildProcess } from "./server-utils.js";
describe("sandbox adapter execution targets", () => {
const cleanupDirs: string[] = [];
afterEach(async () => {
while (cleanupDirs.length > 0) {
const dir = cleanupDirs.pop();
if (!dir) continue;
await rm(dir, { recursive: true, force: true }).catch(() => undefined);
}
});
function createLocalSandboxRunner() {
let counter = 0;
return {
execute: async (input: {
command: string;
args?: string[];
cwd?: string;
env?: Record<string, string>;
stdin?: string;
timeoutMs?: number;
onLog?: (stream: "stdout" | "stderr", chunk: string) => Promise<void>;
onSpawn?: (meta: { pid: number; startedAt: string }) => Promise<void>;
}) => {
counter += 1;
return runChildProcess(`sandbox-run-${counter}`, input.command, input.args ?? [], {
cwd: input.cwd ?? process.cwd(),
env: input.env ?? {},
stdin: input.stdin,
timeoutSec: Math.max(1, Math.ceil((input.timeoutMs ?? 30_000) / 1000)),
graceSec: 5,
onLog: input.onLog ?? (async () => {}),
onSpawn: input.onSpawn
? async (meta) => input.onSpawn?.({ pid: meta.pid, startedAt: meta.startedAt })
: undefined,
});
},
};
}
it("executes through the provider-neutral runner without a remote spec", async () => {
const runner = {
execute: vi.fn(async () => ({
exitCode: 0,
signal: null,
timedOut: false,
stdout: "ok\n",
stderr: "",
pid: null,
startedAt: new Date().toISOString(),
})),
};
const target: AdapterSandboxExecutionTarget = {
kind: "remote",
transport: "sandbox",
providerKey: "acme-sandbox",
environmentId: "env-1",
leaseId: "lease-1",
remoteCwd: "/workspace",
timeoutMs: 30_000,
runner,
};
expect(adapterExecutionTargetToRemoteSpec(target)).toBeNull();
const result = await runAdapterExecutionTargetProcess("run-1", target, "agent-cli", ["--json"], {
cwd: "/local/workspace",
env: { TOKEN: "token" },
stdin: "prompt",
timeoutSec: 5,
graceSec: 1,
onLog: async () => {},
});
expect(result.stdout).toBe("ok\n");
expect(runner.execute).toHaveBeenCalledWith(expect.objectContaining({
command: "agent-cli",
args: ["--json"],
cwd: "/workspace",
env: { TOKEN: "token" },
stdin: "prompt",
timeoutMs: 5000,
}));
expect(adapterExecutionTargetSessionIdentity(target)).toEqual({
transport: "sandbox",
providerKey: "acme-sandbox",
environmentId: "env-1",
leaseId: "lease-1",
remoteCwd: "/workspace",
paperclipTransport: "bridge",
});
});
it("runs shell commands through the same runner", async () => {
const runner = {
execute: vi.fn(async () => ({
exitCode: 0,
signal: null,
timedOut: false,
stdout: "/home/sandbox",
stderr: "",
pid: null,
startedAt: new Date().toISOString(),
})),
};
const target: AdapterSandboxExecutionTarget = {
kind: "remote",
transport: "sandbox",
remoteCwd: "/workspace",
runner,
};
await runAdapterExecutionTargetShellCommand("run-2", target, 'printf %s "$HOME"', {
cwd: "/local/workspace",
env: {},
timeoutSec: 7,
});
expect(runner.execute).toHaveBeenCalledWith(expect.objectContaining({
command: "sh",
args: ["-lc", 'printf %s "$HOME"'],
cwd: "/workspace",
timeoutMs: 7000,
}));
});
it("starts a localhost Paperclip bridge for sandbox targets in bridge mode", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-execution-target-bridge-"));
cleanupDirs.push(rootDir);
const remoteCwd = path.join(rootDir, "workspace");
const runtimeRootDir = path.join(remoteCwd, ".paperclip-runtime", "codex");
await mkdir(runtimeRootDir, { recursive: true });
const requests: Array<{ method: string; url: string; auth: string | null; runId: string | null }> = [];
const apiServer = createServer((req, res) => {
requests.push({
method: req.method ?? "GET",
url: req.url ?? "/",
auth: req.headers.authorization ?? null,
runId: typeof req.headers["x-paperclip-run-id"] === "string" ? req.headers["x-paperclip-run-id"] : null,
});
res.writeHead(200, { "content-type": "application/json" });
res.end(JSON.stringify({ ok: true }));
});
await new Promise<void>((resolve, reject) => {
apiServer.once("error", reject);
apiServer.listen(0, "127.0.0.1", () => resolve());
});
const address = apiServer.address();
if (!address || typeof address === "string") {
throw new Error("Expected the bridge test API server to listen on a TCP port.");
}
const target: AdapterSandboxExecutionTarget = {
kind: "remote",
transport: "sandbox",
providerKey: "e2b",
environmentId: "env-1",
leaseId: "lease-1",
remoteCwd,
paperclipTransport: "bridge",
runner: createLocalSandboxRunner(),
timeoutMs: 30_000,
};
const bridge = await startAdapterExecutionTargetPaperclipBridge({
runId: "run-bridge",
target,
runtimeRootDir,
adapterKey: "codex",
hostApiToken: "real-run-jwt",
hostApiUrl: `http://127.0.0.1:${address.port}`,
});
try {
expect(bridge).not.toBeNull();
expect(bridge?.env.PAPERCLIP_API_URL).toMatch(/^http:\/\/127\.0\.0\.1:\d+$/);
expect(bridge?.env.PAPERCLIP_API_KEY).not.toBe("real-run-jwt");
expect(bridge?.env.PAPERCLIP_API_BRIDGE_MODE).toBe("queue_v1");
const response = await fetch(`${bridge!.env.PAPERCLIP_API_URL}/api/agents/me`, {
headers: {
authorization: `Bearer ${bridge!.env.PAPERCLIP_API_KEY}`,
accept: "application/json",
},
});
expect(response.status).toBe(200);
expect(await response.json()).toEqual({ ok: true });
expect(requests).toEqual([{
method: "GET",
url: "/api/agents/me",
auth: "Bearer real-run-jwt",
runId: "run-bridge",
}]);
} finally {
await bridge?.stop();
await new Promise<void>((resolve) => apiServer.close(() => resolve()));
}
});
it("fails oversized host responses with a 502 before returning them to the sandbox client", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-execution-target-bridge-limit-"));
cleanupDirs.push(rootDir);
const remoteCwd = path.join(rootDir, "workspace");
const runtimeRootDir = path.join(remoteCwd, ".paperclip-runtime", "codex");
await mkdir(runtimeRootDir, { recursive: true });
const requests: Array<{ method: string; url: string; auth: string | null; runId: string | null }> = [];
const largeBody = "x".repeat(64);
const apiServer = createServer((req, res) => {
requests.push({
method: req.method ?? "GET",
url: req.url ?? "/",
auth: req.headers.authorization ?? null,
runId: typeof req.headers["x-paperclip-run-id"] === "string" ? req.headers["x-paperclip-run-id"] : null,
});
res.writeHead(200, {
"content-type": "application/json",
"content-length": String(Buffer.byteLength(largeBody, "utf8")),
});
res.end(largeBody);
});
await new Promise<void>((resolve, reject) => {
apiServer.once("error", reject);
apiServer.listen(0, "127.0.0.1", () => resolve());
});
const address = apiServer.address();
if (!address || typeof address === "string") {
throw new Error("Expected the bridge test API server to listen on a TCP port.");
}
const target: AdapterSandboxExecutionTarget = {
kind: "remote",
transport: "sandbox",
providerKey: "e2b",
environmentId: "env-1",
leaseId: "lease-1",
remoteCwd,
paperclipTransport: "bridge",
runner: createLocalSandboxRunner(),
timeoutMs: 30_000,
};
const bridge = await startAdapterExecutionTargetPaperclipBridge({
runId: "run-bridge-limit",
target,
runtimeRootDir,
adapterKey: "codex",
hostApiToken: "real-run-jwt",
hostApiUrl: `http://127.0.0.1:${address.port}`,
maxBodyBytes: 32,
});
try {
const response = await fetch(`${bridge!.env.PAPERCLIP_API_URL}/api/agents/me`, {
headers: {
authorization: `Bearer ${bridge!.env.PAPERCLIP_API_KEY}`,
accept: "application/json",
},
});
expect(response.status).toBe(502);
await expect(response.json()).resolves.toEqual({
error: "Bridge response body exceeded the configured size limit of 32 bytes.",
});
expect(requests).toEqual([{
method: "GET",
url: "/api/agents/me",
auth: "Bearer real-run-jwt",
runId: "run-bridge-limit",
}]);
} finally {
await bridge?.stop();
await new Promise<void>((resolve) => apiServer.close(() => resolve()));
}
});
});

View File

@@ -2,6 +2,7 @@ import { afterEach, describe, expect, it, vi } from "vitest";
import * as ssh from "./ssh.js";
import {
adapterExecutionTargetUsesManagedHome,
resolveAdapterExecutionTargetCwd,
runAdapterExecutionTargetShellCommand,
} from "./execution-target.js";
@@ -159,3 +160,49 @@ describe("runAdapterExecutionTargetShellCommand", () => {
})).toBe(false);
});
});
describe("resolveAdapterExecutionTargetCwd", () => {
const sshTarget = {
kind: "remote" as const,
transport: "ssh" as const,
remoteCwd: "/srv/paperclip/workspace",
spec: {
host: "ssh.example.test",
port: 22,
username: "ssh-user",
remoteCwd: "/srv/paperclip/workspace",
remoteWorkspacePath: "/srv/paperclip/workspace",
privateKey: null,
knownHosts: null,
strictHostKeyChecking: true,
},
};
it("falls back to the remote cwd when no adapter cwd is configured", () => {
expect(resolveAdapterExecutionTargetCwd(sshTarget, "", "/Users/host/repo/server")).toBe(
"/srv/paperclip/workspace",
);
expect(resolveAdapterExecutionTargetCwd(sshTarget, " ", "/Users/host/repo/server")).toBe(
"/srv/paperclip/workspace",
);
expect(resolveAdapterExecutionTargetCwd(sshTarget, null, "/Users/host/repo/server")).toBe(
"/srv/paperclip/workspace",
);
});
it("preserves an explicit adapter cwd when one is configured", () => {
expect(
resolveAdapterExecutionTargetCwd(
sshTarget,
"/srv/paperclip/custom-agent-dir",
"/Users/host/repo/server",
),
).toBe("/srv/paperclip/custom-agent-dir");
});
it("keeps the local fallback cwd for local targets", () => {
expect(resolveAdapterExecutionTargetCwd(null, "", "/Users/host/repo/server")).toBe(
"/Users/host/repo/server",
);
});
});

View File

@@ -1,11 +1,23 @@
import path from "node:path";
import type { SshRemoteExecutionSpec } from "./ssh.js";
import {
prepareCommandManagedRuntime,
type CommandManagedRuntimeRunner,
} from "./command-managed-runtime.js";
import {
buildRemoteExecutionSessionIdentity,
prepareRemoteManagedRuntime,
remoteExecutionSessionMatches,
type RemoteManagedRuntimeAsset,
} from "./remote-managed-runtime.js";
import {
createCommandManagedSandboxCallbackBridgeQueueClient,
createSandboxCallbackBridgeAsset,
createSandboxCallbackBridgeToken,
DEFAULT_SANDBOX_CALLBACK_BRIDGE_MAX_BODY_BYTES,
startSandboxCallbackBridgeServer,
startSandboxCallbackBridgeWorker,
} from "./sandbox-callback-bridge.js";
import { parseSshRemoteExecutionSpec, runSshCommand, shellQuote } from "./ssh.js";
import {
ensureCommandResolvable,
@@ -31,9 +43,23 @@ export interface AdapterSshExecutionTarget {
spec: SshRemoteExecutionSpec;
}
export interface AdapterSandboxExecutionTarget {
kind: "remote";
transport: "sandbox";
providerKey?: string | null;
environmentId?: string | null;
leaseId?: string | null;
remoteCwd: string;
paperclipApiUrl?: string | null;
paperclipTransport?: "direct" | "bridge";
timeoutMs?: number | null;
runner?: CommandManagedRuntimeRunner;
}
export type AdapterExecutionTarget =
| AdapterLocalExecutionTarget
| AdapterSshExecutionTarget;
| AdapterSshExecutionTarget
| AdapterSandboxExecutionTarget;
export type AdapterRemoteExecutionSpec = SshRemoteExecutionSpec;
@@ -65,6 +91,11 @@ export interface AdapterExecutionTargetShellOptions {
onLog?: (stream: "stdout" | "stderr", chunk: string) => Promise<void>;
}
export interface AdapterExecutionTargetPaperclipBridgeHandle {
env: Record<string, string>;
stop(): Promise<void>;
}
function parseObject(value: unknown): Record<string, unknown> {
return value && typeof value === "object" && !Array.isArray(value)
? (value as Record<string, unknown>)
@@ -79,12 +110,38 @@ function readStringMeta(parsed: Record<string, unknown>, key: string): string |
return readString(parsed[key]);
}
function resolveHostForUrl(rawHost: string): string {
const host = rawHost.trim();
if (!host || host === "0.0.0.0" || host === "::") return "localhost";
if (host.includes(":") && !host.startsWith("[") && !host.endsWith("]")) return `[${host}]`;
return host;
}
function resolveDefaultPaperclipApiUrl(): string {
const runtimeHost = resolveHostForUrl(
process.env.PAPERCLIP_LISTEN_HOST ?? process.env.HOST ?? "localhost",
);
// 3100 matches the default Paperclip dev server port when the runtime does not provide one.
const runtimePort = process.env.PAPERCLIP_LISTEN_PORT ?? process.env.PORT ?? "3100";
return `http://${runtimeHost}:${runtimePort}`;
}
function resolveSandboxPaperclipTransport(
target: Pick<AdapterSandboxExecutionTarget, "paperclipTransport" | "paperclipApiUrl">,
): "direct" | "bridge" {
if (target.paperclipTransport === "direct" || target.paperclipTransport === "bridge") {
return target.paperclipTransport;
}
return target.paperclipApiUrl ? "direct" : "bridge";
}
function isAdapterExecutionTargetInstance(value: unknown): value is AdapterExecutionTarget {
const parsed = parseObject(value);
if (parsed.kind === "local") return true;
if (parsed.kind !== "remote") return false;
if (parsed.transport === "ssh") return parseSshRemoteExecutionSpec(parseObject(parsed.spec)) !== null;
return false;
if (parsed.transport !== "sandbox") return false;
return readStringMeta(parsed, "remoteCwd") !== null;
}
export function adapterExecutionTargetToRemoteSpec(
@@ -102,10 +159,7 @@ export function adapterExecutionTargetIsRemote(
export function adapterExecutionTargetUsesManagedHome(
target: AdapterExecutionTarget | null | undefined,
): boolean {
// SSH execution targets sync the runtime assets they need into the remote cwd today,
// so neither local nor remote targets provision a separate managed adapter home.
void target;
return false;
return target?.kind === "remote" && target.transport === "sandbox";
}
export function adapterExecutionTargetRemoteCwd(
@@ -115,18 +169,49 @@ export function adapterExecutionTargetRemoteCwd(
return target?.kind === "remote" ? target.remoteCwd : localCwd;
}
export function resolveAdapterExecutionTargetCwd(
target: AdapterExecutionTarget | null | undefined,
configuredCwd: string | null | undefined,
localFallbackCwd: string,
): string {
if (typeof configuredCwd === "string" && configuredCwd.trim().length > 0) {
return configuredCwd;
}
return adapterExecutionTargetRemoteCwd(target, localFallbackCwd);
}
export function adapterExecutionTargetPaperclipApiUrl(
target: AdapterExecutionTarget | null | undefined,
): string | null {
if (target?.kind !== "remote") return null;
return target.paperclipApiUrl ?? target.spec.paperclipApiUrl ?? null;
if (target.transport === "ssh") return target.paperclipApiUrl ?? target.spec.paperclipApiUrl ?? null;
if (resolveSandboxPaperclipTransport(target) === "bridge") return null;
return target.paperclipApiUrl ?? null;
}
export function adapterExecutionTargetUsesPaperclipBridge(
target: AdapterExecutionTarget | null | undefined,
): boolean {
return target?.kind === "remote" &&
target.transport === "sandbox" &&
resolveSandboxPaperclipTransport(target) === "bridge";
}
export function describeAdapterExecutionTarget(
target: AdapterExecutionTarget | null | undefined,
): string {
if (!target || target.kind === "local") return "local environment";
return `SSH environment ${target.spec.username}@${target.spec.host}:${target.spec.port}`;
if (target.transport === "ssh") {
return `SSH environment ${target.spec.username}@${target.spec.host}:${target.spec.port}`;
}
return `sandbox environment${target.providerKey ? ` (${target.providerKey})` : ""}`;
}
function requireSandboxRunner(target: AdapterSandboxExecutionTarget): CommandManagedRuntimeRunner {
if (target.runner) return target.runner;
throw new Error(
"Sandbox execution target is missing its provider runtime runner. Sandbox commands must execute through the environment runtime.",
);
}
export async function ensureAdapterExecutionTargetCommandResolvable(
@@ -135,6 +220,9 @@ export async function ensureAdapterExecutionTargetCommandResolvable(
cwd: string,
env: NodeJS.ProcessEnv,
) {
if (target?.kind === "remote" && target.transport === "sandbox") {
return;
}
await ensureCommandResolvable(command, cwd, env, {
remoteExecution: adapterExecutionTargetToRemoteSpec(target),
});
@@ -146,6 +234,9 @@ export async function resolveAdapterExecutionTargetCommandForLogs(
cwd: string,
env: NodeJS.ProcessEnv,
): Promise<string> {
if (target?.kind === "remote" && target.transport === "sandbox") {
return `sandbox://${target.providerKey ?? "provider"}/${target.leaseId ?? "lease"}/${target.remoteCwd} :: ${command}`;
}
return await resolveCommandForLogs(command, cwd, env, {
remoteExecution: adapterExecutionTargetToRemoteSpec(target),
});
@@ -158,6 +249,22 @@ export async function runAdapterExecutionTargetProcess(
args: string[],
options: AdapterExecutionTargetProcessOptions,
): Promise<RunProcessResult> {
if (target?.kind === "remote" && target.transport === "sandbox") {
const runner = requireSandboxRunner(target);
return await runner.execute({
command,
args,
cwd: target.remoteCwd,
env: options.env,
stdin: options.stdin,
timeoutMs: options.timeoutSec > 0 ? options.timeoutSec * 1000 : target.timeoutMs ?? undefined,
onLog: options.onLog,
onSpawn: options.onSpawn
? async (meta) => options.onSpawn?.({ ...meta, processGroupId: null })
: undefined,
});
}
return await runChildProcess(runId, command, args, {
cwd: options.cwd,
env: options.env,
@@ -180,57 +287,68 @@ export async function runAdapterExecutionTargetShellCommand(
const onLog = options.onLog ?? (async () => {});
if (target?.kind === "remote") {
const startedAt = new Date().toISOString();
try {
const result = await runSshCommand(target.spec, `sh -lc ${shellQuote(command)}`, {
timeoutMs: (options.timeoutSec ?? 15) * 1000,
});
if (result.stdout) await onLog("stdout", result.stdout);
if (result.stderr) await onLog("stderr", result.stderr);
return {
exitCode: 0,
signal: null,
timedOut: false,
stdout: result.stdout,
stderr: result.stderr,
pid: null,
startedAt,
};
} catch (error) {
const timedOutError = error as NodeJS.ErrnoException & {
stdout?: string;
stderr?: string;
signal?: string | null;
};
const stdout = timedOutError.stdout ?? "";
const stderr = timedOutError.stderr ?? "";
if (typeof timedOutError.code === "number") {
if (target.transport === "ssh") {
try {
const result = await runSshCommand(target.spec, `sh -lc ${shellQuote(command)}`, {
timeoutMs: (options.timeoutSec ?? 15) * 1000,
});
if (result.stdout) await onLog("stdout", result.stdout);
if (result.stderr) await onLog("stderr", result.stderr);
return {
exitCode: 0,
signal: null,
timedOut: false,
stdout: result.stdout,
stderr: result.stderr,
pid: null,
startedAt,
};
} catch (error) {
const timedOutError = error as NodeJS.ErrnoException & {
stdout?: string;
stderr?: string;
signal?: string | null;
};
const stdout = timedOutError.stdout ?? "";
const stderr = timedOutError.stderr ?? "";
if (typeof timedOutError.code === "number") {
if (stdout) await onLog("stdout", stdout);
if (stderr) await onLog("stderr", stderr);
return {
exitCode: timedOutError.code,
signal: timedOutError.signal ?? null,
timedOut: false,
stdout,
stderr,
pid: null,
startedAt,
};
}
if (timedOutError.code !== "ETIMEDOUT") {
throw error;
}
if (stdout) await onLog("stdout", stdout);
if (stderr) await onLog("stderr", stderr);
return {
exitCode: timedOutError.code,
exitCode: null,
signal: timedOutError.signal ?? null,
timedOut: false,
timedOut: true,
stdout,
stderr,
pid: null,
startedAt,
};
}
if (timedOutError.code !== "ETIMEDOUT") {
throw error;
}
if (stdout) await onLog("stdout", stdout);
if (stderr) await onLog("stderr", stderr);
return {
exitCode: null,
signal: timedOutError.signal ?? null,
timedOut: true,
stdout,
stderr,
pid: null,
startedAt,
};
}
return await requireSandboxRunner(target).execute({
command: "sh",
args: ["-lc", command],
cwd: target.remoteCwd,
env: options.env,
timeoutMs: (options.timeoutSec ?? 15) * 1000,
onLog,
});
}
return await runAdapterExecutionTargetProcess(
@@ -277,11 +395,79 @@ export async function ensureAdapterExecutionTargetFile(
);
}
/**
* Ensure a working directory exists (and is a directory) on the execution target.
*
* For local targets this delegates to the local `ensureAbsoluteDirectory` helper
* (Node fs). For remote (SSH/sandbox) targets it shells out and runs
* `mkdir -p` (when allowed) followed by a `[ -d ]` check so the result reflects
* the directory state inside the environment, not on the Paperclip host.
*
* Throws an Error with a human-readable message on failure.
*/
export async function ensureAdapterExecutionTargetDirectory(
runId: string,
target: AdapterExecutionTarget | null | undefined,
cwd: string,
options: AdapterExecutionTargetShellOptions & { createIfMissing?: boolean },
): Promise<void> {
const createIfMissing = options.createIfMissing ?? false;
if (!target || target.kind === "local") {
const { ensureAbsoluteDirectory } = await import("./server-utils.js");
await ensureAbsoluteDirectory(cwd, { createIfMissing });
return;
}
// Remote (SSH or sandbox): both expect POSIX absolute paths inside the env.
if (!cwd.startsWith("/")) {
throw new Error(`Working directory must be an absolute POSIX path on the remote target: "${cwd}"`);
}
const quoted = shellQuote(cwd);
const script = createIfMissing
? `mkdir -p ${quoted} && [ -d ${quoted} ]`
: `[ -d ${quoted} ]`;
const result = await runAdapterExecutionTargetShellCommand(runId, target, script, {
cwd: target.kind === "remote" ? target.remoteCwd : cwd,
env: options.env,
timeoutSec: options.timeoutSec ?? 15,
graceSec: options.graceSec ?? 5,
onLog: options.onLog,
});
if (result.timedOut) {
throw new Error(`Timed out checking working directory on remote target: "${cwd}"`);
}
if ((result.exitCode ?? 1) !== 0) {
const detail = (result.stderr || result.stdout || "").trim();
if (createIfMissing) {
throw new Error(
`Could not create working directory "${cwd}" on remote target${detail ? `: ${detail}` : "."}`,
);
}
throw new Error(
`Working directory does not exist on remote target: "${cwd}"${detail ? ` (${detail})` : ""}`,
);
}
}
export function adapterExecutionTargetSessionIdentity(
target: AdapterExecutionTarget | null | undefined,
): Record<string, unknown> | null {
if (!target || target.kind === "local") return null;
return buildRemoteExecutionSessionIdentity(target.spec);
if (target.transport === "ssh") return buildRemoteExecutionSessionIdentity(target.spec);
const paperclipTransport = resolveSandboxPaperclipTransport(target);
return {
transport: "sandbox",
providerKey: target.providerKey ?? null,
environmentId: target.environmentId ?? null,
leaseId: target.leaseId ?? null,
remoteCwd: target.remoteCwd,
paperclipTransport,
...(paperclipTransport === "direct" && target.paperclipApiUrl ? { paperclipApiUrl: target.paperclipApiUrl } : {}),
};
}
export function adapterExecutionTargetSessionMatches(
@@ -291,7 +477,18 @@ export function adapterExecutionTargetSessionMatches(
if (!target || target.kind === "local") {
return Object.keys(parseObject(saved)).length === 0;
}
return remoteExecutionSessionMatches(saved, target.spec);
if (target.transport === "ssh") return remoteExecutionSessionMatches(saved, target.spec);
const current = adapterExecutionTargetSessionIdentity(target);
const parsedSaved = parseObject(saved);
return (
readStringMeta(parsedSaved, "transport") === current?.transport &&
readStringMeta(parsedSaved, "providerKey") === current?.providerKey &&
readStringMeta(parsedSaved, "environmentId") === current?.environmentId &&
readStringMeta(parsedSaved, "leaseId") === current?.leaseId &&
readStringMeta(parsedSaved, "remoteCwd") === current?.remoteCwd &&
readStringMeta(parsedSaved, "paperclipTransport") === (current?.paperclipTransport ?? null) &&
readStringMeta(parsedSaved, "paperclipApiUrl") === (current?.paperclipApiUrl ?? null)
);
}
export function parseAdapterExecutionTarget(value: unknown): AdapterExecutionTarget | null {
@@ -320,6 +517,26 @@ export function parseAdapterExecutionTarget(value: unknown): AdapterExecutionTar
};
}
if (kind === "remote" && readStringMeta(parsed, "transport") === "sandbox") {
const remoteCwd = readStringMeta(parsed, "remoteCwd");
const paperclipTransport = readStringMeta(parsed, "paperclipTransport");
if (!remoteCwd) return null;
return {
kind: "remote",
transport: "sandbox",
providerKey: readStringMeta(parsed, "providerKey"),
environmentId: readStringMeta(parsed, "environmentId"),
leaseId: readStringMeta(parsed, "leaseId"),
remoteCwd,
paperclipApiUrl: readStringMeta(parsed, "paperclipApiUrl"),
paperclipTransport:
paperclipTransport === "direct" || paperclipTransport === "bridge"
? paperclipTransport
: undefined,
timeoutMs: typeof parsed.timeoutMs === "number" ? parsed.timeoutMs : null,
};
}
return null;
}
@@ -376,11 +593,36 @@ export async function prepareAdapterExecutionTargetRuntime(input: {
};
}
const prepared = await prepareRemoteManagedRuntime({
spec: target.spec,
if (target.transport === "ssh") {
const prepared = await prepareRemoteManagedRuntime({
spec: target.spec,
adapterKey: input.adapterKey,
workspaceLocalDir: input.workspaceLocalDir,
assets: input.assets,
});
return {
target,
runtimeRootDir: prepared.runtimeRootDir,
assetDirs: prepared.assetDirs,
restoreWorkspace: prepared.restoreWorkspace,
};
}
const prepared = await prepareCommandManagedRuntime({
runner: requireSandboxRunner(target),
spec: {
providerKey: target.providerKey,
leaseId: target.leaseId,
remoteCwd: target.remoteCwd,
timeoutMs: target.timeoutMs,
paperclipApiUrl: target.paperclipApiUrl,
},
adapterKey: input.adapterKey,
workspaceLocalDir: input.workspaceLocalDir,
workspaceExclude: input.workspaceExclude,
preserveAbsentOnRestore: input.preserveAbsentOnRestore,
assets: input.assets,
installCommand: input.installCommand,
});
return {
target,
@@ -397,3 +639,172 @@ export function runtimeAssetDir(
): string {
return prepared.assetDirs[key] ?? path.posix.join(fallbackRemoteCwd, ".paperclip-runtime", key);
}
function buildBridgeResponseHeaders(response: Response): Record<string, string> {
const out: Record<string, string> = {};
for (const key of ["content-type", "etag", "last-modified"]) {
const value = response.headers.get(key);
if (value && value.trim().length > 0) out[key] = value.trim();
}
return out;
}
function buildBridgeForwardUrl(baseUrl: string, request: { path: string; query: string }): URL {
const url = new URL(request.path, baseUrl);
const query = request.query.trim();
url.search = query.startsWith("?") ? query.slice(1) : query;
return url;
}
function bridgeResponseBodyLimitError(maxBodyBytes: number): Error {
return new Error(`Bridge response body exceeded the configured size limit of ${maxBodyBytes} bytes.`);
}
async function readBridgeForwardResponseBody(response: Response, maxBodyBytes: number): Promise<string> {
const rawContentLength = response.headers.get("content-length");
if (rawContentLength) {
const contentLength = Number.parseInt(rawContentLength, 10);
if (Number.isFinite(contentLength) && contentLength > maxBodyBytes) {
throw bridgeResponseBodyLimitError(maxBodyBytes);
}
}
if (!response.body) {
return "";
}
const reader = response.body.getReader();
const chunks: Buffer[] = [];
let totalBytes = 0;
while (true) {
const { done, value } = await reader.read();
if (done) break;
if (!value) continue;
totalBytes += value.byteLength;
if (totalBytes > maxBodyBytes) {
await reader.cancel().catch(() => undefined);
throw bridgeResponseBodyLimitError(maxBodyBytes);
}
chunks.push(Buffer.from(value));
}
return Buffer.concat(chunks, totalBytes).toString("utf8");
}
export async function startAdapterExecutionTargetPaperclipBridge(input: {
runId: string;
target: AdapterExecutionTarget | null | undefined;
runtimeRootDir: string | null | undefined;
adapterKey: string;
hostApiToken: string | null | undefined;
hostApiUrl?: string | null;
onLog?: (stream: "stdout" | "stderr", chunk: string) => Promise<void>;
maxBodyBytes?: number | null;
}): Promise<AdapterExecutionTargetPaperclipBridgeHandle | null> {
if (!adapterExecutionTargetUsesPaperclipBridge(input.target)) {
return null;
}
if (!input.target || input.target.kind !== "remote" || input.target.transport !== "sandbox") {
return null;
}
const target = input.target;
const onLog = input.onLog ?? (async () => {});
const hostApiToken = input.hostApiToken?.trim() ?? "";
if (hostApiToken.length === 0) {
throw new Error("Sandbox bridge mode requires a host-side Paperclip API token.");
}
const runtimeRootDir =
input.runtimeRootDir?.trim().length
? input.runtimeRootDir.trim()
: path.posix.join(target.remoteCwd, ".paperclip-runtime", input.adapterKey);
const bridgeRuntimeDir = path.posix.join(runtimeRootDir, "paperclip-bridge");
const queueDir = path.posix.join(bridgeRuntimeDir, "queue");
const assetRemoteDir = path.posix.join(bridgeRuntimeDir, "server");
const bridgeToken = createSandboxCallbackBridgeToken();
const maxBodyBytes =
typeof input.maxBodyBytes === "number" && Number.isFinite(input.maxBodyBytes) && input.maxBodyBytes > 0
? Math.trunc(input.maxBodyBytes)
: DEFAULT_SANDBOX_CALLBACK_BRIDGE_MAX_BODY_BYTES;
const hostApiUrl =
input.hostApiUrl?.trim() ||
process.env.PAPERCLIP_RUNTIME_API_URL?.trim() ||
process.env.PAPERCLIP_API_URL?.trim() ||
resolveDefaultPaperclipApiUrl();
await onLog(
"stdout",
`[paperclip] Starting sandbox callback bridge for ${input.adapterKey} in ${bridgeRuntimeDir}.\n`,
);
const bridgeAsset = await createSandboxCallbackBridgeAsset();
let server: Awaited<ReturnType<typeof startSandboxCallbackBridgeServer>> | null = null;
let worker: Awaited<ReturnType<typeof startSandboxCallbackBridgeWorker>> | null = null;
try {
const client = createCommandManagedSandboxCallbackBridgeQueueClient({
runner: requireSandboxRunner(target),
remoteCwd: target.remoteCwd,
timeoutMs: target.timeoutMs,
});
worker = await startSandboxCallbackBridgeWorker({
client,
queueDir,
maxBodyBytes,
handleRequest: async (request) => {
const headers = new Headers();
for (const [key, value] of Object.entries(request.headers)) {
if (value.trim().length === 0) continue;
headers.set(key, value);
}
headers.set("authorization", `Bearer ${hostApiToken}`);
headers.set("x-paperclip-run-id", input.runId);
const method = request.method.trim().toUpperCase() || "GET";
const response = await fetch(buildBridgeForwardUrl(hostApiUrl, request), {
method,
headers,
...(method === "GET" || method === "HEAD" ? {} : { body: request.body }),
signal: AbortSignal.timeout(30_000),
});
return {
status: response.status,
headers: buildBridgeResponseHeaders(response),
body: await readBridgeForwardResponseBody(response, maxBodyBytes),
};
},
});
server = await startSandboxCallbackBridgeServer({
runner: requireSandboxRunner(target),
remoteCwd: target.remoteCwd,
assetRemoteDir,
queueDir,
bridgeToken,
bridgeAsset,
timeoutMs: target.timeoutMs,
maxBodyBytes,
});
} catch (error) {
await Promise.allSettled([
server?.stop(),
worker?.stop(),
bridgeAsset.cleanup(),
]);
throw error;
}
return {
env: {
PAPERCLIP_API_URL: server.baseUrl,
PAPERCLIP_API_KEY: bridgeToken,
PAPERCLIP_API_BRIDGE_MODE: "queue_v1",
},
stop: async () => {
await Promise.allSettled([
server?.stop(),
]);
await Promise.allSettled([
worker?.stop(),
bridgeAsset.cleanup(),
]);
},
};
}

View File

@@ -54,3 +54,15 @@ export {
redactTranscriptEntryPaths,
} from "./log-redaction.js";
export { inferOpenAiCompatibleBiller } from "./billing.js";
// Keep the root adapter-utils entry browser-safe because the UI imports it.
// The sandbox callback bridge stays available via its dedicated subpath export.
export type {
SandboxCallbackBridgeRequest,
SandboxCallbackBridgeResponse,
SandboxCallbackBridgeAsset,
SandboxCallbackBridgeDirectories,
SandboxCallbackBridgeRouteRule,
SandboxCallbackBridgeQueueClient,
SandboxCallbackBridgeWorkerHandle,
StartedSandboxCallbackBridgeServer,
} from "./sandbox-callback-bridge.js";

View File

@@ -0,0 +1,610 @@
import { execFile as execFileCallback } from "node:child_process";
import { mkdir, mkdtemp, readFile, readdir, rm, writeFile } from "node:fs/promises";
import os from "node:os";
import path from "node:path";
import { promisify } from "node:util";
import { afterEach, describe, expect, it } from "vitest";
import { prepareCommandManagedRuntime } from "./command-managed-runtime.js";
import {
createFileSystemSandboxCallbackBridgeQueueClient,
createSandboxCallbackBridgeAsset,
createSandboxCallbackBridgeToken,
sandboxCallbackBridgeDirectories,
startSandboxCallbackBridgeServer,
startSandboxCallbackBridgeWorker,
} from "./sandbox-callback-bridge.js";
import type { RunProcessResult } from "./server-utils.js";
const execFile = promisify(execFileCallback);
describe("sandbox callback bridge", () => {
const cleanupDirs: string[] = [];
const cleanupFns: Array<() => Promise<void>> = [];
function createExecRunner() {
return {
execute: async (input: {
command: string;
args?: string[];
cwd?: string;
env?: Record<string, string>;
stdin?: string;
timeoutMs?: number;
}): Promise<RunProcessResult> => {
const startedAt = new Date().toISOString();
const env = {
...process.env,
...input.env,
};
const command = input.command === "sh" ? "/bin/sh" : input.command;
const args = [...(input.args ?? [])];
if (input.stdin != null && input.command === "sh" && args[0] === "-lc" && typeof args[1] === "string") {
env.PAPERCLIP_TEST_STDIN = input.stdin;
args[1] = `printf '%s' \"$PAPERCLIP_TEST_STDIN\" | (${args[1]})`;
}
try {
const result = await execFile(command, args, {
cwd: input.cwd,
env,
maxBuffer: 32 * 1024 * 1024,
timeout: input.timeoutMs,
});
return {
exitCode: 0,
signal: null,
timedOut: false,
stdout: result.stdout,
stderr: result.stderr,
pid: null,
startedAt,
};
} catch (error) {
const err = error as NodeJS.ErrnoException & {
stdout?: string;
stderr?: string;
code?: string | number | null;
signal?: NodeJS.Signals | null;
killed?: boolean;
};
return {
exitCode: typeof err.code === "number" ? err.code : null,
signal: err.signal ?? null,
timedOut: Boolean(err.killed && input.timeoutMs),
stdout: err.stdout ?? "",
stderr: err.stderr ?? "",
pid: null,
startedAt,
};
}
},
};
}
async function waitForJsonFile(directory: string, timeoutMs = 2_000): Promise<string> {
const deadline = Date.now() + timeoutMs;
while (Date.now() < deadline) {
const entries = await readdir(directory).catch(() => []);
const match = entries.find((entry) => entry.endsWith(".json"));
if (match) return match;
await new Promise((resolve) => setTimeout(resolve, 10));
}
throw new Error(`Timed out waiting for a JSON file in ${directory}.`);
}
afterEach(async () => {
while (cleanupFns.length > 0) {
const cleanup = cleanupFns.pop();
if (!cleanup) continue;
await cleanup().catch(() => undefined);
}
while (cleanupDirs.length > 0) {
const dir = cleanupDirs.pop();
if (!dir) continue;
await rm(dir, { recursive: true, force: true }).catch(() => undefined);
}
});
it("round-trips localhost bridge requests over the sandbox queue without forwarding the bridge token", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-bridge-runtime-"));
cleanupDirs.push(rootDir);
const localWorkspaceDir = path.join(rootDir, "local-workspace");
const remoteWorkspaceDir = path.join(rootDir, "remote-workspace");
await mkdir(localWorkspaceDir, { recursive: true });
await mkdir(remoteWorkspaceDir, { recursive: true });
await writeFile(path.join(localWorkspaceDir, "README.md"), "bridge test\n", "utf8");
const runner = createExecRunner();
const bridgeAsset = await createSandboxCallbackBridgeAsset();
cleanupFns.push(bridgeAsset.cleanup);
const prepared = await prepareCommandManagedRuntime({
runner,
spec: {
remoteCwd: remoteWorkspaceDir,
timeoutMs: 30_000,
},
adapterKey: "codex",
workspaceLocalDir: localWorkspaceDir,
assets: [
{
key: "bridge",
localDir: bridgeAsset.localDir,
},
],
});
const queueDir = path.posix.join(prepared.runtimeRootDir, "paperclip-bridge");
const directories = sandboxCallbackBridgeDirectories(queueDir);
const bridgeToken = createSandboxCallbackBridgeToken();
const seenRequests: Array<{
method: string;
path: string;
query: string;
headers: Record<string, string>;
body: string;
}> = [];
const worker = await startSandboxCallbackBridgeWorker({
client: createFileSystemSandboxCallbackBridgeQueueClient(),
queueDir,
authorizeRequest: async (request) =>
request.path === "/api/agents/me" ? null : `Route not allowed: ${request.method} ${request.path}`,
handleRequest: async (request) => {
seenRequests.push({
method: request.method,
path: request.path,
query: request.query,
headers: request.headers,
body: request.body,
});
return {
status: 200,
headers: {
"content-type": "application/json",
etag: '"bridge-rev-1"',
"last-modified": "Tue, 01 Apr 2025 00:00:00 GMT",
},
body: JSON.stringify({
ok: true,
method: request.method,
path: request.path,
}),
};
},
});
cleanupFns.push(async () => {
await worker.stop();
});
const bridge = await startSandboxCallbackBridgeServer({
runner,
remoteCwd: remoteWorkspaceDir,
assetRemoteDir: prepared.assetDirs.bridge,
queueDir,
bridgeToken,
timeoutMs: 30_000,
});
cleanupFns.push(async () => {
await bridge.stop();
});
const okResponse = await fetch(`${bridge.baseUrl}/api/agents/me?view=compact`, {
headers: {
authorization: `Bearer ${bridgeToken}`,
accept: "application/json",
"if-none-match": '"client-cache-key"',
"x-paperclip-run-id": "run-bridge-1",
"x-bridge-debug": "drop-me",
},
});
expect(okResponse.status).toBe(200);
expect(okResponse.headers.get("content-type")).toContain("application/json");
expect(okResponse.headers.get("etag")).toBe('"bridge-rev-1"');
expect(okResponse.headers.get("last-modified")).toBe("Tue, 01 Apr 2025 00:00:00 GMT");
await expect(okResponse.json()).resolves.toMatchObject({
ok: true,
method: "GET",
path: "/api/agents/me",
});
const deniedResponse = await fetch(`${bridge.baseUrl}/api/issues/issue-1`, {
method: "PATCH",
headers: {
authorization: `Bearer ${bridgeToken}`,
"content-type": "application/json",
},
body: JSON.stringify({ status: "in_progress" }),
});
expect(deniedResponse.status).toBe(403);
await expect(deniedResponse.json()).resolves.toMatchObject({
error: "Route not allowed: PATCH /api/issues/issue-1",
});
const unauthorizedResponse = await fetch(`${bridge.baseUrl}/api/agents/me`, {
headers: {
authorization: "Bearer wrong-token",
},
});
expect(unauthorizedResponse.status).toBe(401);
await expect(unauthorizedResponse.json()).resolves.toMatchObject({
error: "Invalid bridge token.",
});
expect(seenRequests).toHaveLength(1);
expect(seenRequests[0]).toMatchObject({
method: "GET",
path: "/api/agents/me",
query: "?view=compact",
body: "",
headers: {
accept: "application/json",
"if-none-match": '"client-cache-key"',
},
});
expect(seenRequests[0]?.headers.authorization).toBeUndefined();
expect(seenRequests[0]?.headers["x-paperclip-run-id"]).toBeUndefined();
});
it("denies non-allowlisted requests by default", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-bridge-default-policy-"));
cleanupDirs.push(rootDir);
const queueDir = path.posix.join(rootDir, "queue");
const directories = sandboxCallbackBridgeDirectories(queueDir);
let handled = 0;
const worker = await startSandboxCallbackBridgeWorker({
client: createFileSystemSandboxCallbackBridgeQueueClient(),
queueDir,
handleRequest: async () => {
handled += 1;
return {
status: 200,
body: "should not happen",
};
},
});
await writeFile(
path.posix.join(directories.requestsDir, "req-1.json"),
`${JSON.stringify({
id: "req-1",
method: "DELETE",
path: "/api/secrets",
query: "",
headers: {},
body: "",
createdAt: new Date().toISOString(),
})}\n`,
"utf8",
);
await worker.stop({ drainTimeoutMs: 1_000 });
const response = JSON.parse(
await readFile(path.posix.join(directories.responsesDir, "req-1.json"), "utf8"),
) as { status: number; body: string };
expect(handled).toBe(0);
expect(response.status).toBe(403);
expect(JSON.parse(response.body)).toEqual({
error: "Route not allowed: DELETE /api/secrets",
});
});
it("drains already-queued requests on stop", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-bridge-drain-"));
cleanupDirs.push(rootDir);
const queueDir = path.posix.join(rootDir, "queue");
const directories = sandboxCallbackBridgeDirectories(queueDir);
const processed: string[] = [];
const worker = await startSandboxCallbackBridgeWorker({
client: createFileSystemSandboxCallbackBridgeQueueClient(),
queueDir,
authorizeRequest: async () => null,
handleRequest: async (request) => {
processed.push(request.id);
await new Promise((resolve) => setTimeout(resolve, 25));
return {
status: 200,
body: request.id,
};
},
});
await writeFile(
path.posix.join(directories.requestsDir, "req-a.json"),
`${JSON.stringify({
id: "req-a",
method: "GET",
path: "/api/agents/me",
query: "",
headers: {},
body: "",
createdAt: new Date().toISOString(),
})}\n`,
"utf8",
);
await writeFile(
path.posix.join(directories.requestsDir, "req-b.json"),
`${JSON.stringify({
id: "req-b",
method: "GET",
path: "/api/agents/me",
query: "",
headers: {},
body: "",
createdAt: new Date().toISOString(),
})}\n`,
"utf8",
);
await worker.stop({ drainTimeoutMs: 1_000 });
expect(processed).toEqual(["req-a", "req-b"]);
await expect(readFile(path.posix.join(directories.responsesDir, "req-a.json"), "utf8")).resolves.toContain("\"req-a\"");
await expect(readFile(path.posix.join(directories.responsesDir, "req-b.json"), "utf8")).resolves.toContain("\"req-b\"");
});
it("writes fast 503 responses for queued requests that miss the drain deadline", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-bridge-drain-timeout-"));
cleanupDirs.push(rootDir);
const queueDir = path.posix.join(rootDir, "queue");
const directories = sandboxCallbackBridgeDirectories(queueDir);
const processed: string[] = [];
const worker = await startSandboxCallbackBridgeWorker({
client: createFileSystemSandboxCallbackBridgeQueueClient(),
queueDir,
authorizeRequest: async () => null,
handleRequest: async (request) => {
processed.push(request.id);
await new Promise((resolve) => setTimeout(resolve, 100));
return {
status: 200,
body: request.id,
};
},
});
await writeFile(
path.posix.join(directories.requestsDir, "req-a.json"),
`${JSON.stringify({
id: "req-a",
method: "GET",
path: "/api/agents/me",
query: "",
headers: {},
body: "",
createdAt: new Date().toISOString(),
})}\n`,
"utf8",
);
await writeFile(
path.posix.join(directories.requestsDir, "req-b.json"),
`${JSON.stringify({
id: "req-b",
method: "GET",
path: "/api/agents/me",
query: "",
headers: {},
body: "",
createdAt: new Date().toISOString(),
})}\n`,
"utf8",
);
for (let attempt = 0; attempt < 50 && processed.length === 0; attempt += 1) {
await new Promise((resolve) => setTimeout(resolve, 5));
}
await worker.stop({ drainTimeoutMs: 10 });
expect(processed).toEqual(["req-a"]);
await expect(readFile(path.posix.join(directories.responsesDir, "req-a.json"), "utf8")).resolves.toContain("\"req-a\"");
await expect(readFile(path.posix.join(directories.responsesDir, "req-b.json"), "utf8")).resolves.toContain(
"Bridge worker stopped before request could be handled.",
);
});
it("rejects non-JSON request bodies and full queues at the bridge server", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-bridge-server-guards-"));
cleanupDirs.push(rootDir);
const localWorkspaceDir = path.join(rootDir, "local-workspace");
const remoteWorkspaceDir = path.join(rootDir, "remote-workspace");
await mkdir(localWorkspaceDir, { recursive: true });
await mkdir(remoteWorkspaceDir, { recursive: true });
await writeFile(path.join(localWorkspaceDir, "README.md"), "bridge guard test\n", "utf8");
const runner = createExecRunner();
const bridgeAsset = await createSandboxCallbackBridgeAsset();
cleanupFns.push(bridgeAsset.cleanup);
const prepared = await prepareCommandManagedRuntime({
runner,
spec: {
remoteCwd: remoteWorkspaceDir,
timeoutMs: 30_000,
},
adapterKey: "codex",
workspaceLocalDir: localWorkspaceDir,
assets: [{ key: "bridge", localDir: bridgeAsset.localDir }],
});
const queueDir = path.posix.join(prepared.runtimeRootDir, "paperclip-bridge");
const directories = sandboxCallbackBridgeDirectories(queueDir);
const bridgeToken = createSandboxCallbackBridgeToken();
const bridge = await startSandboxCallbackBridgeServer({
runner,
remoteCwd: remoteWorkspaceDir,
assetRemoteDir: prepared.assetDirs.bridge,
queueDir,
bridgeToken,
timeoutMs: 30_000,
maxQueueDepth: 1,
});
cleanupFns.push(async () => {
await bridge.stop();
});
await writeFile(
path.posix.join(directories.requestsDir, "existing.json"),
`${JSON.stringify({
id: "existing",
method: "GET",
path: "/api/agents/me",
query: "",
headers: {},
body: "",
createdAt: new Date().toISOString(),
})}\n`,
"utf8",
);
const queueFullResponse = await fetch(`${bridge.baseUrl}/api/agents/me`, {
headers: {
authorization: `Bearer ${bridgeToken}`,
},
});
expect(queueFullResponse.status).toBe(503);
await expect(queueFullResponse.json()).resolves.toEqual({
error: "Bridge request queue is full.",
});
await rm(path.posix.join(directories.requestsDir, "existing.json"), { force: true });
const nonJsonResponse = await fetch(`${bridge.baseUrl}/api/issues/issue-1/comments`, {
method: "POST",
headers: {
authorization: `Bearer ${bridgeToken}`,
"content-type": "text/plain",
},
body: "not json",
});
expect(nonJsonResponse.status).toBe(415);
await expect(nonJsonResponse.json()).resolves.toEqual({
error: "Bridge only accepts JSON request bodies.",
});
});
it("returns a 502 when the host response times out", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-bridge-timeout-"));
cleanupDirs.push(rootDir);
const localWorkspaceDir = path.join(rootDir, "local-workspace");
const remoteWorkspaceDir = path.join(rootDir, "remote-workspace");
await mkdir(localWorkspaceDir, { recursive: true });
await mkdir(remoteWorkspaceDir, { recursive: true });
await writeFile(path.join(localWorkspaceDir, "README.md"), "bridge timeout test\n", "utf8");
const runner = createExecRunner();
const bridgeAsset = await createSandboxCallbackBridgeAsset();
cleanupFns.push(bridgeAsset.cleanup);
const prepared = await prepareCommandManagedRuntime({
runner,
spec: {
remoteCwd: remoteWorkspaceDir,
timeoutMs: 30_000,
},
adapterKey: "codex",
workspaceLocalDir: localWorkspaceDir,
assets: [{ key: "bridge", localDir: bridgeAsset.localDir }],
});
const queueDir = path.posix.join(prepared.runtimeRootDir, "paperclip-bridge");
const bridgeToken = createSandboxCallbackBridgeToken();
const bridge = await startSandboxCallbackBridgeServer({
runner,
remoteCwd: remoteWorkspaceDir,
assetRemoteDir: prepared.assetDirs.bridge,
queueDir,
bridgeToken,
timeoutMs: 30_000,
pollIntervalMs: 10,
responseTimeoutMs: 75,
});
cleanupFns.push(async () => {
await bridge.stop();
});
const response = await fetch(`${bridge.baseUrl}/api/agents/me`, {
headers: {
authorization: `Bearer ${bridgeToken}`,
},
});
expect(response.status).toBe(502);
await expect(response.json()).resolves.toEqual({
error: "Timed out waiting for host bridge response.",
});
});
it("returns a 502 for malformed host response files", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-bridge-malformed-response-"));
cleanupDirs.push(rootDir);
const localWorkspaceDir = path.join(rootDir, "local-workspace");
const remoteWorkspaceDir = path.join(rootDir, "remote-workspace");
await mkdir(localWorkspaceDir, { recursive: true });
await mkdir(remoteWorkspaceDir, { recursive: true });
await writeFile(path.join(localWorkspaceDir, "README.md"), "bridge malformed response test\n", "utf8");
const runner = createExecRunner();
const bridgeAsset = await createSandboxCallbackBridgeAsset();
cleanupFns.push(bridgeAsset.cleanup);
const prepared = await prepareCommandManagedRuntime({
runner,
spec: {
remoteCwd: remoteWorkspaceDir,
timeoutMs: 30_000,
},
adapterKey: "codex",
workspaceLocalDir: localWorkspaceDir,
assets: [{ key: "bridge", localDir: bridgeAsset.localDir }],
});
const queueDir = path.posix.join(prepared.runtimeRootDir, "paperclip-bridge");
const directories = sandboxCallbackBridgeDirectories(queueDir);
const bridgeToken = createSandboxCallbackBridgeToken();
const bridge = await startSandboxCallbackBridgeServer({
runner,
remoteCwd: remoteWorkspaceDir,
assetRemoteDir: prepared.assetDirs.bridge,
queueDir,
bridgeToken,
timeoutMs: 30_000,
pollIntervalMs: 10,
responseTimeoutMs: 1_000,
});
cleanupFns.push(async () => {
await bridge.stop();
});
const responsePromise = fetch(`${bridge.baseUrl}/api/agents/me`, {
headers: {
authorization: `Bearer ${bridgeToken}`,
},
});
const requestFile = await waitForJsonFile(directories.requestsDir);
await writeFile(
path.posix.join(directories.responsesDir, requestFile),
'{"status":200,"headers":{"content-type":"application/json"},"body"',
"utf8",
);
const response = await responsePromise;
expect(response.status).toBe(502);
await expect(response.json()).resolves.toMatchObject({
error: expect.stringMatching(/JSON|Unexpected|Unterminated/i),
});
});
});

View File

@@ -0,0 +1,822 @@
import { randomBytes, randomUUID } from "node:crypto";
import { promises as fs } from "node:fs";
import os from "node:os";
import path from "node:path";
import type { CommandManagedRuntimeRunner } from "./command-managed-runtime.js";
import type { RunProcessResult } from "./server-utils.js";
const DEFAULT_BRIDGE_TOKEN_BYTES = 24;
const DEFAULT_BRIDGE_POLL_INTERVAL_MS = 100;
const DEFAULT_BRIDGE_RESPONSE_TIMEOUT_MS = 30_000;
const DEFAULT_BRIDGE_STOP_TIMEOUT_MS = 2_000;
const DEFAULT_BRIDGE_MAX_QUEUE_DEPTH = 64;
const DEFAULT_BRIDGE_MAX_BODY_BYTES = 256 * 1024;
const REMOTE_WRITE_BASE64_CHUNK_SIZE = 32 * 1024;
const SANDBOX_CALLBACK_BRIDGE_ENTRYPOINT = "paperclip-bridge-server.mjs";
export const DEFAULT_SANDBOX_CALLBACK_BRIDGE_MAX_BODY_BYTES = DEFAULT_BRIDGE_MAX_BODY_BYTES;
export interface SandboxCallbackBridgeRouteRule {
method: string;
path: RegExp;
}
export const DEFAULT_SANDBOX_CALLBACK_BRIDGE_ROUTE_ALLOWLIST: readonly SandboxCallbackBridgeRouteRule[] = [
{ method: "GET", path: /^\/api\/agents\/me$/ },
{ method: "GET", path: /^\/api\/issues\/[^/]+\/heartbeat-context$/ },
{ method: "GET", path: /^\/api\/issues\/[^/]+\/comments(?:\/[^/]+)?$/ },
{ method: "GET", path: /^\/api\/issues\/[^/]+\/documents(?:\/[^/]+)?$/ },
{ method: "POST", path: /^\/api\/issues\/[^/]+\/checkout$/ },
{ method: "POST", path: /^\/api\/issues\/[^/]+\/comments$/ },
{ method: "POST", path: /^\/api\/issues\/[^/]+\/interactions(?:\/[^/]+)?$/ },
{ method: "PATCH", path: /^\/api\/issues\/[^/]+$/ },
] as const;
export const DEFAULT_SANDBOX_CALLBACK_BRIDGE_HEADER_ALLOWLIST = [
"accept",
"content-type",
"if-match",
"if-none-match",
] as const;
export interface SandboxCallbackBridgeRequest {
id: string;
method: string;
path: string;
query: string;
headers: Record<string, string>;
/**
* UTF-8 body contents. The bridge rejects non-JSON request bodies; binary
* payloads are intentionally out of scope for this queue protocol.
*/
body: string;
createdAt: string;
}
export interface SandboxCallbackBridgeResponse {
id: string;
status: number;
headers: Record<string, string>;
body: string;
completedAt: string;
}
export interface SandboxCallbackBridgeAsset {
localDir: string;
entrypoint: string;
cleanup(): Promise<void>;
}
export interface SandboxCallbackBridgeDirectories {
rootDir: string;
requestsDir: string;
responsesDir: string;
logsDir: string;
readyFile: string;
pidFile: string;
logFile: string;
}
export interface SandboxCallbackBridgeQueueClient {
makeDir(remotePath: string): Promise<void>;
listJsonFiles(remotePath: string): Promise<string[]>;
readTextFile(remotePath: string): Promise<string>;
writeTextFile(remotePath: string, body: string): Promise<void>;
rename(fromPath: string, toPath: string): Promise<void>;
remove(remotePath: string): Promise<void>;
}
export interface SandboxCallbackBridgeWorkerHandle {
stop(options?: { drainTimeoutMs?: number }): Promise<void>;
}
export interface StartedSandboxCallbackBridgeServer {
baseUrl: string;
host: string;
port: number;
pid: number;
directories: SandboxCallbackBridgeDirectories;
stop(): Promise<void>;
}
function shellQuote(value: string) {
return `'${value.replace(/'/g, `'"'"'`)}'`;
}
function normalizeMethod(value: string | null | undefined): string {
return typeof value === "string" && value.trim().length > 0 ? value.trim().toUpperCase() : "GET";
}
function normalizeTimeoutMs(value: number | null | undefined, fallback: number): number {
return typeof value === "number" && Number.isFinite(value) && value > 0 ? Math.trunc(value) : fallback;
}
function toBuffer(bytes: Buffer | Uint8Array | ArrayBuffer): Buffer {
if (Buffer.isBuffer(bytes)) return bytes;
if (bytes instanceof ArrayBuffer) return Buffer.from(bytes);
return Buffer.from(bytes.buffer, bytes.byteOffset, bytes.byteLength);
}
function buildRunnerFailureMessage(action: string, result: RunProcessResult): string {
const stderr = result.stderr.trim();
const stdout = result.stdout.trim();
const detail = stderr || stdout;
if (result.timedOut) {
return `${action} timed out${detail ? `: ${detail}` : ""}`;
}
return `${action} failed with exit code ${result.exitCode ?? "null"}${detail ? `: ${detail}` : ""}`;
}
async function runShell(
runner: CommandManagedRuntimeRunner,
cwd: string,
script: string,
timeoutMs: number,
): Promise<RunProcessResult> {
return await runner.execute({
command: "sh",
args: ["-lc", script],
cwd,
timeoutMs,
});
}
function requireSuccessfulResult(action: string, result: RunProcessResult): RunProcessResult {
if (!result.timedOut && result.exitCode === 0) return result;
throw new Error(buildRunnerFailureMessage(action, result));
}
function base64Chunks(body: string): string[] {
const out: string[] = [];
for (let offset = 0; offset < body.length; offset += REMOTE_WRITE_BASE64_CHUNK_SIZE) {
out.push(body.slice(offset, offset + REMOTE_WRITE_BASE64_CHUNK_SIZE));
}
return out;
}
export function createSandboxCallbackBridgeToken(bytes = DEFAULT_BRIDGE_TOKEN_BYTES): string {
return randomBytes(bytes).toString("base64url");
}
export function authorizeSandboxCallbackBridgeRequestWithRoutes(
request: Pick<SandboxCallbackBridgeRequest, "method" | "path">,
routes: readonly SandboxCallbackBridgeRouteRule[] = DEFAULT_SANDBOX_CALLBACK_BRIDGE_ROUTE_ALLOWLIST,
): string | null {
const method = normalizeMethod(request.method);
return routes.some((route) => route.method === method && route.path.test(request.path))
? null
: `Route not allowed: ${method} ${request.path}`;
}
export function sanitizeSandboxCallbackBridgeHeaders(
headers: Record<string, string>,
allowlist: readonly string[] = DEFAULT_SANDBOX_CALLBACK_BRIDGE_HEADER_ALLOWLIST,
): Record<string, string> {
const allowed = new Set(allowlist.map((header) => header.toLowerCase()));
return Object.fromEntries(
Object.entries(headers).filter(([key]) => allowed.has(key.toLowerCase())),
);
}
export function sandboxCallbackBridgeDirectories(rootDir: string): SandboxCallbackBridgeDirectories {
return {
rootDir,
requestsDir: path.posix.join(rootDir, "requests"),
responsesDir: path.posix.join(rootDir, "responses"),
logsDir: path.posix.join(rootDir, "logs"),
readyFile: path.posix.join(rootDir, "ready.json"),
pidFile: path.posix.join(rootDir, "server.pid"),
logFile: path.posix.join(rootDir, "logs", "bridge.log"),
};
}
export function buildSandboxCallbackBridgeEnv(input: {
queueDir: string;
bridgeToken: string;
host?: string;
port?: number | null;
pollIntervalMs?: number | null;
responseTimeoutMs?: number | null;
maxQueueDepth?: number | null;
maxBodyBytes?: number | null;
}): Record<string, string> {
return {
PAPERCLIP_API_BRIDGE_MODE: "queue_v1",
PAPERCLIP_BRIDGE_QUEUE_DIR: input.queueDir,
PAPERCLIP_BRIDGE_TOKEN: input.bridgeToken,
PAPERCLIP_BRIDGE_HOST: input.host?.trim() || "127.0.0.1",
PAPERCLIP_BRIDGE_PORT: String(input.port && input.port > 0 ? Math.trunc(input.port) : 0),
PAPERCLIP_BRIDGE_POLL_INTERVAL_MS: String(
normalizeTimeoutMs(input.pollIntervalMs, DEFAULT_BRIDGE_POLL_INTERVAL_MS),
),
PAPERCLIP_BRIDGE_RESPONSE_TIMEOUT_MS: String(
normalizeTimeoutMs(input.responseTimeoutMs, DEFAULT_BRIDGE_RESPONSE_TIMEOUT_MS),
),
PAPERCLIP_BRIDGE_MAX_QUEUE_DEPTH: String(
normalizeTimeoutMs(input.maxQueueDepth, DEFAULT_BRIDGE_MAX_QUEUE_DEPTH),
),
PAPERCLIP_BRIDGE_MAX_BODY_BYTES: String(
normalizeTimeoutMs(input.maxBodyBytes, DEFAULT_BRIDGE_MAX_BODY_BYTES),
),
};
}
export async function createSandboxCallbackBridgeAsset(): Promise<SandboxCallbackBridgeAsset> {
const localDir = await fs.mkdtemp(path.join(os.tmpdir(), "paperclip-bridge-asset-"));
const entrypoint = path.join(localDir, SANDBOX_CALLBACK_BRIDGE_ENTRYPOINT);
await fs.writeFile(entrypoint, getSandboxCallbackBridgeServerSource(), "utf8");
return {
localDir,
entrypoint,
cleanup: async () => {
await fs.rm(localDir, { recursive: true, force: true }).catch(() => undefined);
},
};
}
export function createFileSystemSandboxCallbackBridgeQueueClient(): SandboxCallbackBridgeQueueClient {
return {
makeDir: async (remotePath) => {
await fs.mkdir(remotePath, { recursive: true });
},
listJsonFiles: async (remotePath) => {
const entries = await fs.readdir(remotePath, { withFileTypes: true }).catch(() => []);
return entries
.filter((entry) => entry.isFile() && entry.name.endsWith(".json"))
.map((entry) => entry.name)
.sort((left, right) => left.localeCompare(right));
},
readTextFile: async (remotePath) => await fs.readFile(remotePath, "utf8"),
writeTextFile: async (remotePath, body) => {
await fs.mkdir(path.posix.dirname(remotePath), { recursive: true });
await fs.writeFile(remotePath, body, "utf8");
},
rename: async (fromPath, toPath) => {
await fs.mkdir(path.posix.dirname(toPath), { recursive: true });
await fs.rename(fromPath, toPath);
},
remove: async (remotePath) => {
await fs.rm(remotePath, { recursive: true, force: true }).catch(() => undefined);
},
};
}
export function createCommandManagedSandboxCallbackBridgeQueueClient(input: {
runner: CommandManagedRuntimeRunner;
remoteCwd: string;
timeoutMs?: number | null;
}): SandboxCallbackBridgeQueueClient {
const timeoutMs = normalizeTimeoutMs(input.timeoutMs, DEFAULT_BRIDGE_RESPONSE_TIMEOUT_MS);
const runChecked = async (action: string, script: string) =>
requireSuccessfulResult(action, await runShell(input.runner, input.remoteCwd, script, timeoutMs));
return {
makeDir: async (remotePath) => {
await runChecked(`mkdir ${remotePath}`, `mkdir -p ${shellQuote(remotePath)}`);
},
listJsonFiles: async (remotePath) => {
const result = await runShell(
input.runner,
input.remoteCwd,
[
`if [ -d ${shellQuote(remotePath)} ]; then`,
` for file in ${shellQuote(remotePath)}/*.json; do`,
` [ -f "$file" ] || continue`,
" basename \"$file\"",
" done",
"fi",
].join("\n"),
timeoutMs,
);
requireSuccessfulResult(`list ${remotePath}`, result);
return result.stdout
.split(/\r?\n/)
.map((line) => line.trim())
.filter((line) => line.length > 0)
.sort((left, right) => left.localeCompare(right));
},
readTextFile: async (remotePath) => {
const result = await runChecked(`read ${remotePath}`, `base64 < ${shellQuote(remotePath)}`);
return Buffer.from(result.stdout.replace(/\s+/g, ""), "base64").toString("utf8");
},
writeTextFile: async (remotePath, body) => {
const remoteDir = path.posix.dirname(remotePath);
const tempPath = `${remotePath}.paperclip-upload.b64`;
await runChecked(
`prepare upload ${remotePath}`,
`mkdir -p ${shellQuote(remoteDir)} && rm -f ${shellQuote(tempPath)} && : > ${shellQuote(tempPath)}`,
);
const base64Body = toBuffer(Buffer.from(body, "utf8")).toString("base64");
for (const chunk of base64Chunks(base64Body)) {
await runChecked(
`append upload chunk ${remotePath}`,
`printf '%s' ${shellQuote(chunk)} >> ${shellQuote(tempPath)}`,
);
}
await runChecked(
`finalize upload ${remotePath}`,
`base64 -d < ${shellQuote(tempPath)} > ${shellQuote(remotePath)} && rm -f ${shellQuote(tempPath)}`,
);
},
rename: async (fromPath, toPath) => {
await runChecked(
`rename ${fromPath}`,
`mkdir -p ${shellQuote(path.posix.dirname(toPath))} && mv ${shellQuote(fromPath)} ${shellQuote(toPath)}`,
);
},
remove: async (remotePath) => {
await runChecked(`remove ${remotePath}`, `rm -rf ${shellQuote(remotePath)}`);
},
};
}
async function writeBridgeResponse(
client: SandboxCallbackBridgeQueueClient,
responsePath: string,
response: SandboxCallbackBridgeResponse,
) {
const tempPath = `${responsePath}.tmp`;
await client.writeTextFile(tempPath, `${JSON.stringify(response)}\n`);
await client.rename(tempPath, responsePath);
}
export async function startSandboxCallbackBridgeWorker(input: {
client: SandboxCallbackBridgeQueueClient;
queueDir: string;
pollIntervalMs?: number | null;
authorizeRequest?: (request: SandboxCallbackBridgeRequest) => string | null | Promise<string | null>;
handleRequest: (request: SandboxCallbackBridgeRequest) => Promise<{
status: number;
headers?: Record<string, string>;
body?: string;
}>;
maxBodyBytes?: number | null;
}): Promise<SandboxCallbackBridgeWorkerHandle> {
const pollIntervalMs = normalizeTimeoutMs(input.pollIntervalMs, DEFAULT_BRIDGE_POLL_INTERVAL_MS);
const maxBodyBytes = normalizeTimeoutMs(input.maxBodyBytes, DEFAULT_BRIDGE_MAX_BODY_BYTES);
const directories = sandboxCallbackBridgeDirectories(input.queueDir);
await input.client.makeDir(directories.rootDir);
await input.client.makeDir(directories.requestsDir);
await input.client.makeDir(directories.responsesDir);
await input.client.makeDir(directories.logsDir);
let stopping = false;
let inFlight = 0;
let settled = false;
let stopDeadline = Number.POSITIVE_INFINITY;
let settleResolve: (() => void) | null = null;
const settledPromise = new Promise<void>((resolve) => {
settleResolve = resolve;
});
const authorizeRequest = input.authorizeRequest ??
((request: SandboxCallbackBridgeRequest) => authorizeSandboxCallbackBridgeRequestWithRoutes(request));
const processRequestFile = async (fileName: string) => {
const requestPath = path.posix.join(directories.requestsDir, fileName);
const responsePath = path.posix.join(directories.responsesDir, fileName);
const raw = await input.client.readTextFile(requestPath);
let request: SandboxCallbackBridgeRequest;
try {
request = JSON.parse(raw) as SandboxCallbackBridgeRequest;
} catch {
const requestId = fileName.replace(/\.json$/i, "") || randomUUID();
await writeBridgeResponse(input.client, responsePath, {
id: requestId,
status: 400,
headers: { "content-type": "application/json" },
body: JSON.stringify({ error: "Invalid bridge request payload." }),
completedAt: new Date().toISOString(),
});
await input.client.remove(requestPath);
return;
}
const denialReason = await authorizeRequest(request);
if (denialReason) {
await writeBridgeResponse(input.client, responsePath, {
id: request.id,
status: 403,
headers: { "content-type": "application/json" },
body: JSON.stringify({ error: denialReason }),
completedAt: new Date().toISOString(),
});
await input.client.remove(requestPath);
return;
}
try {
const result = await input.handleRequest(request);
const responseBody = result.body ?? "";
if (Buffer.byteLength(responseBody, "utf8") > maxBodyBytes) {
throw new Error(`Bridge response body exceeded the configured size limit of ${maxBodyBytes} bytes.`);
}
await writeBridgeResponse(input.client, responsePath, {
id: request.id,
status: result.status,
headers: result.headers ?? {},
body: responseBody,
completedAt: new Date().toISOString(),
});
} catch (error) {
console.warn(
`[paperclip] sandbox callback bridge handler failed for ${request.id}: ${error instanceof Error ? error.message : String(error)}`,
);
await writeBridgeResponse(input.client, responsePath, {
id: request.id,
status: 502,
headers: { "content-type": "application/json" },
body: JSON.stringify({
error: error instanceof Error ? error.message : String(error),
}),
completedAt: new Date().toISOString(),
});
} finally {
await input.client.remove(requestPath);
}
};
const failPendingRequests = async (message: string) => {
const fileNames = await input.client.listJsonFiles(directories.requestsDir).catch(() => []);
for (const fileName of fileNames) {
const requestPath = path.posix.join(directories.requestsDir, fileName);
const responsePath = path.posix.join(directories.responsesDir, fileName);
const requestId = fileName.replace(/\.json$/i, "") || randomUUID();
try {
const raw = await input.client.readTextFile(requestPath);
const parsed = JSON.parse(raw) as Partial<SandboxCallbackBridgeRequest>;
await writeBridgeResponse(input.client, responsePath, {
id: typeof parsed.id === "string" && parsed.id.length > 0 ? parsed.id : requestId,
status: 503,
headers: { "content-type": "application/json" },
body: JSON.stringify({ error: message }),
completedAt: new Date().toISOString(),
});
} catch (error) {
console.warn(
`[paperclip] sandbox callback bridge failed to abort pending request ${requestId}: ${error instanceof Error ? error.message : String(error)}`,
);
} finally {
await input.client.remove(requestPath).catch(() => undefined);
}
}
};
const loop = (async () => {
try {
while (true) {
const fileNames = await input.client.listJsonFiles(directories.requestsDir);
if (fileNames.length === 0) {
if (stopping) {
break;
}
await new Promise((resolve) => setTimeout(resolve, pollIntervalMs));
continue;
}
for (const fileName of fileNames) {
if (stopping && Date.now() >= stopDeadline) break;
inFlight += 1;
try {
await processRequestFile(fileName);
} finally {
inFlight -= 1;
}
}
if (stopping && Date.now() >= stopDeadline) {
break;
}
}
} finally {
settled = true;
if (settleResolve) {
settleResolve();
}
}
})();
void loop;
return {
stop: async (options = {}) => {
stopping = true;
const drainMs = normalizeTimeoutMs(options.drainTimeoutMs, DEFAULT_BRIDGE_STOP_TIMEOUT_MS);
stopDeadline = Date.now() + drainMs;
if (!settled) {
await Promise.race([
settledPromise,
new Promise<void>((resolve) => setTimeout(resolve, drainMs)),
]);
}
await failPendingRequests("Bridge worker stopped before request could be handled.");
},
};
}
export async function startSandboxCallbackBridgeServer(input: {
runner: CommandManagedRuntimeRunner;
remoteCwd: string;
assetRemoteDir: string;
queueDir: string;
bridgeToken: string;
bridgeAsset?: SandboxCallbackBridgeAsset | null;
host?: string;
port?: number | null;
pollIntervalMs?: number | null;
responseTimeoutMs?: number | null;
timeoutMs?: number | null;
nodeCommand?: string;
maxQueueDepth?: number | null;
maxBodyBytes?: number | null;
}): Promise<StartedSandboxCallbackBridgeServer> {
const timeoutMs = normalizeTimeoutMs(input.timeoutMs, DEFAULT_BRIDGE_RESPONSE_TIMEOUT_MS);
const directories = sandboxCallbackBridgeDirectories(input.queueDir);
const remoteEntrypoint = path.posix.join(input.assetRemoteDir, SANDBOX_CALLBACK_BRIDGE_ENTRYPOINT);
if (input.bridgeAsset) {
const assetClient = createCommandManagedSandboxCallbackBridgeQueueClient({
runner: input.runner,
remoteCwd: input.remoteCwd,
timeoutMs,
});
await assetClient.makeDir(input.assetRemoteDir);
const entrypointSource = await fs.readFile(input.bridgeAsset.entrypoint, "utf8");
await assetClient.writeTextFile(remoteEntrypoint, entrypointSource);
}
const env = buildSandboxCallbackBridgeEnv({
queueDir: input.queueDir,
bridgeToken: input.bridgeToken,
host: input.host,
port: input.port,
pollIntervalMs: input.pollIntervalMs,
responseTimeoutMs: input.responseTimeoutMs,
maxQueueDepth: input.maxQueueDepth,
maxBodyBytes: input.maxBodyBytes,
});
const nodeCommand = input.nodeCommand?.trim() || "node";
const startResult = await input.runner.execute({
command: "sh",
args: [
"-lc",
[
`mkdir -p ${shellQuote(directories.requestsDir)} ${shellQuote(directories.responsesDir)} ${shellQuote(directories.logsDir)}`,
`rm -f ${shellQuote(directories.readyFile)} ${shellQuote(directories.pidFile)}`,
`nohup env ${Object.entries(env).map(([key, value]) => `${key}=${shellQuote(value)}`).join(" ")} ` +
`${shellQuote(nodeCommand)} ${shellQuote(remoteEntrypoint)} ` +
`>> ${shellQuote(directories.logFile)} 2>&1 < /dev/null &`,
"pid=$!",
`printf '%s\\n' \"$pid\" > ${shellQuote(directories.pidFile)}`,
"printf '{\"pid\":%s}\\n' \"$pid\"",
].join("\n"),
],
cwd: input.remoteCwd,
timeoutMs,
});
requireSuccessfulResult("start sandbox callback bridge", startResult);
const readyResult = await runShell(
input.runner,
input.remoteCwd,
[
"i=0",
`while [ \"$i\" -lt 200 ]; do`,
` if [ -s ${shellQuote(directories.readyFile)} ]; then`,
` cat ${shellQuote(directories.readyFile)}`,
" exit 0",
" fi",
` if [ -s ${shellQuote(directories.logFile)} ] && ! kill -0 \"$(cat ${shellQuote(directories.pidFile)} 2>/dev/null)\" 2>/dev/null; then`,
` cat ${shellQuote(directories.logFile)} >&2`,
" exit 1",
" fi",
" i=$((i + 1))",
" sleep 0.05",
"done",
`echo "Timed out waiting for bridge readiness." >&2`,
`if [ -s ${shellQuote(directories.logFile)} ]; then cat ${shellQuote(directories.logFile)} >&2; fi`,
"exit 1",
].join("\n"),
timeoutMs,
);
requireSuccessfulResult("wait for sandbox callback bridge readiness", readyResult);
let readyData: { host?: string; port?: number; baseUrl?: string; pid?: number };
try {
readyData = JSON.parse(readyResult.stdout.trim()) as { host?: string; port?: number; baseUrl?: string; pid?: number };
} catch (error) {
throw new Error(
`Sandbox callback bridge wrote invalid readiness JSON: ${error instanceof Error ? error.message : String(error)}`,
);
}
const host = typeof readyData.host === "string" && readyData.host.trim().length > 0
? readyData.host.trim()
: "127.0.0.1";
const port = typeof readyData.port === "number" && Number.isFinite(readyData.port) ? readyData.port : 0;
if (!port) {
throw new Error("Sandbox callback bridge did not report a listening port.");
}
const baseUrl =
typeof readyData.baseUrl === "string" && readyData.baseUrl.trim().length > 0
? readyData.baseUrl.trim()
: `http://${host}:${port}`;
return {
baseUrl,
host,
port,
pid: typeof readyData.pid === "number" && Number.isFinite(readyData.pid) ? readyData.pid : 0,
directories,
stop: async () => {
const stopResult = await input.runner.execute({
command: "sh",
args: [
"-lc",
[
`if [ -s ${shellQuote(directories.pidFile)} ]; then`,
` pid="$(cat ${shellQuote(directories.pidFile)})"`,
" kill \"$pid\" 2>/dev/null || true",
" i=0",
" while kill -0 \"$pid\" 2>/dev/null && [ \"$i\" -lt 40 ]; do",
" i=$((i + 1))",
" sleep 0.05",
" done",
"fi",
`rm -f ${shellQuote(directories.pidFile)} ${shellQuote(directories.readyFile)}`,
].join("\n"),
],
cwd: input.remoteCwd,
timeoutMs,
});
if (stopResult.timedOut) {
throw new Error(buildRunnerFailureMessage("stop sandbox callback bridge", stopResult));
}
},
};
}
function getSandboxCallbackBridgeServerSource(): string {
return `import { randomUUID, timingSafeEqual } from "node:crypto";
import { createServer } from "node:http";
import { promises as fs } from "node:fs";
import path from "node:path";
const queueDir = process.env.PAPERCLIP_BRIDGE_QUEUE_DIR;
const bridgeToken = process.env.PAPERCLIP_BRIDGE_TOKEN;
const host = process.env.PAPERCLIP_BRIDGE_HOST || "127.0.0.1";
const port = Number(process.env.PAPERCLIP_BRIDGE_PORT || "0");
const pollIntervalMs = Number(process.env.PAPERCLIP_BRIDGE_POLL_INTERVAL_MS || "100");
const responseTimeoutMs = Number(process.env.PAPERCLIP_BRIDGE_RESPONSE_TIMEOUT_MS || "30000");
const maxQueueDepth = Number(process.env.PAPERCLIP_BRIDGE_MAX_QUEUE_DEPTH || "${DEFAULT_BRIDGE_MAX_QUEUE_DEPTH}");
const maxBodyBytes = Number(process.env.PAPERCLIP_BRIDGE_MAX_BODY_BYTES || "${DEFAULT_BRIDGE_MAX_BODY_BYTES}");
const allowedHeaders = new Set(${JSON.stringify([...DEFAULT_SANDBOX_CALLBACK_BRIDGE_HEADER_ALLOWLIST])});
if (!queueDir || !bridgeToken) {
throw new Error("PAPERCLIP_BRIDGE_QUEUE_DIR and PAPERCLIP_BRIDGE_TOKEN are required.");
}
const requestsDir = path.posix.join(queueDir, "requests");
const responsesDir = path.posix.join(queueDir, "responses");
const logsDir = path.posix.join(queueDir, "logs");
const readyFile = path.posix.join(queueDir, "ready.json");
function sleep(ms) {
return new Promise((resolve) => setTimeout(resolve, ms));
}
function normalizeHeaders(headers) {
const out = {};
for (const [key, value] of Object.entries(headers)) {
if (value == null) continue;
const normalizedKey = key.toLowerCase();
if (!allowedHeaders.has(normalizedKey)) {
continue;
}
out[normalizedKey] = Array.isArray(value) ? value.join(", ") : String(value);
}
return out;
}
async function readBody(req) {
const chunks = [];
let totalBytes = 0;
for await (const chunk of req) {
const nextChunk = Buffer.isBuffer(chunk) ? chunk : Buffer.from(chunk);
chunks.push(nextChunk);
totalBytes += nextChunk.byteLength;
if (totalBytes > maxBodyBytes) {
throw new Error("Bridge request body exceeded the configured size limit.");
}
}
return Buffer.concat(chunks).toString("utf8");
}
async function queueDepth() {
const entries = await fs.readdir(requestsDir, { withFileTypes: true }).catch(() => []);
return entries.filter((entry) => entry.isFile() && entry.name.endsWith(".json")).length;
}
function tokensMatch(received) {
const expected = Buffer.from(bridgeToken, "utf8");
const actual = Buffer.from(typeof received === "string" ? received : "", "utf8");
if (expected.length !== actual.length) return false;
return timingSafeEqual(expected, actual);
}
async function waitForResponse(requestId) {
const responsePath = path.posix.join(responsesDir, \`\${requestId}.json\`);
const deadline = Date.now() + responseTimeoutMs;
while (Date.now() < deadline) {
const body = await fs.readFile(responsePath, "utf8").catch(() => null);
if (body != null) {
await fs.rm(responsePath, { force: true }).catch(() => undefined);
return JSON.parse(body);
}
await sleep(pollIntervalMs);
}
throw new Error("Timed out waiting for host bridge response.");
}
const server = createServer(async (req, res) => {
try {
const auth = req.headers.authorization || "";
const receivedToken = auth.startsWith("Bearer ") ? auth.slice("Bearer ".length) : "";
if (!tokensMatch(receivedToken)) {
res.statusCode = 401;
res.setHeader("content-type", "application/json");
res.end(JSON.stringify({ error: "Invalid bridge token." }));
return;
}
if (await queueDepth() >= maxQueueDepth) {
res.statusCode = 503;
res.setHeader("content-type", "application/json");
res.end(JSON.stringify({ error: "Bridge request queue is full." }));
return;
}
const url = new URL(req.url || "/", "http://127.0.0.1");
const contentType = typeof req.headers["content-type"] === "string" ? req.headers["content-type"] : "";
if (req.method && req.method !== "GET" && req.method !== "HEAD" && !/json/i.test(contentType)) {
res.statusCode = 415;
res.setHeader("content-type", "application/json");
res.end(JSON.stringify({ error: "Bridge only accepts JSON request bodies." }));
return;
}
const requestId = randomUUID();
const requestBody = await readBody(req);
const payload = {
id: requestId,
method: req.method || "GET",
path: url.pathname,
query: url.search,
headers: normalizeHeaders(req.headers),
body: requestBody,
createdAt: new Date().toISOString(),
};
const requestPath = path.posix.join(requestsDir, \`\${requestId}.json\`);
const tempPath = \`\${requestPath}.tmp\`;
await fs.writeFile(tempPath, \`\${JSON.stringify(payload)}\\n\`, "utf8");
await fs.rename(tempPath, requestPath);
const response = await waitForResponse(requestId);
res.statusCode = typeof response.status === "number" ? response.status : 200;
for (const [key, value] of Object.entries(response.headers || {})) {
if (typeof value !== "string" || key.toLowerCase() === "content-length") continue;
res.setHeader(key, value);
}
res.end(typeof response.body === "string" ? response.body : "");
} catch (error) {
res.statusCode = 502;
res.setHeader("content-type", "application/json");
res.end(JSON.stringify({ error: error instanceof Error ? error.message : String(error) }));
}
});
async function shutdown() {
server.close(() => {
process.exit(0);
});
}
process.on("SIGINT", () => void shutdown());
process.on("SIGTERM", () => void shutdown());
await fs.mkdir(requestsDir, { recursive: true });
await fs.mkdir(responsesDir, { recursive: true });
await fs.mkdir(logsDir, { recursive: true });
server.listen(port, host, async () => {
const address = server.address();
if (!address || typeof address === "string") {
throw new Error("Bridge server did not expose a TCP address.");
}
const ready = {
pid: process.pid,
host,
port: address.port,
baseUrl: \`http://\${host}:\${address.port}\`,
startedAt: new Date().toISOString(),
};
const tempReadyFile = \`\${readyFile}.tmp\`;
await fs.writeFile(tempReadyFile, JSON.stringify(ready), "utf8");
await fs.rename(tempReadyFile, readyFile);
});`;
}

View File

@@ -0,0 +1,133 @@
import { lstat, mkdir, mkdtemp, readFile, readdir, rm, symlink, writeFile } from "node:fs/promises";
import os from "node:os";
import path from "node:path";
import { execFile as execFileCallback } from "node:child_process";
import { promisify } from "node:util";
import { afterEach, describe, expect, it } from "vitest";
import {
mirrorDirectory,
prepareSandboxManagedRuntime,
type SandboxManagedRuntimeClient,
} from "./sandbox-managed-runtime.js";
const execFile = promisify(execFileCallback);
describe("sandbox managed runtime", () => {
const cleanupDirs: string[] = [];
afterEach(async () => {
while (cleanupDirs.length > 0) {
const dir = cleanupDirs.pop();
if (!dir) continue;
await rm(dir, { recursive: true, force: true }).catch(() => undefined);
}
});
it("preserves excluded local workspace artifacts during restore mirroring", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-sandbox-restore-"));
cleanupDirs.push(rootDir);
const sourceDir = path.join(rootDir, "source");
const targetDir = path.join(rootDir, "target");
await mkdir(path.join(sourceDir, "src"), { recursive: true });
await mkdir(path.join(targetDir, ".claude"), { recursive: true });
await mkdir(path.join(targetDir, ".paperclip-runtime"), { recursive: true });
await writeFile(path.join(sourceDir, "src", "app.ts"), "export const value = 2;\n", "utf8");
await writeFile(path.join(targetDir, "stale.txt"), "remove me\n", "utf8");
await writeFile(path.join(targetDir, ".claude", "settings.json"), "{\"keep\":true}\n", "utf8");
await writeFile(path.join(targetDir, ".claude.json"), "{\"keep\":true}\n", "utf8");
await writeFile(path.join(targetDir, ".paperclip-runtime", "state.json"), "{}\n", "utf8");
await mirrorDirectory(sourceDir, targetDir, {
preserveAbsent: [".paperclip-runtime", ".claude", ".claude.json"],
});
await expect(readFile(path.join(targetDir, "src", "app.ts"), "utf8")).resolves.toBe("export const value = 2;\n");
await expect(readFile(path.join(targetDir, ".claude", "settings.json"), "utf8")).resolves.toBe("{\"keep\":true}\n");
await expect(readFile(path.join(targetDir, ".claude.json"), "utf8")).resolves.toBe("{\"keep\":true}\n");
await expect(readFile(path.join(targetDir, ".paperclip-runtime", "state.json"), "utf8")).resolves.toBe("{}\n");
await expect(readFile(path.join(targetDir, "stale.txt"), "utf8")).rejects.toMatchObject({ code: "ENOENT" });
});
it("syncs workspace and assets through a provider-neutral sandbox client", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-sandbox-managed-"));
cleanupDirs.push(rootDir);
const localWorkspaceDir = path.join(rootDir, "local-workspace");
const remoteWorkspaceDir = path.join(rootDir, "remote-workspace");
const localAssetsDir = path.join(rootDir, "local-assets");
const linkedAssetPath = path.join(rootDir, "linked-skill.md");
await mkdir(path.join(localWorkspaceDir, ".claude"), { recursive: true });
await mkdir(localAssetsDir, { recursive: true });
await writeFile(path.join(localWorkspaceDir, "README.md"), "local workspace\n", "utf8");
await writeFile(path.join(localWorkspaceDir, "._README.md"), "appledouble\n", "utf8");
await writeFile(path.join(localWorkspaceDir, ".claude", "settings.json"), "{\"local\":true}\n", "utf8");
await writeFile(linkedAssetPath, "skill body\n", "utf8");
await symlink(linkedAssetPath, path.join(localAssetsDir, "skill.md"));
const client: SandboxManagedRuntimeClient = {
makeDir: async (remotePath) => {
await mkdir(remotePath, { recursive: true });
},
writeFile: async (remotePath, bytes) => {
await mkdir(path.dirname(remotePath), { recursive: true });
await writeFile(remotePath, Buffer.from(bytes));
},
readFile: async (remotePath) => await readFile(remotePath),
listFiles: async (remotePath) => {
const entries = await readdir(remotePath, { withFileTypes: true }).catch(() => []);
return entries
.filter((entry) => entry.isFile())
.map((entry) => entry.name)
.sort((left, right) => left.localeCompare(right));
},
remove: async (remotePath) => {
await rm(remotePath, { recursive: true, force: true });
},
run: async (command) => {
await execFile("sh", ["-lc", command], {
maxBuffer: 32 * 1024 * 1024,
});
},
};
const prepared = await prepareSandboxManagedRuntime({
spec: {
transport: "sandbox",
provider: "test",
sandboxId: "sandbox-1",
remoteCwd: remoteWorkspaceDir,
timeoutMs: 30_000,
apiKey: null,
},
adapterKey: "test-adapter",
client,
workspaceLocalDir: localWorkspaceDir,
workspaceExclude: [".claude"],
preserveAbsentOnRestore: [".claude"],
assets: [{
key: "skills",
localDir: localAssetsDir,
followSymlinks: true,
}],
});
await expect(readFile(path.join(remoteWorkspaceDir, "README.md"), "utf8")).resolves.toBe("local workspace\n");
await expect(readFile(path.join(remoteWorkspaceDir, "._README.md"), "utf8")).rejects.toMatchObject({ code: "ENOENT" });
await expect(readFile(path.join(remoteWorkspaceDir, ".claude", "settings.json"), "utf8")).rejects.toMatchObject({ code: "ENOENT" });
await expect(readFile(path.join(prepared.assetDirs.skills, "skill.md"), "utf8")).resolves.toBe("skill body\n");
expect((await lstat(path.join(prepared.assetDirs.skills, "skill.md"))).isFile()).toBe(true);
await writeFile(path.join(remoteWorkspaceDir, "README.md"), "remote workspace\n", "utf8");
await writeFile(path.join(remoteWorkspaceDir, "remote-only.txt"), "sync back\n", "utf8");
await mkdir(path.join(localWorkspaceDir, ".paperclip-runtime"), { recursive: true });
await writeFile(path.join(localWorkspaceDir, ".paperclip-runtime", "state.json"), "{}\n", "utf8");
await writeFile(path.join(localWorkspaceDir, "local-stale.txt"), "remove\n", "utf8");
await prepared.restoreWorkspace();
await expect(readFile(path.join(localWorkspaceDir, "README.md"), "utf8")).resolves.toBe("remote workspace\n");
await expect(readFile(path.join(localWorkspaceDir, "remote-only.txt"), "utf8")).resolves.toBe("sync back\n");
await expect(readFile(path.join(localWorkspaceDir, "local-stale.txt"), "utf8")).rejects.toMatchObject({ code: "ENOENT" });
await expect(readFile(path.join(localWorkspaceDir, ".claude", "settings.json"), "utf8")).resolves.toBe("{\"local\":true}\n");
await expect(readFile(path.join(localWorkspaceDir, ".paperclip-runtime", "state.json"), "utf8")).resolves.toBe("{}\n");
});
});

View File

@@ -0,0 +1,339 @@
import { execFile as execFileCallback } from "node:child_process";
import { constants as fsConstants, promises as fs } from "node:fs";
import os from "node:os";
import path from "node:path";
import { promisify } from "node:util";
const execFile = promisify(execFileCallback);
export interface SandboxRemoteExecutionSpec {
transport: "sandbox";
provider: string;
sandboxId: string;
remoteCwd: string;
timeoutMs: number;
apiKey: string | null;
paperclipApiUrl?: string | null;
}
export interface SandboxManagedRuntimeAsset {
key: string;
localDir: string;
followSymlinks?: boolean;
exclude?: string[];
}
export interface SandboxManagedRuntimeClient {
makeDir(remotePath: string): Promise<void>;
writeFile(remotePath: string, bytes: ArrayBuffer): Promise<void>;
readFile(remotePath: string): Promise<Buffer | Uint8Array | ArrayBuffer>;
listFiles(remotePath: string): Promise<string[]>;
remove(remotePath: string): Promise<void>;
run(command: string, options: { timeoutMs: number }): Promise<void>;
}
export interface PreparedSandboxManagedRuntime {
spec: SandboxRemoteExecutionSpec;
workspaceLocalDir: string;
workspaceRemoteDir: string;
runtimeRootDir: string;
assetDirs: Record<string, string>;
restoreWorkspace(): Promise<void>;
}
function asObject(value: unknown): Record<string, unknown> {
return value && typeof value === "object" && !Array.isArray(value)
? (value as Record<string, unknown>)
: {};
}
function asString(value: unknown): string {
return typeof value === "string" ? value : "";
}
function asNumber(value: unknown): number {
return typeof value === "number" ? value : Number(value);
}
function shellQuote(value: string) {
return `'${value.replace(/'/g, `'\"'\"'`)}'`;
}
export function parseSandboxRemoteExecutionSpec(value: unknown): SandboxRemoteExecutionSpec | null {
const parsed = asObject(value);
const transport = asString(parsed.transport).trim();
const provider = asString(parsed.provider).trim();
const sandboxId = asString(parsed.sandboxId).trim();
const remoteCwd = asString(parsed.remoteCwd).trim();
const timeoutMs = asNumber(parsed.timeoutMs);
if (
transport !== "sandbox" ||
provider.length === 0 ||
sandboxId.length === 0 ||
remoteCwd.length === 0 ||
!Number.isFinite(timeoutMs) ||
timeoutMs <= 0
) {
return null;
}
return {
transport: "sandbox",
provider,
sandboxId,
remoteCwd,
timeoutMs,
apiKey: asString(parsed.apiKey).trim() || null,
paperclipApiUrl: asString(parsed.paperclipApiUrl).trim() || null,
};
}
export function buildSandboxExecutionSessionIdentity(spec: SandboxRemoteExecutionSpec | null) {
if (!spec) return null;
return {
transport: "sandbox",
provider: spec.provider,
sandboxId: spec.sandboxId,
remoteCwd: spec.remoteCwd,
...(spec.paperclipApiUrl ? { paperclipApiUrl: spec.paperclipApiUrl } : {}),
} as const;
}
export function sandboxExecutionSessionMatches(saved: unknown, current: SandboxRemoteExecutionSpec | null): boolean {
const currentIdentity = buildSandboxExecutionSessionIdentity(current);
if (!currentIdentity) return false;
const parsedSaved = asObject(saved);
return (
asString(parsedSaved.transport) === currentIdentity.transport &&
asString(parsedSaved.provider) === currentIdentity.provider &&
asString(parsedSaved.sandboxId) === currentIdentity.sandboxId &&
asString(parsedSaved.remoteCwd) === currentIdentity.remoteCwd &&
asString(parsedSaved.paperclipApiUrl) === asString(currentIdentity.paperclipApiUrl)
);
}
async function withTempDir<T>(prefix: string, fn: (dir: string) => Promise<T>): Promise<T> {
const dir = await fs.mkdtemp(path.join(os.tmpdir(), prefix));
try {
return await fn(dir);
} finally {
await fs.rm(dir, { recursive: true, force: true }).catch(() => undefined);
}
}
async function execTar(args: string[]): Promise<void> {
await execFile("tar", args, {
env: {
...process.env,
COPYFILE_DISABLE: "1",
},
maxBuffer: 32 * 1024 * 1024,
});
}
async function createTarballFromDirectory(input: {
localDir: string;
archivePath: string;
exclude?: string[];
followSymlinks?: boolean;
}): Promise<void> {
const excludeArgs = ["._*", ...(input.exclude ?? [])].flatMap((entry) => ["--exclude", entry]);
await execTar([
"-c",
...(input.followSymlinks ? ["-h"] : []),
"-f",
input.archivePath,
"-C",
input.localDir,
...excludeArgs,
".",
]);
}
async function extractTarballToDirectory(input: {
archivePath: string;
localDir: string;
}): Promise<void> {
await fs.mkdir(input.localDir, { recursive: true });
await execTar(["-xf", input.archivePath, "-C", input.localDir]);
}
async function walkDirectory(root: string, relative = ""): Promise<string[]> {
const current = path.join(root, relative);
const entries = await fs.readdir(current, { withFileTypes: true }).catch(() => []);
const out: string[] = [];
for (const entry of entries) {
const nextRelative = relative ? path.posix.join(relative, entry.name) : entry.name;
out.push(nextRelative);
if (entry.isDirectory()) {
out.push(...(await walkDirectory(root, nextRelative)));
}
}
return out.sort((left, right) => right.length - left.length);
}
function isRelativePathOrDescendant(relative: string, candidate: string): boolean {
return relative === candidate || relative.startsWith(`${candidate}/`);
}
export async function mirrorDirectory(
sourceDir: string,
targetDir: string,
options: { preserveAbsent?: string[] } = {},
): Promise<void> {
await fs.mkdir(targetDir, { recursive: true });
const preserveAbsent = new Set(options.preserveAbsent ?? []);
const shouldPreserveAbsent = (relative: string) =>
[...preserveAbsent].some((candidate) => isRelativePathOrDescendant(relative, candidate));
const sourceEntries = new Set(await walkDirectory(sourceDir));
const targetEntries = await walkDirectory(targetDir);
for (const relative of targetEntries) {
if (shouldPreserveAbsent(relative)) continue;
if (!sourceEntries.has(relative)) {
await fs.rm(path.join(targetDir, relative), { recursive: true, force: true }).catch(() => undefined);
}
}
const copyEntry = async (relative: string) => {
const sourcePath = path.join(sourceDir, relative);
const targetPath = path.join(targetDir, relative);
const stats = await fs.lstat(sourcePath);
if (stats.isDirectory()) {
await fs.mkdir(targetPath, { recursive: true });
return;
}
await fs.mkdir(path.dirname(targetPath), { recursive: true });
await fs.rm(targetPath, { recursive: true, force: true }).catch(() => undefined);
if (stats.isSymbolicLink()) {
const linkTarget = await fs.readlink(sourcePath);
await fs.symlink(linkTarget, targetPath);
return;
}
await fs.copyFile(sourcePath, targetPath, fsConstants.COPYFILE_FICLONE).catch(async () => {
await fs.copyFile(sourcePath, targetPath);
});
await fs.chmod(targetPath, stats.mode);
};
const entries = (await walkDirectory(sourceDir)).sort((left, right) => left.localeCompare(right));
for (const relative of entries) {
await copyEntry(relative);
}
}
function toArrayBuffer(bytes: Buffer): ArrayBuffer {
return Uint8Array.from(bytes).buffer;
}
function toBuffer(bytes: Buffer | Uint8Array | ArrayBuffer): Buffer {
if (Buffer.isBuffer(bytes)) return bytes;
if (bytes instanceof ArrayBuffer) return Buffer.from(bytes);
return Buffer.from(bytes.buffer, bytes.byteOffset, bytes.byteLength);
}
function tarExcludeFlags(exclude: string[] | undefined): string {
return ["._*", ...(exclude ?? [])].map((entry) => `--exclude ${shellQuote(entry)}`).join(" ");
}
export async function prepareSandboxManagedRuntime(input: {
spec: SandboxRemoteExecutionSpec;
adapterKey: string;
client: SandboxManagedRuntimeClient;
workspaceLocalDir: string;
workspaceRemoteDir?: string;
workspaceExclude?: string[];
preserveAbsentOnRestore?: string[];
assets?: SandboxManagedRuntimeAsset[];
}): Promise<PreparedSandboxManagedRuntime> {
const workspaceRemoteDir = input.workspaceRemoteDir ?? input.spec.remoteCwd;
const runtimeRootDir = path.posix.join(workspaceRemoteDir, ".paperclip-runtime", input.adapterKey);
await withTempDir("paperclip-sandbox-sync-", async (tempDir) => {
const workspaceTarPath = path.join(tempDir, "workspace.tar");
await createTarballFromDirectory({
localDir: input.workspaceLocalDir,
archivePath: workspaceTarPath,
exclude: input.workspaceExclude,
});
const workspaceTarBytes = await fs.readFile(workspaceTarPath);
const remoteWorkspaceTar = path.posix.join(runtimeRootDir, "workspace-upload.tar");
await input.client.makeDir(runtimeRootDir);
await input.client.writeFile(remoteWorkspaceTar, toArrayBuffer(workspaceTarBytes));
const preservedNames = new Set([".paperclip-runtime", ...(input.preserveAbsentOnRestore ?? [])]);
const findPreserveArgs = [...preservedNames].map((entry) => `! -name ${shellQuote(entry)}`).join(" ");
await input.client.run(
`sh -lc ${shellQuote(
`mkdir -p ${shellQuote(workspaceRemoteDir)} && ` +
`find ${shellQuote(workspaceRemoteDir)} -mindepth 1 -maxdepth 1 ${findPreserveArgs} -exec rm -rf -- {} + && ` +
`tar -xf ${shellQuote(remoteWorkspaceTar)} -C ${shellQuote(workspaceRemoteDir)} && ` +
`rm -f ${shellQuote(remoteWorkspaceTar)}`,
)}`,
{ timeoutMs: input.spec.timeoutMs },
);
for (const asset of input.assets ?? []) {
const assetTarPath = path.join(tempDir, `${asset.key}.tar`);
await createTarballFromDirectory({
localDir: asset.localDir,
archivePath: assetTarPath,
followSymlinks: asset.followSymlinks,
exclude: asset.exclude,
});
const assetTarBytes = await fs.readFile(assetTarPath);
const remoteAssetDir = path.posix.join(runtimeRootDir, asset.key);
const remoteAssetTar = path.posix.join(runtimeRootDir, `${asset.key}-upload.tar`);
await input.client.writeFile(remoteAssetTar, toArrayBuffer(assetTarBytes));
await input.client.run(
`sh -lc ${shellQuote(
`rm -rf ${shellQuote(remoteAssetDir)} && ` +
`mkdir -p ${shellQuote(remoteAssetDir)} && ` +
`tar -xf ${shellQuote(remoteAssetTar)} -C ${shellQuote(remoteAssetDir)} && ` +
`rm -f ${shellQuote(remoteAssetTar)}`,
)}`,
{ timeoutMs: input.spec.timeoutMs },
);
}
});
const assetDirs = Object.fromEntries(
(input.assets ?? []).map((asset) => [asset.key, path.posix.join(runtimeRootDir, asset.key)]),
);
return {
spec: input.spec,
workspaceLocalDir: input.workspaceLocalDir,
workspaceRemoteDir,
runtimeRootDir,
assetDirs,
restoreWorkspace: async () => {
await withTempDir("paperclip-sandbox-restore-", async (tempDir) => {
const remoteWorkspaceTar = path.posix.join(runtimeRootDir, "workspace-download.tar");
await input.client.run(
`sh -lc ${shellQuote(
`mkdir -p ${shellQuote(runtimeRootDir)} && ` +
`tar -cf ${shellQuote(remoteWorkspaceTar)} -C ${shellQuote(workspaceRemoteDir)} ` +
`${tarExcludeFlags(input.workspaceExclude)} .`,
)}`,
{ timeoutMs: input.spec.timeoutMs },
);
const archiveBytes = await input.client.readFile(remoteWorkspaceTar);
await input.client.remove(remoteWorkspaceTar).catch(() => undefined);
const localArchivePath = path.join(tempDir, "workspace.tar");
const extractedDir = path.join(tempDir, "workspace");
await fs.writeFile(localArchivePath, toBuffer(archiveBytes));
await extractTarballToDirectory({
archivePath: localArchivePath,
localDir: extractedDir,
});
await mirrorDirectory(extractedDir, input.workspaceLocalDir, {
preserveAbsent: [".paperclip-runtime", ...(input.preserveAbsentOnRestore ?? [])],
});
});
},
};
}

View File

@@ -1,6 +1,7 @@
import { randomUUID } from "node:crypto";
import { describe, expect, it } from "vitest";
import {
applyPaperclipWorkspaceEnv,
appendWithByteCap,
DEFAULT_PAPERCLIP_AGENT_PROMPT_TEMPLATE,
renderPaperclipWakePrompt,
@@ -254,6 +255,7 @@ describe("renderPaperclipWakePrompt", () => {
it("keeps the default local-agent prompt action-oriented", () => {
expect(DEFAULT_PAPERCLIP_AGENT_PROMPT_TEMPLATE).toContain("Start actionable work in this heartbeat");
expect(DEFAULT_PAPERCLIP_AGENT_PROMPT_TEMPLATE).toContain("do not stop at a plan");
expect(DEFAULT_PAPERCLIP_AGENT_PROMPT_TEMPLATE).toContain("Prefer the smallest verification that proves the change");
expect(DEFAULT_PAPERCLIP_AGENT_PROMPT_TEMPLATE).toContain("Use child issues");
expect(DEFAULT_PAPERCLIP_AGENT_PROMPT_TEMPLATE).toContain("instead of polling agents, sessions, or processes");
expect(DEFAULT_PAPERCLIP_AGENT_PROMPT_TEMPLATE).toContain("Create child issues directly when you know what needs to be done");
@@ -326,6 +328,34 @@ describe("renderPaperclipWakePrompt", () => {
expect(prompt).toContain("PAP-1723 Finish blocker (todo)");
});
it("renders loose review request instructions for execution handoffs", () => {
const prompt = renderPaperclipWakePrompt({
reason: "execution_review_requested",
issue: {
id: "issue-1",
identifier: "PAP-2011",
title: "Review request handoff",
status: "in_review",
},
executionStage: {
wakeRole: "reviewer",
stageId: "stage-1",
stageType: "review",
currentParticipant: { type: "agent", agentId: "agent-1" },
returnAssignee: { type: "agent", agentId: "agent-2" },
reviewRequest: {
instructions: "Please focus on edge cases and leave a short risk summary.",
},
allowedActions: ["approve", "request_changes"],
},
fallbackFetchNeeded: false,
});
expect(prompt).toContain("Review request instructions:");
expect(prompt).toContain("Please focus on edge cases and leave a short risk summary.");
expect(prompt).toContain("You are waking as the active reviewer for this issue.");
});
it("includes continuation and child issue summaries in structured wake context", () => {
const payload = {
reason: "issue_children_completed",
@@ -396,6 +426,50 @@ describe("renderPaperclipWakePrompt", () => {
});
});
describe("applyPaperclipWorkspaceEnv", () => {
it("adds shared workspace env vars including AGENT_HOME", () => {
const env = applyPaperclipWorkspaceEnv(
{},
{
workspaceCwd: "/tmp/workspace",
workspaceSource: "project_primary",
workspaceStrategy: "git_worktree",
workspaceId: "workspace-1",
workspaceRepoUrl: "https://github.com/paperclipai/paperclip.git",
workspaceRepoRef: "main",
workspaceBranch: "feature/test",
workspaceWorktreePath: "/tmp/worktree",
agentHome: "/tmp/agent-home",
},
);
expect(env).toEqual({
PAPERCLIP_WORKSPACE_CWD: "/tmp/workspace",
PAPERCLIP_WORKSPACE_SOURCE: "project_primary",
PAPERCLIP_WORKSPACE_STRATEGY: "git_worktree",
PAPERCLIP_WORKSPACE_ID: "workspace-1",
PAPERCLIP_WORKSPACE_REPO_URL: "https://github.com/paperclipai/paperclip.git",
PAPERCLIP_WORKSPACE_REPO_REF: "main",
PAPERCLIP_WORKSPACE_BRANCH: "feature/test",
PAPERCLIP_WORKSPACE_WORKTREE_PATH: "/tmp/worktree",
AGENT_HOME: "/tmp/agent-home",
});
});
it("skips empty workspace env values", () => {
const env = applyPaperclipWorkspaceEnv(
{},
{
workspaceCwd: "",
workspaceSource: null,
agentHome: "",
},
);
expect(env).toEqual({});
});
});
describe("appendWithByteCap", () => {
it("keeps valid UTF-8 when trimming through multibyte text", () => {
const output = appendWithByteCap("prefix ", "hello — world", 7);

View File

@@ -87,10 +87,12 @@ export const DEFAULT_PAPERCLIP_AGENT_PROMPT_TEMPLATE = [
"Execution contract:",
"- Start actionable work in this heartbeat; do not stop at a plan unless the issue asks for planning.",
"- Leave durable progress in comments, documents, or work products with a clear next action.",
"- Prefer the smallest verification that proves the change; do not default to full workspace typecheck/build/test on every heartbeat unless the task scope warrants it.",
"- Use child issues for parallel or long delegated work instead of polling agents, sessions, or processes.",
"- If woken by a human comment on a dependency-blocked issue, respond or triage the comment without treating the blocked deliverable work as unblocked.",
"- Create child issues directly when you know what needs to be done; use issue-thread interactions when the board/user must choose suggested tasks, answer structured questions, or confirm a proposal.",
"- To ask for that input, create an interaction on the current issue with POST /api/issues/{issueId}/interactions using kind suggest_tasks, ask_user_questions, or request_confirmation. Use continuationPolicy wake_assignee when you need to resume after a response; for request_confirmation this resumes only after acceptance.",
"- When you intentionally restart follow-up work on a completed assigned issue, include structured `resume: true` with the POST /api/issues/{issueId}/comments or PATCH /api/issues/{issueId} comment payload. Generic agent comments on closed issues are inert by default.",
"- For plan approval, update the plan document first, then create request_confirmation targeting the latest plan revision with idempotencyKey confirmation:{issueId}:plan:{revisionId}. Wait for acceptance before creating implementation subtasks, and create a fresh confirmation after superseding board/user comments if approval is still needed.",
"- If blocked, mark the issue blocked and name the unblock owner and action.",
"- Respect budget, pause/cancel, approval gates, and company boundaries.",
@@ -282,6 +284,9 @@ type PaperclipWakeExecutionStage = {
stageType: string | null;
currentParticipant: PaperclipWakeExecutionPrincipal | null;
returnAssignee: PaperclipWakeExecutionPrincipal | null;
reviewRequest: {
instructions: string;
} | null;
lastDecisionOutcome: string | null;
allowedActions: string[];
};
@@ -484,11 +489,14 @@ function normalizePaperclipWakeExecutionStage(value: unknown): PaperclipWakeExec
: [];
const currentParticipant = normalizePaperclipWakeExecutionPrincipal(stage.currentParticipant);
const returnAssignee = normalizePaperclipWakeExecutionPrincipal(stage.returnAssignee);
const reviewRequestRaw = parseObject(stage.reviewRequest);
const reviewInstructions = asString(reviewRequestRaw.instructions, "").trim();
const reviewRequest = reviewInstructions ? { instructions: reviewInstructions } : null;
const stageId = asString(stage.stageId, "").trim() || null;
const stageType = asString(stage.stageType, "").trim() || null;
const lastDecisionOutcome = asString(stage.lastDecisionOutcome, "").trim() || null;
if (!wakeRole && !stageId && !stageType && !currentParticipant && !returnAssignee && !lastDecisionOutcome && allowedActions.length === 0) {
if (!wakeRole && !stageId && !stageType && !currentParticipant && !returnAssignee && !reviewRequest && !lastDecisionOutcome && allowedActions.length === 0) {
return null;
}
@@ -498,6 +506,7 @@ function normalizePaperclipWakeExecutionStage(value: unknown): PaperclipWakeExec
stageType,
currentParticipant,
returnAssignee,
reviewRequest,
lastDecisionOutcome,
allowedActions,
};
@@ -664,6 +673,13 @@ export function renderPaperclipWakePrompt(
if (executionStage.allowedActions.length > 0) {
lines.push(`- allowed actions: ${executionStage.allowedActions.join(", ")}`);
}
if (executionStage.reviewRequest) {
lines.push(
"",
"Review request instructions:",
executionStage.reviewRequest.instructions,
);
}
lines.push("");
if (executionStage.wakeRole === "reviewer" || executionStage.wakeRole === "approver") {
lines.push(
@@ -819,6 +835,41 @@ export function buildPaperclipEnv(agent: { id: string; companyId: string }): Rec
return vars;
}
export function applyPaperclipWorkspaceEnv(
env: Record<string, string>,
input: {
workspaceCwd?: string | null;
workspaceSource?: string | null;
workspaceStrategy?: string | null;
workspaceId?: string | null;
workspaceRepoUrl?: string | null;
workspaceRepoRef?: string | null;
workspaceBranch?: string | null;
workspaceWorktreePath?: string | null;
agentHome?: string | null;
},
): Record<string, string> {
const mappings = [
["PAPERCLIP_WORKSPACE_CWD", input.workspaceCwd],
["PAPERCLIP_WORKSPACE_SOURCE", input.workspaceSource],
["PAPERCLIP_WORKSPACE_STRATEGY", input.workspaceStrategy],
["PAPERCLIP_WORKSPACE_ID", input.workspaceId],
["PAPERCLIP_WORKSPACE_REPO_URL", input.workspaceRepoUrl],
["PAPERCLIP_WORKSPACE_REPO_REF", input.workspaceRepoRef],
["PAPERCLIP_WORKSPACE_BRANCH", input.workspaceBranch],
["PAPERCLIP_WORKSPACE_WORKTREE_PATH", input.workspaceWorktreePath],
["AGENT_HOME", input.agentHome],
] as const;
for (const [key, value] of mappings) {
if (typeof value === "string" && value.length > 0) {
env[key] = value;
}
}
return env;
}
export function sanitizeInheritedPaperclipEnv(baseEnv: NodeJS.ProcessEnv): NodeJS.ProcessEnv {
const env: NodeJS.ProcessEnv = { ...baseEnv };
for (const key of Object.keys(env)) {
@@ -1021,6 +1072,20 @@ export async function resolvePaperclipSkillsDir(
return null;
}
async function readSkillRequired(skillDir: string): Promise<boolean> {
try {
const content = await fs.readFile(path.join(skillDir, "SKILL.md"), "utf8");
const normalized = content.replace(/\r\n/g, "\n");
if (!normalized.startsWith("---\n")) return true;
const closing = normalized.indexOf("\n---\n", 4);
if (closing < 0) return true;
const frontmatter = normalized.slice(4, closing);
return !/^\s*required\s*:\s*false\s*$/m.test(frontmatter);
} catch {
return true;
}
}
export async function listPaperclipSkillEntries(
moduleDir: string,
additionalCandidates: string[] = [],
@@ -1030,15 +1095,20 @@ export async function listPaperclipSkillEntries(
try {
const entries = await fs.readdir(root, { withFileTypes: true });
return entries
.filter((entry) => entry.isDirectory())
.map((entry) => ({
const dirs = entries.filter((entry) => entry.isDirectory());
return Promise.all(dirs.map(async (entry) => {
const skillDir = path.join(root, entry.name);
const required = await readSkillRequired(skillDir);
return {
key: `paperclipai/paperclip/${entry.name}`,
runtimeName: entry.name,
source: path.join(root, entry.name),
required: true,
requiredReason: "Bundled Paperclip skills are always available for local adapters.",
}));
source: skillDir,
required,
requiredReason: required
? "Bundled Paperclip skills are always available for local adapters."
: null,
};
}));
} catch {
return [];
}

View File

@@ -476,8 +476,8 @@ async function importGitWorkspaceToSsh(input: {
`if [ ! -d ${shellQuote(path.posix.join(input.remoteDir, ".git"))} ]; then git init ${shellQuote(input.remoteDir)} >/dev/null; fi`,
`git -C ${shellQuote(input.remoteDir)} fetch --force "$tmp_bundle" '${tempRef}:${tempRef}' >/dev/null`,
input.snapshot.branchName
? `git -C ${shellQuote(input.remoteDir)} checkout -B ${shellQuote(input.snapshot.branchName)} ${shellQuote(input.snapshot.headCommit)} >/dev/null`
: `git -C ${shellQuote(input.remoteDir)} -c advice.detachedHead=false checkout --detach ${shellQuote(input.snapshot.headCommit)} >/dev/null`,
? `git -C ${shellQuote(input.remoteDir)} checkout --force -B ${shellQuote(input.snapshot.branchName)} ${shellQuote(input.snapshot.headCommit)} >/dev/null`
: `git -C ${shellQuote(input.remoteDir)} -c advice.detachedHead=false checkout --force --detach ${shellQuote(input.snapshot.headCommit)} >/dev/null`,
`git -C ${shellQuote(input.remoteDir)} reset --hard ${shellQuote(input.snapshot.headCommit)} >/dev/null`,
`git -C ${shellQuote(input.remoteDir)} clean -fdx -e .paperclip-runtime >/dev/null`,
].join("\n");

View File

@@ -64,12 +64,16 @@ export interface AdapterRuntimeServiceReport {
healthStatus?: "unknown" | "healthy" | "unhealthy";
}
export type AdapterExecutionErrorFamily = "transient_upstream";
export interface AdapterExecutionResult {
exitCode: number | null;
signal: string | null;
timedOut: boolean;
errorMessage?: string | null;
errorCode?: string | null;
errorFamily?: AdapterExecutionErrorFamily | null;
retryNotBefore?: string | null;
errorMeta?: Record<string, unknown>;
usage?: UsageSummary;
/**
@@ -212,6 +216,20 @@ export interface AdapterEnvironmentTestContext {
companyId: string;
adapterType: string;
config: Record<string, unknown>;
/**
* Optional execution target the adapter should run probes against.
*
* If omitted (or `kind === "local"`), the adapter tests on the Paperclip
* host. For SSH/sandbox targets the adapter should run command/auth probes
* inside the remote environment so the result reflects what an agent run
* would actually see at execution time.
*/
executionTarget?: AdapterExecutionTarget | null;
/**
* Friendly name of the environment being tested (when `executionTarget` is set).
* Surfaced in check messages so users see which environment the probe ran in.
*/
environmentName?: string | null;
deployment?: {
mode?: "local_trusted" | "authenticated";
exposure?: "private" | "public";
@@ -311,6 +329,13 @@ export interface ServerAdapterModule {
supportsLocalAgentJwt?: boolean;
models?: AdapterModel[];
listModels?: () => Promise<AdapterModel[]>;
/**
* Optional explicit refresh hook for model discovery.
* Use this when the adapter caches discovered models and needs a bypass path
* so the UI can fetch newly released models without waiting for cache expiry
* or a Paperclip code update.
*/
refreshModels?: () => Promise<AdapterModel[]>;
agentConfigurationDoc?: string;
/**
* Optional lifecycle hook when an agent is approved/hired (join-request or hire_agent approval).

View File

@@ -0,0 +1,66 @@
import * as fs from "node:fs/promises";
import os from "node:os";
import path from "node:path";
import { afterEach, describe, expect, it, vi } from "vitest";
import { prepareClaudeConfigSeed } from "./claude-config.js";
describe("prepareClaudeConfigSeed", () => {
const cleanupDirs: string[] = [];
afterEach(async () => {
vi.restoreAllMocks();
while (cleanupDirs.length > 0) {
const dir = cleanupDirs.pop();
if (!dir) continue;
await fs.rm(dir, { recursive: true, force: true }).catch(() => undefined);
}
});
function createEnv(root: string, sourceDir: string): NodeJS.ProcessEnv {
return {
HOME: root,
PAPERCLIP_HOME: path.join(root, "paperclip-home"),
PAPERCLIP_INSTANCE_ID: "test-instance",
CLAUDE_CONFIG_DIR: sourceDir,
};
}
it("reuses the same snapshot path when the seeded files are unchanged", async () => {
const root = await fs.mkdtemp(path.join(os.tmpdir(), "paperclip-claude-config-seed-"));
cleanupDirs.push(root);
const sourceDir = path.join(root, "claude-source");
await fs.mkdir(sourceDir, { recursive: true });
await fs.writeFile(path.join(sourceDir, "settings.json"), JSON.stringify({ theme: "light" }), "utf8");
const onLog = vi.fn(async () => {});
const env = createEnv(root, sourceDir);
const first = await prepareClaudeConfigSeed(env, onLog, "company-1");
const second = await prepareClaudeConfigSeed(env, onLog, "company-1");
expect(first).toBe(second);
await expect(fs.readFile(path.join(first, "settings.json"), "utf8"))
.resolves.toBe(JSON.stringify({ theme: "light" }));
});
it("keeps an existing snapshot intact when the seeded files change", async () => {
const root = await fs.mkdtemp(path.join(os.tmpdir(), "paperclip-claude-config-race-"));
cleanupDirs.push(root);
const sourceDir = path.join(root, "claude-source");
await fs.mkdir(sourceDir, { recursive: true });
await fs.writeFile(path.join(sourceDir, "settings.json"), JSON.stringify({ theme: "light" }), "utf8");
const onLog = vi.fn(async () => {});
const env = createEnv(root, sourceDir);
const first = await prepareClaudeConfigSeed(env, onLog, "company-1");
await fs.writeFile(path.join(sourceDir, "settings.json"), JSON.stringify({ theme: "dark" }), "utf8");
const second = await prepareClaudeConfigSeed(env, onLog, "company-1");
expect(second).not.toBe(first);
await expect(fs.readFile(path.join(first, "settings.json"), "utf8"))
.resolves.toBe(JSON.stringify({ theme: "light" }));
await expect(fs.readFile(path.join(second, "settings.json"), "utf8"))
.resolves.toBe(JSON.stringify({ theme: "dark" }));
});
});

View File

@@ -0,0 +1,135 @@
import { createHash } from "node:crypto";
import fs from "node:fs/promises";
import os from "node:os";
import path from "node:path";
import type { AdapterExecutionContext } from "@paperclipai/adapter-utils";
const DEFAULT_PAPERCLIP_INSTANCE_ID = "default";
const SEEDED_SHARED_FILES = [
".credentials.json",
"credentials.json",
"settings.json",
"settings.local.json",
"CLAUDE.md",
] as const;
function nonEmpty(value: string | undefined): string | null {
return typeof value === "string" && value.trim().length > 0 ? value.trim() : null;
}
async function pathExists(candidate: string): Promise<boolean> {
return fs.access(candidate).then(() => true).catch(() => false);
}
function isAlreadyExistsError(error: unknown): boolean {
if (!error || typeof error !== "object") return false;
const code = "code" in error ? error.code : null;
return code === "EEXIST" || code === "ENOTEMPTY";
}
async function collectSeedFiles(sourceDir: string): Promise<Array<{ name: string; sourcePath: string }>> {
const files: Array<{ name: string; sourcePath: string }> = [];
for (const name of SEEDED_SHARED_FILES) {
const sourcePath = path.join(sourceDir, name);
if (!(await pathExists(sourcePath))) continue;
files.push({ name, sourcePath });
}
return files;
}
async function buildSeedSnapshotKey(files: Array<{ name: string; sourcePath: string }>): Promise<string> {
if (files.length === 0) return "empty";
const hash = createHash("sha256");
for (const file of files) {
hash.update(file.name);
hash.update("\0");
hash.update(await fs.readFile(file.sourcePath));
hash.update("\0");
}
return hash.digest("hex").slice(0, 16);
}
async function materializeSeedSnapshot(input: {
rootDir: string;
snapshotKey: string;
files: Array<{ name: string; sourcePath: string }>;
}): Promise<string> {
const targetDir = path.join(input.rootDir, input.snapshotKey);
if (await pathExists(targetDir)) {
return targetDir;
}
await fs.mkdir(input.rootDir, { recursive: true });
const stagingDir = await fs.mkdtemp(path.join(input.rootDir, ".tmp-"));
try {
for (const file of input.files) {
await fs.copyFile(file.sourcePath, path.join(stagingDir, file.name));
}
try {
await fs.rename(stagingDir, targetDir);
} catch (error) {
if (!isAlreadyExistsError(error)) {
throw error;
}
await fs.rm(stagingDir, { recursive: true, force: true });
}
} catch (error) {
await fs.rm(stagingDir, { recursive: true, force: true }).catch(() => undefined);
throw error;
}
return targetDir;
}
export function resolveSharedClaudeConfigDir(
env: NodeJS.ProcessEnv = process.env,
): string {
const fromEnv = nonEmpty(env.CLAUDE_CONFIG_DIR);
return fromEnv ? path.resolve(fromEnv) : path.join(os.homedir(), ".claude");
}
export function resolveManagedClaudeConfigSeedDir(
env: NodeJS.ProcessEnv,
companyId?: string,
): string {
const paperclipHome = nonEmpty(env.PAPERCLIP_HOME) ?? path.resolve(os.homedir(), ".paperclip");
const instanceId = nonEmpty(env.PAPERCLIP_INSTANCE_ID) ?? DEFAULT_PAPERCLIP_INSTANCE_ID;
return companyId
? path.resolve(paperclipHome, "instances", instanceId, "companies", companyId, "claude-config-seed")
: path.resolve(paperclipHome, "instances", instanceId, "claude-config-seed");
}
export async function prepareClaudeConfigSeed(
env: NodeJS.ProcessEnv,
onLog: AdapterExecutionContext["onLog"],
companyId?: string,
): Promise<string> {
const sourceDir = resolveSharedClaudeConfigDir(env);
const targetRootDir = resolveManagedClaudeConfigSeedDir(env, companyId);
if (path.resolve(sourceDir) === path.resolve(targetRootDir)) {
return targetRootDir;
}
const copiedFiles = await collectSeedFiles(sourceDir);
const snapshotKey = await buildSeedSnapshotKey(copiedFiles);
const targetDir = await materializeSeedSnapshot({
rootDir: targetRootDir,
snapshotKey,
files: copiedFiles,
});
if (copiedFiles.length > 0) {
await onLog(
"stdout",
`[paperclip] Prepared Claude config seed "${targetDir}" from "${sourceDir}" (${copiedFiles.map((file) => file.name).join(", ")}).\n`,
);
} else {
await onLog(
"stdout",
`[paperclip] No local Claude config seed files were found in "${sourceDir}". Remote Claude auth may still require login.\n`,
);
}
return targetDir;
}

View File

@@ -10,12 +10,15 @@ import {
adapterExecutionTargetSessionIdentity,
adapterExecutionTargetSessionMatches,
adapterExecutionTargetUsesManagedHome,
adapterExecutionTargetUsesPaperclipBridge,
describeAdapterExecutionTarget,
ensureAdapterExecutionTargetCommandResolvable,
prepareAdapterExecutionTargetRuntime,
readAdapterExecutionTarget,
resolveAdapterExecutionTargetCommandForLogs,
runAdapterExecutionTargetProcess,
runAdapterExecutionTargetShellCommand,
startAdapterExecutionTargetPaperclipBridge,
} from "@paperclipai/adapter-utils/execution-target";
import {
asString,
@@ -24,6 +27,7 @@ import {
asStringArray,
parseObject,
parseJson,
applyPaperclipWorkspaceEnv,
buildPaperclipEnv,
readPaperclipRuntimeSkillEntries,
joinPromptSections,
@@ -35,13 +39,17 @@ import {
stringifyPaperclipWakePayload,
DEFAULT_PAPERCLIP_AGENT_PROMPT_TEMPLATE,
} from "@paperclipai/adapter-utils/server-utils";
import { shellQuote } from "@paperclipai/adapter-utils/ssh";
import {
parseClaudeStreamJson,
describeClaudeFailure,
detectClaudeLoginRequired,
extractClaudeRetryNotBefore,
isClaudeMaxTurnsResult,
isClaudeTransientUpstreamError,
isClaudeUnknownSessionError,
} from "./parse.js";
import { prepareClaudeConfigSeed } from "./claude-config.js";
import { resolveClaudeDesiredSkillNames } from "./skills.js";
import { isBedrockModelId } from "./models.js";
import { prepareClaudePromptBundle } from "./prompt-cache.js";
@@ -191,33 +199,17 @@ async function buildClaudeRuntimeConfig(input: ClaudeExecutionInput): Promise<Cl
if (wakePayloadJson) {
env.PAPERCLIP_WAKE_PAYLOAD_JSON = wakePayloadJson;
}
if (effectiveWorkspaceCwd) {
env.PAPERCLIP_WORKSPACE_CWD = effectiveWorkspaceCwd;
}
if (workspaceSource) {
env.PAPERCLIP_WORKSPACE_SOURCE = workspaceSource;
}
if (workspaceStrategy) {
env.PAPERCLIP_WORKSPACE_STRATEGY = workspaceStrategy;
}
if (workspaceId) {
env.PAPERCLIP_WORKSPACE_ID = workspaceId;
}
if (workspaceRepoUrl) {
env.PAPERCLIP_WORKSPACE_REPO_URL = workspaceRepoUrl;
}
if (workspaceRepoRef) {
env.PAPERCLIP_WORKSPACE_REPO_REF = workspaceRepoRef;
}
if (workspaceBranch) {
env.PAPERCLIP_WORKSPACE_BRANCH = workspaceBranch;
}
if (workspaceWorktreePath) {
env.PAPERCLIP_WORKSPACE_WORKTREE_PATH = workspaceWorktreePath;
}
if (agentHome) {
env.AGENT_HOME = agentHome;
}
applyPaperclipWorkspaceEnv(env, {
workspaceCwd: effectiveWorkspaceCwd,
workspaceSource,
workspaceStrategy,
workspaceId,
workspaceRepoUrl,
workspaceRepoRef,
workspaceBranch,
workspaceWorktreePath,
agentHome,
});
if (workspaceHints.length > 0) {
env.PAPERCLIP_WORKSPACES_JSON = JSON.stringify(workspaceHints);
}
@@ -329,6 +321,9 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
const chrome = asBoolean(config.chrome, false);
const maxTurns = asNumber(config.maxTurnsPerRun, 0);
const dangerouslySkipPermissions = asBoolean(config.dangerouslySkipPermissions, true);
const configEnv = parseObject(config.env);
const hasExplicitClaudeConfigDir =
typeof configEnv.CLAUDE_CONFIG_DIR === "string" && configEnv.CLAUDE_CONFIG_DIR.trim().length > 0;
const instructionsFilePath = asString(config.instructionsFilePath, "").trim();
const instructionsFileDir = instructionsFilePath ? `${path.dirname(instructionsFilePath)}/` : "";
const runtimeConfig = await buildClaudeRuntimeConfig({
@@ -347,11 +342,12 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
workspaceRepoUrl,
workspaceRepoRef,
env,
loggedEnv,
loggedEnv: initialLoggedEnv,
timeoutSec,
graceSec,
extraArgs,
} = runtimeConfig;
let loggedEnv = initialLoggedEnv;
const effectiveExecutionCwd = adapterExecutionTargetRemoteCwd(executionTarget, cwd);
const terminalResultCleanupGraceMs = Math.max(
0,
@@ -392,6 +388,13 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
instructionsContents: combinedInstructionsContents,
onLog,
});
const useManagedRemoteClaudeConfig =
executionTargetIsRemote &&
adapterExecutionTargetUsesManagedHome(executionTarget) &&
!hasExplicitClaudeConfigDir;
const claudeConfigSeedDir = useManagedRemoteClaudeConfig
? await prepareClaudeConfigSeed(process.env, onLog, agent.companyId)
: null;
const preparedExecutionTargetRuntime = executionTargetIsRemote
? await (async () => {
await onLog(
@@ -408,6 +411,13 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
localDir: promptBundle.addDir,
followSymlinks: true,
},
...(claudeConfigSeedDir
? [{
key: "config-seed",
localDir: claudeConfigSeedDir,
followSymlinks: true,
}]
: []),
],
});
})()
@@ -424,6 +434,63 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
? path.posix.join(effectivePromptBundleAddDir, path.basename(promptBundle.instructionsFilePath))
: promptBundle.instructionsFilePath
: undefined;
const remoteClaudeRuntimeRoot = executionTargetIsRemote
? preparedExecutionTargetRuntime?.runtimeRootDir ??
path.posix.join(effectiveExecutionCwd, ".paperclip-runtime", "claude")
: null;
const remoteClaudeConfigSeedDir = claudeConfigSeedDir && remoteClaudeRuntimeRoot
? preparedExecutionTargetRuntime?.assetDirs["config-seed"] ??
path.posix.join(remoteClaudeRuntimeRoot, "config-seed")
: null;
const remoteClaudeConfigDir = useManagedRemoteClaudeConfig && remoteClaudeRuntimeRoot
? path.posix.join(remoteClaudeRuntimeRoot, "config")
: null;
if (remoteClaudeConfigDir && remoteClaudeConfigSeedDir) {
env.CLAUDE_CONFIG_DIR = remoteClaudeConfigDir;
loggedEnv.CLAUDE_CONFIG_DIR = remoteClaudeConfigDir;
await onLog(
"stdout",
`[paperclip] Materializing Claude auth/config into ${remoteClaudeConfigDir}.\n`,
);
await runAdapterExecutionTargetShellCommand(
runId,
executionTarget,
`mkdir -p ${shellQuote(remoteClaudeConfigDir)} && ` +
`if [ -d ${shellQuote(remoteClaudeConfigSeedDir)} ]; then ` +
`cp -R ${shellQuote(`${remoteClaudeConfigSeedDir}/.`)} ${shellQuote(remoteClaudeConfigDir)}/; ` +
`fi`,
{
cwd,
env,
timeoutSec: Math.max(timeoutSec, 15),
graceSec,
onLog,
},
);
}
let paperclipBridge: Awaited<ReturnType<typeof startAdapterExecutionTargetPaperclipBridge>> = null;
if (executionTargetIsRemote && adapterExecutionTargetUsesPaperclipBridge(executionTarget)) {
paperclipBridge = await startAdapterExecutionTargetPaperclipBridge({
runId,
target: executionTarget,
runtimeRootDir: preparedExecutionTargetRuntime?.runtimeRootDir,
adapterKey: "claude",
hostApiToken: env.PAPERCLIP_API_KEY,
onLog,
});
if (paperclipBridge) {
Object.assign(env, paperclipBridge.env);
const runtimeEnv = ensurePathInEnv({ ...process.env, ...env });
loggedEnv = buildInvocationEnvForLogs(env, {
runtimeEnv,
includeRuntimeKeys: ["HOME", "CLAUDE_CONFIG_DIR"],
resolvedCommand,
});
if (remoteClaudeConfigDir) {
loggedEnv.CLAUDE_CONFIG_DIR = remoteClaudeConfigDir;
}
}
}
const runtimeSessionParams = parseObject(runtime.sessionParams);
const runtimeSessionId = asString(runtimeSessionParams.sessionId, runtime.sessionId ?? "");
@@ -625,16 +692,48 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
}
if (!parsed) {
const fallbackErrorMessage = parseFallbackErrorMessage(proc);
const transientUpstream =
!loginMeta.requiresLogin &&
(proc.exitCode ?? 0) !== 0 &&
isClaudeTransientUpstreamError({
parsed: null,
stdout: proc.stdout,
stderr: proc.stderr,
errorMessage: fallbackErrorMessage,
});
const transientRetryNotBefore = transientUpstream
? extractClaudeRetryNotBefore({
parsed: null,
stdout: proc.stdout,
stderr: proc.stderr,
errorMessage: fallbackErrorMessage,
})
: null;
const errorCode = loginMeta.requiresLogin
? "claude_auth_required"
: transientUpstream
? "claude_transient_upstream"
: null;
return {
exitCode: proc.exitCode,
signal: proc.signal,
timedOut: false,
errorMessage: parseFallbackErrorMessage(proc),
errorCode: loginMeta.requiresLogin ? "claude_auth_required" : null,
errorMessage: fallbackErrorMessage,
errorCode,
errorFamily: transientUpstream ? "transient_upstream" : null,
retryNotBefore: transientRetryNotBefore ? transientRetryNotBefore.toISOString() : null,
errorMeta,
resultJson: {
stdout: proc.stdout,
stderr: proc.stderr,
...(transientUpstream ? { errorFamily: "transient_upstream" } : {}),
...(transientRetryNotBefore
? { retryNotBefore: transientRetryNotBefore.toISOString() }
: {}),
...(transientRetryNotBefore
? { transientRetryNotBefore: transientRetryNotBefore.toISOString() }
: {}),
},
clearSession: Boolean(opts.clearSessionOnMissingSession),
};
@@ -670,16 +769,48 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
} as Record<string, unknown>)
: null;
const clearSessionForMaxTurns = isClaudeMaxTurnsResult(parsed);
const parsedIsError = asBoolean(parsed.is_error, false);
const failed = (proc.exitCode ?? 0) !== 0 || parsedIsError;
const errorMessage = failed
? describeClaudeFailure(parsed) ?? `Claude exited with code ${proc.exitCode ?? -1}`
: null;
const transientUpstream =
failed &&
!loginMeta.requiresLogin &&
isClaudeTransientUpstreamError({
parsed,
stdout: proc.stdout,
stderr: proc.stderr,
errorMessage,
});
const transientRetryNotBefore = transientUpstream
? extractClaudeRetryNotBefore({
parsed,
stdout: proc.stdout,
stderr: proc.stderr,
errorMessage,
})
: null;
const resolvedErrorCode = loginMeta.requiresLogin
? "claude_auth_required"
: transientUpstream
? "claude_transient_upstream"
: null;
const mergedResultJson: Record<string, unknown> = {
...parsed,
...(transientUpstream ? { errorFamily: "transient_upstream" } : {}),
...(transientRetryNotBefore ? { retryNotBefore: transientRetryNotBefore.toISOString() } : {}),
...(transientRetryNotBefore ? { transientRetryNotBefore: transientRetryNotBefore.toISOString() } : {}),
};
return {
exitCode: proc.exitCode,
signal: proc.signal,
timedOut: false,
errorMessage:
(proc.exitCode ?? 0) === 0
? null
: describeClaudeFailure(parsed) ?? `Claude exited with code ${proc.exitCode ?? -1}`,
errorCode: loginMeta.requiresLogin ? "claude_auth_required" : null,
errorMessage,
errorCode: resolvedErrorCode,
errorFamily: transientUpstream ? "transient_upstream" : null,
retryNotBefore: transientRetryNotBefore ? transientRetryNotBefore.toISOString() : null,
errorMeta,
usage,
sessionId: resolvedSessionId,
@@ -690,7 +821,7 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
model: parsedStream.model || asString(parsed.model, model),
billingType,
costUsd: parsedStream.costUsd ?? asNumber(parsed.total_cost_usd, 0),
resultJson: parsed,
resultJson: mergedResultJson,
summary: parsedStream.summary || asString(parsed.result, ""),
clearSession: clearSessionForMaxTurns || Boolean(opts.clearSessionOnMissingSession && !resolvedSessionId),
};
@@ -715,6 +846,9 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
return toAdapterResult(initial, { fallbackSessionId: runtimeSessionId || runtime.sessionId });
} finally {
if (paperclipBridge) {
await paperclipBridge.stop();
}
if (restoreRemoteWorkspace) {
await onLog(
"stdout",

View File

@@ -0,0 +1,123 @@
import { describe, expect, it } from "vitest";
import {
extractClaudeRetryNotBefore,
isClaudeTransientUpstreamError,
} from "./parse.js";
describe("isClaudeTransientUpstreamError", () => {
it("classifies the 'out of extra usage' subscription window failure as transient", () => {
expect(
isClaudeTransientUpstreamError({
errorMessage: "You're out of extra usage · resets 4pm (America/Chicago)",
}),
).toBe(true);
expect(
isClaudeTransientUpstreamError({
parsed: {
is_error: true,
result: "You're out of extra usage. Resets at 4pm (America/Chicago).",
},
}),
).toBe(true);
});
it("classifies Anthropic API rate_limit_error and overloaded_error as transient", () => {
expect(
isClaudeTransientUpstreamError({
parsed: {
is_error: true,
errors: [{ type: "rate_limit_error", message: "Rate limit reached for requests." }],
},
}),
).toBe(true);
expect(
isClaudeTransientUpstreamError({
parsed: {
is_error: true,
errors: [{ type: "overloaded_error", message: "Overloaded" }],
},
}),
).toBe(true);
expect(
isClaudeTransientUpstreamError({
stderr: "HTTP 429: Too Many Requests",
}),
).toBe(true);
expect(
isClaudeTransientUpstreamError({
stderr: "Bedrock ThrottlingException: slow down",
}),
).toBe(true);
});
it("classifies the subscription 5-hour / weekly limit wording", () => {
expect(
isClaudeTransientUpstreamError({
errorMessage: "Claude usage limit reached — weekly limit reached. Try again in 2 days.",
}),
).toBe(true);
expect(
isClaudeTransientUpstreamError({
errorMessage: "5-hour limit reached.",
}),
).toBe(true);
});
it("does not classify login/auth failures as transient", () => {
expect(
isClaudeTransientUpstreamError({
stderr: "Please log in. Run `claude login` first.",
}),
).toBe(false);
});
it("does not classify max-turns or unknown-session as transient", () => {
expect(
isClaudeTransientUpstreamError({
parsed: { subtype: "error_max_turns", result: "Maximum turns reached." },
}),
).toBe(false);
expect(
isClaudeTransientUpstreamError({
parsed: {
result: "No conversation found with session id abc-123",
errors: [{ message: "No conversation found with session id abc-123" }],
},
}),
).toBe(false);
});
it("does not classify deterministic validation errors as transient", () => {
expect(
isClaudeTransientUpstreamError({
errorMessage: "Invalid request_error: Unknown parameter 'foo'.",
}),
).toBe(false);
});
});
describe("extractClaudeRetryNotBefore", () => {
it("parses the 'resets 4pm' hint in its explicit timezone", () => {
const now = new Date("2026-04-22T15:15:00.000Z");
const extracted = extractClaudeRetryNotBefore(
{ errorMessage: "You're out of extra usage · resets 4pm (America/Chicago)" },
now,
);
expect(extracted?.toISOString()).toBe("2026-04-22T21:00:00.000Z");
});
it("rolls forward past midnight when the reset time has already passed today", () => {
const now = new Date("2026-04-22T23:30:00.000Z");
const extracted = extractClaudeRetryNotBefore(
{ errorMessage: "Usage limit reached. Resets at 3:15 AM (UTC)." },
now,
);
expect(extracted?.toISOString()).toBe("2026-04-23T03:15:00.000Z");
});
it("returns null when no reset hint is present", () => {
expect(
extractClaudeRetryNotBefore({ errorMessage: "Overloaded. Try again later." }, new Date()),
).toBeNull();
});
});

View File

@@ -1,9 +1,19 @@
import type { UsageSummary } from "@paperclipai/adapter-utils";
import { asString, asNumber, parseObject, parseJson } from "@paperclipai/adapter-utils/server-utils";
import {
asString,
asNumber,
parseObject,
parseJson,
} from "@paperclipai/adapter-utils/server-utils";
const CLAUDE_AUTH_REQUIRED_RE = /(?:not\s+logged\s+in|please\s+log\s+in|please\s+run\s+`?claude\s+login`?|login\s+required|requires\s+login|unauthorized|authentication\s+required)/i;
const URL_RE = /(https?:\/\/[^\s'"`<>()[\]{};,!?]+[^\s'"`<>()[\]{};,!.?:]+)/gi;
const CLAUDE_TRANSIENT_UPSTREAM_RE =
/(?:rate[-\s]?limit(?:ed)?|rate_limit_error|too\s+many\s+requests|\b429\b|overloaded(?:_error)?|server\s+overloaded|service\s+unavailable|\b503\b|\b529\b|high\s+demand|try\s+again\s+later|temporarily\s+unavailable|throttl(?:ed|ing)|throttlingexception|servicequotaexceededexception|out\s+of\s+extra\s+usage|extra\s+usage\b|claude\s+usage\s+limit\s+reached|5[-\s]?hour\s+limit\s+reached|weekly\s+limit\s+reached|usage\s+limit\s+reached|usage\s+cap\s+reached)/i;
const CLAUDE_EXTRA_USAGE_RESET_RE =
/(?:out\s+of\s+extra\s+usage|extra\s+usage|usage\s+limit\s+reached|usage\s+cap\s+reached|5[-\s]?hour\s+limit\s+reached|weekly\s+limit\s+reached|claude\s+usage\s+limit\s+reached)[\s\S]{0,80}?\bresets?\s+(?:at\s+)?([^\n()]+?)(?:\s*\(([^)]+)\))?(?:[.!]|\n|$)/i;
export function parseClaudeStreamJson(stdout: string) {
let sessionId: string | null = null;
let model = "";
@@ -177,3 +187,197 @@ export function isClaudeUnknownSessionError(parsed: Record<string, unknown>): bo
/no conversation found with session id|unknown session|session .* not found/i.test(msg),
);
}
function buildClaudeTransientHaystack(input: {
parsed?: Record<string, unknown> | null;
stdout?: string | null;
stderr?: string | null;
errorMessage?: string | null;
}): string {
const parsed = input.parsed ?? null;
const resultText = parsed ? asString(parsed.result, "") : "";
const parsedErrors = parsed ? extractClaudeErrorMessages(parsed) : [];
return [
input.errorMessage ?? "",
resultText,
...parsedErrors,
input.stdout ?? "",
input.stderr ?? "",
]
.join("\n")
.split(/\r?\n/)
.map((line) => line.trim())
.filter(Boolean)
.join("\n");
}
function readTimeZoneParts(date: Date, timeZone: string) {
const values = new Map(
new Intl.DateTimeFormat("en-US", {
timeZone,
hourCycle: "h23",
year: "numeric",
month: "2-digit",
day: "2-digit",
hour: "2-digit",
minute: "2-digit",
}).formatToParts(date).map((part) => [part.type, part.value]),
);
return {
year: Number.parseInt(values.get("year") ?? "", 10),
month: Number.parseInt(values.get("month") ?? "", 10),
day: Number.parseInt(values.get("day") ?? "", 10),
hour: Number.parseInt(values.get("hour") ?? "", 10),
minute: Number.parseInt(values.get("minute") ?? "", 10),
};
}
function normalizeResetTimeZone(timeZoneHint: string | null | undefined): string | null {
const normalized = timeZoneHint?.trim();
if (!normalized) return null;
if (/^(?:utc|gmt)$/i.test(normalized)) return "UTC";
try {
new Intl.DateTimeFormat("en-US", { timeZone: normalized }).format(new Date(0));
return normalized;
} catch {
return null;
}
}
function dateFromTimeZoneWallClock(input: {
year: number;
month: number;
day: number;
hour: number;
minute: number;
timeZone: string;
}): Date | null {
let candidate = new Date(Date.UTC(input.year, input.month - 1, input.day, input.hour, input.minute, 0, 0));
const targetUtc = Date.UTC(input.year, input.month - 1, input.day, input.hour, input.minute, 0, 0);
for (let attempt = 0; attempt < 4; attempt += 1) {
const actual = readTimeZoneParts(candidate, input.timeZone);
const actualUtc = Date.UTC(actual.year, actual.month - 1, actual.day, actual.hour, actual.minute, 0, 0);
const offsetMs = targetUtc - actualUtc;
if (offsetMs === 0) break;
candidate = new Date(candidate.getTime() + offsetMs);
}
const verified = readTimeZoneParts(candidate, input.timeZone);
if (
verified.year !== input.year ||
verified.month !== input.month ||
verified.day !== input.day ||
verified.hour !== input.hour ||
verified.minute !== input.minute
) {
return null;
}
return candidate;
}
function nextClockTimeInTimeZone(input: {
now: Date;
hour: number;
minute: number;
timeZoneHint: string;
}): Date | null {
const timeZone = normalizeResetTimeZone(input.timeZoneHint);
if (!timeZone) return null;
const nowParts = readTimeZoneParts(input.now, timeZone);
let retryAt = dateFromTimeZoneWallClock({
year: nowParts.year,
month: nowParts.month,
day: nowParts.day,
hour: input.hour,
minute: input.minute,
timeZone,
});
if (!retryAt) return null;
if (retryAt.getTime() <= input.now.getTime()) {
const nextDay = new Date(Date.UTC(nowParts.year, nowParts.month - 1, nowParts.day + 1, 0, 0, 0, 0));
retryAt = dateFromTimeZoneWallClock({
year: nextDay.getUTCFullYear(),
month: nextDay.getUTCMonth() + 1,
day: nextDay.getUTCDate(),
hour: input.hour,
minute: input.minute,
timeZone,
});
}
return retryAt;
}
function parseClaudeResetClockTime(clockText: string, now: Date, timeZoneHint?: string | null): Date | null {
const normalized = clockText.trim().replace(/\s+/g, " ");
const match = normalized.match(/^(\d{1,2})(?::(\d{2}))?\s*([ap])\.?\s*m\.?/i);
if (!match) return null;
const hour12 = Number.parseInt(match[1] ?? "", 10);
const minute = Number.parseInt(match[2] ?? "0", 10);
if (!Number.isInteger(hour12) || hour12 < 1 || hour12 > 12) return null;
if (!Number.isInteger(minute) || minute < 0 || minute > 59) return null;
let hour24 = hour12 % 12;
if ((match[3] ?? "").toLowerCase() === "p") hour24 += 12;
if (timeZoneHint) {
const explicitRetryAt = nextClockTimeInTimeZone({
now,
hour: hour24,
minute,
timeZoneHint,
});
if (explicitRetryAt) return explicitRetryAt;
}
const retryAt = new Date(now);
retryAt.setHours(hour24, minute, 0, 0);
if (retryAt.getTime() <= now.getTime()) {
retryAt.setDate(retryAt.getDate() + 1);
}
return retryAt;
}
export function extractClaudeRetryNotBefore(
input: {
parsed?: Record<string, unknown> | null;
stdout?: string | null;
stderr?: string | null;
errorMessage?: string | null;
},
now = new Date(),
): Date | null {
const haystack = buildClaudeTransientHaystack(input);
const match = haystack.match(CLAUDE_EXTRA_USAGE_RESET_RE);
if (!match) return null;
return parseClaudeResetClockTime(match[1] ?? "", now, match[2]);
}
export function isClaudeTransientUpstreamError(input: {
parsed?: Record<string, unknown> | null;
stdout?: string | null;
stderr?: string | null;
errorMessage?: string | null;
}): boolean {
const parsed = input.parsed ?? null;
// Deterministic failures are handled by their own classifiers.
if (parsed && (isClaudeMaxTurnsResult(parsed) || isClaudeUnknownSessionError(parsed))) {
return false;
}
const loginMeta = detectClaudeLoginRequired({
parsed,
stdout: input.stdout ?? "",
stderr: input.stderr ?? "",
});
if (loginMeta.requiresLogin) return false;
const haystack = buildClaudeTransientHaystack(input);
if (!haystack) return false;
return CLAUDE_TRANSIENT_UPSTREAM_RE.test(haystack);
}

View File

@@ -9,11 +9,15 @@ import {
asNumber,
asStringArray,
parseObject,
ensureAbsoluteDirectory,
ensureCommandResolvable,
ensurePathInEnv,
runChildProcess,
} from "@paperclipai/adapter-utils/server-utils";
import {
ensureAdapterExecutionTargetCommandResolvable,
ensureAdapterExecutionTargetDirectory,
runAdapterExecutionTargetProcess,
describeAdapterExecutionTarget,
resolveAdapterExecutionTargetCwd,
} from "@paperclipai/adapter-utils/execution-target";
import path from "node:path";
import { detectClaudeLoginRequired, parseClaudeStreamJson } from "./parse.js";
import { isBedrockModelId } from "./models.js";
@@ -56,10 +60,28 @@ export async function testEnvironment(
const checks: AdapterEnvironmentCheck[] = [];
const config = parseObject(ctx.config);
const command = asString(config.command, "claude");
const cwd = asString(config.cwd, process.cwd());
const target = ctx.executionTarget ?? null;
const targetIsRemote = target?.kind === "remote";
const cwd = resolveAdapterExecutionTargetCwd(target, asString(config.cwd, ""), process.cwd());
const targetLabel = targetIsRemote
? ctx.environmentName ?? describeAdapterExecutionTarget(target)
: null;
const runId = `claude-envtest-${Date.now()}-${Math.random().toString(16).slice(2)}`;
if (targetLabel) {
checks.push({
code: "claude_environment_target",
level: "info",
message: `Probing inside environment: ${targetLabel}`,
});
}
try {
await ensureAbsoluteDirectory(cwd, { createIfMissing: true });
await ensureAdapterExecutionTargetDirectory(runId, target, cwd, {
cwd,
env: {},
createIfMissing: true,
});
checks.push({
code: "claude_cwd_valid",
level: "info",
@@ -81,7 +103,7 @@ export async function testEnvironment(
}
const runtimeEnv = ensurePathInEnv({ ...process.env, ...env });
try {
await ensureCommandResolvable(command, cwd, runtimeEnv);
await ensureAdapterExecutionTargetCommandResolvable(command, target, cwd, runtimeEnv);
checks.push({
code: "claude_command_resolvable",
level: "info",
@@ -96,16 +118,21 @@ export async function testEnvironment(
});
}
// When probing a remote target, the Paperclip host's process.env does not
// reflect what the agent will actually see at runtime. Only consider env
// vars from the adapter config in that case; the probe itself will surface
// any auth issues on the remote box.
const considerHostEnv = !targetIsRemote;
const hasBedrock =
env.CLAUDE_CODE_USE_BEDROCK === "1" ||
env.CLAUDE_CODE_USE_BEDROCK === "true" ||
process.env.CLAUDE_CODE_USE_BEDROCK === "1" ||
process.env.CLAUDE_CODE_USE_BEDROCK === "true" ||
(considerHostEnv && process.env.CLAUDE_CODE_USE_BEDROCK === "1") ||
(considerHostEnv && process.env.CLAUDE_CODE_USE_BEDROCK === "true") ||
isNonEmpty(env.ANTHROPIC_BEDROCK_BASE_URL) ||
isNonEmpty(process.env.ANTHROPIC_BEDROCK_BASE_URL);
(considerHostEnv && isNonEmpty(process.env.ANTHROPIC_BEDROCK_BASE_URL));
const configApiKey = env.ANTHROPIC_API_KEY;
const hostApiKey = process.env.ANTHROPIC_API_KEY;
const hostApiKey = considerHostEnv ? process.env.ANTHROPIC_API_KEY : undefined;
if (hasBedrock) {
const source =
env.CLAUDE_CODE_USE_BEDROCK === "1" ||
@@ -130,7 +157,7 @@ export async function testEnvironment(
detail: `Detected in ${source}.`,
hint: "Unset ANTHROPIC_API_KEY if you want subscription-based Claude login behavior.",
});
} else {
} else if (!targetIsRemote) {
checks.push({
code: "claude_subscription_mode_possible",
level: "info",
@@ -172,8 +199,9 @@ export async function testEnvironment(
if (maxTurns > 0) args.push("--max-turns", String(maxTurns));
if (extraArgs.length > 0) args.push(...extraArgs);
const probe = await runChildProcess(
`claude-envtest-${Date.now()}-${Math.random().toString(16).slice(2)}`,
const probe = await runAdapterExecutionTargetProcess(
runId,
target,
command,
args,
{

View File

@@ -66,8 +66,6 @@ export function buildClaudeLocalConfig(v: CreateConfigValues): Record<string, un
const ac: Record<string, unknown> = {};
if (v.cwd) ac.cwd = v.cwd;
if (v.instructionsFilePath) ac.instructionsFilePath = v.instructionsFilePath;
if (v.promptTemplate) ac.promptTemplate = v.promptTemplate;
if (v.bootstrapPrompt) ac.bootstrapPromptTemplate = v.bootstrapPrompt;
if (v.model) ac.model = v.model;
if (v.thinkingEffort) ac.effort = v.thinkingEffort;
if (v.chrome) ac.chrome = true;

View File

@@ -4,7 +4,23 @@ export const DEFAULT_CODEX_LOCAL_MODEL = "gpt-5.3-codex";
export const DEFAULT_CODEX_LOCAL_BYPASS_APPROVALS_AND_SANDBOX = true;
export const CODEX_LOCAL_FAST_MODE_SUPPORTED_MODELS = ["gpt-5.4"] as const;
function normalizeModelId(model: string | null | undefined): string {
return typeof model === "string" ? model.trim() : "";
}
export function isCodexLocalKnownModel(model: string | null | undefined): boolean {
const normalizedModel = normalizeModelId(model);
if (!normalizedModel) return false;
return models.some((entry) => entry.id === normalizedModel);
}
export function isCodexLocalManualModel(model: string | null | undefined): boolean {
const normalizedModel = normalizeModelId(model);
return Boolean(normalizedModel) && !isCodexLocalKnownModel(normalizedModel);
}
export function isCodexLocalFastModeSupported(model: string | null | undefined): boolean {
if (isCodexLocalManualModel(model)) return true;
const normalizedModel = typeof model === "string" ? model.trim() : "";
return CODEX_LOCAL_FAST_MODE_SUPPORTED_MODELS.includes(
normalizedModel as (typeof CODEX_LOCAL_FAST_MODE_SUPPORTED_MODELS)[number],
@@ -35,7 +51,7 @@ Core fields:
- modelReasoningEffort (string, optional): reasoning effort override (minimal|low|medium|high|xhigh) passed via -c model_reasoning_effort=...
- promptTemplate (string, optional): run prompt template
- search (boolean, optional): run codex with --search
- fastMode (boolean, optional): enable Codex Fast mode; currently supported on GPT-5.4 only and consumes credits faster
- fastMode (boolean, optional): enable Codex Fast mode; supported on GPT-5.4 and passed through for manual model IDs
- dangerouslyBypassApprovalsAndSandbox (boolean, optional): run with bypass flag
- command (string, optional): defaults to "codex"
- extraArgs (string[], optional): additional CLI args
@@ -54,6 +70,6 @@ Notes:
- Paperclip injects desired local skills into the effective CODEX_HOME/skills/ directory at execution time so Codex can discover "$paperclip" and related skills without polluting the project working directory. In managed-home mode (the default) this is ~/.paperclip/instances/<id>/companies/<companyId>/codex-home/skills/; when CODEX_HOME is explicitly overridden in adapter config, that override is used instead.
- Unless explicitly overridden in adapter config, Paperclip runs Codex with a per-company managed CODEX_HOME under the active Paperclip instance and seeds auth/config from the shared Codex home (the CODEX_HOME env var, when set, or ~/.codex).
- Some model/tool combinations reject certain effort levels (for example minimal with web search enabled).
- Fast mode is currently supported on GPT-5.4 only. When enabled, Paperclip applies \`service_tier="fast"\` and \`features.fast_mode=true\`.
- Fast mode is supported on GPT-5.4 and manual model IDs. When enabled for those models, Paperclip applies \`service_tier="fast"\` and \`features.fast_mode=true\`.
- When Paperclip realizes a workspace/runtime for a run, it injects PAPERCLIP_WORKSPACE_* and PAPERCLIP_RUNTIME_* env vars for agent-side tooling.
`;

View File

@@ -26,6 +26,28 @@ describe("buildCodexExecArgs", () => {
]);
});
it("enables Codex fast mode overrides for manual models", () => {
const result = buildCodexExecArgs({
model: "gpt-5.5",
fastMode: true,
});
expect(result.fastModeRequested).toBe(true);
expect(result.fastModeApplied).toBe(true);
expect(result.fastModeIgnoredReason).toBeNull();
expect(result.args).toEqual([
"exec",
"--json",
"--model",
"gpt-5.5",
"-c",
'service_tier="fast"',
"-c",
"features.fast_mode=true",
"-",
]);
});
it("ignores fast mode for unsupported models", () => {
const result = buildCodexExecArgs({
model: "gpt-5.3-codex",
@@ -34,7 +56,9 @@ describe("buildCodexExecArgs", () => {
expect(result.fastModeRequested).toBe(true);
expect(result.fastModeApplied).toBe(false);
expect(result.fastModeIgnoredReason).toContain("currently only supported on gpt-5.4");
expect(result.fastModeIgnoredReason).toContain(
"currently only supported on gpt-5.4 or manually configured model IDs",
);
expect(result.args).toEqual([
"exec",
"--json",

View File

@@ -25,7 +25,7 @@ function asRecord(value: unknown): Record<string, unknown> {
}
function formatFastModeSupportedModels(): string {
return CODEX_LOCAL_FAST_MODE_SUPPORTED_MODELS.join(", ");
return `${CODEX_LOCAL_FAST_MODE_SUPPORTED_MODELS.join(", ")} or manually configured model IDs`;
}
export function buildCodexExecArgs(

View File

@@ -8,17 +8,20 @@ import {
adapterExecutionTargetRemoteCwd,
adapterExecutionTargetSessionIdentity,
adapterExecutionTargetSessionMatches,
adapterExecutionTargetUsesPaperclipBridge,
describeAdapterExecutionTarget,
ensureAdapterExecutionTargetCommandResolvable,
prepareAdapterExecutionTargetRuntime,
readAdapterExecutionTarget,
resolveAdapterExecutionTargetCommandForLogs,
runAdapterExecutionTargetProcess,
startAdapterExecutionTargetPaperclipBridge,
} from "@paperclipai/adapter-utils/execution-target";
import {
asString,
asNumber,
parseObject,
applyPaperclipWorkspaceEnv,
buildPaperclipEnv,
buildInvocationEnvForLogs,
ensureAbsoluteDirectory,
@@ -34,6 +37,7 @@ import {
} from "@paperclipai/adapter-utils/server-utils";
import {
parseCodexJsonl,
extractCodexRetryNotBefore,
isCodexTransientUpstreamError,
isCodexUnknownSessionError,
} from "./parse.js";
@@ -367,6 +371,7 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
const restoreRemoteWorkspace = preparedExecutionTargetRuntime
? () => preparedExecutionTargetRuntime.restoreWorkspace()
: null;
let paperclipBridge: Awaited<ReturnType<typeof startAdapterExecutionTargetPaperclipBridge>> = null;
const remoteCodexHome = executionTargetIsRemote
? preparedExecutionTargetRuntime?.assetDirs.home ??
path.posix.join(effectiveExecutionCwd, ".paperclip-runtime", "codex", "home")
@@ -420,33 +425,17 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
if (wakePayloadJson) {
env.PAPERCLIP_WAKE_PAYLOAD_JSON = wakePayloadJson;
}
if (effectiveWorkspaceCwd) {
env.PAPERCLIP_WORKSPACE_CWD = effectiveWorkspaceCwd;
}
if (workspaceSource) {
env.PAPERCLIP_WORKSPACE_SOURCE = workspaceSource;
}
if (workspaceStrategy) {
env.PAPERCLIP_WORKSPACE_STRATEGY = workspaceStrategy;
}
if (workspaceId) {
env.PAPERCLIP_WORKSPACE_ID = workspaceId;
}
if (workspaceRepoUrl) {
env.PAPERCLIP_WORKSPACE_REPO_URL = workspaceRepoUrl;
}
if (workspaceRepoRef) {
env.PAPERCLIP_WORKSPACE_REPO_REF = workspaceRepoRef;
}
if (workspaceBranch) {
env.PAPERCLIP_WORKSPACE_BRANCH = workspaceBranch;
}
if (workspaceWorktreePath) {
env.PAPERCLIP_WORKSPACE_WORKTREE_PATH = workspaceWorktreePath;
}
if (agentHome) {
env.AGENT_HOME = agentHome;
}
applyPaperclipWorkspaceEnv(env, {
workspaceCwd: effectiveWorkspaceCwd,
workspaceSource,
workspaceStrategy,
workspaceId,
workspaceRepoUrl,
workspaceRepoRef,
workspaceBranch,
workspaceWorktreePath,
agentHome,
});
if (workspaceHints.length > 0) {
env.PAPERCLIP_WORKSPACES_JSON = JSON.stringify(workspaceHints);
}
@@ -470,6 +459,19 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
if (!hasExplicitApiKey && authToken) {
env.PAPERCLIP_API_KEY = authToken;
}
if (executionTargetIsRemote && adapterExecutionTargetUsesPaperclipBridge(executionTarget)) {
paperclipBridge = await startAdapterExecutionTargetPaperclipBridge({
runId,
target: executionTarget,
runtimeRootDir: preparedExecutionTargetRuntime?.runtimeRootDir,
adapterKey: "codex",
hostApiToken: env.PAPERCLIP_API_KEY,
onLog,
});
if (paperclipBridge) {
Object.assign(env, paperclipBridge.env);
}
}
const effectiveEnv = Object.fromEntries(
Object.entries({ ...process.env, ...env }).filter(
(entry): entry is [string, string] => typeof entry[1] === "string",
@@ -725,6 +727,21 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
parsedError ||
stderrLine ||
`Codex exited with code ${attempt.proc.exitCode ?? -1}`;
const transientRetryNotBefore =
(attempt.proc.exitCode ?? 0) !== 0
? extractCodexRetryNotBefore({
stdout: attempt.proc.stdout,
stderr: attempt.proc.stderr,
errorMessage: fallbackErrorMessage,
})
: null;
const transientUpstream =
(attempt.proc.exitCode ?? 0) !== 0 &&
isCodexTransientUpstreamError({
stdout: attempt.proc.stdout,
stderr: attempt.proc.stderr,
errorMessage: fallbackErrorMessage,
});
return {
exitCode: attempt.proc.exitCode,
@@ -735,14 +752,11 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
? null
: fallbackErrorMessage,
errorCode:
(attempt.proc.exitCode ?? 0) !== 0 &&
isCodexTransientUpstreamError({
stdout: attempt.proc.stdout,
stderr: attempt.proc.stderr,
errorMessage: fallbackErrorMessage,
})
transientUpstream
? "codex_transient_upstream"
: null,
errorFamily: transientUpstream ? "transient_upstream" : null,
retryNotBefore: transientRetryNotBefore ? transientRetryNotBefore.toISOString() : null,
usage: attempt.parsed.usage,
sessionId: resolvedSessionId,
sessionParams: resolvedSessionParams,
@@ -755,6 +769,9 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
resultJson: {
stdout: attempt.proc.stdout,
stderr: attempt.proc.stderr,
...(transientUpstream ? { errorFamily: "transient_upstream" } : {}),
...(transientRetryNotBefore ? { retryNotBefore: transientRetryNotBefore.toISOString() } : {}),
...(transientRetryNotBefore ? { transientRetryNotBefore: transientRetryNotBefore.toISOString() } : {}),
},
summary: attempt.parsed.summary,
clearSession: Boolean((clearSessionOnMissingSession || forceFreshSession) && !resolvedSessionId),
@@ -779,6 +796,9 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
return toResult(initial, false, false);
} finally {
if (paperclipBridge) {
await paperclipBridge.stop();
}
if (restoreRemoteWorkspace) {
await onLog(
"stdout",

View File

@@ -1,5 +1,6 @@
import { describe, expect, it } from "vitest";
import {
extractCodexRetryNotBefore,
isCodexTransientUpstreamError,
isCodexUnknownSessionError,
parseCodexJsonl,
@@ -101,6 +102,25 @@ describe("isCodexTransientUpstreamError", () => {
).toBe(true);
});
it("classifies usage-limit windows as transient and extracts the retry time", () => {
const errorMessage = "You've hit your usage limit for GPT-5.3-Codex-Spark. Switch to another model now, or try again at 11:31 PM.";
const now = new Date(2026, 3, 22, 22, 29, 2);
expect(isCodexTransientUpstreamError({ errorMessage })).toBe(true);
expect(extractCodexRetryNotBefore({ errorMessage }, now)?.getTime()).toBe(
new Date(2026, 3, 22, 23, 31, 0, 0).getTime(),
);
});
it("parses explicit timezone hints on usage-limit retry windows", () => {
const errorMessage = "You've hit your usage limit for GPT-5.3-Codex-Spark. Switch to another model now, or try again at 11:31 PM (America/Chicago).";
const now = new Date("2026-04-23T03:29:02.000Z");
expect(extractCodexRetryNotBefore({ errorMessage }, now)?.toISOString()).toBe(
"2026-04-23T04:31:00.000Z",
);
});
it("does not classify deterministic compaction errors as transient", () => {
expect(
isCodexTransientUpstreamError({

View File

@@ -1,8 +1,15 @@
import { asString, asNumber, parseObject, parseJson } from "@paperclipai/adapter-utils/server-utils";
import {
asString,
asNumber,
parseObject,
parseJson,
} from "@paperclipai/adapter-utils/server-utils";
const CODEX_TRANSIENT_UPSTREAM_RE =
/(?:we(?:'|)re\s+currently\s+experiencing\s+high\s+demand|temporary\s+errors|rate[-\s]?limit(?:ed)?|too\s+many\s+requests|\b429\b|server\s+overloaded|service\s+unavailable|try\s+again\s+later)/i;
const CODEX_REMOTE_COMPACTION_RE = /remote\s+compact\s+task/i;
const CODEX_USAGE_LIMIT_RE =
/you(?:'|)ve hit your usage limit for .+\.\s+switch to another model now,\s+or try again at\s+([^.!\n]+)(?:[.!]|\n|$)/i;
export function parseCodexJsonl(stdout: string) {
let sessionId: string | null = null;
@@ -76,12 +83,12 @@ export function isCodexUnknownSessionError(stdout: string, stderr: string): bool
);
}
export function isCodexTransientUpstreamError(input: {
function buildCodexErrorHaystack(input: {
stdout?: string | null;
stderr?: string | null;
errorMessage?: string | null;
}): boolean {
const haystack = [
}): string {
return [
input.errorMessage ?? "",
input.stdout ?? "",
input.stderr ?? "",
@@ -91,9 +98,164 @@ export function isCodexTransientUpstreamError(input: {
.map((line) => line.trim())
.filter(Boolean)
.join("\n");
}
function readTimeZoneParts(date: Date, timeZone: string) {
const values = new Map(
new Intl.DateTimeFormat("en-US", {
timeZone,
hourCycle: "h23",
year: "numeric",
month: "2-digit",
day: "2-digit",
hour: "2-digit",
minute: "2-digit",
}).formatToParts(date).map((part) => [part.type, part.value]),
);
return {
year: Number.parseInt(values.get("year") ?? "", 10),
month: Number.parseInt(values.get("month") ?? "", 10),
day: Number.parseInt(values.get("day") ?? "", 10),
hour: Number.parseInt(values.get("hour") ?? "", 10),
minute: Number.parseInt(values.get("minute") ?? "", 10),
};
}
function normalizeResetTimeZone(timeZoneHint: string | null | undefined): string | null {
const normalized = timeZoneHint?.trim();
if (!normalized) return null;
if (/^(?:utc|gmt)$/i.test(normalized)) return "UTC";
try {
new Intl.DateTimeFormat("en-US", { timeZone: normalized }).format(new Date(0));
return normalized;
} catch {
return null;
}
}
function dateFromTimeZoneWallClock(input: {
year: number;
month: number;
day: number;
hour: number;
minute: number;
timeZone: string;
}): Date | null {
let candidate = new Date(Date.UTC(input.year, input.month - 1, input.day, input.hour, input.minute, 0, 0));
const targetUtc = Date.UTC(input.year, input.month - 1, input.day, input.hour, input.minute, 0, 0);
for (let attempt = 0; attempt < 4; attempt += 1) {
const actual = readTimeZoneParts(candidate, input.timeZone);
const actualUtc = Date.UTC(actual.year, actual.month - 1, actual.day, actual.hour, actual.minute, 0, 0);
const offsetMs = targetUtc - actualUtc;
if (offsetMs === 0) break;
candidate = new Date(candidate.getTime() + offsetMs);
}
const verified = readTimeZoneParts(candidate, input.timeZone);
if (
verified.year !== input.year ||
verified.month !== input.month ||
verified.day !== input.day ||
verified.hour !== input.hour ||
verified.minute !== input.minute
) {
return null;
}
return candidate;
}
function nextClockTimeInTimeZone(input: {
now: Date;
hour: number;
minute: number;
timeZoneHint: string;
}): Date | null {
const timeZone = normalizeResetTimeZone(input.timeZoneHint);
if (!timeZone) return null;
const nowParts = readTimeZoneParts(input.now, timeZone);
let retryAt = dateFromTimeZoneWallClock({
year: nowParts.year,
month: nowParts.month,
day: nowParts.day,
hour: input.hour,
minute: input.minute,
timeZone,
});
if (!retryAt) return null;
if (retryAt.getTime() <= input.now.getTime()) {
const nextDay = new Date(Date.UTC(nowParts.year, nowParts.month - 1, nowParts.day + 1, 0, 0, 0, 0));
retryAt = dateFromTimeZoneWallClock({
year: nextDay.getUTCFullYear(),
month: nextDay.getUTCMonth() + 1,
day: nextDay.getUTCDate(),
hour: input.hour,
minute: input.minute,
timeZone,
});
}
return retryAt;
}
function parseLocalClockTime(clockText: string, now: Date): Date | null {
const normalized = clockText.trim();
const match = normalized.match(/^(\d{1,2})(?::(\d{2}))?\s*([ap])\.?\s*m\.?(?:\s*\(([^)]+)\)|\s+([A-Z]{2,5}))?$/i);
if (!match) return null;
const hour12 = Number.parseInt(match[1] ?? "", 10);
const minute = Number.parseInt(match[2] ?? "0", 10);
if (!Number.isInteger(hour12) || hour12 < 1 || hour12 > 12) return null;
if (!Number.isInteger(minute) || minute < 0 || minute > 59) return null;
let hour24 = hour12 % 12;
if ((match[3] ?? "").toLowerCase() === "p") hour24 += 12;
const timeZoneHint = match[4] ?? match[5];
if (timeZoneHint) {
const explicitRetryAt = nextClockTimeInTimeZone({
now,
hour: hour24,
minute,
timeZoneHint,
});
if (explicitRetryAt) return explicitRetryAt;
}
const retryAt = new Date(now);
retryAt.setHours(hour24, minute, 0, 0);
if (retryAt.getTime() <= now.getTime()) {
retryAt.setDate(retryAt.getDate() + 1);
}
return retryAt;
}
export function extractCodexRetryNotBefore(input: {
stdout?: string | null;
stderr?: string | null;
errorMessage?: string | null;
}, now = new Date()): Date | null {
const haystack = buildCodexErrorHaystack(input);
const usageLimitMatch = haystack.match(CODEX_USAGE_LIMIT_RE);
if (!usageLimitMatch) return null;
return parseLocalClockTime(usageLimitMatch[1] ?? "", now);
}
export function isCodexTransientUpstreamError(input: {
stdout?: string | null;
stderr?: string | null;
errorMessage?: string | null;
}): boolean {
const haystack = buildCodexErrorHaystack(input);
if (extractCodexRetryNotBefore(input) != null) return true;
if (!CODEX_TRANSIENT_UPSTREAM_RE.test(haystack)) return false;
// Keep automatic retries scoped to the observed remote-compaction/high-demand
// failure shape; broader 429s may be caused by user or account limits.
// failure shape, plus explicit usage-limit windows that tell us when retrying
// becomes safe again.
return CODEX_REMOTE_COMPACTION_RE.test(haystack) || /high\s+demand|temporary\s+errors/i.test(haystack);
}

View File

@@ -6,11 +6,15 @@ import type {
import {
asString,
parseObject,
ensureAbsoluteDirectory,
ensureCommandResolvable,
ensurePathInEnv,
runChildProcess,
} from "@paperclipai/adapter-utils/server-utils";
import {
ensureAdapterExecutionTargetCommandResolvable,
ensureAdapterExecutionTargetDirectory,
runAdapterExecutionTargetProcess,
describeAdapterExecutionTarget,
resolveAdapterExecutionTargetCwd,
} from "@paperclipai/adapter-utils/execution-target";
import path from "node:path";
import { parseCodexJsonl } from "./parse.js";
import { codexHomeDir, readCodexAuthInfo } from "./quota.js";
@@ -57,10 +61,28 @@ export async function testEnvironment(
const checks: AdapterEnvironmentCheck[] = [];
const config = parseObject(ctx.config);
const command = asString(config.command, "codex");
const cwd = asString(config.cwd, process.cwd());
const target = ctx.executionTarget ?? null;
const targetIsRemote = target?.kind === "remote";
const cwd = resolveAdapterExecutionTargetCwd(target, asString(config.cwd, ""), process.cwd());
const targetLabel = targetIsRemote
? ctx.environmentName ?? describeAdapterExecutionTarget(target)
: null;
const runId = `codex-envtest-${Date.now()}-${Math.random().toString(16).slice(2)}`;
if (targetLabel) {
checks.push({
code: "codex_environment_target",
level: "info",
message: `Probing inside environment: ${targetLabel}`,
});
}
try {
await ensureAbsoluteDirectory(cwd, { createIfMissing: true });
await ensureAdapterExecutionTargetDirectory(runId, target, cwd, {
cwd,
env: {},
createIfMissing: true,
});
checks.push({
code: "codex_cwd_valid",
level: "info",
@@ -82,7 +104,7 @@ export async function testEnvironment(
}
const runtimeEnv = ensurePathInEnv({ ...process.env, ...env });
try {
await ensureCommandResolvable(command, cwd, runtimeEnv);
await ensureAdapterExecutionTargetCommandResolvable(command, target, cwd, runtimeEnv);
checks.push({
code: "codex_command_resolvable",
level: "info",
@@ -98,7 +120,7 @@ export async function testEnvironment(
}
const configOpenAiKey = env.OPENAI_API_KEY;
const hostOpenAiKey = process.env.OPENAI_API_KEY;
const hostOpenAiKey = targetIsRemote ? undefined : process.env.OPENAI_API_KEY;
if (isNonEmpty(configOpenAiKey) || isNonEmpty(hostOpenAiKey)) {
const source = isNonEmpty(configOpenAiKey) ? "adapter config env" : "server environment";
checks.push({
@@ -107,7 +129,9 @@ export async function testEnvironment(
message: "OPENAI_API_KEY is set for Codex authentication.",
detail: `Detected in ${source}.`,
});
} else {
} else if (!targetIsRemote) {
// Local-only auth file check. On remote targets, the probe will surface
// any missing-auth errors directly from the remote `codex` invocation.
const codexHome = isNonEmpty(env.CODEX_HOME) ? env.CODEX_HOME : undefined;
const codexAuth = await readCodexAuthInfo(codexHome).catch(() => null);
if (codexAuth) {
@@ -146,12 +170,13 @@ export async function testEnvironment(
code: "codex_fast_mode_unsupported_model",
level: "warn",
message: execArgs.fastModeIgnoredReason,
hint: "Switch the agent model to GPT-5.4 to enable Codex Fast mode.",
hint: "Switch the agent model to GPT-5.4 or enter a manual model ID to enable Codex Fast mode.",
});
}
const probe = await runChildProcess(
`codex-envtest-${Date.now()}-${Math.random().toString(16).slice(2)}`,
const probe = await runAdapterExecutionTargetProcess(
runId,
target,
command,
args,
{

View File

@@ -70,8 +70,6 @@ export function buildCodexLocalConfig(v: CreateConfigValues): Record<string, unk
const ac: Record<string, unknown> = {};
if (v.cwd) ac.cwd = v.cwd;
if (v.instructionsFilePath) ac.instructionsFilePath = v.instructionsFilePath;
if (v.promptTemplate) ac.promptTemplate = v.promptTemplate;
if (v.bootstrapPrompt) ac.bootstrapPromptTemplate = v.bootstrapPrompt;
ac.model = v.model || DEFAULT_CODEX_LOCAL_MODEL;
if (v.thinkingEffort) ac.modelReasoningEffort = v.thinkingEffort;
ac.timeoutSec = 0;

View File

@@ -80,4 +80,5 @@ Notes:
- Sessions are resumed with --resume when stored session cwd matches current cwd.
- Paperclip auto-injects local skills into "~/.cursor/skills" when missing, so Cursor can discover "$paperclip" and related skills on local runs.
- Paperclip auto-adds --yolo unless one of --trust/--yolo/-f is already present in extraArgs.
- Remote sandbox runs prepend "~/.local/bin" to PATH and prefer "~/.local/bin/cursor-agent" when the default Cursor entrypoint is requested, so standard E2B-style installs do not need hardcoded absolute command paths.
`;

View File

@@ -10,6 +10,7 @@ import {
adapterExecutionTargetSessionIdentity,
adapterExecutionTargetSessionMatches,
adapterExecutionTargetUsesManagedHome,
adapterExecutionTargetUsesPaperclipBridge,
describeAdapterExecutionTarget,
ensureAdapterExecutionTargetCommandResolvable,
prepareAdapterExecutionTargetRuntime,
@@ -18,12 +19,14 @@ import {
resolveAdapterExecutionTargetCommandForLogs,
runAdapterExecutionTargetProcess,
runAdapterExecutionTargetShellCommand,
startAdapterExecutionTargetPaperclipBridge,
} from "@paperclipai/adapter-utils/execution-target";
import {
asString,
asNumber,
asStringArray,
parseObject,
applyPaperclipWorkspaceEnv,
buildPaperclipEnv,
buildInvocationEnvForLogs,
ensureAbsoluteDirectory,
@@ -40,6 +43,7 @@ import {
} from "@paperclipai/adapter-utils/server-utils";
import { DEFAULT_CURSOR_LOCAL_MODEL } from "../index.js";
import { parseCursorJsonl, isCursorUnknownSessionError } from "./parse.js";
import { prepareCursorSandboxCommand } from "./remote-command.js";
import { normalizeCursorStreamLine } from "../shared/stream.js";
import { hasCursorTrustBypassArg } from "../shared/trust.js";
@@ -198,7 +202,7 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
config.promptTemplate,
DEFAULT_PAPERCLIP_AGENT_PROMPT_TEMPLATE,
);
const command = asString(config.command, "agent");
let command = asString(config.command, "agent");
const model = asString(config.model, DEFAULT_CURSOR_LOCAL_MODEL).trim();
const mode = normalizeMode(asString(config.mode, ""));
@@ -230,7 +234,7 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
const envConfig = parseObject(config.env);
const hasExplicitApiKey =
typeof envConfig.PAPERCLIP_API_KEY === "string" && envConfig.PAPERCLIP_API_KEY.trim().length > 0;
const env: Record<string, string> = { ...buildPaperclipEnv(agent) };
let env: Record<string, string> = { ...buildPaperclipEnv(agent) };
env.PAPERCLIP_RUN_ID = runId;
const wakeTaskId =
(typeof context.taskId === "string" && context.taskId.trim().length > 0 && context.taskId.trim()) ||
@@ -277,24 +281,14 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
if (wakePayloadJson) {
env.PAPERCLIP_WAKE_PAYLOAD_JSON = wakePayloadJson;
}
if (effectiveWorkspaceCwd) {
env.PAPERCLIP_WORKSPACE_CWD = effectiveWorkspaceCwd;
}
if (workspaceSource) {
env.PAPERCLIP_WORKSPACE_SOURCE = workspaceSource;
}
if (workspaceId) {
env.PAPERCLIP_WORKSPACE_ID = workspaceId;
}
if (workspaceRepoUrl) {
env.PAPERCLIP_WORKSPACE_REPO_URL = workspaceRepoUrl;
}
if (workspaceRepoRef) {
env.PAPERCLIP_WORKSPACE_REPO_REF = workspaceRepoRef;
}
if (agentHome) {
env.AGENT_HOME = agentHome;
}
applyPaperclipWorkspaceEnv(env, {
workspaceCwd: effectiveWorkspaceCwd,
workspaceSource,
workspaceId,
workspaceRepoUrl,
workspaceRepoRef,
agentHome,
});
if (workspaceHints.length > 0) {
env.PAPERCLIP_WORKSPACES_JSON = JSON.stringify(workspaceHints);
}
@@ -308,6 +302,22 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
if (!hasExplicitApiKey && authToken) {
env.PAPERCLIP_API_KEY = authToken;
}
const timeoutSec = asNumber(config.timeoutSec, 0);
const graceSec = asNumber(config.graceSec, 20);
// Probe the sandbox before the managed-home override so we discover
// cursor-agent from the real system HOME (e.g. ~/.local/bin/cursor-agent).
// The managed HOME set later is for runtime isolation, not for finding the CLI.
const sandboxCommand = await prepareCursorSandboxCommand({
runId,
target: executionTarget,
command,
cwd,
env,
timeoutSec,
graceSec,
});
command = sandboxCommand.command;
env = sandboxCommand.env;
const effectiveEnv = Object.fromEntries(
Object.entries({ ...process.env, ...env }).filter(
(entry): entry is [string, string] => typeof entry[1] === "string",
@@ -317,14 +327,12 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
const runtimeEnv = ensurePathInEnv(effectiveEnv);
await ensureAdapterExecutionTargetCommandResolvable(command, executionTarget, cwd, runtimeEnv);
const resolvedCommand = await resolveAdapterExecutionTargetCommandForLogs(command, executionTarget, cwd, runtimeEnv);
const loggedEnv = buildInvocationEnvForLogs(env, {
let loggedEnv = buildInvocationEnvForLogs(env, {
runtimeEnv,
includeRuntimeKeys: ["HOME"],
resolvedCommand,
});
const timeoutSec = asNumber(config.timeoutSec, 0);
const graceSec = asNumber(config.graceSec, 20);
const extraArgs = (() => {
const fromExtraArgs = asStringArray(config.extraArgs);
if (fromExtraArgs.length > 0) return fromExtraArgs;
@@ -334,6 +342,8 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
const effectiveExecutionCwd = adapterExecutionTargetRemoteCwd(executionTarget, cwd);
let restoreRemoteWorkspace: (() => Promise<void>) | null = null;
let localSkillsDir: string | null = null;
let remoteRuntimeRootDir: string | null = null;
let paperclipBridge: Awaited<ReturnType<typeof startAdapterExecutionTargetPaperclipBridge>> = null;
if (executionTargetIsRemote) {
try {
@@ -353,6 +363,7 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
}],
});
restoreRemoteWorkspace = () => preparedExecutionTargetRuntime.restoreWorkspace();
remoteRuntimeRootDir = preparedExecutionTargetRuntime.runtimeRootDir;
const managedHome = adapterExecutionTargetUsesManagedHome(executionTarget);
if (managedHome && preparedExecutionTargetRuntime.runtimeRootDir) {
env.HOME = preparedExecutionTargetRuntime.runtimeRootDir;
@@ -383,6 +394,24 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
throw error;
}
}
if (executionTargetIsRemote && adapterExecutionTargetUsesPaperclipBridge(executionTarget)) {
paperclipBridge = await startAdapterExecutionTargetPaperclipBridge({
runId,
target: executionTarget,
runtimeRootDir: remoteRuntimeRootDir,
adapterKey: "cursor",
hostApiToken: env.PAPERCLIP_API_KEY,
onLog,
});
if (paperclipBridge) {
Object.assign(env, paperclipBridge.env);
loggedEnv = buildInvocationEnvForLogs(env, {
runtimeEnv: ensurePathInEnv({ ...process.env, ...env }),
includeRuntimeKeys: ["HOME"],
resolvedCommand,
});
}
}
const runtimeSessionParams = parseObject(runtime.sessionParams);
const runtimeSessionId = asString(runtimeSessionParams.sessionId, runtime.sessionId ?? "");
@@ -431,6 +460,12 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
notes.push("Auto-added --yolo to bypass interactive prompts.");
}
notes.push("Prompt is piped to Cursor via stdin.");
if (sandboxCommand.addedPathEntry) {
notes.push(`Remote sandbox runs prepend ${sandboxCommand.addedPathEntry} to PATH.`);
}
if (sandboxCommand.preferredCommandPath) {
notes.push(`Remote sandbox runs prefer ${sandboxCommand.preferredCommandPath} when using the default Cursor entrypoint.`);
}
if (!instructionsFilePath) return notes;
if (instructionsPrefix.length > 0) {
notes.push(
@@ -645,6 +680,9 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
}
return toResult(initial);
} finally {
if (paperclipBridge) {
await paperclipBridge.stop();
}
if (restoreRemoteWorkspace) {
await onLog(
"stdout",

View File

@@ -0,0 +1,160 @@
import path from "node:path";
import {
runAdapterExecutionTargetShellCommand,
type AdapterExecutionTarget,
} from "@paperclipai/adapter-utils/execution-target";
import { ensurePathInEnv } from "@paperclipai/adapter-utils/server-utils";
const DEFAULT_CURSOR_COMMAND_BASENAMES = new Set(["agent", "cursor-agent"]);
function commandBasename(command: string): string {
return command.trim().split(/[\\/]/).pop()?.toLowerCase() ?? "";
}
function hasPathSeparator(command: string): boolean {
return command.includes("/") || command.includes("\\");
}
function prependPosixPathEntry(pathValue: string, entry: string): string {
const parts = pathValue.split(":").filter(Boolean);
if (parts.includes(entry)) return pathValue;
const cleaned = parts.join(":");
return cleaned.length > 0 ? `${entry}:${cleaned}` : entry;
}
type SandboxCursorRuntimeInfo = {
remoteSystemHomeDir: string | null;
preferredCommandPath: string | null;
};
function readMarkedValue(lines: string[], marker: string): string | null {
const matchedLine = lines.find((line) => line.startsWith(marker));
if (!matchedLine) return null;
const value = matchedLine.slice(marker.length).trim();
return value.length > 0 ? value : null;
}
async function readSandboxCursorRuntimeInfo(input: {
runId: string;
target: AdapterExecutionTarget;
command: string;
cwd: string;
env: Record<string, string>;
timeoutSec: number;
graceSec: number;
}): Promise<SandboxCursorRuntimeInfo> {
const shouldCheckPreferredCommand = isDefaultCursorCommand(input.command) && !hasPathSeparator(input.command);
const homeMarker = "__PAPERCLIP_CURSOR_HOME__:";
const preferredMarker = "__PAPERCLIP_CURSOR_AGENT__:";
try {
const result = await runAdapterExecutionTargetShellCommand(
input.runId,
input.target,
[
`printf ${JSON.stringify(`${homeMarker}%s\\n`)} "$HOME"`,
shouldCheckPreferredCommand
? `if [ -x "$HOME/.local/bin/cursor-agent" ]; then printf ${JSON.stringify(`${preferredMarker}%s\\n`)} "$HOME/.local/bin/cursor-agent"; fi`
: "",
].filter(Boolean).join("; "),
{
cwd: input.cwd,
env: input.env,
timeoutSec: input.timeoutSec,
graceSec: input.graceSec,
},
);
if (result.timedOut || (result.exitCode ?? 1) !== 0) {
return {
remoteSystemHomeDir: null,
preferredCommandPath: null,
};
}
const lines = result.stdout.split(/\r?\n/);
return {
remoteSystemHomeDir: readMarkedValue(lines, homeMarker),
preferredCommandPath: readMarkedValue(lines, preferredMarker),
};
} catch {
return {
remoteSystemHomeDir: null,
preferredCommandPath: null,
};
}
}
export function isDefaultCursorCommand(command: string): boolean {
return DEFAULT_CURSOR_COMMAND_BASENAMES.has(commandBasename(command));
}
export type PreparedCursorSandboxCommand = {
command: string;
env: Record<string, string>;
remoteSystemHomeDir: string | null;
addedPathEntry: string | null;
preferredCommandPath: string | null;
};
export async function prepareCursorSandboxCommand(input: {
runId: string;
target: AdapterExecutionTarget | null | undefined;
command: string;
cwd: string;
env: Record<string, string>;
timeoutSec: number;
graceSec: number;
}): Promise<PreparedCursorSandboxCommand> {
if (input.target?.kind !== "remote" || input.target.transport !== "sandbox") {
return {
command: input.command,
env: input.env,
remoteSystemHomeDir: null,
addedPathEntry: null,
preferredCommandPath: null,
};
}
const runtimeInfo = await readSandboxCursorRuntimeInfo({
runId: input.runId,
target: input.target,
command: input.command,
cwd: input.cwd,
env: input.env,
timeoutSec: input.timeoutSec,
graceSec: input.graceSec,
});
const remoteSystemHomeDir = runtimeInfo.remoteSystemHomeDir;
if (!remoteSystemHomeDir) {
return {
command: input.command,
env: input.env,
remoteSystemHomeDir: null,
addedPathEntry: null,
preferredCommandPath: null,
};
}
const remoteLocalBinDir = path.posix.join(remoteSystemHomeDir, ".local", "bin");
const runtimeEnv = ensurePathInEnv(input.env);
const currentPath = runtimeEnv.PATH ?? runtimeEnv.Path ?? "";
const nextPath = prependPosixPathEntry(currentPath, remoteLocalBinDir);
const env = nextPath === currentPath ? input.env : { ...input.env, PATH: nextPath };
if (!runtimeInfo.preferredCommandPath) {
return {
command: input.command,
env,
remoteSystemHomeDir,
addedPathEntry: nextPath === currentPath ? null : remoteLocalBinDir,
preferredCommandPath: null,
};
}
return {
command: runtimeInfo.preferredCommandPath,
env,
remoteSystemHomeDir,
addedPathEntry: nextPath === currentPath ? null : remoteLocalBinDir,
preferredCommandPath: runtimeInfo.preferredCommandPath,
};
}

View File

@@ -7,16 +7,21 @@ import {
asString,
asStringArray,
parseObject,
ensureAbsoluteDirectory,
ensureCommandResolvable,
ensurePathInEnv,
runChildProcess,
} from "@paperclipai/adapter-utils/server-utils";
import {
ensureAdapterExecutionTargetCommandResolvable,
ensureAdapterExecutionTargetDirectory,
runAdapterExecutionTargetProcess,
describeAdapterExecutionTarget,
resolveAdapterExecutionTargetCwd,
} from "@paperclipai/adapter-utils/execution-target";
import fs from "node:fs/promises";
import os from "node:os";
import path from "node:path";
import { DEFAULT_CURSOR_LOCAL_MODEL } from "../index.js";
import { parseCursorJsonl } from "./parse.js";
import { isDefaultCursorCommand, prepareCursorSandboxCommand } from "./remote-command.js";
import { hasCursorTrustBypassArg } from "../shared/trust.js";
function summarizeStatus(checks: AdapterEnvironmentCheck[]): AdapterEnvironmentTestResult["status"] {
@@ -38,11 +43,6 @@ function firstNonEmptyLine(text: string): string {
);
}
function commandLooksLike(command: string, expected: string): boolean {
const base = path.basename(command).toLowerCase();
return base === expected || base === `${expected}.cmd` || base === `${expected}.exe`;
}
function summarizeProbeDetail(stdout: string, stderr: string, parsedError: string | null): string | null {
const raw = parsedError?.trim() || firstNonEmptyLine(stderr) || firstNonEmptyLine(stdout);
if (!raw) return null;
@@ -94,11 +94,29 @@ export async function testEnvironment(
): Promise<AdapterEnvironmentTestResult> {
const checks: AdapterEnvironmentCheck[] = [];
const config = parseObject(ctx.config);
const command = asString(config.command, "agent");
const cwd = asString(config.cwd, process.cwd());
let command = asString(config.command, "agent");
const target = ctx.executionTarget ?? null;
const targetIsRemote = target?.kind === "remote";
const cwd = resolveAdapterExecutionTargetCwd(target, asString(config.cwd, ""), process.cwd());
const targetLabel = targetIsRemote
? ctx.environmentName ?? describeAdapterExecutionTarget(target)
: null;
const runId = `cursor-envtest-${Date.now()}-${Math.random().toString(16).slice(2)}`;
if (targetLabel) {
checks.push({
code: "cursor_environment_target",
level: "info",
message: `Probing inside environment: ${targetLabel}`,
});
}
try {
await ensureAbsoluteDirectory(cwd, { createIfMissing: true });
await ensureAdapterExecutionTargetDirectory(runId, target, cwd, {
cwd,
env: {},
createIfMissing: true,
});
checks.push({
code: "cursor_cwd_valid",
level: "info",
@@ -114,13 +132,24 @@ export async function testEnvironment(
}
const envConfig = parseObject(config.env);
const env: Record<string, string> = {};
let env: Record<string, string> = {};
for (const [key, value] of Object.entries(envConfig)) {
if (typeof value === "string") env[key] = value;
}
const sandboxCommand = await prepareCursorSandboxCommand({
runId,
target,
command,
cwd,
env,
timeoutSec: 45,
graceSec: 5,
});
command = sandboxCommand.command;
env = sandboxCommand.env;
const runtimeEnv = ensurePathInEnv({ ...process.env, ...env });
try {
await ensureCommandResolvable(command, cwd, runtimeEnv);
await ensureAdapterExecutionTargetCommandResolvable(command, target, cwd, runtimeEnv);
checks.push({
code: "cursor_command_resolvable",
level: "info",
@@ -136,7 +165,7 @@ export async function testEnvironment(
}
const configCursorApiKey = env.CURSOR_API_KEY;
const hostCursorApiKey = process.env.CURSOR_API_KEY;
const hostCursorApiKey = targetIsRemote ? undefined : process.env.CURSOR_API_KEY;
if (isNonEmpty(configCursorApiKey) || isNonEmpty(hostCursorApiKey)) {
const source = isNonEmpty(configCursorApiKey) ? "adapter config env" : "server environment";
checks.push({
@@ -145,7 +174,7 @@ export async function testEnvironment(
message: "CURSOR_API_KEY is set for Cursor authentication.",
detail: `Detected in ${source}.`,
});
} else {
} else if (!targetIsRemote) {
const cursorHome = isNonEmpty(env.CURSOR_HOME) ? env.CURSOR_HOME : undefined;
const cursorAuth = await readCursorAuthInfo(cursorHome).catch(() => null);
if (cursorAuth) {
@@ -170,13 +199,13 @@ export async function testEnvironment(
const canRunProbe =
checks.every((check) => check.code !== "cursor_cwd_invalid" && check.code !== "cursor_command_unresolvable");
if (canRunProbe) {
if (!commandLooksLike(command, "agent")) {
if (!isDefaultCursorCommand(command)) {
checks.push({
code: "cursor_hello_probe_skipped_custom_command",
level: "info",
message: "Skipped hello probe because command is not `agent`.",
message: "Skipped hello probe because command is not a default Cursor CLI entrypoint.",
detail: command,
hint: "Use the `agent` CLI command to run the automatic installation and auth probe.",
hint: "Use `agent` or `cursor-agent` to run the automatic installation and auth probe.",
});
} else {
const model = asString(config.model, DEFAULT_CURSOR_LOCAL_MODEL).trim();
@@ -192,8 +221,9 @@ export async function testEnvironment(
if (extraArgs.length > 0) args.push(...extraArgs);
args.push("Respond with hello.");
const probe = await runChildProcess(
`cursor-envtest-${Date.now()}-${Math.random().toString(16).slice(2)}`,
const probe = await runAdapterExecutionTargetProcess(
runId,
target,
command,
args,
{

View File

@@ -61,8 +61,6 @@ export function buildCursorLocalConfig(v: CreateConfigValues): Record<string, un
const ac: Record<string, unknown> = {};
if (v.cwd) ac.cwd = v.cwd;
if (v.instructionsFilePath) ac.instructionsFilePath = v.instructionsFilePath;
if (v.promptTemplate) ac.promptTemplate = v.promptTemplate;
if (v.bootstrapPrompt) ac.bootstrapPromptTemplate = v.bootstrapPrompt;
ac.model = v.model || DEFAULT_CURSOR_LOCAL_MODEL;
const mode = normalizeMode(v.thinkingEffort);
if (mode) ac.mode = mode;

View File

@@ -11,6 +11,7 @@ import {
adapterExecutionTargetSessionIdentity,
adapterExecutionTargetSessionMatches,
adapterExecutionTargetUsesManagedHome,
adapterExecutionTargetUsesPaperclipBridge,
describeAdapterExecutionTarget,
ensureAdapterExecutionTargetCommandResolvable,
prepareAdapterExecutionTargetRuntime,
@@ -19,12 +20,14 @@ import {
resolveAdapterExecutionTargetCommandForLogs,
runAdapterExecutionTargetProcess,
runAdapterExecutionTargetShellCommand,
startAdapterExecutionTargetPaperclipBridge,
} from "@paperclipai/adapter-utils/execution-target";
import {
asBoolean,
asNumber,
asString,
asStringArray,
applyPaperclipWorkspaceEnv,
buildPaperclipEnv,
buildInvocationEnvForLogs,
ensureAbsoluteDirectory,
@@ -240,12 +243,14 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
if (approvalStatus) env.PAPERCLIP_APPROVAL_STATUS = approvalStatus;
if (linkedIssueIds.length > 0) env.PAPERCLIP_LINKED_ISSUE_IDS = linkedIssueIds.join(",");
if (wakePayloadJson) env.PAPERCLIP_WAKE_PAYLOAD_JSON = wakePayloadJson;
if (effectiveWorkspaceCwd) env.PAPERCLIP_WORKSPACE_CWD = effectiveWorkspaceCwd;
if (workspaceSource) env.PAPERCLIP_WORKSPACE_SOURCE = workspaceSource;
if (workspaceId) env.PAPERCLIP_WORKSPACE_ID = workspaceId;
if (workspaceRepoUrl) env.PAPERCLIP_WORKSPACE_REPO_URL = workspaceRepoUrl;
if (workspaceRepoRef) env.PAPERCLIP_WORKSPACE_REPO_REF = workspaceRepoRef;
if (agentHome) env.AGENT_HOME = agentHome;
applyPaperclipWorkspaceEnv(env, {
workspaceCwd: effectiveWorkspaceCwd,
workspaceSource,
workspaceId,
workspaceRepoUrl,
workspaceRepoRef,
agentHome,
});
if (workspaceHints.length > 0) env.PAPERCLIP_WORKSPACES_JSON = JSON.stringify(workspaceHints);
const targetPaperclipApiUrl = adapterExecutionTargetPaperclipApiUrl(executionTarget);
if (targetPaperclipApiUrl) env.PAPERCLIP_API_URL = targetPaperclipApiUrl;
@@ -265,7 +270,7 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
const runtimeEnv = ensurePathInEnv(effectiveEnv);
await ensureAdapterExecutionTargetCommandResolvable(command, executionTarget, cwd, runtimeEnv);
const resolvedCommand = await resolveAdapterExecutionTargetCommandForLogs(command, executionTarget, cwd, runtimeEnv);
const loggedEnv = buildInvocationEnvForLogs(env, {
let loggedEnv = buildInvocationEnvForLogs(env, {
runtimeEnv,
includeRuntimeKeys: ["HOME"],
resolvedCommand,
@@ -282,6 +287,8 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
let restoreRemoteWorkspace: (() => Promise<void>) | null = null;
let remoteSkillsDir: string | null = null;
let localSkillsDir: string | null = null;
let remoteRuntimeRootDir: string | null = null;
let paperclipBridge: Awaited<ReturnType<typeof startAdapterExecutionTargetPaperclipBridge>> = null;
if (executionTargetIsRemote) {
try {
@@ -301,6 +308,7 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
}],
});
restoreRemoteWorkspace = () => preparedExecutionTargetRuntime.restoreWorkspace();
remoteRuntimeRootDir = preparedExecutionTargetRuntime.runtimeRootDir;
const managedHome = adapterExecutionTargetUsesManagedHome(executionTarget);
if (managedHome && preparedExecutionTargetRuntime.runtimeRootDir) {
env.HOME = preparedExecutionTargetRuntime.runtimeRootDir;
@@ -331,6 +339,24 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
throw error;
}
}
if (executionTargetIsRemote && adapterExecutionTargetUsesPaperclipBridge(executionTarget)) {
paperclipBridge = await startAdapterExecutionTargetPaperclipBridge({
runId,
target: executionTarget,
runtimeRootDir: remoteRuntimeRootDir,
adapterKey: "gemini",
hostApiToken: env.PAPERCLIP_API_KEY,
onLog,
});
if (paperclipBridge) {
Object.assign(env, paperclipBridge.env);
loggedEnv = buildInvocationEnvForLogs(env, {
runtimeEnv: ensurePathInEnv({ ...process.env, ...env }),
includeRuntimeKeys: ["HOME"],
resolvedCommand,
});
}
}
const runtimeSessionParams = parseObject(runtime.sessionParams);
const runtimeSessionId = asString(runtimeSessionParams.sessionId, runtime.sessionId ?? "");
@@ -580,6 +606,7 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
return toResult(initial);
} finally {
await Promise.all([
paperclipBridge?.stop(),
restoreRemoteWorkspace?.(),
localSkillsDir ? fs.rm(path.dirname(localSkillsDir), { recursive: true, force: true }).catch(() => undefined) : Promise.resolve(),
]);

View File

@@ -9,12 +9,16 @@ import {
asNumber,
asString,
asStringArray,
ensureAbsoluteDirectory,
ensureCommandResolvable,
ensurePathInEnv,
parseObject,
runChildProcess,
} from "@paperclipai/adapter-utils/server-utils";
import {
ensureAdapterExecutionTargetCommandResolvable,
ensureAdapterExecutionTargetDirectory,
runAdapterExecutionTargetProcess,
describeAdapterExecutionTarget,
resolveAdapterExecutionTargetCwd,
} from "@paperclipai/adapter-utils/execution-target";
import { DEFAULT_GEMINI_LOCAL_MODEL } from "../index.js";
import { detectGeminiAuthRequired, detectGeminiQuotaExhausted, parseGeminiJsonl } from "./parse.js";
import { firstNonEmptyLine } from "./utils.js";
@@ -48,10 +52,28 @@ export async function testEnvironment(
const checks: AdapterEnvironmentCheck[] = [];
const config = parseObject(ctx.config);
const command = asString(config.command, "gemini");
const cwd = asString(config.cwd, process.cwd());
const target = ctx.executionTarget ?? null;
const targetIsRemote = target?.kind === "remote";
const cwd = resolveAdapterExecutionTargetCwd(target, asString(config.cwd, ""), process.cwd());
const targetLabel = targetIsRemote
? ctx.environmentName ?? describeAdapterExecutionTarget(target)
: null;
const runId = `gemini-envtest-${Date.now()}-${Math.random().toString(16).slice(2)}`;
if (targetLabel) {
checks.push({
code: "gemini_environment_target",
level: "info",
message: `Probing inside environment: ${targetLabel}`,
});
}
try {
await ensureAbsoluteDirectory(cwd, { createIfMissing: true });
await ensureAdapterExecutionTargetDirectory(runId, target, cwd, {
cwd,
env: {},
createIfMissing: true,
});
checks.push({
code: "gemini_cwd_valid",
level: "info",
@@ -73,7 +95,7 @@ export async function testEnvironment(
}
const runtimeEnv = ensurePathInEnv({ ...process.env, ...env });
try {
await ensureCommandResolvable(command, cwd, runtimeEnv);
await ensureAdapterExecutionTargetCommandResolvable(command, target, cwd, runtimeEnv);
checks.push({
code: "gemini_command_resolvable",
level: "info",
@@ -89,10 +111,10 @@ export async function testEnvironment(
}
const configGeminiApiKey = env.GEMINI_API_KEY;
const hostGeminiApiKey = process.env.GEMINI_API_KEY;
const hostGeminiApiKey = targetIsRemote ? undefined : process.env.GEMINI_API_KEY;
const configGoogleApiKey = env.GOOGLE_API_KEY;
const hostGoogleApiKey = process.env.GOOGLE_API_KEY;
const hasGca = env.GOOGLE_GENAI_USE_GCA === "true" || process.env.GOOGLE_GENAI_USE_GCA === "true";
const hostGoogleApiKey = targetIsRemote ? undefined : process.env.GOOGLE_API_KEY;
const hasGca = env.GOOGLE_GENAI_USE_GCA === "true" || (!targetIsRemote && process.env.GOOGLE_GENAI_USE_GCA === "true");
if (
isNonEmpty(configGeminiApiKey) ||
isNonEmpty(hostGeminiApiKey) ||
@@ -152,8 +174,9 @@ export async function testEnvironment(
}
if (extraArgs.length > 0) args.push(...extraArgs);
const probe = await runChildProcess(
`gemini-envtest-${Date.now()}-${Math.random().toString(16).slice(2)}`,
const probe = await runAdapterExecutionTargetProcess(
runId,
target,
command,
args,
{

View File

@@ -55,8 +55,6 @@ export function buildGeminiLocalConfig(v: CreateConfigValues): Record<string, un
const ac: Record<string, unknown> = {};
if (v.cwd) ac.cwd = v.cwd;
if (v.instructionsFilePath) ac.instructionsFilePath = v.instructionsFilePath;
if (v.promptTemplate) ac.promptTemplate = v.promptTemplate;
if (v.bootstrapPrompt) ac.bootstrapPromptTemplate = v.bootstrapPrompt;
ac.model = v.model || DEFAULT_GEMINI_LOCAL_MODEL;
ac.timeoutSec = 0;
ac.graceSec = 15;

View File

@@ -10,6 +10,7 @@ import {
adapterExecutionTargetSessionIdentity,
adapterExecutionTargetSessionMatches,
adapterExecutionTargetUsesManagedHome,
adapterExecutionTargetUsesPaperclipBridge,
describeAdapterExecutionTarget,
ensureAdapterExecutionTargetCommandResolvable,
prepareAdapterExecutionTargetRuntime,
@@ -18,12 +19,14 @@ import {
resolveAdapterExecutionTargetCommandForLogs,
runAdapterExecutionTargetProcess,
runAdapterExecutionTargetShellCommand,
startAdapterExecutionTargetPaperclipBridge,
} from "@paperclipai/adapter-utils/execution-target";
import {
asString,
asNumber,
asStringArray,
parseObject,
applyPaperclipWorkspaceEnv,
buildPaperclipEnv,
joinPromptSections,
buildInvocationEnvForLogs,
@@ -199,12 +202,14 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
if (approvalStatus) env.PAPERCLIP_APPROVAL_STATUS = approvalStatus;
if (linkedIssueIds.length > 0) env.PAPERCLIP_LINKED_ISSUE_IDS = linkedIssueIds.join(",");
if (wakePayloadJson) env.PAPERCLIP_WAKE_PAYLOAD_JSON = wakePayloadJson;
if (effectiveWorkspaceCwd) env.PAPERCLIP_WORKSPACE_CWD = effectiveWorkspaceCwd;
if (workspaceSource) env.PAPERCLIP_WORKSPACE_SOURCE = workspaceSource;
if (workspaceId) env.PAPERCLIP_WORKSPACE_ID = workspaceId;
if (workspaceRepoUrl) env.PAPERCLIP_WORKSPACE_REPO_URL = workspaceRepoUrl;
if (workspaceRepoRef) env.PAPERCLIP_WORKSPACE_REPO_REF = workspaceRepoRef;
if (agentHome) env.AGENT_HOME = agentHome;
applyPaperclipWorkspaceEnv(env, {
workspaceCwd: effectiveWorkspaceCwd,
workspaceSource,
workspaceId,
workspaceRepoUrl,
workspaceRepoRef,
agentHome,
});
if (workspaceHints.length > 0) env.PAPERCLIP_WORKSPACES_JSON = JSON.stringify(workspaceHints);
const targetPaperclipApiUrl = adapterExecutionTargetPaperclipApiUrl(executionTarget);
if (targetPaperclipApiUrl) env.PAPERCLIP_API_URL = targetPaperclipApiUrl;
@@ -231,7 +236,7 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
);
await ensureAdapterExecutionTargetCommandResolvable(command, executionTarget, cwd, runtimeEnv);
const resolvedCommand = await resolveAdapterExecutionTargetCommandForLogs(command, executionTarget, cwd, runtimeEnv);
const loggedEnv = buildInvocationEnvForLogs(preparedRuntimeConfig.env, {
let loggedEnv = buildInvocationEnvForLogs(preparedRuntimeConfig.env, {
runtimeEnv,
includeRuntimeKeys: ["HOME"],
resolvedCommand,
@@ -256,6 +261,8 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
const effectiveExecutionCwd = adapterExecutionTargetRemoteCwd(executionTarget, cwd);
let restoreRemoteWorkspace: (() => Promise<void>) | null = null;
let localSkillsDir: string | null = null;
let remoteRuntimeRootDir: string | null = null;
let paperclipBridge: Awaited<ReturnType<typeof startAdapterExecutionTargetPaperclipBridge>> = null;
if (executionTargetIsRemote) {
localSkillsDir = await buildOpenCodeSkillsDir(config);
@@ -282,6 +289,7 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
],
});
restoreRemoteWorkspace = () => preparedExecutionTargetRuntime.restoreWorkspace();
remoteRuntimeRootDir = preparedExecutionTargetRuntime.runtimeRootDir;
const managedHome = adapterExecutionTargetUsesManagedHome(executionTarget);
if (managedHome && preparedExecutionTargetRuntime.runtimeRootDir) {
preparedRuntimeConfig.env.HOME = preparedExecutionTargetRuntime.runtimeRootDir;
@@ -308,6 +316,28 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
);
}
}
if (executionTargetIsRemote && adapterExecutionTargetUsesPaperclipBridge(executionTarget)) {
paperclipBridge = await startAdapterExecutionTargetPaperclipBridge({
runId,
target: executionTarget,
runtimeRootDir: remoteRuntimeRootDir,
adapterKey: "opencode",
hostApiToken: preparedRuntimeConfig.env.PAPERCLIP_API_KEY,
onLog,
});
if (paperclipBridge) {
Object.assign(preparedRuntimeConfig.env, paperclipBridge.env);
loggedEnv = buildInvocationEnvForLogs(preparedRuntimeConfig.env, {
runtimeEnv: Object.fromEntries(
Object.entries(ensurePathInEnv({ ...process.env, ...preparedRuntimeConfig.env })).filter(
(entry): entry is [string, string] => typeof entry[1] === "string",
),
),
includeRuntimeKeys: ["HOME"],
resolvedCommand,
});
}
}
const runtimeSessionParams = parseObject(runtime.sessionParams);
const runtimeSessionId = asString(runtimeSessionParams.sessionId, runtime.sessionId ?? "");
@@ -535,6 +565,7 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
return toResult(initial);
} finally {
await Promise.all([
paperclipBridge?.stop(),
restoreRemoteWorkspace?.(),
localSkillsDir ? fs.rm(path.dirname(localSkillsDir), { recursive: true, force: true }).catch(() => undefined) : Promise.resolve(),
]);

View File

@@ -8,11 +8,15 @@ import {
asString,
asStringArray,
parseObject,
ensureAbsoluteDirectory,
ensureCommandResolvable,
ensurePathInEnv,
runChildProcess,
} from "@paperclipai/adapter-utils/server-utils";
import {
ensureAdapterExecutionTargetCommandResolvable,
ensureAdapterExecutionTargetDirectory,
runAdapterExecutionTargetProcess,
describeAdapterExecutionTarget,
resolveAdapterExecutionTargetCwd,
} from "@paperclipai/adapter-utils/execution-target";
import { discoverOpenCodeModels, ensureOpenCodeModelConfiguredAndAvailable } from "./models.js";
import { parseOpenCodeJsonl } from "./parse.js";
import { prepareOpenCodeRuntimeConfig } from "./runtime-config.js";
@@ -58,10 +62,28 @@ export async function testEnvironment(
const checks: AdapterEnvironmentCheck[] = [];
const config = parseObject(ctx.config);
const command = asString(config.command, "opencode");
const cwd = asString(config.cwd, process.cwd());
const target = ctx.executionTarget ?? null;
const targetIsRemote = target?.kind === "remote";
const cwd = resolveAdapterExecutionTargetCwd(target, asString(config.cwd, ""), process.cwd());
const targetLabel = targetIsRemote
? ctx.environmentName ?? describeAdapterExecutionTarget(target)
: null;
const runId = `opencode-envtest-${Date.now()}-${Math.random().toString(16).slice(2)}`;
if (targetLabel) {
checks.push({
code: "opencode_environment_target",
level: "info",
message: `Probing inside environment: ${targetLabel}`,
});
}
try {
await ensureAbsoluteDirectory(cwd, { createIfMissing: false });
await ensureAdapterExecutionTargetDirectory(runId, target, cwd, {
cwd,
env: {},
createIfMissing: false,
});
checks.push({
code: "opencode_cwd_valid",
level: "info",
@@ -115,7 +137,7 @@ export async function testEnvironment(
});
} else {
try {
await ensureCommandResolvable(command, cwd, runtimeEnv);
await ensureAdapterExecutionTargetCommandResolvable(command, target, cwd, runtimeEnv);
checks.push({
code: "opencode_command_resolvable",
level: "info",
@@ -137,7 +159,19 @@ export async function testEnvironment(
let modelValidationPassed = false;
const configuredModel = asString(config.model, "").trim();
if (canRunProbe && configuredModel) {
// Model discovery and validation use local child processes against
// OpenCode's `models` subcommand and JSON config; these are not yet
// wired through the execution target. When probing a remote env, skip
// discovery/validation and rely on the remote hello probe to surface
// model/auth issues directly.
if (targetIsRemote && configuredModel) {
checks.push({
code: "opencode_model_validation_skipped_remote",
level: "info",
message: `Skipped local model validation; will be validated by the hello probe inside ${targetLabel}.`,
});
modelValidationPassed = true;
} else if (canRunProbe && configuredModel) {
try {
const discovered = await discoverOpenCodeModels({ command, cwd, env: runtimeEnv });
if (discovered.length > 0) {
@@ -173,7 +207,7 @@ export async function testEnvironment(
});
}
}
} else if (canRunProbe && !configuredModel) {
} else if (!targetIsRemote && canRunProbe && !configuredModel) {
try {
const discovered = await discoverOpenCodeModels({ command, cwd, env: runtimeEnv });
if (discovered.length > 0) {
@@ -207,7 +241,7 @@ export async function testEnvironment(
const modelUnavailable = checks.some((check) => check.code === "opencode_hello_probe_model_unavailable");
if (!configuredModel && !modelUnavailable) {
// No model configured skip model requirement if no model-related checks exist
} else if (configuredModel && canRunProbe) {
} else if (!targetIsRemote && configuredModel && canRunProbe) {
try {
await ensureOpenCodeModelConfiguredAndAvailable({
model: configuredModel,
@@ -246,8 +280,9 @@ export async function testEnvironment(
if (extraArgs.length > 0) args.push(...extraArgs);
try {
const probe = await runChildProcess(
`opencode-envtest-${Date.now()}-${Math.random().toString(16).slice(2)}`,
const probe = await runAdapterExecutionTargetProcess(
runId,
target,
command,
args,
{

View File

@@ -54,8 +54,6 @@ export function buildOpenCodeLocalConfig(v: CreateConfigValues): Record<string,
const ac: Record<string, unknown> = {};
if (v.cwd) ac.cwd = v.cwd;
if (v.instructionsFilePath) ac.instructionsFilePath = v.instructionsFilePath;
if (v.promptTemplate) ac.promptTemplate = v.promptTemplate;
if (v.bootstrapPrompt) ac.bootstrapPromptTemplate = v.bootstrapPrompt;
if (v.model) ac.model = v.model;
if (v.thinkingEffort) ac.variant = v.thinkingEffort;
ac.dangerouslySkipPermissions = v.dangerouslySkipPermissions;

View File

@@ -10,6 +10,7 @@ import {
adapterExecutionTargetSessionIdentity,
adapterExecutionTargetSessionMatches,
adapterExecutionTargetUsesManagedHome,
adapterExecutionTargetUsesPaperclipBridge,
describeAdapterExecutionTarget,
ensureAdapterExecutionTargetCommandResolvable,
ensureAdapterExecutionTargetFile,
@@ -17,12 +18,14 @@ import {
readAdapterExecutionTarget,
resolveAdapterExecutionTargetCommandForLogs,
runAdapterExecutionTargetProcess,
startAdapterExecutionTargetPaperclipBridge,
} from "@paperclipai/adapter-utils/execution-target";
import {
asString,
asNumber,
asStringArray,
parseObject,
applyPaperclipWorkspaceEnv,
buildPaperclipEnv,
joinPromptSections,
buildInvocationEnvForLogs,
@@ -228,12 +231,14 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
if (approvalStatus) env.PAPERCLIP_APPROVAL_STATUS = approvalStatus;
if (linkedIssueIds.length > 0) env.PAPERCLIP_LINKED_ISSUE_IDS = linkedIssueIds.join(",");
if (wakePayloadJson) env.PAPERCLIP_WAKE_PAYLOAD_JSON = wakePayloadJson;
if (workspaceCwd) env.PAPERCLIP_WORKSPACE_CWD = workspaceCwd;
if (workspaceSource) env.PAPERCLIP_WORKSPACE_SOURCE = workspaceSource;
if (workspaceId) env.PAPERCLIP_WORKSPACE_ID = workspaceId;
if (workspaceRepoUrl) env.PAPERCLIP_WORKSPACE_REPO_URL = workspaceRepoUrl;
if (workspaceRepoRef) env.PAPERCLIP_WORKSPACE_REPO_REF = workspaceRepoRef;
if (agentHome) env.AGENT_HOME = agentHome;
applyPaperclipWorkspaceEnv(env, {
workspaceCwd,
workspaceSource,
workspaceId,
workspaceRepoUrl,
workspaceRepoRef,
agentHome,
});
if (workspaceHints.length > 0) env.PAPERCLIP_WORKSPACES_JSON = JSON.stringify(workspaceHints);
const targetPaperclipApiUrl = adapterExecutionTargetPaperclipApiUrl(executionTarget);
if (targetPaperclipApiUrl) env.PAPERCLIP_API_URL = targetPaperclipApiUrl;
@@ -275,7 +280,7 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
);
await ensureAdapterExecutionTargetCommandResolvable(command, executionTarget, cwd, runtimeEnv);
const resolvedCommand = await resolveAdapterExecutionTargetCommandForLogs(command, executionTarget, cwd, runtimeEnv);
const loggedEnv = buildInvocationEnvForLogs(env, {
let loggedEnv = buildInvocationEnvForLogs(env, {
runtimeEnv,
includeRuntimeKeys: ["HOME"],
resolvedCommand,
@@ -301,6 +306,7 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
let remoteRuntimeRootDir: string | null = null;
let localSkillsDir: string | null = null;
let remoteSkillsDir: string | null = null;
let paperclipBridge: Awaited<ReturnType<typeof startAdapterExecutionTargetPaperclipBridge>> = null;
if (executionTargetIsRemote) {
try {
@@ -335,6 +341,28 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
throw error;
}
}
if (executionTargetIsRemote && adapterExecutionTargetUsesPaperclipBridge(executionTarget)) {
paperclipBridge = await startAdapterExecutionTargetPaperclipBridge({
runId,
target: executionTarget,
runtimeRootDir: remoteRuntimeRootDir,
adapterKey: "pi",
hostApiToken: env.PAPERCLIP_API_KEY,
onLog,
});
if (paperclipBridge) {
Object.assign(env, paperclipBridge.env);
loggedEnv = buildInvocationEnvForLogs(env, {
runtimeEnv: Object.fromEntries(
Object.entries(ensurePathInEnv({ ...process.env, ...env })).filter(
(entry): entry is [string, string] => typeof entry[1] === "string",
),
),
includeRuntimeKeys: ["HOME"],
resolvedCommand,
});
}
}
const runtimeSessionParams = parseObject(runtime.sessionParams);
const runtimeSessionId = asString(runtimeSessionParams.sessionId, runtime.sessionId ?? "");
@@ -651,6 +679,7 @@ export async function execute(ctx: AdapterExecutionContext): Promise<AdapterExec
return toResult(initial);
} finally {
await Promise.all([
paperclipBridge?.stop(),
restoreRemoteWorkspace?.(),
localSkillsDir ? fs.rm(path.dirname(localSkillsDir), { recursive: true, force: true }).catch(() => undefined) : Promise.resolve(),
]);

View File

@@ -6,14 +6,18 @@ import type {
import {
asString,
parseObject,
ensureAbsoluteDirectory,
ensureCommandResolvable,
ensurePathInEnv,
runChildProcess,
} from "@paperclipai/adapter-utils/server-utils";
import {
asStringArray,
} from "@paperclipai/adapter-utils/server-utils";
import {
ensureAdapterExecutionTargetCommandResolvable,
ensureAdapterExecutionTargetDirectory,
runAdapterExecutionTargetProcess,
describeAdapterExecutionTarget,
resolveAdapterExecutionTargetCwd,
} from "@paperclipai/adapter-utils/execution-target";
import { discoverPiModelsCached } from "./models.js";
import { parsePiJsonl } from "./parse.js";
@@ -78,10 +82,28 @@ export async function testEnvironment(
const checks: AdapterEnvironmentCheck[] = [];
const config = parseObject(ctx.config);
const command = asString(config.command, "pi");
const cwd = asString(config.cwd, process.cwd());
const target = ctx.executionTarget ?? null;
const targetIsRemote = target?.kind === "remote";
const cwd = resolveAdapterExecutionTargetCwd(target, asString(config.cwd, ""), process.cwd());
const targetLabel = targetIsRemote
? ctx.environmentName ?? describeAdapterExecutionTarget(target)
: null;
const runId = `pi-envtest-${Date.now()}-${Math.random().toString(16).slice(2)}`;
if (targetLabel) {
checks.push({
code: "pi_environment_target",
level: "info",
message: `Probing inside environment: ${targetLabel}`,
});
}
try {
await ensureAbsoluteDirectory(cwd, { createIfMissing: false });
await ensureAdapterExecutionTargetDirectory(runId, target, cwd, {
cwd,
env: {},
createIfMissing: false,
});
checks.push({
code: "pi_cwd_valid",
level: "info",
@@ -113,7 +135,7 @@ export async function testEnvironment(
});
} else {
try {
await ensureCommandResolvable(command, cwd, runtimeEnv);
await ensureAdapterExecutionTargetCommandResolvable(command, target, cwd, runtimeEnv);
checks.push({
code: "pi_command_resolvable",
level: "info",
@@ -132,7 +154,10 @@ export async function testEnvironment(
const canRunProbe =
checks.every((check) => check.code !== "pi_cwd_invalid" && check.code !== "pi_command_unresolvable");
if (canRunProbe) {
// Pi model discovery shells out to `pi --list-models` locally; when probing a
// remote target we skip discovery and let the remote hello probe surface
// model/auth issues directly.
if (!targetIsRemote && canRunProbe) {
try {
const discovered = await discoverPiModelsCached({ command, cwd, env: runtimeEnv });
if (discovered.length > 0) {
@@ -166,6 +191,12 @@ export async function testEnvironment(
message: "Pi requires a configured model in provider/model format.",
hint: "Set adapterConfig.model using an ID from `pi --list-models`.",
});
} else if (targetIsRemote) {
checks.push({
code: "pi_model_validation_skipped_remote",
level: "info",
message: `Skipped local model validation; will be validated by the hello probe inside ${targetLabel}.`,
});
} else if (canRunProbe) {
// Verify model is in the list
try {
@@ -218,8 +249,9 @@ export async function testEnvironment(
if (extraArgs.length > 0) args.push(...extraArgs);
try {
const probe = await runChildProcess(
`pi-envtest-${Date.now()}-${Math.random().toString(16).slice(2)}`,
const probe = await runAdapterExecutionTargetProcess(
runId,
target,
command,
args,
{

View File

@@ -47,8 +47,6 @@ export function buildPiLocalConfig(v: CreateConfigValues): Record<string, unknow
const ac: Record<string, unknown> = {};
if (v.cwd) ac.cwd = v.cwd;
if (v.instructionsFilePath) ac.instructionsFilePath = v.instructionsFilePath;
if (v.promptTemplate) ac.promptTemplate = v.promptTemplate;
if (v.bootstrapPrompt) ac.bootstrapPromptTemplate = v.bootstrapPrompt;
if (v.model) ac.model = v.model;
if (v.thinkingEffort) ac.thinking = v.thinkingEffort;

View File

@@ -182,7 +182,93 @@ describeEmbeddedPostgres("runDatabaseBackup", () => {
);
it(
"restores statements incrementally when backup comments precede the first breakpoint",
"backs up and restores non-public database schemas and migration history",
async () => {
const sourceConnectionString = await createTempDatabase();
const restoreConnectionString = await createSiblingDatabase(
sourceConnectionString,
"paperclip_full_logical_restore_target",
);
const backupDir = createTempDir("paperclip-db-full-logical-backup-");
const sourceSql = postgres(sourceConnectionString, { max: 1, onnotice: () => {} });
const restoreSql = postgres(restoreConnectionString, { max: 1, onnotice: () => {} });
try {
await sourceSql.unsafe(`
CREATE SCHEMA IF NOT EXISTS "drizzle";
CREATE TABLE IF NOT EXISTS "drizzle"."__drizzle_migrations" (
"id" serial PRIMARY KEY,
"hash" text NOT NULL,
"created_at" bigint
);
INSERT INTO "drizzle"."__drizzle_migrations" ("hash", "created_at")
VALUES ('paperclip-migration-history', 1770000000000);
`);
await sourceSql.unsafe(`
CREATE TABLE "public"."backup_parent_records" (
"id" uuid PRIMARY KEY,
"name" text NOT NULL
);
INSERT INTO "public"."backup_parent_records" ("id", "name")
VALUES ('11111111-1111-4111-8111-111111111111', 'parent');
`);
await sourceSql.unsafe(`
CREATE SCHEMA "plugin_backup_scope";
CREATE TYPE "plugin_backup_scope"."plugin_status" AS ENUM ('ready', 'done');
CREATE TABLE "plugin_backup_scope"."plugin_rows" (
"id" serial PRIMARY KEY,
"parent_id" uuid NOT NULL REFERENCES "public"."backup_parent_records"("id") ON DELETE CASCADE,
"status" "plugin_backup_scope"."plugin_status" NOT NULL,
"note" text NOT NULL
);
CREATE UNIQUE INDEX "plugin_rows_note_uq" ON "plugin_backup_scope"."plugin_rows" ("note");
INSERT INTO "plugin_backup_scope"."plugin_rows" ("parent_id", "status", "note")
VALUES ('11111111-1111-4111-8111-111111111111', 'ready', 'first');
`);
const result = await runDatabaseBackup({
connectionString: sourceConnectionString,
backupDir,
retention: { dailyDays: 7, weeklyWeeks: 4, monthlyMonths: 1 },
filenamePrefix: "paperclip-full-logical-test",
backupEngine: "javascript",
});
await runDatabaseRestore({
connectionString: restoreConnectionString,
backupFile: result.backupFile,
});
const migrationRows = await restoreSql.unsafe<{ hash: string }[]>(`
SELECT "hash"
FROM "drizzle"."__drizzle_migrations"
WHERE "hash" = 'paperclip-migration-history'
`);
expect(migrationRows).toEqual([{ hash: "paperclip-migration-history" }]);
const pluginRows = await restoreSql.unsafe<{ note: string; status: string; parent_name: string }[]>(`
SELECT r."note", r."status"::text AS "status", p."name" AS "parent_name"
FROM "plugin_backup_scope"."plugin_rows" r
JOIN "public"."backup_parent_records" p ON p."id" = r."parent_id"
`);
expect(pluginRows).toEqual([{ note: "first", status: "ready", parent_name: "parent" }]);
await expect(
restoreSql.unsafe(`
INSERT INTO "plugin_backup_scope"."plugin_rows" ("parent_id", "status", "note")
VALUES ('11111111-1111-4111-8111-111111111111', 'done', 'first')
`),
).rejects.toThrow();
} finally {
await sourceSql.end();
await restoreSql.end();
}
},
60_000,
);
it(
"restores legacy public-only backups without migration history",
async () => {
const restoreConnectionString = await createTempDatabase();
const restoreSql = postgres(restoreConnectionString, { max: 1, onnotice: () => {} });

View File

@@ -19,6 +19,11 @@ export type RunDatabaseBackupOptions = {
retention: BackupRetentionPolicy;
filenamePrefix?: string;
connectTimeoutSeconds?: number;
/**
* @deprecated Migration-journal schemas are included with the normal backup
* scope. This option is kept for compatibility and no longer changes backup
* engine selection.
*/
includeMigrationJournal?: boolean;
excludeTables?: string[];
nullifyColumns?: Record<string, string[]>;
@@ -61,8 +66,6 @@ type ExtensionDefinition = {
schema_name: string;
};
const DRIZZLE_SCHEMA = "drizzle";
const DRIZZLE_MIGRATIONS_TABLE = "__drizzle_migrations";
const DEFAULT_BACKUP_WRITE_BUFFER_BYTES = 1024 * 1024;
const BACKUP_DATA_CURSOR_ROWS = 100;
const BACKUP_CLI_STDERR_BYTES = 64 * 1024;
@@ -229,9 +232,15 @@ function tableKey(schemaName: string, tableName: string): string {
return `${schemaName}.${tableName}`;
}
function nonSystemSchemaPredicate(identifier: string): string {
return `${identifier} NOT IN ('pg_catalog', 'information_schema')
AND ${identifier} NOT LIKE 'pg_toast%'
AND ${identifier} NOT LIKE 'pg_temp_%'
AND ${identifier} NOT LIKE 'pg_toast_temp_%'`;
}
function hasBackupTransforms(opts: RunDatabaseBackupOptions): boolean {
return opts.includeMigrationJournal === true ||
(opts.excludeTables?.length ?? 0) > 0 ||
return (opts.excludeTables?.length ?? 0) > 0 ||
Object.keys(opts.nullifyColumns ?? {}).length > 0;
}
@@ -285,7 +294,6 @@ async function runPgDumpBackup(opts: {
"--if-exists",
"--no-owner",
"--no-privileges",
"--schema=public",
],
{
stdio: ["ignore", "pipe", "pipe"],
@@ -484,7 +492,6 @@ export async function runDatabaseBackup(opts: RunDatabaseBackupOptions): Promise
const connectTimeout = Math.max(1, Math.trunc(opts.connectTimeoutSeconds ?? 5));
const backupEngine = opts.backupEngine ?? "auto";
const canUsePgDump = !hasBackupTransforms(opts);
const includeMigrationJournal = opts.includeMigrationJournal === true;
const excludedTableNames = normalizeTableNameSet(opts.excludeTables);
const nullifiedColumnsByTable = normalizeNullifyColumnMap(opts.nullifyColumns);
let sql = postgres(opts.connectionString, { max: 1, connect_timeout: connectTimeout });
@@ -552,31 +559,24 @@ export async function runDatabaseBackup(opts: RunDatabaseBackupOptions): Promise
SELECT table_schema AS schema_name, table_name AS tablename
FROM information_schema.tables
WHERE table_type = 'BASE TABLE'
AND (
table_schema = 'public'
OR (${includeMigrationJournal}::boolean AND table_schema = ${DRIZZLE_SCHEMA} AND table_name = ${DRIZZLE_MIGRATIONS_TABLE})
)
AND ${sql.unsafe(nonSystemSchemaPredicate("table_schema"))}
ORDER BY table_schema, table_name
`;
const tables = allTables;
const includedTableNames = new Set(tables.map(({ schema_name, tablename }) => tableKey(schema_name, tablename)));
const includedSchemas = new Set(tables.map(({ schema_name }) => schema_name));
// Get all enums
const enums = await sql<{ typname: string; labels: string[] }[]>`
SELECT t.typname, array_agg(e.enumlabel ORDER BY e.enumsortorder) AS labels
const enums = await sql<{ schema_name: string; typname: string; labels: string[] }[]>`
SELECT n.nspname AS schema_name, t.typname, array_agg(e.enumlabel ORDER BY e.enumsortorder) AS labels
FROM pg_type t
JOIN pg_enum e ON t.oid = e.enumtypid
JOIN pg_namespace n ON t.typnamespace = n.oid
WHERE n.nspname = 'public'
GROUP BY t.typname
ORDER BY t.typname
WHERE ${sql.unsafe(nonSystemSchemaPredicate("n.nspname"))}
GROUP BY n.nspname, t.typname
ORDER BY n.nspname, t.typname
`;
for (const e of enums) {
const labels = e.labels.map((l) => `'${l.replace(/'/g, "''")}'`).join(", ");
emitStatement(`CREATE TYPE "public"."${e.typname}" AS ENUM (${labels});`);
}
if (enums.length > 0) emit("");
for (const e of enums) includedSchemas.add(e.schema_name);
const allSequences = await sql<SequenceDefinition[]>`
SELECT
@@ -598,15 +598,14 @@ export async function runDatabaseBackup(opts: RunDatabaseBackupOptions): Promise
LEFT JOIN pg_class tbl ON tbl.oid = dep.refobjid
LEFT JOIN pg_namespace tblns ON tblns.oid = tbl.relnamespace
LEFT JOIN pg_attribute attr ON attr.attrelid = tbl.oid AND attr.attnum = dep.refobjsubid
WHERE s.sequence_schema = 'public'
OR (${includeMigrationJournal}::boolean AND s.sequence_schema = ${DRIZZLE_SCHEMA})
WHERE ${sql.unsafe(nonSystemSchemaPredicate("s.sequence_schema"))}
ORDER BY s.sequence_schema, s.sequence_name
`;
const sequences = allSequences.filter(
(seq) => !seq.owner_table || includedTableNames.has(tableKey(seq.owner_schema ?? "public", seq.owner_table)),
);
const schemas = new Set<string>();
const schemas = new Set<string>(includedSchemas);
for (const table of tables) schemas.add(table.schema_name);
for (const seq of sequences) schemas.add(seq.sequence_schema);
const extraSchemas = [...schemas].filter((schemaName) => schemaName !== "public");
@@ -618,6 +617,12 @@ export async function runDatabaseBackup(opts: RunDatabaseBackupOptions): Promise
emit("");
}
for (const e of enums) {
const labels = e.labels.map((l) => `'${l.replace(/'/g, "''")}'`).join(", ");
emitStatement(`CREATE TYPE ${quoteQualifiedName(e.schema_name, e.typname)} AS ENUM (${labels});`);
}
if (enums.length > 0) emit("");
const extensions = await sql<ExtensionDefinition[]>`
SELECT
e.extname AS extension_name,
@@ -655,6 +660,7 @@ export async function runDatabaseBackup(opts: RunDatabaseBackupOptions): Promise
const columns = await sql<{
column_name: string;
data_type: string;
udt_schema: string;
udt_name: string;
is_nullable: string;
column_default: string | null;
@@ -662,7 +668,7 @@ export async function runDatabaseBackup(opts: RunDatabaseBackupOptions): Promise
numeric_precision: number | null;
numeric_scale: number | null;
}[]>`
SELECT column_name, data_type, udt_name, is_nullable, column_default,
SELECT column_name, data_type, udt_schema, udt_name, is_nullable, column_default,
character_maximum_length, numeric_precision, numeric_scale
FROM information_schema.columns
WHERE table_schema = ${schema_name} AND table_name = ${tablename}
@@ -676,9 +682,12 @@ export async function runDatabaseBackup(opts: RunDatabaseBackupOptions): Promise
for (const col of columns) {
let typeStr: string;
if (col.data_type === "USER-DEFINED") {
typeStr = `"${col.udt_name}"`;
typeStr = quoteQualifiedName(col.udt_schema, col.udt_name);
} else if (col.data_type === "ARRAY") {
typeStr = `${col.udt_name.replace(/^_/, "")}[]`;
const elementType = col.udt_name.replace(/^_/, "");
typeStr = col.udt_schema === "pg_catalog"
? `${elementType}[]`
: `${quoteQualifiedName(col.udt_schema, elementType)}[]`;
} else if (col.data_type === "character varying") {
typeStr = col.character_maximum_length
? `varchar(${col.character_maximum_length})`
@@ -761,10 +770,8 @@ export async function runDatabaseBackup(opts: RunDatabaseBackupOptions): Promise
JOIN pg_namespace tgtn ON tgtn.oid = tgt.relnamespace
JOIN pg_attribute sa ON sa.attrelid = src.oid AND sa.attnum = ANY(c.conkey)
JOIN pg_attribute ta ON ta.attrelid = tgt.oid AND ta.attnum = ANY(c.confkey)
WHERE c.contype = 'f' AND (
srcn.nspname = 'public'
OR (${includeMigrationJournal}::boolean AND srcn.nspname = ${DRIZZLE_SCHEMA})
)
WHERE c.contype = 'f'
AND ${sql.unsafe(nonSystemSchemaPredicate("srcn.nspname"))}
GROUP BY c.conname, srcn.nspname, src.relname, tgtn.nspname, tgt.relname, c.confupdtype, c.confdeltype
ORDER BY srcn.nspname, src.relname, c.conname
`;
@@ -800,10 +807,8 @@ export async function runDatabaseBackup(opts: RunDatabaseBackupOptions): Promise
JOIN pg_class t ON t.oid = c.conrelid
JOIN pg_namespace n ON n.oid = t.relnamespace
JOIN pg_attribute a ON a.attrelid = t.oid AND a.attnum = ANY(c.conkey)
WHERE c.contype = 'u' AND (
n.nspname = 'public'
OR (${includeMigrationJournal}::boolean AND n.nspname = ${DRIZZLE_SCHEMA})
)
WHERE c.contype = 'u'
AND ${sql.unsafe(nonSystemSchemaPredicate("n.nspname"))}
GROUP BY c.conname, n.nspname, t.relname
ORDER BY n.nspname, t.relname, c.conname
`;
@@ -822,10 +827,7 @@ export async function runDatabaseBackup(opts: RunDatabaseBackupOptions): Promise
const allIndexes = await sql<{ schema_name: string; tablename: string; indexdef: string }[]>`
SELECT schemaname AS schema_name, tablename, indexdef
FROM pg_indexes
WHERE (
schemaname = 'public'
OR (${includeMigrationJournal}::boolean AND schemaname = ${DRIZZLE_SCHEMA})
)
WHERE ${sql.unsafe(nonSystemSchemaPredicate("schemaname"))}
AND indexname NOT IN (
SELECT conname FROM pg_constraint c
JOIN pg_namespace n ON n.oid = c.connamespace

View File

@@ -0,0 +1,13 @@
CREATE UNIQUE INDEX IF NOT EXISTS "issues_active_liveness_recovery_incident_uq"
ON "issues" USING btree ("company_id","origin_kind","origin_id")
WHERE "origin_kind" = 'harness_liveness_escalation'
AND "origin_id" IS NOT NULL
AND "hidden_at" IS NULL
AND "status" NOT IN ('done', 'cancelled');
--> statement-breakpoint
CREATE UNIQUE INDEX IF NOT EXISTS "issues_active_liveness_recovery_leaf_uq"
ON "issues" USING btree ("company_id","origin_kind","origin_fingerprint")
WHERE "origin_kind" = 'harness_liveness_escalation'
AND "origin_fingerprint" <> 'default'
AND "hidden_at" IS NULL
AND "status" NOT IN ('done', 'cancelled');

View File

@@ -0,0 +1,70 @@
ALTER TABLE "heartbeat_runs" ADD COLUMN IF NOT EXISTS "last_output_at" timestamp with time zone;
--> statement-breakpoint
ALTER TABLE "heartbeat_runs" ADD COLUMN IF NOT EXISTS "last_output_seq" integer DEFAULT 0 NOT NULL;
--> statement-breakpoint
ALTER TABLE "heartbeat_runs" ADD COLUMN IF NOT EXISTS "last_output_stream" text;
--> statement-breakpoint
ALTER TABLE "heartbeat_runs" ADD COLUMN IF NOT EXISTS "last_output_bytes" bigint;
--> statement-breakpoint
CREATE INDEX IF NOT EXISTS "heartbeat_runs_company_status_last_output_idx"
ON "heartbeat_runs" USING btree ("company_id","status","last_output_at");
--> statement-breakpoint
CREATE INDEX IF NOT EXISTS "heartbeat_runs_company_status_process_started_idx"
ON "heartbeat_runs" USING btree ("company_id","status","process_started_at");
--> statement-breakpoint
CREATE TABLE IF NOT EXISTS "heartbeat_run_watchdog_decisions" (
"id" uuid PRIMARY KEY DEFAULT gen_random_uuid() NOT NULL,
"company_id" uuid NOT NULL,
"run_id" uuid NOT NULL,
"evaluation_issue_id" uuid,
"decision" text NOT NULL,
"snoozed_until" timestamp with time zone,
"reason" text,
"created_by_agent_id" uuid,
"created_by_user_id" text,
"created_by_run_id" uuid,
"created_at" timestamp with time zone DEFAULT now() NOT NULL
);
--> statement-breakpoint
DO $$ BEGIN
ALTER TABLE "heartbeat_run_watchdog_decisions" ADD CONSTRAINT "heartbeat_run_watchdog_decisions_company_id_companies_id_fk" FOREIGN KEY ("company_id") REFERENCES "public"."companies"("id") ON DELETE no action ON UPDATE no action;
EXCEPTION
WHEN duplicate_object THEN null;
END $$;
--> statement-breakpoint
DO $$ BEGIN
ALTER TABLE "heartbeat_run_watchdog_decisions" ADD CONSTRAINT "heartbeat_run_watchdog_decisions_run_id_heartbeat_runs_id_fk" FOREIGN KEY ("run_id") REFERENCES "public"."heartbeat_runs"("id") ON DELETE cascade ON UPDATE no action;
EXCEPTION
WHEN duplicate_object THEN null;
END $$;
--> statement-breakpoint
DO $$ BEGIN
ALTER TABLE "heartbeat_run_watchdog_decisions" ADD CONSTRAINT "heartbeat_run_watchdog_decisions_evaluation_issue_id_issues_id_fk" FOREIGN KEY ("evaluation_issue_id") REFERENCES "public"."issues"("id") ON DELETE set null ON UPDATE no action;
EXCEPTION
WHEN duplicate_object THEN null;
END $$;
--> statement-breakpoint
DO $$ BEGIN
ALTER TABLE "heartbeat_run_watchdog_decisions" ADD CONSTRAINT "heartbeat_run_watchdog_decisions_created_by_agent_id_agents_id_fk" FOREIGN KEY ("created_by_agent_id") REFERENCES "public"."agents"("id") ON DELETE set null ON UPDATE no action;
EXCEPTION
WHEN duplicate_object THEN null;
END $$;
--> statement-breakpoint
DO $$ BEGIN
ALTER TABLE "heartbeat_run_watchdog_decisions" ADD CONSTRAINT "heartbeat_run_watchdog_decisions_created_by_run_id_heartbeat_runs_id_fk" FOREIGN KEY ("created_by_run_id") REFERENCES "public"."heartbeat_runs"("id") ON DELETE set null ON UPDATE no action;
EXCEPTION
WHEN duplicate_object THEN null;
END $$;
--> statement-breakpoint
CREATE INDEX IF NOT EXISTS "heartbeat_run_watchdog_decisions_company_run_created_idx"
ON "heartbeat_run_watchdog_decisions" USING btree ("company_id","run_id","created_at");
--> statement-breakpoint
CREATE INDEX IF NOT EXISTS "heartbeat_run_watchdog_decisions_company_run_snooze_idx"
ON "heartbeat_run_watchdog_decisions" USING btree ("company_id","run_id","snoozed_until");
--> statement-breakpoint
CREATE UNIQUE INDEX IF NOT EXISTS "issues_active_stale_run_evaluation_uq"
ON "issues" USING btree ("company_id","origin_kind","origin_id")
WHERE "origin_kind" = 'stale_active_run_evaluation'
AND "origin_id" IS NOT NULL
AND "hidden_at" IS NULL
AND "status" NOT IN ('done', 'cancelled');

View File

@@ -0,0 +1 @@
ALTER TABLE "companies" ALTER COLUMN "require_board_approval_for_new_agents" SET DEFAULT false;

View File

@@ -0,0 +1,6 @@
CREATE UNIQUE INDEX IF NOT EXISTS "issues_active_stranded_issue_recovery_uq"
ON "issues" USING btree ("company_id","origin_kind","origin_id")
WHERE "origin_kind" = 'stranded_issue_recovery'
AND "origin_id" IS NOT NULL
AND "hidden_at" IS NULL
AND "status" NOT IN ('done', 'cancelled');

View File

@@ -0,0 +1 @@
ALTER TABLE "companies" ADD COLUMN "attachment_max_bytes" integer DEFAULT 10485760 NOT NULL;

View File

@@ -0,0 +1,4 @@
CREATE UNIQUE INDEX "issues_active_productivity_review_uq" ON "issues" USING btree ("company_id","origin_kind","origin_id") WHERE "issues"."origin_kind" = 'issue_productivity_review'
and "issues"."origin_id" is not null
and "issues"."hidden_at" is null
and "issues"."status" not in ('done', 'cancelled');

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -484,6 +484,48 @@
"when": 1776959400000,
"tag": "0068_environment_local_driver_unique",
"breakpoints": true
},
{
"idx": 69,
"version": "7",
"when": 1776780003000,
"tag": "0069_liveness_recovery_dedupe",
"breakpoints": true
},
{
"idx": 70,
"version": "7",
"when": 1776780004000,
"tag": "0070_active_run_output_watchdog",
"breakpoints": true
},
{
"idx": 71,
"version": "7",
"when": 1777131234000,
"tag": "0071_default_hire_approval_off",
"breakpoints": true
},
{
"idx": 72,
"version": "7",
"when": 1777305216238,
"tag": "0072_large_sandman",
"breakpoints": true
},
{
"idx": 73,
"version": "7",
"when": 1777382021347,
"tag": "0073_shiny_salo",
"breakpoints": true
},
{
"idx": 74,
"version": "7",
"when": 1777384535070,
"tag": "0074_striped_genesis",
"breakpoints": true
}
]
}
}

View File

@@ -13,9 +13,12 @@ export const companies = pgTable(
issueCounter: integer("issue_counter").notNull().default(0),
budgetMonthlyCents: integer("budget_monthly_cents").notNull().default(0),
spentMonthlyCents: integer("spent_monthly_cents").notNull().default(0),
attachmentMaxBytes: integer("attachment_max_bytes")
.notNull()
.default(10 * 1024 * 1024),
requireBoardApprovalForNewAgents: boolean("require_board_approval_for_new_agents")
.notNull()
.default(true),
.default(false),
feedbackDataSharingEnabled: boolean("feedback_data_sharing_enabled")
.notNull()
.default(false),

View File

@@ -0,0 +1,34 @@
import { index, pgTable, text, timestamp, uuid } from "drizzle-orm/pg-core";
import { agents } from "./agents.js";
import { companies } from "./companies.js";
import { heartbeatRuns } from "./heartbeat_runs.js";
import { issues } from "./issues.js";
export const heartbeatRunWatchdogDecisions = pgTable(
"heartbeat_run_watchdog_decisions",
{
id: uuid("id").primaryKey().defaultRandom(),
companyId: uuid("company_id").notNull().references(() => companies.id),
runId: uuid("run_id").notNull().references(() => heartbeatRuns.id, { onDelete: "cascade" }),
evaluationIssueId: uuid("evaluation_issue_id").references(() => issues.id, { onDelete: "set null" }),
decision: text("decision").notNull(),
snoozedUntil: timestamp("snoozed_until", { withTimezone: true }),
reason: text("reason"),
createdByAgentId: uuid("created_by_agent_id").references(() => agents.id, { onDelete: "set null" }),
createdByUserId: text("created_by_user_id"),
createdByRunId: uuid("created_by_run_id").references(() => heartbeatRuns.id, { onDelete: "set null" }),
createdAt: timestamp("created_at", { withTimezone: true }).notNull().defaultNow(),
},
(table) => ({
companyRunCreatedIdx: index("heartbeat_run_watchdog_decisions_company_run_created_idx").on(
table.companyId,
table.runId,
table.createdAt,
),
companyRunSnoozeIdx: index("heartbeat_run_watchdog_decisions_company_run_snooze_idx").on(
table.companyId,
table.runId,
table.snoozedUntil,
),
}),
);

View File

@@ -34,6 +34,10 @@ export const heartbeatRuns = pgTable(
processPid: integer("process_pid"),
processGroupId: integer("process_group_id"),
processStartedAt: timestamp("process_started_at", { withTimezone: true }),
lastOutputAt: timestamp("last_output_at", { withTimezone: true }),
lastOutputSeq: integer("last_output_seq").notNull().default(0),
lastOutputStream: text("last_output_stream"),
lastOutputBytes: bigint("last_output_bytes", { mode: "number" }),
retryOfRunId: uuid("retry_of_run_id").references((): AnyPgColumn => heartbeatRuns.id, {
onDelete: "set null",
}),
@@ -64,5 +68,15 @@ export const heartbeatRuns = pgTable(
table.livenessState,
table.createdAt,
),
companyStatusLastOutputIdx: index("heartbeat_runs_company_status_last_output_idx").on(
table.companyId,
table.status,
table.lastOutputAt,
),
companyStatusProcessStartedIdx: index("heartbeat_runs_company_status_process_started_idx").on(
table.companyId,
table.status,
table.processStartedAt,
),
}),
);

View File

@@ -53,6 +53,7 @@ export { documentRevisions } from "./document_revisions.js";
export { issueDocuments } from "./issue_documents.js";
export { heartbeatRuns } from "./heartbeat_runs.js";
export { heartbeatRunEvents } from "./heartbeat_run_events.js";
export { heartbeatRunWatchdogDecisions } from "./heartbeat_run_watchdog_decisions.js";
export { costEvents } from "./cost_events.js";
export { financeEvents } from "./finance_events.js";
export { approvals } from "./approvals.js";

View File

@@ -91,5 +91,45 @@ export const issues = pgTable(
and ${table.executionRunId} is not null
and ${table.status} in ('backlog', 'todo', 'in_progress', 'in_review', 'blocked')`,
),
activeLivenessRecoveryIncidentIdx: uniqueIndex("issues_active_liveness_recovery_incident_uq")
.on(table.companyId, table.originKind, table.originId)
.where(
sql`${table.originKind} = 'harness_liveness_escalation'
and ${table.originId} is not null
and ${table.hiddenAt} is null
and ${table.status} not in ('done', 'cancelled')`,
),
activeLivenessRecoveryLeafIdx: uniqueIndex("issues_active_liveness_recovery_leaf_uq")
.on(table.companyId, table.originKind, table.originFingerprint)
.where(
sql`${table.originKind} = 'harness_liveness_escalation'
and ${table.originFingerprint} <> 'default'
and ${table.hiddenAt} is null
and ${table.status} not in ('done', 'cancelled')`,
),
activeStaleRunEvaluationIdx: uniqueIndex("issues_active_stale_run_evaluation_uq")
.on(table.companyId, table.originKind, table.originId)
.where(
sql`${table.originKind} = 'stale_active_run_evaluation'
and ${table.originId} is not null
and ${table.hiddenAt} is null
and ${table.status} not in ('done', 'cancelled')`,
),
activeProductivityReviewIdx: uniqueIndex("issues_active_productivity_review_uq")
.on(table.companyId, table.originKind, table.originId)
.where(
sql`${table.originKind} = 'issue_productivity_review'
and ${table.originId} is not null
and ${table.hiddenAt} is null
and ${table.status} not in ('done', 'cancelled')`,
),
activeStrandedIssueRecoveryIdx: uniqueIndex("issues_active_stranded_issue_recovery_uq")
.on(table.companyId, table.originKind, table.originId)
.where(
sql`${table.originKind} = 'stranded_issue_recovery'
and ${table.originId} is not null
and ${table.hiddenAt} is null
and ${table.status} not in ('done', 'cancelled')`,
),
}),
);

View File

@@ -33,77 +33,56 @@ export type EmbeddedPostgresTestDatabase = {
let embeddedPostgresSupportPromise: Promise<EmbeddedPostgresTestSupport> | null = null;
const DEFAULT_PAPERCLIP_EMBEDDED_POSTGRES_PORT = 54329;
function getReservedTestPorts(): Set<number> {
const configuredPorts = [
DEFAULT_PAPERCLIP_EMBEDDED_POSTGRES_PORT,
Number.parseInt(process.env.PAPERCLIP_EMBEDDED_POSTGRES_PORT ?? "", 10),
...String(process.env.PAPERCLIP_TEST_POSTGRES_RESERVED_PORTS ?? "")
.split(",")
.map((value) => Number.parseInt(value.trim(), 10)),
];
return new Set(configuredPorts.filter((port) => Number.isInteger(port) && port > 0 && port <= 65535));
}
async function getEmbeddedPostgresCtor(): Promise<EmbeddedPostgresCtor> {
const mod = await import("embedded-postgres");
return mod.default as EmbeddedPostgresCtor;
}
async function getAvailablePort(): Promise<number> {
return await new Promise((resolve, reject) => {
const server = net.createServer();
server.unref();
server.on("error", reject);
server.listen(0, "127.0.0.1", () => {
const address = server.address();
if (!address || typeof address === "string") {
server.close(() => reject(new Error("Failed to allocate test port")));
return;
}
const { port } = address;
server.close((error) => {
if (error) reject(error);
else resolve(port);
const reservedPorts = getReservedTestPorts();
for (let attempt = 0; attempt < 20; attempt += 1) {
const port = await new Promise<number>((resolve, reject) => {
const server = net.createServer();
server.unref();
server.on("error", reject);
server.listen(0, "127.0.0.1", () => {
const address = server.address();
if (!address || typeof address === "string") {
server.close(() => reject(new Error("Failed to allocate test port")));
return;
}
const { port } = address;
server.close((error) => {
if (error) reject(error);
else resolve(port);
});
});
});
});
}
function formatEmbeddedPostgresError(error: unknown): string {
if (error instanceof Error && error.message.length > 0) return error.message;
if (typeof error === "string" && error.length > 0) return error;
return "embedded Postgres startup failed";
}
async function probeEmbeddedPostgresSupport(): Promise<EmbeddedPostgresTestSupport> {
const dataDir = fs.mkdtempSync(path.join(os.tmpdir(), "paperclip-embedded-postgres-probe-"));
const port = await getAvailablePort();
const EmbeddedPostgres = await getEmbeddedPostgresCtor();
const instance = new EmbeddedPostgres({
databaseDir: dataDir,
user: "paperclip",
password: "paperclip",
port,
persistent: true,
initdbFlags: ["--encoding=UTF8", "--locale=C", "--lc-messages=C"],
onLog: () => {},
onError: () => {},
});
try {
await instance.initialise();
await instance.start();
return { supported: true };
} catch (error) {
return {
supported: false,
reason: formatEmbeddedPostgresError(error),
};
} finally {
await instance.stop().catch(() => {});
fs.rmSync(dataDir, { recursive: true, force: true });
if (!reservedPorts.has(port)) return port;
}
throw new Error(
`Failed to allocate embedded Postgres test port outside reserved Paperclip ports: ${[
...reservedPorts,
].join(", ")}`,
);
}
export async function getEmbeddedPostgresTestSupport(): Promise<EmbeddedPostgresTestSupport> {
if (!embeddedPostgresSupportPromise) {
embeddedPostgresSupportPromise = probeEmbeddedPostgresSupport();
}
return await embeddedPostgresSupportPromise;
}
export async function startEmbeddedPostgresTestDatabase(
tempDirPrefix: string,
): Promise<EmbeddedPostgresTestDatabase> {
async function createEmbeddedPostgresTestInstance(tempDirPrefix: string) {
const dataDir = fs.mkdtempSync(path.join(os.tmpdir(), tempDirPrefix));
const port = await getAvailablePort();
const EmbeddedPostgres = await getEmbeddedPostgresCtor();
@@ -118,6 +97,51 @@ export async function startEmbeddedPostgresTestDatabase(
onError: () => {},
});
return { dataDir, port, instance };
}
function cleanupEmbeddedPostgresTestDirs(dataDir: string) {
fs.rmSync(dataDir, { recursive: true, force: true });
}
function formatEmbeddedPostgresError(error: unknown): string {
if (error instanceof Error && error.message.length > 0) return error.message;
if (typeof error === "string" && error.length > 0) return error;
return "embedded Postgres startup failed";
}
async function probeEmbeddedPostgresSupport(): Promise<EmbeddedPostgresTestSupport> {
const { dataDir, instance } = await createEmbeddedPostgresTestInstance(
"paperclip-embedded-postgres-probe-",
);
try {
await instance.initialise();
await instance.start();
return { supported: true };
} catch (error) {
return {
supported: false,
reason: formatEmbeddedPostgresError(error),
};
} finally {
await instance.stop().catch(() => {});
cleanupEmbeddedPostgresTestDirs(dataDir);
}
}
export async function getEmbeddedPostgresTestSupport(): Promise<EmbeddedPostgresTestSupport> {
if (!embeddedPostgresSupportPromise) {
embeddedPostgresSupportPromise = probeEmbeddedPostgresSupport();
}
return await embeddedPostgresSupportPromise;
}
export async function startEmbeddedPostgresTestDatabase(
tempDirPrefix: string,
): Promise<EmbeddedPostgresTestDatabase> {
const { dataDir, port, instance } = await createEmbeddedPostgresTestInstance(tempDirPrefix);
try {
await instance.initialise();
await instance.start();
@@ -131,12 +155,12 @@ export async function startEmbeddedPostgresTestDatabase(
connectionString,
cleanup: async () => {
await instance.stop().catch(() => {});
fs.rmSync(dataDir, { recursive: true, force: true });
cleanupEmbeddedPostgresTestDirs(dataDir);
},
};
} catch (error) {
await instance.stop().catch(() => {});
fs.rmSync(dataDir, { recursive: true, force: true });
cleanupEmbeddedPostgresTestDirs(dataDir);
throw new Error(
`Failed to start embedded PostgreSQL test database: ${formatEmbeddedPostgresError(error)}`,
);

View File

@@ -450,7 +450,7 @@ export function createToolDefinitions(client: PaperclipApiClient): ToolDefinitio
),
makeTool(
"paperclipUpdateIssue",
"Patch an issue, optionally including a comment",
"Patch an issue, optionally including a comment; include resume=true when intentionally requesting follow-up on resumable closed work",
updateIssueToolSchema,
async ({ issueId, ...body }) =>
client.requestJson("PATCH", `/issues/${encodeURIComponent(issueId)}`, { body }),
@@ -475,7 +475,7 @@ export function createToolDefinitions(client: PaperclipApiClient): ToolDefinitio
),
makeTool(
"paperclipAddComment",
"Add a comment to an issue",
"Add a comment to an issue; include resume=true when intentionally requesting follow-up on resumable closed work",
addCommentToolSchema,
async ({ issueId, ...body }) =>
client.requestJson("POST", `/issues/${encodeURIComponent(issueId)}/comments`, { body }),

View File

@@ -4,9 +4,9 @@ import fs from "node:fs";
import path from "node:path";
import { fileURLToPath } from "node:url";
const VALID_TEMPLATES = ["default", "connector", "workspace"] as const;
const VALID_TEMPLATES = ["default", "connector", "workspace", "environment"] as const;
type PluginTemplate = (typeof VALID_TEMPLATES)[number];
const VALID_CATEGORIES = new Set(["connector", "workspace", "automation", "ui"] as const);
const VALID_CATEGORIES = new Set(["connector", "workspace", "automation", "ui", "environment"] as const);
export interface ScaffoldPluginOptions {
pluginName: string;
@@ -15,7 +15,7 @@ export interface ScaffoldPluginOptions {
displayName?: string;
description?: string;
author?: string;
category?: "connector" | "workspace" | "automation" | "ui";
category?: "connector" | "workspace" | "automation" | "ui" | "environment";
sdkPath?: string;
}
@@ -138,7 +138,7 @@ export function scaffoldPluginProject(options: ScaffoldPluginOptions): string {
const displayName = options.displayName ?? makeDisplayName(options.pluginName);
const description = options.description ?? "A Paperclip plugin";
const author = options.author ?? "Plugin Author";
const category = options.category ?? (template === "workspace" ? "workspace" : "connector");
const category = options.category ?? (template === "workspace" ? "workspace" : template === "environment" ? "environment" : "connector");
const manifestId = packageToManifestId(options.pluginName);
const localSdkPath = path.resolve(options.sdkPath ?? getLocalSdkPackagePath());
const localSharedPath = getLocalSharedPackagePath(localSdkPath);
@@ -296,9 +296,231 @@ export default defineConfig({
`,
);
writeFile(
path.join(outputDir, "src", "manifest.ts"),
`import type { PaperclipPluginManifestV1 } from "@paperclipai/plugin-sdk";
if (template === "environment") {
writeFile(
path.join(outputDir, "src", "manifest.ts"),
`import type { PaperclipPluginManifestV1 } from "@paperclipai/plugin-sdk";
const manifest: PaperclipPluginManifestV1 = {
id: ${quote(manifestId)},
apiVersion: 1,
version: "0.1.0",
displayName: ${quote(displayName)},
description: ${quote(description)},
author: ${quote(author)},
categories: [${quote(category)}],
capabilities: [
"environment.drivers.register",
"plugin.state.read",
"plugin.state.write"
],
entrypoints: {
worker: "./dist/worker.js",
ui: "./dist/ui"
},
environmentDrivers: [
{
driverKey: ${quote(manifestId + "-driver")},
displayName: ${quote(displayName + " Driver")}
}
],
ui: {
slots: [
{
type: "dashboardWidget",
id: "health-widget",
displayName: ${quote(`${displayName} Health`)},
exportName: "DashboardWidget"
}
]
}
};
export default manifest;
`,
);
writeFile(
path.join(outputDir, "src", "worker.ts"),
`import { definePlugin, runWorker } from "@paperclipai/plugin-sdk";
import type {
PluginEnvironmentValidateConfigParams,
PluginEnvironmentProbeParams,
PluginEnvironmentAcquireLeaseParams,
PluginEnvironmentResumeLeaseParams,
PluginEnvironmentReleaseLeaseParams,
PluginEnvironmentDestroyLeaseParams,
PluginEnvironmentRealizeWorkspaceParams,
PluginEnvironmentExecuteParams,
} from "@paperclipai/plugin-sdk";
const plugin = definePlugin({
async setup(ctx) {
ctx.data.register("health", async () => {
return { status: "ok", checkedAt: new Date().toISOString() };
});
},
async onHealth() {
return { status: "ok", message: "Environment plugin worker is running" };
},
async onEnvironmentValidateConfig(params: PluginEnvironmentValidateConfigParams) {
if (!params.config || typeof params.config !== "object") {
return { ok: false, errors: ["Config must be a non-null object"] };
}
return { ok: true, normalizedConfig: params.config };
},
async onEnvironmentProbe(_params: PluginEnvironmentProbeParams) {
return { ok: true, summary: "Environment is reachable" };
},
async onEnvironmentAcquireLease(params: PluginEnvironmentAcquireLeaseParams) {
const providerLeaseId = \`lease-\${params.runId}-\${Date.now()}\`;
return {
providerLeaseId,
metadata: { acquiredAt: new Date().toISOString() },
};
},
async onEnvironmentResumeLease(params: PluginEnvironmentResumeLeaseParams) {
return {
providerLeaseId: params.providerLeaseId,
metadata: { ...params.leaseMetadata, resumed: true },
};
},
async onEnvironmentReleaseLease(_params: PluginEnvironmentReleaseLeaseParams) {
// Release provider-side resources here
},
async onEnvironmentDestroyLease(_params: PluginEnvironmentDestroyLeaseParams) {
// Destroy provider-side resources here
},
async onEnvironmentRealizeWorkspace(params: PluginEnvironmentRealizeWorkspaceParams) {
const cwd = params.workspace.remotePath ?? params.workspace.localPath ?? "/tmp/workspace";
return { cwd, metadata: { realized: true } };
},
async onEnvironmentExecute(params: PluginEnvironmentExecuteParams) {
// Replace this with real command execution against your provider
return {
exitCode: 0,
timedOut: false,
stdout: \`Executed: \${params.command}\`,
stderr: "",
};
},
});
export default plugin;
runWorker(plugin, import.meta.url);
`,
);
writeFile(
path.join(outputDir, "src", "ui", "index.tsx"),
`import { usePluginData, type PluginWidgetProps } from "@paperclipai/plugin-sdk/ui";
type HealthData = {
status: "ok" | "degraded" | "error";
checkedAt: string;
};
export function DashboardWidget(_props: PluginWidgetProps) {
const { data, loading, error } = usePluginData<HealthData>("health");
if (loading) return <div>Loading environment health...</div>;
if (error) return <div>Plugin error: {error.message}</div>;
return (
<div style={{ display: "grid", gap: "0.5rem" }}>
<strong>${displayName}</strong>
<div>Health: {data?.status ?? "unknown"}</div>
<div>Checked: {data?.checkedAt ?? "never"}</div>
</div>
);
}
`,
);
writeFile(
path.join(outputDir, "tests", "plugin.spec.ts"),
`import { describe, expect, it } from "vitest";
import {
createEnvironmentTestHarness,
createFakeEnvironmentDriver,
assertEnvironmentEventOrder,
assertLeaseLifecycle,
} from "@paperclipai/plugin-sdk/testing";
import manifest from "../src/manifest.js";
import plugin from "../src/worker.js";
const ENV_ID = "env-test-1";
const BASE_PARAMS = {
driverKey: manifest.environmentDrivers![0].driverKey,
companyId: "co-1",
environmentId: ENV_ID,
config: {},
};
describe("environment plugin scaffold", () => {
it("validates config", async () => {
const driver = createFakeEnvironmentDriver({ driverKey: BASE_PARAMS.driverKey });
const harness = createEnvironmentTestHarness({ manifest, environmentDriver: driver });
await plugin.definition.setup(harness.ctx);
const result = await plugin.definition.onEnvironmentValidateConfig!({
driverKey: BASE_PARAMS.driverKey,
config: { host: "test" },
});
expect(result.ok).toBe(true);
});
it("probes the environment", async () => {
const driver = createFakeEnvironmentDriver({ driverKey: BASE_PARAMS.driverKey });
const harness = createEnvironmentTestHarness({ manifest, environmentDriver: driver });
await plugin.definition.setup(harness.ctx);
const result = await plugin.definition.onEnvironmentProbe!(BASE_PARAMS);
expect(result.ok).toBe(true);
});
it("runs a full lease lifecycle through the harness", async () => {
const driver = createFakeEnvironmentDriver({ driverKey: BASE_PARAMS.driverKey });
const harness = createEnvironmentTestHarness({ manifest, environmentDriver: driver });
await plugin.definition.setup(harness.ctx);
const lease = await harness.acquireLease({ ...BASE_PARAMS, runId: "run-1" });
expect(lease.providerLeaseId).toBeTruthy();
await harness.realizeWorkspace({
...BASE_PARAMS,
lease,
workspace: { localPath: "/tmp/test" },
});
await harness.releaseLease({
...BASE_PARAMS,
providerLeaseId: lease.providerLeaseId,
});
assertEnvironmentEventOrder(harness.environmentEvents, [
"acquireLease",
"realizeWorkspace",
"releaseLease",
]);
assertLeaseLifecycle(harness.environmentEvents, ENV_ID);
});
});
`,
);
} else {
writeFile(
path.join(outputDir, "src", "manifest.ts"),
`import type { PaperclipPluginManifestV1 } from "@paperclipai/plugin-sdk";
const manifest: PaperclipPluginManifestV1 = {
id: ${quote(manifestId)},
@@ -331,11 +553,11 @@ const manifest: PaperclipPluginManifestV1 = {
export default manifest;
`,
);
);
writeFile(
path.join(outputDir, "src", "worker.ts"),
`import { definePlugin, runWorker } from "@paperclipai/plugin-sdk";
writeFile(
path.join(outputDir, "src", "worker.ts"),
`import { definePlugin, runWorker } from "@paperclipai/plugin-sdk";
const plugin = definePlugin({
async setup(ctx) {
@@ -363,11 +585,11 @@ const plugin = definePlugin({
export default plugin;
runWorker(plugin, import.meta.url);
`,
);
);
writeFile(
path.join(outputDir, "src", "ui", "index.tsx"),
`import { usePluginAction, usePluginData, type PluginWidgetProps } from "@paperclipai/plugin-sdk/ui";
writeFile(
path.join(outputDir, "src", "ui", "index.tsx"),
`import { usePluginAction, usePluginData, type PluginWidgetProps } from "@paperclipai/plugin-sdk/ui";
type HealthData = {
status: "ok" | "degraded" | "error";
@@ -391,11 +613,11 @@ export function DashboardWidget(_props: PluginWidgetProps) {
);
}
`,
);
);
writeFile(
path.join(outputDir, "tests", "plugin.spec.ts"),
`import { describe, expect, it } from "vitest";
writeFile(
path.join(outputDir, "tests", "plugin.spec.ts"),
`import { describe, expect, it } from "vitest";
import { createTestHarness } from "@paperclipai/plugin-sdk/testing";
import manifest from "../src/manifest.js";
import plugin from "../src/worker.js";
@@ -416,7 +638,8 @@ describe("plugin scaffold", () => {
});
});
`,
);
);
}
writeFile(
path.join(outputDir, "README.md"),

View File

@@ -5,13 +5,13 @@
"private": true,
"description": "A Paperclip plugin",
"scripts": {
"prebuild": "node ../../../../scripts/ensure-plugin-build-deps.mjs",
"prebuild": "pnpm --filter @paperclipai/plugin-sdk ensure-build-deps",
"build": "node ./esbuild.config.mjs",
"build:rollup": "rollup -c",
"dev": "node ./esbuild.config.mjs --watch",
"dev:ui": "paperclip-plugin-dev-server --root . --ui-dir dist/ui --port 4177",
"test": "vitest run --config ./vitest.config.ts",
"typecheck": "pnpm --filter @paperclipai/plugin-sdk build && tsc --noEmit"
"typecheck": "pnpm --filter @paperclipai/plugin-sdk ensure-build-deps && tsc --noEmit"
},
"paperclipPlugin": {
"manifest": "./dist/manifest.js",

View File

@@ -13,10 +13,10 @@
"ui": "./dist/ui/"
},
"scripts": {
"prebuild": "node ../../../../scripts/ensure-plugin-build-deps.mjs",
"prebuild": "pnpm --filter @paperclipai/plugin-sdk ensure-build-deps",
"build": "tsc && node ./scripts/build-ui.mjs",
"clean": "rm -rf dist",
"typecheck": "pnpm --filter @paperclipai/plugin-sdk build && tsc --noEmit"
"typecheck": "pnpm --filter @paperclipai/plugin-sdk ensure-build-deps && tsc --noEmit"
},
"dependencies": {
"@codemirror/lang-javascript": "^6.2.2",

View File

@@ -13,10 +13,10 @@
"ui": "./dist/ui/"
},
"scripts": {
"prebuild": "node ../../../../scripts/ensure-plugin-build-deps.mjs",
"prebuild": "pnpm --filter @paperclipai/plugin-sdk ensure-build-deps",
"build": "tsc",
"clean": "rm -rf dist",
"typecheck": "pnpm --filter @paperclipai/plugin-sdk build && tsc --noEmit"
"typecheck": "pnpm --filter @paperclipai/plugin-sdk ensure-build-deps && tsc --noEmit"
},
"dependencies": {
"@paperclipai/plugin-sdk": "workspace:*"

Some files were not shown because too many files have changed in this diff Show More