Compare commits

...

152 Commits

Author SHA1 Message Date
Dotta
bab5136645 Keep handoff resolution logging non-fatal 2026-05-05 12:16:30 -05:00
Dotta
09ed4e54cb Add operator QoL screenshots 2026-05-05 12:06:21 -05:00
Dotta
783f4d2f28 Address operator QoL PR feedback 2026-05-05 11:59:50 -05:00
Dotta
433326ffcb Improve operator workflow QoL
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-05-05 11:34:01 -05:00
Dotta
3c73ed26b5 Expand plugin host surface (#5205)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The plugin system is the extension boundary for optional product
capabilities
> - Rich plugins need more than a worker entrypoint: they need scoped
database storage, local project folders, managed agents/routines, host
navigation, and reusable UI components
> - The LLM Wiki work exposed those missing host surfaces while keeping
plugin code outside the core control plane
> - This pull request expands the core plugin host, SDK, server APIs,
and UI bridge so plugins can declare and use those surfaces
> - The benefit is that future plugins can integrate with Paperclip
through documented, validated contracts instead of bespoke server or UI
imports

## What Changed

- Added plugin-managed database namespaces and migration tracking,
including Drizzle schema/migration files and SQL validation for
namespace isolation.
- Added server support for plugin local folders, managed agents, managed
routines, scoped plugin APIs, and plugin operation visibility.
- Expanded shared plugin manifest/types/validators and SDK
host/testing/UI exports for richer plugin surfaces.
- Added reusable UI pieces for file trees, managed routines, resizable
sidebars, route sidebars, and plugin bridge initialization.
- Updated plugin docs and example plugins to use the expanded host and
SDK surface.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm run preflight:workspace-links && pnpm exec vitest run
packages/shared/src/validators/plugin.test.ts
server/src/__tests__/plugin-database.test.ts
server/src/__tests__/plugin-local-folders.test.ts
server/src/__tests__/plugin-managed-agents.test.ts
server/src/__tests__/plugin-managed-routines.test.ts
server/src/__tests__/plugin-orchestration-apis.test.ts
ui/src/api/plugins.test.ts ui/src/components/FileTree.test.tsx
ui/src/components/ResizableSidebarPane.test.tsx
ui/src/pages/PluginPage.test.tsx ui/src/plugins/bridge.test.ts` passed:
11 files, 67 tests.
- Confirmed this PR changes 89 files and does not include
`pnpm-lock.yaml` or `.github/workflows/*`.

## Risks

- Medium: this expands plugin host contracts across db/shared/server/ui
and includes a new core migration (`0076_useful_elektra.sql`).
- The plugin database namespace validator is intentionally restrictive;
plugin authors may need follow-up affordances for SQL patterns that
remain blocked.
- Merge this before the LLM Wiki plugin PR so the plugin can resolve the
new SDK and host APIs.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled shell/git/GitHub
workflow. Context window size was not exposed by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-05 07:42:57 -05:00
Dotta
d6bee62f02 Fix Cloud tenant issue identifier routes (#5196)
## Summary

- Allow Cloud tenant issue identifiers with alphanumeric prefixes, such
as `PC1897-1`, to normalize as issue references.
- Resolve those identifiers through issue detail/update routes, active
run/live run polling, activity, costs, and `issueService.getById`.
- Keep UI issue-link parsing aligned so tenant links normalize back to
`/issues/<IDENTIFIER>`.

## Root Cause

Cloud tenant issue prefixes include digits from the stack-id hash. The
app-side route normalization still accepted only all-letter prefixes, so
`/api/issues/PC1897-1` skipped identifier lookup and fell through as a
non-UUID id.

## Verification

- `pnpm exec vitest run packages/shared/src/issue-references.test.ts
ui/src/lib/issue-reference.test.ts
server/src/__tests__/issue-identifier-routes.test.ts
server/src/__tests__/activity-routes.test.ts
server/src/__tests__/costs-service.test.ts
server/src/__tests__/agent-live-run-routes.test.ts
server/src/__tests__/issues-service.test.ts`
- `pnpm --filter @paperclipai/shared typecheck && pnpm --filter
@paperclipai/server typecheck`
- `git diff --check`

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-04 13:20:58 -05:00
Dotta
edbb670c3b Merge pull request #5154 from paperclipai/pap-3474-docker-timeout
Raise Docker image build timeout
2026-05-03 23:01:46 -05:00
Dotta
fd10404374 Raise Docker image build timeout
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-05-03 22:52:33 -05:00
Devin Foley
47920f9c47 Speed up PR CI critical path (#5147)
## Thinking Path

> - Paperclip orchestrates AI agents for autonomous companies, so
developer throughput on the control plane repo directly affects how fast
the product can evolve.
> - The PR workflow is part of that throughput surface because every
change waits on it before review and merge.
> - This branch started from measured evidence that the PR critical path
was dominated by work that was either serialized unnecessarily or placed
on the wrong part of the graph.
> - The biggest concrete problems were: the canary dry run living inside
`verify`, the server isolated suites running one-by-one in a single
lane, and duplicate CI work that the PR path was paying for without
increasing coverage proportionally.
> - This pull request restructures the PR workflow so those costs are
reduced without removing the important coverage that was already
protecting release and test quality.
> - Follow-up fixes on the branch hardened the new entrypoints so they
work on clean GitHub runners and so the reduced PR typecheck path stays
self-maintaining as workspace packages evolve.
> - The benefit is materially faster PR wall-clock time while keeping
canary packaging checks, serialized-suite isolation, plugin SDK
consumers, and explicit TypeScript coverage where builds do not already
provide it.

## What Changed

- Moved the PR canary dry run into its own `Canary Dry Run` job so it
still runs on PRs but no longer extends the `verify` critical path.
- Split the custom Vitest runner into `general`, `serialized`, and `all`
modes, and added shard support for the isolated server suites.
- Added `test:run:general` and `test:run:serialized` scripts, then
rewired PR CI to fan the serialized server suites out across a 4-way
matrix.
- Added the required `@paperclipai/plugin-sdk` build preflight before
the new reduced-scope typecheck and test entrypoints so they succeed on
clean CI runners.
- Replaced the hardcoded PR build-gap list with
`scripts/run-typecheck-build-gaps.mjs`, which discovers workspace
packages whose `build` scripts skip TypeScript and runs only their
explicit `typecheck` scripts.
- Removed the redundant `pnpm build` from the PR `e2e` job because the
Playwright onboarding path boots Paperclip from source.

## Verification

- `ruby -e "require 'yaml'; YAML.load_file('.github/workflows/pr.yml');
puts 'workflow ok'"`
- `node scripts/run-vitest-stable.mjs --mode general --dry-run`
- `node scripts/run-vitest-stable.mjs --mode serialized --shard-index 0
--shard-count 4 --dry-run`
- `pnpm run typecheck:build-gaps`
- `pnpm test:run:general`
- `pnpm test:run:serialized -- --shard-index 0 --shard-count 4`
- `pnpm build`
- `pnpm paperclipai onboard --yes --run`
- `curl http://127.0.0.1:3299/api/health`

## Risks

- Branch protection or required-check configuration may need to be
updated for the new standalone `Canary Dry Run` job and the
serialized-suite matrix job names.
- `scripts/run-typecheck-build-gaps.mjs` assumes packages that need
explicit PR-time typechecking are the ones whose `build` scripts omit
`tsc`; if build conventions change, that heuristic needs to stay
aligned.
- Serialized test sharding preserves per-suite isolation, but the first
few CI runs should still be watched for shard-balance or naming
assumptions in downstream tooling.

## Model Used

- OpenAI GPT-5.4 via the Codex local adapter, using high reasoning
effort with shell, git, and file-edit tool use in a local worktree.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-03 20:20:14 -07:00
Dotta
e01ffc18d3 Merge pull request #5148 from paperclipai/pap-3474-tenant-identity-deploy
Support Cloud tenant identity bootstrap
2026-05-03 21:57:28 -05:00
Dotta
ae23e02526 Support Cloud tenant identity bootstrap
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-05-03 21:55:52 -05:00
Devin Foley
29401b231b fix(ci): gate new release packages on npm bootstrap (#5146)
## Thinking Path

> - Paperclip is a control plane for autonomous agent companies, so its
release automation is part of the core operator trust boundary.
> - The affected subsystem is npm/GitHub Actions release publishing for
the public monorepo packages.
> - The concrete failure was that a newly added package reached
`master`, the canary workflow attempted its first publish, and npm
trusted publishing was not yet bootstrapped for that package.
> - That means the problem is not just one broken run; it is a missing
pre-merge guard that lets release-ineligible packages land and only fail
once `publish_canary` runs.
> - This pull request makes release enrollment explicit, validates that
enrollment in CI, and adds a PR-time bootstrap check against npm for
changed release-enabled package manifests.
> - The result is that we keep trusted publishing, avoid teaching CI to
`npm adduser`, and move this class of failure from post-merge canary
time to pre-merge review time.

## What Changed

- Added `scripts/release-package-manifest.json` so release-managed
public packages are explicitly enrolled instead of being inferred from
every non-private workspace package.
- Hardened `scripts/release-package-map.mjs` to validate the manifest
before release workflows rewrite versions or assemble publish payloads.
- Added `scripts/check-release-package-bootstrap.mjs` and wired it into
`.github/workflows/pr.yml` so PRs that change a release-enabled package
manifest fail if that package does not already exist on npm.
- Added release-package manifest coverage tests to
`scripts/release-package-map.test.mjs` and included them in `pnpm run
test:release-registry`.
- Wired manifest validation into `.github/workflows/release.yml` and
documented the first-publish bootstrap policy in `doc/PUBLISHING.md` and
`doc/RELEASE-AUTOMATION-SETUP.md`.

## Verification

- `pnpm run test:release-registry`
- `./scripts/release.sh canary --skip-verify --dry-run`
- Confirmed the committed diff contains no obvious PII/secrets via
targeted pattern scan before pushing.

## Risks

- Low risk overall: this is CI/release-policy code, not product runtime
logic.
- The new PR bootstrap check depends on npm metadata availability, so a
transient npm outage could block a PR that changes a release-enabled
package manifest.
- The manifest introduces a new source of truth that must stay aligned
with public package additions, but that is intentional and now enforced.

## Model Used

- OpenAI Codex via the `codex_local` Paperclip adapter; GPT-5-based
coding agent with tool use, terminal execution, git, and GitHub CLI.
Exact served model ID/context window are not exposed by the local
runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 19:31:28 -07:00
Devin Foley
a5430f010d Handle Gemini assistant message events in JSONL parser (#5143)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies, including
agents
>   running the Gemini CLI (`gemini-local` adapter)
> - The Gemini CLI emits a JSONL event stream during a run that the
adapter
> parses to extract the assistant's response text, tool results, and
usage
> - Recent versions of the Gemini CLI emit assistant responses as
> `{ "type": "message", "role": "assistant", "content": ... }` events in
>   addition to the previously-handled event shapes
> - The parser was not handling the new event type, so the assistant's
actual
> response text was being silently dropped from parsed output. Callers
ended
>   up with empty assistant messages even when Gemini had successfully
>   responded
> - This PR teaches the parser to recognize `{type: "message", role:
>   "assistant"}` events and extract their content text via the same
>   `collectMessageText` helper used for other message-shaped events
> - The benefit is that Gemini runs surface the assistant's real
response in
> downstream consumers (issue comments, run logs, downstream agent
context)
>   instead of vanishing

## What Changed

- `packages/adapters/gemini-local/src/server/parse.ts`: in
`parseGeminiJsonl(...)`, add a branch for `event.type === "message"`
with
  `role === "assistant"` that calls
  `messages.push(...collectMessageText(event.content))`.
- `packages/adapters/gemini-local/src/server/parse.test.ts`: ~19 lines
of
  coverage for the new branch.

## Verification

- `pnpm --filter @paperclipai/adapter-gemini-local test -- parse`
- Manual QA: run a Gemini agent on an issue, confirm the assistant's
response
appears as the issue comment / run output. Before this fix the comment
was
  empty even when the run completed successfully.

## Risks

- Tightly scoped: 8 lines of production code in one parser branch. No
effect
  on existing event shapes or other adapters.
- If the Gemini CLI changes its event schema again, this branch may need
to be
  revisited — but adding it is strictly additive over current behaviour.

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 18:36:50 -07:00
Devin Foley
6c090f84a9 Strip inherited host shell env from SSH remote execution (#5142)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents executing on remote SSH hosts receive an env map built from
the host
>   process's env plus per-run additions like `PAPERCLIP_API_KEY`,
>   `PAPERCLIP_RUN_ID`, etc.
> - The env map currently includes inherited host vars by default,
including
> identity-bound ones like `PATH`, `HOME`, `USER`, `NVM_DIR`, `XDG_*` —
> variables whose values are meaningful only on the host they came from
> - Sending the host's `PATH` (containing host-only directories like a
local
> nvm install path) to a remote SSH box overrides the remote's actual
`PATH`
> and breaks command resolution. Same hazard for `HOME` (commands
looking for
> config files end up in a non-existent dir), `USER` (writes go to the
wrong
>   path), etc.
> - This PR adds `sanitizeSshRemoteEnv()` that drops inherited
identity-bound
> vars when their value matches the host process's value. Explicitly-set
> values pass through untouched, so callers that genuinely want to
override
>   remote `PATH` etc. still can — but accidental leakage from
>   `process.env` is filtered.
> - The benefit is that SSH remote execution stops corrupting the remote
> shell's environment with host-shaped paths, so commands resolve
correctly
>   against the remote PATH and config files land in the remote `HOME`

## What Changed

- New `sanitizeSshRemoteEnv(env, inheritedEnv = process.env)` in
`packages/adapter-utils/src/server-utils.ts`. The identity-bound key set
is:
    - `PATH`, `HOME`, `PWD`, `SHELL`, `USER`, `LOGNAME`
    - `NVM_DIR`, `TMPDIR`, `TMP`, `TEMP`
    - `XDG_CONFIG_HOME`, `XDG_CACHE_HOME`, `XDG_DATA_HOME`,
      `XDG_STATE_HOME`, `XDG_RUNTIME_DIR`
For any key in this set, the entry is dropped iff the env value equals
the
  inherited (host process) value. Other keys pass through unchanged.
- `readEnvValueCaseInsensitive(...)` helper handles Windows-style
  case-insensitive env var lookups.
- Wired into `resolveSpawnTarget(...)` for the SSH transport. Sandbox
and local
  paths are unaffected.
- Tests added in `server-utils.test.ts` (~50 lines) covering: matching
keys
filtered, mismatched keys preserved, non-identity keys passed through,
case
  insensitivity.

## Verification

- `pnpm --filter @paperclipai/adapter-utils test -- server-utils`
- Manual QA: run any adapter against an SSH-backed environment, confirm
remote command resolution works (e.g. `node`, `npm`, the adapter's CLI)
and
config files land in the remote user's `HOME`. Compare to the prior
behaviour
by transiently re-introducing the inherited `PATH` and watching commands
  fail with `command not found`.

## Risks

- Behavioural shift: SSH remote execution previously passed inherited
host env
  vars verbatim. Code that relied on that (e.g. a remote command somehow
  expecting the host's `PATH`) will see different behaviour. None of the
  adapter code in this repo has such a dependency.
- Edge case: if a caller explicitly sets `PATH` to the same value as the
host's
`PATH` (literally — same exact string), the sanitizer drops it as a
leak.
  In practice no caller constructs the env this way.
- Windows host: case-insensitive lookup handles `Path` vs `PATH`
correctly.
  Tested.

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 18:36:13 -07:00
Devin Foley
90631b09b3 Let adapters declare runtime command spec for remote provisioning (#5141)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies, running
adapter
> commands like `claude`, `codex`, `pi` either locally or on remote
runtimes
>   (SSH hosts, sandboxes, etc.)
> - On a fresh remote runtime — particularly an ephemeral sandbox — the
> adapter's CLI may not be installed yet. Today operators handle this
via
> external configuration (e.g. a project-level `provisionCommand` shell
> script) that has to know about every adapter the operator might want
to use
> - This means every adapter has its own well-known npm package, but
operators
>   end up writing duplicate provision shell scripts that paste together
> `npm install -g @anthropic-ai/claude-code`, `npm install -g
@openai/codex`,
>   etc. — knowledge the adapter itself already has
> - This PR moves that knowledge into the adapter modules: each adapter
declares
> how its runtime command should be detected and (if applicable)
installed
> via `getRuntimeCommandSpec(config)`. The execution path runs the
adapter's
> own install command on remote sandbox targets before launching, so a
fresh
> sandbox bootstraps itself instead of requiring a hand-written
provision script
> - The benefit is fewer footguns for operators provisioning remote
runtimes,
>   and a clean place for new adapters to plug in their install recipe

## What Changed

- New types in `packages/adapter-utils/src/types.ts`:
    - `AdapterRuntimeCommandSpec` describing `command`, optional
      `detectCommand`, and optional `installCommand`
    - Optional `getRuntimeCommandSpec(config)` on `ServerAdapterModule`
- Optional `runtimeCommandSpec` on `AdapterExecutionContext` so adapters
      receive the resolved spec at execute time
- New helper `ensureAdapterExecutionTargetRuntimeCommandInstalled(...)`
in
`packages/adapter-utils/src/execution-target.ts` that runs the install
command
on remote targets when `transport === "sandbox"`. SSH and local targets
are
  no-ops. Throws on timeout or non-zero exit so failures surface early.
- Each of `claude-local`, `codex-local`, `cursor-local`, `gemini-local`,
  `opencode-local`, `pi-local`'s `execute.ts` now reads
`ctx.runtimeCommandSpec?.installCommand` and calls the helper before
launching
  the adapter command.
- `server/src/adapters/registry.ts` declares `getRuntimeCommandSpec` for
each
  adapter:
- claude/codex/gemini/opencode/pi-local: `npm install -g <package>`
recipe via
a shared `buildNpmRuntimeCommandSpec` helper, with a defensive guard
that
only auto-installs when the configured `command` matches the well-known
      fallback (custom binaries are left alone).
- cursor-local: declares `command` only; no auto-install (no public npm
      package), preserving the existing manual setup.
- `server/src/services/heartbeat.ts` resolves the spec via
`adapter.getRuntimeCommandSpec?.(runtimeConfig)` and passes it through
to
  `AdapterExecutionContext`.
- Tests added in `execution-target.test.ts` (~75 lines), e2b
`plugin.test.ts` (~32 lines), and `environment-run-orchestrator.test.ts`
  (~76 lines).

## Verification

- `pnpm --filter @paperclipai/adapter-utils test`
- `pnpm --filter @paperclipai/server test --
environment-run-orchestrator`
- `pnpm --filter @paperclipai/sandbox-providers-e2b test`
- Manual QA: run an adapter (claude/codex/etc.) against a fresh
sandbox-backed
environment that does NOT have the adapter CLI pre-installed. Confirm
the
install runs once at the start of the agent run and the adapter then
launches
successfully. Re-run on the same sandbox; confirm the install command is
  idempotent and the second run starts faster.
- Confirm SSH and local execution paths are unaffected (gated by
  `transport === "sandbox"`).

## Risks

- Behavioural shift on sandbox runs: a new install step now runs at the
start
  of every sandbox agent run for adapters with `installCommand` set. The
install commands are idempotent (`if ! command -v X >/dev/null 2>&1;
then
npm install -g <pkg>; fi`), so this is fast on warm sandboxes. On a cold
  sandbox, the first run takes longer.
- Operators who used the legacy project-level `provisionCommand` to
install
adapter CLIs can drop that part of their script; the adapter handles it
now.
  Existing scripts continue to work — installs are idempotent.
- The cursor-local adapter has no auto-install (no public npm package).
  Behaviour for cursor-local on sandboxes is unchanged.
- New optional surface on `ServerAdapterModule`. Plugins that don't
implement
  `getRuntimeCommandSpec` retain previous behaviour (no auto-install).

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 18:35:36 -07:00
Devin Foley
2dce81fbf6 Add optional bridge proxy request logging via PAPERCLIP_BRIDGE_DEBUG (#5140)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents on remote sandboxes call back into the Paperclip control
plane via a
> callback bridge — the host process running the bridge proxies HTTP
requests
>   from the sandbox to the Paperclip API
> - When something goes wrong end-to-end (sandbox can't reach Paperclip,
requests
> timing out, malformed responses), it's hard to tell whether the bridge
> processed the request, what URL/method it used, and what the upstream
>   responded with
> - There was no built-in way to log bridge proxy traffic without
modifying
>   adapter code or attaching a debugger
> - This PR adds opt-in stdout logging of every bridge proxy request and
response
> (method, path, query, status), gated behind `PAPERCLIP_BRIDGE_DEBUG`
so it
>   stays off by default
> - The benefit is that operators can flip a single env var to get full
visibility
> into bridge traffic when debugging remote runs, without changing code

## What Changed

- `packages/adapter-utils/src/execution-target.ts`:
`startAdapterExecutionTargetPaperclipBridge`'s `handleRequest` now logs
each
  proxied request and response when `PAPERCLIP_BRIDGE_DEBUG` is truthy:
    - `[paperclip] Bridge proxy <METHOD> <path>?<query>` before fetch
- `[paperclip] Bridge proxy response <status> for <METHOD>
<path>?<query>` after
- Logging is no-op when the env var is unset/`"0"`/`"false"`.

## Verification

- Set `PAPERCLIP_BRIDGE_DEBUG=1` in the host process env, run an agent
against
a sandbox-backed environment, confirm the bridge log lines appear in
stdout.
- Unset the env var and confirm no extra log lines appear during normal
runs.

## Risks

- Off-by-default, no observable change for shipping users.
- When enabled, the logging is verbose — every API call from the sandbox
  produces 2 stdout lines. Operators should only enable it during active
  debugging.

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [ ] I have added or updated tests where applicable — covered by
exercising the
      flag in dev; the underlying handleRequest behavior is unchanged
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 18:34:48 -07:00
Devin Foley
0e51fa2b0d Honor reuse-existing preference and assignee default environment in issue runs (#5139)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents run inside execution workspaces (a per-issue cwd + env), and
an issue
> can prefer to reuse an existing workspace or get a fresh one each time
> - The heartbeat service was reading the existing workspace's config to
derive
> environment selection regardless of whether the issue actually wanted
to reuse
> it. So fresh-run issues were inheriting stale config from a workspace
that was
>   about to be discarded
> - Separately, when an issue is assigned to an agent, the issue's
execution
> workspace settings weren't picking up the agent's
`defaultEnvironmentId`,
>   even though the agent's choice is the natural default for that issue
> - This PR makes both selection paths honor the obvious source of
truth:
> workspace config flows only when the issue actually wants
`reuse_existing`,
> and the assignee agent's default environment is applied at assignment
time if
>   nothing else is set on the issue
> - The benefit is that re-running a flaky issue picks up the right
environment
> instead of inheriting the previous run's config, and assigning an
agent to an
>   issue does the obvious thing without operator intervention

## What Changed

- `server/src/services/heartbeat.ts`: introduce
`reusableExecutionWorkspaceConfig`
  that is non-null only when `shouldReuseExisting` is true. Both
  `resolveExecutionWorkspaceEnvironmentId(...)` and
`applyPersistedExecutionWorkspaceConfig(...)` now read from it instead
of
unconditionally consulting `existingExecutionWorkspace?.config`.
Fresh-run
issues no longer inherit stale environment config from an in-flight
workspace
  about to be discarded.
- `server/src/services/issues.ts`: when an issue update sets a new
  `assigneeAgentId` and isolated workspaces are enabled, populate
  `executionWorkspaceSettings.environmentId` from the assignee agent's
  `defaultEnvironmentId` if the issue doesn't have an explicit
  `environmentId` set yet.
- Tests added in `heartbeat-plugin-environment.test.ts` (~216 lines) and
  `issues-service.test.ts` (~85 lines) covering both paths.

## Verification

- `pnpm --filter @paperclipai/server test --
heartbeat-plugin-environment issues-service`
- Manual QA: assign an issue to an agent that has a non-default
`defaultEnvironmentId`, confirm the issue's workspace settings now
include that
environment id without operator intervention. Trigger a rerun on an
issue
whose existing workspace points at a stale environment, confirm the
rerun uses
  the freshly-resolved environment.

## Risks

- Behavioural shift on assignment: previously assigning an agent didn't
propagate the agent's default environment to the issue. Now it does.
Callers
that explicitly want the issue to keep its existing/null environment
must set
`executionWorkspaceSettings.environmentId` themselves; the new logic
only
  fires when no explicit value is set.
- Behavioural shift on rerun: stale workspace config is no longer
applied to
  fresh runs. Operators who relied on this implicit inheritance may see
different environment selection on the first rerun after deploy.
Mitigation:
the explicit isssue settings and project policy are still honored as
before.

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A (no UI changes)
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 18:33:55 -07:00
Devin Foley
09eceb952a Avoid resuming stale remote sessions (Pi adapter) (#5120)
> **Stacked PR (part 7 of 7).** Depends on:
  - PR #5114
  - PR #5115
  - PR #5116
  - PR #5117
  - PR #5118
  - PR #5119
> Diff against `master` includes commits from earlier PRs in the stack —
the new commit in this PR is the topmost one.

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The Pi adapter persists a session jsonl per agent so subsequent runs
resume
>   conversation context instead of starting cold
> - SSH testing reproduced a real failure: a verification issue reached
terminal
>   `done` and the agent claimed success, but the proof artifact
> `manual-qa/environment-matrix/ssh/pi_local.md` was missing from the
realized
>   SSH workspace on the QA target box
> - Root cause: the saved session header recorded a different cwd than
the new
> execution cwd, but the resume eligibility check only compared
session-params
> cwd via local-style `path.resolve` (which doesn't roundtrip on remote
POSIX
> paths). The stale session got resumed and writes landed in the wrong
cwd
> - This PR tightens resume eligibility for remote targets: it adds
remote-aware
> cwd normalisation, reads the first line of the session jsonl over SSH
(`head
>   -n 1`) to verify the saved header cwd, and only resumes when both
> session-params cwd *and* the on-disk header cwd match the realised
execution
>   cwd. Stale sessions are skipped silently and the run starts cold
> - The benefit is that Pi runs across cwd-changing environments stop
> accidentally resuming each other's sessions, and proof artifacts land
where
>   reviewers expect them

## What Changed

- Added `normalizeExecutionCwd`, `executionCwdsMatch`,
`readSessionHeaderCwd`,
  and `readSavedSessionCwd` helpers in `pi-local/src/server/execute.ts`
- `readSavedSessionCwd` reads the first line of the session jsonl —
locally via
`fs.readFile`, remotely via `runAdapterExecutionTargetShellCommand`
(`head -n 1`)
- Resume eligibility now requires:
  1. Saved session id is non-empty
  2. Execution target shape matches (existing check)
  3. Session-params cwd matches the realised execution cwd
4. Session-header cwd (from the on-disk jsonl) matches the realised
execution cwd
- Stale sessions are skipped silently (run starts cold) instead of
resumed
- `execute.remote.test.ts` extended with: matching header → resume;
mismatched
header → start fresh; missing/unreadable header → start fresh; remote
head
  command failure → start fresh

## Verification

- `pnpm --filter @paperclipai/adapter-pi-local test`
- `pnpm test -- pi-local`
- Manual QA: ran a Pi agent twice in two different remote cwds,
confirmed
the second run did not pick up the first run's session and that
subsequent
  runs in the original cwd still resumed correctly

## Risks

- Adds a `head -n 1` shell call per Pi run on remote targets. Negligible
  latency (single read of session jsonl), bounded by 15s timeout.
- If the `head` call fails for unrelated reasons (transient remote
unreachability), the run will start cold instead of resuming. This is
the
safe default but worth noting — operators may see one extra cold run if
a
  remote glitches mid-session.
- No data is deleted or migrated; stale sessions remain on disk for
manual
  inspection if desired.

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 13:51:38 -07:00
Devin Foley
d22e790bd4 Validate remote model probes on execution target (OpenCode) (#5119)
> **Stacked PR (part 6 of 7).** Depends on:
  - PR #5114
  - PR #5115
  - PR #5116
  - PR #5117
  - PR #5118
> Diff against `master` includes commits from earlier PRs in the stack —
the new commit in this PR is the topmost one.

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The OpenCode adapter validates that its configured model exists
before letting
>   a run start so misconfiguration fails fast with a clear error
> - SSH testing reproduced an OpenCode failure where issues stayed
`backlog`,
>   timed out, and produced no comments. The root cause was in
> `packages/adapters/opencode-local/src/server/execute.ts`: the local
model
> guard `ensureOpenCodeModelConfiguredAndAvailable(...)` only ran when
execution
> was *not* remote, so SSH OpenCode bypassed it and failed silently
later
> - Subsequent testing surfaced a related remote-only failure where the
probe
> (when wired up naively) hits `EACCES: permission denied, mkdir
'/var/folders'`
> on the SSH box because of how OpenCode's runtime config picks a
tempdir
> - This PR runs the model probe on the actual execution target —
`opencode
> models` via `runAdapterExecutionTargetProcess` — instead of the local
CLI,
> parses the output with the shared `parseOpenCodeModelsOutput` helper,
and
> reports a concrete error naming the offending model and a sample of
available
>   remote models when the configured model isn't present
> - The benefit is that mismatched OpenCode models surface as a clear
pre-flight
> error referencing the remote target instead of a silent run that never
leaves
>   `backlog`

## What Changed

- Added `ensureRemoteOpenCodeModelConfiguredAndAvailable` in
  `opencode-local/src/server/execute.ts` that runs `opencode models` via
`runAdapterExecutionTargetProcess` and validates the configured model is
in
  the parsed output
- `models.ts` now exports `parseOpenCodeModelsOutput` and
`requireOpenCodeModelId`
  so the remote path can reuse them
- `execute.ts` calls the remote variant when `executionTargetIsRemote`,
otherwise
  the existing local `ensureOpenCodeModelConfiguredAndAvailable`
- Errors include the offending model id and a sample of available remote
models
  so the operator knows exactly what's missing
- `execute.remote.test.ts` extended with cases for: probe timeout, probe
  non-zero exit, empty model list, and missing-model error

## Verification

- `pnpm --filter @paperclipai/adapter-opencode-local test`
- `pnpm test -- opencode-local`
- Manual QA: configured an OpenCode agent with a model that exists
locally but
not in the remote sandbox, and confirmed the new error fires before the
run
  starts and references the remote target

## Risks

- New behaviour: remote model validation adds a `~20s timeout` `opencode
models`
call on every remote run start. For most environments this is fast, but
a
network-slow sandbox could see startup latency rise. Timeout is bounded.
- If the remote CLI is missing or misconfigured, the new error replaces
the old
generic startup failure — clearer message, but the failure point shifts
earlier. Monitor for any QA flows that relied on the old failure shape.

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 13:34:09 -07:00
Devin Foley
856c6cb192 Fix remote workspace environment shaping (#5118)
> **Stacked PR (part 5 of 7).** Depends on:
  - PR #5114
  - PR #5115
  - PR #5116
  - PR #5117
> Diff against `master` includes commits from earlier PRs in the stack —
the new commit in this PR is the topmost one.

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents run with a Paperclip-shaped environment
(`PAPERCLIP_WORKSPACE_CWD`,
> worktree path, `PAPERCLIP_WORKSPACES_JSON` hints) so the CLI can
locate the
>   correct project tree
> - SSH testing reproduced a real failure: a Codex SSH run wrote to
> `/tmp/paperclip-env-matrix-...` (the *host* path) instead of the
realized
> remote workspace at `/home/<user>/paperclip-env-matrix-ssh-claude/...`
> because the adapter injected `PAPERCLIP_WORKSPACE_CWD=/tmp/...` into
the
>   remote env
> - Code review on the initial codex-only fix asked to roll the same
approach
> into every other SSH-capable adapter (claude, acpx, cursor, opencode,
gemini,
>   pi) via a shared helper rather than duplicating per-adapter
> - This PR adds `shapePaperclipWorkspaceEnvForExecution` in
adapter-utils that,
> when the execution target is remote: replaces local cwd with the
realized
> execution cwd, nulls out worktree path (which has no remote meaning),
and
> rewrites/strips `cwd` entries in workspace hints based on what was
actually
>   synced. Every adapter calls it before invoking the remote runner
> - The benefit is that remote runs see the realized remote workspace,
host-local
> paths stop leaking into remote env, and the rule is unit-tested in one
place

## What Changed

- Added `shapePaperclipWorkspaceEnvForExecution` to
  `packages/adapter-utils/src/server-utils.ts` with full unit coverage
  (`server-utils.test.ts`)
- Each of acpx-local, claude-local, codex-local, cursor-local,
gemini-local,
opencode-local, pi-local now calls the new shaper before issuing the
remote
  command and feeds the shaped values into `applyPaperclipWorkspaceEnv`
- Per-adapter `execute.remote.test.ts` files extended to cover the new
shaping
  behaviour: localhost paths replaced with remote cwd, foreign-cwd hints
  stripped, worktree path nulled out for remote targets
- `acpx-local/src/server/execute.test.ts` extended with shaping coverage

## Verification

- `pnpm test -- server-utils execute.remote`
- `pnpm --filter @paperclipai/adapter-acpx-local test`
- Manual QA reproducing the original failure:
  1. Provision an E2B sandbox environment for the Paperclip QA company
2. Assign an issue to a remote-targeted claude-local agent and confirm
the
run starts in the correct remote cwd (no `/Users/...` path leakage in
the
     run logs)
  3. Repeat for opencode-local and pi-local

## Risks

- Behavioural shift: hints whose `cwd` doesn't match the workspace cwd
are now
stripped on remote targets. If any adapter relied on a leaked local hint
cwd,
it will see a missing `cwd` instead. Reviewed all current callers — none
do.
- Adds a small per-run cost (path resolve + string normalisation) on
every remote
  execution. Negligible.
- Worktree path is now nulled out on remote (it has no meaning there).
Adapters
  that previously read the value defensively will continue to work.

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 13:17:52 -07:00
Devin Foley
bb7d040894 Switch OpenCode to explicit static/local-aware model selection (#5117)
> **Stacked PR (part 4 of 7).** Depends on:
  - PR #5114
  - PR #5115
  - PR #5116
> Diff against `master` includes commits from earlier PRs in the stack —
the new commit in this PR is the topmost one.

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - When creating an OpenCode-local agent, Paperclip currently validates
> `adapterConfig.model` against the *Paperclip host's* `opencode models`
output
> - SSH testing surfaced that this blocks creating an OpenCode agent for
an SSH
> environment: the model that exists on the SSH target isn't visible to
the
> host, so creation fails with "OpenCode requires `adapterConfig.model`
in
> provider/model format" even when the operator picked a real remote
model
> - The initial direction was environment-aware model discovery; the
final
> decision was to keep OpenCode on the same explicit-model pattern as
other
> adapters (default + curated list + manual override) and stop blocking
>   creation on host-side discovery
> - This PR does both: the adapter-models endpoint now accepts
`environmentId` and
> probes against the target environment, and the create-time hard gate
is
> replaced by `requireOpenCodeModelId` which validates `provider/model`
*format*
> without requiring host-local discovery. Test/run-time still surfaces
real
>   auth/availability problems
> - The benefit is that operators can create OpenCode agents for remote
> environments without out-of-band setup, and the model picker in the UI
>   reflects the actually-targeted environment

## What Changed

- Added `requireOpenCodeModelId(input)` in
`opencode-local/src/server/models.ts`,
  exported it from the adapter index
- `ensureOpenCodeModelConfiguredAndAvailable` now delegates the format
check to
  `requireOpenCodeModelId`
- `agentsApi.adapterModels(companyId, adapterType, { environmentId })`
now accepts
  an environment ID and passes it as a query parameter
- `queryKeys.agents.adapterModels` now keys on `(companyId, adapterType,
environmentId)`
- `server/src/routes/agents.ts` reads and validates the new query
parameter,
  forwarding it to the adapter's model probe
- `AgentConfigForm.tsx` and `OnboardingWizard.tsx` build the model query
key from
the currently selected default environment ID and disable autodetect for
  `opencode_local` (model selection is explicit)
- `NewAgent.tsx` simplified — no longer special-cases OpenCode
autodetect
- `company-portability.ts` no longer needs OpenCode-specific autodetect
handling
- Tests added/updated:
  `adapter-model-refresh-routes.test.ts`, `adapter-models.test.ts`,
`agent-permissions-routes.test.ts`,
`opencode-local/src/server/models.test.ts`

## Verification

- `pnpm --filter @paperclipai/server test -- adapter-models
adapter-model-refresh agent-permissions`
- `pnpm --filter @paperclipai/adapter-opencode-local test`
- `pnpm --filter @paperclipai/ui test -- AgentConfigForm
OnboardingWizard NewAgent`
- Manual QA in browser:
1. Boot Paperclip on Tailscale-bound port (so it's reachable from
another
machine), create an OpenCode-local agent, switch the default environment
between two installed sandboxes, and confirm the model list refreshes
     per-environment
  2. Submit with a malformed `provider/model` string and verify the new
     `requireOpenCodeModelId` error surfaces
- Before/after screenshots attached for `AgentConfigForm` model picker

## Risks

- Behavioural shift: switching default environment now triggers a model
refetch.
Should be cheap but introduces a new UI loading state for OpenCode
users.
- Removing dynamic autodetect for OpenCode: if any user configured an
agent
without specifying `model` and relied on autodetect populating it, that
agent
will now fail at submit time. Mitigation: validation error is explicit
and
  actionable.
- New query string parameter on `/api/companies/:id/adapter-models` —
older
clients that omit it still work (parameter is optional and defaults to
null).

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 13:01:34 -07:00
Devin Foley
076067865f Migrate SSH environment callback to bridge (#5116)
> **Stacked PR (part 3 of 7).** Depends on:
  - PR #5114
  - PR #5115
> Diff against `master` includes commits from earlier PRs in the stack —
the new commit in this PR is the topmost one.

## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents executing on a remote SSH-backed environment need a way to
call back into
>   the Paperclip control plane (run events, log streaming, signals)
> - When the SSH host can't reach the Paperclip host (NAT, firewalls, or
simply not
> on the same network), the run silently fails or hangs — a recurring
class of
>   failure during SSH testing
> - In sandboxed environments we already solved this with a callback
bridge that
> tunnels back through the existing connection; SSH was the odd one out
> - This PR migrates SSH execution to use the same callback bridge, so
every
> adapter's remote run uses one consistent reverse-channel. Per-adapter
SSH glue
> is deleted in favour of a shared `CommandManagedRuntimeRunner` built
from the
>   SSH spec
> - The benefit is fewer SSH-specific failure modes, a smaller code
surface, and
>   one place to evolve the callback contract going forward

## What Changed

- Added `createSshCommandManagedRuntimeRunner` in
`packages/adapter-utils/src/ssh.ts` that adapts an SSH spec into a
generic
  command-managed-runtime runner (with cwd, env, and timeout handling)
- Removed `paperclipApiUrl` from `SshRemoteExecutionSpec`; the bridge
URL now flows
  through the shared runner
- Reworked `execution-target.ts` to use the SSH runner alongside sandbox
runners
  via a unified `CommandManagedRuntimeRunner` interface
- Simplified `remote-managed-runtime.ts` and
`sandbox-managed-runtime.ts` to consume
  the shared runner abstraction
- Deleted per-adapter SSH callback wiring from claude-local,
codex-local,
  cursor-local, gemini-local, opencode-local, pi-local execute.ts files
- Removed `environment-runtime-driver-contract.test.ts` (the contract is
now
  enforced by `environment-execution-target.test.ts`)
- Added/updated `execute.remote.test.ts` cases for each adapter to cover
the SSH
  runner path

## Verification

- `pnpm --filter @paperclipai/adapter-utils test`
- `pnpm test -- execute.remote` (covers all six local adapters' SSH
paths)
- Manual QA: ran a claude-local agent against an SSH-backed environment,
confirmed
the agent successfully called back to `/api/agent-callback/*` endpoints
during
  the run

## Risks

- Refactor touches all six local adapters. If any adapter had subtle
SSH-specific
behaviour that wasn't captured in tests, it could regress. Mitigation:
each
  adapter's `execute.remote.test.ts` was extended.
- `paperclipApiUrl` removal from `SshRemoteExecutionSpec` is a breaking
type change
for any internal consumer. Verified no external plugins consume this
type.
- The new `CommandManagedRuntimeRunner` shape is a public surface in
`@paperclipai/adapter-utils`; downstream plugins implementing custom
runners may
  need updates, but no such plugins exist in this repo.

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 12:43:52 -07:00
Devin Foley
a7b45938b7 Let sandbox providers declare shell defaults (#5114)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents execute in sandboxed remote environments served by pluggable
sandbox
>   providers (E2B today, more later)
> - Today every sandbox command runs under `sh -lc` regardless of what
the
>   provider's container actually ships
> - That misses bash-only shell init on E2B (which ships bash) and
prevents
> future providers from declaring a different default — there's no way
for a
>   provider to say "I have bash, use it"
> - This PR adds a `shellCommand` field to sandbox execution targets so
providers
> can declare their preferred shell ("bash" for E2B), threads it through
the
> sandbox-managed-runtime client, callback bridge, and execution-target
shell
>   helper, and validates the value at the lease-metadata boundary
> - The benefit is that sandbox commands run under the right shell on
the right
> provider, and adding new sandbox providers only needs to declare a
shell
>   preference

## What Changed

- Added `packages/adapter-utils/src/sandbox-shell.ts` exporting
`preferredShellForSandbox(shellCommand)` (returns `"bash"` if input is
`"bash"`,
  else `"sh"`)
- Added `shellCommand?: "bash" | "sh" | null` to
`AdapterSandboxExecutionTarget`
  and `CommandManagedRuntimeSpec`; threaded it through
`runAdapterExecutionTargetShellCommand`,
`prepareAdapterExecutionTargetRuntime`,
  and `startAdapterExecutionTargetPaperclipBridge`
- `createCommandManagedRuntimeClient`, `prepareCommandManagedRuntime`,
and
`createCommandManagedSandboxCallbackBridgeQueueClient` now take an
optional
  `shellCommand` and use `preferredShellForSandbox` to pick the shell
- `startSandboxCallbackBridgeServer` accepts a `shellCommand` for its
server
  startup, readiness probe, and stop hook
- E2B sandbox plugin declares `shellCommand: "bash"` in `leaseMetadata`
- `resolveEnvironmentExecutionTarget` reads `shellCommand` from lease
metadata
  (validating against `"bash" | "sh" | null`)
- `environment-runtime.ts` adds `"shellCommand"` to
`INTERNAL_PLUGIN_SANDBOX_CONFIG_KEYS`
so the field round-trips through internal plugin config without leaking
to
  external plugin metadata
- Updated tests in `command-managed-runtime.test.ts`,
  `execution-target-sandbox.test.ts`, `sandbox-callback-bridge.test.ts`,
  `environment-execution-target.test.ts`

## Verification

- `pnpm --filter @paperclipai/adapter-utils test`
- `pnpm --filter @paperclipai/server test --
environment-execution-target`
- `pnpm --filter @paperclipai/sandbox-providers-e2b test`
- Manual QA: boot a Paperclip instance, create an E2B-backed
environment, run a
claude_local agent against it, and confirm the run completes (verifies
bash
  shell semantics flow through the callback bridge end-to-end)

## Risks

- E2B sandbox commands now run under `bash -lc` instead of `sh -lc`.
Bash is a
strict superset for the commands we issue (no busybox-only flags in our
shell
scripts), so risk is low. The shellCommand field is opt-in via lease
metadata —
  providers that don't declare it stay on `sh`.
- New optional field on `CommandManagedRuntimeSpec` and
`AdapterSandboxExecutionTarget`.
  Consumers ignoring the field retain previous behaviour (sh).
- Lease metadata now carries an additional field. Existing leases
without
`shellCommand` resolve to `null` and fall back to sh — backwards
compatible.

## Model Used

- OpenAI GPT-5.4 (reasoning effort: high) via Codex CLI
- Provider: OpenAI
- Used to author the code changes in this PR

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots — N/A (no UI changes)
- [ ] I have updated relevant documentation to reflect my changes — N/A
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-05-03 12:19:35 -07:00
Dotta
15eac43b43 [codex] Retry max-turn exhausted heartbeats (#5096)
## Thinking Path

> - Paperclip orchestrates AI agents for autonomous companies, and
heartbeat execution is the control-plane loop that keeps assigned work
moving.
> - Max-turn exhaustion is a recoverable local-adapter stop condition
for Claude and Gemini agents when a run needs another heartbeat to
continue safely.
> - The previous behavior could leave max-turn continuation details hard
to inspect, and duplicate/stale continuation wakes could keep running
after issue state changed.
> - The adapter layer also needed to avoid trusting arbitrary
stdout/stderr text as scheduler control metadata.
> - This pull request adds bounded max-turn continuation scheduling,
visible retry state, structured stop metadata handling, and
stale/duplicate continuation guards.
> - The benefit is safer automatic continuation after max-turn stops,
clearer operator visibility, and fewer duplicate or stale agent runs.

## What Changed

- Replaces closed PR #4952, whose head repository was deleted.
- Rebases the recovered max-turn continuation branch onto current
`paperclipai/paperclip:master`.
- Adds max-turn continuation scheduling and retry-state plumbing for
heartbeat runs.
- Adds stale/duplicate continuation suppression when issue status,
ownership, or execution locks change.
- Normalizes Claude/Gemini max-turn detection around structured stop
metadata instead of unstructured stdout/stderr text.
- Surfaces max-turn continuation settings and retry visibility in the
board UI.
- Adds focused server, adapter, and UI tests for max-turn stop metadata,
retry scheduling, stale queued-run invalidation, adapter
parsing/execution, run ledger display, and agent config patching.

## Verification

- `pnpm install --no-frozen-lockfile` to refresh local dependencies
after rebasing onto current `master`.
- `pnpm run preflight:workspace-links && pnpm exec vitest run
server/src/__tests__/claude-local-adapter.test.ts
server/src/__tests__/claude-local-execute.test.ts
server/src/__tests__/gemini-local-adapter.test.ts
server/src/__tests__/gemini-local-execute.test.ts
server/src/__tests__/heartbeat-retry-scheduling.test.ts
server/src/__tests__/heartbeat-stale-queue-invalidation.test.ts
server/src/services/heartbeat-stop-metadata.test.ts
ui/src/components/IssueRunLedger.test.tsx
ui/src/lib/agent-config-patch.test.ts ui/src/lib/runRetryState.test.ts
--testTimeout=20000`
- `pnpm --filter @paperclipai/adapter-claude-local typecheck && pnpm
--filter @paperclipai/adapter-gemini-local typecheck && pnpm --filter
@paperclipai/server typecheck && pnpm --filter @paperclipai/ui
typecheck`
- UI screenshot note: the UI changes are limited to config/ledger state
rendering rather than layout changes; component/unit coverage above
verifies the rendered behavior.

## Risks

- Medium behavior risk: heartbeat retry gating now suppresses max-turn
continuations when issue state or execution locks drift, so any callers
that relied on stale continuations running will now see cancellation
instead.
- Low adapter risk: Claude/Gemini unstructured text no longer triggers
max-turn scheduler metadata, so only structured stop signals and Gemini
exit code 53 are trusted.
- No database migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent, GPT-5-class model, tool-enabled local
repository editing and command execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots (not applicable: state/default rendering only; covered by
component/unit tests)
- [x] I have updated relevant documentation to reflect my changes (not
applicable: no user-facing command or docs contract changed)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-03 11:30:48 -05:00
Dotta
57229d0f24 [codex] Add issue monitor liveness controls (#4988)
## Thinking Path

> - Paperclip is a control plane for autonomous AI companies where work
must stay observable, governable, and recoverable.
> - The task/heartbeat subsystem owns agent execution continuity, issue
state transitions, and visible recovery behavior.
> - Waiting on an external service is not the same as being blocked when
the assignee still owns a future check.
> - The gap was that agents had no first-class one-shot monitor state
for external-service waits, so recovery could look stalled or require ad
hoc comments.
> - This pull request adds bounded issue monitors that can wake the
owner, clear exhausted waits, and produce explicit recovery behavior.
> - It also surfaces monitor status in the board UI and documents when
to use monitors versus `blocked`.
> - The benefit is clearer liveness semantics for asynchronous waits
without weakening single-assignee task ownership.

## What Changed

- Added issue monitor fields, shared types, validators, constants, and
an idempotent `0075` migration for scheduled monitor state.
- Added server-side monitor scheduling, dispatch, recovery bounds,
activity logging, and external-ref redaction.
- Added board/agent route coverage for monitor permissions and child
monitor scheduling.
- Added issue detail/property UI for monitor state, a monitor activity
card, and Storybook stories for review surfaces.
- Documented monitor semantics and recovery policy behavior in
`doc/execution-semantics.md`.
- Addressed Greptile review feedback by preserving monitor state in
skipped-stage builders and making board monitor saves send `scheduledBy:
"board"`.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm run preflight:workspace-links && pnpm exec vitest run
server/src/__tests__/issue-execution-policy-routes.test.ts
server/src/__tests__/issue-execution-policy.test.ts
server/src/__tests__/issue-monitor-scheduler.test.ts
server/src/__tests__/recovery-classifiers.test.ts
ui/src/components/IssueMonitorActivityCard.test.tsx
ui/src/components/IssueProperties.test.tsx
ui/src/lib/activity-format.test.ts`
- First run passed 5 files and failed to collect 2 server suites because
the worktree was missing the optional `acpx/runtime` dependency.
- After `pnpm install --frozen-lockfile`, reran the 2 failed suites
successfully.
- `pnpm exec vitest run
server/src/__tests__/issue-monitor-scheduler.test.ts
server/src/__tests__/recovery-classifiers.test.ts`
- `pnpm --filter @paperclipai/shared typecheck && pnpm --filter
@paperclipai/db typecheck && pnpm --filter @paperclipai/server typecheck
&& pnpm --filter @paperclipai/ui typecheck`
- `pnpm exec vitest run
server/src/__tests__/issue-execution-policy.test.ts
ui/src/components/IssueProperties.test.tsx`
- `pnpm --filter @paperclipai/server typecheck && pnpm --filter
@paperclipai/ui typecheck`
- `pnpm exec vitest run
ui/src/components/IssueMonitorActivityCard.test.tsx
ui/src/components/IssueProperties.test.tsx`
- `pnpm --filter @paperclipai/ui typecheck`
- Storybook screenshot captured from
`http://127.0.0.1:6006/iframe.html?viewMode=story&id=product-issue-monitor-surfaces--monitor-surfaces`
with Playwright.

## Screenshots

![Issue monitor Storybook
surfaces](https://raw.githubusercontent.com/paperclipai/paperclip/PAP-2945-when-a-task-is-waiting-for-an-_external-service_-what-state-should-it-be-in-and-what-recovery-method-could-it-h/docs/pr-screenshots/pap-2945/monitor-surfaces.png)

## Risks

- Medium: this changes heartbeat recovery behavior for scheduled
external-service waits, so regressions could affect wake timing or
recovery issue creation.
- Migration risk is reduced by using `IF NOT EXISTS` for the new issue
monitor columns and index.
- External monitor references are treated as secret-adjacent and are
intentionally omitted from visible activity/wake payloads.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent with repository tool use and terminal
execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots or Storybook review surfaces
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-03 08:58:53 -05:00
Dotta
76f09c8eb6 [PAP-3180] Move workspace switcher into sidebar (#4981)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies.
> - The board UI needs a clear persistent way to move between company
workspaces.
> - The previous layout kept company switching in a separate left rail,
which made the sidebar feel split between workspace selection and
navigation.
> - The workspace switcher belongs in the sidebar header so navigation
and workspace context stay together.
> - This pull request removes the separate company rail from the layout
and turns the sidebar company menu into the primary workspace switcher.
> - The benefit is a cleaner sidebar structure that keeps workspace
identity, switching, company actions, and navigation in one place.

## What Changed

- Removed the standalone `CompanyRail` from the main layout.
- Added the company/workspace switcher to the default, company settings,
and instance settings sidebars.
- Expanded `SidebarCompanyMenu` to list active workspaces, indicate the
current workspace, navigate out of instance settings when switching, and
expose add-company onboarding.
- Updated focused component tests for the new workspace-switcher
behavior.

## Verification

- `pnpm --filter @paperclipai/ui exec vitest run
src/components/SidebarCompanyMenu.test.tsx
src/components/CompanySettingsSidebar.test.tsx`
- `pnpm --filter @paperclipai/ui typecheck`
- `git diff --check`
- Visual smoke attempted against the managed dev server at
`http://127.0.0.1:57385`; a fresh browser context reached the
authenticated sign-in screen, so I could not capture an authenticated
sidebar screenshot from this heartbeat.

## Risks

- Low-to-medium UI risk: this changes the primary sidebar structure and
workspace-switching entry point.
- The instance-settings switch behavior now routes back to the selected
company dashboard when a workspace is selected.
- No migrations, API contracts, or lockfile changes.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled, medium reasoning mode.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-02 08:13:53 -05:00
Dotta
685ee84e4a [codex] Document terminal bench dispatch config (#4961)
## Thinking Path

> - Paperclip agents rely on skills for repeatable operating procedures
> - The Terminal-Bench loop skill needs to preserve enough dispatch
configuration to reproduce real heartbeat behavior
> - A bare benchmark command can create unassigned work with no
heartbeat-enabled agent, which is a harness setup failure rather than
product evidence
> - The Paperclip heartbeat skill also needs to keep escalation biased
toward agent-owned follow-through
> - This pull request documents dispatch runner config requirements and
strengthens the agent follow-through rule
> - The benefit is fewer misleading benchmark loops and clearer agent
operating guidance

## What Changed

- Documented `PAPERCLIP_HARBOR_RUNNER_CONFIG` / runner dispatch config
as required Terminal-Bench loop input.
- Updated the Terminal-Bench loop smoke check to require the dispatch
config mention.
- Added stronger Paperclip skill guidance to avoid asking humans for
work an agent can perform.

## Verification

- `pnpm smoke:terminal-bench-loop-skill`

## Risks

- Low risk: documentation and smoke expectation changes only. The
stricter smoke assertion is intentional so future edits do not drop the
dispatch config requirement.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool use and local command
execution. Exact context window was not exposed in the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-01 12:00:47 -05:00
Dotta
d7719423e9 [codex] Harden non-system database backup schemas (#4960)
## Thinking Path

> - Paperclip is a control plane whose database is the durable audit and
work record
> - Database backup needs to include operator/plugin schemas while
excluding PostgreSQL-owned internals
> - PostgreSQL reserves the `pg_` schema prefix for system schemas,
including temp and toast variants
> - A single escaped `pg_` prefix predicate is less brittle than
enumerating individual `pg_toast` and `pg_temp` forms
> - This pull request tightens non-system schema discovery for logical
backups without changing the normal user/plugin schema path

## What Changed

- Replaced narrow `pg_toast` and `pg_temp` schema exclusions with an
escaped `pg_` reserved-prefix exclusion.
- Kept `information_schema` excluded from logical backup metadata
discovery.
- Addressed Greptile feedback by removing redundant no-op additions from
the prior iteration.

## Verification

- `pnpm exec vitest run packages/db/src/backup-lib.test.ts`
- PR checks on the latest pushed head: policy, verify, e2e, Greptile
Review, and Snyk

## Risks

- Low risk: PostgreSQL reserves `pg_` schema names for system use, so
this should only exclude database-owned internals that should not be
restored from Paperclip logical backups.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected - check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool use and local command
execution. Exact context window was not exposed in the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-01 11:59:53 -05:00
Dotta
fe401b7fa9 [codex] Polish inbox nested issue UI (#4959)
## Thinking Path

> - Paperclip orchestrates AI agents through issue lists and
issue-thread interactions
> - The inbox must preserve nested issue visibility and keyboard
navigation as work decomposes into deeper sub-issues
> - Some UI polish issues made nested rows harder to scan and pending
question cancellation less covered
> - The issue list also had a small test indentation regression around
load-more behavior
> - This pull request tightens nested inbox rendering and related
issue-thread/list polish
> - The benefit is a more reliable operator inbox for multi-level work
trees

## What Changed

- Included nested grandchild issues in inbox keyboard navigation and
recursive row rendering.
- Sort parent rows by descendant activity so active subtrees remain
visible.
- Removed extra inbox card background styling in favor of the page
surface.
- Added regression coverage for pending question cancellation.
- Cleaned up the issue-list load-more test indentation.

## Verification

- `pnpm exec vitest run ui/src/lib/inbox.test.ts
ui/src/components/IssueChatThread.test.tsx
ui/src/components/IssuesList.test.tsx`
- Screenshots were not captured in this PR split; the visible flow is
covered by focused component/helper tests and should get browser QA in
the follow-up issue.

## Risks

- Medium risk: nested inbox rendering and keyboard navigation are
user-visible. The changes are localized to inbox grouping/rendering
helpers and covered by targeted tests.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool use and local command
execution. Exact context window was not exposed in the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-01 11:58:53 -05:00
Dotta
2d72292ad6 [codex] Add workspace routine run tab (#4958)
## Thinking Path

> - Paperclip orchestrates AI agents through reusable execution
workspaces and routines
> - Operators need a fast way to run workspace-aware routines against a
specific execution workspace
> - The existing workspace detail surface showed configuration, runtime
logs, and linked issues, but not routines that depend on workspace
variables
> - Routine runs also needed to prefill the selected execution workspace
so branch variables resolve correctly
> - This pull request adds a workspace routines tab and prefilled
routine-run dialog support
> - The benefit is a tighter workflow for rerunning reviews, smoke
checks, and other workspace-specific routines

## What Changed

- Added an execution workspace `Routines` tab and company-prefixed
routes.
- Listed routines that declare or reference workspace-specific
variables.
- Added `Run now` support that preselects the current execution
workspace in `RoutineRunVariablesDialog`.
- Centralized reusable execution workspace ordering/deduplication for
issue creation and workspace cards.
- Added focused UI helper and dialog regression tests.

## Verification

- `pnpm exec vitest run ui/src/lib/reusable-execution-workspaces.test.ts
ui/src/lib/workspace-routines.test.ts
ui/src/components/RoutineRunVariablesDialog.test.tsx
ui/src/lib/company-routes.test.ts`
- Screenshots were not captured in this PR split; the visible flow is
covered by focused component/helper tests and should get browser QA in
the follow-up issue.

## Risks

- Medium risk: this adds a new workspace detail tab and routine-run
path. It is isolated to workspace-scoped routines and uses existing
routine run APIs.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool use and local command
execution. Exact context window was not exposed in the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-01 11:58:15 -05:00
Dotta
570a4206da [codex] Recover productive terminal continuations (#4956)
## Thinking Path

> - Paperclip orchestrates AI agents through issue-scoped heartbeat runs
> - Recovery logic decides whether in-progress work still has a live
path after a terminal run
> - A productive terminal continuation can still leave an issue stranded
when no active run or wake remains
> - Treating that state as healthy leaves work stuck despite evidence
that more action is needed
> - This pull request re-enqueues recovery for productive terminal
continuations that left no live path
> - The benefit is fewer silently stranded in-progress issues after
agents make partial progress

## What Changed

- Reclassified successful-but-productive terminal continuations as
recoverable when no live path remains.
- Enqueue a follow-up recovery wake with the original run id and
continuation metadata.
- Added regression tests covering productive terminal continuation
recovery and advanced liveness handoff.

## Verification

- `pnpm exec vitest run
server/src/__tests__/heartbeat-process-recovery.test.ts
server/src/__tests__/run-continuations.test.ts`

## Risks

- Medium risk: recovery may schedule one more follow-up where Paperclip
previously considered the work observed. The existing uniqueness,
budget, and escalation checks still constrain retry loops.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool use and local command
execution. Exact context window was not exposed in the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-01 11:57:23 -05:00
Dotta
3cd26a78fc [codex] Surface live run comment context (#4957)
## Thinking Path

> - Paperclip orchestrates AI agents through issue comments and
heartbeat runs
> - The board UI needs to distinguish a comment that triggered a live
run from comments queued after that run started
> - The run payload already stores comment context, but active-run API
responses did not expose the ids the UI needs
> - Without those ids, the triggering comment can flash as queued while
the agent is already responding to it
> - This pull request exposes live-run comment context and teaches the
optimistic comment helper to ignore the trigger comment
> - The benefit is clearer issue-chat state during comment-triggered
agent interruptions

## What Changed

- Added `contextCommentId` and `contextWakeCommentId` to active/live run
payloads.
- Threaded those ids through server routes, heartbeat summaries, UI API
types, and issue detail rendering.
- Updated optimistic comment classification to avoid marking the
triggering comment as queued.
- Added server and UI regression coverage.

## Verification

- `pnpm exec vitest run
server/src/__tests__/agent-live-run-routes.test.ts
ui/src/lib/optimistic-issue-comments.test.ts`

## Risks

- Low-to-medium risk: adds optional fields to existing run payloads.
Existing consumers should ignore unknown fields, and UI handling is
null-safe.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool use and local command
execution. Exact context window was not exposed in the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-01 10:44:11 -05:00
Dotta
e8275318ba [codex] Raise agent heartbeat concurrency default (#4954)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agent heartbeat settings control how much parallel work one employee
can run
> - The previous default of 5 concurrent runs was too restrictive for
active local agent teams
> - The shared default, heartbeat clamp, docs, and route/import/UI
expectations need to agree
> - This pull request raises the default heartbeat concurrency to 20
while keeping explicit headroom up to 50 for power users
> - The benefit is higher throughput for agent teams without each new
agent needing manual runtime config edits

## What Changed

- Raised `AGENT_DEFAULT_MAX_CONCURRENT_RUNS` from 5 to 20.
- Raised the heartbeat service max clamp from 10 to 50, keeping the new
default below the ceiling.
- Updated V1 implementation docs and tests that assert default
imported/exported runtime config.
- Updated the new-agent UI runtime config test to assert the shared
default constant instead of duplicating the numeric value.

## Verification

- `pnpm exec vitest run
server/src/__tests__/agent-permissions-routes.test.ts
server/src/__tests__/company-portability.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`

## Risks

- Medium risk: new agents can consume more local execution capacity by
default. The heartbeat scheduler still respects configured max
concurrency and budget/pause controls, and operators can lower or raise
the per-agent cap within the `1..50` clamp.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool use and local command
execution. Exact context window was not exposed in the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-01 10:42:56 -05:00
Dotta
e273d621fc [PAP-3154] Stop padding /live-runs by default (#4963)
## Summary
- Fix [PAP-3154](/PAP/issues/PAP-3154): the Sidebar's "Dashboard NN
live" badge showed a constant 50 in every company because `GET
/api/companies/:companyId/live-runs` was padding its response with up to
50 recent (non-live) heartbeat runs whenever the caller did not pass
`minCount`.
- Regression introduced by
[#4875](https://github.com/paperclipai/paperclip/pull/4875) (commit
`6445bef9`), which capped both `minCount` and `limit` at 50 with a
fallback of 50 for omitted values. The cap is correct for `limit` (real
unboundedness guard); for `minCount` it conflates "no padding" with "pad
to the cap".
- Default `minCount` to 0 so callers asking for "live runs" only get
actually-live runs unless they explicitly request padding
(`ActiveAgentsPanel` is the only caller that does). Keep `limit` capped
at 50 by default.

## Test plan
- [x] `pnpm exec vitest run
server/src/__tests__/agent-live-run-routes.test.ts` — 7/7 pass,
including new tests for the no-pad default and explicit padding.
- [x] `pnpm exec vitest run ui/src/components/Sidebar.test.tsx
ui/src/components/ActiveAgentsPanel.test.tsx
ui/src/api/heartbeats.test.ts` — 6/6 pass.
- [ ] Verify in dev: with ~8 truly-live runs in a company, the sidebar
Dashboard badge shows the real count (not 50).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-01 10:33:13 -05:00
Dotta
42a299fb9d [codex] Bound productivity review recovery loops (#4948)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The heartbeat/productivity review subsystem detects when assigned
work is likely stuck or churning.
> - Productivity reviews are useful, but repeated reconciliation can
create noisy refresh comments or repeated review issues around the same
source issue.
> - That makes manager follow-up harder because the signal can get
buried under duplicate review activity.
> - This pull request bounds productivity review refreshes and creation
loops while preserving the existing escalation path.
> - The benefit is a quieter recovery loop that still surfaces stuck or
high-churn work for manager attention.

## What Changed

- Added refresh throttling for open productivity review issues,
including a one-hour default interval and a maximum of three refresh
comments per open review.
- Added a rolling 24-hour creation cap so completed/closed reviews
cannot immediately recreate review issues indefinitely for the same
source issue.
- Excluded cancelled productivity reviews from the creation cap so
manager cancellations do not silently suppress future legitimate
reviews.
- Preserved productivity review timestamps in deterministic test paths
and added targeted coverage for immediate refresh suppression, refresh
caps, creation caps, and cancelled-review exclusion.

## Verification

- `pnpm run preflight:workspace-links && pnpm exec vitest run
server/src/__tests__/productivity-review-service.test.ts`
- `pnpm exec vitest run
server/src/__tests__/productivity-review-service.test.ts`
- Greptile Review: 5/5 on commit
`bcf25832d0ffae25890b2ee7eed112d1c2d114fe` with review threads resolved.
- GitHub PR checks passed on the latest head: `policy`, `verify`, `e2e`,
`Greptile Review`, and `security/snyk (cryppadotta)`.
- Verified the branch is rebased onto `public-gh/master` with no
conflicts.
- Verified the diff does not include `pnpm-lock.yaml`, database schema
changes, or migrations.

## Risks

- Low-to-medium risk: this changes automation cadence for productivity
reviews. A truly stuck issue may receive fewer repeated refresh
comments, but the original review issue remains open and assigned for
manager action.
- No migration risk: this is server logic and tests only.

> Checked [`ROADMAP.md`](ROADMAP.md) for overlapping planned core work;
this is a targeted recovery-loop fix and does not add a new roadmap
feature.

## Model Used

- OpenAI Codex coding agent, GPT-5 model family, tool-using software
engineering mode. Exact context window is not exposed in this runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots (not applicable; server-only change)
- [x] I have updated relevant documentation to reflect my changes (not
applicable; no user-facing docs or commands changed)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-05-01 08:32:04 -05:00
Devin Foley
d2dd759caa plugins: make e2b template default explicit (#4901)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Remote execution environments are part of that control plane,
including sandbox-provider plugins like E2B
> - The E2B provider already normalizes config and runtime behavior
around a `base` template default
> - But the manifest still presented `template` as required, which
forces redundant operator input and makes the UI contract stricter than
runtime behavior
> - That mismatch showed up while building a repeatable QA workflow for
sandbox testing
> - This pull request makes the manifest and validation contract line up
with the existing `base` default
> - The benefit is a simpler and more accurate E2B environment setup
experience

## What Changed

- Removed the E2B manifest's `required: ["template"]` requirement so the
config schema matches runtime behavior
- Clarified the manifest description to say the template defaults to
`base` when omitted
- Added a focused unit test proving that validation normalizes a missing
template to `base`

## Verification

- Ran the focused E2B plugin test for the new behavior:
- `cd packages/plugins/sandbox-providers/e2b && pnpm test --
--testNamePattern "defaults a missing template to base"`

## Risks

- Low risk. This only loosens the schema to match the plugin's existing
runtime normalization and adds a test for that path.
- The broader E2B plugin suite currently has unrelated existing failures
outside this change; this PR does not modify those paths.

## Model Used

- OpenAI Codex, GPT-5 Codex via Codex CLI agent tooling, large-context
coding workflow with terminal tool use and local test execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [ ] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [ ] I will address all Greptile and reviewer comments before
requesting merge
2026-04-30 22:43:24 -07:00
Devin Foley
b02e67cea5 fix(ci): diff PR workflow paths from merge base (#4903)
## Thinking Path

> - Paperclip’s PR workflow is part of the control-plane safety surface
because it decides whether a branch is allowed to merge.
> - This issue started in that workflow: the lockfile and manifest
policy checks were diffing `base.sha..head.sha`, which incorrectly
treated unrelated `master` commits as if they belonged to the PR branch.
> - The right fix there is to diff from the PR merge base
(`base...head`) so policy checks only evaluate files introduced by the
branch itself.
> - Once that workflow fix was in place, `/checkpr` exposed a second
blocker on the PR merge ref: `verify` was failing in newer `master`-side
tests that were not part of the original branch diff.
> - The actionable repeated failure came from the ACPX local adapter
test suite, where a test hard-coded the managed Codex home under
`instances/default` even though the stable Vitest runner sets a
non-default `PAPERCLIP_INSTANCE_ID`.
> - This pull request now includes both the original CI diff-scope fix
and the targeted ACPX test fix so the PR’s actual checks align with
current base-branch execution.
> - The benefit is that the original false-positive lockfile failure is
removed, and the merge-ref verify path is hardened against the
instance-id isolation used in CI.

## What Changed

- Updated `.github/workflows/pr.yml` so the lockfile policy and manifest
policy steps diff `pull_request.base.sha...pull_request.head.sha` from
the merge base instead of using a two-dot base/head diff.
- Added an inline workflow comment explaining why the three-dot diff is
required for PR-scoped file detection.
- Updated `packages/adapters/acpx-local/src/server/execute.test.ts` so
the managed Codex home assertion uses a test-specific
`PAPERCLIP_INSTANCE_ID` instead of hard-coding `default`.
- Restored `PAPERCLIP_INSTANCE_ID` after that ACPX test finishes so the
test remains isolated and does not leak process env changes.

## Verification

- Reproduced the original false positive locally by comparing PR heads
`#4901` and `#4902` with the old `base..head` logic; both incorrectly
included `pnpm-lock.yaml` from unrelated `master` commits.
- Verified the new `base...head` logic reduces those PRs to only their
actual changed files and excludes `pnpm-lock.yaml`.
- Verified a real manifest-changing PR (`#4893`) still reports
`package.json` changes under the new logic.
- Ran `pnpm -r typecheck` successfully.
- Ran `pnpm vitest run
packages/adapters/acpx-local/src/server/execute.test.ts` successfully
after the ACPX test fix.
- Ran `pnpm vitest run packages/db/src/backup-lib.test.ts` successfully
against the merge-ref-related DB failure path observed during
`/checkpr`.
- Pushed commit `9520a976` and allowed PR `#4903` checks to rerun on the
updated branch.

## Risks

- Low risk: the workflow change only affects how PR policy checks
determine the changed file set.
- Low risk: the ACPX change is test-only and aligns the test with the
instance-isolation behavior already used by
`scripts/run-vitest-stable.mjs` in CI.
- The remaining operational risk is limited to other unrelated
merge-ref-only failures that were not reproduced in the targeted local
verification above.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, `gpt-5-codex`, via the Codex local adapter in Paperclip.
- Tool-using coding model with shell execution, git, GitHub CLI, and
repository inspection in a local worktree.
- Context included the current repo, the Paperclip task thread, PR check
output, and the isolated execution workspace.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-30 21:22:40 -07:00
github-actions[bot]
6a7cca95ef chore(lockfile): refresh pnpm-lock.yaml (#4899)
Auto-generated lockfile refresh after dependencies changed on master.
This PR only updates pnpm-lock.yaml.

Co-authored-by: lockfile-bot <lockfile-bot@users.noreply.github.com>
2026-04-30 20:00:07 -05:00
Dotta
4272c1604d Add ACPX local adapter runtime (#4893)
## Thinking Path

> - Paperclip orchestrates AI-agent companies through a control plane
that can start, supervise, and recover agent runs.
> - Local adapters are the bridge between Paperclip issues and concrete
agent runtimes such as Claude, Codex, and other ACP-compatible tools.
> - The roadmap calls out broader “bring your own agent” and claw-style
agent support, and ACPX gives Paperclip one path to normalize multiple
ACP agents behind a single adapter.
> - The branch needed to become one reviewable PR against current
`paperclipai/paperclip:master`, without carrying stale base conflicts or
generated lockfile churn.
> - This pull request adds an experimental built-in `acpx_local`
adapter, integrates it through the server/CLI/UI adapter surfaces, and
adds regression coverage for runtime execution, skill sync, stream
parsing, diagnostics, and log redaction.
> - The benefit is that Paperclip can run Claude/Codex/custom ACP agents
through ACPX while keeping operator configuration, skills, logging, and
transcript rendering inside the existing adapter model.

## What Changed

- Added `@paperclipai/adapter-acpx-local` with server execution, config
schema, ACPX session handling, CLI formatting, UI config helpers, and
stdout parsing.
- Registered `acpx_local` across CLI, server, shared constants, UI
adapter metadata, adapter capabilities, and agent creation/editing
surfaces.
- Added ACPX runtime execution support with persistent sessions,
local-agent JWT environment handling, skill snapshots, runtime skill
materialization, and isolation/security regressions.
- Added ACPX adapter diagnostics and marked the adapter experimental in
the UI.
- Added command/env secret redaction for resolved command metadata in
adapter-utils, server event storage, and the Agent Detail invocation UI.
- Added Storybook coverage for ACPX config, transcript rendering, and
skill states, plus PR screenshots under `docs/pr-screenshots/pap-2944/`.
- Rebased the branch onto current `public-gh/master`; `pnpm-lock.yaml`
is intentionally not included and there are no migration/schema changes.

## Verification

- `pnpm exec vitest run
packages/adapters/acpx-local/src/server/execute.test.ts
packages/adapters/acpx-local/src/server/test.test.ts
packages/adapters/acpx-local/src/cli/format-event.test.ts
packages/adapters/acpx-local/src/ui/parse-stdout.test.ts
packages/adapter-utils/src/server-utils.test.ts
server/src/__tests__/redaction.test.ts
server/src/__tests__/acpx-local-execute.test.ts
server/src/__tests__/acpx-local-skill-sync.test.ts
server/src/__tests__/acpx-local-adapter-environment.test.ts
server/src/__tests__/adapter-routes.test.ts
server/src/__tests__/agent-skills-routes.test.ts
ui/src/adapters/metadata.test.ts` — 12 files, 87 tests passed.
- `pnpm --filter @paperclipai/adapter-acpx-local typecheck` — passed.
- `pnpm --filter @paperclipai/server typecheck` — passed.
- `pnpm --filter @paperclipai/ui typecheck` — passed.
- Confirmed PR diff does not include `pnpm-lock.yaml`, database schema
files, or migrations.

Screenshots:

![ACPX Claude skills
light](https://github.com/cryppadotta/paperclip-1/blob/PAP-2944-acpx-make-a-claude_local-adapter-that-uses-acpx-instead/docs/pr-screenshots/pap-2944/skills-claude-light.png?raw=true)
![ACPX Claude skills
dark](https://github.com/cryppadotta/paperclip-1/blob/PAP-2944-acpx-make-a-claude_local-adapter-that-uses-acpx-instead/docs/pr-screenshots/pap-2944/skills-claude-dark.png?raw=true)
![ACPX custom skills
light](https://github.com/cryppadotta/paperclip-1/blob/PAP-2944-acpx-make-a-claude_local-adapter-that-uses-acpx-instead/docs/pr-screenshots/pap-2944/skills-custom-light.png?raw=true)

## Risks

- Medium risk: this introduces a new built-in adapter package and
touches runtime execution, adapter registration, agent config, skills,
and transcript rendering.
- ACPX and ACP agent behavior can vary by installed tool versions; the
adapter is marked experimental to set operator expectations.
- `pnpm-lock.yaml` is excluded per repository PR policy, so dependency
lock refresh must be handled by the repo’s automation or maintainers.
- No database migration risk: no schema or migration files changed.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, with repository tool use,
shell execution, git operations, and local verification. Exact hosted
context window was not exposed in this environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-30 19:57:05 -05:00
Dotta
ad5432fece [codex] Harden issue recovery reliability (#4875)
## Thinking Path

> - Paperclip is the control plane for autonomous agent companies, so
non-terminal issue state must always have a clear live, waiting, or
recovery owner.
> - This change stays inside the server reliability and liveness
subsystem for assigned issue recovery, blocker attention, and live-run
polling.
> - Closed PR #4860 mixed this reliability work with separate
mutation-boundary policy changes, which made review and merge risk too
broad.
> - [PAP-2981](/PAP/issues/PAP-2981) asked for a replacement PR
containing only the remaining reliability slice and explicitly excluding
user-assignment and execution-policy restrictions.
> - Follow-up review also split `advanced` run-liveness continuation
behavior out of this PR so it can be reviewed separately.
> - The implementation hardens repeated recovery escalation, expands
blocker-attention coverage for explicit waiting and recovery paths, and
caps company live-run polling defaults.
> - The benefit is a smaller reliability PR that improves liveness
behavior without changing agent/user mutation authorization boundaries
or `advanced` continuation semantics.

## What Changed

- Avoid repeated liveness escalation updates when the source issue is
already blocked by the same open escalation.
- Treat open liveness escalation recovery issues, their source issues,
and their leaf blockers as covered waiting paths in blocker attention.
- Cap default company live-run polling at 50 rows for both `minCount`
and `limit`, including explicit zero values, to avoid unbounded
responses.
- Preserve the existing behavior where succeeded `advanced` runs are
considered productive/healthy for stranded-work recovery and are not
actionable bounded run-liveness continuations.
- Added focused server coverage for recovery dedupe, blocker attention,
liveness escalation, run continuations, and live-run polling.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run
server/src/__tests__/heartbeat-process-recovery.test.ts
server/src/__tests__/heartbeat-issue-liveness-escalation.test.ts
server/src/__tests__/issue-blocker-attention.test.ts
server/src/__tests__/run-continuations.test.ts
server/src/__tests__/agent-live-run-routes.test.ts`
- Result: 5 files passed, 63 tests passed.
- `pnpm --filter @paperclipai/server typecheck`
- Result: passed.
- No UI changes; screenshots are not applicable.

## Risks

- Recovery and blocker-attention classification changes can affect which
blocked chains are shown as covered versus needing attention.
- Live-run polling now treats omitted, invalid, or non-positive `limit`
/ `minCount` values as the capped default of 50.
- `advanced` run-liveness continuation behavior is intentionally
excluded from this PR and split for separate review.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5, code execution and GitHub CLI tool use, medium
reasoning effort.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-30 16:44:28 -05:00
Dotta
a3de1d764d Add cheap model profiles for local adapters (#4881)
## Thinking Path

> - Paperclip is a control plane for autonomous AI companies, where
adapters are the boundary between the board, agents, and execution
runtimes.
> - Local adapters currently expose a primary runtime configuration, but
operators often need a cheaper model lane for routine or low-risk work.
> - That cheap lane has to stay adapter-owned: runtime profile settings
should not mutate the primary adapter config or bypass existing
auth/secret mediation.
> - Issue creation also needs an ergonomic way to request primary,
cheap, or custom model behavior for a selected assignee.
> - This pull request adds a first-class `cheap` model profile contract
across adapter capabilities, heartbeat config resolution, agent
configuration, and issue creation.
> - The benefit is cheaper task execution can be configured and
requested explicitly while preserving adapter boundaries, secret
handling, and audit visibility.

## What Changed

- Added adapter model-profile capability metadata and a `cheap` profile
contract for supported local adapters.
- Applied `runtimeConfig.modelProfiles.cheap.adapterConfig` during
heartbeat config resolution, including requested/applied/fallback run
metadata.
- Added agent configuration UI for cheap model profile settings without
writing those settings into primary `adapterConfig`.
- Added New Issue assignee model lane controls for Primary / Cheap /
Custom and request payload handling.
- Added run ledger profile badges and Storybook stories for the new
cheap-lane UI states.
- Added tests for validators, heartbeat model profile application,
permission/secret mediation, UI payload helpers, and run ledger
rendering.
- Added committed UI verification screenshots under
`docs/pr-screenshots/pap-2837/`.
- Addressed Greptile review feedback around cheap-profile defaults,
shared profile types, and fallback test data.

## Verification

Local:

- `pnpm exec vitest run packages/shared/src/validators/issue.test.ts
server/src/__tests__/adapter-registry.test.ts
server/src/__tests__/agent-permissions-routes.test.ts
server/src/__tests__/heartbeat-model-profile.test.ts
ui/src/components/IssueRunLedger.test.tsx
ui/src/lib/agent-config-patch.test.ts
ui/src/lib/issue-assignee-overrides.test.ts
ui/src/lib/new-agent-runtime-config.test.ts` — passed, 8 files / 103
tests.
- `pnpm exec vitest run ui/src/lib/new-agent-runtime-config.test.ts
ui/src/components/IssueRunLedger.test.tsx` — passed after
Greptile/rebase follow-up, 2 files / 17 tests.
- `pnpm --filter @paperclipai/ui typecheck` — passed after
Greptile/rebase follow-up.
- `pnpm -r typecheck` — passed.
- `pnpm build` — passed.
- `pnpm test:run` — did not complete successfully in this local
worktree: it stopped in pre-existing `@paperclipai/adapter-utils`
sandbox/SSH fixture suites outside this PR diff. Failures were 5s local
timeouts plus `git init -b` unsupported by this machine's Git 2.21.0.
The branch-specific targeted suites above passed.
- Branch was fetched/rebased onto `public-gh/master`; `git rev-list
--left-right --count public-gh/master...HEAD` reports `0 9`.

Remote PR checks on latest head
`e30bf399146451c86cee98ed528d51d33fa5af5a`:

- `policy` — passed.
- `verify` — passed.
- `e2e` — passed.
- `Greptile Review` — passed, confidence score 5/5; Greptile review
threads resolved.
- `security/snyk (cryppadotta)` — passed.

Screenshots:

- [New issue cheap lane
desktop](https://github.com/paperclipai/paperclip/blob/PAP-2837-plan-cheap-model-for-adapters-that-can-support-it/docs/pr-screenshots/pap-2837/newissue-cheap-desktop.png)
- [New issue custom lane
desktop](https://github.com/paperclipai/paperclip/blob/PAP-2837-plan-cheap-model-for-adapters-that-can-support-it/docs/pr-screenshots/pap-2837/newissue-custom-desktop.png)
- [New issue unsupported adapter
desktop](https://github.com/paperclipai/paperclip/blob/PAP-2837-plan-cheap-model-for-adapters-that-can-support-it/docs/pr-screenshots/pap-2837/newissue-unsupported-desktop.png)
- [Run ledger model profile badges
desktop](https://github.com/paperclipai/paperclip/blob/PAP-2837-plan-cheap-model-for-adapters-that-can-support-it/docs/pr-screenshots/pap-2837/runledger-profile-badges-desktop.png)
- Mobile variants are also in `docs/pr-screenshots/pap-2837/`.

## Risks

- Medium: heartbeat config mediation now merges runtime model profiles
into adapter configs, so adapter secret normalization and host-command
restrictions must keep covering nested config paths.
- Medium: the UI adds another issue creation choice; unsupported
adapters must keep hiding the cheap lane and preserve primary behavior.
- Low migration risk: no database migration is included.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

OpenAI Codex coding agent using GPT-5-class reasoning with repo tool use
and command execution. Exact served model/context window was not exposed
by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [ ] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-30 15:32:04 -05:00
Dotta
1fe1067361 Polish board settings and skills workflow (#4863)
## Thinking Path

> - Paperclip's board UI and bundled skills are the operator layer for
configuring agents, routines, issue workflows, and local troubleshooting
loops.
> - The prior rollup mixed this operator polish with database backups,
backend reliability, thread scale, and cost/workflow primitives.
> - This pull request isolates the remaining board QoL, settings,
issue-detail integration, adapter config cleanup, and skills smoke
tooling.
> - It includes some integration-level overlap with the thread and
workflow slices so this branch can run from `origin/master` while still
preserving the full original work.
> - Preferred merge order is the narrower primitives first, then this
integration PR last.
> - The benefit is that reviewers can inspect the user-facing
board/settings/skills layer separately from backend infrastructure
changes.

## What Changed

- Added board/settings polish for agents, routines, company settings,
project workspace detail, and issue detail controls.
- Added agent/routine UI regression tests and New Issue dialog coverage.
- Integrated issue-detail activity/cost/interaction surfaces and leaf
work pause/resume controls.
- Cleaned bundled adapter UI config defaults and onboarding copy.
- Added terminal-bench loop and work-stoppage diagnosis skills plus a
smoke test script.
- Updated attachment type handling and Paperclip skill/API guidance.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run ui/src/pages/Agents.test.tsx
ui/src/pages/Routines.test.tsx ui/src/components/NewIssueDialog.test.tsx
ui/src/pages/IssueDetail.test.tsx
server/src/__tests__/costs-service.test.ts
server/src/__tests__/issue-thread-interaction-routes.test.ts
server/src/__tests__/issue-thread-interactions-service.test.ts`
- Result: 7 test files passed, 54 tests passed.
- `pnpm run smoke:terminal-bench-loop-skill`
- Result: JSON output included `"ok": true` and `"cleanup": true`.
- UI screenshots not included because verification is focused
component/page coverage for the changed board surfaces.

## Risks

- This is the integration-heavy PR in the split and intentionally
overlaps some component/API primitives with the issue-thread and
workflow PRs so it can run from `origin/master`.
- Preferred merge order: #4859, #4860, #4861, #4862, then this PR last.
If earlier branches merge first, this PR may need a straightforward
conflict refresh in shared UI files.
- The terminal-bench smoke script creates temporary mock issues and
relies on cleanup; the verified run returned `cleanup: true`.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5.5, code execution and GitHub CLI tool use, medium
reasoning effort.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-30 15:28:11 -05:00
Dotta
c4269bab59 Add workflow interaction cancellation and issue cost summaries (#4862)
## Thinking Path

> - Paperclip coordinates work through issue-thread interactions, run
history, and cost telemetry.
> - Operators need workflow prompts to be cancellable and costs to be
visible at the issue level.
> - The earlier rollup mixed this workflow/cost work with database
backups, reliability recovery, thread scaling, and settings polish.
> - This pull request isolates the interaction and cost surfaces into a
reviewable slice.
> - The backend now supports cancelling pending question interactions
and summarizing issue-tree costs.
> - The UI component layer can render cancelled questions and interleave
activity with run ledger rows.

## What Changed

- Added `cancelled` as an issue-thread interaction status and result
shape for question interactions.
- Added the board-only `POST
/issues/:id/interactions/:interactionId/cancel` route and service
implementation.
- Added issue-tree cost summary support in the cost service and
`/issues/:id/cost-summary` API route.
- Extended shared cost exports and UI API/query keys for issue cost
summaries.
- Updated `IssueThreadInteractionCard` and `IssueRunLedger` components
for cancelled questions, issue cost surfaces, and activity/run
interleaving.
- Added focused server and component regression coverage.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run server/src/__tests__/costs-service.test.ts
server/src/__tests__/issue-thread-interaction-routes.test.ts
server/src/__tests__/issue-thread-interactions-service.test.ts
ui/src/components/IssueRunLedger.test.tsx`
- Result: 4 test files passed, 45 tests passed.
- UI screenshots not included because this PR updates reusable
components and API surfaces without wiring a new page-level layout.

## Risks

- Adds a new interaction terminal status; clients that switch
exhaustively on interaction status may need to handle `cancelled`.
- Issue-tree cost summaries use recursive issue traversal and should be
watched on unusually large issue trees.
- Page-level issue detail wiring is intentionally left to the board
QoL/issue-detail branch to keep this PR narrow.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5.5, code execution and GitHub CLI tool use, medium
reasoning effort.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-30 13:57:25 -05:00
Dotta
87f19cd9a6 Improve issue thread scale and markdown polish (#4861)
## Thinking Path

> - Paperclip's board UI is the operator surface for supervising
AI-agent companies.
> - Issue threads are where operators read progress, respond to agents,
inspect markdown, and jump through long histories.
> - Large threads and rich markdown had become difficult to navigate and
expensive to render.
> - The previous rollup mixed these UI scale fixes with unrelated
backend recovery, costs, backups, and settings changes.
> - This pull request isolates the issue-thread scale and markdown
polish work.
> - The benefit is a reviewable UI slice that can merge independently of
the backend reliability, database backup, workflow, and board QoL PRs.

## What Changed

- Virtualized long issue chat threads and stabilized
anchor/jump-to-latest behavior for large histories.
- Added incremental issue-list row loading and tests for
scroll-triggered pagination behavior.
- Hardened markdown body rendering and markdown editor behavior around
HTML tags, image drops, code-copy UI, and escaped newline handling.
- Added a long-thread measurement harness at
`scripts/measure-issue-chat-long-thread.mjs` plus
`perf:issue-chat-long-thread`.
- Added focused UI/lib regression coverage for thread rendering,
markdown, optimistic comments, and message building.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run ui/src/components/IssueChatThread.test.tsx
ui/src/components/IssuesList.test.tsx
ui/src/components/MarkdownBody.test.tsx
ui/src/components/MarkdownEditor.test.tsx
ui/src/lib/issue-chat-messages.test.ts
ui/src/lib/optimistic-issue-comments.test.ts`
- Result: 6 test files passed, 170 tests passed.
- UI screenshots not included because this PR is covered by targeted
component tests and does not introduce a new page layout.

## Risks

- Virtualization changes can affect scroll anchoring in edge cases on
very long threads.
- Markdown/editor hardening changes are intentionally defensive, but
malformed content may render differently than before.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5.5, code execution and GitHub CLI tool use, medium
reasoning effort.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-30 13:18:01 -05:00
Dotta
cd606563f6 Expand database backups to non-system schemas (#4859)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies.
> - Reliable backups are part of operating that control plane safely.
> - The previous backup path was public-schema oriented and did not
clearly cover plugin-owned schemas or migration history.
> - Paperclip now has plugin database namespaces and Drizzle migration
state that must survive backup/restore.
> - This pull request expands logical database backups to non-system
schemas and documents the backup boundary.
> - The benefit is safer restore behavior for core and plugin-owned
database state without implying full filesystem disaster recovery.

## What Changed

- Include non-system database schemas in JavaScript and pg_dump backup
paths.
- Preserve enum, table, sequence, index, constraint, migration, and
plugin-schema objects across backup/restore.
- Add restore coverage for plugin-owned schemas and Drizzle migration
history.
- Clarify docs that DB backups are logical database backups, not full
instance filesystem backups.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run packages/db/src/backup-lib.test.ts`
- Result: 1 test file passed, 4 tests passed.
- Confirmed this PR does not include `pnpm-lock.yaml` or
`.github/workflows/*` changes.

## Risks

- Medium: backup generation touches schema discovery and restore
ordering, so unusual database objects may need additional coverage
later.
- No migrations are included.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool use enabled, medium reasoning
effort. Exact hosted context-window details are not exposed in this
runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Note: no UI changes are included in this PR, so screenshots are not
applicable.

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-30 12:54:35 -05:00
Devin Foley
c0ce35d1fb Improve E2B plugin configuration UX and fix execution timeouts (#4802)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - E2B is a sandbox provider plugin that runs agent code in isolated
cloud environments
> - Operators configure E2B through the plugin settings page
> - But the E2B API key configuration was unclear — the settings field
description didn't explain that pasted keys are auto-saved as company
secrets, and the fallback to the host `E2B_API_KEY` variable wasn't
documented
> - Additionally, long-running E2B sandbox commands were timing out
because the plugin environment RPC driver used a fixed timeout, and
environment commands competed for the single foreground command slot
> - This PR clarifies the E2B configuration UX, fixes RPC timeouts for
plugin environment execution, and runs E2B environment commands in
background mode to avoid blocking the foreground slot
> - The benefit is clearer E2B setup for operators and more reliable
sandbox command execution

## What Changed

- Updated E2B plugin manifest and settings UI to clarify API key
configuration — field description now explains that pasted keys are
saved as company secrets and documents the `E2B_API_KEY` host fallback
- Added test coverage for the plugin settings page rendering
- Fixed `plugin-environment-driver.ts` to pass the configured timeout
through to RPC calls instead of using a hardcoded default
- Updated `environment-runtime.ts` to propagate timeout from the
environment lease to the plugin driver
- Changed E2B sandbox command execution to use background handles so
long-running agent commands don't block the foreground slot needed by
the callback bridge

## Verification

- `pnpm test` — all existing and new tests pass
- `pnpm typecheck` — clean
- Manual: navigate to plugin settings, verify E2B API key field shows
the updated description text
- Manual: run an E2B-backed agent task with a long-running command,
verify it completes without RPC timeout

## Risks

- Low risk. Configuration UX change is cosmetic. The timeout fix passes
an existing value through instead of dropping it. Background command
execution is a behavioral change but only affects E2B sandbox commands —
the foreground slot is still available for bridge health checks.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 17:12:30 -07:00
Devin Foley
a4ac6ff133 Add sandbox callback bridge for remote environment API access (#4801)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents can run inside sandboxed environments like E2B, which are
isolated from the host network
> - Sandboxed agents need to call back to the Paperclip API to report
progress, post comments, and update issue status
> - But sandbox environments cannot reach the Paperclip server directly
because they run in isolated network namespaces
> - This PR adds a callback bridge that proxies API requests from the
sandbox to the Paperclip server, running as a local HTTP server on the
host that forwards authenticated requests
> - The bridge is started automatically when an adapter launches a
sandbox execution, and torn down when the run completes
> - The benefit is sandboxed agents can interact with the Paperclip API
without requiring network-level access to the host, enabling E2B and
similar providers to work end-to-end

## What Changed

- Added `sandbox-callback-bridge.ts` in `packages/adapter-utils/` — a
lightweight HTTP bridge server that accepts requests from sandbox
environments and proxies them to the Paperclip API with authentication
- Added request validation and security policy: the bridge only forwards
requests to the configured API URL, validates content types, enforces
size limits, and rejects non-API paths
- Wired the bridge into all remote adapter execute paths (claude, codex,
cursor, gemini, pi) — the bridge starts before the agent process and the
bridge URL is passed via environment variables
- Updated `environment-execution-target.ts` to prefer the explicit API
URL from environment lease metadata for sandbox callback routing
- Fixed Claude sandbox runtime setup to work with the bridge
configuration
- Added comprehensive test coverage for bridge request handling, policy
enforcement, and sandbox execution integration
- Fixed browser bundling — the bridge module is excluded from the
frontend bundle via the adapter-utils index export

## Verification

- `pnpm test` — all existing and new tests pass, including bridge unit
tests and sandbox execution integration tests
- `pnpm typecheck` — clean
- Manual: configure an E2B environment, run an agent task, verify the
agent can post comments and update issue status through the bridge

## Risks

- Medium. This is a new network-facing component (HTTP server on
localhost). The security policy restricts forwarding to the configured
API URL only and validates all requests, but any proxy introduces attack
surface. The bridge binds to localhost only and is scoped to the
lifetime of a single agent run.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 16:37:34 -07:00
Devin Foley
4cf612a92d Fix runtime state race, workspace sync, plugin startup, and orphaned leases (#4804)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents run inside environments that are leased, and the server
manages runtime state, workspace configuration, and plugin lifecycle
> - Several edge cases caused failures during concurrent operations: a
race condition in runtime state insertion could produce duplicate-key
errors, reused workspaces didn't sync their configuration when the
parent issue was updated, sandbox provider plugins could be queried
before registration completed, and orphaned environment leases from
failed runs were never released
> - This PR fixes these four runtime/environment issues
> - The benefit is more reliable concurrent agent execution and proper
resource cleanup

## What Changed

- `services/heartbeat.ts`: Fixed a race condition where concurrent
runtime state inserts could fail with a duplicate-key error by using an
upsert pattern
- `services/issues.ts`: Sync reused workspace configuration when an
issue is updated, so the workspace reflects the latest issue state
- `services/environment-runtime.ts`: Fixed a startup race where sandbox
provider plugins could be queried before registration completed, by
awaiting plugin readiness before resolving environment drivers
- `services/heartbeat.ts`: Release environment leases for orphaned runs
that lost their process without cleanup

## Verification

- `pnpm test` — all existing and new tests pass, including new tests for
runtime state upsert and process recovery lease cleanup
- `pnpm typecheck` — clean
- Manual: trigger concurrent agent runs to verify no duplicate-key
failures; verify orphaned leases are released after process loss

## Risks

- Low risk. The runtime state upsert changes insert-to-upsert behavior,
which could mask a legitimate duplicate if two different runs produce
the same key — but this is prevented by the run ID being part of the
key. The plugin startup await is bounded by the existing registration
timeout.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 16:37:10 -07:00
Devin Foley
f9cf1d2f6a Add cursor sandbox support and fix SSH workspace sync (#4803)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents can run inside sandboxed environments like E2B, or on remote
hosts via SSH
> - The cursor adapter needs to resolve `cursor-agent` inside sandbox
environments where it's installed in `~/.local/bin`
> - But when using the default `agent` command on a sandbox target, the
adapter didn't know to look in `~/.local/bin/cursor-agent`, causing
"command not found" failures
> - Additionally, repeated SSH runs failed because `git checkout` during
workspace sync conflicted with leftover `.paperclip-runtime` files from
previous runs
> - This PR adds sandbox-aware command resolution for cursor and fixes
the SSH workspace sync conflict
> - The benefit is cursor works in E2B sandboxes out of the box, and
repeated SSH runs don't fail on workspace sync

## What Changed

- `cursor-local`: Added `prepareCursorSandboxCommand` — on sandbox
targets, reads the remote `$HOME`, prepends `~/.local/bin` to PATH, and
prefers `~/.local/bin/cursor-agent` when the default command is
requested; tightened the sandbox command probe to validate the binary
exists before launching; preserves explicit custom command overrides
- `adapter-utils/ssh.ts`: Added `--force` to git checkout in SSH
workspace sync to handle `.paperclip-runtime` untracked file conflicts
from previous runs

## Verification

- `pnpm test` — all existing and new tests pass, including cursor
sandbox probe, sandbox execution, and custom command override tests
- `pnpm typecheck` — clean
- Manual: configure an E2B environment, run a cursor-local task, verify
it resolves cursor-agent from the sandbox install path

## Risks

- Low-medium. The `--force` flag on git checkout could discard
uncommitted changes in the remote workspace, but the workspace is
managed by Paperclip and should not contain user edits.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 16:12:06 -07:00
Devin Foley
a0f5cbffd7 Harden release flow with registry verification and dist-tag checks (#4800)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Paperclip is distributed as npm packages, including plugins like
`plugin-e2b`
> - The release process publishes canary and stable builds via npm
dist-tags
> - But there was no automated verification that published packages
actually landed with the correct dist-tags, and broken canary publishes
could silently ship to users
> - This PR adds a registry verification script that checks published
packages match their expected dist-tags, and wires it into PR CI so
regressions are caught before merge
> - The benefit is release integrity is verified automatically, and
broken dist-tag states are caught early

## What Changed

- Added `scripts/verify-release-registry-state.mjs` — verifies that
published npm packages have correct dist-tag assignments and detects
orphaned or mispointed tags
- Added `scripts/verify-release-registry-state.test.mjs` — test coverage
for the verification logic
- Updated `scripts/release.sh` to include canary dist-tag safety checks
before publishing
- Updated `.github/workflows/pr.yml` to run registry verification as a
CI step
- Updated `doc/PUBLISHING.md` and `doc/RELEASING.md` with the new
verification workflow

## Verification

- `pnpm test` — all tests pass including new verification script tests
- `node scripts/verify-release-registry-state.mjs` — runs against the
live npm registry and reports current state
- CI: the new PR workflow step runs on every PR push

## Risks

- Low risk. This is additive CI and tooling — no runtime code changes.
The registry verification is read-only (queries npm, does not publish).
The release script changes add safety checks that abort before
publishing if state is unexpected.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 15:56:20 -07:00
Devin Foley
367d4cab72 Fix SSH callback URL selection for LAN and private networks (#4799)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents can run on remote hosts via SSH environments
> - When a remote agent needs to call back to the Paperclip API, it
needs a reachable URL
> - But the runtime API URL candidate builder did not account for
private network topologies where the server is only reachable via LAN or
VPN addresses
> - Agents on SSH hosts were failing to connect because the callback URL
pointed to localhost or an unreachable address
> - This PR fixes callback URL selection to honor `PAPERCLIP_API_URL`,
prefer LAN-reachable candidates, filter unreachable link-local
addresses, and include interface hosts in onboarding invite URLs
> - The benefit is SSH-based agents can reliably reach the Paperclip API
on private networks without manual URL configuration

## What Changed

- `runtime-api.ts`: Added `PAPERCLIP_API_URL` as a first-priority
candidate in `buildRuntimeApiCandidateUrls`; extracted
`collectReachableInterfaceHosts` to enumerate non-loopback,
non-link-local network interface IPs with IPv4 preference
- `server/src/index.ts`: Export `PAPERCLIP_API_URL` from the server
environment so it is available to callback candidate resolution
- `server/src/routes/access.ts`: Include LAN interface hosts in
onboarding invite connection candidates
- `server/src/config.ts`: Attempted auto-allowing LAN interface hosts,
then reverted to the per-instance allowlist approach (both commits
included for history clarity)

## Verification

- `pnpm test` — all existing and new tests pass, including new tests for
LAN candidate ordering and link-local filtering
- `pnpm typecheck` — clean
- Manual: start a Paperclip server on a machine with a LAN IP, create an
SSH environment pointing to another host on the same LAN, verify the
agent's callback URL uses the LAN IP rather than localhost

## Risks

- Low-medium. The candidate list now includes more addresses (all
non-loopback LAN interfaces). These are candidates for the agent to try,
not an allowlist — the server's allowed hostnames still gate which
origins are accepted. Ordering change (LAN preferred over loopback)
could affect existing setups where localhost was intentionally
preferred.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 15:56:17 -07:00
Devin Foley
9b99d30330 Add dedicated environment settings page and test-in-environment (#4798)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents run inside environments (local, SSH, E2B sandbox)
> - Operators need to configure and manage these environments
> - But environment settings were buried inside the general company
settings page, making them hard to find
> - Additionally, when testing an agent from the configuration form, the
test always ran locally regardless of which environment was selected
> - This PR moves environments into a dedicated top-level company
settings section and wires the "Test Environment" button to run inside
the selected environment
> - The benefit is operators can find and manage environments more
easily, and the test button now validates the actual environment the
agent will use

## What Changed

- Added a dedicated `CompanyEnvironments` settings page with its own
route and sidebar entry
- Updated `CompanySettingsSidebar` and `CompanySettingsNav` to include
the new environments section
- Modified the agent test route (`POST /agents/:id/test`) to accept an
optional `environmentId` parameter
- Updated all adapter `test.ts` handlers to resolve and use the
specified execution target environment
- Added `resolveTestExecutionTarget` to `execution-target.ts` for remote
environment test resolution with cwd fallback
- Moved the "Test Environment" button and its feedback display into the
`NewAgent` page footer for better UX flow

## Verification

- `pnpm test` — all existing and new tests pass
- `pnpm typecheck` — clean
- Manual: navigate to Company Settings, confirm "Environments" appears
as a top-level section
- Manual: configure an agent with a non-local environment, click "Test
Environment", confirm the test runs inside that environment

## Risks

- Low risk. UI-only routing change for the settings page. The
test-in-environment change adds an optional parameter with a local
fallback, so existing behavior is preserved when no environment is
specified.

## Model Used

Codex GPT 5.4 high via Paperclip.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-29 15:56:13 -07:00
Dotta
3494e84a29 Add v2026.428.0 release changelog (#4665)
## Summary

- Adds `releases/v2026.428.0.md` covering the diff between `v2026.427.0`
and `origin/master` (seven merged PRs).
- Generated via `.agents/skills/release-changelog/SKILL.md`.
- Flags the additive migrations `0071_default_hire_approval_off` and
`0072_large_sandman` plus the new-companies hire-approval default flip
in the upgrade guide.

## Notes

- Highlight: pause/resume actions in the sidebar agents panel
([#4616](https://github.com/paperclipai/paperclip/pull/4616)).
- Improvements: assigned-todo recovery dispatch, recovery issue
hardening, hire-approval opt-in default, inline selector keyboard
handling.
- Fixes: manual routine inbox visibility, stale company skill refresh
rejection, stale stored company-selection cleanup.

## Test plan

- [ ] Reviewer confirms section coverage and PR-to-bullet attribution.
- [ ] Confirm the file lands at \`releases/v2026.428.0.md\`.
- [ ] Confirm no canary suffix in title or filename.

Source issue: [PAP-2599](/PAP/issues/PAP-2599)

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-28 17:40:20 -05:00
Dotta
6b7f6ce4b8 [codex] Split PR #4692 UI/QoL updates (#4701)
## Thinking Path

> - Paperclip orchestrates AI agents through a company-scoped control
plane.
> - The affected surface is the board UI for issue threads, issue lists,
routines, dialogs, navigation, and issue review indicators.
> - Closed PR #4692 bundled backend, schema, docs, workflow, and UI/QoL
work into one oversized change set.
> - Greptile could not keep reviewing that broad PR because it exceeded
the 100-file review limit and mixed unrelated concerns.
> - This pull request extracts the UI/QoL slice into a fresh branch
under the review limit while leaving workflow and lockfile churn out.
> - The benefit is a focused review path for the board UI performance
and workflow improvements without reopening the oversized PR.

## What Changed

- Added long issue-thread virtualization, scroll-container binding,
anchor preservation, latest-comment jump targeting, and related
regression/perf fixtures.
- Improved issue list scalability with scroll-based loading, server
offset parameters, and pagination-focused UI tests.
- Reduced new issue dialog typing churn and split dialog action
subscriptions so broad layout/nav surfaces avoid unnecessary renders.
- Added routine variables help and routine description mention options
for users, agents, and projects.
- Added productivity review badge/link UI and fixed the badge to use
Paperclip's company-prefixed router link.
- Kept the split PR below Greptile's review limit and excluded
`.github/workflows/pr.yml` and `pnpm-lock.yaml`.

## Verification

- `pnpm install --no-frozen-lockfile` in the clean worktree to install
`@tanstack/react-virtual` locally without committing lockfile churn.
- `pnpm --filter @paperclipai/ui exec vitest run --config
vitest.config.ts src/components/IssueChatThread.test.tsx
src/components/IssuesList.test.tsx
src/components/NewIssueDialog.test.tsx src/pages/Routines.test.tsx
src/pages/Issues.test.tsx` passed: 5 files, 83 tests.
- `pnpm --filter @paperclipai/ui typecheck` passed.
- `git diff --check origin/master..HEAD` passed.
- Split-scope checks: 53 changed files; no `.github/workflows/pr.yml`;
no `pnpm-lock.yaml`.
- Screenshots were not captured in this heartbeat; the changes are
primarily virtualization, routing, pagination, and editor behavior
covered by focused regression tests.

## Risks

- Moderate UI risk because issue-thread virtualization changes scroll
behavior on long conversations; regression tests cover anchor jumps,
latest-comment targeting, row metadata, and short-thread fallback.
- Moderate integration risk because the issue-list offset parameter and
productivity review field depend on matching API behavior.
- Dependency risk: the UI package adds `@tanstack/react-virtual` while
repository policy keeps `pnpm-lock.yaml` out of PRs, so CI must resolve
dependency changes through the repo's normal lockfile policy.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled local repository and
GitHub workflow. Exact runtime context window was not exposed by the
harness.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-28 17:18:58 -05:00
Dotta
1991ec9d6f [codex] Split backend control-plane QoL slice (#4700)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies, so
backend task ownership, recovery, review visibility, and company-scoped
limits need to stay enforceable without UI-only coupling.
> - Closed PR #4692 bundled those backend changes with UI workflow,
docs, skills, workflow, and lockfile churn.
> - PAP-2694 asks for a clean backend/control-plane slice from that
closed branch.
> - This branch starts from current `master` and mines only the `cli`,
`packages/db`, `packages/shared`, and `server` contracts/tests needed
for the backend behavior.
> - It explicitly excludes UI workflow/performance work,
`.github/workflows/pr.yml`, `pnpm-lock.yaml`, docs, skills,
package-script, adapter UI build-config, and perf fixture script
changes; the only UI files are fixture/test updates required by the
tightened shared `Company` contract.
> - The benefit is a smaller reviewable PR that preserves the
control-plane fixes while staying under Greptile s 100-file review
limit.

## What Changed

- Added company-scoped attachment-size limits through DB
schema/migrations, shared company portability contracts, CLI
import/export coverage, and server attachment upload enforcement.
- Added productivity review service/API behavior for no-comment streak,
long-active, and high-churn review issues, including request-depth
clamping and issue summary exposure.
- Hardened issue ownership and recovery/control-plane paths: peer-agent
mutation denial, issue tree pause/resume behavior, stranded recovery
origins, and related activity/test coverage.
- Preserved related backend contract updates for routine timestamp
variables and managed agent instruction bundles because they live in
shared/server contracts from the source branch.
- Addressed Greptile feedback by making `Company.attachmentMaxBytes`
non-optional, simplifying review request-depth clamping, fixing the
migration final newline, and enforcing the process-level attachment cap
as the final ceiling for uploads.
- Added minimal company fixtures needed for repo-wide typecheck/build
and kept the PR to 66 changed files with forbidden/non-slice paths
excluded.

## Verification

- `pnpm install --frozen-lockfile`
- `git diff --check origin/master..HEAD`
- `git diff --name-only origin/master..HEAD | wc -l` -> 66 files
- `git diff --name-only origin/master..HEAD -- .github/workflows/pr.yml
pnpm-lock.yaml package.json doc skills .agents scripts
packages/adapters` -> no output
- `pnpm exec vitest run --config vitest.config.ts
packages/shared/src/validators/issue.test.ts
packages/shared/src/routine-variables.test.ts
packages/shared/src/adapter-types.test.ts
cli/src/__tests__/company-import-export-e2e.test.ts
cli/src/__tests__/company.test.ts
server/src/__tests__/productivity-review-service.test.ts
server/src/__tests__/issue-tree-control-service.test.ts
server/src/__tests__/issue-tree-control-routes.test.ts
server/src/__tests__/issue-agent-mutation-ownership-routes.test.ts
server/src/__tests__/issue-attachment-routes.test.ts
server/src/__tests__/heartbeat-process-recovery.test.ts
server/src/__tests__/issues-service.test.ts` -> 12 files, 147 tests
passed
- `pnpm exec vitest run --config vitest.config.ts
cli/src/__tests__/company-delete.test.ts
cli/src/__tests__/company-import-export-e2e.test.ts
server/src/__tests__/productivity-review-service.test.ts` -> 3 files, 18
tests passed
- `pnpm exec vitest run --config vitest.config.ts
server/src/__tests__/issue-attachment-routes.test.ts` -> 1 file, 6 tests
passed
- `pnpm --filter @paperclipai/db typecheck && pnpm --filter
@paperclipai/shared typecheck && pnpm --filter @paperclipai/server
typecheck && pnpm --filter paperclipai typecheck`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm --filter @paperclipai/ui typecheck && pnpm --filter
@paperclipai/ui build`

## Risks

- Includes migrations `0073_shiny_salo.sql` and
`0074_striped_genesis.sql`; merge ordering matters if another PR adds
migrations first.
- This is intentionally backend-only apart from fixture/test updates
forced by shared type correctness; UI affordances from PR #4692 are not
present here and should land in separate UI slices.
- The worktree install emitted plugin SDK bin-link warnings for unbuilt
plugin packages, but the targeted tests and package typechecks completed
successfully.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected; check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled terminal/GitHub
workflow. Exact runtime context window was not exposed by the harness.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-28 16:46:45 -05:00
Dotta
d9f540c331 [codex] Refresh docs and agent skills (#4693)
## Thinking Path

> - Paperclip orchestrates AI agents through a company-scoped control
plane
> - Contributors and agents need docs and skills that match the current
V1 behavior
> - The source branch included documentation updates alongside
implementation work
> - Keeping docs and skill guidance separate makes the implementation PR
easier to review
> - This pull request refreshes the V1 docs and agent-operating guidance
without changing runtime behavior
> - The benefit is current contributor guidance that can merge
independently from code changes

## What Changed

- Refreshed V1 product, goal, implementation, database, and development
documentation.
- Updated the Paperclip heartbeat skill guidance and create-agent skill
references.
- Added the Paperclip plan-to-task conversion skill.
- Updated release changelog skill guidance.

## Verification

- `git diff --check public-gh/master..HEAD` passed in the PR worktree
after the Greptile fix.
- Greptile Review passed on head `673317ed` with zero unresolved review
threads.
- GitHub PR checks passed on head `673317ed`: `policy`, `verify`, `e2e`,
and `security/snyk (cryppadotta)`.

## Risks

- Low runtime risk because this branch only changes docs and skill
guidance.
- Documentation may need follow-up wording adjustments if reviewers want
a different framing for V1 behavior.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled terminal/GitHub
workflow. Exact runtime context window was not exposed by the harness.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-28 16:12:03 -05:00
Dotta
d0bdbe11a9 Stabilize inline selector keyboard handling (#4617)
## Thinking Path

> - Paperclip's board UI relies on compact selectors for frequent issue
and agent edits.
> - Inline selectors often live inside larger keyboard-aware surfaces
such as composers and popovers.
> - Arrow, enter, tab, and escape keys handled by the selector should
not leak to parent document shortcuts.
> - Stale company selection should also stay hidden until the company
list confirms it is valid.
> - This pull request tightens inline selector keyboard handling and
adds regression coverage for stale company bootstrap behavior.
> - The benefit is fewer accidental parent interactions and safer
company-scoped UI initialization.

## What Changed

- Added a stable empty `recentOptionIds` default so selector filtering
does not get a new array every render.
- Mirrored highlighted option state into a ref so Enter/Tab commits the
current highlighted option reliably after keyboard navigation.
- Stopped propagation for selector-owned navigation/commit/escape keys.
- Added jsdom regressions for inline selector keyboard handling and
CompanyProvider stale selection behavior.

## Verification

- `pnpm exec vitest run ui/src/components/InlineEntitySelector.test.tsx
ui/src/context/CompanyContext.test.tsx`
- Targeted selector and CompanyProvider tests pass cleanly without React
`act(...)` warnings.
- Screenshots not attached: this is keyboard/state behavior covered by
component tests.

## Risks

- Low risk: changes are scoped to inline selector key handling and
tests. The main behavior shift is intentionally preventing handled
selector keys from reaching parent listeners.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, tool-enabled local
repository and shell access, Paperclip heartbeat context.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 20:04:35 -05:00
Dotta
43b0f2ae58 Add pause and resume actions to sidebar agents (#4616)
## Thinking Path

> - Paperclip operators need fast control over the agents running their
company.
> - The sidebar is the persistent place operators scan agent state while
navigating the board UI.
> - Agent pause and resume already exist as control-plane actions, but
sidebar users had to navigate away to use them.
> - This pull request adds a compact per-agent action menu in the
sidebar.
> - It keeps edit, pause, and resume close to the visible agent list
while preserving existing navigation behavior.
> - The benefit is faster operator intervention when an agent needs to
be paused or restarted.

## What Changed

- Refactored sidebar agent rows into a small item component with a
hover/focus action menu.
- Added edit, pause, and resume actions using existing agent API calls
and cache invalidation keys.
- Added success/error toasts for pause and resume mutations.
- Tracked pause/resume pending state per agent so one active mutation
does not disable every sidebar row.
- Disabled direct sidebar resume for budget-paused agents and labeled
that state clearly.
- Added jsdom coverage for active-agent pause, paused-agent resume,
per-agent pending state, and budget-paused resume protection.
- Added visual review artifacts for the default row and opened action
menu:
  - [Sidebar row](docs/pr-screenshots/pr-4616/sidebar-agent-row.png)
- [Sidebar action
menu](docs/pr-screenshots/pr-4616/sidebar-agent-actions.png)

## Verification

- `pnpm exec vitest run ui/src/components/SidebarAgents.test.tsx`
- `pnpm --filter @paperclipai/server prepare:ui-dist`
- Browser screenshot pass against a temporary local trusted instance at
`http://127.0.0.1:3102` using Playwright.

## Risks

- Low risk: UI-only addition using existing agent pause/resume
endpoints. The main risk is layout crowding for very narrow sidebars,
mitigated by the icon-only trigger and existing truncation.
- Budget-paused agents now require a non-sidebar path to resume, which
is intentional to avoid accidentally restarting agents stopped by budget
enforcement.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, tool-enabled local
repository and shell/browser access, Paperclip heartbeat context.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 20:03:54 -05:00
Dotta
f88f538e6d Keep manual routine runs visible in the runner inbox (#4615)
## Thinking Path

> - Paperclip coordinates recurring agent work through scheduled and
manual routines.
> - Manual routine runs are board-initiated work and should stay visible
to the human who kicked them off.
> - Routine execution issues are agent-assigned, so they can be filtered
away from a board user's inbox unless the user is recorded as touching
the work.
> - Coalesced or skipped active routine runs have the same visibility
problem because they reuse an existing live issue.
> - This pull request carries the manual runner actor into routine
dispatch and touches the linked issue for that user's inbox.
> - The benefit is that manually triggered routine work stays
discoverable by the operator who started it.

## What Changed

- Passed the board or agent actor from the routine run route into the
routine service.
- Recorded manual board runners as `createdByUserId` on fresh routine
execution issues.
- Touched coalesced or skipped active routine issues for the manual
runner by updating read state and clearing that user's inbox archive.
- Added route and service regressions for manual routine run actor
propagation and inbox visibility.

## Verification

- `pnpm exec vitest run server/src/__tests__/routines-routes.test.ts
server/src/__tests__/routines-service.test.ts`

## Risks

- Low risk: the change is scoped to manual routine runs and only updates
issue attribution/read-state metadata for the initiating actor.
- No migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, tool-enabled local
repository and shell access, Paperclip heartbeat context.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 20:03:24 -05:00
Dotta
68c37660f0 Dispatch assigned todo work during recovery sweeps (#4614)
## Thinking Path

> - Paperclip orchestrates AI agents for autonomous companies.
> - Agent assignments must reliably turn into heartbeat work without
board operators manually nudging stuck tasks.
> - The stranded-assignment recovery sweep already handles failed or
lost runs.
> - But assigned `todo` issues with no prior run could sit idle because
there was nothing to retry or recover.
> - This pull request dispatches those never-started assigned todos as
normal assignment wakes.
> - The benefit is that recovery fixes missed initial dispatches without
creating unnecessary recovery issues.

## What Changed

- Added an initial assigned-todo dispatch path to the recovery service
when an assigned `todo` issue has no heartbeat run yet.
- Reused invocation budget hard-stop checks before dispatching or
requeueing recovery work.
- Counted `assignmentDispatched` in startup/scheduled recovery logs.
- Added heartbeat recovery regressions for first dispatch, duplicate
queued wake prevention, budget-blocked skips, and paused-agent skips.

## Verification

- `pnpm exec vitest run
server/src/__tests__/heartbeat-process-recovery.test.ts`

## Risks

- Low to medium risk: this changes liveness recovery behavior for
assigned `todo` issues, but it stays on the existing assignment wake
path and skips paused or budget-blocked agents.
- No migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, tool-enabled local
repository and shell access, Paperclip heartbeat context.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-27 20:02:44 -05:00
Dotta
7a9b3a6037 [codex] Harden recovery issue handling (#4600)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The control plane must recover stranded agent work without creating
new operational loops
> - Stranded recovery issues can themselves fail, and exposing raw retry
errors in comments can leak sensitive adapter details
> - New local companies also should not force a hire-approval gate
unless operators enable that policy
> - This pull request hardens recovery issue handling, redacts retry
failure details in issue copy, preserves `maxConcurrentRuns: 1`, and
flips new-hire approval to an opt-in default
> - The benefit is safer automatic recovery and smoother default company
setup without hidden migration conflicts

## What Changed

- Added migration `0071_default_hire_approval_off` and updated company
schema/import/export/docs so hire approvals default off and serialize
only when enabled.
- Added migration `0072_large_sandman` with a partial unique index
preventing duplicate active stranded recovery issues for the same source
issue.
- Blocked failed `stranded_issue_recovery` issues in place instead of
creating nested recovery issues.
- Redacted latest retry failure details from recovery issue comments
while still linking reviewers to run evidence.
- Allowed `maxConcurrentRuns: 1` to be honored by heartbeat concurrency
normalization.
- Added focused regression coverage for recovery recursion, redaction,
migration ordering, and concurrency behavior.

## Verification

- `pnpm --filter @paperclipai/db run check:migrations`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/recovery-classifiers.test.ts`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/company-portability.test.ts --pool=forks
--poolOptions.forks.isolate=true`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/agent-permissions-routes.test.ts --pool=forks
--poolOptions.forks.isolate=true`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/heartbeat-process-recovery.test.ts --pool=forks
--poolOptions.forks.isolate=true` exits 0, but this host skipped the
embedded Postgres tests with the existing init guard.
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/heartbeat-dependency-scheduling.test.ts
--pool=forks --poolOptions.forks.isolate=true` exits 0, but this host
skipped the embedded Postgres tests with the existing init guard.

## Risks

- Migration risk is low but this PR intentionally owns both new
migrations to avoid separate PR migration-journal conflicts.
- Recovery comments now require operators to inspect linked run evidence
for details instead of reading raw errors inline.
- The hire approval default changes behavior for newly created/imported
companies only; existing persisted company settings are not changed
except by the SQL default for future rows.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled terminal/GitHub
workflow, reasoning mode active. Context window not exposed in this
environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 15:02:47 -05:00
Dotta
6ccf80bcf2 [codex] Reject stale company skill refreshes (#4601)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Company skills are part of the reusable agent capability layer
> - Skill inventory refresh work can outlive the company it was
requested for
> - Without an explicit company existence check, stale refreshes can
continue into bundled/local skill cleanup for deleted or missing
companies
> - This pull request makes company-skill listing fail fast when the
company no longer exists
> - The benefit is clearer API behavior and less stale background work
against missing company scope

## What Changed

- Added a company existence check before `companySkillService.list()`
refreshes bundled and local-path skill state.
- Added regression coverage asserting missing companies return `404
Company not found`.

## Verification

- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/company-skills-service.test.ts --pool=forks
--poolOptions.forks.isolate=true` exits 0, but this host skipped the
embedded Postgres tests with the existing init guard.

## Risks

- Low risk. Existing callers for valid companies are unchanged.
- Missing-company callers now receive an explicit 404 instead of
continuing refresh work.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled terminal/GitHub
workflow, reasoning mode active. Context window not exposed in this
environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-27 13:19:38 -05:00
Dotta
d95968a9f8 [codex] Ignore stale stored company selections (#4602)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The board UI is the operator’s control surface for selecting the
active company
> - A company id stored in localStorage can become stale across resets,
imports, or deleted companies
> - Exposing that stale id before companies load can briefly put
downstream UI in an invalid company scope
> - This pull request defers selected-company exposure until the loaded
company list validates the stored id
> - The benefit is a cleaner company-selection bootstrap path and fewer
transient invalid API requests

## What Changed

- Initialized `CompanyProvider` selection as `null` until companies
finish loading.
- Reused a stored company id only when it exists in the loaded
selectable company list.
- Cleared storage and selected state when no companies are available.
- Added jsdom regression coverage for stale stored ids before and after
company loading.

## Verification

- `pnpm exec vitest run --project @paperclipai/ui
ui/src/context/CompanyContext.test.tsx`

## Risks

- Low risk. The change only affects selection bootstrap and keeps valid
stored selections intact.
- There may be a slightly longer initial `null` selected-company state
while the company list is loading.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled terminal/GitHub
workflow, reasoning mode active. Context window not exposed in this
environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 13:18:21 -05:00
Dotta
15c0ce3722 Add Twitter/X link to READMEs (PAP-2475) (#4593)
## Summary

- Adds [@papercliping on X](https://x.com/papercliping) next to the
existing Discord link in the top-of-file nav and the bottom
**Community** list of both `README.md` and `cli/README.md`.
- The `cli/README.md` change keeps the npm-published readme consistent
with the GitHub one.

Resolves PAP-2475.

## Test plan

- [ ] Render `README.md` on GitHub and confirm the new Twitter link in
the header strip and the Community section.
- [ ] Render `cli/README.md` (preview on GitHub or via npm) and confirm
the same.
- [ ] Click both Twitter links and verify they land on
`https://x.com/papercliping`.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 09:35:33 -05:00
Dotta
ecd92af001 release: v2026.427.0 notes (#4590)
## Summary

- Adds `releases/v2026.427.0.md` covering the 77 PRs since `v2026.416.0`
- Highlights: multi-user access + invites, structured issue-thread
interactions, run liveness continuations, sub-issues as a workflow
checklist, issue subtree pause/cancel/restore, first-class issue
references
- Beta Features: Environments + pluggable sandbox providers (incl.
`@paperclipai/plugin-e2b`)
- Notes 14 additive migrations (`0057`–`0070`) in the Upgrade Guide; no
breaking changes flagged

Source issue: [PAP-2476](/PAP/issues/PAP-2476)

## Test plan
- [ ] Spot-check that each linked PR actually landed in the v2026.427.0
range
- [ ] Confirm migration list matches `server/src/database/migrations`
- [ ] Verify rendered markdown looks right on GitHub

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 09:16:20 -05:00
Dotta
215b6cd161 [codex] Add security role route coverage (#4589)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Agent creation accepts roles that become part of the agent contract
and telemetry.
> - The shared role list already includes the security role.
> - Direct agent creation should preserve that role through route
handling and analytics metadata.
> - This pull request adds route coverage for creating a security-role
agent and asserting telemetry receives the same role.
> - The benefit is regression coverage for security agents without
changing the production route behavior.

## What Changed

- Added a server route test that creates an agent with `role:
"security"`.
- Asserted the create payload and telemetry metadata preserve `security`
as the agent role.

## Verification

- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/agent-skills-routes.test.ts --pool=forks
--poolOptions.forks.isolate=true`

## Risks

- Low risk; test-only coverage.
- No runtime behavior, schema, or API contract changes.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, `gpt-5`, coding model with tool use and local command
execution; context window not exposed by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 08:49:59 -05:00
Dotta
53396f272a [codex] Fix sub-issue progress summary styling (#4588)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The issue list and issue detail surfaces summarize child/sub-issue
progress for operators.
> - Those summaries need to be compact and visually consistent because
they appear in dense lists.
> - The progress strip is most useful when there are multiple sub-issues
to compare, so the summary intentionally stays hidden for a single
sub-issue.
> - This pull request tightens the sub-issue progress summary styling
and updates the related tests.
> - The benefit is a cleaner, more scannable task list without changing
task ownership, status, or workflow behavior.

## What Changed

- Adjusted sub-issue progress summary copy/styling in the issue list and
detail summary helpers.
- Intentionally render the progress summary only for two or more child
issues; a single child issue still appears in the normal sub-issue list
without a redundant progress strip.
- Updated the UI tests that assert the rendered summary behavior.
- Clarified the two-plus-child threshold in code with a named constant.

## Verification

- `pnpm exec vitest run --project @paperclipai/ui
ui/src/components/IssuesList.test.tsx
ui/src/lib/issue-detail-subissues.test.ts`

## Screenshots

![Before/after comparison of sub-issue progress summary
styling](cd26b5bd63/pap-2449-subissue-progress-before-after.svg)

## Risks

- Low risk; this is a small UI presentation change with focused test
coverage.
- The intentional threshold change means parents with exactly one child
no longer show the aggregate progress strip, avoiding redundant summary
chrome while keeping the child visible in the list.
- No schema or API behavior changes.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, `gpt-5`, coding model with tool use and local command
execution; context window not exposed by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 08:48:26 -05:00
Dotta
fda296ee4f [codex] Add configurable liveness auto-recovery controls (#4587)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Heartbeat liveness recovery decides when stalled issue trees need
manager-visible follow-up.
> - Automatic recovery issue creation is useful, but operators need
instance-level controls for how aggressive it is.
> - Without controls, recovery behavior is harder to tune for local
development, production operations, and noisy edge cases.
> - This pull request adds configurable liveness auto-recovery settings
across shared contracts, API routes, services, and the instance
experimental settings UI.
> - The benefit is that operators can keep liveness findings advisory or
enable bounded recovery automation with explicit intervals and lookback
windows.

## What Changed

- Added shared types and validators for liveness auto-recovery settings.
- Extended instance settings routes and services to persist and validate
the new controls.
- Wired heartbeat/recovery services to honor enablement, minimum
interval, and lookback settings.
- Added UI controls for liveness recovery under instance experimental
settings.
- Covered the new server behavior with instance settings and liveness
escalation tests.

## Verification

- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/heartbeat-issue-liveness-escalation.test.ts
server/src/__tests__/instance-settings-routes.test.ts --pool=forks
--poolOptions.forks.isolate=true`
- `pnpm --filter @paperclipai/shared typecheck`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm --filter @paperclipai/ui typecheck`

## Risks

- Moderate behavioral risk because recovery automation timing changes
when enabled; defaults keep existing advisory behavior unless the
setting is turned on.
- No database migration in this PR; settings are stored through the
existing instance settings path.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, `gpt-5`, coding model with tool use and local command
execution; context window not exposed by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 08:46:44 -05:00
Neeraj Kumar Singh B
f0f9460d1d docs: AWS ECS Fargate deployment runbook (#3897)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies and ships
a
>   "local-first, cloud-ready" deployment model
> - The deploy docs currently cover local/Docker but not a production
> cloud target, so teams asking "how do I put this behind a real domain"
>   have no canonical path
> - We already support Docker images, RDS-compatible Postgres, and an
EFS
>   storage profile, so AWS ECS Fargate is a natural fit
> - Without a runbook, each team reinvents VPC, security groups, TLS,
and
>   secrets wiring and usually gets at least one step wrong
> - This pull request adds `docs/deploy/aws-ecs.md`, an ECS
task-definition
> template, and an `.env.aws.example`, cross-linked from the deploy
overview
> - The benefit is a single, reproducible ~$110/mo path to a production
>   deployment, plus a full teardown for throwaway environments

## What Changed

- New `docs/deploy/aws-ecs.md` — an 11-step ECS Fargate runbook covering
ECR,
  VPC, RDS, EFS, Secrets Manager, IAM, ALB, and ECS service with the
  deployment circuit breaker enabled
- New `docker/ecs-task-definition.json` — Fargate-ready task definition
with
  `<ACCOUNT_ID>`, `<REGION>`, `<EFS_ID>`, `<DOMAIN>` placeholder tokens
- New `docker/.env.aws.example` — documents every non-secret env var the
  ECS deployment needs
- `docs/deploy/overview.md` — one-line cross-reference to the new guide
- Greptile feedback addressed in follow-up commits:
  - `containerName` in the service-create call now matches
    `paperclip-server` in the task definition
  - HTTP :80 listener added that 301-redirects to :443
  - Dedicated RDS DB subnet group created before `create-db-instance`
  - EFS teardown polls on mount-target deletion instead of `sleep 30`

## Verification

- Walked every step of the runbook against the task definition to
confirm
  variable names (`$ALB_SG`, `$ECS_SG`, `$RDS_SG`, `$EFS_SG`, `$TG_ARN`,
`$LISTENER_ARN`, `$HTTP_LISTENER_ARN`, `$EFS_ID`, `$RDS_ENDPOINT`, etc.)
are
  defined before they are referenced
- Confirmed the `containerName` in Step 10 (`paperclip-server`) matches
  `docker/ecs-task-definition.json` line 11
- Confirmed the `sed` placeholder substitution in Step 8 matches the
tokens
  in the task definition template
- Teardown order was checked in reverse-dependency order: ECS service →
  listeners → target group → ALB → RDS (waits for deletion) → DB subnet
  group → EFS mount targets (polled) → EFS → secrets → SGs → ECR → IAM →
  log group

## Risks

- **Low risk for the repo.** Docs-only change plus two template files
under
`docker/`; no runtime code paths are touched and nothing is imported by
  the build.
- **Risk for users who follow the runbook:** AWS bills accrue
immediately
  once RDS/ALB/EFS exist. The runbook calls this out and includes a full
  teardown procedure. Placeholder tokens (`<ACCOUNT_ID>`, `<REGION>`,
`<EFS_ID>`, `<DOMAIN>`) are documented so nothing is silently
hard-coded.

## Model Used

- Claude (Anthropic), model `claude-opus-4-6`, ~200K context window,
extended thinking mode on, used with tool access (file edit, shell) via
Claude Code. The Greptile follow-up commits were authored the same way.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass — N/A for docs/config
templates; validated by reading
- [x] I have added or updated tests where applicable — N/A for docs
- [x] If this change affects the UI, I have included before/after
screenshots — N/A, no UI
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-27 08:41:47 -05:00
Dotta
1d8c7a09b8 [codex] Add security role route regression (#4586)
## Thinking Path

> - Paperclip orchestrates AI agents through company-scoped
control-plane workflows.
> - Agent creation is one of the core board/operator surfaces for
defining who works in a company.
> - The shared taxonomy now includes a first-class `security` agent
role.
> - Direct agent creation must preserve that role through default
instruction materialization and telemetry.
> - A prior replacement PR covered this path, but Greptile identified
that the route-test mock could let a future patch object shadow the
regression.
> - This pull request reopens the narrow regression coverage from
current `master` with the mock ordering fixed.
> - The benefit is a focused guardrail that keeps `security` role
creation observable without expanding the production diff.

## What Changed

- Added a direct agent creation route regression test for `role:
"security"`.
- Verified telemetry receives `agentRole: "security"` after the default
instruction materialization update path.
- Ordered the regression mock as `...patch` before `role: "security"` so
future patch fields cannot shadow the asserted role.

## Verification

- `pnpm install --frozen-lockfile` to link dependencies in the fresh
worktree; it completed with existing plugin SDK bin warnings.
- `pnpm exec vitest run server/src/__tests__/agent-skills-routes.test.ts
packages/shared/src/adapter-types.test.ts`

## Risks

- Low risk. This is test-only coverage and does not change runtime
behavior.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 based coding agent, tool-enabled with local shell
and repository editing capabilities.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots (N/A: no UI changes)
- [x] I have updated relevant documentation to reflect my changes (N/A:
test-only regression)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-27 08:11:52 -05:00
Devin Foley
d2cbe2cb23 Prefer pushing feature branches to a user fork in paperclip-dev skill (#4572)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The `paperclip-dev` skill is the canonical reference agents read
before doing development work on the Paperclip repo itself
> - Today the skill assumes feature branches get pushed to `origin` (=
`paperclipai/paperclip`), which clutters the upstream branch list when
contributors actually have personal forks
> - This is the standard open-source contribution pattern (push to fork,
PR upstream) and the skill should reflect it
> - This pull request adds a "Forks — Prefer Pushing to a User Fork"
section that teaches agents to detect a fork remote, push there, and
only fall back to `origin` when no fork is configured
> - The benefit is cleaner upstream branch hygiene and behavior that
matches typical contributor workflows without any code/runtime change

## What Changed

- Added a new **Forks — Prefer Pushing to a User Fork** section to
`skills/paperclip-dev/SKILL.md` covering:
- How to detect a user fork via `git remote -v` (treat any
non-`paperclipai` GitHub remote as the fork)
  - How to push to the fork (`git push -u <fork-remote> HEAD`)
- How to create the PR from the fork (`gh pr create --repo
paperclipai/paperclip --head <fork-owner>:<branch>`)
- The no-fork fallback (push to `origin`, do not auto-create a fork —
ask first)
  - Keeping the fork's `master` in sync
- Added a reinforcing entry to the **Common Mistakes** table linking
back to the new section

## Verification

- Docs-only change to a single markdown skill file. Reviewer can confirm
by reading the diff in `skills/paperclip-dev/SKILL.md`:
- New `## Forks — Prefer Pushing to a User Fork` section sits between
`## Worktrees` and `## Pull Requests`
  - New row appended to the `## Common Mistakes` table
- No tests, no build, no runtime behavior affected.

## Risks

- Low risk. Documentation-only edit. The instructions are advisory —
they only change agent behavior on future runs that read the skill.

## Model Used

- Provider: Anthropic (Claude)
- Model ID: `claude-opus-4-7` (Claude Opus 4)
- Capabilities: tool use (file read/edit, shell, git, gh CLI), extended
reasoning
- Context: invoked via Claude Code / Paperclip heartbeat for issue
PAPA-139

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass (N/A — docs-only change; no
test surface)
- [x] I have added or updated tests where applicable (N/A)
- [x] If this change affects the UI, I have included before/after
screenshots (N/A — no UI change)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 22:19:07 -07:00
Dotta
82e257c7ba Cancel stale queued heartbeats when issue graph changes (PAP-2314) (#4534)
Co-Authored-By: Paperclip <noreply@paperclip.ing>
2026-04-26 21:17:38 -05:00
Devin Foley
868d08903e test: isolate CLI company import e2e state (#4560)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies, and its
CLI import/export path is part of how operators move company state
safely between environments.
> - The `paperclipai company import/export` e2e test is supposed to
validate that portability flow inside a hermetic harness, not against a
developer's live Paperclip home.
> - This regression showed nested CLI subprocesses could silently fall
back to ambient `PAPERCLIP_*` state and mutate a real local instance by
creating extra companies such as `CLI-1-Roundtrip-Test`.
> - The first job was to pin the test subprocesses to isolated config,
home, instance, auth, and context paths, and to add a regression
assertion that proves the nested CLI writes stay inside the test-owned
state.
> - Once the PR was up, CI and Greptile exposed two follow-on issues
that were blocking merge: plugin SDK typecheck bootstrap was racing
across packages in fresh CI, and the new lock helper needed one more fix
to release its lock on failure.
> - This pull request therefore ends up doing two tightly related
things: fixing the original CLI isolation leak, and hardening the
supporting typecheck/bootstrap path enough for the fix to verify cleanly
in CI.
> - The benefit is that the portability e2e test is now actually
isolated, and the PR verification path is stable enough to catch
regressions instead of introducing its own nondeterministic failures.

## What Changed

- Hardened `cli/src/__tests__/company-import-export-e2e.test.ts` so
nested CLI subprocesses re-seed isolated `PAPERCLIP_CONFIG`,
`PAPERCLIP_HOME`, `PAPERCLIP_INSTANCE_ID`, `PAPERCLIP_CONTEXT`,
`PAPERCLIP_AUTH_STORE`, and throwaway `HOME` values instead of falling
back to ambient machine state.
- Added a regression assertion around `paperclipai context set --json`,
then cleared the temporary `context.json` so the isolation check and the
later export/import flow stay independent.
- Passed the same isolated `HOME` into the server subprocess so both
sides of the e2e harness are symmetric.
- Introduced locking in `scripts/ensure-plugin-build-deps.mjs` and
switched the server/plugin example `typecheck` scripts to use that
helper instead of launching concurrent raw `@paperclipai/plugin-sdk`
builds.
- Fixed the helper failure path so it releases the lock before exiting
non-zero, which prevents stale-lock timeouts during parallel typecheck
runs.

## Verification

- `pnpm vitest run cli/src/__tests__/company-import-export-e2e.test.ts
--project paperclipai`
- `pnpm --filter paperclipai typecheck`
- `pnpm -r typecheck`
- PR checks now pass on the current head, including `policy`, `verify`,
`e2e`, `security/snyk`, and `Greptile Review`.

## Risks

- Low risk. The product-facing behavior change is scoped to test harness
code in the CLI e2e suite.
- The CI stabilization changes only affect bootstrap/typecheck helper
paths for the server and plugin/example packages, but they do touch
shared verification plumbing; the main risk is changing how fresh build
artifacts are prepared in local/CI typecheck runs.

## Model Used

- Anthropic Claude via Paperclip `claude_local`, model
`claude-opus-4-7`, high-effort local coding agent, used for the initial
implementation and first peer-reviewed verification.
- OpenAI Codex via Paperclip `codex_local`, model `gpt-5.4`, high
reasoning-effort local coding agent with tool use, used for CI triage,
Greptile follow-up fixes, verification, and PR maintenance.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 19:10:01 -07:00
Devin Foley
1d9f7a5149 Fix flaky heartbeat recovery teardown CI failure (#4559)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The linked CI job is in the server test/recovery path, where
heartbeat runs and issue cleanup need to leave the control plane in a
consistent state even when retries fail.
> - In this case the failure was not runtime product behavior but test
teardown behavior inside `heartbeat-process-recovery.test.ts`.
> - The failing GitHub Actions job showed a foreign-key race on
`company_skills_company_id_companies_id_fk` while the test tried to
delete the parent company record.
> - The surrounding teardown code already uses bounded retry cleanup for
other dependent tables (`issues`, `heartbeatRuns`, and `agents`) because
this test file intentionally exercises asynchronous recovery flows.
> - This pull request applies that same retry pattern to the final
`db.delete(companies)` step, re-clearing `companySkills` before each
retry.
> - The benefit is a targeted fix for the CI flake without changing
runtime behavior or expanding the scope beyond the failing teardown
path.

## What Changed

- Wrapped the final `db.delete(companies)` call in
`server/src/__tests__/heartbeat-process-recovery.test.ts` with the same
5-attempt retry pattern already used elsewhere in that teardown.
- Re-cleared `companySkills` before each company-delete retry so
late-arriving FK-dependent rows do not mask the real test result.
- Verified the fix against the originally failing
`heartbeat-process-recovery` test file and the broader `pnpm test:run`
command under CI-like env conditions.

## Verification

- `pnpm exec vitest run
server/src/__tests__/heartbeat-process-recovery.test.ts`
- Re-ran `pnpm exec vitest run
server/src/__tests__/heartbeat-process-recovery.test.ts` multiple times
locally; the previously failing teardown stayed green.
- `env -u PAPERCLIP_API_URL -u PAPERCLIP_RUNTIME_API_URL -u
PAPERCLIP_RUN_ID -u PAPERCLIP_TASK_ID -u PAPERCLIP_AGENT_ID -u
PAPERCLIP_COMPANY_ID -u PAPERCLIP_API_KEY -u PAPERCLIP_WAKE_REASON -u
PAPERCLIP_WAKE_COMMENT_ID -u PAPERCLIP_WAKE_PAYLOAD_JSON -u
PAPERCLIP_APPROVAL_ID -u PAPERCLIP_APPROVAL_STATUS pnpm test:run`

## Risks

- Low risk. The change is test-only and scoped to teardown retry
behavior in a single server test file.
- If the underlying async cleanup behavior changes again, this test
could still become flaky in a different way, but this PR addresses the
specific FK race seen in the linked CI job.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI `gpt-5.4` via Paperclip `codex_local`, high reasoning mode,
with tool use for shell, git, HTTP API calls, and patch application.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 17:30:20 -07:00
Devin Foley
8145141c55 Fix external issue URL rewriting in markdown (#4558)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Issue and comment rendering is part of the board UI where humans
supervise and inspect agent work.
> - External Paperclip issue URLs can appear in comments as references
to other runs, review threads, or remote test environments.
> - Those links must preserve their full destination, including origin,
port, and `#comment-...` fragments, or the operator is taken to the
wrong place.
> - The bug here was that absolute `http(s)` issue URLs were being
normalized into internal `/issues/...` routes in the markdown path.
> - This pull request stops rewriting absolute URLs while keeping
internal issue-reference behavior for relative paths and identifiers.
> - The benefit is that authored external links now navigate exactly
where the operator expects, especially for remote test and
comment-deep-link workflows.

## What Changed

- Stopped `ui/src/lib/issue-reference.ts` from treating absolute
`http(s)` URLs as internal issue paths.
- Added defense-in-depth in `ui/src/lib/mention-chips.ts` so absolute
`http(s)` URLs are never reclassified as issue mention chips.
- Updated `ui/src/lib/issue-reference.test.ts` to cover absolute
Paperclip URLs with preserved origin, port, and comment hash.
- Updated `ui/src/components/MarkdownBody.test.tsx` to assert the
reported URL renders as an external link, not an internal `/issues/...`
href.

## Verification

- `pnpm exec vitest run ui/src/lib/issue-reference.test.ts
ui/src/components/MarkdownBody.test.tsx`
- Expected result: `2` files passed, `37` tests passed.
- Manual spot-check from the issue report path: a URL like
`http://remote.example.test:3103/PAPA/issues/PAPA-115#comment-...`
should remain an external link with its full destination preserved.

## Risks

- Low risk. The change narrows when Paperclip rewrites URLs, so the main
risk is if some existing workflow depended on absolute `http(s)`
Paperclip URLs being converted into internal issue links. The added
regression coverage is aimed at preventing that from regressing
silently.

## Model Used

- OpenAI Codex local agent via Paperclip `codex_local`
- Backing model family: GPT-5-based Codex runtime
- Exact backend model ID/version: not exposed by this adapter/runtime
surface
- Context window: not exposed by this adapter/runtime surface
- Capabilities used: tool use, shell command execution, code editing,
git operations, and local test execution

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [ ] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 17:19:23 -07:00
Devin Foley
54ab0d24cd Fix disappearing issue comments (#4557)
## Thinking Path

> - Paperclip is a control plane for AI-agent companies, so issue detail
pages are a primary surface for understanding agent work and human
feedback.
> - The relevant subsystem here is the issue comments/chat experience
across the React issue detail page and the server comment pagination
API.
> - Long issue threads were only surfacing the newest page of comments
at first render, which hid earlier human and agent messages behind extra
pagination.
> - The first UI fix exposed that the descending cursor path on the
server could also fail for older-page fetches, leaving the chat tab
stuck on an infinite "Loading earlier comments..." state.
> - This needed to be addressed in both layers so the chat tab can
surface earlier conversation history without manual recovery and without
server errors.
> - This pull request auto-loads earlier comment pages in the issue
detail chat view and fixes the descending cursor predicate used by issue
comment pagination.
> - The benefit is that long-running issues like `PAPA-103` now show the
missing conversation history near the top of the chat surface instead of
hiding it or failing to load it.

## What Changed

- Auto-load earlier issue comment pages in the issue detail chat tab
until the thread reaches a 150-comment cap or there are no older
comments left.
- Add UI-side guard logic and regression coverage for optimistic issue
comment pagination so the autoload behavior stops cleanly.
- Replace the raw SQL descending cursor predicate in
`issueService.listComments` with typed Drizzle comparisons for the
`(createdAt, id)` anchor tuple.
- Add a server regression test that paginates earlier comments in
descending order from an anchor comment.
- Smoke-test the exact previously failing seeded `PAPA-103` cursor path
on the isolated dev instance used for review.

## Verification

- `pnpm --filter @paperclipai/server exec vitest run
src/__tests__/issues-service.test.ts`
- `pnpm --filter @paperclipai/server typecheck`
- Manual smoke against seeded `PAPA-103` data on the isolated dev
server:
- `GET /api/issues/PAPA-103/comments?order=desc&limit=50` returns `200`
- `GET
/api/issues/PAPA-103/comments?after=765d3609-edc6-4d11-a8fe-d466affbe85d&order=desc&limit=50`
now returns `200` with 50 comments instead of `500`

## Risks

- Moderate UI/perf risk on very large threads because the chat tab now
prefetches multiple earlier pages on mount; the cap is intentionally
limited to 150 comments to bound that work.
- Low API risk because the server fix only changes the cursor predicate
construction for anchor-based comment pagination, but any mistake there
would affect older-comment paging order.

> I checked `ROADMAP.md` before opening this PR and this bug fix does
not duplicate planned core work.

## Model Used

- OpenAI Codex coding agent in the Paperclip local adapter environment.
The exact backend model ID and context window were not exposed
in-session. Tool-assisted workflow included shell execution, git/GitHub
CLI, local test execution, and targeted code edits.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 16:23:53 -07:00
Devin Foley
b2496c8067 fix(auth): trust allowed hostname port variants on detected listen port (#4554)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies, so
authenticated board access has to be predictable across local and
worktree deployments.
> - This change sits in the authenticated-mode server startup and Better
Auth origin-trust wiring.
> - The original auth branch fixed one real gap by adding port-qualified
trusted origins for allowed hostnames on non-default ports.
> - Review of that branch found a second-order bug: trusted origins were
still derived from the configured port before startup detected the
actual listen port.
> - In isolated worktrees, that meant a common `3100 -> 3101` port shift
could still leave Better Auth trusting the stale origin.
> - This pull request keeps the original allowed-hostname port-variant
fix, then moves trust derivation onto the resolved listen port and adds
regression coverage around startup wiring.
> - The benefit is that authenticated sessions keep working on allowed
private hostnames even when Paperclip has to auto-shift to a different
local port.

## What Changed

- Added `:port` trusted-origin variants for authenticated-mode
`allowedHostnames` when Paperclip runs on non-default ports.
- Changed authenticated startup so `listenPort` is detected before
Better Auth initialization, and explicit auth base URLs are rewritten
before auth startup.
- Updated `deriveAuthTrustedOrigins()` to accept the resolved listen
port so Better Auth trusts the actual browser origin instead of the
stale configured port.
- Added focused regression coverage in
`server/src/__tests__/better-auth.test.ts` and
`server/src/__tests__/server-startup-feedback-export.test.ts`.

## Verification

- `pnpm exec vitest run server/src/__tests__/better-auth.test.ts
server/src/__tests__/server-startup-feedback-export.test.ts`
- Reviewer re-check: reviewed commits `380f5b9f` and `092bb34c` after
the follow-up fix landed and found no remaining issues.

## Risks

- Low risk: this only affects authenticated-mode origin derivation and
startup ordering around detected listen ports.
- Main behavioral shift: startup no longer mutates `config.port` to the
selected port; it now carries `requestedListenPort` separately and uses
`listenPort` where runtime behavior needs the resolved value.
- If another path was implicitly relying on `config.port` being
overwritten during startup, that path would need follow-up, though the
current startup/test coverage did not reveal one.

> I checked `ROADMAP.md` and did not find an overlapping planned core
work item for this auth trusted-origin port handling fix.

## Model Used

- OpenAI Codex via Paperclip `codex_local` agents for implementation and
review. Exact backend model ID/context window were not surfaced in this
run context; work was performed through the Codex local adapter with
tool use, code execution, and review passes.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 15:40:39 -07:00
Devin Foley
08af830430 Tighten publicBaseUrl port rewriting (#4553)
## Thinking Path

> - Paperclip is a control plane for autonomous agent companies, so its
local and authenticated deployment behavior has to stay predictable
under port rebinding and worktree isolation.
> - This change sits in the server/worktree configuration path that
derives runtime URLs and auth origins from `auth.publicBaseUrl`.
> - The original hostname-port rewrite change fixed one real gap for
private/tailnet host:port worktree setups, but it widened the rewrite
rule too far.
> - Rewriting every explicit `auth.publicBaseUrl` can corrupt public or
reverse-proxy URLs by turning a stable origin like
`https://paperclip.example` into a local listen-port URL.
> - Paperclip's auth and trusted-origin handling depend on that URL
staying semantically correct, so this had to be narrowed before merge.
> - This pull request tightens the rewrite rule to explicit-port URLs
only and adds regression coverage across the CLI helper, worktree config
persistence, and server startup path.
> - The benefit is that private host:port worktree flows still work,
while public/default-port URLs remain stable and safe.

## What Changed

- Tightened `rewriteLocalUrlPort` in `cli/src/commands/worktree-lib.ts`,
`server/src/worktree-config.ts`, and `server/src/index.ts` so it only
rewrites URLs that already include an explicit port.
- Removed the old loopback-only hostname gate from the CLI/worktree
helpers and replaced it with the more precise `parsed.port` guard.
- Updated CLI helper coverage to assert that explicit-port non-loopback
URLs still rewrite while no-port public URLs stay unchanged.
- Expanded `server/src/__tests__/worktree-config.test.ts` to cover
explicit-port rewrite and no-port stability for both persisted worktree
config and in-memory runtime port selection.
- Added startup-path coverage in
`server/src/__tests__/server-startup-feedback-export.test.ts` for
`detect-port` rebinding with both explicit-port and no-port
`auth.publicBaseUrl` values.

## Verification

- `pnpm --filter @paperclipai/plugin-sdk build`
- `npx vitest run
server/src/__tests__/server-startup-feedback-export.test.ts`
- `npx vitest run cli/src/__tests__/worktree.test.ts
server/src/__tests__/worktree-config.test.ts`
- All of the above were run locally in this issue worktree and passed.

## Risks

- Low risk. The behavior change is deliberately narrower than the
reviewed broad-host rewrite and is guarded by regression coverage for
both the explicit-port and no-port cases.
- The main remaining risk is behavioral only if another code path starts
depending on port rewriting for URLs that never declared a port, which
would be a separate bug.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex local agent using `gpt-5.4` with high reasoning effort,
tool use, shell execution, and file editing.
- Anthropic Claude local agent using `claude-opus-4-6` for follow-up
code review approval on the implementation issue.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 14:29:22 -07:00
Devin Foley
d47ffa87f0 Fix CEO AGENT_HOME paths and centralize workspace env propagation (#4551)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The local adapter layer is responsible for turning Paperclip runtime
context into the environment seen by the child agent process.
> - The CEO onboarding bundle tells the agent where to read and write
its persistent memory and fact files.
> - That bundle was using `./memory/...` and `./life/...`, which only
works when the process cwd happens to equal the agent home directory.
> - At the same time, six local adapters each duplicated the same
workspace-env propagation logic, including `AGENT_HOME`, which makes
this contract easy to drift.
> - This pull request fixes the CEO instructions to use
`$AGENT_HOME/...` and centralizes workspace-env propagation in one
shared helper with shared tests.
> - The benefit is a real bug fix for agent memory paths plus a single
tested contract that makes future built-in adapter work less likely to
forget `AGENT_HOME`.

## What Changed

- Updated `server/src/onboarding-assets/ceo/HEARTBEAT.md` to use
`$AGENT_HOME/memory/...` and `$AGENT_HOME/life/...` instead of
cwd-relative `./memory/...` and `./life/...`.
- Added `applyPaperclipWorkspaceEnv(...)` in
`packages/adapter-utils/src/server-utils.ts` to centralize
`PAPERCLIP_WORKSPACE_*` and `AGENT_HOME` propagation.
- Added shared helper coverage in
`packages/adapter-utils/src/server-utils.test.ts` for both populated and
skip-empty cases.
- Switched the built-in local adapters (`claude_local`, `codex_local`,
`cursor_local`, `gemini_local`, `opencode_local`, `pi_local`) over to
the shared helper instead of inline env assignment blocks.

## Verification

- `pnpm install`
- `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/codex-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- Result: 7 test files passed, 31 tests passed, 0 failures.

## Risks

- Low risk.
- The only behavioral surface is the shared env propagation refactor
across six adapters; if the helper diverged from prior semantics, an
adapter could miss a workspace env var.
- The shared helper test plus the affected adapter execute tests reduce
that risk, and the helper preserves the prior "set only non-empty
strings" behavior.

## Model Used

- OpenAI Codex via Paperclip `codex_local` agent runtime; tool-assisted
coding workflow with shell execution, file patching, git operations, and
API interaction. The exact backend model identifier and context window
are not surfaced by this local runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 13:57:35 -07:00
Devin Foley
d1484551ee Add open-source hygiene note to paperclip-dev skill (#4541)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The `paperclip-dev` skill is part of the contributor and agent
workflow layer that tells developers how to work in this repository
safely.
> - That skill already references the public upstream `origin`, but it
did not explicitly say that pushes there must be treated as publishable
open-source output.
> - Without that reminder, contributors are more likely to leak secrets,
PII, private logs, machine-local config, or noisy throwaway git history
into the public repo.
> - This pull request adds a prominent `OPEN SOURCE HYGIENE` callout
near the top of the skill, before the git workflow guidance.
> - The benefit is clearer safety guidance for contributors and less
accidental disclosure or branch/commit noise on the upstream project.

## What Changed

- Added an `OPEN SOURCE HYGIENE` callout near the top of
`skills/paperclip-dev/SKILL.md`.
- Explicitly warned that anything pushed to `origin` must be
publishable.
- Called out avoiding secrets, API keys, PII, private logs,
machine-local config, and noisy throwaway branches or checkpoint
commits.

## Verification

- N/a

## Risks

- Low risk. This is a docs-only change in a skill file; the main risk is
wording tone or placement, not runtime behavior.

## Model Used

- OpenAI Codex via the `codex_local` Paperclip adapter, GPT-5-based
coding agent runtime. Exact backend serving model ID is not exposed in
this heartbeat environment. Tool use, shell execution, and patch
application were enabled.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [ ] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 12:14:49 -07:00
Devin Foley
91333ec86f feat: add paperclip-dev skill with optional bundled skill support (#3854)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents working on the Paperclip codebase itself need guidance on dev
workflows: server lifecycle, worktrees, builds, database ops,
diagnostics
> - There was no bundled skill covering these workflows — agents had to
figure it out from scratch each time
> - Additionally, not every skill should be force-installed on every
agent — a dev-focused skill should be opt-in
> - This PR adds a `paperclip-dev` skill with `required: false`
frontmatter so it ships with Paperclip but isn't auto-installed
> - The skill's PR section references canonical files
(`.github/PULL_REQUEST_TEMPLATE.md`, `CONTRIBUTING.md`) instead of
duplicating their content, with gated instructions that force agents to
read those files before creating any PR
> - The benefit is that developers (human or agent) can opt in to
structured dev guidance without polluting the default agent skill set or
creating drift between duplicated docs

## What Changed

- Added `skills/paperclip-dev/SKILL.md` covering server management,
worktree lifecycle, builds, database ops, diagnostics, agent operations,
and common mistakes
- The Pull Requests section uses gated, reference-based instructions —
agents MUST read `.github/PULL_REQUEST_TEMPLATE.md` and
`CONTRIBUTING.md` before running `gh pr create`, with a brief checklist
of required section names (no content duplication)
- Updated `packages/adapter-utils/src/server-utils.ts` to respect
`required: false` frontmatter — optional skills are bundled but not
auto-installed on agents
- Added test in `server/src/__tests__/paperclip-skill-utils.test.ts`
verifying that optional skills are excluded from the default install set

## Verification

```bash
# Run tests
pnpm test

# Manual verification: create a fresh worktree without seeding
npx paperclipai worktree:make test-optional-skill --no-seed
cd ~/paperclip-test-optional-skill
eval "$(npx paperclipai worktree env)"
npx paperclipai run

# Verify paperclip-dev appears in company skill library but is NOT auto-assigned
# Call listPaperclipSkillEntries() — paperclip-dev should show required: false
# Call resolvePaperclipDesiredSkillNames() — paperclip-dev should NOT be in the default set

# Cleanup
npx paperclipai worktree:cleanup test-optional-skill
```

## Risks

- Low risk. The `required` field defaults to `true` when absent, so all
existing skills behave identically. Only the new `paperclip-dev` skill
sets `required: false`.

## Model Used

Claude Opus 4.6 (`claude-opus-4-6`) via Claude Code, with tool use and
extended context.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-26 11:06:13 -07:00
Dotta
c036bbfa98 Add first-class security agent role to taxonomy (#4532)
## Thinking Path

> - Paperclip is the control plane for AI-agent companies, so agent
metadata is part of the platform's governance and audit surface.
> - The shared agent taxonomy in `@paperclipai/shared` is the source of
truth for allowed agent roles and their UI labels.
> - The current taxonomy lacks a `security` role, which causes Security
Engineer hires to collapse into `engineer`.
> - That breaks separation-of-duties evidence in telemetry and weakens
role-level audit fidelity even though it does not directly change
permissions.
> - This pull request adds `security` as a first-class shared role and
covers the prior rejection path with a regression test.
> - The benefit is that Security Engineer agents can now be persisted
and rendered under the correct role without schema or permission churn.

## What Changed

- Added `security` to `AGENT_ROLES` in
`packages/shared/src/constants.ts`.
- Added the `Security` display label to `AGENT_ROLE_LABELS` so existing
UI consumers render the new role automatically.
- Added a shared validator regression test proving `createAgentSchema`
accepts `role: "security"` and that the label stays stable.

## Verification

- `pnpm --filter @paperclipai/shared typecheck`
- `pnpm --filter @paperclipai/shared exec vitest run
src/adapter-types.test.ts`

## Risks

- Low risk. This is a shared enum expansion with no database migration
and no permission-model change.
- Residual risk: this PR does not backfill existing agents already
persisted as `engineer`; it only fixes new validations and labels going
forward.

> I checked `ROADMAP.md`/`doc` for overlap and did not find an existing
planned item covering this taxonomy fix.

## Model Used

- OpenAI GPT-5.4 via the Codex local adapter, with tool use and local
code execution enabled. The runtime did not surface a separate
context-window identifier in agent metadata.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-26 07:52:05 -05:00
Dotta
df425fde96 Present ordered sub-issues as a workflow checklist (#4523)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Operators use issue detail pages and child issue lists to understand
multi-step execution plans.
> - Ordered sub-issues currently read like a flat table, so dependency
chains and current next steps are harder to scan.
> - The branch work adds a workflow-oriented presentation for child
issues without changing the single-assignee task model.
> - This pull request makes ordered sub-issues read more like a progress
checklist while preserving normal issue list controls.
> - The benefit is that operators can see completed steps, active work,
blocked follow-ups, and dependency order at a glance.

## What Changed

- Added workflow sorting utilities and tests for dependency-aware child
issue ordering.
- Added sub-issue progress summary, checklist numbering, current-step
affordances, blocker context, and done-state de-emphasis in the issue
list UI.
- Wired issue detail sub-issue panels to use the workflow sort/progress
checklist presentation.
- Updated issue service behavior/tests for child issue ordering inputs
used by the UI.
- Added a Storybook visual review fixture and screenshot helper for the
sub-issue workflow checklist surface.

## Verification

- `pnpm run preflight:workspace-links && pnpm exec vitest run
server/src/__tests__/issues-service.test.ts
ui/src/components/IssueRow.test.tsx
ui/src/components/IssuesList.test.tsx ui/src/pages/IssueDetail.test.tsx
ui/src/lib/issue-detail-subissues.test.ts
ui/src/lib/workflow-sort.test.ts`
- Result: 6 test files passed, 55 tests passed, 34 embedded Postgres
issue-service tests skipped because `@embedded-postgres/darwin-x64` is
unavailable on this host.
- Visual review: generated Storybook screenshots from the existing local
Storybook server on port 6006 with `node
scripts/screenshot-subissues.mjs /tmp/pap-2189-subissues-screens
http://localhost:6006`.
- Screenshot artifacts:
- Desktop dark: ![Desktop
dark](doc/assets/pap-2189/desktop-1440x900-dark.png)
- Desktop light: ![Desktop
light](doc/assets/pap-2189/desktop-1440x900-light.png)
- Mobile dark: ![Mobile
dark](doc/assets/pap-2189/mobile-390x844-dark.png)
- Mobile light: ![Mobile
light](doc/assets/pap-2189/mobile-390x844-light.png)
- Local Storybook note: starting a second Storybook process selected
port 6008 because 6006 was occupied, then Vite failed with an esbuild
host/binary version mismatch (`0.25.12` host vs `0.27.3` binary). The
already-running Storybook server on 6006 served the fixture successfully
for screenshots.

## Risks

- Medium UI risk: the issue list now has additional sub-issue-specific
visual states, so dense lists should be checked for spacing and
scanability.
- Low ordering risk: workflow sorting is covered by focused unit tests,
but unusual dependency topologies may still need reviewer attention.
- No migration risk: this PR does not add database migrations or touch
`pnpm-lock.yaml`.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled shell/git/GitHub
workflow. Context window is runtime-provided and not exposed in this
environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-26 07:36:49 -05:00
Devin Foley
40782f703d Fix release packaging for standalone public packages (#4494)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies, and the
sandbox-provider work just moved E2B into a standalone publishable
plugin package.
> - That plugin is intentionally excluded from the root pnpm workspace
so it can model third-party install behavior without forcing lockfile
churn in the main repo.
> - The merged architecture change exposed a follow-up release problem:
the canary publish workflow tried to publish `@paperclipai/plugin-e2b`,
but the tarball had no `dist/` payload because standalone public
packages were not being built in the release path.
> - That means the release pipeline needed a packaging fix in core
release tooling, not another architectural change in the sandbox
provider itself.
> - This pull request adds a generic release step for public packages
that live outside the pnpm workspace, instead of hardcoding E2B-specific
behavior into the release script.
> - The benefit is that standalone publishable packages can be built and
packed correctly during release, including future sandbox-provider
plugins that follow the same pattern.

## What Changed

- Added `scripts/build-standalone-public-packages.mjs` to discover
public packages outside the pnpm workspace, run a clean package-local
install, and build them before publish.
- Updated `scripts/release.sh` to invoke that helper immediately after
the normal workspace build step.
- Kept the behavior generic by driving off the existing public package
map and pnpm workspace patterns rather than special-casing
`@paperclipai/plugin-e2b`.

## Verification

- `rm -rf packages/plugins/sandbox-providers/e2b/dist`
- `node ./scripts/build-standalone-public-packages.mjs`
- `cd packages/plugins/sandbox-providers/e2b && npm pack --dry-run`
- Confirm the tarball now includes the rebuilt `dist/` files instead of
only `README.md` / `package.json`

## Risks

- Low risk: this only changes the release build path for public packages
outside the pnpm workspace.
- The helper performs a clean package-local install for each standalone
public package, so release time may increase slightly as more such
packages are added.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex via `codex_local`
- Model ID: `gpt-5.4`
- Reasoning effort: `high`
- Context window observed in runtime session metadata: `258400` tokens
- Capabilities used: terminal tool execution, git, GitHub CLI, and local
build/test inspection

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-25 12:16:23 -07:00
Devin Foley
4ef969f084 Add E2B sandbox provider plugin (#4452)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Sandbox environments are part of that execution layer, and the
recent core refactor moved provider-specific behavior to a generic
plugin seam
> - This pull request adds a dedicated `@paperclipai/plugin-e2b` package
so E2B can live entirely outside core host code
> - Because the feature is still unreleased, the plugin should model
third-party packaging directly instead of carrying extra
backward-compatibility complexity in core or the workspace lockfile
> - This branch therefore makes the E2B provider a standalone
publishable package, documents the package-local dev flow, and keeps the
publish manifest/runtime dependency story correct
> - The benefit is that E2B becomes a true plugin reference
implementation that can be installed by package name without reopening
core Paperclip code

## What Changed

- Added `packages/plugins/paperclip-plugin-e2b` as the E2B sandbox
provider plugin package
- Implemented config validation, lease acquire/resume/release/destroy
handlers, workspace realization, and command execution for E2B sandboxes
- Excluded the E2B plugin package from the root workspace so the repo no
longer needs `pnpm-lock.yaml` churn for its third-party dependency graph
- Added package-local development/install support plus a prepack
manifest generator so the published tarball still declares
`@paperclipai/plugin-sdk` and `e2b` runtime dependencies
- Addressed review feedback by fixing sandbox cleanup on acquire
failures, rejecting blank templates, normalizing fractional `timeoutMs`,
and always passing the configured template name to the E2B SDK
- Updated focused Vitest coverage for config normalization, validation,
acquire cleanup, command execution, and lease release behavior
- Updated the Dockerfile deps stage to copy the E2B package manifest so
the policy check stays in sync

## Verification

- `cd packages/plugins/paperclip-plugin-e2b && pnpm install
--ignore-workspace --no-lockfile`
- `cd packages/plugins/paperclip-plugin-e2b && pnpm build`
- `cd packages/plugins/paperclip-plugin-e2b && pnpm --ignore-workspace
test`
- `cd packages/plugins/paperclip-plugin-e2b && pnpm --ignore-workspace
typecheck`
- `cd packages/plugins/paperclip-plugin-e2b && npm pack --dry-run`

## Risks

- The package now relies on a prepack manifest rewrite so the
publish-time dependency list stays correct while the repo-local dev
manifest stays workspace-light
- The current repo snapshot is still unreleased, so the generated
publish manifest points at the repo SDK version until the normal release
flow rewrites versions before publish
- Real-world E2B environments may still expose edge cases around
lifecycle timing or sandbox metadata beyond the mocked unit coverage

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex via `codex_local`
- Model ID: `gpt-5.4`
- Reasoning effort: `high`
- Context window observed in runtime session metadata: `258400` tokens
- Capabilities used: terminal tool execution, git, GitHub CLI, and local
build/test inspection

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-25 11:01:11 -07:00
Devin Foley
5bd0f578fd Generalize sandbox provider core for plugin-only providers (#4449)
## Thinking Path

> - Paperclip is a control plane, so optional execution providers should
sit at the plugin edge instead of hardcoding provider-specific behavior
into core shared/server/ui layers.
> - Sandbox environments are already first-class, and the fake provider
proves the built-in path; the remaining gap was that real providers
still leaked provider-specific config and runtime assumptions into core.
> - That coupling showed up in config normalization, secret persistence,
capabilities reporting, lease reconstruction, and the board UI form
fields.
> - As long as core knew about those provider-shaped details, shipping a
provider as a pure third-party plugin meant every new provider would
still require host changes.
> - This pull request generalizes the sandbox provider seam around
schema-driven plugin metadata and generic secret-ref handling.
> - The runtime and UI now consume provider metadata generically, so
core only special-cases the built-in fake provider while third-party
providers can live entirely in plugins.

## What Changed

- Added generic sandbox-provider capability metadata so plugin-backed
providers can expose `configSchema` through shared environment support
and the environments capabilities API.
- Reworked sandbox config normalization/persistence/runtime resolution
to handle schema-declared secret-ref fields generically, storing them as
Paperclip secrets and resolving them for probe/execute/release flows.
- Generalized plugin sandbox runtime handling so provider validation,
reusable-lease matching, lease reconstruction, and plugin worker calls
all operate on provider-agnostic config instead of provider-shaped
branches.
- Replaced hardcoded sandbox provider form fields in Company Settings
with schema-driven rendering and blocked agent environment selection
from the built-in fake provider.
- Added regression coverage for the generic seam across shared support
helpers plus environment config, probe, routes, runtime, and
sandbox-provider runtime tests.

## Verification

- `pnpm vitest --run packages/shared/src/environment-support.test.ts
server/src/__tests__/environment-config.test.ts
server/src/__tests__/environment-probe.test.ts
server/src/__tests__/environment-routes.test.ts
server/src/__tests__/environment-runtime.test.ts
server/src/__tests__/sandbox-provider-runtime.test.ts`
- `pnpm -r typecheck`

## Risks

- Plugin sandbox providers now depend more heavily on accurate
`configSchema` declarations; incorrect schemas can misclassify
secret-bearing fields or omit required config.
- Reusable lease matching is now metadata-driven for plugin-backed
providers, so providers that fail to persist stable metadata may
reprovision instead of resuming an existing lease.
- The UI form is now fully schema-driven for plugin-backed sandbox
providers; provider manifests without good defaults or descriptions may
produce a rougher operator experience.

## Model Used

- OpenAI Codex via `codex_local`
- Model ID: `gpt-5.4`
- Reasoning effort: `high`
- Context window observed in runtime session metadata: `258400` tokens
- Capabilities used: terminal tool execution, git, and local code/test
inspection

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 18:03:41 -07:00
Dotta
deba60ebb2 Stabilize serialized server route tests (#4448)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The server route suite is a core confidence layer for auth, issue
context, and workspace runtime behavior
> - Some route tests were doing extra module/server isolation work that
made local runs slower and more fragile
> - The stable Vitest runner also needs to pass server-relative exclude
paths to avoid accidentally re-including serialized suites
> - This pull request tightens route test isolation and runner
serialization behavior
> - The benefit is more reliable targeted and stable-route test
execution without product behavior changes

## What Changed

- Updated `run-vitest-stable.mjs` to exclude serialized server tests
using server-relative paths.
- Forced the server Vitest config to use a single worker in addition to
isolated forks.
- Simplified agent permission route tests to create per-request test
servers without shared server lifecycle state.
- Stabilized issue goal context route mocks by using static mocked
services and a sequential suite.
- Re-registered workspace runtime route mocks before cache-busted route
imports.

## Verification

- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/agent-permissions-routes.test.ts
server/src/__tests__/issues-goal-context-routes.test.ts
server/src/__tests__/workspace-runtime-routes-authz.test.ts --pool=forks
--poolOptions.forks.isolate=true`
- `node --check scripts/run-vitest-stable.mjs`

## Risks

- Low risk. This is test infrastructure only.
- The stable runner path fix changes which tests are excluded from the
non-serialized server batch, matching the server project root that
Vitest applies internally.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled with
shell/GitHub/Paperclip API access. Context window was not reported by
the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 19:27:00 -05:00
Dotta
f68e9caa9a Polish markdown external link wrapping (#4447)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The board UI renders agent comments, PR links, issue links, and
operational markdown throughout issue threads
> - Long GitHub and external links can wrap awkwardly, leaving icons
orphaned from the text they describe
> - Small inbox visual polish also helps repeated board scanning without
changing behavior
> - This pull request glues markdown link icons to adjacent link
characters and removes a redundant inbox list border
> - The benefit is cleaner, more stable markdown and inbox rendering for
day-to-day operator review

## What Changed

- Added an external-link indicator for external markdown links.
- Kept the GitHub icon attached to the first link character so it does
not wrap onto a separate line.
- Kept the external-link icon attached to the final link character so it
does not wrap away from the URL/text.
- Added markdown rendering regressions for GitHub and external link icon
wrapping.
- Removed the extra border around the inbox list card.

## Verification

- `pnpm exec vitest run --project @paperclipai/ui
ui/src/components/MarkdownBody.test.tsx`
- `pnpm --filter @paperclipai/ui typecheck`

## Risks

- Low risk. The markdown change is limited to link child rendering and
preserves existing href/target/rel behavior.
- Visual-only inbox polish.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled with
shell/GitHub/Paperclip API access. Context window was not reported by
the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 19:26:13 -05:00
Dotta
73fbdf36db Gate stale-run watchdog decisions by board access (#4446)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The run ledger surfaces stale-run watchdog evaluation issues and
recovery actions
> - Viewer-level board users should be able to inspect status without
getting controls that the server will reject
> - The UI also needs enough board-access context to know when to hide
those decision actions
> - This pull request exposes board memberships in the current board
access snapshot and gates watchdog action controls for known viewer
contexts
> - The benefit is clearer least-privilege UI behavior around recovery
controls

## What Changed

- Included memberships in `/api/cli-auth/me` so the board UI can
distinguish active viewer memberships from operator/admin access.
- Added the stale-run evaluation issue assignee to output silence
summaries.
- Hid stale-run watchdog decision buttons for known non-owner viewer
contexts.
- Surfaced watchdog decision failures through toast and inline error
text.
- Threaded `companyId` through the issue activity run ledger so access
checks are company-scoped.
- Added IssueRunLedger coverage for non-owner viewers.

## Verification

- `pnpm exec vitest run --project @paperclipai/ui
ui/src/components/IssueRunLedger.test.tsx`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm --filter @paperclipai/ui typecheck`

## Risks

- Medium-low risk. This is a UI gating change backed by existing server
authorization.
- Local implicit and instance-admin board contexts continue to show
watchdog decision controls.
- No migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled with
shell/GitHub/Paperclip API access. Context window was not reported by
the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 19:25:23 -05:00
Dotta
6916e30f8e Cancel stale retries when issue ownership changes (#4445)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Issue execution is guarded by run locks and bounded retry scheduling
> - A failed run can schedule a retry, but the issue may be reassigned
before that retry becomes due
> - The old assignee's scheduled retry should not continue to hold or
reclaim execution for the issue
> - This pull request cancels stale scheduled retries when ownership
changes and cancels live work when an issue is explicitly cancelled
> - The benefit is cleaner issue handoff semantics and fewer stranded or
incorrect execution locks

## What Changed

- Cancel scheduled retry runs when their issue has been reassigned
before the retry is promoted.
- Clear stale issue execution locks and cancel the associated wakeup
request when a stale retry is cancelled.
- Avoid deferring a new assignee behind a previous assignee's scheduled
retry.
- Cancel an active run when an issue status is explicitly changed to
`cancelled`, while leaving `done` transitions alone.
- Added route and heartbeat regressions for reassignment and
cancellation behavior.

## Verification

- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/heartbeat-retry-scheduling.test.ts
server/src/__tests__/issue-comment-reopen-routes.test.ts --pool=forks
--poolOptions.forks.isolate=true`
  - `issue-comment-reopen-routes.test.ts`: 28 passed.
- `heartbeat-retry-scheduling.test.ts`: skipped by the existing embedded
Postgres host guard (`Postgres init script exited with code null`).
- `pnpm --filter @paperclipai/server typecheck`

## Risks

- Medium risk because this changes heartbeat retry lifecycle behavior.
- The cancellation path is scoped to scheduled retries whose issue
assignee no longer matches the retrying agent, and logs a lifecycle
event for auditability.
- No migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled with
shell/GitHub/Paperclip API access. Context window was not reported by
the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 19:24:13 -05:00
Dotta
0c6961a03e Normalize escaped multiline issue and approval text (#4444)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The board and agent APIs accept multiline issue, approval,
interaction, and document text
> - Some callers accidentally send literal escaped newline sequences
like `\n` instead of JSON-decoded line breaks
> - That makes comments, descriptions, documents, and approval notes
render as flattened text instead of readable markdown
> - This pull request centralizes multiline text normalization in shared
validators
> - The benefit is newline-preserving API behavior across issue and
approval workflows without route-specific fixes

## What Changed

- Added a shared `multilineTextSchema` helper that normalizes escaped
`\n`, `\r\n`, and `\r` sequences to real line breaks.
- Applied the helper to issue descriptions, issue update comments, issue
comment bodies, suggested task descriptions, interaction summaries,
issue documents, approval comments, and approval decision notes.
- Added shared validator regressions for issue and approval multiline
inputs.

## Verification

- `pnpm exec vitest run --project @paperclipai/shared
packages/shared/src/validators/approval.test.ts
packages/shared/src/validators/issue.test.ts`
- `pnpm --filter @paperclipai/shared typecheck`

## Risks

- Low risk. This only changes text fields that are explicitly multiline
user/operator content.
- If a caller intentionally wanted literal backslash-n text in these
fields, it will now render as a real line break.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent, tool-enabled with
shell/GitHub/Paperclip API access. Context window was not reported by
the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 18:02:45 -05:00
Dotta
5a0c1979cf [codex] Add runtime lifecycle recovery and live issue visibility (#4419) 2026-04-24 15:50:32 -05:00
Dotta
9a8d219949 [codex] Stabilize tests and local maintenance assets (#4423)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - A fast-moving control plane needs stable local tests and repeatable
local maintenance tools so contributors can safely split and review work
> - Several route suites needed stronger isolation, Codex manual model
selection needed a faster-mode option, and local browser cleanup missed
Playwright's headless shell binary
> - Storybook static output also needed to be preserved as a generated
review artifact from the working branch
> - This pull request groups the test/local-dev maintenance pieces so
they can be reviewed separately from product runtime changes
> - The benefit is more predictable contributor verification and cleaner
local maintenance without mixing these changes into feature PRs

## What Changed

- Added stable Vitest runner support and serialized route/authz test
isolation.
- Fixed workspace runtime authz route mocks and stabilized
Claude/company-import related assertions.
- Allowed Codex fast mode for manually selected models.
- Broadened the agent browser cleanup script to detect
`chrome-headless-shell` as well as Chrome for Testing.
- Preserved generated Storybook static output from the source branch.

## Verification

- `pnpm exec vitest run
src/__tests__/workspace-runtime-routes-authz.test.ts
src/__tests__/claude-local-execute.test.ts --config vitest.config.ts`
from `server/` passed: 2 files, 19 tests.
- `pnpm exec vitest run src/server/codex-args.test.ts --config
vitest.config.ts` from `packages/adapters/codex-local/` passed: 1 file,
3 tests.
- `bash -n scripts/kill-agent-browsers.sh &&
scripts/kill-agent-browsers.sh --dry` passed; dry-run detected
`chrome-headless-shell` processes without killing them.
- `test -f ui/storybook-static/index.html && test -f
ui/storybook-static/assets/forms-editors.stories-Dry7qwx2.js` passed.
- `git diff --check public-gh/master..pap-2228-test-local-maintenance --
. ':(exclude)ui/storybook-static'` passed.
- `pnpm exec vitest run
cli/src/__tests__/company-import-export-e2e.test.ts --config
cli/vitest.config.ts` did not complete in the isolated split worktree
because `paperclipai run` exited during build prep with `TS2688: Cannot
find type definition file for 'react'`; this appears to be caused by the
worktree dependency symlink setup, not the code under test.
- Confirmed this PR does not include `pnpm-lock.yaml`.

## Risks

- Medium risk: the stable Vitest runner changes how route/authz tests
are scheduled.
- Generated `ui/storybook-static` files are large and contain minified
third-party output; `git diff --check` reports whitespace inside those
generated assets, so reviewers may choose to drop or regenerate that
artifact before merge.
- No database migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, with shell, git, Paperclip
API, and GitHub CLI tool use in the local Paperclip workspace.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Note: screenshot checklist item is not applicable to source UI behavior;
the included Storybook static output is generated artifact preservation
from the source branch.

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 15:11:42 -05:00
Devin Foley
70679a3321 Add sandbox environment support (#4415)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The environment/runtime layer decides where agent work executes and
how the control plane reaches those runtimes.
> - Today Paperclip can run locally and over SSH, but sandboxed
execution needs a first-class environment model instead of one-off
adapter behavior.
> - We also want sandbox providers to be pluggable so the core does not
hardcode every provider implementation.
> - This branch adds the Sandbox environment path, the provider
contract, and a deterministic fake provider plugin.
> - That required synchronized changes across shared contracts, plugin
SDK surfaces, server runtime orchestration, and the UI
environment/workspace flows.
> - The result is that sandbox execution becomes a core control-plane
capability while keeping provider implementations extensible and
testable.

## What Changed

- Added sandbox runtime support to the environment execution path,
including runtime URL discovery, sandbox execution targeting,
orchestration, and heartbeat integration.
- Added plugin-provider support for sandbox environments so providers
can be supplied via plugins instead of hardcoded server logic.
- Added the fake sandbox provider plugin with deterministic behavior
suitable for local and automated testing.
- Updated shared types, validators, plugin protocol definitions, and SDK
helpers to carry sandbox provider and workspace-runtime contracts across
package boundaries.
- Updated server routes and services so companies can create sandbox
environments, select them for work, and execute work through the sandbox
runtime path.
- Updated the UI environment and workspace surfaces to expose sandbox
environment configuration and selection.
- Added test coverage for sandbox runtime behavior, provider seams,
environment route guards, orchestration, and the fake provider plugin.

## Verification

- Ran locally before the final fixture-only scrub:
  - `pnpm -r typecheck`
  - `pnpm test:run`
  - `pnpm build`
- Ran locally after the final scrub amend:
  - `pnpm vitest run server/src/__tests__/runtime-api.test.ts`
- Reviewer spot checks:
  - create a sandbox environment backed by the fake provider plugin
  - run work through that environment
- confirm sandbox provider execution does not inherit host secrets
implicitly

## Risks

- This touches shared contracts, plugin SDK plumbing, server runtime
orchestration, and UI environment/workspace flows, so regressions would
likely show up as cross-layer mismatches rather than isolated type
errors.
- Runtime URL discovery and sandbox callback selection are sensitive to
host/bind configuration; if that logic is wrong, sandbox-backed
callbacks may fail even when execution succeeds.
- The fake provider plugin is intentionally deterministic and
test-oriented; future providers may expose capability gaps that this
branch does not yet cover.

## Model Used

- OpenAI Codex coding agent on a GPT-5-class backend in the
Paperclip/Codex harness. Exact backend model ID is not exposed
in-session. Tool-assisted workflow with shell execution, file editing,
git history inspection, and local test execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 12:15:53 -07:00
Dotta
641eb44949 [codex] Harden create-agent skill governance (#4422)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Hiring agents is a governance-sensitive workflow because it grants
roles, adapter config, skills, and execution capability
> - The create-agent skill needs explicit templates and review guidance
so hires are auditable and not over-permissioned
> - Skill sync also needs to recognize bundled Paperclip skills
consistently for Codex local agents
> - This pull request expands create-agent role templates, adds a
security-engineer template, and documents capability/secret-handling
review requirements
> - The benefit is safer, more repeatable agent creation with clearer
approval payloads and less permission sprawl

## What Changed

- Expanded `paperclip-create-agent` guidance for template selection,
adjacent-template drafting, and role-specific review bars.
- Added a Security Engineer agent template and collaboration/safety
sections for Coder, QA, and UX Designer templates.
- Hardened draft-review guidance around desired skills, external-system
access, secrets, and confidential advisory handling.
- Updated LLM agent-configuration guidance to point hiring workflows at
the create-agent skill.
- Added tests for bundled skill sync, create-agent skill injection, hire
approval payloads, and LLM route guidance.

## Verification

- `pnpm exec vitest run server/src/__tests__/agent-skills-routes.test.ts
server/src/__tests__/codex-local-skill-injection.test.ts
server/src/__tests__/codex-local-skill-sync.test.ts
server/src/__tests__/llms-routes.test.ts
server/src/__tests__/paperclip-skill-utils.test.ts --config
server/vitest.config.ts` passed: 5 files, 23 tests.
- `git diff --check public-gh/master..pap-2228-create-agent-governance
-- . ':(exclude)ui/storybook-static'` passed.
- Confirmed this PR does not include `pnpm-lock.yaml`.

## Risks

- Low-to-medium risk: this primarily changes skills/docs and tests, but
it affects future hiring guidance and approval expectations.
- Reviewers should check whether the new Security Engineer template is
too broad for default company installs.
- No database migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, with shell, git, Paperclip
API, and GitHub CLI tool use in the local Paperclip workspace.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Note: screenshot checklist item is not applicable; this PR changes
skills, docs, and server tests.

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 14:15:28 -05:00
Dotta
77a72e28c2 [codex] Polish issue composer and long document display (#4420)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Issue comments and documents are the main working surface where
operators and agents collaborate
> - File drops, markdown editing, and long issue descriptions need to
feel predictable because they sit directly in the task execution loop
> - The composer had edge cases around drag targets, attachment
feedback, image drops, and long markdown content crowding the page
> - This pull request polishes the issue composer, hardens markdown
editor regressions, and adds a fold curtain for long issue
descriptions/documents
> - The benefit is a calmer issue detail surface that handles uploads
and long work products without hiding state or breaking layout

## What Changed

- Scoped issue-composer drag/drop behavior so the composer owns file
drops without turning the whole thread into a competing drop target.
- Added clearer attachment upload feedback for non-image files and
image-drop stability coverage.
- Hardened markdown editor and markdown body handling around HTML-like
tag regressions.
- Added `FoldCurtain` and wired it into issue descriptions and issue
documents so long markdown previews can expand/collapse.
- Added Storybook coverage for the fold curtain state.

## Verification

- `pnpm exec vitest run ui/src/components/IssueChatThread.test.tsx
ui/src/components/MarkdownEditor.test.tsx
ui/src/components/MarkdownBody.test.tsx --config ui/vitest.config.ts`
passed: 3 files, 75 tests.
- `git diff --check public-gh/master..pap-2228-editor-composer-polish --
. ':(exclude)ui/storybook-static'` passed.
- Confirmed this PR does not include `pnpm-lock.yaml`.

## Risks

- Low-to-medium risk: this changes user-facing composer/drop behavior
and long markdown display.
- The fold curtain uses DOM measurement and `ResizeObserver`; reviewers
should check browser behavior for very long descriptions and documents.
- No database migrations.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, with shell, git, Paperclip
API, and GitHub CLI tool use in the local Paperclip workspace.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Note: screenshots were not newly captured during branch splitting; the
UI states are covered by component tests and a Storybook story.

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 14:12:41 -05:00
Dotta
8f1cd0474f [codex] Improve transient recovery and Codex model refresh (#4383)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Adapter execution and retry classification decide whether agent work
pauses, retries, or recovers automatically
> - Transient provider failures need to be classified precisely so
Paperclip does not convert retryable upstream conditions into false hard
failures
> - At the same time, operators need an up-to-date model list for
Codex-backed agents and prompts should nudge agents toward targeted
verification instead of repo-wide sweeps
> - This pull request tightens transient recovery classification for
Claude and Codex, updates the agent prompt guidance, and adds Codex
model refresh support end-to-end
> - The benefit is better automatic retry behavior plus fresher
operator-facing model configuration

## What Changed

- added Codex usage-limit retry-window parsing and Claude extra-usage
transient classification
- normalized the heartbeat transient-recovery contract across adapter
executions and heartbeat scheduling
- documented that deferred comment wakes only reopen completed issues
for human/comment-reopen interactions, while system follow-ups leave
closed work closed
- updated adapter-utils prompt guidance to prefer targeted verification
- added Codex model refresh support in the server route, registry,
shared types, and agent config form
- added adapter/server tests covering the new parsing, retry scheduling,
and model-refresh behavior

## Verification

- `pnpm exec vitest run --project @paperclipai/adapter-utils
packages/adapter-utils/src/server-utils.test.ts`
- `pnpm exec vitest run --project @paperclipai/adapter-claude-local
packages/adapters/claude-local/src/server/parse.test.ts`
- `pnpm exec vitest run --project @paperclipai/adapter-codex-local
packages/adapters/codex-local/src/server/parse.test.ts`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/adapter-model-refresh-routes.test.ts
server/src/__tests__/adapter-models.test.ts
server/src/__tests__/claude-local-execute.test.ts
server/src/__tests__/codex-local-execute.test.ts
server/src/__tests__/heartbeat-process-recovery.test.ts
server/src/__tests__/heartbeat-retry-scheduling.test.ts`

## Risks

- Moderate behavior risk: retry classification affects whether runs
auto-recover or block, so mistakes here could either suppress needed
retries or over-retry real failures
- Low workflow risk: deferred comment wake reopening is intentionally
scoped to human/comment-reopen interactions so system follow-ups do not
revive completed issues unexpectedly

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex GPT-5-based coding agent with tool use and code execution
in the Codex CLI environment

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 09:40:40 -05:00
Dotta
4fdbbeced3 [codex] Refine markdown issue reference rendering (#4382)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Task references are a core part of how operators understand issue
relationships across the UI
> - Those references appear both in markdown bodies and in sidebar
relationship panels
> - The rendering had drifted between surfaces, and inline markdown
pills were reading awkwardly inside prose and lists
> - This pull request unifies the underlying issue-reference treatment,
routes issue descriptions through `MarkdownBody`, and switches inline
markdown references to a cleaner text-link presentation
> - The benefit is more consistent issue-reference UX with better
readability in markdown-heavy views

## What Changed

- unified sidebar and markdown issue-reference rendering around the
shared issue-reference components
- routed resting issue descriptions through `MarkdownBody` so
description previews inherit the richer issue-reference treatment
- replaced inline markdown pill chrome with a cleaner inline reference
presentation for prose contexts
- added and updated UI tests for `MarkdownBody` and `InlineEditor`

## Verification

- `pnpm exec vitest run --project @paperclipai/ui
ui/src/components/MarkdownBody.test.tsx
ui/src/components/InlineEditor.test.tsx`

## Risks

- Moderate UI risk: issue-reference rendering now differs intentionally
between inline markdown and relationship sidebars, so regressions would
show up as styling or hover-preview mismatches

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex GPT-5-based coding agent with tool use and code execution
in the Codex CLI environment

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 09:39:21 -05:00
Dotta
7ad225a198 [codex] Improve issue thread review flow (#4381)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Issue detail is where operators coordinate review, approvals, and
follow-up work with active runs
> - That thread UI needs to surface blockers, descendants, review
handoffs, and reply ergonomics clearly enough for humans to guide agent
work
> - Several small gaps in the issue-thread flow were making review and
navigation clunkier than necessary
> - This pull request improves the reply composer, descendant/blocker
presentation, interaction folding, and review-request handoff plumbing
together as one cohesive issue-thread workflow slice
> - The benefit is a cleaner operator review loop without changing the
broader task model

## What Changed

- restored and refined the floating reply composer behavior in the issue
thread
- folded expired confirmation interactions and improved post-submit
thread scrolling behavior
- surfaced descendant issue context and inline blocker/paused-assignee
notices on the issue detail view
- tightened large-board first paint behavior in `IssuesList`
- added loose review-request handoffs through the issue
execution-policy/update path and covered them with tests

## Verification

- `pnpm vitest run ui/src/pages/IssueDetail.test.tsx`
- `pnpm vitest run server/src/__tests__/issues-service.test.ts
server/src/__tests__/issue-execution-policy.test.ts`
- `pnpm exec vitest run --project @paperclipai/ui
ui/src/components/IssueChatThread.test.tsx
ui/src/components/IssueProperties.test.tsx
ui/src/components/IssuesList.test.tsx ui/src/lib/issue-tree.test.ts
ui/src/api/issues.test.ts`
- `pnpm exec vitest run --project @paperclipai/adapter-utils
packages/adapter-utils/src/server-utils.test.ts`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/issue-comment-reopen-routes.test.ts -t "coerces
executor handoff patches into workflow-controlled review wakes|wakes the
return assignee with execution_changes_requested"`
- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/issue-execution-policy.test.ts
server/src/__tests__/issues-service.test.ts`

## Visual Evidence

- UI layout changes are covered by the focused issue-thread component
and issue-detail tests listed above. Browser screenshots were not
attachable from this automated greploop environment, so reviewers should
use the running preview for final visual confirmation.

## Risks

- Moderate UI-flow risk: these changes touch the issue detail experience
in multiple spots, so regressions would most likely show up as
thread-layout quirks or incorrect review-handoff behavior

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex GPT-5-based coding agent with tool use and code execution
in the Codex CLI environment

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots or documented the visual verification path
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-24 08:02:45 -05:00
Dotta
35a9dc37b0 [codex] Speed up company skill detail loading (#4380)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Company skills are part of the control plane for distributing
reusable capabilities
> - Board flows that inspect company skill detail should stay responsive
because they are operator-facing control-plane reads
> - The existing detail path was doing broader work than needed for the
specific detail screen
> - This pull request narrows that company-skill detail loading path and
adds a regression test around it
> - The benefit is faster company skill detail reads without changing
the external API contract

## What Changed

- tightened the company-skill detail loading path in
`server/src/services/company-skills.ts`
- added `server/src/__tests__/company-skills-detail.test.ts` to verify
the detail route only pulls the required data

## Verification

- `pnpm exec vitest run --project @paperclipai/server
server/src/__tests__/company-skills-detail.test.ts`

## Risks

- Low risk: this only changes the company-skill detail query path, but
any missed assumption in the detail consumer would surface when loading
that screen

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex GPT-5-based coding agent with tool use and code execution
in the Codex CLI environment

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-24 07:37:13 -05:00
Devin Foley
e4995bbb1c Add SSH environment support (#4358)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The environments subsystem already models execution environments,
but before this branch there was no end-to-end SSH-backed runtime path
for agents to actually run work against a remote box
> - That meant agents could be configured around environment concepts
without a reliable way to execute adapter sessions remotely, sync
workspace state, and preserve run context across supported adapters
> - We also need environment selection to participate in normal
Paperclip control-plane behavior: agent defaults, project/issue
selection, route validation, and environment probing
> - Because this capability is still experimental, the UI surface should
be easy to hide and easy to remove later without undoing the underlying
implementation
> - This pull request adds SSH environment execution support across the
runtime, adapters, routes, schema, and tests, then puts the visible
environment-management UI behind an experimental flag
> - The benefit is that we can validate real SSH-backed agent execution
now while keeping the user-facing controls safely gated until the
feature is ready to come out of experimentation

## What Changed

- Added SSH-backed execution target support in the shared adapter
runtime, including remote workspace preparation, skill/runtime asset
sync, remote session handling, and workspace restore behavior after
runs.
- Added SSH execution coverage for supported local adapters, plus remote
execution tests across Claude, Codex, Cursor, Gemini, OpenCode, and Pi.
- Added environment selection and environment-management backend support
needed for SSH execution, including route/service work, validation,
probing, and agent default environment persistence.
- Added CLI support for SSH environment lab verification and updated
related docs/tests.
- Added the `enableEnvironments` experimental flag and gated the
environment UI behind it on company settings, agent configuration, and
project configuration surfaces.

## Verification

- `pnpm exec vitest run
packages/adapters/claude-local/src/server/execute.remote.test.ts
packages/adapters/cursor-local/src/server/execute.remote.test.ts
packages/adapters/gemini-local/src/server/execute.remote.test.ts
packages/adapters/opencode-local/src/server/execute.remote.test.ts
packages/adapters/pi-local/src/server/execute.remote.test.ts`
- `pnpm exec vitest run server/src/__tests__/environment-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/instance-settings-routes.test.ts`
- `pnpm exec vitest run ui/src/lib/new-agent-hire-payload.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- `pnpm -r typecheck`
- `pnpm build`
- Manual verification on a branch-local dev server:
  - enabled the experimental flag
  - created an SSH environment
  - created a Linux Claude agent using that environment
- confirmed a run executed on the Linux box and synced workspace changes
back

## Risks

- Medium: this touches runtime execution flow across multiple adapters,
so regressions would likely show up in remote session setup, workspace
sync, or environment selection precedence.
- The UI flag reduces exposure, but the underlying runtime and route
changes are still substantial and rely on migration correctness.
- The change set is broad across adapters, control-plane services,
migrations, and UI gating, so review should pay close attention to
environment-selection precedence and remote workspace lifecycle
behavior.

## Model Used

- OpenAI Codex via Paperclip's local Codex adapter, GPT-5-class coding
model with tool use and code execution in the local repo workspace. The
local adapter does not surface a more specific public model version
string in this branch workflow.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 19:15:22 -07:00
Dotta
f98c348e2b [codex] Add issue subtree pause, cancel, and restore controls (#4332)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - This branch extends the issue control-plane so board operators can
pause, cancel, and later restore whole issue subtrees while keeping
descendant execution and wake behavior coherent.
> - That required new hold state in the database, shared contracts,
server routes/services, and issue detail UI controls so subtree actions
are durable and auditable instead of ad hoc.
> - While this branch was in flight, `master` advanced with new
environment lifecycle work, including a new `0065_environments`
migration.
> - Before opening the PR, this branch had to be rebased onto
`paperclipai/paperclip:master` without losing the existing
subtree-control work or leaving conflicting migration numbering behind.
> - This pull request rebases the subtree pause/cancel/restore feature
cleanly onto current `master`, renumbers the hold migration to
`0066_issue_tree_holds`, and preserves the full branch diff in a single
PR.
> - The benefit is that reviewers get one clean, mergeable PR for the
subtree-control feature instead of stale branch history with migration
conflicts.

## What Changed

- Added durable issue subtree hold data structures, shared
API/types/validators, server routes/services, and UI flows for subtree
pause, cancel, and restore operations.
- Added server and UI coverage for subtree previewing, hold
creation/release, dependency-aware scheduling under holds, and issue
detail subtree controls.
- Rebased the branch onto current `paperclipai/paperclip:master` and
renumbered the branch migration from `0065_issue_tree_holds` to
`0066_issue_tree_holds` so it no longer conflicts with upstream
`0065_environments`.
- Added a small follow-up commit that makes restore requests return `200
OK` explicitly while keeping pause/cancel hold creation at `201
Created`, and updated the route test to match that contract.

## Verification

- `pnpm --filter @paperclipai/db typecheck`
- `pnpm --filter @paperclipai/shared typecheck`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm --filter @paperclipai/ui typecheck`
- `cd server && pnpm exec vitest run
src/__tests__/issue-tree-control-routes.test.ts
src/__tests__/issue-tree-control-service.test.ts
src/__tests__/issue-tree-control-service-unit.test.ts
src/__tests__/heartbeat-dependency-scheduling.test.ts`
- `cd ui && pnpm exec vitest run src/components/IssueChatThread.test.tsx
src/pages/IssueDetail.test.tsx`

## Risks

- This is a broad cross-layer change touching DB/schema, shared
contracts, server orchestration, and UI; regressions are most likely
around subtree status restoration or wake suppression/resume edge cases.
- The migration was renumbered during PR prep to avoid the new upstream
`0065_environments` conflict. Reviewers should confirm the final
`0066_issue_tree_holds` ordering is the only hold-related migration that
lands.
- The issue-tree restore endpoint now responds with `200` instead of
relying on implicit behavior, which is semantically better for a restore
operation but still changes an API detail that clients or tests could
have assumed.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent in the Paperclip Codex runtime (GPT-5-class
tool-using coding model; exact deployment ID/context window is not
exposed inside this session).

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-23 14:51:46 -05:00
Russell Dempsey
854fa81757 fix(pi-local): prepend installed skill bin/ dirs to child PATH (#4331)
## Thinking Path

> - Paperclip orchestrates AI agents; each agent runs under an adapter
that spawns a model CLI as a child process.
> - The pi-local adapter (`packages/adapters/pi-local`) spawns `pi` and
inherits the child's shell environment — including `PATH`, which
determines what the child's bash tool can execute by name.
> - Paperclip skills ship executable helpers under `<skill>/bin/` (e.g.
`paperclip-get-issue`) and Reviewer/QA-style `AGENTS.md` files invoke
them by name via the agent's bash tool.
> - Pi-local builds its runtime env with `ensurePathInEnv({
...process.env, ...env })` only — it never adds the installed skills'
`bin/` dirs to PATH. The pi CLI's `--skill` arg loads each skill's
SKILL.md but does not augment PATH.
> - Consequence: every bash invocation of a skill helper fails with
`exit 127: command not found`. The agent then spends its heartbeat
guessing (re-reading SKILL.md, trying `find`, inventing command paths)
and either times out or gives up.
> - This PR prepends each injected skill's `bin/` directory to the child
PATH immediately before runtimeEnv is constructed.
> - The benefit: pi_local agents whose AGENTS.md uses any `paperclip-*`
skill helper can actually run those helpers.

## What Changed

- `packages/adapters/pi-local/src/server/execute.ts`: compute
`skillBinDirs` from the already-resolved `piSkillEntries`, dedupe
against the existing PATH, prepend them to whichever of `PATH` / `Path`
the merged env uses, then build `runtimeEnv`. No new helpers, no
adapter-utils changes.

## Verification

Manual repro before the fix:

1. Create a pi_local agent wired to a paperclip skill (e.g.
paperclip-control).
2. Wake the agent on an in_review issue with an AGENTS.md that starts
with `paperclip-get-issue "$PAPERCLIP_TASK_ID"`.
3. Session file: `{ "role": "toolResult", "isError": true, "content": [{
"text": "/bin/bash: paperclip-get-issue: command not found\n\nCommand
exited with code 127" }] }`.

After the fix: same wake; `paperclip-get-issue` resolves and returns the
issue JSON; agent proceeds.

Local commands:

```
pnpm --filter @paperclipai/adapter-pi-local typecheck   # clean
pnpm --filter @paperclipai/adapter-pi-local build       # clean
pnpm --filter @paperclipai/server exec vitest run \
  src/__tests__/pi-local-execute.test.ts \
  src/__tests__/pi-local-adapter-environment.test.ts \
  src/__tests__/pi-local-skill-sync.test.ts
# 5/5 passing
```

No new tests: the existing `pi-local-skill-sync.test.ts` covers skill
symlink injection (upstream of the PATH step), and
`pi-local-execute.test.ts` covers the spawn path; this change only
augments env on the same spawn path.

## Risks

Low. Pure PATH augmentation on the child env. Edge cases:

- Zero skills installed → no PATH change (guarded by
`skillBinDirs.length > 0`).
- Duplicate bin dirs already on PATH → deduped; no pollution on re-runs.
- Windows `Path` casing → falls back correctly when merged env uses
`Path` instead of `PATH`.
- Skill dir without `bin/` subdir → joined path simply won't resolve;
harmless.

No behavioral change for pi_local agents that don't use skill-provided
commands.

## Model Used

- Claude, `claude-opus-4-7` (1M context), extended thinking enabled,
tool use enabled. Walked pi-local/cursor-local/claude-local and
adapter-utils to isolate the gap, wrote the inlined fix, and ran
typecheck/build/test locally.

## Checklist

- [x] Thinking path from project context to this change
- [x] Model used specified
- [x] Checked ROADMAP.md — no overlap
- [x] Tests run locally, passing
- [x] Tests added — new case in
`server/src/__tests__/pi-local-execute.test.ts`; verified it fails when
the fix is reverted
- [ ] UI screenshots — N/A (backend adapter change)
- [x] Docs updated — N/A (internal adapter, no user-facing docs)
- [x] Risks documented
- [x] Will address reviewer comments before merge
2026-04-23 10:15:10 -05:00
Dotta
fe14de504c [codex] Document README architecture systems (#4250)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies.
> - The public README is the first place many operators and contributors
learn what the product already includes.
> - The existing README explained the product promise but did not give a
compact, concrete tour of the major systems behind it.
> - This made Paperclip easier to underestimate as a wrapper around
agents instead of a full control plane with identity, work, execution,
governance, budgets, plugins, and portability.
> - This pull request adds an under-the-hood README section that names
those systems and shows how adapters connect into the server.
> - Greptile caught consistency gaps between the diagram and prose, so
the final version aligns the system labels and adapter examples across
both surfaces.
> - The benefit is a clearer first-read model of Paperclip's
architecture and shipped capabilities without changing runtime behavior.

## What Changed

- Added a `What's Under the Hood` section to `README.md`.
- Added an ASCII architecture diagram for the Paperclip server and
external agent adapters.
- Added a systems table covering identity, org charts, tasks, heartbeat
execution, workspaces, governance, budgets, routines, plugins,
secrets/storage, activity/events, and company portability.
- Addressed Greptile feedback by aligning diagram labels with table rows
and grouping adapter examples consistently.

## Verification

- `git diff --check public-gh/master...HEAD`
- Attempted `pnpm exec prettier --check README.md`, but this checkout
does not expose a `prettier` binary through `pnpm exec`.
- Greptile review rerun passed after addressing its two comments; review
threads are resolved.
- Remote PR checks passed on the latest head: `policy`, `verify`, `e2e`,
`security/snyk (cryppadotta)`, and `Greptile Review`.
- Not run locally: Vitest/build suites, because this is a README-only
documentation change and the PR's remote `verify` job ran typecheck,
tests, build, and release canary dry run.

## Risks

- Low runtime risk: documentation-only change.
- The main risk is wording drift if the README overstates or
underspecifies evolving product capabilities; the section was aligned
against the current product/spec docs and roadmap.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex / GPT-5 coding agent in a Paperclip heartbeat, with shell
and GitHub CLI tool use. Exact runtime model identifier and context
window were not exposed by the adapter.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-23 09:48:19 -05:00
Michel Tomas
3d15798c22 fix(adapters/routes): apply resolveExternalAdapterRegistration on hot-install (#4324)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The external adapter plugin system (#2218) lets adapters ship as npm
modules loaded via `server/src/adapters/plugin-loader.ts`; since #4296
merged, each `ServerAdapterModule` can declare `sessionManagement`
(`supportsSessionResume`, `nativeContextManagement`,
`defaultSessionCompaction`) and have it preserved through the init-time
load via the new `resolveExternalAdapterRegistration` helper
> - #4296 fixed the init-time IIFE path at
`server/src/adapters/registry.ts:363-369` but noted that the hot-install
path at `server/src/routes/adapters.ts:174
registerWithSessionManagement` still unconditionally overwrites
module-provided `sessionManagement` during `POST /api/adapters/install`
> - Practical impact today: an external adapter installed via the API
needs a Paperclip restart before its declared `sessionManagement` takes
effect — the IIFE runs on next boot and preserves it, but until then the
hot-install overwrite wins
> - This PR closes that parity gap: `registerWithSessionManagement`
delegates to the same `resolveExternalAdapterRegistration` helper
introduced by #4296, unifying both load paths behind one resolver
> - The benefit is consistent behaviour between cold-start and
hot-install: no "install then restart" ritual; declared
`sessionManagement` on an external module is honoured the moment `POST
/api/adapters/install` returns 201

## What Changed

- `server/src/routes/adapters.ts`: `registerWithSessionManagement`
delegates to the exported `resolveExternalAdapterRegistration` helper
(added in #4296). Honours module-provided `sessionManagement` first,
falls back to host registry lookup, defaults `undefined`. Updated the
section comment to document the parity-with-IIFE intent.
- `server/src/routes/adapters.ts`: dropped the now-unused
`getAdapterSessionManagement` import.
- `server/src/adapters/registry.ts`: updated the JSDoc on
`resolveExternalAdapterRegistration` — previously said "Exported for
unit tests; runtime callers use the IIFE below", now says the helper is
used by both the init-time IIFE and the hot-install path in
`routes/adapters.ts`. Addresses Greptile C1.
- `server/src/__tests__/adapter-routes.test.ts`: new integration test —
installs a mocked external adapter module carrying a non-trivial
`sessionManagement` declaration and asserts
`findServerAdapter(type).sessionManagement` preserves it after `POST
/api/adapters/install` returns 201.
- `server/src/__tests__/adapter-routes.test.ts`: added
`findServerAdapter` to the shared test-scope variable set so the new
test can inspect post-install registry state.

## Verification

Targeted test runs from a clean tree on
`fix/external-session-management-hot-install` (rebased onto current
`upstream/master` now that #4296 has merged):

- `pnpm test server/src/__tests__/adapter-routes.test.ts` — 6 passed
(new test + 5 pre-existing)
- `pnpm test server/src/__tests__/adapter-registry.test.ts` — 15 passed
(ensures the IIFE path from #4296 continues to behave correctly)
- `pnpm -w run test` full workspace suite — 1923 passed / 1 skipped
(unrelated skip)

End-to-end smoke on file:
[`@superbiche/cline-paperclip-adapter@0.1.1`](https://www.npmjs.com/package/@superbiche/cline-paperclip-adapter)
and
[`@superbiche/qwen-paperclip-adapter@0.1.1`](https://www.npmjs.com/package/@superbiche/qwen-paperclip-adapter),
both public on npm, both declare `sessionManagement`. With this PR in
place, the "restart after install" step disappears — the declared
compaction policy is active immediately after the install response.

## Risks

- Low risk. The change replaces an inline mutation with a call to a
helper that already has dedicated unit coverage (#4296 added three tests
for `resolveExternalAdapterRegistration` covering module-provided,
registry-fallback, and undefined paths). Behaviour is a strict superset
of the prior path — externals that did not declare `sessionManagement`
continue to get the hardcoded-registry lookup; externals that did
declare it now have those values preserved instead of overwritten.
- No migration impact. The stored plugin records
(`~/.paperclip/adapter-plugins.json`) are unchanged. Existing
hot-installed adapters behave correctly before and after.
- No behavioural change for builtin adapters; they hit
`registerServerAdapter` directly and never flow through
`registerWithSessionManagement`.

## Model Used

- Provider and model: Claude (Anthropic) via Claude Code
- Model ID: `claude-opus-4-7` (1M context)
- Reasoning mode: standard (no extended thinking on this PR)
- Tool use: yes — file edits, subprocess invocations for
builds/tests/git via the Claude Code harness

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots (N/A — server-only change)
- [x] I have updated relevant documentation to reflect my changes (the
JSDoc on `resolveExternalAdapterRegistration` and the section comment
above `registerWithSessionManagement` now document the parity-with-IIFE
intent)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 09:45:24 -05:00
Michel Tomas
24232078fd fix(adapters/registry): honor module-provided sessionManagement for external adapters (#4296)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Adapters are how paperclip hands work off to specific agent
runtimes; since #2218, external adapter packages can ship as npm modules
loaded via `server/src/adapters/plugin-loader.ts`
> - Each `ServerAdapterModule` can declare `sessionManagement`
(`supportsSessionResume`, `nativeContextManagement`,
`defaultSessionCompaction`) — but the init-time load at
`registry.ts:363-369` hard-overwrote it with a hardcoded-registry lookup
that has no entries for external types, so modules could not actually
set these fields
> - The hot-install path at `routes/adapters.ts:179` →
`registerServerAdapter` preserves module-provided `sessionManagement`,
so externals worked after `POST /api/adapters/install` — *until the next
server restart*, when the init-time IIFE wiped it back to `undefined`
> - #2218 explicitly deferred this: *"Adapter execution model, heartbeat
protocol, and session management are untouched."* This PR is the natural
follow-up for session management on the plugin-loader path
> - This PR aligns init-time registration with the hot-install path:
honor module-provided `sessionManagement` first, fall back to the
hardcoded registry when absent (so externals overriding a built-in type
still inherit its policy). Extracted as a testable helper with three
unit tests
> - The benefit is external adapters can declare session-resume
capabilities consistently across cold-start and hot-install, without
requiring upstream additions to the hardcoded registry for each new
plugin

## What Changed

- `server/src/adapters/registry.ts`: extracted the merge logic into a
new exported helper `resolveExternalAdapterRegistration()` — honors
module-provided `sessionManagement` first, falls back to
`getAdapterSessionManagement(type)`, else `undefined`. The init-time
IIFE calls the helper instead of inlining an overwrite.
- `server/src/adapters/registry.ts`: updated the section comment (lines
331–340) to reflect the new semantics and cross-reference the
hot-install path's behavior.
- `server/src/__tests__/adapter-registry.test.ts`: new
`describe("resolveExternalAdapterRegistration")` block with three tests
— module-provided value preserved, registry fallback when module omits,
`undefined` when neither provides.

## Verification

Targeted test run from a clean tree on
`fix/external-session-management`:

```
cd server && pnpm exec vitest run src/__tests__/adapter-registry.test.ts
# 1 test file, 15 tests passed, 0 failed (12 pre-existing + 3 new)
```

Full server suite via the independent review pass noted under Model
Used: **1,156 tests passed, 0 failed**.

Typecheck note: `pnpm --filter @paperclipai/server exec tsc --noEmit`
surfaces two errors in `src/services/plugin-host-services.ts:1510`
(`createInteraction` + implicit-any). Verified by `git stash` + re-run
on clean `upstream/master` — they reproduce without this PR's changes.
Pre-existing, out of scope.

## Risks

- **Low behavioral risk.** Strictly additive: externals that do NOT
provide `sessionManagement` continue to receive exactly the same value
as before (registry lookup → `undefined` for pure externals, or the
builtin's entry for externals overriding a built-in type). Only a new
capability is unlocked; no existing behavior changes for existing
adapters.
- **No breaking change.** `ServerAdapterModule.sessionManagement` was
already optional at the type level. Externals that never set it see no
difference on either path.
- **Consistency verified.** Init-time IIFE now matches the post-`POST
/api/adapters/install` behavior — a server restart no longer regresses
the field.

## Note

This is part of a broader effort to close the parity gap between
external and built-in adapters. Once externals reach 1:1 capability
coverage with internals, new-adapter contributions can increasingly be
steered toward the external-plugin path instead of the core product — a
trajectory CONTRIBUTING.md already encourages ("*If the idea fits as an
extension, prefer building it with the plugin system*").

## Model Used

- **Provider**: Anthropic
- **Model**: Claude Opus 4.7
- **Exact model ID**: `claude-opus-4-7` (1M-context variant:
`claude-opus-4-7[1m]`)
- **Context window**: 1,000,000 tokens
- **Harness**: Claude Code (Anthropic's official CLI), orchestrated by
@superbiche as human-in-the-loop. Full file-editing, shell, and `gh`
tool use, plus parallel research subagents for fact-finding against
paperclip internals (plugin-loader contract, sessionCodec reachability,
UI parser surface, Cline CLI JSON schema).
- **Independent local review**: Gemini 3.1 Pro (Google) performed a
separate verification pass on the committed branch — confirmed the
approach & necessity, ran the full workspace build, and executed the
complete server test suite (1,156 tests, all passing). Not used for
authoring; second-opinion pass only.
- **Authoring split**: @superbiche identified the gap (while mapping the
external-adapter surface for a downstream adapter build) and shaped the
plan — categorising the surface into `works / acceptable /
needs-upstream` buckets, directing the surgical-diff approach on a fresh
branch from `upstream/master`, and calling the framing ("alignment bug
between init-time IIFE and hot-install path" rather than "missing
capability"). Opus 4.7 executed the fact-finding, the diff, the tests,
and drafted this PR body — all under direct review.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work (convention-aligned bug fix on the external-adapter
plugin path introduced by #2218)
- [x] I have run tests locally and they pass (15/15 in the touched file;
1,156/1,156 full server suite via the independent Gemini 3.1 Pro review)
- [x] I have added tests where applicable (3 new for the extracted
helper)
- [x] If this change affects the UI, I have included before/after
screenshots (no UI touched)
- [x] I have updated relevant documentation to reflect my changes
(in-file comment reflects new semantics)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-23 07:39:43 -05:00
Devin Foley
13551b2bac Add local environment lifecycle (#4297)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Every heartbeat run needs a concrete place where the agent's adapter
process executes.
> - Today that execution location is implicitly the local machine, which
makes it hard to track, audit, and manage as a first-class runtime
concern.
> - The first step is to represent the current local execution path
explicitly without changing how users experience agent runs.
> - This pull request adds core Environment and Environment Lease
records, then routes existing local heartbeat execution through a
default `Local` environment.
> - The benefit is that local runs remain behavior-preserving while the
system now has durable environment identity, lease lifecycle tracking,
and activity records for execution placement.

## What Changed

- Added `environments` and `environment_leases` database tables, schema
exports, and migration `0065_environments.sql`.
- Added shared environment constants, TypeScript types, and validators
for environment drivers, statuses, lease policies, lease statuses, and
cleanup states.
- Added `environmentService` for listing, reading, creating, updating,
and ensuring company-scoped environments.
- Added environment lease lifecycle operations for acquire, metadata
update, single-lease release, and run-wide release.
- Updated heartbeat execution to lazily ensure a company-scoped default
`Local` environment before adapter execution.
- Updated heartbeat execution to acquire an ephemeral local environment
lease, write `paperclipEnvironment` into the run context snapshot, and
release active leases during run finalization.
- Added activity log events for environment lease acquisition and
release.
- Added tests for environment service behavior and the local heartbeat
environment lifecycle.
- Added a CI-follow-up heartbeat guard so deferred issue comment wakes
are promoted before automatic missing-comment retries, with focused
batching test coverage.

## Verification

Local verification run for this branch:

- `pnpm -r typecheck`
- `pnpm build`
- `pnpm exec vitest run server/src/__tests__/environment-service.test.ts
server/src/__tests__/heartbeat-local-environment.test.ts --pool=forks`

Additional reviewer/CI verification:

- Confirm `pnpm-lock.yaml` is not modified.
- Confirm `pnpm test:run` passes in CI.
- Confirm `PAPERCLIP_E2E_SKIP_LLM=true pnpm run test:e2e` passes in CI.
- Confirm a local heartbeat run creates one active `Local` environment
when needed, records one lease for the run, releases the lease when the
run finishes, and includes `paperclipEnvironment` in the run context
snapshot.

Screenshots: not applicable; this PR has no UI changes.

## Risks

- Migration risk: introduces two new tables and a new migration journal
entry. Review should verify company scoping, indexes, foreign keys, and
enum defaults are correct.
- Lifecycle risk: heartbeat finalization now releases environment leases
in addition to existing runtime cleanup. A finalization bug could leave
stale active leases or mark a failed run's lease incorrectly.
- Behavior-preservation risk: local adapter execution should remain
unchanged apart from environment bookkeeping. Review should pay
attention to the heartbeat path around context snapshot updates and
final cleanup ordering.
- Activity volume risk: each heartbeat run now logs lease acquisition
and release events, increasing activity log volume by two records per
run.

## Model Used

OpenAI GPT-5.4 via Codex CLI. Capabilities used: repository inspection,
TypeScript implementation review, local test/build execution, and
PR-description drafting.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots (N/A: no UI changes)
- [x] I have updated relevant documentation to reflect my changes (N/A:
no user-facing docs or commands changed)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-22 20:07:41 -07:00
Dotta
b69b563aa8 [codex] Fix stale issue execution run locks (#4258)
## Thinking Path

> - Paperclip is a control plane for AI-agent companies, so issue
checkout and execution ownership are core safety contracts.
> - The affected subsystem is the issue service and route layer that
gates agent writes by `checkoutRunId` and `executionRunId`.
> - PAP-1982 exposed a stale-lock failure mode where a terminal
heartbeat run could leave `executionRunId` pinned after checkout
ownership had moved or been cleared.
> - That stale execution lock could reject legitimate
PATCH/comment/release requests from the rightful assignee after a
harness restart.
> - This pull request centralizes terminal-run cleanup, applies it
before ownership-gated writes, and adds a board-only recovery endpoint
for operator intervention.
> - The benefit is that crashed or terminal runs no longer strand issues
behind stale execution locks, while live execution locks still block
conflicting writes.

## What Changed

- Added `issueService.clearExecutionRunIfTerminal()` to atomically lock
the issue/run rows and clear terminal or missing execution-run locks.
- Reused stale execution-lock cleanup from checkout,
`assertCheckoutOwner()`, and `release()`.
- Allowed the same assigned agent/current run to adopt an unowned
`in_progress` checkout after stale execution-lock cleanup.
- Updated release to clear `executionRunId`, `executionAgentNameKey`,
and `executionLockedAt`.
- Added board-only `POST /api/issues/:id/admin/force-release` with
company access checks, optional `clearAssignee=true`, and
`issue.admin_force_release` audit logging.
- Added embedded Postgres service tests and route integration tests for
stale-lock recovery, release behavior, and admin force-release
authorization/audit behavior.
- Documented the new force-release API in `doc/SPEC-implementation.md`.

## Verification

- `pnpm vitest run server/src/__tests__/issues-service.test.ts
server/src/__tests__/issue-stale-execution-lock-routes.test.ts` passed.
- `pnpm vitest run
server/src/__tests__/issue-stale-execution-lock-routes.test.ts
server/src/__tests__/approval-routes-idempotency.test.ts
server/src/__tests__/issue-comment-reopen-routes.test.ts
server/src/__tests__/issue-telemetry-routes.test.ts` passed.
- `pnpm -r typecheck` passed.
- `pnpm build` passed.
- `git diff --check` passed.
- `pnpm lint` could not run because this repo has no `lint` command.
- Full `pnpm test:run` completed with 4 failures in existing route
suites: `approval-routes-idempotency.test.ts` (2),
`issue-comment-reopen-routes.test.ts` (1), and
`issue-telemetry-routes.test.ts` (1). Those same files pass when run
isolated and when run together with the new stale-lock route test, so
this appears to be a whole-suite ordering/mock-isolation issue outside
this patch path.

## Risks

- Medium: this changes ownership-gated write behavior. The new adoption
path is limited to the current run, the current assignee, `in_progress`
issues, and rows with no checkout owner after terminal-lock cleanup.
- Low: the admin force-release endpoint is board-only and
company-scoped, but misuse can intentionally clear a live lock. It
writes an audit event with prior lock IDs.
- No schema or migration changes.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5 coding agent (`gpt-5`), agentic coding with
terminal/tool use and local test execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-22 10:43:38 -05:00
Dotta
a957394420 [codex] Add structured issue-thread interactions (#4244)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Operators supervise that work through issues, comments, approvals,
and the board UI.
> - Some agent proposals need structured board/user decisions, not
hidden markdown conventions or heavyweight governed approvals.
> - Issue-thread interactions already provide a natural thread-native
surface for proposed tasks and questions.
> - This pull request extends that surface with request confirmations,
richer interaction cards, and agent/plugin/MCP helpers.
> - The benefit is that plan approvals and yes/no decisions become
explicit, auditable, and resumable without losing the single-issue
workflow.

## What Changed

- Added persisted issue-thread interactions for suggested tasks,
structured questions, and request confirmations.
- Added board UI cards for interaction review, selection, question
answers, and accept/reject confirmation flows.
- Added MCP and plugin SDK helpers for creating interaction cards from
agents/plugins.
- Updated agent wake instructions, onboarding assets, Paperclip skill
docs, and public docs to prefer structured confirmations for
issue-scoped decisions.
- Rebased the branch onto `public-gh/master` and renumbered branch
migrations to `0063` and `0064`; the idempotency migration uses `ADD
COLUMN IF NOT EXISTS` for old branch users.

## Verification

- `git diff --check public-gh/master..HEAD`
- `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts
packages/mcp-server/src/tools.test.ts
packages/shared/src/issue-thread-interactions.test.ts
ui/src/lib/issue-thread-interactions.test.ts
ui/src/lib/issue-chat-messages.test.ts
ui/src/components/IssueThreadInteractionCard.test.tsx
ui/src/components/IssueChatThread.test.tsx
server/src/__tests__/issue-thread-interaction-routes.test.ts
server/src/__tests__/issue-thread-interactions-service.test.ts
server/src/services/issue-thread-interactions.test.ts` -> 9 files / 79
tests passed
- `pnpm -r typecheck` -> passed, including `packages/db` migration
numbering check

## Risks

- Medium: this adds a new issue-thread interaction model across
db/shared/server/ui/plugin surfaces.
- Migration risk is reduced by placing this branch after current master
migrations (`0063`, `0064`) and making the idempotency column add
idempotent for users who applied the old branch numbering.
- UI interaction behavior is covered by component tests, but this PR
does not include browser screenshots.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-class coding agent runtime. Exact model ID and
context window are not exposed in this Paperclip run; tool use and local
shell/code execution were enabled.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-21 20:15:11 -05:00
Dotta
014aa0eb2d [codex] Clear stale queued comment targets (#4234)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Operators interact with agent work through issue threads and queued
comments.
> - When the selected comment target becomes stale, the composer can
keep pointing at an invalid target after thread state changes.
> - That makes follow-up comments easier to misroute and harder to
reason about.
> - This pull request clears stale queued comment targets and covers the
behavior with tests.
> - The benefit is more predictable issue-thread commenting during live
agent work.

## What Changed

- Clears queued comment targets when they no longer match the current
issue thread state.
- Adjusts issue detail comment-target handling to avoid stale target
reuse.
- Adds regression tests for optimistic issue comment target behavior.

## Verification

- `pnpm exec vitest run ui/src/lib/optimistic-issue-comments.test.ts`

## Risks

- Low risk; scoped to comment-target state handling in the issue UI.
- No migrations.

> Checked `ROADMAP.md`; this is a focused UI reliability fix, not a new
roadmap-level feature.

## Model Used

- OpenAI Codex, GPT-5-based coding agent, tool-enabled repository
editing and local test execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-21 16:50:26 -05:00
Dotta
bcbbb41a4b [codex] Harden heartbeat runtime cleanup (#4233)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The heartbeat runtime is the control-plane path that turns issue
assignments into agent runs and recovers after process exits.
> - Several edge cases could leave high-volume reads unbounded, stale
runtime services visible, blocked dependency wakes too eager, or
terminal adapter processes still around after output finished.
> - These problems make operator views noisy and make long-running agent
work less predictable.
> - This pull request tightens the runtime/read paths and adds focused
regression coverage.
> - The benefit is safer heartbeat execution and cleaner runtime state
without changing the public task model.

## What Changed

- Bounded high-volume issue/log reads in runtime code paths.
- Hardened heartbeat handling for blocked dependency wakes and terminal
run cleanup.
- Added adapter process cleanup coverage for terminal output cases.
- Added workspace runtime control tests for stale command matching and
stopped services.

## Verification

- `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts
server/src/__tests__/heartbeat-dependency-scheduling.test.ts
ui/src/components/WorkspaceRuntimeControls.test.tsx`

## Risks

- Medium risk because heartbeat cleanup and runtime filtering affect
active agent execution paths.
- No migrations.

> Checked `ROADMAP.md`; this is runtime hardening and bug-fix work, not
a new roadmap-level feature.

## Model Used

- OpenAI Codex, GPT-5-based coding agent, tool-enabled repository
editing and local test execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-21 16:48:47 -05:00
Dotta
73ef40e7be [codex] Sandbox dynamic adapter UI parsers (#4225)
## Thinking Path

> - Paperclip is a control plane for AI-agent companies.
> - External adapters can provide UI parser code that the board loads
dynamically for run transcript rendering.
> - Running adapter-provided parser code directly in the board page
gives that parser access to same-origin browser state.
> - This PR narrows that surface by evaluating dynamically loaded
external adapter UI parser code in a dedicated browser Web Worker with a
constrained postMessage protocol.
> - The worker here is a frontend isolation boundary for adapter UI
parser JavaScript; it is not Paperclip's server plugin-worker system and
it is not a server-side job runner.

## What Changed

- Runs dynamically loaded external adapter UI parsers inside a dedicated
Web Worker instead of importing/evaluating them directly in the board
page.
- Adds a narrow postMessage protocol for parser initialization and line
parsing.
- Caches completed async parse results and notifies the adapter registry
so transcript recomputation can synchronously drain the final parsed
line.
- Disables common worker network, persistence, child worker, Blob/object
URL, and WebRTC escape APIs inside the parser worker bootstrap.
- Handles worker error messages after initialization and drains pending
callbacks on worker termination or mid-session worker error.
- Adds focused regression coverage for the parser worker lockdown and
unused protocol removal.

## Verification

- `pnpm exec vitest run --config ui/vitest.config.ts
ui/src/adapters/sandboxed-parser-worker.test.ts`
- `pnpm exec tsc --noEmit --target es2021 --moduleResolution bundler
--module esnext --jsx react-jsx --lib dom,es2021 --skipLibCheck
ui/src/adapters/dynamic-loader.ts
ui/src/adapters/sandboxed-parser-worker.ts
ui/src/adapters/sandboxed-parser-worker.test.ts`
- `pnpm --filter @paperclipai/ui typecheck` was attempted; it reached
existing unrelated failures in HeartbeatRun test/storybook fixtures and
missing Storybook type resolution, with no adapter-module errors
surfaced.
- PR #4225 checks on current head `34c9da00`: `policy`, `e2e`, `verify`,
`security/snyk`, and `Greptile Review` are all `SUCCESS`.
- Greptile Review on current head `34c9da00` reached 5/5.

## Risks

- Medium risk: parser execution is now asynchronous through a worker
while the existing parser interface is synchronous, so transcript
updates should be watched with external adapters.
- Some adapter parser bundles may rely on direct ESM `export` syntax or
browser APIs that are no longer available inside the worker lockdown.
- The worker lockdown is a hardening layer around external parser code,
not a complete browser security sandbox for arbitrary untrusted
applications.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent runtime, shell/git tool use
enabled. Exact hosted model build and context window are not exposed in
this Paperclip heartbeat environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-21 13:42:44 -05:00
Dotta
a26e1288b6 [codex] Polish issue board workflows (#4224)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Human operators supervise that work through issue lists, issue
detail, comments, inbox groups, markdown references, and
profile/activity surfaces
> - The branch had many small UI fixes that improve the operator loop
but do not need to ship with backend runtime migrations
> - These changes belong together as board workflow polish because they
affect scanning, navigation, issue context, comment state, and markdown
clarity
> - This pull request groups the UI-only slice so it can merge
independently from runtime/backend changes
> - The benefit is a clearer board experience with better issue context,
steadier optimistic updates, and more predictable keyboard navigation

## What Changed

- Improves issue properties, sub-issue actions, blocker chips, and issue
list/detail refresh behavior.
- Adds blocker context above the issue composer and stabilizes
queued/interrupted comment UI state.
- Improves markdown issue/GitHub link rendering and opens external
markdown links in a new tab.
- Adds inbox group keyboard navigation and fold/unfold support.
- Polishes activity/avatar/profile/settings/workspace presentation
details.

## Verification

- `pnpm exec vitest run ui/src/components/IssueProperties.test.tsx
ui/src/components/IssueChatThread.test.tsx
ui/src/components/MarkdownBody.test.tsx ui/src/lib/inbox.test.ts
ui/src/lib/optimistic-issue-comments.test.ts`

## Risks

- Low to medium risk: changes are UI-focused but cover high-traffic
issue and inbox surfaces.
- This branch intentionally does not include the backend runtime changes
from the companion PR; where UI calls newer API filters, unsupported
servers should continue to fail visibly through existing API error
handling.
- Visual screenshots were not captured in this heartbeat; targeted
component/helper tests cover the changed behavior.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent runtime, shell/git tool use
enabled. Exact hosted model build and context window are not exposed in
this Paperclip heartbeat environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-21 12:25:34 -05:00
Dotta
09d0678840 [codex] Harden heartbeat scheduling and runtime controls (#4223)
## Thinking Path

> - Paperclip orchestrates AI agents through issue checkout, heartbeat
runs, routines, and auditable control-plane state
> - The runtime path has to recover from lost local processes, transient
adapter failures, blocked dependencies, and routine coalescing without
stranding work
> - The existing branch carried several reliability fixes across
heartbeat scheduling, issue runtime controls, routine dispatch, and
operator-facing run state
> - These changes belong together because they share backend contracts,
migrations, and runtime status semantics
> - This pull request groups the control-plane/runtime slice so it can
merge independently from board UI polish and adapter sandbox work
> - The benefit is safer heartbeat recovery, clearer runtime controls,
and more predictable recurring execution behavior

## What Changed

- Adds bounded heartbeat retry scheduling, scheduled retry state, and
Codex transient failure recovery handling.
- Tightens heartbeat process recovery, blocker wake behavior, issue
comment wake handling, routine dispatch coalescing, and
activity/dashboard bounds.
- Adds runtime-control MCP tools and Paperclip skill docs for issue
workspace runtime management.
- Adds migrations `0061_lively_thor_girl.sql` and
`0062_routine_run_dispatch_fingerprint.sql`.
- Surfaces retry state in run ledger/agent UI and keeps related shared
types synchronized.

## Verification

- `pnpm exec vitest run
server/src/__tests__/heartbeat-retry-scheduling.test.ts
server/src/__tests__/heartbeat-process-recovery.test.ts
server/src/__tests__/routines-service.test.ts`
- `pnpm exec vitest run src/tools.test.ts` from `packages/mcp-server`

## Risks

- Medium risk: this touches heartbeat recovery and routine dispatch,
which are central execution paths.
- Migration order matters if split branches land out of order: merge
this PR before branches that assume the new runtime/routine fields.
- Runtime retry behavior should be watched in CI and in local operator
smoke tests because it changes how transient failures are resumed.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent runtime, shell/git tool use
enabled. Exact hosted model build and context window are not exposed in
this Paperclip heartbeat environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-21 12:24:11 -05:00
Dotta
ab9051b595 Add first-class issue references (#4214)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Operators and agents coordinate through company-scoped issues,
comments, documents, and task relationships.
> - Issue text can mention other tickets, but those references were
previously plain markdown/text without durable relationship data.
> - That made it harder to understand related work, surface backlinks,
and keep cross-ticket context visible in the board.
> - This pull request adds first-class issue reference extraction,
storage, API responses, and UI surfaces.
> - The benefit is that issue references become queryable, navigable,
and visible without relying on ad hoc text scanning.

## What Changed

- Added shared issue-reference parsing utilities and exported
reference-related types/constants.
- Added an `issue_reference_mentions` table, idempotent migration DDL,
schema exports, and database documentation.
- Added server-side issue reference services, route integration,
activity summaries, and a backfill command for existing issue content.
- Added UI reference pills, related-work panels, markdown/editor mention
handling, and issue detail/property rendering updates.
- Added focused shared, server, and UI tests for parsing, persistence,
display, and related-work behavior.
- Rebased `PAP-735-first-class-task-references` cleanly onto
`public-gh/master`; no `pnpm-lock.yaml` changes are included.

## Verification

- `pnpm -r typecheck`
- `pnpm test:run packages/shared/src/issue-references.test.ts
server/src/__tests__/issue-references-service.test.ts
ui/src/components/IssueRelatedWorkPanel.test.tsx
ui/src/components/IssueProperties.test.tsx
ui/src/components/MarkdownBody.test.tsx`

## Risks

- Medium risk because this adds a new issue-reference persistence path
that touches shared parsing, database schema, server routes, and UI
rendering.
- Migration risk is mitigated by `CREATE TABLE IF NOT EXISTS`, guarded
foreign-key creation, and `CREATE INDEX IF NOT EXISTS` statements so
users who have applied an older local version of the numbered migration
can re-run safely.
- UI risk is limited by focused component coverage, but reviewers should
still manually inspect issue detail pages containing ticket references
before merge.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent, tool-using shell workflow with
repository inspection, git rebase/push, typecheck, and focused Vitest
verification.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: dotta <dotta@example.com>
Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-21 10:02:52 -05:00
Dotta
1954eb3048 [codex] Detect issue graph liveness deadlocks (#4209)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The heartbeat harness is responsible for waking agents, reconciling
issue state, and keeping execution moving.
> - Some dependency graphs can become live-locks when a blocked issue
depends on an unassigned, cancelled, or otherwise uninvokable issue.
> - Review and approval stages can also stall when the recorded
participant can no longer be resolved.
> - This pull request adds issue graph liveness classification plus
heartbeat reconciliation that creates durable escalation work for those
cases.
> - The benefit is that harness-level deadlocks become visible,
assigned, logged, and recoverable instead of silently leaving task
sequences blocked.

## What Changed

- Added an issue graph liveness classifier for blocked dependency and
invalid review participant states.
- Added heartbeat reconciliation that creates one stable escalation
issue per liveness incident, links it as a blocker, comments on the
affected issue, wakes the recommended owner, and logs activity.
- Wired startup and periodic server reconciliation for issue graph
liveness incidents.
- Added focused tests for classifier behavior, heartbeat escalation
creation/deduplication, and queued dependency wake promotion.
- Fixed queued issue wakes so a coalesced wake re-runs queue selection,
allowing dependency-unblocked work to start immediately.

## Verification

- `pnpm exec vitest run
server/src/__tests__/heartbeat-dependency-scheduling.test.ts
server/src/__tests__/issue-liveness.test.ts
server/src/__tests__/heartbeat-issue-liveness-escalation.test.ts`
- Passed locally: `server/src/__tests__/issue-liveness.test.ts` (5
tests)
- Skipped locally: embedded Postgres suites because optional package
`@embedded-postgres/darwin-x64` is not installed on this host
- `pnpm --filter @paperclipai/server typecheck`
- `git diff --check`
- Greptile review loop: ran 3 times as requested; the final
Greptile-reviewed head `0a864eab` had 0 comments and all Greptile
threads were resolved. Later commits are CI/test-stability fixes after
the requested max Greptile pass count.
- GitHub PR checks on head `87493ed4`: `policy`, `verify`, `e2e`, and
`security/snyk (cryppadotta)` all passed.

## Risks

- Moderate operational risk: the reconciler creates escalation issues
automatically, so incorrect classification could create noise. Stable
incident keys and deduplication limit repeated escalation.
- Low schema risk: this uses existing issue, relation, comment, wake,
and activity log tables with no migration.
- No UI screenshots included because this change is server-side harness
behavior only.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5-based coding agent. Exact runtime model ID and
context window were not exposed in this session. Used tool execution for
git, tests, typecheck, Greptile review handling, and GitHub CLI
operations.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-21 09:11:12 -05:00
Robin van Duiven
8d0c3d2fe6 fix(hermes): inject agent JWT into Hermes adapter env to fix identity attribution (#3608)
## Thinking Path

> - Paperclip orchestrates AI agents and records their actions through
auditable issue comments and API writes.
> - The local adapter registry is responsible for adapting each agent
runtime to Paperclip's server-side execution context.
> - The Hermes local adapter delegated directly to
`hermes-paperclip-adapter`, whose current execution context type
predates the server `authToken` field.
> - Without explicitly passing the run-scoped agent token and run id
into Hermes, Hermes could inherit a server or board-user
`PAPERCLIP_API_KEY` and lack a usable `PAPERCLIP_RUN_ID` for mutating
API calls.
> - That made Paperclip writes from Hermes agents risk appearing under
the wrong identity or without the correct run-scoped attribution.
> - This pull request wraps the Hermes execution call so Hermes receives
the agent run JWT as `PAPERCLIP_API_KEY` and the current execution id as
`PAPERCLIP_RUN_ID` while preserving explicit adapter configuration where
appropriate.
> - Follow-up review fixes preserve Hermes' built-in prompt when no
custom prompt template exists and document the intentional type cast.
> - The benefit is reliable agent attribution for the covered local
Hermes path without clobbering Hermes' default heartbeat/task
instructions.

## What Changed

- Wrapped `hermesLocalAdapter.execute` so `ctx.authToken` is injected
into `adapterConfig.env.PAPERCLIP_API_KEY` when no explicit Paperclip
API key is already configured.
- Injected `ctx.runId` into `adapterConfig.env.PAPERCLIP_RUN_ID` so the
auth guard's `X-Paperclip-Run-Id: $PAPERCLIP_RUN_ID` instruction
resolves to the current run id.
- Added a Paperclip API auth guard to existing custom Hermes
`promptTemplate` values without creating a replacement prompt when no
custom template exists.
- Documented the intentional `as unknown as` cast needed until
`hermes-paperclip-adapter` ships an `AdapterExecutionContext` type that
includes `authToken`.
- Added registry tests for JWT injection, run-id injection, explicit key
preservation, default prompt preservation, and the no-`authToken`
early-return path.

## Verification

- [x] `pnpm --filter "./server" exec vitest run adapter-registry` - 8
tests passed.
- [x] `pnpm --filter "./server" typecheck` - passed.
- [x] Trigger a Hermes agent heartbeat and verify Paperclip writes
appear under the agent identity rather than a shared board-user
identity, with the correct run id on mutating requests.

## Risks

- Low migration risk: this changes only the Hermes local adapter wrapper
and tests.
- Existing explicit `adapterConfig.env.PAPERCLIP_API_KEY` values are
preserved to avoid breaking intentionally configured agents.
- `PAPERCLIP_RUN_ID` is set from `ctx.runId` for each execution so
mutating API calls use the current run id instead of a stale or literal
placeholder value.
- Prompt behavior is intentionally conservative: the auth guard is only
prepended when a custom prompt template already exists, so Hermes'
built-in default prompt remains intact for unconfigured agents.
- Remaining operational risk: the identity and run-id behavior should
still be verified with a live Hermes heartbeat before relying on it in
production.

## Model Used

- OpenAI Codex, GPT-5 family coding agent, tool use enabled for local
shell, GitHub CLI, and test execution.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots (not applicable: backend-only change)
- [x] I have updated relevant documentation to reflect my changes (not
applicable: no product docs changed; PR description updated)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
Co-authored-by: Dotta <bippadotta@protonmail.com>
2026-04-21 07:18:11 -05:00
Dotta
1266954a4e [codex] Make heartbeat scheduling blocker-aware (#4157)
## Thinking Path

> - Paperclip orchestrates AI agents through issue-driven heartbeats,
checkouts, and wake scheduling.
> - This change sits in the server heartbeat and issue services that
decide which queued runs are allowed to start.
> - Before this branch, queued heartbeats could be selected even when
their issue still had unresolved blocker relationships.
> - That let blocked descendant work compete with actually-ready work
and risked auto-checking out issues that were not dependency-ready.
> - This pull request teaches the scheduler and checkout path to consult
issue dependency readiness before claiming queued runs.
> - It also exposes dependency readiness in the agent inbox so agents
can see which assigned issues are still blocked.
> - The result is that heartbeat execution follows the DAG of blocked
dependencies instead of waking work out of order.

## What Changed

- Added `IssueDependencyReadiness` helpers to `issueService`, including
unresolved blocker lookup for single issues and bulk issue lists.
- Prevented issue checkout and `in_progress` transitions when unresolved
blockers still exist.
- Made heartbeat queued-run claiming and prioritization dependency-aware
so ready work starts before blocked descendants.
- Included dependency readiness fields in `/api/agents/me/inbox-lite`
for agent heartbeat selection.
- Added regression coverage for dependency-aware heartbeat promotion and
issue-service participation filtering.

## Verification

- `pnpm run preflight:workspace-links`
- `pnpm exec vitest run
server/src/__tests__/heartbeat-dependency-scheduling.test.ts
server/src/__tests__/issues-service.test.ts`
- On this host, the Vitest command passed, but the embedded-Postgres
portions of those files were skipped because
`@embedded-postgres/darwin-x64` is not installed.

## Risks

- Scheduler ordering now prefers dependency-ready runs, so any hidden
assumptions about strict FIFO ordering could surface in edge cases.
- The new guardrails reject checkout or `in_progress` transitions for
blocked issues; callers depending on the old permissive behavior would
now get `422` errors.
- Local verification did not execute the embedded-Postgres integration
paths on this macOS host because the platform binary package was
missing.

> I checked `ROADMAP.md`; this is a targeted execution/scheduling fix
and does not duplicate planned roadmap feature work.

## Model Used

- OpenAI Codex via the Paperclip `codex_local` adapter in this
workspace. Exact backend model ID is not surfaced in the runtime here;
tool-enabled coding agent with terminal execution and repository editing
capabilities.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-20 16:03:57 -05:00
Hiuri Noronha
1bf2424377 fix: honor Hermes local command override (#3503)
## Summary

This fixes the Hermes local adapter so that a configured command
override is respected during both environment tests and execution.

## Problem

The Hermes adapter expects `adapterConfig.hermesCommand`, but the
generic local command path in the UI was storing
`adapterConfig.command`.

As a result, changing the command in the UI did not reliably affect
runtime behavior. In real use, the adapter could still fall back to the
default `hermes` binary.

This showed up clearly in setups where Hermes is launched through a
wrapper command rather than installed directly on the host.

## What changed

- switched the Hermes local UI adapter to the Hermes-specific config
builder
- updated the configuration form to read and write `hermesCommand` for
`hermes_local`
- preserved the override correctly in the test-environment path
- added server-side normalization from legacy `command` to
`hermesCommand`

## Compatibility

The server-side normalization keeps older saved agent configs working,
including configs that still store the value under `command`.

## Validation

Validated against a Docker-based Hermes workflow using a local wrapper
exposed through a symlinked command:

- `Command = hermes-docker`
- environment test respects the override
- runs no longer fall back to `hermes`

Typecheck also passed for both UI and server.

Co-authored-by: NoronhaH <NoronhaH@users.noreply.github.com>
2026-04-20 15:55:08 -05:00
LeonSGP
51f127f47b fix(hermes): stop advertising unsupported instructions bundles (#3908)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Local adapter capability flags decide which configuration surfaces
the UI and server expose for each adapter.
> - `hermes_local` currently advertises managed instructions bundle
support, so Paperclip exposes the AGENTS.md bundle flow for Hermes
agents.
> - The bundled `hermes-paperclip-adapter` only consumes
`promptTemplate` at runtime and does not read `instructionsFilePath`, so
that advertised bundle path silently does nothing.
> - Issue #3833 reports exactly that mismatch: users configure AGENTS.md
instructions, but Hermes only receives the built-in heartbeat prompt.
> - This pull request stops advertising managed instructions bundles for
`hermes_local` until the adapter actually consumes bundle files at
runtime.

## What Changed

- Changed the built-in `hermes_local` server adapter registration to
report `supportsInstructionsBundle: false`.
- Updated the UI's synchronous built-in capability fallback so Hermes no
longer shows the managed instructions bundle affordance on first render.
- Added regression coverage in
`server/src/__tests__/adapter-routes.test.ts` to assert that
`hermes_local` still reports skills + local JWT support, but not
instructions bundle support.

## Verification

- `git diff --check`
- `node --experimental-strip-types --input-type=module -e "import {
findActiveServerAdapter } from './server/src/adapters/index.ts'; const
adapter = findActiveServerAdapter('hermes_local');
console.log(JSON.stringify({ type: adapter?.type,
supportsInstructionsBundle: adapter?.supportsInstructionsBundle,
supportsLocalAgentJwt: adapter?.supportsLocalAgentJwt, supportsSkills:
Boolean(adapter?.listSkills || adapter?.syncSkills) }));"`
- Observed
`{"type":"hermes_local","supportsInstructionsBundle":false,"supportsLocalAgentJwt":true,"supportsSkills":true}`
- Added adapter-routes regression assertions for the Hermes capability
contract; CI should validate the full route path in a clean workspace.

## Risks

- Low risk: this only changes the advertised capability surface for
`hermes_local`.
- Behavior change: Hermes agents will no longer show the broken managed
instructions bundle UI until the underlying adapter actually supports
`instructionsFilePath`.
- Existing Hermes skill sync and local JWT behavior are unchanged.

## Model Used

- OpenAI Codex, GPT-5.4 class coding agent, medium reasoning,
terminal/git/gh tool use.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [ ] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-20 15:54:14 -05:00
github-actions[bot]
b94f1a1565 chore(lockfile): refresh pnpm-lock.yaml (#4139)
Auto-generated lockfile refresh after dependencies changed on master.
This PR only updates pnpm-lock.yaml.

Co-authored-by: lockfile-bot <lockfile-bot@users.noreply.github.com>
2026-04-20 12:15:59 -05:00
Dotta
2de893f624 [codex] add comprehensive UI Storybook coverage (#4132)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The board UI is the main operator surface, so its component and
workflow coverage needs to stay reviewable as the product grows.
> - This branch adds Storybook as a dedicated UI reference surface for
core Paperclip screens and interaction patterns.
> - That work spans Storybook infrastructure, app-level provider wiring,
and a large fixture set that can render real control-plane states
without a live backend.
> - The branch also expands coverage across agents, budgets, issues,
chat, dialogs, navigation, projects, and data visualization so future UI
changes have a concrete visual baseline.
> - This pull request packages that Storybook work on top of the latest
`master`, excludes the lockfile from the final diff per repo policy, and
fixes one fixture contract drift caught during verification.
> - The benefit is a single reviewable PR that adds broad UI
documentation and regression-surfacing coverage without losing the
existing branch work.

## What Changed

- Added Storybook 10 wiring for the UI package, including root scripts,
UI package scripts, Storybook config, preview wrappers, Tailwind
entrypoints, and setup docs.
- Added a large fixture-backed data source for Storybook so complex
board states can render without a live server.
- Added story suites covering foundations, status language,
control-plane surfaces, overview, UX labs, agent management, budget and
finance, forms and editors, issue management, navigation and layout,
chat and comments, data visualization, dialogs and modals, and
projects/goals/workspaces.
- Adjusted several UI components for Storybook parity so dialogs, menus,
keyboard shortcuts, budget markers, markdown editing, and related
surfaces render correctly in isolation.
- Rebasing work for PR assembly: replayed the branch onto current
`master`, removed `pnpm-lock.yaml` from the final PR diff, and aligned
the dashboard fixture with the current `DashboardSummary.runActivity`
API contract.

## Verification

- `pnpm --filter @paperclipai/ui typecheck`
- `pnpm --filter @paperclipai/ui build-storybook`
- Manual diff audit after rebase: verified the PR no longer includes
`pnpm-lock.yaml` and now cleanly targets current `master`.
- Before/after UI note: before this branch there was no dedicated
Storybook surface for these Paperclip views; after this branch the local
Storybook build includes the new overview and domain story suites in
`ui/storybook-static`.

## Risks

- Large static fixture files can drift from shared types as dashboard
and UI contracts evolve; this PR already needed one fixture correction
for `runActivity`.
- Storybook bundle output includes some large chunks, so future growth
may need chunking work if build performance becomes an issue.
- Several component tweaks were made for isolated rendering parity, so
reviewers should spot-check key board surfaces against the live app
behavior.

## Model Used

- OpenAI Codex, GPT-5-based coding agent in the Paperclip harness; exact
serving model ID is not exposed in-runtime to the agent.
- Tool-assisted workflow with terminal execution, git operations, local
typecheck/build verification, and GitHub CLI PR creation.
- Context window/reasoning mode not surfaced by the harness.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 12:13:23 -05:00
Dotta
7a329fb8bb Harden API route authorization boundaries (#4122)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The REST API is the control-plane boundary for companies, agents,
plugins, adapters, costs, invites, and issue mutations.
> - Several routes still relied on broad board or company access checks
without consistently enforcing the narrower actor, company, and
active-checkout boundaries those operations require.
> - That can allow agents or non-admin users to mutate sensitive
resources outside the intended governance path.
> - This pull request hardens the route authorization layer and adds
regression coverage for the audited API surfaces.
> - The benefit is tighter multi-company isolation, safer plugin and
adapter administration, and stronger enforcement of active issue
ownership.

## What Changed

- Added route-level authorization checks for budgets, plugin
administration/scoped routes, adapter management, company import/export,
direct agent creation, invite test resolution, and issue mutation/write
surfaces.
- Enforced active checkout ownership for agent-authenticated issue
mutations, while preserving explicit management overrides for permitted
managers.
- Restricted sensitive adapter and plugin management operations to
instance-admin or properly scoped actors.
- Tightened company portability and invite probing routes so agents
cannot cross company boundaries.
- Updated access constants and the Company Access UI copy for the new
active-checkout management grant.
- Added focused regression tests covering cross-company denial, agent
self-mutation denial, admin-only operations, and active checkout
ownership.
- Rebased the branch onto `public-gh/master` and fixed validation
fallout from the rebase: heartbeat-context route ordering and a company
import/export e2e fixture that now opts out of direct-hire approval
before using direct agent creation.
- Updated onboarding and signoff e2e setup to create seed agents through
`/agent-hires` plus board approval, so they remain compatible with the
approval-gated new-agent default.
- Addressed Greptile feedback by removing a duplicate company export API
alias, avoiding N+1 reporting-chain lookups in active-checkout override
checks, allowing agent mutations on unassigned `in_progress` issues, and
blocking NAT64 invite-probe targets.

## Verification

- `pnpm exec vitest run
server/src/__tests__/issues-goal-context-routes.test.ts
cli/src/__tests__/company-import-export-e2e.test.ts`
- `pnpm exec vitest run server/src/__tests__/plugin-routes-authz.test.ts
server/src/__tests__/adapter-routes-authz.test.ts
server/src/__tests__/agent-permissions-routes.test.ts
server/src/__tests__/company-portability-routes.test.ts
server/src/__tests__/costs-service.test.ts
server/src/__tests__/invite-test-resolution-route.test.ts
server/src/__tests__/issue-agent-mutation-ownership-routes.test.ts
server/src/__tests__/agent-adapter-validation-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/issue-agent-mutation-ownership-routes.test.ts`
- `pnpm exec vitest run
server/src/__tests__/invite-test-resolution-route.test.ts`
- `pnpm -r typecheck`
- `pnpm --filter server typecheck`
- `pnpm --filter ui typecheck`
- `pnpm build`
- `pnpm test:e2e -- tests/e2e/onboarding.spec.ts
tests/e2e/signoff-policy.spec.ts`
- `pnpm test:e2e -- tests/e2e/signoff-policy.spec.ts`
- `pnpm test:run` was also run. It failed under default full-suite
parallelism with two order-dependent failures in
`plugin-routes-authz.test.ts` and `routines-e2e.test.ts`; both files
passed when rerun directly together with `pnpm exec vitest run
server/src/__tests__/plugin-routes-authz.test.ts
server/src/__tests__/routines-e2e.test.ts`.

## Risks

- Medium risk: this changes authorization behavior across multiple
sensitive API surfaces, so callers that depended on broad board/company
access may now receive `403` or `409` until they use the correct
governance path.
- Direct agent creation now respects the company-level board-approval
requirement; integrations that need pending hires should use
`/api/companies/:companyId/agent-hires`.
- Active in-progress issue mutations now require checkout ownership or
an explicit management override, which may reveal workflow assumptions
in older automation.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

OpenAI Codex, GPT-5 coding agent, tool-using workflow with local shell,
Git, GitHub CLI, and repository tests.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [ ] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 10:56:48 -05:00
Dotta
549ef11c14 [codex] Respect manual workspace runtime controls (#4125)
## Thinking Path

> - Paperclip orchestrates AI agents inside execution and project
workspaces
> - Workspace runtime services can be controlled manually by operators
and reused by agent runs
> - Manual start/stop state was not preserved consistently across
workspace policies and routine launches
> - Routine launches also needed branch/workspace variables to default
from the selected workspace context
> - This pull request makes runtime policy state explicit, preserves
manual control, and auto-fills routine branch variables from workspace
data
> - The benefit is less surprising workspace service behavior and fewer
manual inputs when running workspace-scoped routines

## What Changed

- Added runtime-state handling for manual workspace control across
execution and project workspace validators, routes, and services.
- Updated heartbeat/runtime startup behavior so manually stopped
services are respected.
- Auto-filled routine workspace branch variables from available
workspace context.
- Added focused server and UI tests for workspace runtime and routine
variable behavior.
- Removed muted gray background styling from workspace pages and cards
for a cleaner workspace UI.

## Verification

- `pnpm install --frozen-lockfile --ignore-scripts`
- `pnpm exec vitest run server/src/__tests__/routines-service.test.ts
server/src/__tests__/workspace-runtime.test.ts
ui/src/components/RoutineRunVariablesDialog.test.tsx`
- Result: 55 tests passed, 21 skipped. The embedded Postgres routines
tests skipped on this host with the existing PGlite/Postgres init
warning; workspace-runtime and UI tests passed.

## Risks

- Medium risk: this touches runtime service start/stop policy and
heartbeat launch behavior.
- The focused tests cover manual runtime state, routine variables, and
workspace runtime reuse paths.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, tool-enabled local shell and
GitHub workflow, exact runtime context window not exposed in this
session.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots, or documented why targeted component/service verification
is sufficient here
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 10:39:37 -05:00
Dotta
c7c1ca0c78 [codex] Clean up terminal-result adapter process groups (#4129)
## Thinking Path

> - Paperclip runs local adapter processes for agents and streams their
output into heartbeat runs
> - Some adapters can emit a terminal result before all descendant
processes have exited
> - If those descendants keep running, a heartbeat can appear complete
while the process group remains alive
> - Claude local runs need a bounded cleanup path after terminal JSON
output is observed and the child exits
> - This pull request adds terminal-result cleanup support to adapter
process utilities and wires it into the Claude local adapter
> - The benefit is fewer stranded adapter process groups after
successful terminal results

## What Changed

- Added terminal-result cleanup options to `runChildProcess`.
- Tracked child exit plus terminal output before signaling lingering
process groups.
- Added Claude local adapter configuration for terminal result cleanup
grace time.
- Added process cleanup tests covering terminal-output cleanup and noisy
non-terminal runs.

## Verification

- `pnpm install --frozen-lockfile --ignore-scripts`
- `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts`
- Result: 9 tests passed.

## Risks

- Medium risk: this changes adapter child-process cleanup behavior.
- The cleanup only arms after terminal result detection and child exit,
and it is covered by process-group tests.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, tool-enabled local shell and
GitHub workflow, exact runtime context window not exposed in this
session.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots, or documented why it is not applicable
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 10:38:57 -05:00
Dotta
56b3120971 [codex] Improve mobile org chart navigation (#4127)
## Thinking Path

> - Paperclip models companies as teams of human and AI operators
> - The org chart is the primary visual map of that company structure
> - Mobile users need to pan and inspect the chart without awkward
gestures or layout jumps
> - The roadmap also needed to reflect that the multiple-human-users
work is complete
> - This pull request improves mobile org chart gestures and updates the
roadmap references
> - The benefit is a smoother company navigation experience and docs
that match shipped multi-user support

## What Changed

- Added one-finger mobile pan handling for the org chart.
- Expanded org chart test coverage for touch gesture behavior.
- Updated README, ROADMAP, and CLI README references to mark
multiple-human-users work as complete.

## Verification

- `pnpm install --frozen-lockfile --ignore-scripts`
- `pnpm exec vitest run ui/src/pages/OrgChart.test.tsx`
- Result: 4 tests passed.

## Risks

- Low-medium risk: org chart pointer/touch handling changed, but the
behavior is scoped to the org chart page and covered by targeted tests.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, tool-enabled local shell and
GitHub workflow, exact runtime context window not exposed in this
session.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots, or documented why targeted interaction tests are sufficient
here
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 10:35:33 -05:00
Dotta
4357a3f352 [codex] Harden dashboard run activity charts (#4126)
## Thinking Path

> - Paperclip gives operators a live view of agent work across
dashboards, transcripts, and run activity charts
> - Those views consume live run updates and aggregate run activity from
backend dashboard data
> - Missing or partial run data could make charts brittle, and live
transcript updates were heavier than needed
> - Operators need dashboard data to stay stable even when recent run
payloads are incomplete
> - This pull request hardens dashboard run aggregation, guards chart
rendering, and lightens live run update handling
> - The benefit is a more reliable dashboard during active agent
execution

## What Changed

- Added dashboard run activity types and backend aggregation coverage.
- Guarded activity chart rendering when run data is missing or partial.
- Reduced live transcript update churn in active agent and run chat
surfaces.
- Fixed issue chat avatar alignment in the thread renderer.
- Added focused dashboard, activity chart, and live transcript tests.

## Verification

- `pnpm install --frozen-lockfile --ignore-scripts`
- `pnpm exec vitest run server/src/__tests__/dashboard-service.test.ts
ui/src/components/ActivityCharts.test.tsx
ui/src/components/transcript/useLiveRunTranscripts.test.tsx`
- Result: 8 tests passed, 1 skipped. The embedded Postgres dashboard
service test skipped on this host with the existing PGlite/Postgres init
warning; UI chart and transcript tests passed.

## Risks

- Medium-low risk: aggregation semantics changed, but the UI remains
guarded around incomplete data.
- The dashboard service test is host-skipped here, so CI should confirm
the embedded database path.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, tool-enabled local shell and
GitHub workflow, exact runtime context window not exposed in this
session.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots, or documented why targeted component tests are sufficient
here
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 10:34:21 -05:00
Dotta
0f4e4b4c10 [codex] Split reusable agent hiring templates (#4124)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Hiring new agents depends on clear, reusable operating instructions
> - The create-agent skill had one large template reference that mixed
multiple roles together
> - That made it harder to reuse, review, and adapt role-specific
instructions during governed hires
> - This pull request splits the reusable agent instruction templates
into focused role files and polishes the agent instructions pane layout
> - The benefit is faster, clearer agent hiring without bloating the
main skill document

## What Changed

- Split coder, QA, and UX designer reusable instructions into dedicated
reference files.
- Kept the index reference concise and pointed it at the role-specific
files.
- Updated the create-agent skill to describe the separated template
structure.
- Polished the agent detail instructions/package file tree layout so the
longer template references remain readable.

## Verification

- `pnpm install --frozen-lockfile --ignore-scripts`
- `pnpm --filter @paperclipai/ui typecheck`
- UI screenshot rationale: no screenshots attached because the visible
change is limited to the Agent detail instructions file-tree layout
(`wrapLabels` plus the side-by-side breakpoint). There is no new user
flow or state transition to demonstrate; reviewers can verify visually
by opening an agent's Instructions tab and resizing across the
single-column and side-by-side breakpoints to confirm long file names
wrap instead of truncating or overflowing.

## Risks

- Low risk: this is documentation and UI layout only.
- Main risk is stale links in the skill references; the new files are
committed in the referenced paths.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex coding agent based on GPT-5, tool-enabled local shell and
GitHub workflow, exact runtime context window not exposed in this
session.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots, or documented why targeted component/type verification is
sufficient here
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 10:33:19 -05:00
Aron Prins
73eb23734f docs: use structured agent mentions in paperclip skill (#4103)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Agents coordinate work through tasks and comments, and @-mentions
are part of the wakeup path for cross-agent handoffs and review requests
> - The current repo skill still instructs machine-authored comments to
use raw `@AgentName` text as the default mention format
> - But the current backend mention parsing is still unreliable for
multi-word display names, so agents following that guidance can silently
fail to wake the intended target
> - This pull request updates the Paperclip skill and API reference to
prefer structured `agent://` markdown mentions for machine-authored
comments
> - The benefit is a low-risk documentation workaround that steers
agents onto the mention format the server already resolves reliably
while broader runtime fixes are reviewed upstream

## What Changed

- Updated `skills/paperclip/SKILL.md` to stop recommending raw
`@AgentName` mentions for machine-authored comments
- Updated `skills/paperclip/references/api-reference.md` with a concrete
workflow: resolve the target via `GET
/api/companies/{companyId}/agents`, then emit `[@Display
Name](agent://<agent-id>)`
- Added explicit guidance that raw `@AgentName` text is fallback-only
and unreliable for names containing spaces
- Cross-referenced the current upstream mention-bug context so reviewers
can connect this docs workaround to the open parser/runtime fixes
  Related issue/PR refs: #448, #459, #558, #669, #722, #1412, #2249

## Verification

- `pnpm -r typecheck`
- `pnpm build`
- `pnpm test:run` currently fails on upstream `master` in existing tests
unrelated to this docs-only change:
- `src/__tests__/worktree.test.ts` — `seeds authenticated users into
minimally cloned worktree instances` timed out after 20000ms
- `src/__tests__/onboard.test.ts` — `keeps tailnet quickstart on
loopback until tailscale is available` expected `127.0.0.1` but got
`100.125.202.3`
- Confirmed the git diff is limited to:
  - `skills/paperclip/SKILL.md`
  - `skills/paperclip/references/api-reference.md`

## Risks

- Low risk. This is a docs/skill-only change and does not alter runtime
behavior.
- It is a mitigation, not a full fix: it helps agent-authored comments
that follow the Paperclip skill, but it does not fix manually typed raw
mentions or other code paths that still emit plain `@Name` text.
- If upstream chooses a different long-term mention format, this
guidance may need to be revised once the runtime-side fix lands.

## Model Used

- OpenAI Codex desktop agent on a GPT-5-class model. Exact deployed
model ID and context window are not exposed by the local harness. Tool
use enabled, including shell execution, git, and GitHub CLI.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-20 07:38:04 -07:00
Dotta
9c6f551595 [codex] Add plugin orchestration host APIs (#4114)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The plugin system is the extension path for optional capabilities
that should not require core product changes for every integration.
> - Plugins need scoped host APIs for issue orchestration, documents,
wakeups, summaries, activity attribution, and isolated database state.
> - Without those host APIs, richer plugins either cannot coordinate
Paperclip work safely or need privileged core-side special cases.
> - This pull request adds the plugin orchestration host surface, scoped
route dispatch, a database namespace layer, and a smoke plugin that
exercises the contract.
> - The benefit is a broader plugin API that remains company-scoped,
auditable, and covered by tests.

## What Changed

- Added plugin orchestration host APIs for issue creation, document
access, wakeups, summaries, plugin-origin activity, and scoped API route
dispatch.
- Added plugin database namespace tables, schema exports, migration
checks, and idempotent replay coverage under migration
`0059_plugin_database_namespaces`.
- Added shared plugin route/API types and validators used by server and
SDK boundaries.
- Expanded plugin SDK types, protocol helpers, worker RPC host behavior,
and testing utilities for orchestration flows.
- Added the `plugin-orchestration-smoke-example` package to exercise
scoped routes, restricted database namespaces, issue orchestration,
documents, wakeups, summaries, and UI status surfaces.
- Kept the new orchestration smoke fixture out of the root pnpm
workspace importer so this PR preserves the repository policy of not
committing `pnpm-lock.yaml`.
- Updated plugin docs and database docs for the new orchestration and
database namespace surfaces.
- Rebased the branch onto `public-gh/master`, resolved conflicts, and
removed `pnpm-lock.yaml` from the final PR diff.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm --filter @paperclipai/db typecheck`
- `pnpm exec vitest run packages/db/src/client.test.ts`
- `pnpm exec vitest run server/src/__tests__/plugin-database.test.ts
server/src/__tests__/plugin-orchestration-apis.test.ts
server/src/__tests__/plugin-routes-authz.test.ts
server/src/__tests__/plugin-scoped-api-routes.test.ts
server/src/__tests__/plugin-sdk-orchestration-contract.test.ts`
- From `packages/plugins/examples/plugin-orchestration-smoke-example`:
`pnpm exec vitest run --config ./vitest.config.ts`
- `pnpm --dir
packages/plugins/examples/plugin-orchestration-smoke-example run
typecheck`
- `pnpm --filter @paperclipai/server typecheck`
- PR CI on latest head `293fc67c`: `policy`, `verify`, `e2e`, and
`security/snyk` all passed.

## Risks

- Medium risk: this expands plugin host authority, so route auth,
company scoping, and plugin-origin activity attribution need careful
review.
- Medium risk: database namespace migration behavior must remain
idempotent for environments that may have seen earlier branch versions.
- Medium risk: the orchestration smoke fixture is intentionally excluded
from the root workspace importer to avoid a `pnpm-lock.yaml` PR diff;
direct fixture verification remains listed above.
- Low operational risk from the PR setup itself: the branch is rebased
onto current `master`, the migration is ordered after upstream
`0057`/`0058`, and `pnpm-lock.yaml` is not in the final diff.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

Roadmap checked: this work aligns with the completed Plugin system
milestone and extends the plugin surface rather than duplicating an
unrelated planned core feature.

## Model Used

- OpenAI Codex, GPT-5-based coding agent in a tool-enabled CLI
environment. Exact hosted model build and context-window size are not
exposed by the runtime; reasoning/tool use were enabled for repository
inspection, editing, testing, git operations, and PR creation.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots (N/A: no core UI screen change; example plugin UI contract
is covered by tests)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 08:52:51 -05:00
Dotta
16b2b84d84 [codex] Improve agent runtime recovery and governance (#4086)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - The heartbeat runtime, agent import path, and agent configuration
defaults determine whether work is dispatched safely and predictably.
> - Several accumulated fixes all touched agent execution recovery, wake
routing, import behavior, and runtime concurrency defaults.
> - Those changes need to land together so the heartbeat service and
agent creation defaults stay internally consistent.
> - This pull request groups the runtime/governance changes from the
split branch into one standalone branch.
> - The benefit is safer recovery for stranded runs, bounded high-volume
reads, imported-agent approval correctness, skill-template support, and
a clearer default concurrency policy.

## What Changed

- Fixed stranded continuation recovery so successful automatic retries
are requeued instead of incorrectly blocking the issue.
- Bounded high-volume issue/log reads across issue, heartbeat, agent,
project, and workspace paths.
- Fixed imported-agent approval and instruction-path permission
handling.
- Quarantined seeded worktree execution state during worktree
provisioning.
- Queued approval follow-up wakes and hardened SQL_ASCII heartbeat
output handling.
- Added reusable agent instruction templates for hiring flows.
- Set the default max concurrent agent runs to five and updated related
UI/tests/docs.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run server/src/__tests__/company-portability.test.ts
server/src/__tests__/heartbeat-process-recovery.test.ts
server/src/__tests__/heartbeat-comment-wake-batching.test.ts
server/src/__tests__/heartbeat-list.test.ts
server/src/__tests__/issues-service.test.ts
server/src/__tests__/agent-permissions-routes.test.ts
packages/adapter-utils/src/server-utils.test.ts
ui/src/lib/new-agent-runtime-config.test.ts`
- Split integration check: merged this branch first, followed by the
other [PAP-1614](/PAP/issues/PAP-1614) branches, with no merge
conflicts.
- Confirmed this branch does not include `pnpm-lock.yaml`.

## Risks

- Medium risk: touches heartbeat recovery, queueing, and issue list
bounds in central runtime paths.
- Imported-agent and concurrency default behavior changes may affect
existing automation that assumes one-at-a-time default runs.
- No database migrations are included.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5.4 tool-enabled coding model, agentic
code-editing/runtime with local shell and GitHub CLI access; exact
context window and reasoning mode are not exposed by the Paperclip
harness.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 06:19:48 -05:00
Dotta
057fee4836 [codex] Polish issue and operator workflow UI (#4090)
## Thinking Path

> - Paperclip operators spend much of their time in issues, inboxes,
selectors, and rich comment threads.
> - Small interaction problems in those surfaces slow down supervision
of AI-agent work.
> - The branch included related operator quality-of-life fixes for issue
layout, inbox actions, recent selectors, mobile inputs, and chat
rendering stability.
> - These changes are UI-focused and can land independently from
workspace navigation and access-profile work.
> - This pull request groups the operator QoL fixes into one standalone
branch.
> - The benefit is a more stable and efficient board workflow for issue
triage and task editing.

## What Changed

- Widened issue detail content and added a desktop inbox archive action.
- Fixed mobile text-field zoom by keeping touch input font sizes at
16px.
- Prioritized recent picker selections for assignees/projects in issue
and routine flows.
- Showed actionable approvals in the Mine inbox model.
- Fixed issue chat renderer state crashes and hardened tests.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run ui/src/components/IssueChatThread.test.tsx
ui/src/lib/inbox.test.ts ui/src/lib/recent-selections.test.ts`
- Split integration check: merged last after the other
[PAP-1614](/PAP/issues/PAP-1614) branches with no merge conflicts.
- Confirmed this branch does not include `pnpm-lock.yaml`.

## Risks

- Low to medium risk: mostly UI state, layout, and selection-priority
behavior.
- Visual layout and mobile zoom behavior may need browser/device QA
beyond component tests.
- No database migrations are included.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5.4 tool-enabled coding model, agentic
code-editing/runtime with local shell and GitHub CLI access; exact
context window and reasoning mode are not exposed by the Paperclip
harness.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 06:16:41 -05:00
Dotta
fee514efcb [codex] Improve workspace navigation and runtime UI (#4089)
## Thinking Path

> - Paperclip agents do real work in project and execution workspaces.
> - Operators need workspace state to be visible, navigable, and
copyable without digging through raw run logs.
> - The branch included related workspace cards, navigation, runtime
controls, stale-service handling, and issue-property visibility.
> - These changes share the workspace UI and runtime-control surfaces
and can stand alone from unrelated access/profile work.
> - This pull request groups the workspace experience changes into one
standalone branch.
> - The benefit is a clearer workspace overview, better metadata copy
flows, and more accurate runtime service controls.

## What Changed

- Polished project workspace summary cards and made workspace metadata
copyable.
- Added a workspace navigation overview and extracted reusable project
workspace content.
- Squared and polished the execution workspace configuration page.
- Fixed stale workspace command matching and hid stopped stale services
in runtime controls.
- Showed live workspace service context in issue properties.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run
ui/src/components/ProjectWorkspaceSummaryCard.test.tsx
ui/src/lib/project-workspaces-tab.test.ts
ui/src/components/Sidebar.test.tsx
ui/src/components/WorkspaceRuntimeControls.test.tsx
ui/src/components/IssueProperties.test.tsx`
- `pnpm exec vitest run packages/shared/src/workspace-commands.test.ts
--config /dev/null` because the root Vitest project config does not
currently include `packages/shared` tests.
- Split integration check: merged after runtime/governance,
dev-infra/backups, and access/profiles with no merge conflicts.
- Confirmed this branch does not include `pnpm-lock.yaml`.

## Risks

- Medium risk: touches workspace navigation, runtime controls, and issue
property rendering.
- Visual layout changes may need browser QA, especially around smaller
screens and dense workspace metadata.
- No database migrations are included.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5.4 tool-enabled coding model, agentic
code-editing/runtime with local shell and GitHub CLI access; exact
context window and reasoning mode are not exposed by the Paperclip
harness.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 06:14:32 -05:00
Dotta
d8b63a18e7 [codex] Add access cleanup and user profile page (#4088)
## Thinking Path

> - Paperclip is moving from a solo local operator model toward teams
supervising AI-agent companies.
> - Human access management and human-visible profile surfaces are part
of that multiple-user path.
> - The branch included related access cleanup, archived-member removal,
permission protection, and a user profile page.
> - These changes share company membership, user attribution, and
access-service behavior.
> - This pull request groups those human access/profile changes into one
standalone branch.
> - The benefit is safer member removal behavior and a first profile
surface for user work, activity, and cost attribution.

## What Changed

- Added archived company member removal support across shared contracts,
server routes/services, and UI.
- Protected company member removal with stricter permission checks and
tests.
- Added company user profile API, shared types, route wiring, client
API, route, and UI page.
- Simplified the user profile page visual design to a neutral
typography-led layout.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run server/src/__tests__/access-service.test.ts
server/src/__tests__/user-profile-routes.test.ts
ui/src/pages/CompanyAccess.test.tsx --hookTimeout=30000`
- `pnpm exec vitest run server/src/__tests__/user-profile-routes.test.ts
--testTimeout=30000 --hookTimeout=30000` after an initial local
embedded-Postgres hook timeout in the combined run.
- Split integration check: merged after runtime/governance and
dev-infra/backups with no merge conflicts.
- Confirmed this branch does not include `pnpm-lock.yaml`.

## Risks

- Medium risk: changes member removal permissions and adds a new user
profile route with cross-table stats.
- The profile page is a new UI surface and may need visual follow-up in
browser QA.
- No database migrations are included.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5.4 tool-enabled coding model, agentic
code-editing/runtime with local shell and GitHub CLI access; exact
context window and reasoning mode are not exposed by the Paperclip
harness.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 06:10:20 -05:00
Dotta
e89d3f7e11 [codex] Add backup endpoint and dev runtime hardening (#4087)
## Thinking Path

> - Paperclip is a local-first control plane for AI-agent companies.
> - Operators need predictable local dev behavior, recoverable instance
data, and scripts that do not churn the running app.
> - Several accumulated changes improve backup streaming, dev-server
health, static UI caching/logging, diagnostic-file ignores, and instance
isolation.
> - These are operational improvements that can land independently from
product UI work.
> - This pull request groups the dev-infra and backup changes from the
split branch into one standalone branch.
> - The benefit is safer local operation, easier manual backups, less
noisy dev output, and less cross-instance auth leakage.

## What Changed

- Added a manual instance database backup endpoint and route tests.
- Streamed backup/restore handling to avoid materializing large payloads
at once.
- Reduced dev static UI log/cache churn and ignored Node diagnostic
report captures.
- Added guarded dev auto-restart health polling coverage.
- Preserved worktree config during provisioning and scoped auth cookies
by instance.
- Added a Discord daily digest helper script and environment
documentation.
- Hardened adapter-route and startup feedback export tests around the
changed infrastructure.

## Verification

- `pnpm install --frozen-lockfile`
- `pnpm exec vitest run packages/db/src/backup-lib.test.ts
server/src/__tests__/instance-database-backups-routes.test.ts
server/src/__tests__/server-startup-feedback-export.test.ts
server/src/__tests__/adapter-routes.test.ts
server/src/__tests__/dev-runner-paths.test.ts
server/src/__tests__/health-dev-server-token.test.ts
server/src/__tests__/http-log-policy.test.ts
server/src/__tests__/vite-html-renderer.test.ts
server/src/__tests__/workspace-runtime.test.ts
server/src/__tests__/better-auth.test.ts`
- Split integration check: merged after the runtime/governance branch
and before UI branches with no merge conflicts.
- Confirmed this branch does not include `pnpm-lock.yaml`.

## Risks

- Medium risk: touches server startup, backup streaming, auth cookie
naming, dev health checks, and worktree provisioning.
- Backup endpoint behavior depends on existing board/admin access
controls and database backup helpers.
- No database migrations are included.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5.4 tool-enabled coding model, agentic
code-editing/runtime with local shell and GitHub CLI access; exact
context window and reasoning mode are not exposed by the Paperclip
harness.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 06:08:55 -05:00
Dotta
236d11d36f [codex] Add run liveness continuations (#4083)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies.
> - Heartbeat runs are the control-plane record of each agent execution
window.
> - Long-running local agents can exhaust context or stop while still
holding useful next-step state.
> - Operators need that stop reason, next action, and continuation path
to be durable and visible.
> - This pull request adds run liveness metadata, continuation
summaries, and UI surfaces for issue run ledgers.
> - The benefit is that interrupted or long-running work can resume with
clearer context instead of losing the agent's last useful handoff.

## What Changed

- Added heartbeat-run liveness fields, continuation attempt tracking,
and an idempotent `0058` migration.
- Added server services and tests for run liveness, continuation
summaries, stop metadata, and activity backfill.
- Wired local and HTTP adapters to surface continuation/liveness context
through shared adapter utilities.
- Added shared constants, validators, and heartbeat types for liveness
continuation state.
- Added issue-detail UI surfaces for continuation handoffs and the run
ledger, with component tests.
- Updated agent runtime docs, heartbeat protocol docs, prompt guidance,
onboarding assets, and skills instructions to explain continuation
behavior.
- Addressed Greptile feedback by scoping document evidence by run,
excluding system continuation-summary documents from liveness evidence,
importing shared liveness types, surfacing hidden ledger run counts,
documenting bounded retry behavior, and moving run-ledger liveness
backfill off the request path.

## Verification

- `pnpm exec vitest run packages/adapter-utils/src/server-utils.test.ts
server/src/__tests__/run-continuations.test.ts
server/src/__tests__/run-liveness.test.ts
server/src/__tests__/activity-service.test.ts
server/src/__tests__/documents-service.test.ts
server/src/__tests__/issue-continuation-summary.test.ts
server/src/services/heartbeat-stop-metadata.test.ts
ui/src/components/IssueRunLedger.test.tsx
ui/src/components/IssueContinuationHandoff.test.tsx
ui/src/components/IssueDocumentsSection.test.tsx`
- `pnpm --filter @paperclipai/db build`
- `pnpm exec vitest run server/src/__tests__/activity-service.test.ts
ui/src/components/IssueRunLedger.test.tsx`
- `pnpm --filter @paperclipai/ui typecheck`
- `pnpm --filter @paperclipai/server typecheck`
- `pnpm exec vitest run server/src/__tests__/activity-service.test.ts
server/src/__tests__/run-continuations.test.ts
ui/src/components/IssueRunLedger.test.tsx`
- `pnpm exec vitest run
server/src/__tests__/heartbeat-process-recovery.test.ts -t "treats a
plan document update"`
- `pnpm exec vitest run server/src/__tests__/activity-service.test.ts
server/src/__tests__/heartbeat-process-recovery.test.ts -t "activity
service|treats a plan document update"`
- Remote PR checks on head `e53b1a1d`: `verify`, `e2e`, `policy`, and
Snyk all passed.
- Confirmed `public-gh/master` is an ancestor of this branch after
fetching `public-gh master`.
- Confirmed `pnpm-lock.yaml` is not included in the branch diff.
- Confirmed migration `0058_wealthy_starbolt.sql` is ordered after
`0057` and uses `IF NOT EXISTS` guards for repeat application.
- Greptile inline review threads are resolved.

## Risks

- Medium risk: this touches heartbeat execution, liveness recovery,
activity rendering, issue routes, shared contracts, docs, and UI.
- Migration risk is mitigated by additive columns/indexes and idempotent
guards.
- Run-ledger liveness backfill is now asynchronous, so the first ledger
response can briefly show historical missing liveness until the
background backfill completes.
- UI screenshot coverage is not included in this packaging pass;
validation is currently through focused component tests.

> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and
discuss it in `#dev` before opening the PR. Feature PRs that overlap
with planned core work may need to be redirected — check the roadmap
first. See `CONTRIBUTING.md`.

## Model Used

- OpenAI Codex, GPT-5.4, local tool-use coding agent with terminal, git,
GitHub connector, GitHub CLI, and Paperclip API access.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Screenshot note: no before/after screenshots were captured in this PR
packaging pass; the UI changes are covered by focused component tests
listed above.

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-20 06:01:49 -05:00
Dotta
b9a80dcf22 feat: implement multi-user access and invite flows (#3784)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies.
> - V1 needs to stay local-first while also supporting shared,
authenticated deployments.
> - Human operators need real identities, company membership, invite
flows, profile surfaces, and company-scoped access controls.
> - Agents and operators also need the existing issue, inbox, workspace,
approval, and plugin flows to keep working under those authenticated
boundaries.
> - This branch accumulated the multi-user implementation, follow-up QA
fixes, workspace/runtime refinements, invite UX improvements,
release-branch conflict resolution, and review hardening.
> - This pull request consolidates that branch onto the current `master`
branch as a single reviewable PR.
> - The benefit is a complete multi-user implementation path with tests
and docs carried forward without dropping existing branch work.

## What Changed

- Added authenticated human-user access surfaces: auth/session routes,
company user directory, profile settings, company access/member
management, join requests, and invite management.
- Added invite creation, invite landing, onboarding, logo/branding,
invite grants, deduped join requests, and authenticated multi-user E2E
coverage.
- Tightened company-scoped and instance-admin authorization across
board, plugin, adapter, access, issue, and workspace routes.
- Added profile-image URL validation hardening, avatar preservation on
name-only profile updates, and join-request uniqueness migration cleanup
for pending human requests.
- Added an atomic member role/status/grants update path so Company
Access saves no longer leave partially updated permissions.
- Improved issue chat, inbox, assignee identity rendering,
sidebar/account/company navigation, workspace routing, and execution
workspace reuse behavior for multi-user operation.
- Added and updated server/UI tests covering auth, invites, membership,
issue workspace inheritance, plugin authz, inbox/chat behavior, and
multi-user flows.
- Merged current `public-gh/master` into this branch, resolved all
conflicts, and verified no `pnpm-lock.yaml` change is included in this
PR diff.

## Verification

- `pnpm exec vitest run server/src/__tests__/issues-service.test.ts
ui/src/components/IssueChatThread.test.tsx ui/src/pages/Inbox.test.tsx`
- `pnpm run preflight:workspace-links && pnpm exec vitest run
server/src/__tests__/plugin-routes-authz.test.ts`
- `pnpm exec vitest run server/src/__tests__/plugin-routes-authz.test.ts
server/src/__tests__/workspace-runtime-service-authz.test.ts
server/src/__tests__/access-validators.test.ts`
- `pnpm exec vitest run
server/src/__tests__/authz-company-access.test.ts
server/src/__tests__/routines-routes.test.ts
server/src/__tests__/sidebar-preferences-routes.test.ts
server/src/__tests__/approval-routes-idempotency.test.ts
server/src/__tests__/openclaw-invite-prompt-route.test.ts
server/src/__tests__/agent-cross-tenant-authz-routes.test.ts
server/src/__tests__/routines-e2e.test.ts`
- `pnpm exec vitest run server/src/__tests__/auth-routes.test.ts
ui/src/pages/CompanyAccess.test.tsx`
- `pnpm --filter @paperclipai/shared typecheck && pnpm --filter
@paperclipai/db typecheck && pnpm --filter @paperclipai/server
typecheck`
- `pnpm --filter @paperclipai/shared typecheck && pnpm --filter
@paperclipai/server typecheck`
- `pnpm --filter @paperclipai/ui typecheck`
- `pnpm db:generate`
- `npx playwright test --config tests/e2e/playwright.config.ts --list`
- Confirmed branch has no uncommitted changes and is `0` commits behind
`public-gh/master` before PR creation.
- Confirmed no `pnpm-lock.yaml` change is staged or present in the PR
diff.

## Risks

- High review surface area: this PR contains the accumulated multi-user
branch plus follow-up fixes, so reviewers should focus especially on
company-boundary enforcement and authenticated-vs-local deployment
behavior.
- UI behavior changed across invites, inbox, issue chat, access
settings, and sidebar navigation; no browser screenshots are included in
this branch-consolidation PR.
- Plugin install, upgrade, and lifecycle/config mutations now require
instance-admin access, which is intentional but may change expectations
for non-admin board users.
- A join-request dedupe migration rejects duplicate pending human
requests before creating unique indexes; deployments with unusual
historical duplicates should review the migration behavior.
- Company member role/status/grant saves now use a new combined
endpoint; older separate endpoints remain for compatibility.
- Full production build was not run locally in this heartbeat; CI should
cover the full matrix.

## Model Used

- OpenAI Codex coding agent, GPT-5-based model, CLI/tool-use
environment. Exact deployed model identifier and context window were not
exposed by the runtime.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Note on screenshots: this is a branch-consolidation PR for an
already-developed multi-user branch, and no browser screenshots were
captured during this heartbeat.

---------

Co-authored-by: dotta <dotta@example.com>
Co-authored-by: Paperclip <noreply@paperclip.ing>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-17 09:44:19 -05:00
Roman Barinov
e93e418cbf fix: add ssh client and jq to production image (#3826)
## Thinking Path

> - Paperclip is the control plane that runs long-lived AI-agent work in
production.
> - The production container image is the runtime boundary for agent
tools and shell access.
> - In our deployment, Paperclip agents now need a native SSH client and
`jq` available inside the final runtime container.
> - Installing those tools only via ai-rig entrypoint hacks is brittle
and drifts from the image source of truth.
> - This pull request updates the production Docker image itself so the
required binaries are present whenever the image is built.
> - The change is intentionally scoped to the final production stage so
build/deps stages do not gain extra packages unnecessarily.
> - The benefit is a cleaner, reproducible runtime image with fewer
deploy-specific workarounds.

## What Changed

- Added `openssh-client` to the production Docker image stage.
- Added `jq` to the production Docker image stage.
- Kept the package install in the final `production` stage instead of
the shared base stage to minimize scope.

## Verification

- Reviewed the final Dockerfile diff to confirm the packages are
installed in the `production` stage only.
- Attempted local image build with:
  - `docker build --target production -t paperclip:ssh-jq-test .`
- Local build could not be completed in this environment because the
local Docker daemon was unavailable:
- `Cannot connect to the Docker daemon at
unix:///Users/roman/.docker/run/docker.sock. Is the docker daemon
running?`

## Risks

- Low risk: image footprint increases slightly because two Debian
packages are added.
- `openssh-client` expands runtime capability, so this is appropriate
only because the deployed Paperclip runtime explicitly needs SSH access.

## Model Used

- OpenAI Codex / `gpt-5.4`
- Tool-using agent workflow via Hermes
- Context from local repository inspection, git, and shell tooling

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [ ] I have run tests locally and they pass
- [ ] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [ ] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-16 17:11:55 -05:00
Dotta
407e76c1db [codex] Fix Docker gh installation (#3844)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies, and the
Docker image is the no-local-Node path for running that control plane.
> - The deploy workflow builds and pushes that image from the repository
`Dockerfile`.
> - The current image setup adds GitHub CLI through GitHub's external
apt repository and verifies a mutable keyring URL with a pinned SHA256.
> - GitHub rotated the CLI Linux package signing key, so that pinned
keyring checksum now fails before Buildx can publish the image.
> - Paperclip already has a repo-local precedent in
`docker/untrusted-review/Dockerfile`: install Debian trixie's packaged
`gh` directly from the base distribution.
> - This pull request removes the external GitHub CLI apt
keyring/repository path from the production image and installs `gh` with
the rest of the Debian packages.
> - The benefit is a simpler Docker build that no longer fails when
GitHub rotates the apt keyring file.

## What Changed

- Updated the main `Dockerfile` base stage to install `gh` from Debian
trixie's package repositories.
- Removed the mutable GitHub CLI apt keyring download, pinned checksum
verification, extra apt source, second `apt-get update`, and separate
`gh` install step.

## Verification

- `git diff --check`
- `./scripts/docker-build-test.sh` skipped because Docker is installed
but the daemon is not running on this machine.
- Confirmed `https://packages.debian.org/trixie/gh` returns HTTP 200,
matching the base image distribution package source.

## Risks

- Debian's `gh` package can lag the latest upstream GitHub CLI release.
This is acceptable for the current image contract, which requires `gh`
availability but does not document a latest-upstream version guarantee.
- A full image build still needs to run in CI because the local Docker
daemon is unavailable in this environment.

## Model Used

- OpenAI Codex, GPT-5-based coding agent. Exact backend model ID was not
exposed in this runtime; tool use and shell execution were enabled.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-16 17:10:42 -05:00
Devin Foley
e458145583 docs: add public roadmap and update contribution policy for feature PRs (#3835)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - As the project grows, more contributors want to build features —
which is great
> - Without a public roadmap or clear contribution guidance,
contributors spend time on PRs that overlap with planned core work
> - This creates frustration on both sides when those PRs can't be
merged
> - This PR publishes a roadmap, updates the contribution guide with a
clear path for feature proposals, and reinforces the workflow in the PR
template
> - The benefit is that contributors know exactly how to propose
features and where to focus for the highest-impact contributions

## What Changed

- Added `ROADMAP.md` with expanded descriptions of all shipped and
planned milestones, plus guidance on coordinating feature contributions
- Added "Feature Contributions" section to `CONTRIBUTING.md` explaining
how to propose features (check roadmap → discuss in #dev → consider the
plugin system)
- Updated `.github/PULL_REQUEST_TEMPLATE.md` with a callout linking to
the roadmap and a new checklist item to check for overlap with planned
work, while preserving the newer required `Model Used` section from
`master`
- Added `Memory / Knowledge` to the README roadmap preview and linked
the preview to the full `ROADMAP.md`

## Verification

- Open `ROADMAP.md` on GitHub and confirm it renders correctly with all
milestone sections
- Read the new "Feature Contributions" section in `CONTRIBUTING.md` and
verify all links resolve
- Open a new PR and confirm the template shows the roadmap callout and
the new checklist item
- Verify README links to `ROADMAP.md` and the roadmap preview includes
"Memory / Knowledge"

## Risks

- Docs-only change — no runtime or behavioral impact
- Contribution policy changes were written to be constructive and to
offer clear alternative paths (plugins, coordination via #dev, reference
implementations as feedback)

## Model Used

- OpenAI Codex local agent (GPT-5-based coding model; exact runtime
model ID is not exposed in this environment)
- Tool use enabled for shell, git, GitHub CLI, and patch application
- Used to rebase the branch, resolve merge conflicts, update the PR
metadata, and verify the repo state

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have checked ROADMAP.md and confirmed this PR does not duplicate
planned core work
- [ ] I have run tests locally and they pass
- [ ] I have added or updated tests where applicable (N/A — docs only)
- [ ] If this change affects the UI, I have included before/after
screenshots (N/A — no UI changes)
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-16 13:04:50 -07:00
Dewaldt Huysamen
f701c3e78c feat(claude-local): add Opus 4.7 to adapter model dropdown (#3828)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Each adapter advertises a model list that powers the agent config UI
dropdown
> - The `claude_local` adapter's dropdown is sourced from the hard-coded
`models` array in `packages/adapters/claude-local/src/index.ts`
> - Anthropic recently released Opus 4.7, the newest current-generation
Opus model
> - Without a list entry, users cannot discover or select Opus 4.7 from
the dropdown (they can still type it manually, since the field is
creatable, but discoverability is poor)
> - This pull request adds `claude-opus-4-7` to the `claude_local` model
list so new agents can be configured with the latest model by default
> - The benefit is out-of-the-box access to the newest Opus model,
consistent with how every other current-generation Claude model is
already listed

## What Changed

- Added `{ id: "claude-opus-4-7", label: "Claude Opus 4.7" }` as the
**first** entry of the `models` array in
`packages/adapters/claude-local/src/index.ts`. Newest-first ordering
matches the convention already used for 4.6.

## Verification

- `pnpm --filter @paperclipai/adapter-claude-local typecheck` → passes.
- `pnpm --filter @paperclipai/server exec vitest run
src/__tests__/adapter-models.test.ts
src/__tests__/claude-local-adapter.test.ts` → 12/12 passing (both
directly-related files).
- No existing test pins the `claude_local` models array (see
`server/src/__tests__/adapter-models.test.ts`), so appending a new entry
is non-breaking.
- Manual check of UI consumer: `AgentConfigForm.tsx` fetches the list
via `agentsApi.adapterModels()` and renders it in a creatable popover —
no hard-coded expectations anywhere in the UI layer.
- Screenshots: single new option appears at the top of the Claude Code
(local) model dropdown; existing options unchanged.

## Risks

- Low risk. Purely additive: one new entry in a list consumed by a UI
dropdown. No behavior change for existing agents, no schema change, no
migration, no env var.
- `BEDROCK_MODELS` in
`packages/adapters/claude-local/src/server/models.ts` is intentionally
**not** touched — the exact region-qualified Bedrock id for Opus 4.7 is
not yet confirmed, and shipping a guessed id could produce a broken
option for Bedrock users. Tracked as a follow-up on the linked issue.

## Model Used

- None — human-authored.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable (no tests needed:
existing suite already covers the list-consumer paths)
- [x] If this change affects the UI, I have included before/after
screenshots (dropdown gains one new top entry; all other entries
unchanged)
- [x] I have updated relevant documentation to reflect my changes (no
doc update needed: `docs/adapters/claude-local.md` uses
`claude-opus-4-6` only as an example, still valid)
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Closes #3827
2026-04-16 13:18:30 -05:00
akhater
1afb6be961 fix(heartbeat): add hermes_local to SESSIONED_LOCAL_ADAPTERS (#3561)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The heartbeat service monitors agent health via PID liveness checks
for local adapters
> - `SESSIONED_LOCAL_ADAPTERS` in `heartbeat.ts` controls which adapters
get PID tracking and retry-on-lost behavior
> - `hermes_local` (the Hermes Agent adapter) was missing from this set
> - Without it, the orphan reaper immediately marks all Hermes runs as
`process_lost` instead of retrying
> - This PR adds the one-line registration so `hermes_local` gets the
same treatment as `claude_local`, `codex_local`, `cursor`, and
`gemini_local`
> - The benefit is Hermes agent runs complete normally instead of being
killed after ~5 minutes

## What Changed

- Added `"hermes_local"` to the `SESSIONED_LOCAL_ADAPTERS` set in
`server/src/services/heartbeat.ts`

## Verification

- Trigger a Hermes agent run via the wakeup API
- Confirm `heartbeat_runs.status` transitions to `succeeded` (not
`process_lost`)
- Tested end-to-end on a production Paperclip instance with Hermes agent
running heartbeat cycles for 48+ hours

## Risks

Low risk. Additive one-line change — adds a string to an existing set.
No behavioral change for other adapters. Consistent with
`BUILTIN_ADAPTER_TYPES` which already includes `hermes_local`.

## Model Used

- Provider: Anthropic
- Model: Claude Opus 4.6 (claude-opus-4-6)
- Context window: 1M tokens
- Capabilities: Tool use, code execution

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

Co-authored-by: Antoine Khater <akhater@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 07:35:02 -05:00
Dotta
b8725c52ef release: v2026.416.0 notes (#3782)
## Thinking Path

> - Paperclip is the control plane for autonomous AI companies, and
stable releases need a clear changelog artifact for operators upgrading
between versions.
> - The release-note workflow in this repo stores one stable changelog
file per release under `releases/`.
> - `v2026.410.0` and `v2026.413.0` were intermediate drafts for the
same release window, while the next stable release is `v2026.416.0`.
> - Keeping superseded draft release notes around would make the stable
release history noisy and misleading.
> - This pull request consolidates the intended content into
`releases/v2026.416.0.md` and removes the older
`releases/v2026.410.0.md` and `releases/v2026.413.0.md` files.
> - The benefit is a single canonical stable release note for
`v2026.416.0` with no duplicate release artifacts.

## What Changed

- Added `releases/v2026.416.0.md` as the canonical stable changelog for
the April 16, 2026 release.
- Removed the superseded `releases/v2026.410.0.md` and
`releases/v2026.413.0.md` draft release-note files.
- Kept the final release-note ordering and content as edited in the
working tree before commit.

## Verification

- Reviewed the git diff to confirm the PR only changes release-note
artifacts in `releases/`.
- Confirmed the branch is based on `public-gh/master` and contains a
single release-note commit.
- Did not run tests because this is a docs-only changelog update.

## Risks

- Low risk. The change is limited to release-note markdown files.
- The main risk is editorial: if any release item was meant to stay in a
separate changelog file, it now exists only in `v2026.416.0.md`.

## Model Used

- OpenAI GPT-5 Codex, model `gpt-5.4`, medium reasoning, tool use and
code execution in the Codex CLI environment.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [ ] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-15 21:40:35 -05:00
Dotta
5f45712846 Sync/master post pap1497 followups 2026 04 15 (#3779)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The board depends on issue, inbox, cost, and company-skill surfaces
to stay accurate and fast while agents are actively working
> - The PAP-1497 follow-up branch exposed a few rough edges in those
surfaces: stale active-run state on completed issues, missing creator
filters, oversized issue payload scans, and placeholder issue-route
parsing
> - Those gaps make the control plane harder to trust because operators
can see misleading run state, miss the right subset of work, or pay
extra query/render cost on large issue records
> - This pull request tightens those follow-ups across server and UI
code, and adds regression coverage for the affected paths
> - The benefit is a more reliable issue workflow, safer high-volume
cost aggregation, and clearer board/operator navigation

## What Changed

- Added the `v2026.415.0` release changelog entry.
- Fixed stale issue-run presentation after completion and reused the
shared issue-path parser so literal route placeholders no longer become
issue links.
- Added creator filters to the Issues page and Inbox, including
persisted filter-state normalization and regression coverage.
- Bounded issue detail/list project-mention scans and trimmed large
issue-list payload fields to keep issue reads lighter.
- Hardened company-skill list projection and cost/finance aggregation so
large markdown blobs and large summed values do not leak into list
responses or overflow 32-bit casts.
- Added targeted server/UI regression tests for company skills,
costs/finance, issue mention scanning, creator filters, inbox
normalization, and issue reference parsing.

## Verification

- `pnpm exec vitest run
server/src/__tests__/company-skills-service.test.ts
server/src/__tests__/costs-service.test.ts
server/src/__tests__/issues-goal-context-routes.test.ts
server/src/__tests__/issues-service.test.ts ui/src/lib/inbox.test.ts
ui/src/lib/issue-filters.test.ts ui/src/lib/issue-reference.test.ts`
- `gh pr checks 3779`
Current pass set on the PR head: `policy`, `verify`, `e2e`,
`security/snyk (cryppadotta)`, `Greptile Review`

## Risks

- Creator filter options are derived from the currently loaded
issue/agent data, so very sparse result sets may not surface every
historical creator until they appear in the active dataset.
- Cost/finance aggregate casts now use `double precision`; that removes
the current overflow risk, but future schema changes should keep
large-value aggregation behavior under review.
- Issue detail mention scanning now skips comment-body scans on the
detail route, so any consumer that relied on comment-only project
mentions there would need to fetch them separately.

## Model Used

- OpenAI Codex, GPT-5-based coding agent with terminal tool use and
local code execution in the Paperclip workspace. Exact internal model
ID/context-window exposure is not surfaced in this session.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-15 21:13:56 -05:00
Dotta
d4c3899ca4 [codex] improve issue and routine UI responsiveness (#3744)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Operators rely on issue, inbox, and routine views to understand what
the company is doing in real time
> - Those views need to stay fast and readable even when issue lists,
markdown comments, and run metadata get large
> - The current branch had a coherent set of UI and live-update
improvements spread across issue search, issue detail rendering, routine
affordances, and workspace lookups
> - This pull request groups those board-facing changes into one
standalone branch that can merge independently of the heartbeat/runtime
work
> - The benefit is a faster, clearer issue and routine workflow without
changing the underlying task model

## What Changed

- Show routine execution issues by default and rename the filter to
`Hide routine runs` so the default state no longer looks like an active
filter.
- Show the routine name in the run dialog and tighten the issue
properties pane with a workspace link, copy-on-click behavior, and an
inline parent arrow.
- Reduce issue detail rerenders, keep queued issue chat mounted, improve
issues page search responsiveness, and speed up issues first paint.
- Add inbox "other search results", refresh visible issue runs after
status updates, and optimize workspace lookups through summary-mode
execution workspace queries.
- Improve markdown wrapping and scrolling behavior for long strings and
self-comment code blocks.
- Relax the markdown sanitizer assertion so the test still validates
safety after the new wrap-friendly inline styles.

## Verification

- `pnpm vitest run ui/src/components/IssuesList.test.tsx
ui/src/lib/inbox.test.ts ui/src/pages/Issues.test.tsx
ui/src/context/BreadcrumbContext.test.tsx
ui/src/context/LiveUpdatesProvider.test.ts
ui/src/components/MarkdownBody.test.tsx
ui/src/api/execution-workspaces.test.ts
server/src/__tests__/execution-workspaces-routes.test.ts`

## Risks

- This touches several issue-facing UI surfaces at once, so regressions
would most likely show up as stale rendering, search result mismatches,
or small markdown presentation differences.
- The workspace lookup optimization depends on the summary-mode route
shape staying aligned between server and UI.

## Model Used

- OpenAI Codex, GPT-5-based coding agent in the Codex CLI environment.
Exact backend model deployment ID was not exposed in-session.
Tool-assisted editing and shell execution were used.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
2026-04-15 15:54:05 -05:00
Jannes Stubbemann
7463479fc8 fix: disable HTTP caching on run log endpoints (#3724)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Every run emits a streaming log that the web UI polls so humans can
watch what the agent is doing
> - Log responses go out without explicit cache directives, so Express
adds an ETag
> - If the first poll lands before any bytes have been written, the
browser caches the empty / partial snapshot and keeps getting `304 Not
Modified` on every subsequent poll
> - The transcript pane then stays stuck on "Waiting for transcript…"
even after the log has plenty of content
> - This pull request sets `Cache-Control: no-cache, no-store` on both
run-log endpoints so the conditional-request path is defeated

## What Changed

- `server/src/routes/agents.ts` — `GET /heartbeat-runs/:runId/log` now
sets `Cache-Control: no-cache, no-store` on the response.
- Same change applied to `GET /workspace-operations/:operationId/log`
(same structure, same bug).

## Verification

- Reproduction: start a long-running agent, watch the transcript pane.
Before the fix, open devtools and observe `304 Not Modified` on each
poll after the initial 200 with an empty body; the UI never updates.
After the fix, each poll is a 200 with fresh bytes.
- Existing tests pass.

## Risks

Low. Cache headers only affect whether the browser revalidates; the
response body is unchanged. No API surface change.

## Model Used

Claude Opus 4.6 (1M context), extended thinking mode.

## Checklist

- [x] Thinking path traces from project context to this change
- [x] Model used specified
- [x] Tests run locally and pass
- [x] CI green
- [x] Greptile review addressed
2026-04-15 09:53:25 -05:00
Dotta
3fa5d25de1 [codex] harden heartbeat run summaries and recovery context (#3742)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Heartbeat runs are the control-plane record of what agents did, why
they woke up, and what operators should see next
> - Run lists, stranded issue comments, and live log polling all depend
on compact but accurate heartbeat summaries
> - The current branch had a focused backend slice that improves how run
result JSON is summarized, how stale process recovery comments are
written, and how live log polling resolves the active run
> - This pull request isolates that heartbeat/runtime reliability work
from the unrelated UI and dev-tooling changes
> - The benefit is more reliable issue context and cheaper run lookups
without dragging unrelated board UI changes into the same review

## What Changed

- Include the latest run failure in stranded issue comments during
orphaned process recovery.
- Bound heartbeat `result_json` payloads for list responses while
preserving the raw stored payloads.
- Narrow heartbeat log endpoint lookups so issue polling resolves the
relevant active run with less unnecessary scanning.
- Add focused tests for heartbeat list summaries, live run polling,
orphaned process recovery, and the run context/result summary helpers.

## Verification

- `pnpm vitest run
server/src/__tests__/heartbeat-context-summary.test.ts
server/src/__tests__/heartbeat-list.test.ts
server/src/__tests__/agent-live-run-routes.test.ts
server/src/__tests__/heartbeat-process-recovery.test.ts`

## Risks

- The main risk is accidentally hiding a field that some client still
expects from summarized `result_json`, or over-constraining the live log
lookup path for edge-case run routing.
- Recovery comments now surface the latest failure more aggressively, so
wording changes may affect downstream expectations if anyone parses
those comments too strictly.

## Model Used

- OpenAI Codex, GPT-5-based coding agent in the Codex CLI environment.
Exact backend model deployment ID was not exposed in-session.
Tool-assisted editing and shell execution were used.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-15 09:48:39 -05:00
Dotta
c1a02497b0 [codex] fix worktree dev dependency ergonomics (#3743)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Local development needs to work cleanly across linked git worktrees
because Paperclip itself leans on worktree-based engineering workflows
> - Dev-mode asset routing, Vite watch behavior, and workspace package
links are part of that day-to-day control-plane ergonomics
> - The current branch had a small but coherent set of
worktree/dev-tooling fixes that are independent from both the issue UI
changes and the heartbeat runtime changes
> - This pull request isolates those environment fixes into a standalone
branch that can merge without carrying unrelated product work
> - The benefit is a smoother multi-worktree developer loop with fewer
stale links and less noisy dev watching

## What Changed

- Serve dev public assets before the HTML shell and add a routing test
that locks that behavior in.
- Ignore UI test files in the Vite dev watch helper so the dev server
does less unnecessary work.
- Update `ensure-workspace-package-links.ts` to relink stale workspace
dependencies whenever a workspace `node_modules` directory exists,
instead of only inside linked-worktree detection paths.

## Verification

- `pnpm vitest run server/src/__tests__/app-vite-dev-routing.test.ts
ui/src/lib/vite-watch.test.ts`
- `node cli/node_modules/tsx/dist/cli.mjs
scripts/ensure-workspace-package-links.ts`

## Risks

- The asset routing change is low risk but sits near app shell behavior,
so a regression would show up as broken static assets in dev mode.
- The workspace-link repair now runs in more cases, so the main risk is
doing unexpected relinks when a checkout has intentionally unusual
workspace symlink state.

## Model Used

- OpenAI Codex, GPT-5-based coding agent in the Codex CLI environment.
Exact backend model deployment ID was not exposed in-session.
Tool-assisted editing and shell execution were used.

## Checklist

- [x] I have included a thinking path that traces from project context
to this change
- [x] I have specified the model used (with version and capability
details)
- [x] I have run tests locally and they pass
- [x] I have added or updated tests where applicable
- [x] If this change affects the UI, I have included before/after
screenshots
- [x] I have updated relevant documentation to reflect my changes
- [x] I have considered and documented any risks above
- [x] I will address all Greptile and reviewer comments before
requesting merge
2026-04-15 09:47:29 -05:00
Jannes Stubbemann
390502736c chore(ui): drop console.* and legal comments in production builds (#3728)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - The web UI is a single-page app built with Vite and shipped as a
static bundle to every deployment
> - Production bundles carry `console.log` / `console.debug` calls from
dev code and `/*! … */` legal-comment banners from third-party packages
> - The console calls leak internals to anyone opening devtools and
waste bytes per call site; the legal banners accumulate throughout the
bundle
> - Both problems affect every self-hoster, since they all ship the same
UI bundle
> - This pull request configures esbuild (via `vite.config.ts`) to strip
`console` and `debugger` statements and drop inline legal comments from
production builds only

## What Changed

- `ui/vite.config.ts`:
  - Switch to the functional `defineConfig(({ mode }) => …)` form.
- Add `build.minify: "esbuild"` (explicit — it's the existing default).
- Add `esbuild.drop: ["console", "debugger"]` and
`esbuild.legalComments: "none"`, gated on `mode === "production"` so
`vite dev` is unaffected.

## Verification

- `pnpm --filter @paperclipai/ui build` then grep the
`ui/dist/assets/*.js` bundle for `console.log` — no occurrences.
- `pnpm --filter @paperclipai/ui dev` — `console.log` calls in source
still reach the browser console.
- Bundle size: small reduction (varies with project but measurable on a
fresh build).

## Risks

Low. No API surface change. Production code should not depend on
`console.*` for side effects; any call that did is now a dead call,
which is the same behavior most minifiers apply.

## Model Used

Claude Opus 4.6 (1M context), extended thinking mode.

## Checklist

- [x] Thinking path traces from project context to this change
- [x] Model used specified
- [x] Tests run locally and pass
- [x] CI green
- [x] Greptile review addressed
2026-04-15 09:46:12 -05:00
Jannes Stubbemann
0d87fd9a11 fix: proper cache headers for static assets and SPA fallback (#3734)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Every deployment serves the same Vite-built UI bundle from the same
express app
> - Vite emits JS/CSS under `/assets/<name>.<hash>.<ext>` — the hash
rolls whenever the content rolls, so these files are inherently
immutable
> - `index.html` references specific hashed filenames, so it has the
opposite lifecycle: whenever we deploy, the file changes but the URL
doesn't
> - Today the static middleware sends neither with cache headers, and
the SPA fallback serves `index.html` for any unmatched route — including
paths under `/assets/` that no longer exist after a deploy
> - That combination produces the familiar "blank screen after deploy" +
`Failed to load module script: Expected a JavaScript MIME type but
received 'text/html'` bug
> - This pull request caches hashed assets immutably, forces
`index.html` to `no-cache` everywhere it gets served, and returns 404
for missing `/assets/*` paths

## What Changed

- `server/src/app.ts`:
- Serve `/assets/*` with `Cache-Control: public, max-age=31536000,
immutable`.
- Serve the remaining static files (favicon, manifest, robots.txt) with
a 1-hour cache, but override to `no-cache` specifically for `index.html`
via the `setHeaders` hook — because `express.static` serves it directly
for `/` and `/index.html`.
- The SPA fallback (`app.get(/.*/, …)`) sets `Cache-Control: no-cache`
on its `index.html` response.
- The fallback returns 404 for paths under `/assets/` so browsers don't
cache the HTML shell as a JavaScript module.

## Verification

- `curl -i http://localhost:3100/assets/index-abc123.js` →
`cache-control: public, max-age=31536000, immutable`.
- `curl -i http://localhost:3100/` → `cache-control: no-cache`.
- `curl -i http://localhost:3100/assets/missing.js` → `404`.
- `curl -i http://localhost:3100/some/spa/route` → `200` HTML with
`cache-control: no-cache`.

## Risks

Low. Asset URLs and HTML content are unchanged; only response headers
and the 404 behavior for missing asset paths change. No API surface
affected.

## Model Used

Claude Opus 4.6 (1M context), extended thinking mode.

## Checklist

- [x] Thinking path traces from project context to this change
- [x] Model used specified
- [x] Tests run locally and pass
- [x] CI green
- [x] Greptile review addressed
2026-04-15 09:45:22 -05:00
Jannes Stubbemann
6059c665d5 fix(a11y): remove maximum-scale and user-scalable=no from viewport (#3726)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Humans watch and oversee those agents through a web UI
> - Accessibility matters for anyone who cannot read small text
comfortably — they rely on browser zoom
> - The app shell's viewport meta tag includes `maximum-scale=1.0,
user-scalable=no`
> - Those tokens disable pinch-zoom and are a WCAG 2.1 SC 1.4.4 (Resize
Text) failure
> - The original motivation — suppressing iOS Safari's auto-zoom on
focused inputs — is actually a font-size issue, not a viewport issue,
and modern Safari only auto-zooms when input font-size is below 16px
> - This pull request drops the two tokens, restoring pinch-zoom while
leaving the real fix (inputs at ≥16px) to CSS

## What Changed

- `ui/index.html` — remove `maximum-scale=1.0, user-scalable=no` from
the viewport meta tag. Keep `width=device-width, initial-scale=1.0,
viewport-fit=cover`.

## Verification

- Manual on iOS and Chrome mobile: pinch-to-zoom now works across the
app.
- Manual on desktop: Ctrl+/- zoom already worked via
`initial-scale=1.0`; unchanged.

## Risks

Low. Users who were relying on auto-zoom-suppression for text inputs
will notice nothing (modern Safari only auto-zooms below 16px). No API
surface change.

## Model Used

Claude Opus 4.6 (1M context), extended thinking mode.

## Checklist

- [x] Thinking path traces from project context to this change
- [x] Model used specified
- [x] Tests run locally and pass
- [x] CI green
- [x] Greptile review addressed
2026-04-15 09:43:45 -05:00
Jannes Stubbemann
f460f744ef fix: trust PAPERCLIP_PUBLIC_URL in board mutation guard (#3731)
## Thinking Path

> - Paperclip orchestrates AI agents for zero-human companies
> - Humans interact with the system through a web UI that authenticates
a session and then issues mutations against the board
> - A CSRF-style guard (`boardMutationGuard`) protects those mutations
by requiring the request origin match a trusted set built from the
`Host` / `X-Forwarded-Host` header
> - Behind certain reverse proxies, neither header matches the public
URL — TLS terminates at the edge and the inbound `Host` carries an
internal service name (cluster-local hostname, IP, or an Ingress backend
reference)
> - Mutations from legitimate browser sessions then fail with `403 Board
mutation requires trusted browser origin`
> - `PAPERCLIP_PUBLIC_URL` is already the canonical "what operators told
us the public URL is" value — it's used by better-auth and `config.ts`
> - This pull request adds it to the trusted-origin set when set, so
browsers reaching the legit public URL aren't blocked

## What Changed

- `server/src/middleware/board-mutation-guard.ts` — parse
`PAPERCLIP_PUBLIC_URL` and add its origin to the trusted set in
`trustedOriginsForRequest`. Additive only.

## Verification

- `PAPERCLIP_PUBLIC_URL=https://example.com pnpm start` then issue a
mutation from a browser pointed at `https://example.com`: 200, as
before. From an unrecognized origin: 403, as before.
- Without `PAPERCLIP_PUBLIC_URL` set: behavior is unchanged.

## Risks

Low. Additive only. The default dev origins and the
`Host`/`X-Forwarded-Host`-derived origins continue to be trusted; this
just adds the operator-configured public URL on top.

## Model Used

Claude Opus 4.6 (1M context), extended thinking mode.

## Checklist

- [x] Thinking path traces from project context to this change
- [x] Model used specified
- [x] Tests run locally and pass
- [x] CI green
- [x] Greptile review addressed
2026-04-15 09:42:55 -05:00
1004 changed files with 299732 additions and 8927 deletions

View File

@@ -154,6 +154,14 @@ Each AGENTS.md body should include not just what the agent does, but how they fi
This turns a collection of agents into an organization that actually works together. Without workflow context, agents operate in isolation — they do their job but don't know what happens before or after them.
Add a concise execution contract to every generated working agent:
- Start actionable work in the same heartbeat and do not stop at a plan unless planning was requested.
- Leave durable progress in comments, documents, or work products with the next action.
- Use child issues for long or parallel delegated work instead of polling agents, sessions, or processes.
- Mark blocked work with the unblock owner and action.
- Respect budget, pause/cancel, approval gates, and company boundaries.
### Step 5: Confirm Output Location
Ask the user where to write the package. Common options:

View File

@@ -105,6 +105,13 @@ Your responsibilities:
- Implement features and fix bugs
- Write tests and documentation
- Participate in code reviews
Execution contract:
- Start actionable implementation work in the same heartbeat; do not stop at a plan unless planning was requested.
- Leave durable progress with a clear next action.
- Use child issues for long or parallel delegated work instead of polling agents, sessions, or processes.
- Mark blocked work with the unblock owner and action.
```
## teams/engineering/TEAM.md

View File

@@ -548,7 +548,7 @@ Import from `@paperclipai/adapter-utils/server-utils`:
### Prompt Templates
- Support `promptTemplate` for every run
- Use `renderTemplate()` with the standard variable set
- Default prompt: `"You are agent {{agent.id}} ({{agent.name}}). Continue your Paperclip work."`
- Default prompt should use `DEFAULT_PAPERCLIP_AGENT_PROMPT_TEMPLATE` from `@paperclipai/adapter-utils/server-utils` so local adapters share Paperclip's execution contract: act in the same heartbeat, avoid planning-only exits unless requested, leave durable progress and a next action, use child issues instead of polling, mark blockers with owner/action, and respect governance boundaries.
### Error Handling
- Differentiate timeout vs process error vs parse failure

View File

@@ -177,8 +177,12 @@ real name or email). To find GitHub usernames:
**Never expose contributor email addresses.** Use `@username` only.
Exclude bot accounts (e.g. `lockfile-bot`, `dependabot`) from the list. List contributors
in alphabetical order by GitHub username (case-insensitive).
Exclude bot accounts (e.g. `lockfile-bot`, `dependabot`) from the list.
Exclude Paperclip founders from the list (e.g. `cryppadotta`, `forgottendev`, `devinfoley`, `sockmonster`, `scotttong`)
List contributors in alphabetical order by GitHub username (case-insensitive).
If there are no contributors left after exclusions, then just skip this section and don't mention it.
## Step 6 — Review Before Release

View File

@@ -2,3 +2,6 @@ DATABASE_URL=postgres://paperclip:paperclip@localhost:5432/paperclip
PORT=3100
SERVE_UI=false
BETTER_AUTH_SECRET=paperclip-dev-secret
# Discord webhook for daily merge digest (scripts/discord-daily-digest.sh)
# DISCORD_WEBHOOK_URL=https://discord.com/api/webhooks/...

View File

@@ -38,6 +38,8 @@
-
> For core feature work, check [`ROADMAP.md`](ROADMAP.md) first and discuss it in `#dev` before opening the PR. Feature PRs that overlap with planned core work may need to be redirected — check the roadmap first. See `CONTRIBUTING.md`.
## Model Used
<!--
@@ -57,6 +59,7 @@
- [ ] I have included a thinking path that traces from project context to this change
- [ ] I have specified the model used (with version and capability details)
- [ ] I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work
- [ ] I have run tests locally and they pass
- [ ] I have added or updated tests where applicable
- [ ] If this change affects the UI, I have included before/after screenshots

View File

@@ -14,7 +14,7 @@ permissions:
jobs:
build-and-push:
runs-on: ubuntu-latest
timeout-minutes: 30
timeout-minutes: 60
concurrency:
group: docker-${{ github.ref }}
cancel-in-progress: true

View File

@@ -23,7 +23,9 @@ jobs:
- name: Block manual lockfile edits
if: github.head_ref != 'chore/refresh-lockfile'
run: |
changed="$(git diff --name-only "${{ github.event.pull_request.base.sha }}" "${{ github.event.pull_request.head.sha }}")"
# Diff the PR branch against its merge base so recent base-branch commits
# do not masquerade as changes made by the PR itself.
changed="$(git diff --name-only "${{ github.event.pull_request.base.sha }}...${{ github.event.pull_request.head.sha }}")"
if printf '%s\n' "$changed" | grep -qx 'pnpm-lock.yaml'; then
echo "Do not commit pnpm-lock.yaml in pull requests. CI owns lockfile updates."
exit 1
@@ -41,48 +43,20 @@ jobs:
node-version: 24
- name: Validate Dockerfile deps stage
run: node ./scripts/check-docker-deps-stage.mjs
- name: Validate release package manifest
run: node ./scripts/release-package-map.mjs check
- name: Verify release package bootstrap for changed manifests
run: |
missing=0
# Extract only the deps stage from the Dockerfile
deps_stage="$(awk '/^FROM .* AS deps$/{found=1; next} found && /^FROM /{exit} found{print}' Dockerfile)"
if [ -z "$deps_stage" ]; then
echo "::error::Could not extract deps stage from Dockerfile (expected 'FROM ... AS deps')"
exit 1
fi
# Derive workspace search roots from pnpm-workspace.yaml (exclude dev-only packages)
search_roots="$(grep '^ *- ' pnpm-workspace.yaml | sed 's/^ *- //' | sed 's/\*$//' | grep -v 'examples' | grep -v 'create-paperclip-plugin' | tr '\n' ' ')"
if [ -z "$search_roots" ]; then
echo "::error::Could not derive workspace roots from pnpm-workspace.yaml"
exit 1
fi
# Check all workspace package.json files are copied in the deps stage
for pkg in $(find $search_roots -maxdepth 2 -name package.json -not -path '*/examples/*' -not -path '*/create-paperclip-plugin/*' -not -path '*/node_modules/*' 2>/dev/null | sort -u); do
dir="$(dirname "$pkg")"
if ! echo "$deps_stage" | grep -q "^COPY ${dir}/package.json"; then
echo "::error::Dockerfile deps stage missing: COPY ${pkg} ${dir}/"
missing=1
fi
done
# Check patches directory is copied if it exists
if [ -d patches ] && ! echo "$deps_stage" | grep -q '^COPY patches/'; then
echo "::error::Dockerfile deps stage missing: COPY patches/ patches/"
missing=1
fi
if [ "$missing" -eq 1 ]; then
echo "Dockerfile deps stage is out of sync. Update it to include the missing files."
exit 1
fi
mapfile -t changed_paths < <(git diff --name-only "${{ github.event.pull_request.base.sha }}...${{ github.event.pull_request.head.sha }}")
PAPERCLIP_RELEASE_BOOTSTRAP_BASE_SHA="${{ github.event.pull_request.base.sha }}" \
node ./scripts/check-release-package-bootstrap.mjs "${changed_paths[@]}"
- name: Validate dependency resolution when manifests change
run: |
changed="$(git diff --name-only "${{ github.event.pull_request.base.sha }}" "${{ github.event.pull_request.head.sha }}")"
changed="$(git diff --name-only "${{ github.event.pull_request.base.sha }}...${{ github.event.pull_request.head.sha }}")"
manifest_pattern='(^|/)package\.json$|^pnpm-workspace\.yaml$|^\.npmrc$|^pnpmfile\.(cjs|js|mjs)$'
if printf '%s\n' "$changed" | grep -Eq "$manifest_pattern"; then
pnpm install --lockfile-only --ignore-scripts --no-frozen-lockfile
@@ -111,16 +85,88 @@ jobs:
- name: Install dependencies
run: pnpm install --frozen-lockfile
- name: Typecheck
run: pnpm -r typecheck
- name: Typecheck workspaces whose build scripts skip TypeScript
run: pnpm run typecheck:build-gaps
- name: Run tests
run: pnpm test:run
- name: Run general test suites
run: pnpm test:run:general
- name: Verify release registry test coverage
run: pnpm run test:release-registry
- name: Build
run: pnpm build
- name: Release canary dry run
verify_serialized_server:
name: Verify serialized server suites (${{ matrix.shard_label }})
needs: [policy]
runs-on: ubuntu-latest
timeout-minutes: 20
strategy:
fail-fast: false
matrix:
include:
- shard_index: 0
shard_count: 4
shard_label: 1/4
- shard_index: 1
shard_count: 4
shard_label: 2/4
- shard_index: 2
shard_count: 4
shard_label: 3/4
- shard_index: 3
shard_count: 4
shard_label: 4/4
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Setup pnpm
uses: pnpm/action-setup@v4
with:
version: 9.15.4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 24
cache: pnpm
- name: Install dependencies
run: pnpm install --frozen-lockfile
- name: Run serialized server test shard
run: pnpm test:run:serialized -- --shard-index ${{ matrix.shard_index }} --shard-count ${{ matrix.shard_count }}
canary_dry_run:
name: Canary Dry Run
needs: [policy]
runs-on: ubuntu-latest
timeout-minutes: 20
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Setup pnpm
uses: pnpm/action-setup@v4
with:
version: 9.15.4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 24
cache: pnpm
- name: Install dependencies
run: pnpm install --frozen-lockfile
# `release.sh` always executes its Step 2/7 workspace build, even when
# `--skip-verify` bypasses the initial verification gate.
- name: Release canary dry run via release.sh internal build
run: |
git checkout -B master HEAD
git checkout -- pnpm-lock.yaml
@@ -149,9 +195,6 @@ jobs:
- name: Install dependencies
run: pnpm install --frozen-lockfile
- name: Build
run: pnpm build
- name: Install Playwright
run: npx playwright install --with-deps chromium

View File

@@ -50,6 +50,9 @@ jobs:
node-version: 24
cache: pnpm
- name: Validate release package manifest
run: node ./scripts/release-package-map.mjs check
- name: Install dependencies
run: pnpm install --no-frozen-lockfile
@@ -89,6 +92,9 @@ jobs:
node-version: 24
cache: pnpm
- name: Validate release package manifest
run: node ./scripts/release-package-map.mjs check
- name: Install dependencies
run: pnpm install --no-frozen-lockfile
@@ -139,6 +145,9 @@ jobs:
node-version: 24
cache: pnpm
- name: Validate release package manifest
run: node ./scripts/release-package-map.mjs check
- name: Install dependencies
run: pnpm install --no-frozen-lockfile
@@ -177,6 +186,9 @@ jobs:
node-version: 24
cache: pnpm
- name: Validate release package manifest
run: node ./scripts/release-package-map.mjs check
- name: Install dependencies
run: pnpm install --no-frozen-lockfile

5
.gitignore vendored
View File

@@ -1,5 +1,9 @@
node_modules
node_modules/
**/node_modules
**/node_modules/
dist/
ui/storybook-static/
.env
*.tsbuildinfo
drizzle/meta/
@@ -32,6 +36,7 @@ server/src/**/*.d.ts
server/src/**/*.d.ts.map
tmp/
feedback-export-*
diagnostics/
# Editor / tool temp files
*.tmp

View File

@@ -123,7 +123,9 @@ pnpm test:release-smoke
Run the browser suites only when your change touches them or when you are explicitly verifying CI/release flows.
Run this full check before claiming done:
For normal issue work, run the smallest relevant verification first. Do not default to repo-wide typecheck/build/test on every heartbeat when a narrower check is enough to prove the change.
Run this full check before claiming repo work done in a PR-ready hand-off, or when the change scope is broad enough that targeted checks are not sufficient:
```sh
pnpm -r typecheck

View File

@@ -51,6 +51,21 @@ All tests must pass before a PR can be merged. Run them locally first and verify
We use [Greptile](https://greptile.com) for automated code review. Your PR must achieve a **5/5 Greptile score** with **all Greptile comments addressed** before it can be merged. If Greptile leaves comments, fix or respond to each one and request a re-review.
## Feature Contributions
We actively manage the core Paperclip feature roadmap.
Uncoordinated feature PRs against the core product may be closed, even when the implementation is thoughtful and high quality. That is about roadmap ownership, product coherence, and long-term maintenance commitment, not a judgment about the effort.
If you want to contribute a feature:
- Check [ROADMAP.md](ROADMAP.md) first
- Start the discussion in Discord -> `#dev` before writing code
- If the idea fits as an extension, prefer building it with the [plugin system](doc/plugins/PLUGIN_SPEC.md)
- If you want to show a possible direction, reference implementations are welcome as feedback, but they generally will not be merged directly into core
Bugs, docs improvements, and small targeted improvements are still the easiest path to getting merged, and we really do appreciate them.
## General Rules (both paths)
- Write clear commit messages

View File

@@ -1,16 +1,9 @@
# syntax=docker/dockerfile:1.20
FROM node:lts-trixie-slim AS base
ARG USER_UID=1000
ARG USER_GID=1000
RUN apt-get update \
&& apt-get install -y --no-install-recommends ca-certificates gosu curl git wget ripgrep python3 \
&& mkdir -p -m 755 /etc/apt/keyrings \
&& wget -nv -O/etc/apt/keyrings/githubcli-archive-keyring.gpg https://cli.github.com/packages/githubcli-archive-keyring.gpg \
&& echo "20e0125d6f6e077a9ad46f03371bc26d90b04939fb95170f5a1905099cc6bcc0 /etc/apt/keyrings/githubcli-archive-keyring.gpg" | sha256sum -c - \
&& chmod go+r /etc/apt/keyrings/githubcli-archive-keyring.gpg \
&& mkdir -p -m 755 /etc/apt/sources.list.d \
&& echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/githubcli-archive-keyring.gpg] https://cli.github.com/packages stable main" > /etc/apt/sources.list.d/github-cli.list \
&& apt-get update \
&& apt-get install -y --no-install-recommends gh \
&& apt-get install -y --no-install-recommends ca-certificates gosu curl gh git wget ripgrep python3 \
&& rm -rf /var/lib/apt/lists/* \
&& corepack enable
@@ -29,6 +22,7 @@ COPY packages/shared/package.json packages/shared/
COPY packages/db/package.json packages/db/
COPY packages/adapter-utils/package.json packages/adapter-utils/
COPY packages/mcp-server/package.json packages/mcp-server/
COPY packages/adapters/acpx-local/package.json packages/adapters/acpx-local/
COPY packages/adapters/claude-local/package.json packages/adapters/claude-local/
COPY packages/adapters/codex-local/package.json packages/adapters/codex-local/
COPY packages/adapters/cursor-local/package.json packages/adapters/cursor-local/
@@ -37,6 +31,8 @@ COPY packages/adapters/openclaw-gateway/package.json packages/adapters/openclaw-
COPY packages/adapters/opencode-local/package.json packages/adapters/opencode-local/
COPY packages/adapters/pi-local/package.json packages/adapters/pi-local/
COPY packages/plugins/sdk/package.json packages/plugins/sdk/
COPY --parents packages/plugins/sandbox-providers/./*/package.json packages/plugins/sandbox-providers/
COPY packages/plugins/paperclip-plugin-fake-sandbox/package.json packages/plugins/paperclip-plugin-fake-sandbox/
COPY patches/ patches/
RUN pnpm install --frozen-lockfile
@@ -56,6 +52,9 @@ ARG USER_GID=1000
WORKDIR /app
COPY --chown=node:node --from=build /app /app
RUN npm install --global --omit=dev @anthropic-ai/claude-code@latest @openai/codex@latest opencode-ai \
&& apt-get update \
&& apt-get install -y --no-install-recommends openssh-client jq \
&& rm -rf /var/lib/apt/lists/* \
&& mkdir -p /paperclip \
&& chown node:node /paperclip

119
README.md
View File

@@ -6,7 +6,8 @@
<a href="#quickstart"><strong>Quickstart</strong></a> &middot;
<a href="https://paperclip.ing/docs"><strong>Docs</strong></a> &middot;
<a href="https://github.com/paperclipai/paperclip"><strong>GitHub</strong></a> &middot;
<a href="https://discord.gg/m4HZY7xNG3"><strong>Discord</strong></a>
<a href="https://discord.gg/m4HZY7xNG3"><strong>Discord</strong></a> &middot;
<a href="https://x.com/papercliping"><strong>Twitter</strong></a>
</p>
<p align="center">
@@ -156,6 +157,115 @@ Paperclip handles the hard orchestration details correctly.
<br/>
## What's Under the Hood
Paperclip is a full control plane, not a wrapper. Before you build any of this yourself, know that it already exists:
```
┌──────────────────────────────────────────────────────────────┐
│ PAPERCLIP SERVER │
│ │
│ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐ │
│ │Identity & │ │ Work & │ │ Heartbeat │ │Governance │ │
│ │ Access │ │ Tasks │ │ Execution │ │& Approvals│ │
│ └───────────┘ └───────────┘ └───────────┘ └───────────┘ │
│ │
│ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐ │
│ │ Org Chart │ │Workspaces │ │ Plugins │ │ Budget │ │
│ │ & Agents │ │ & Runtime │ │ │ │ & Costs │ │
│ └───────────┘ └───────────┘ └───────────┘ └───────────┘ │
│ │
│ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐ │
│ │ Routines │ │ Secrets & │ │ Activity │ │ Company │ │
│ │& Schedules│ │ Storage │ │ & Events │ │Portability│ │
│ └───────────┘ └───────────┘ └───────────┘ └───────────┘ │
└──────────────────────────────────────────────────────────────┘
▲ ▲ ▲ ▲
┌─────┴─────┐ ┌─────┴─────┐ ┌─────┴─────┐ ┌─────┴─────┐
│ Claude │ │ Codex │ │ CLI │ │ HTTP/web │
│ Code │ │ │ │ agents │ │ bots │
└───────────┘ └───────────┘ └───────────┘ └───────────┘
```
### The Systems
<table>
<tr>
<td width="50%">
**Identity & Access** — Two deployment modes (trusted local or authenticated), board users, agent API keys, short-lived run JWTs, company memberships, invite flows, and OpenClaw onboarding. Every mutating request is traced to an actor.
</td>
<td width="50%">
**Org Chart & Agents** — Agents have roles, titles, reporting lines, permissions, and budgets. Adapter examples match the diagram: Claude Code, Codex, CLI agents such as Cursor/Gemini/bash, HTTP/webhook bots such as OpenClaw, and external adapter plugins. If it can receive a heartbeat, it's hired.
</td>
</tr>
<tr>
<td>
**Work & Task System** — Issues carry company/project/goal/parent links, atomic checkout with execution locks, first-class blocker dependencies, comments, documents, attachments, work products, labels, and inbox state. No double-work, no lost context.
</td>
<td>
**Heartbeat Execution** — DB-backed wakeup queue with coalescing, budget checks, workspace resolution, secret injection, skill loading, and adapter invocation. Runs produce structured logs, cost events, session state, and audit trails. Recovery handles orphaned runs automatically.
</td>
</tr>
<tr>
<td>
**Workspaces & Runtime** — Project workspaces, isolated execution workspaces (git worktrees, operator branches), and runtime services (dev servers, preview URLs). Agents work in the right directory with the right context every time.
</td>
<td>
**Governance & Approvals** — Board approval workflows, execution policies with review/approval stages, decision tracking, budget hard-stops, agent pause/resume/terminate, and full audit logging. You're the board — nothing ships without your sign-off.
</td>
</tr>
<tr>
<td>
**Budget & Cost Control** — Token and cost tracking by company, agent, project, goal, issue, provider, and model. Scoped budget policies with warning thresholds and hard stops. Overspend pauses agents and cancels queued work automatically.
</td>
<td>
**Routines & Schedules** — Recurring tasks with cron, webhook, and API triggers. Concurrency and catch-up policies. Each routine execution creates a tracked issue and wakes the assigned agent — no manual kick-offs needed.
</td>
</tr>
<tr>
<td>
**Plugins** — Instance-wide plugin system with out-of-process workers, capability-gated host services, job scheduling, tool exposure, and UI contributions. Extend Paperclip without forking it.
</td>
<td>
**Secrets & Storage** — Instance and company secrets, encrypted local storage, provider-backed object storage, attachments, and work products. Sensitive values stay out of prompts unless a scoped run explicitly needs them.
</td>
</tr>
<tr>
<td>
**Activity & Events** — Mutating actions, heartbeat state changes, cost events, approvals, comments, and work products are recorded as durable activity so operators can audit what happened and why.
</td>
<td>
**Company Portability** — Export and import entire organizations — agents, skills, projects, routines, and issues — with secret scrubbing and collision handling. One deployment, many companies, complete data isolation.
</td>
</tr>
</table>
<br/>
## What Paperclip is not
| | |
@@ -256,10 +366,10 @@ See [doc/DEVELOPING.md](doc/DEVELOPING.md) for the full development guide.
- ✅ Scheduled Routines
- ✅ Better Budgeting
- ✅ Agent Reviews and Approvals
- Multiple Human Users
- Multiple Human Users
- ⚪ Cloud / Sandbox agents (e.g. Cursor / e2b agents)
- ⚪ Artifacts & Work Products
- ⚪ Memory & Knowledge
- ⚪ Memory / Knowledge
- ⚪ Enforced Outcomes
- ⚪ MAXIMIZER MODE
- ⚪ Deep Planning
@@ -270,6 +380,8 @@ See [doc/DEVELOPING.md](doc/DEVELOPING.md) for the full development guide.
- ⚪ Cloud deployments
- ⚪ Desktop App
This is the short roadmap preview. See the full roadmap in [ROADMAP.md](ROADMAP.md).
<br/>
## Community & Plugins
@@ -298,6 +410,7 @@ We welcome contributions. See the [contributing guide](CONTRIBUTING.md) for deta
## Community
- [Discord](https://discord.gg/m4HZY7xNG3) — Join the community
- [Twitter / X](https://x.com/papercliping) — Follow updates and announcements
- [GitHub Issues](https://github.com/paperclipai/paperclip/issues) — bugs and feature requests
- [GitHub Discussions](https://github.com/paperclipai/paperclip/discussions) — ideas and RFC

97
ROADMAP.md Normal file
View File

@@ -0,0 +1,97 @@
# Roadmap
This document expands the roadmap preview in `README.md`.
Paperclip is still moving quickly. The list below is directional, not promised, and priorities may shift as we learn from users and from operating real AI companies with the product.
We value community involvement and want to make sure contributor energy goes toward areas where it can land.
We may accept contributions in the areas below, but if you want to work on roadmap-level core features, please coordinate with us first in Discord (`#dev`) before writing code. Bugs, docs, polish, and tightly scoped improvements are still the easiest contributions to merge.
If you want to extend Paperclip today, the best path is often the [plugin system](doc/plugins/PLUGIN_SPEC.md). Community reference implementations are also useful feedback even when they are not merged directly into core.
## Milestones
### ✅ Plugin system
Paperclip should keep a thin core and rich edges. Plugins are the path for optional capabilities like knowledge bases, custom tracing, queues, doc editors, and other product-specific surfaces that do not need to live in the control plane itself.
### ✅ Get OpenClaw / claw-style agent employees
Paperclip should be able to hire and manage real claw-style agent workers, not just a narrow built-in runtime. This is part of the larger "bring your own agent" story and keeps the control plane useful across different agent ecosystems.
### ✅ companies.sh - import and export entire organizations
Reusable companies matter. Import/export is the foundation for moving org structures, agent definitions, and reusable company setups between environments and eventually for broader company-template distribution.
### ✅ Easy AGENTS.md configurations
Agent setup should feel repo-native and legible. Simple `AGENTS.md`-style configuration lowers the barrier to getting an agent team running and makes it easier for contributors to understand how a company is wired together.
### ✅ Skills Manager
Agents need a practical way to discover, install, and use skills without every setup becoming bespoke. The skills layer is part of making Paperclip companies more reusable and easier to operate.
### ✅ Scheduled Routines
Recurring work should be native. Routine tasks like reports, reviews, and other periodic work need first-class scheduling so the company keeps operating even when no human is manually kicking work off.
### ✅ Better Budgeting
Budgets are a core control-plane feature, not an afterthought. Better budgeting means clearer spend visibility, safer hard stops, and better operator control over how autonomy turns into real cost.
### ✅ Agent Reviews and Approvals
Paperclip should support explicit review and approval stages as first-class workflow steps, not just ad hoc comments. That means reviewer routing, approval gates, change requests, and durable audit trails that fit the same task model as the rest of the control plane.
### ✅ Multiple Human Users
Paperclip needs a clearer path from solo operator to real human teams. That means shared board access, safer collaboration, and a better model for several humans supervising the same autonomous company.
### ⚪ Cloud / Sandbox agents (e.g. Cursor / e2b agents)
We want agents to run in more remote and sandboxed environments while preserving the same Paperclip control-plane model. This makes the system safer, more flexible, and more useful outside a single trusted local machine.
### ⚪ Artifacts & Work Products
Paperclip should make outputs first-class. That means generated artifacts, previews, deployable outputs, and the handoff from "agent did work" to "here is the result" should become more visible and easier to operate.
### ⚪ Memory / Knowledge
We want a stronger memory and knowledge surface for companies, agents, and projects. That includes durable memory, better recall of prior decisions and context, and a clearer path for knowledge-style capabilities without turning Paperclip into a generic chat app.
### ⚪ Enforced Outcomes
Paperclip should get stricter about what counts as finished work. Tasks, approvals, and execution flows should resolve to clear outcomes like merged code, published artifacts, shipped docs, or explicit decisions instead of stopping at vague status updates.
### ⚪ MAXIMIZER MODE
This is the direction for higher-autonomy execution: more aggressive delegation, deeper follow-through, and stronger operating loops with clear budgets, visibility, and governance. The point is not hidden autonomy; the point is more output per human supervisor.
### ⚪ Deep Planning
Some work needs more than a task description before execution starts. Deeper planning means stronger issue documents, revisionable plans, and clearer review loops for strategy-heavy work before agents begin execution.
### ⚪ Work Queues
Paperclip should support queue-style work streams for repeatable inputs like support, triage, review, and backlog intake. That would make it easier to route work continuously without turning every system into a one-off workflow.
### ⚪ Self-Organization
As companies grow, agents should be able to propose useful structural changes such as role adjustments, delegation changes, and new recurring routines. The goal is adaptive organizations that still stay within governance and approval boundaries.
### ⚪ Automatic Organizational Learning
Paperclip should get better at turning completed work into reusable organizational knowledge. That includes capturing playbooks, recurring fixes, and decision patterns so future work starts from what the company has already learned.
### ⚪ CEO Chat
We want a lighter-weight way to talk to leadership agents, but those conversations should still resolve to real work objects like plans, issues, approvals, or decisions. This should improve interaction without changing the core task-and-comments model.
### ⚪ Cloud deployments
Local-first remains important, but Paperclip also needs a cleaner shared deployment story. Teams should be able to run the same product in hosted or semi-hosted environments without changing the mental model.
### ⚪ Desktop App
A desktop app can make Paperclip feel more accessible and persistent for day-to-day operators. The goal is easier access, better local ergonomics, and a smoother default experience for users who want the control plane always close at hand.

View File

@@ -6,13 +6,14 @@
<a href="#quickstart"><strong>Quickstart</strong></a> &middot;
<a href="https://paperclip.ing/docs"><strong>Docs</strong></a> &middot;
<a href="https://github.com/paperclipai/paperclip"><strong>GitHub</strong></a> &middot;
<a href="https://discord.gg/m4HZY7xNG3"><strong>Discord</strong></a>
<a href="https://discord.gg/m4HZY7xNG3"><strong>Discord</strong></a> &middot;
<a href="https://x.com/papercliping"><strong>Twitter</strong></a>
</p>
<p align="center">
<a href="https://github.com/paperclipai/paperclip/blob/master/LICENSE"><img src="https://img.shields.io/badge/license-MIT-blue" alt="MIT License" /></a>
<a href="https://github.com/paperclipai/paperclip/stargazers"><img src="https://img.shields.io/github/stars/paperclipai/paperclip?style=flat" alt="Stars" /></a>
<a href="https://discord.gg/m4HZY7xNG3"><img src="https://img.shields.io/badge/discord-join%20chat-5865F2?logo=discord&logoColor=white" alt="Discord" /></a>
<a href="https://discord.gg/m4HZY7xNG3"><img src="https://img.shields.io/discord/000000000?label=discord" alt="Discord" /></a>
</p>
<br/>
@@ -258,7 +259,7 @@ See [doc/DEVELOPING.md](https://github.com/paperclipai/paperclip/blob/master/doc
- ⚪ Artifacts & Deployments
- ⚪ CEO Chat
- ⚪ MAXIMIZER MODE
- Multiple Human Users
- Multiple Human Users
- ⚪ Cloud / Sandbox agents (e.g. Cursor / e2b agents)
- ⚪ Cloud deployments
- ⚪ Desktop App
@@ -278,6 +279,7 @@ We welcome contributions. See the [contributing guide](https://github.com/paperc
## Community
- [Discord](https://discord.gg/m4HZY7xNG3) — Join the community
- [Twitter / X](https://x.com/papercliping) — Follow updates and announcements
- [GitHub Issues](https://github.com/paperclipai/paperclip/issues) — bugs and feature requests
- [GitHub Discussions](https://github.com/paperclipai/paperclip/discussions) — ideas and RFC

View File

@@ -37,6 +37,7 @@
},
"dependencies": {
"@clack/prompts": "^0.10.0",
"@paperclipai/adapter-acpx-local": "workspace:*",
"@paperclipai/adapter-claude-local": "workspace:*",
"@paperclipai/adapter-codex-local": "workspace:*",
"@paperclipai/adapter-cursor-local": "workspace:*",

View File

@@ -14,6 +14,7 @@ function makeCompany(overrides: Partial<Company>): Company {
issueCounter: 1,
budgetMonthlyCents: 0,
spentMonthlyCents: 0,
attachmentMaxBytes: 10 * 1024 * 1024,
requireBoardApprovalForNewAgents: false,
feedbackDataSharingEnabled: false,
feedbackDataSharingConsentAt: null,

View File

@@ -1,5 +1,5 @@
import { execFile, spawn } from "node:child_process";
import { mkdirSync, mkdtempSync, readFileSync, readdirSync, rmSync, writeFileSync } from "node:fs";
import { existsSync, mkdirSync, mkdtempSync, readFileSync, readdirSync, rmSync, writeFileSync } from "node:fs";
import net from "node:net";
import os from "node:os";
import path from "node:path";
@@ -104,20 +104,50 @@ function writeTestConfig(configPath: string, tempRoot: string, port: number, con
writeFileSync(configPath, `${JSON.stringify(config, null, 2)}\n`, "utf8");
}
function createServerEnv(configPath: string, port: number, connectionString: string) {
interface TestPaperclipEnv {
configPath: string;
paperclipHome: string;
instanceId: string;
shellHome?: string;
}
function createBasePaperclipEnv(options: TestPaperclipEnv) {
const env = { ...process.env };
for (const key of Object.keys(env)) {
if (key.startsWith("PAPERCLIP_")) {
delete env[key];
}
}
env.PAPERCLIP_CONFIG = options.configPath;
env.PAPERCLIP_HOME = options.paperclipHome;
env.PAPERCLIP_INSTANCE_ID = options.instanceId;
env.PAPERCLIP_CONTEXT = path.join(options.paperclipHome, "context.json");
env.PAPERCLIP_AUTH_STORE = path.join(options.paperclipHome, "auth.json");
if (options.shellHome) {
env.HOME = options.shellHome;
}
return env;
}
function createServerEnv(
configPath: string,
port: number,
connectionString: string,
options: Omit<TestPaperclipEnv, "configPath">,
) {
const env = createBasePaperclipEnv({
configPath,
...options,
});
delete env.DATABASE_URL;
delete env.PORT;
delete env.HOST;
delete env.SERVE_UI;
delete env.HEARTBEAT_SCHEDULER_ENABLED;
env.PAPERCLIP_CONFIG = configPath;
env.DATABASE_URL = connectionString;
env.HOST = "127.0.0.1";
env.PORT = String(port);
@@ -130,13 +160,8 @@ function createServerEnv(configPath: string, port: number, connectionString: str
return env;
}
function createCliEnv() {
const env = { ...process.env };
for (const key of Object.keys(env)) {
if (key.startsWith("PAPERCLIP_")) {
delete env[key];
}
}
function createCliEnv(options: TestPaperclipEnv) {
const env = createBasePaperclipEnv(options);
delete env.DATABASE_URL;
delete env.PORT;
delete env.HOST;
@@ -183,14 +208,25 @@ async function api<T>(baseUrl: string, pathname: string, init?: RequestInit): Pr
return text ? JSON.parse(text) as T : (null as T);
}
async function runCliJson<T>(args: string[], opts: { apiBase: string; configPath: string }) {
async function runCliJson<T>(
args: string[],
opts: TestPaperclipEnv & { apiBase?: string; includeConfigArg?: boolean },
) {
const repoRoot = path.resolve(path.dirname(fileURLToPath(import.meta.url)), "../../..");
const cliArgs = ["--silent", "paperclipai", ...args];
if (opts.apiBase) {
cliArgs.push("--api-base", opts.apiBase);
}
if (opts.includeConfigArg !== false) {
cliArgs.push("--config", opts.configPath);
}
cliArgs.push("--json");
const result = await execFileAsync(
"pnpm",
["--silent", "paperclipai", ...args, "--api-base", opts.apiBase, "--config", opts.configPath, "--json"],
cliArgs,
{
cwd: repoRoot,
env: createCliEnv(),
env: createCliEnv(opts),
maxBuffer: 10 * 1024 * 1024,
},
);
@@ -235,6 +271,9 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
let configPath = "";
let exportDir = "";
let apiBase = "";
let paperclipHome = "";
let cliShellHome = "";
let paperclipInstanceId = "";
let serverProcess: ServerProcess | null = null;
let tempDb: Awaited<ReturnType<typeof startEmbeddedPostgresTestDatabase>> | null = null;
@@ -242,6 +281,11 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
tempRoot = mkdtempSync(path.join(os.tmpdir(), "paperclip-company-cli-e2e-"));
configPath = path.join(tempRoot, "config", "config.json");
exportDir = path.join(tempRoot, "exported-company");
paperclipHome = path.join(tempRoot, "paperclip-home");
cliShellHome = path.join(tempRoot, "shell-home");
paperclipInstanceId = "company-cli-e2e";
mkdirSync(paperclipHome, { recursive: true });
mkdirSync(cliShellHome, { recursive: true });
tempDb = await startEmbeddedPostgresTestDatabase("paperclip-company-cli-db-");
@@ -256,7 +300,11 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
["paperclipai", "run", "--config", configPath],
{
cwd: repoRoot,
env: createServerEnv(configPath, port, tempDb.connectionString),
env: createServerEnv(configPath, port, tempDb.connectionString, {
paperclipHome,
instanceId: paperclipInstanceId,
shellHome: cliShellHome,
}),
stdio: ["ignore", "pipe", "pipe"],
},
);
@@ -282,11 +330,41 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
it("exports a company package and imports it into new and existing companies", async () => {
expect(serverProcess).not.toBeNull();
const cliContext = await runCliJson<{
contextPath: string;
profileName: string;
profile: { apiBase?: string };
}>(
["context", "set", "--profile", "isolation-check", "--api-base", "https://example.test"],
{
configPath,
paperclipHome,
instanceId: paperclipInstanceId,
shellHome: cliShellHome,
includeConfigArg: false,
},
);
const expectedContextPath = path.join(paperclipHome, "context.json");
const leakedContextPath = path.join(cliShellHome, ".paperclip", "context.json");
expect(cliContext.contextPath).toBe(expectedContextPath);
expect(cliContext.profileName).toBe("isolation-check");
expect(cliContext.profile.apiBase).toBe("https://example.test");
expect(existsSync(expectedContextPath)).toBe(true);
expect(existsSync(leakedContextPath)).toBe(false);
rmSync(expectedContextPath, { force: true });
expect(existsSync(expectedContextPath)).toBe(false);
const sourceCompany = await api<{ id: string; name: string; issuePrefix: string }>(apiBase, "/api/companies", {
method: "POST",
headers: { "content-type": "application/json" },
body: JSON.stringify({ name: `CLI Export Source ${Date.now()}` }),
});
await api(apiBase, `/api/companies/${sourceCompany.id}`, {
method: "PATCH",
headers: { "content-type": "application/json" },
body: JSON.stringify({ requireBoardApprovalForNewAgents: false }),
});
const sourceAgent = await api<{ id: string; name: string }>(
apiBase,
@@ -298,8 +376,11 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
name: "Export Engineer",
role: "engineer",
adapterType: "claude_local",
adapterConfig: {
promptTemplate: "You verify company portability.",
adapterConfig: {},
instructionsBundle: {
files: {
"AGENTS.md": "You verify company portability.",
},
},
}),
},
@@ -350,7 +431,13 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
"--include",
"company,agents,projects,issues",
],
{ apiBase, configPath },
{
apiBase,
configPath,
paperclipHome,
instanceId: paperclipInstanceId,
shellHome: cliShellHome,
},
);
expect(exportResult.ok).toBe(true);
@@ -374,7 +461,13 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
"company,agents,projects,issues",
"--yes",
],
{ apiBase, configPath },
{
apiBase,
configPath,
paperclipHome,
instanceId: paperclipInstanceId,
shellHome: cliShellHome,
},
);
expect(importedNew.company.action).toBe("created");
@@ -393,10 +486,11 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
apiBase,
`/api/companies/${importedNew.company.id}/issues`,
);
const importedMatchingIssues = importedIssues.filter((issue) => issue.title === sourceIssue.title);
expect(importedAgents.map((agent) => agent.name)).toContain(sourceAgent.name);
expect(importedProjects.map((project) => project.name)).toContain(sourceProject.name);
expect(importedIssues.map((issue) => issue.title)).toContain(sourceIssue.title);
expect(importedMatchingIssues).toHaveLength(1);
const previewExisting = await runCliJson<{
errors: string[];
@@ -421,7 +515,13 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
"rename",
"--dry-run",
],
{ apiBase, configPath },
{
apiBase,
configPath,
paperclipHome,
instanceId: paperclipInstanceId,
shellHome: cliShellHome,
},
);
expect(previewExisting.errors).toEqual([]);
@@ -448,7 +548,13 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
"rename",
"--yes",
],
{ apiBase, configPath },
{
apiBase,
configPath,
paperclipHome,
instanceId: paperclipInstanceId,
shellHome: cliShellHome,
},
);
expect(importedExisting.company.action).toBe("unchanged");
@@ -466,11 +572,13 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
apiBase,
`/api/companies/${importedNew.company.id}/issues`,
);
const twiceImportedMatchingIssues = twiceImportedIssues.filter((issue) => issue.title === sourceIssue.title);
expect(twiceImportedAgents).toHaveLength(2);
expect(new Set(twiceImportedAgents.map((agent) => agent.name)).size).toBe(2);
expect(twiceImportedProjects).toHaveLength(2);
expect(twiceImportedIssues).toHaveLength(2);
expect(twiceImportedMatchingIssues).toHaveLength(2);
expect(new Set(twiceImportedMatchingIssues.map((issue) => issue.identifier)).size).toBe(2);
const zipPath = path.join(tempRoot, "exported-company.zip");
const portableFiles: Record<string, string> = {};
@@ -493,10 +601,16 @@ describeEmbeddedPostgres("paperclipai company import/export e2e", () => {
"company,agents,projects,issues",
"--yes",
],
{ apiBase, configPath },
{
apiBase,
configPath,
paperclipHome,
instanceId: paperclipInstanceId,
shellHome: cliShellHome,
},
);
expect(importedFromZip.company.action).toBe("created");
expect(importedFromZip.agents.some((agent) => agent.action === "created")).toBe(true);
}, 60_000);
}, 90_000);
});

View File

@@ -160,6 +160,7 @@ describe("renderCompanyImportPreview", () => {
path: "COMPANY.md",
name: "Source Co",
description: null,
attachmentMaxBytes: null,
brandColor: null,
logoPath: null,
requireBoardApprovalForNewAgents: false,
@@ -375,6 +376,7 @@ describe("import selection catalog", () => {
path: "COMPANY.md",
name: "Source Co",
description: null,
attachmentMaxBytes: null,
brandColor: null,
logoPath: "images/company-logo.png",
requireBoardApprovalForNewAgents: false,

View File

@@ -0,0 +1,24 @@
import path from "node:path";
import { describe, expect, it } from "vitest";
import { collectEnvLabDoctorStatus, resolveEnvLabSshStatePath } from "../commands/env-lab.js";
describe("env-lab command", () => {
it("resolves the default SSH fixture state path under the instance root", () => {
const statePath = resolveEnvLabSshStatePath("fixture-test");
expect(statePath).toContain(
path.join("instances", "fixture-test", "env-lab", "ssh-fixture", "state.json"),
);
});
it("reports doctor status for an instance without a running fixture", async () => {
const status = await collectEnvLabDoctorStatus({ instance: "fixture-test-missing" });
expect(status.statePath).toContain(
path.join("instances", "fixture-test-missing", "env-lab", "ssh-fixture", "state.json"),
);
expect(typeof status.ssh.supported).toBe("boolean");
expect(status.ssh.running).toBe(false);
expect(status.ssh.environment).toBeNull();
});
});

View File

@@ -3,11 +3,15 @@ import os from "node:os";
import path from "node:path";
import { execFileSync } from "node:child_process";
import { randomUUID } from "node:crypto";
import { eq } from "drizzle-orm";
import { afterEach, describe, expect, it, vi } from "vitest";
import {
agents,
authUsers,
companies,
createDb,
issueComments,
issues,
projects,
routines,
routineTriggers,
@@ -16,6 +20,7 @@ import {
copyGitHooksToWorktreeGitDir,
copySeededSecretsKey,
pauseSeededScheduledRoutines,
quarantineSeededWorktreeExecutionState,
readSourceAttachmentBody,
rebindWorkspaceCwd,
resolveSourceConfigPath,
@@ -47,6 +52,7 @@ import {
const ORIGINAL_CWD = process.cwd();
const ORIGINAL_ENV = { ...process.env };
const embeddedPostgresSupport = await getEmbeddedPostgresTestSupport();
const itEmbeddedPostgres = embeddedPostgresSupport.supported ? it : it.skip;
const describeEmbeddedPostgres = embeddedPostgresSupport.supported ? describe : describe.skip;
if (!embeddedPostgresSupport.supported) {
@@ -184,8 +190,9 @@ describe("worktree helpers", () => {
).toEqual(["worktree", "add", "-b", "my-worktree", "/tmp/my-worktree", "origin/main"]);
});
it("rewrites loopback auth URLs to the new port only", () => {
it("rewrites auth URLs only when they already include a port", () => {
expect(rewriteLocalUrlPort("http://127.0.0.1:3100", 3110)).toBe("http://127.0.0.1:3110/");
expect(rewriteLocalUrlPort("http://my-host.ts.net:3100", 3110)).toBe("http://my-host.ts.net:3110/");
expect(rewriteLocalUrlPort("https://paperclip.example", 3110)).toBe("https://paperclip.example");
});
@@ -280,6 +287,138 @@ describe("worktree helpers", () => {
expect(full.nullifyColumns).toEqual({});
});
itEmbeddedPostgres("quarantines copied live execution state in seeded worktree databases", async () => {
const tempDb = await startEmbeddedPostgresTestDatabase("paperclip-worktree-quarantine-");
const db = createDb(tempDb.connectionString);
const companyId = randomUUID();
const agentId = randomUUID();
const idleAgentId = randomUUID();
const inProgressIssueId = randomUUID();
const todoIssueId = randomUUID();
const reviewIssueId = randomUUID();
const userIssueId = randomUUID();
try {
await db.insert(companies).values({
id: companyId,
name: "Paperclip",
issuePrefix: "WTQ",
requireBoardApprovalForNewAgents: false,
});
await db.insert(agents).values([
{
id: agentId,
companyId,
name: "CodexCoder",
role: "engineer",
status: "running",
adapterType: "codex_local",
adapterConfig: {},
runtimeConfig: {
heartbeat: { enabled: true, intervalSec: 60 },
wakeOnDemand: true,
},
permissions: {},
},
{
id: idleAgentId,
companyId,
name: "Reviewer",
role: "reviewer",
status: "idle",
adapterType: "codex_local",
adapterConfig: {},
runtimeConfig: { heartbeat: { enabled: false, intervalSec: 300 } },
permissions: {},
},
]);
await db.insert(issues).values([
{
id: inProgressIssueId,
companyId,
title: "Copied in-flight issue",
status: "in_progress",
priority: "medium",
assigneeAgentId: agentId,
issueNumber: 1,
identifier: "WTQ-1",
executionAgentNameKey: "codexcoder",
executionLockedAt: new Date("2026-04-18T00:00:00.000Z"),
},
{
id: todoIssueId,
companyId,
title: "Copied assigned todo issue",
status: "todo",
priority: "medium",
assigneeAgentId: agentId,
issueNumber: 2,
identifier: "WTQ-2",
},
{
id: reviewIssueId,
companyId,
title: "Copied assigned review issue",
status: "in_review",
priority: "medium",
assigneeAgentId: idleAgentId,
issueNumber: 3,
identifier: "WTQ-3",
},
{
id: userIssueId,
companyId,
title: "Copied user issue",
status: "todo",
priority: "medium",
assigneeUserId: "user-1",
issueNumber: 4,
identifier: "WTQ-4",
},
]);
await expect(quarantineSeededWorktreeExecutionState(tempDb.connectionString)).resolves.toEqual({
disabledTimerHeartbeats: 1,
resetRunningAgents: 1,
quarantinedInProgressIssues: 1,
unassignedTodoIssues: 1,
unassignedReviewIssues: 1,
});
const [quarantinedAgent] = await db.select().from(agents).where(eq(agents.id, agentId));
expect(quarantinedAgent?.status).toBe("idle");
expect(quarantinedAgent?.runtimeConfig).toMatchObject({
heartbeat: { enabled: false, intervalSec: 60 },
wakeOnDemand: true,
});
const [inProgressIssue] = await db.select().from(issues).where(eq(issues.id, inProgressIssueId));
expect(inProgressIssue?.status).toBe("blocked");
expect(inProgressIssue?.assigneeAgentId).toBeNull();
expect(inProgressIssue?.executionAgentNameKey).toBeNull();
expect(inProgressIssue?.executionLockedAt).toBeNull();
const [todoIssue] = await db.select().from(issues).where(eq(issues.id, todoIssueId));
expect(todoIssue?.status).toBe("todo");
expect(todoIssue?.assigneeAgentId).toBeNull();
const [reviewIssue] = await db.select().from(issues).where(eq(issues.id, reviewIssueId));
expect(reviewIssue?.status).toBe("in_review");
expect(reviewIssue?.assigneeAgentId).toBeNull();
const [userIssue] = await db.select().from(issues).where(eq(issues.id, userIssueId));
expect(userIssue?.status).toBe("todo");
expect(userIssue?.assigneeUserId).toBe("user-1");
const comments = await db.select().from(issueComments).where(eq(issueComments.issueId, inProgressIssueId));
expect(comments).toHaveLength(1);
expect(comments[0]?.body).toContain("Quarantined during worktree seed");
} finally {
await db.$client?.end?.({ timeout: 5 }).catch(() => undefined);
await tempDb.cleanup();
}
}, 20_000);
it("copies the source local_encrypted secrets key into the seeded worktree instance", () => {
const tempRoot = fs.mkdtempSync(path.join(os.tmpdir(), "paperclip-worktree-secrets-"));
const originalInlineMasterKey = process.env.PAPERCLIP_SECRETS_MASTER_KEY;
@@ -373,6 +512,97 @@ describe("worktree helpers", () => {
}
});
itEmbeddedPostgres(
"seeds authenticated users into minimally cloned worktree instances",
async () => {
const tempRoot = fs.mkdtempSync(path.join(os.tmpdir(), "paperclip-worktree-auth-seed-"));
const worktreeRoot = path.join(tempRoot, "PAP-999-auth-seed");
const sourceHome = path.join(tempRoot, "source-home");
const sourceConfigDir = path.join(sourceHome, "instances", "source");
const sourceConfigPath = path.join(sourceConfigDir, "config.json");
const sourceEnvPath = path.join(sourceConfigDir, ".env");
const sourceKeyPath = path.join(sourceConfigDir, "secrets", "master.key");
const worktreeHome = path.join(tempRoot, ".paperclip-worktrees");
const originalCwd = process.cwd();
const sourceDb = await startEmbeddedPostgresTestDatabase("paperclip-worktree-auth-source-");
try {
const sourceDbClient = createDb(sourceDb.connectionString);
await sourceDbClient.insert(authUsers).values({
id: "user-existing",
email: "existing@paperclip.ing",
name: "Existing User",
emailVerified: true,
createdAt: new Date(),
updatedAt: new Date(),
});
fs.mkdirSync(path.dirname(sourceKeyPath), { recursive: true });
fs.mkdirSync(worktreeRoot, { recursive: true });
const sourceConfig = buildSourceConfig();
sourceConfig.database = {
mode: "postgres",
embeddedPostgresDataDir: path.join(sourceConfigDir, "db"),
embeddedPostgresPort: 54329,
backup: {
enabled: true,
intervalMinutes: 60,
retentionDays: 30,
dir: path.join(sourceConfigDir, "backups"),
},
connectionString: sourceDb.connectionString,
};
sourceConfig.logging.logDir = path.join(sourceConfigDir, "logs");
sourceConfig.storage.localDisk.baseDir = path.join(sourceConfigDir, "storage");
sourceConfig.secrets.localEncrypted.keyFilePath = sourceKeyPath;
fs.writeFileSync(sourceConfigPath, JSON.stringify(sourceConfig, null, 2) + "\n", "utf8");
fs.writeFileSync(sourceEnvPath, "", "utf8");
fs.writeFileSync(sourceKeyPath, "source-master-key", "utf8");
process.chdir(worktreeRoot);
await worktreeInitCommand({
name: "PAP-999-auth-seed",
home: worktreeHome,
fromConfig: sourceConfigPath,
force: true,
});
const targetConfig = JSON.parse(
fs.readFileSync(path.join(worktreeRoot, ".paperclip", "config.json"), "utf8"),
) as PaperclipConfig;
const { default: EmbeddedPostgres } = await import("embedded-postgres");
const targetPg = new EmbeddedPostgres({
databaseDir: targetConfig.database.embeddedPostgresDataDir,
user: "paperclip",
password: "paperclip",
port: targetConfig.database.embeddedPostgresPort,
persistent: true,
initdbFlags: ["--encoding=UTF8", "--locale=C", "--lc-messages=C"],
onLog: () => {},
onError: () => {},
});
await targetPg.start();
try {
const targetDb = createDb(
`postgres://paperclip:paperclip@127.0.0.1:${targetConfig.database.embeddedPostgresPort}/paperclip`,
);
const seededUsers = await targetDb.select().from(authUsers);
expect(seededUsers.some((row) => row.email === "existing@paperclip.ing")).toBe(true);
} finally {
await targetPg.stop();
}
} finally {
process.chdir(originalCwd);
await sourceDb.cleanup();
fs.rmSync(tempRoot, { recursive: true, force: true });
}
},
30000,
);
it("avoids ports already claimed by sibling worktree instance configs", async () => {
const tempRoot = fs.mkdtempSync(path.join(os.tmpdir(), "paperclip-worktree-claimed-ports-"));
const repoRoot = path.join(tempRoot, "repo");
@@ -652,7 +882,7 @@ describe("worktree helpers", () => {
}
fs.rmSync(tempRoot, { recursive: true, force: true });
}
}, 20_000);
}, 30_000);
it("restores the current worktree config and instance data if reseed fails", async () => {
const tempRoot = fs.mkdtempSync(path.join(os.tmpdir(), "paperclip-worktree-reseed-rollback-"));
@@ -809,7 +1039,7 @@ describe("worktree helpers", () => {
execFileSync("git", ["worktree", "remove", "--force", worktreePath], { cwd: repoRoot, stdio: "ignore" });
fs.rmSync(tempRoot, { recursive: true, force: true });
}
});
}, 15_000);
it("creates and initializes a worktree from the top-level worktree:make command", async () => {
const tempRoot = fs.mkdtempSync(path.join(os.tmpdir(), "paperclip-worktree-make-"));

View File

@@ -1,4 +1,5 @@
import type { CLIAdapterModule } from "@paperclipai/adapter-utils";
import { printAcpxStreamEvent } from "@paperclipai/adapter-acpx-local/cli";
import { printClaudeStreamEvent } from "@paperclipai/adapter-claude-local/cli";
import { printCodexStreamEvent } from "@paperclipai/adapter-codex-local/cli";
import { printCursorStreamEvent } from "@paperclipai/adapter-cursor-local/cli";
@@ -14,6 +15,11 @@ const claudeLocalCLIAdapter: CLIAdapterModule = {
formatStdoutEvent: printClaudeStreamEvent,
};
const acpxLocalCLIAdapter: CLIAdapterModule = {
type: "acpx_local",
formatStdoutEvent: printAcpxStreamEvent,
};
const codexLocalCLIAdapter: CLIAdapterModule = {
type: "codex_local",
formatStdoutEvent: printCodexStreamEvent,
@@ -46,6 +52,7 @@ const openclawGatewayCLIAdapter: CLIAdapterModule = {
const adaptersByType = new Map<string, CLIAdapterModule>(
[
acpxLocalCLIAdapter,
claudeLocalCLIAdapter,
codexLocalCLIAdapter,
openCodeLocalCLIAdapter,

View File

@@ -61,6 +61,7 @@ interface IssueUpdateOptions extends BaseClientOptions {
interface IssueCommentOptions extends BaseClientOptions {
body: string;
reopen?: boolean;
resume?: boolean;
}
interface IssueCheckoutOptions extends BaseClientOptions {
@@ -241,12 +242,14 @@ export function registerIssueCommands(program: Command): void {
.argument("<issueId>", "Issue ID")
.requiredOption("--body <text>", "Comment body")
.option("--reopen", "Reopen if issue is done/cancelled")
.option("--resume", "Request explicit follow-up and wake the assignee when resumable")
.action(async (issueId: string, opts: IssueCommentOptions) => {
try {
const ctx = resolveCommandContext(opts);
const payload = addIssueCommentSchema.parse({
body: opts.body,
reopen: opts.reopen,
resume: opts.resume,
});
const comment = await ctx.api.post<IssueComment>(`/api/issues/${issueId}/comments`, payload);
printOutput(comment, { json: ctx.json });

174
cli/src/commands/env-lab.ts Normal file
View File

@@ -0,0 +1,174 @@
import path from "node:path";
import type { Command } from "commander";
import * as p from "@clack/prompts";
import pc from "picocolors";
import {
buildSshEnvLabFixtureConfig,
getSshEnvLabSupport,
readSshEnvLabFixtureStatus,
startSshEnvLabFixture,
stopSshEnvLabFixture,
} from "@paperclipai/adapter-utils/ssh";
import { resolvePaperclipInstanceId, resolvePaperclipInstanceRoot } from "../config/home.js";
export function resolveEnvLabSshStatePath(instanceId?: string): string {
const resolvedInstanceId = resolvePaperclipInstanceId(instanceId);
return path.resolve(
resolvePaperclipInstanceRoot(resolvedInstanceId),
"env-lab",
"ssh-fixture",
"state.json",
);
}
function printJson(value: unknown) {
process.stdout.write(`${JSON.stringify(value, null, 2)}\n`);
}
function summarizeFixture(state: {
host: string;
port: number;
username: string;
workspaceDir: string;
sshdLogPath: string;
}) {
p.log.message(`Host: ${pc.cyan(state.host)}:${pc.cyan(String(state.port))}`);
p.log.message(`User: ${pc.cyan(state.username)}`);
p.log.message(`Workspace: ${pc.cyan(state.workspaceDir)}`);
p.log.message(`Log: ${pc.dim(state.sshdLogPath)}`);
}
export async function collectEnvLabDoctorStatus(opts: { instance?: string }) {
const statePath = resolveEnvLabSshStatePath(opts.instance);
const [sshSupport, sshStatus] = await Promise.all([
getSshEnvLabSupport(),
readSshEnvLabFixtureStatus(statePath),
]);
const environment = sshStatus.state ? await buildSshEnvLabFixtureConfig(sshStatus.state) : null;
return {
statePath,
ssh: {
supported: sshSupport.supported,
reason: sshSupport.reason,
running: sshStatus.running,
state: sshStatus.state,
environment,
},
};
}
export async function envLabUpCommand(opts: { instance?: string; json?: boolean }) {
const statePath = resolveEnvLabSshStatePath(opts.instance);
const state = await startSshEnvLabFixture({ statePath });
const environment = await buildSshEnvLabFixtureConfig(state);
if (opts.json) {
printJson({ state, environment });
return;
}
p.log.success("SSH env-lab fixture is running.");
summarizeFixture(state);
p.log.message(`State: ${pc.dim(statePath)}`);
}
export async function envLabStatusCommand(opts: { instance?: string; json?: boolean }) {
const statePath = resolveEnvLabSshStatePath(opts.instance);
const status = await readSshEnvLabFixtureStatus(statePath);
const environment = status.state ? await buildSshEnvLabFixtureConfig(status.state) : null;
if (opts.json) {
printJson({ ...status, environment, statePath });
return;
}
if (!status.state || !status.running) {
p.log.info(`SSH env-lab fixture is not running (${pc.dim(statePath)}).`);
return;
}
p.log.success("SSH env-lab fixture is running.");
summarizeFixture(status.state);
p.log.message(`State: ${pc.dim(statePath)}`);
}
export async function envLabDownCommand(opts: { instance?: string; json?: boolean }) {
const statePath = resolveEnvLabSshStatePath(opts.instance);
const stopped = await stopSshEnvLabFixture(statePath);
if (opts.json) {
printJson({ stopped, statePath });
return;
}
if (!stopped) {
p.log.info(`No SSH env-lab fixture was running (${pc.dim(statePath)}).`);
return;
}
p.log.success("SSH env-lab fixture stopped.");
p.log.message(`State: ${pc.dim(statePath)}`);
}
export async function envLabDoctorCommand(opts: { instance?: string; json?: boolean }) {
const status = await collectEnvLabDoctorStatus(opts);
if (opts.json) {
printJson(status);
return;
}
if (status.ssh.supported) {
p.log.success("SSH fixture prerequisites are installed.");
} else {
p.log.warn(`SSH fixture prerequisites are incomplete: ${status.ssh.reason ?? "unknown reason"}`);
}
if (status.ssh.state && status.ssh.running) {
p.log.success("SSH env-lab fixture is running.");
summarizeFixture(status.ssh.state);
p.log.message(`Private key: ${pc.dim(status.ssh.state.clientPrivateKeyPath)}`);
p.log.message(`Known hosts: ${pc.dim(status.ssh.state.knownHostsPath)}`);
} else if (status.ssh.state) {
p.log.warn("SSH env-lab fixture state exists, but the process is not running.");
p.log.message(`State: ${pc.dim(status.statePath)}`);
} else {
p.log.info("SSH env-lab fixture is not running.");
p.log.message(`State: ${pc.dim(status.statePath)}`);
}
p.log.message(`Cleanup: ${pc.dim("pnpm paperclipai env-lab down")}`);
}
export function registerEnvLabCommands(program: Command) {
const envLab = program.command("env-lab").description("Deterministic local environment fixtures");
envLab
.command("up")
.description("Start the default SSH env-lab fixture")
.option("-i, --instance <id>", "Paperclip instance id (default: current/default)")
.option("--json", "Print machine-readable fixture details")
.action(envLabUpCommand);
envLab
.command("status")
.description("Show the current SSH env-lab fixture state")
.option("-i, --instance <id>", "Paperclip instance id (default: current/default)")
.option("--json", "Print machine-readable fixture details")
.action(envLabStatusCommand);
envLab
.command("down")
.description("Stop the default SSH env-lab fixture")
.option("-i, --instance <id>", "Paperclip instance id (default: current/default)")
.option("--json", "Print machine-readable stop details")
.action(envLabDownCommand);
envLab
.command("doctor")
.description("Check SSH fixture prerequisites and current status")
.option("-i, --instance <id>", "Paperclip instance id (default: current/default)")
.option("--json", "Print machine-readable diagnostic details")
.action(envLabDoctorCommand);
}

View File

@@ -75,11 +75,6 @@ function nonEmpty(value: string | null | undefined): string | null {
return typeof value === "string" && value.trim().length > 0 ? value.trim() : null;
}
function isLoopbackHost(hostname: string): boolean {
const value = hostname.trim().toLowerCase();
return value === "127.0.0.1" || value === "localhost" || value === "::1";
}
export function sanitizeWorktreeInstanceId(rawValue: string): string {
const trimmed = rawValue.trim().toLowerCase();
const normalized = trimmed
@@ -168,7 +163,8 @@ export function rewriteLocalUrlPort(rawUrl: string | undefined, port: number): s
if (!rawUrl) return undefined;
try {
const parsed = new URL(rawUrl);
if (!isLoopbackHost(parsed.hostname)) return rawUrl;
// The URL API normalizes default ports like :80/:443 to "", so treat them as stable URLs.
if (!parsed.port) return rawUrl;
parsed.port = String(port);
return parsed.toString();
} catch {

View File

@@ -93,6 +93,7 @@ type WorktreeInitOptions = {
dbPort?: number;
seed?: boolean;
seedMode?: string;
preserveLiveWork?: boolean;
force?: boolean;
};
@@ -126,6 +127,7 @@ type WorktreeReseedOptions = {
fromDataDir?: string;
fromInstance?: string;
seedMode?: string;
preserveLiveWork?: boolean;
yes?: boolean;
allowLiveTarget?: boolean;
};
@@ -137,6 +139,7 @@ type WorktreeRepairOptions = {
fromDataDir?: string;
fromInstance?: string;
seedMode?: string;
preserveLiveWork?: boolean;
noSeed?: boolean;
allowLiveTarget?: boolean;
};
@@ -179,6 +182,8 @@ type CopiedGitHooksResult = {
type SeedWorktreeDatabaseResult = {
backupSummary: string;
pausedScheduledRoutines: number;
executionQuarantine: SeededWorktreeExecutionQuarantineSummary;
reboundWorkspaces: Array<{
name: string;
fromCwd: string;
@@ -186,6 +191,14 @@ type SeedWorktreeDatabaseResult = {
}>;
};
export type SeededWorktreeExecutionQuarantineSummary = {
disabledTimerHeartbeats: number;
resetRunningAgents: number;
quarantinedInProgressIssues: number;
unassignedTodoIssues: number;
unassignedReviewIssues: number;
};
function nonEmpty(value: string | null | undefined): string | null {
return typeof value === "string" && value.trim().length > 0 ? value.trim() : null;
}
@@ -198,6 +211,18 @@ function isCurrentSourceConfigPath(sourceConfigPath: string): boolean {
return path.resolve(currentConfigPath) === path.resolve(sourceConfigPath);
}
function formatSeededWorktreeExecutionQuarantineSummary(
summary: SeededWorktreeExecutionQuarantineSummary,
): string {
return [
`disabled timer heartbeats: ${summary.disabledTimerHeartbeats}`,
`reset running agents: ${summary.resetRunningAgents}`,
`quarantined in-progress issues: ${summary.quarantinedInProgressIssues}`,
`unassigned todo issues: ${summary.unassignedTodoIssues}`,
`unassigned review issues: ${summary.unassignedReviewIssues}`,
].join(", ");
}
const WORKTREE_NAME_PREFIX = "paperclip-";
function resolveWorktreeMakeName(name: string): string {
@@ -1119,6 +1144,133 @@ export async function pauseSeededScheduledRoutines(connectionString: string): Pr
}
}
const EMPTY_SEEDED_WORKTREE_EXECUTION_QUARANTINE_SUMMARY: SeededWorktreeExecutionQuarantineSummary = {
disabledTimerHeartbeats: 0,
resetRunningAgents: 0,
quarantinedInProgressIssues: 0,
unassignedTodoIssues: 0,
unassignedReviewIssues: 0,
};
function isRecord(value: unknown): value is Record<string, unknown> {
return Boolean(value) && typeof value === "object" && !Array.isArray(value);
}
function isEnabledValue(value: unknown): boolean {
return value === true || value === "true" || value === 1 || value === "1";
}
function normalizeWorktreeRuntimeConfig(runtimeConfig: unknown): {
runtimeConfig: Record<string, unknown>;
disabledTimerHeartbeat: boolean;
changed: boolean;
} {
const nextRuntimeConfig = isRecord(runtimeConfig) ? { ...runtimeConfig } : {};
const heartbeat = isRecord(nextRuntimeConfig.heartbeat) ? { ...nextRuntimeConfig.heartbeat } : null;
if (!heartbeat) {
return { runtimeConfig: nextRuntimeConfig, disabledTimerHeartbeat: false, changed: false };
}
const disabledTimerHeartbeat = isEnabledValue(heartbeat.enabled);
if (heartbeat.enabled !== false) {
heartbeat.enabled = false;
nextRuntimeConfig.heartbeat = heartbeat;
return { runtimeConfig: nextRuntimeConfig, disabledTimerHeartbeat, changed: true };
}
return { runtimeConfig: nextRuntimeConfig, disabledTimerHeartbeat: false, changed: false };
}
export async function quarantineSeededWorktreeExecutionState(
connectionString: string,
): Promise<SeededWorktreeExecutionQuarantineSummary> {
const db = createDb(connectionString);
const summary = { ...EMPTY_SEEDED_WORKTREE_EXECUTION_QUARANTINE_SUMMARY };
try {
await db.transaction(async (tx) => {
const seededAgents = await tx
.select({
id: agents.id,
status: agents.status,
runtimeConfig: agents.runtimeConfig,
})
.from(agents);
for (const agent of seededAgents) {
const normalized = normalizeWorktreeRuntimeConfig(agent.runtimeConfig);
const nextStatus = agent.status === "running" ? "idle" : agent.status;
if (normalized.disabledTimerHeartbeat) {
summary.disabledTimerHeartbeats += 1;
}
if (agent.status === "running") {
summary.resetRunningAgents += 1;
}
if (normalized.changed || nextStatus !== agent.status) {
await tx
.update(agents)
.set({
runtimeConfig: normalized.runtimeConfig,
status: nextStatus,
updatedAt: new Date(),
})
.where(eq(agents.id, agent.id));
}
}
const affectedIssues = await tx
.select({
id: issues.id,
companyId: issues.companyId,
status: issues.status,
})
.from(issues)
.where(
and(
sql`${issues.assigneeAgentId} is not null`,
sql`${issues.assigneeUserId} is null`,
inArray(issues.status, ["todo", "in_progress", "in_review"]),
),
);
for (const issue of affectedIssues) {
const nextStatus = issue.status === "in_progress" ? "blocked" : issue.status;
await tx
.update(issues)
.set({
status: nextStatus,
assigneeAgentId: null,
checkoutRunId: null,
executionRunId: null,
executionAgentNameKey: null,
executionLockedAt: null,
executionWorkspaceId: null,
updatedAt: new Date(),
})
.where(eq(issues.id, issue.id));
if (issue.status === "in_progress") {
summary.quarantinedInProgressIssues += 1;
await tx.insert(issueComments).values({
companyId: issue.companyId,
issueId: issue.id,
body:
"Quarantined during worktree seed so copied in-flight work does not auto-run in this isolated instance. " +
"Reassign or unblock here only if you intentionally want the worktree instance to own this task.",
});
} else if (issue.status === "todo") {
summary.unassignedTodoIssues += 1;
} else if (issue.status === "in_review") {
summary.unassignedReviewIssues += 1;
}
}
});
return summary;
} finally {
await db.$client?.end?.({ timeout: 5 }).catch(() => undefined);
}
}
async function seedWorktreeDatabase(input: {
sourceConfigPath: string;
sourceConfig: PaperclipConfig;
@@ -1126,6 +1278,7 @@ async function seedWorktreeDatabase(input: {
targetPaths: WorktreeLocalPaths;
instanceId: string;
seedMode: WorktreeSeedMode;
preserveLiveWork?: boolean;
}): Promise<SeedWorktreeDatabaseResult> {
const seedPlan = resolveWorktreeSeedPlan(input.seedMode);
const sourceEnvFile = resolvePaperclipEnvFile(input.sourceConfigPath);
@@ -1158,6 +1311,7 @@ async function seedWorktreeDatabase(input: {
backupDir: path.resolve(input.targetPaths.backupDir, "seed"),
retention: { dailyDays: 7, weeklyWeeks: 4, monthlyMonths: 1 },
filenamePrefix: `${input.instanceId}-seed`,
backupEngine: "javascript",
includeMigrationJournal: true,
excludeTables: seedPlan.excludedTables,
nullifyColumns: seedPlan.nullifyColumns,
@@ -1176,7 +1330,10 @@ async function seedWorktreeDatabase(input: {
backupFile: backup.backupFile,
});
await applyPendingMigrations(targetConnectionString);
await pauseSeededScheduledRoutines(targetConnectionString);
const executionQuarantine = input.preserveLiveWork
? { ...EMPTY_SEEDED_WORKTREE_EXECUTION_QUARANTINE_SUMMARY }
: await quarantineSeededWorktreeExecutionState(targetConnectionString);
const pausedScheduledRoutines = await pauseSeededScheduledRoutines(targetConnectionString);
const reboundWorkspaces = await rebindSeededProjectWorkspaces({
targetConnectionString,
currentCwd: input.targetPaths.cwd,
@@ -1184,6 +1341,8 @@ async function seedWorktreeDatabase(input: {
return {
backupSummary: formatDatabaseBackupResult(backup),
pausedScheduledRoutines,
executionQuarantine,
reboundWorkspaces,
};
} finally {
@@ -1262,6 +1421,8 @@ async function runWorktreeInit(opts: WorktreeInitOptions): Promise<void> {
const copiedGitHooks = copyGitHooksToWorktreeGitDir(cwd);
let seedSummary: string | null = null;
let seedExecutionQuarantineSummary: SeededWorktreeExecutionQuarantineSummary | null = null;
let pausedScheduledRoutineCount: number | null = null;
let reboundWorkspaceSummary: SeedWorktreeDatabaseResult["reboundWorkspaces"] = [];
if (opts.seed !== false) {
if (!sourceConfig) {
@@ -1279,8 +1440,11 @@ async function runWorktreeInit(opts: WorktreeInitOptions): Promise<void> {
targetPaths: paths,
instanceId,
seedMode,
preserveLiveWork: opts.preserveLiveWork,
});
seedSummary = seeded.backupSummary;
seedExecutionQuarantineSummary = seeded.executionQuarantine;
pausedScheduledRoutineCount = seeded.pausedScheduledRoutines;
reboundWorkspaceSummary = seeded.reboundWorkspaces;
spinner.stop(`Seeded isolated worktree database (${seedMode}).`);
} catch (error) {
@@ -1303,6 +1467,16 @@ async function runWorktreeInit(opts: WorktreeInitOptions): Promise<void> {
if (seedSummary) {
p.log.message(pc.dim(`Seed mode: ${seedMode}`));
p.log.message(pc.dim(`Seed snapshot: ${seedSummary}`));
if (opts.preserveLiveWork) {
p.log.warning("Preserved copied live work; this worktree instance may auto-run source-instance assignments.");
} else if (seedExecutionQuarantineSummary) {
p.log.message(
pc.dim(`Seed execution quarantine: ${formatSeededWorktreeExecutionQuarantineSummary(seedExecutionQuarantineSummary)}`),
);
}
if (pausedScheduledRoutineCount != null) {
p.log.message(pc.dim(`Paused scheduled routines: ${pausedScheduledRoutineCount}`));
}
for (const rebound of reboundWorkspaceSummary) {
p.log.message(
pc.dim(`Rebound workspace ${rebound.name}: ${rebound.fromCwd} -> ${rebound.toCwd}`),
@@ -2947,11 +3121,20 @@ async function runWorktreeReseed(opts: WorktreeReseedOptions): Promise<void> {
targetPaths,
instanceId: targetPaths.instanceId,
seedMode,
preserveLiveWork: opts.preserveLiveWork,
});
spinner.stop(`Reseeded ${targetEndpoint.label} (${seedMode}).`);
p.log.message(pc.dim(`Source: ${source.configPath}`));
p.log.message(pc.dim(`Target: ${targetEndpoint.configPath}`));
p.log.message(pc.dim(`Seed snapshot: ${seeded.backupSummary}`));
if (opts.preserveLiveWork) {
p.log.warning("Preserved copied live work; this worktree instance may auto-run source-instance assignments.");
} else {
p.log.message(
pc.dim(`Seed execution quarantine: ${formatSeededWorktreeExecutionQuarantineSummary(seeded.executionQuarantine)}`),
);
}
p.log.message(pc.dim(`Paused scheduled routines: ${seeded.pausedScheduledRoutines}`));
for (const rebound of seeded.reboundWorkspaces) {
p.log.message(
pc.dim(`Rebound workspace ${rebound.name}: ${rebound.fromCwd} -> ${rebound.toCwd}`),
@@ -3015,6 +3198,7 @@ export async function worktreeRepairCommand(opts: WorktreeRepairOptions): Promis
fromConfig: source.configPath,
to: target.rootPath,
seedMode,
preserveLiveWork: opts.preserveLiveWork,
yes: true,
allowLiveTarget: opts.allowLiveTarget,
});
@@ -3047,6 +3231,7 @@ export async function worktreeRepairCommand(opts: WorktreeRepairOptions): Promis
fromInstance: opts.fromInstance,
seed: opts.noSeed ? false : true,
seedMode,
preserveLiveWork: opts.preserveLiveWork,
force: true,
});
} finally {
@@ -3070,6 +3255,7 @@ export function registerWorktreeCommands(program: Command): void {
.option("--server-port <port>", "Preferred server port", (value) => Number(value))
.option("--db-port <port>", "Preferred embedded Postgres port", (value) => Number(value))
.option("--seed-mode <mode>", "Seed profile: minimal or full (default: minimal)", "minimal")
.option("--preserve-live-work", "Do not quarantine copied agent timers or assigned open issues in the seeded worktree", false)
.option("--no-seed", "Skip database seeding from the source instance")
.option("--force", "Replace existing repo-local config and isolated instance data", false)
.action(worktreeMakeCommand);
@@ -3086,6 +3272,7 @@ export function registerWorktreeCommands(program: Command): void {
.option("--server-port <port>", "Preferred server port", (value) => Number(value))
.option("--db-port <port>", "Preferred embedded Postgres port", (value) => Number(value))
.option("--seed-mode <mode>", "Seed profile: minimal or full (default: minimal)", "minimal")
.option("--preserve-live-work", "Do not quarantine copied agent timers or assigned open issues in the seeded worktree", false)
.option("--no-seed", "Skip database seeding from the source instance")
.option("--force", "Replace existing repo-local config and isolated instance data", false)
.action(worktreeInitCommand);
@@ -3125,6 +3312,7 @@ export function registerWorktreeCommands(program: Command): void {
.option("--from-data-dir <path>", "Source PAPERCLIP_HOME used when deriving the source config")
.option("--from-instance <id>", "Source instance id when deriving the source config")
.option("--seed-mode <mode>", "Seed profile: minimal or full (default: full)", "full")
.option("--preserve-live-work", "Do not quarantine copied agent timers or assigned open issues in the seeded worktree", false)
.option("--yes", "Skip the destructive confirmation prompt", false)
.option("--allow-live-target", "Override the guard that requires the target worktree DB to be stopped first", false)
.action(worktreeReseedCommand);
@@ -3138,6 +3326,7 @@ export function registerWorktreeCommands(program: Command): void {
.option("--from-data-dir <path>", "Source PAPERCLIP_HOME used when deriving the source config")
.option("--from-instance <id>", "Source instance id when deriving the source config (default: default)")
.option("--seed-mode <mode>", "Seed profile: minimal or full (default: minimal)", "minimal")
.option("--preserve-live-work", "Do not quarantine copied agent timers or assigned open issues in the seeded worktree", false)
.option("--no-seed", "Repair metadata only and skip reseeding when bootstrapping a missing worktree config", false)
.option("--allow-live-target", "Override the guard that requires the target worktree DB to be stopped first", false)
.action(worktreeRepairCommand);

View File

@@ -8,6 +8,7 @@ import { heartbeatRun } from "./commands/heartbeat-run.js";
import { runCommand } from "./commands/run.js";
import { bootstrapCeoInvite } from "./commands/auth-bootstrap-ceo.js";
import { dbBackupCommand } from "./commands/db-backup.js";
import { registerEnvLabCommands } from "./commands/env-lab.js";
import { registerContextCommands } from "./commands/client/context.js";
import { registerCompanyCommands } from "./commands/client/company.js";
import { registerIssueCommands } from "./commands/client/issue.js";
@@ -147,6 +148,7 @@ registerDashboardCommands(program);
registerRoutineCommands(program);
registerFeedbackCommands(program);
registerWorktreeCommands(program);
registerEnvLabCommands(program);
registerPluginCommands(program);
const auth = program.command("auth").description("Authentication and bootstrap utilities");

View File

@@ -2,7 +2,7 @@
Paperclip CLI now supports both:
- instance setup/diagnostics (`onboard`, `doctor`, `configure`, `env`, `allowed-hostname`)
- instance setup/diagnostics (`onboard`, `doctor`, `configure`, `env`, `allowed-hostname`, `env-lab`)
- control-plane client operations (issues, approvals, agents, activity, dashboard)
## Base Usage
@@ -45,6 +45,15 @@ Allow an authenticated/private hostname (for example custom Tailscale DNS):
pnpm paperclipai allowed-hostname dotta-macbook-pro
```
Bring up the default local SSH fixture for environment testing:
```sh
pnpm paperclipai env-lab up
pnpm paperclipai env-lab doctor
pnpm paperclipai env-lab status --json
pnpm paperclipai env-lab down
```
All client commands support:
- `--data-dir <path>`

View File

@@ -27,6 +27,18 @@ pnpm db:migrate
When `DATABASE_URL` is unset, this command targets the current embedded PostgreSQL instance for your active Paperclip config/instance.
Issue reference mentions follow the normal migration path: the schema migration creates the tracking table, but it does not backfill historical issue titles, descriptions, comments, or documents automatically.
To backfill existing content manually after migrating, run:
```sh
pnpm issue-references:backfill
# optional: limit to one company
pnpm issue-references:backfill -- --company <company-id>
```
Future issue, comment, and document writes sync references automatically without running the backfill command.
This mode is ideal for local development and one-command installs.
Docker note: the Docker quickstart image also uses embedded PostgreSQL by default. Persist `/paperclip` to keep DB state across container restarts (see `doc/DOCKER.md`).
@@ -47,11 +59,11 @@ cp .env.example .env
# DATABASE_URL=postgres://paperclip:paperclip@localhost:5432/paperclip
```
Run migrations (once the migration generation issue is fixed) or use `drizzle-kit push`:
Run migrations:
```sh
DATABASE_URL=postgres://paperclip:paperclip@localhost:5432/paperclip \
npx drizzle-kit push
pnpm db:migrate
```
Start the server:
@@ -88,27 +100,27 @@ postgres://postgres.[PROJECT-REF]:[PASSWORD]@aws-0-[REGION].pooler.supabase.com:
### Configure
Set `DATABASE_URL` in your `.env`:
For the application runtime, use a direct PostgreSQL connection unless the database client has explicit prepared-statement configuration for your pooling mode:
```sh
DATABASE_URL=postgres://postgres.[PROJECT-REF]:[PASSWORD]@aws-0-[REGION].pooler.supabase.com:5432/postgres
```
If you later run the app with a pooled runtime URL, set `DATABASE_MIGRATION_URL` to the direct connection URL. Paperclip uses it for startup schema checks/migrations and plugin namespace migrations, while the app continues to use `DATABASE_URL` for runtime queries:
```sh
DATABASE_URL=postgres://postgres.[PROJECT-REF]:[PASSWORD]@aws-0-[REGION].pooler.supabase.com:6543/postgres
DATABASE_MIGRATION_URL=postgres://postgres.[PROJECT-REF]:[PASSWORD]@aws-0-[REGION].pooler.supabase.com:5432/postgres
```
If using connection pooling (port 6543), the `postgres` client must disable prepared statements. Update `packages/db/src/client.ts`:
```ts
export function createDb(url: string) {
const sql = postgres(url, { prepare: false });
return drizzlePg(sql, { schema });
}
```
If your hosted database requires transaction-pooling-only connections, use a direct or session-pooled connection for Paperclip until runtime pooling support is documented in this guide. Do not edit database client source files as part of deployment setup.
### Push the schema
```sh
# Use the direct connection (port 5432) for schema changes
DATABASE_URL=postgres://postgres.[PROJECT-REF]:[PASSWORD]@...5432/postgres \
npx drizzle-kit push
pnpm db:migrate
```
### Free tier limits
@@ -131,6 +143,22 @@ The database mode is controlled by `DATABASE_URL`:
Your Drizzle schema (`packages/db/src/schema/`) stays the same regardless of mode.
## Plugin database namespaces
The plugin runtime tracks plugin-owned database namespaces and migrations in `plugin_database_namespaces` and `plugin_migrations`. Hosted deployments that separate runtime and migration connections should set `DATABASE_MIGRATION_URL`; plugin namespace migration work uses the migration connection when present.
## Backups
Paperclip supports automatic and manual logical database backups. These dumps include
non-system database schemas such as `public`, the Drizzle migration journal, and
plugin-owned database schemas. See `doc/DEVELOPING.md` for the current
`paperclipai db:backup` / `pnpm db:backup` commands and backup retention
configuration.
Database backups do not include non-database instance files such as local-disk
uploads, workspace files, or the local encrypted secrets master key. Back those paths
up separately when you need full instance disaster recovery.
## Secret storage
Paperclip stores secret metadata and versions in:

View File

@@ -142,3 +142,4 @@ This prevents lockout when a user migrates from long-running local trusted usage
- implementation plan: `doc/plans/deployment-auth-mode-consolidation.md`
- V1 contract: `doc/SPEC-implementation.md`
- operator workflows: `doc/DEVELOPING.md` and `doc/CLI.md`
- invite/join state map: `doc/spec/invite-flow.md`

View File

@@ -43,6 +43,19 @@ This starts:
`pnpm dev` and `pnpm dev:once` are now idempotent for the current repo and instance: if the matching Paperclip dev runner is already alive, Paperclip reports the existing process instead of starting a duplicate.
Issue execution may also use project execution workspace policies and workspace runtime services for per-project worktrees, preview servers, and managed dev commands. Configure those through the project workspace/runtime surfaces rather than starting long-running unmanaged processes when a task needs a reusable service.
## Storybook
The board UI Storybook keeps stories and Storybook config under `ui/storybook/` so component review files stay out of the app source routes.
```sh
pnpm storybook
pnpm build-storybook
```
These run the `@paperclipai/ui` Storybook on port `6006` and build the static output to `ui/storybook-static/`.
Inspect or stop the current repo's managed dev runner:
```sh
@@ -102,6 +115,8 @@ pnpm test:release-smoke
These browser suites are intended for targeted local verification and CI, not the default agent/human test command.
For normal issue work, start with the smallest targeted check that proves the change. Reserve repo-wide typecheck/build/test runs for PR-ready handoff or changes broad enough that narrow checks do not cover the risk.
## One-Command Local Run
For a first-time local install, you can bootstrap and run in one command:
@@ -183,6 +198,8 @@ For `codex_local`, Paperclip also manages a per-company Codex home under the ins
If the `codex` CLI is not installed or not on `PATH`, `codex_local` agent runs fail at execution time with a clear adapter error. Quota polling uses a short-lived `codex app-server` subprocess: when `codex` cannot be spawned, that provider reports `ok: false` in aggregated quota results and the API server keeps running (it must not exit on a missing binary).
Local adapters require their corresponding CLI/session setup on the machine running Paperclip. External adapters are installed through the adapter/plugin flow and should not require hardcoded imports in `server/` or `ui/`.
## Worktree-local Instances
When developing from multiple git worktrees, do not point two Paperclip servers at the same embedded PostgreSQL data directory.
@@ -209,6 +226,8 @@ Seed modes:
- `full` makes a full logical clone of the source instance
- `--no-seed` creates an empty isolated instance
Seeded worktree instances quarantine copied live execution by default for both `minimal` and `full` seeds. During restore, Paperclip disables copied agent timer heartbeats, resets copied `running` agents to `idle`, blocks and unassigns copied agent-owned `in_progress` issues, and unassigns copied agent-owned `todo`/`in_review` issues. This keeps a freshly booted worktree from starting agents for work already owned by the source instance. Pass `--preserve-live-work` only when you intentionally want the isolated worktree to resume copied assignments.
After `worktree init`, both the server and the CLI auto-load the repo-local `.paperclip/.env` when run inside that worktree, so normal commands like `pnpm dev`, `paperclipai doctor`, and `paperclipai db:backup` stay scoped to the worktree instance.
`pnpm dev` now fails fast in a linked git worktree when `.paperclip/.env` is missing, instead of silently booting against the default instance/port. If that happens, run `paperclipai worktree init` in the worktree first.
@@ -222,6 +241,8 @@ That repo-local env also sets:
- `PAPERCLIP_WORKTREE_COLOR=<hex-color>`
The server/UI use those values for worktree-specific branding such as the top banner and dynamically colored favicon.
Authenticated worktree servers also use the `PAPERCLIP_INSTANCE_ID` value to scope Better Auth cookie names.
Browser cookies are shared by host rather than port, so this prevents logging into one `127.0.0.1:<port>` worktree from replacing another worktree server's session cookie.
Print shell exports explicitly when needed:
@@ -400,7 +421,9 @@ If you set `DATABASE_URL`, the server will use that instead of embedded PostgreS
## Automatic DB Backups
Paperclip can run automatic DB backups on a timer. Defaults:
Paperclip can run automatic logical database backups on a timer. These backups cover
non-system database schemas, including migration history and plugin-owned database
schemas. Defaults:
- enabled
- every 60 minutes
@@ -428,6 +451,10 @@ Environment overrides:
- `PAPERCLIP_DB_BACKUP_RETENTION_DAYS=<days>`
- `PAPERCLIP_DB_BACKUP_DIR=/absolute/or/~/path`
DB backups are not full instance filesystem backups. For full local disaster
recovery, also back up local storage files and the local encrypted secrets key if
those providers are enabled.
## Secrets in Dev
Agent env vars now support secret references. By default, secret values are stored with local encryption and only secret refs are persisted in agent config.

View File

@@ -23,7 +23,7 @@ Paperclip is the command, communication, and control plane for a company of AI a
- **Track work in real time** — see at any moment what every agent is working on
- **Control costs** — token salary budgets per agent, spend tracking, burn rate
- **Align to goals** — agents see how their work serves the bigger mission
- **Store company knowledge** — a shared brain for the organization
- **Preserve work context** — comments, documents, work products, attachments, and company state stay attached to the work
## Architecture
@@ -36,17 +36,20 @@ The central nervous system. Manages:
- Agent registry and org chart
- Task assignment and status
- Budget and token spend tracking
- Company knowledge base
- Issue comments, documents, work products, attachments, and company state
- Goal hierarchy (company → team → agent → task)
- Heartbeat monitoring — know when agents are alive, idle, or stuck
It also enforces execution-control semantics such as single-assignee issues, atomic checkout and execution locks, blockers, recovery issues, and workspace/runtime controls.
### 2. Execution Services (adapters)
Agents run externally and report into the control plane. An agent is just Python code that gets kicked off and does work. Adapters connect different execution environments:
Agents run externally and report into the control plane. Adapters connect different execution environments and define how a heartbeat is invoked, observed, and cancelled:
- **OpenClaw** — initial adapter target
- **Heartbeat loop** — simple custom Python that loops, checks in, does work
- **Others** — any runtime that can call an API
- **Local CLI/session adapters** — built-in adapters for tools such as Claude Code, Codex, Gemini, OpenCode, Pi, and Cursor
- **HTTP/process-style adapters** — command or webhook/API integrations for custom runtimes
- **OpenClaw gateway** — integration for OpenClaw-style remote agents
- **External adapter plugins** — dynamically loaded adapters installed outside the core app
The control plane doesn't run agents. It orchestrates them. Agents run wherever they run and phone home.

View File

@@ -32,12 +32,14 @@ Then you define who reports to the CEO: a CTO managing programmers, a CMO managi
### Agent Execution
There are two fundamental modes for running an agent's heartbeat:
Paperclip supports several ways to run an agent's heartbeat:
1. **Run a command** — Paperclip kicks off a process (shell command, Python script, etc.) and tracks it. The heartbeat is "execute this and monitor it."
2. **Fire and forget a request** — Paperclip sends a webhook/API call to an externally running agent. The heartbeat is "notify this agent to wake up." (OpenClaw hooks work this way.)
1. **Local CLI/session adapters** — Paperclip starts or resumes local coding-tool sessions such as Claude Code, Codex, Gemini, OpenCode, Pi, and Cursor, then tracks the run.
2. **Run a command** — Paperclip kicks off a process (shell command, Python script, etc.) and tracks it. The heartbeat is "execute this and monitor it."
3. **Fire and forget a request** — Paperclip sends a webhook/API call to an externally running agent. The heartbeat is "notify this agent to wake up." OpenClaw-style hooks work this way.
4. **External adapter plugins** — Paperclip loads adapter packages through the plugin/adapter flow so self-hosted installs can add runtimes without hardcoding them in core.
We provide sensible defaults — a default agent that shells out to Claude Code or Codex with your configuration, remembers session IDs, runs basic scripts. But you can plug in anything.
Agent runs can use project and execution workspaces, managed runtime services such as preview/dev servers, adapter-specific session state, and HTTP/webhook-style execution. We provide sensible defaults, but the adapter is still the boundary: if a runtime can be invoked, observed, and authorized, Paperclip can coordinate it.
### Task Management
@@ -54,7 +56,7 @@ I am researching the Facebook ads Granola uses (current task)
Tasks have parentage. Every task exists in service of a parent task, all the way up to the company goal. This is what keeps autonomous agents aligned — they can always answer "why am I doing this?"
More detailed task structure TBD.
The current issue model includes stable issue identifiers, parent/sub-issues, blockers, a single assignee, comments, issue documents, attachments and work products, and review/approval handoffs. That structure keeps work inspectable by both the board and agents while still allowing agents to decompose work into smaller tasks.
## Principles
@@ -115,7 +117,7 @@ Paperclips core identity is a **control plane for autonomous AI companies**,
- Do not make the core product a general chat app. The current product definition is explicitly task/comment-centric and “not a chatbot,” and that boundary is valuable.
- Do not build a complete Jira/GitHub replacement. The repo/docs already position Paperclip as organization orchestration, not focused on pull-request review.
- Do not build enterprise-grade RBAC first. The current V1 spec still treats multi-board governance and fine-grained human permissions as out of scope, so the first multi-user version should be coarse and company-scoped.
- Do not build enterprise-grade RBAC first. Paperclip now has authenticated mode, company memberships, instance roles, and permission grants, but fine-grained enterprise governance should remain secondary to the core company control plane.
- Do not lead with raw bash logs and transcripts. Default view should be human-readable intent/progress, with raw detail beneath.
- Do not force users to understand provider/API-key plumbing unless absolutely necessary. There are active onboarding/auth issues already; friction here is clearly real.
@@ -136,11 +138,14 @@ Paperclips core identity is a **control plane for autonomous AI companies**,
5. **Output-first**
Work is not done until the user can see the result: file, document, preview link, screenshot, plan, or PR.
6. **Local-first, cloud-ready**
6. **Execution visibility without log worship**
Active runs, recovery issues, productivity review states, blockers, and work products should be first-class surfaces. Raw transcripts are available when needed, but they are not the primary product surface.
7. **Local-first, cloud-ready**
The mental model should not change between local solo use and shared/private or public/cloud deployment.
7. **Safe autonomy**
8. **Safe autonomy**
Auto mode is allowed; hidden token burn is not.
8. **Thin core, rich edges**
9. **Thin core, rich edges**
Put optional chat, knowledge, and special surfaces into plugins/extensions rather than bloating the control plane.

View File

@@ -115,38 +115,6 @@ If the first real publish returns npm `E404`, check npm-side prerequisites befor
- The initial publish must include `--access public` for a public scoped package.
- npm also requires either account 2FA for publishing or a granular token that is allowed to bypass 2FA.
### Manual first publish for `@paperclipai/mcp-server`
If you need to publish only the MCP server package once by hand, use:
- `@paperclipai/mcp-server`
Recommended flow from the repo root:
```bash
# optional sanity check: this 404s until the first publish exists
npm view @paperclipai/mcp-server version
# make sure the build output is fresh
pnpm --filter @paperclipai/mcp-server build
# confirm your local npm auth before the real publish
npm whoami
# safe preview of the exact publish payload
cd packages/mcp-server
pnpm publish --dry-run --no-git-checks --access public
# real publish
pnpm publish --no-git-checks --access public
```
Notes:
- Publish from `packages/mcp-server/`, not the repo root.
- If `npm view @paperclipai/mcp-server version` already returns the same version that is in [`packages/mcp-server/package.json`](../packages/mcp-server/package.json), do not republish. Bump the version or use the normal repo-wide release flow in [`scripts/release.sh`](../scripts/release.sh).
- The same npm-side prerequisites apply as above: valid npm auth, permission to publish to the `@paperclipai` scope, `--access public`, and the required publish auth/2FA policy.
## Version formats
Paperclip uses calendar versions:
@@ -175,6 +143,13 @@ This keeps the default install path unchanged while allowing explicit installs w
npx paperclipai@canary onboard
```
The release script now verifies two things after a canary publish:
- the `canary` dist-tag resolves to the version that was just published
- every published internal `@paperclipai/*` dependency referenced by that manifest exists on npm
It also treats `latest -> canary` as a failure by default, because npm metadata can otherwise leave the default install path pointing at an unreleased canary dependency graph. Only pass `./scripts/release.sh canary --allow-canary-latest` when that `latest` behavior is explicitly intended.
### Stable
Stable publishes use the npm dist-tag `latest`.
@@ -201,6 +176,58 @@ That means:
See [doc/RELEASE-AUTOMATION-SETUP.md](RELEASE-AUTOMATION-SETUP.md) for the GitHub/npm setup steps.
## Release enrollment for new public packages
Paperclip does not auto-publish every non-private workspace package anymore.
CI publishing is controlled by [`scripts/release-package-manifest.json`](../scripts/release-package-manifest.json).
When you add a new public package:
1. add it to the manifest and decide whether CI should publish it immediately
2. if CI should publish it, bootstrap the package on npm before merge
3. if CI should not publish it yet, keep `"publishFromCi": false`
4. only enable `"publishFromCi": true` after npm trusted publishing is configured for that package
PR CI now checks changed release-enabled package manifests against npm. That catches a missing first-publish bootstrap before the change reaches `master`.
### One-time bootstrap sequence for a new package
The first publish of a brand-new package still needs one human maintainer with npm write access.
After that, trusted publishing can take over.
Example for `@paperclipai/adapter-acpx-local` from the repo root:
```bash
# safe preview
pnpm run release:bootstrap-package -- @paperclipai/adapter-acpx-local
# one-time first publish from an authenticated maintainer machine
pnpm run release:bootstrap-package -- @paperclipai/adapter-acpx-local --publish --otp 123456
```
The helper script:
- checks that the package does not already exist on npm
- builds the target package unless `--skip-build` is passed
- runs `npm pack --dry-run` in the package directory
- only runs the real `npm publish --access public` when `--publish --otp <code>` is provided
For the real `--publish` step, the maintainer machine must already be authenticated to npm.
If `npm whoami` returns `401`, first run `npm logout --registry=https://registry.npmjs.org/` to clear any stale local auth, then run `npm login` or `npm adduser` locally as an npm org member, and finally rerun the helper.
That local human auth is fine for the one-time bootstrap publish; we just do not want the same auth model inside CI.
The helper now requires `--otp <code>` up front for `--publish`, so it fails before the real publish attempt if the one-time password is missing.
After that first publish succeeds:
1. open `https://www.npmjs.com/package/@paperclipai/adapter-acpx-local`
2. go to `Settings``Trusted publishing`
3. add repository `paperclipai/paperclip`
4. set workflow filename to `release.yml`
5. optionally go to `Settings``Publishing access` and enable `Require two-factor authentication and disallow tokens`
6. keep `publishFromCi: true` in [`scripts/release-package-manifest.json`](../scripts/release-package-manifest.json)
Once those steps are done, future canary and stable publishes for that package are automated through GitHub OIDC. The manual step is only the first package creation on npm.
## Rollback model
Rollback does not unpublish anything.

View File

@@ -67,6 +67,27 @@ Why:
- the single `release.yml` workflow handles both canary and stable publishing
- GitHub environments `npm-canary` and `npm-stable` still enforce different approval rules on the GitHub side
### 2.2.1. Newly added public packages need a bootstrap phase
Trusted publishing is configured on the npm package itself, not at the repo scope.
That means a brand-new public package must not be auto-enrolled into CI publishing until its npm package exists and its trusted publisher has been configured.
Repo policy:
1. add every non-private package to [`scripts/release-package-manifest.json`](../scripts/release-package-manifest.json)
2. set `"publishFromCi": true` only when CI is expected to publish that package
3. if the package is not ready for CI publishing yet, keep `"publishFromCi": false`
4. complete the package bootstrap before merging any PR that changes a release-enabled new package
Bootstrap sequence for a new package:
1. publish the package once from a trusted maintainer machine using normal npm auth
2. open that package on npm and add the `paperclipai/paperclip` trusted publisher for `.github/workflows/release.yml`
3. rerun or dry-run the release flow as needed to confirm CI publishing now works
4. only then enable `"publishFromCi": true`
PR CI enforces this by checking changed release-enabled package manifests against npm. That keeps `master` canary publishing healthy while preserving the no-long-lived-token model for normal CI releases.
### 2.3. Verify trusted publishing before removing old auth
After the workflows are live:

View File

@@ -63,6 +63,8 @@ It:
- verifies the pushed commit
- computes the canary version for the current UTC date
- publishes under npm dist-tag `canary`
- verifies that `canary` resolves to the just-published version and that published internal dependencies exist on npm
- fails by default if npm leaves `latest` pointing at a canary; use `--allow-canary-latest` only when that state is intentional
- creates a git tag `canary/vYYYY.MDD.P-canary.N`
Users install canaries with:

View File

@@ -1,7 +1,7 @@
# Paperclip V1 Implementation Spec
Status: Implementation contract for first release (V1)
Date: 2026-02-17
Date: 2026-04-28
Audience: Product, engineering, and agent-integration authors
Source inputs: `GOAL.md`, `PRODUCT.md`, `SPEC.md`, `DATABASE.md`, current monorepo code
@@ -37,8 +37,9 @@ These decisions close open questions from `SPEC.md` for V1.
| Visibility | Full visibility to board and all agents in same company |
| Communication | Tasks + comments only (no separate chat system) |
| Task ownership | Single assignee; atomic checkout required for `in_progress` transition |
| Recovery | No automatic reassignment; work recovery stays manual/explicit |
| Agent adapters | Built-in `process` and `http` adapters |
| Recovery | Liveness/watchdog recovery preserves explicit ownership: retry lost execution continuity where safe, otherwise create visible recovery issues or require human escalation (see `doc/execution-semantics.md`) |
| Agent adapters | Built-in `process`, `http`, local CLI/session adapters, and OpenClaw gateway support; external adapters can also be loaded through the adapter plugin flow |
| Plugin framework | Local/self-hosted early plugin runtime is in scope; cloud marketplace and packaged public distribution remain out of scope |
| Auth | Mode-dependent human auth (`local_trusted` implicit board in current code; authenticated mode uses sessions), API keys for agents |
| Budget period | Monthly UTC calendar window |
| Budget enforcement | Soft alerts + hard limit auto-pause |
@@ -73,7 +74,7 @@ V1 implementation extends this baseline into a company-centric, governance-aware
## 5.2 Out of Scope (V1)
- Plugin framework and third-party extension SDK
- Cloud-grade plugin marketplace/distribution beyond the local/self-hosted plugin runtime
- Revenue/expense accounting beyond model/token costs
- Knowledge base subsystem
- Public marketplace (ClipHub)
@@ -123,6 +124,16 @@ Human auth tables (`users`, `sessions`, and provider-specific auth artifacts) ar
- `name` text not null
- `description` text null
- `status` enum: `active | paused | archived`
- `pause_reason` text null
- `paused_at` timestamptz null
- `issue_prefix` text not null
- `issue_counter` int not null
- `budget_monthly_cents` int not null default 0
- `spent_monthly_cents` int not null default 0
- `attachment_max_bytes` int not null
- `require_board_approval_for_new_agents` boolean not null default false
- feedback sharing consent fields
- branding fields such as `brand_color`
Invariant: every business record belongs to exactly one company.
@@ -133,15 +144,21 @@ Invariant: every business record belongs to exactly one company.
- `name` text not null
- `role` text not null
- `title` text null
- `status` enum: `active | paused | idle | running | error | terminated`
- `icon` text null
- `status` enum: `active | paused | idle | running | error | pending_approval | terminated`
- `reports_to` uuid fk `agents.id` null
- `capabilities` text null
- `adapter_type` enum: `process | http`
- `adapter_type` text; built-ins include `process`, `http`, `claude_local`, `codex_local`, `gemini_local`, `opencode_local`, `pi_local`, `cursor`, and `openclaw_gateway`
- `adapter_config` jsonb not null
- `runtime_config` jsonb not null default `{}`; may include Paperclip runtime policy such as `modelProfiles.cheap.adapterConfig` for an optional low-cost model lane that does not change the primary adapter config
- `default_environment_id` uuid fk `environments.id` null
- `context_mode` enum: `thin | fat` default `thin`
- `budget_monthly_cents` int not null default 0
- `spent_monthly_cents` int not null default 0
- pause fields: `pause_reason`, `paused_at`
- `permissions` jsonb not null default `{}`
- `last_heartbeat_at` timestamptz null
- `metadata` jsonb null
Invariants:
@@ -195,6 +212,7 @@ Invariant:
- `id` uuid pk
- `company_id` uuid fk not null
- `project_id` uuid fk `projects.id` null
- `project_workspace_id` uuid fk `project_workspaces.id` null
- `goal_id` uuid fk `goals.id` null
- `parent_id` uuid fk `issues.id` null
- `title` text not null
@@ -202,13 +220,22 @@ Invariant:
- `status` enum: `backlog | todo | in_progress | in_review | done | blocked | cancelled`
- `priority` enum: `critical | high | medium | low`
- `assignee_agent_id` uuid fk `agents.id` null
- `assignee_user_id` text null
- checkout/execution locks: `checkout_run_id`, `execution_run_id`, `execution_agent_name_key`, `execution_locked_at`
- `created_by_agent_id` uuid fk `agents.id` null
- `created_by_user_id` uuid fk `users.id` null
- identifier fields: `issue_number`, `identifier`
- origin fields: `origin_kind`, `origin_id`, `origin_run_id`, `origin_fingerprint`
- `request_depth` int not null default 0
- `billing_code` text null
- `assignee_adapter_overrides` jsonb null
- `execution_policy` jsonb null
- `execution_state` jsonb null
- execution workspace fields: `execution_workspace_id`, `execution_workspace_preference`, `execution_workspace_settings`
- `started_at` timestamptz null
- `completed_at` timestamptz null
- `cancelled_at` timestamptz null
- `hidden_at` timestamptz null
Invariants:
@@ -261,10 +288,10 @@ Invariant: each event must attach to agent and company; rollups are aggregation,
- `id` uuid pk
- `company_id` uuid fk not null
- `type` enum: `hire_agent | approve_ceo_strategy`
- `type` enum: `hire_agent | approve_ceo_strategy | budget_override_required | request_board_approval`
- `requested_by_agent_id` uuid fk `agents.id` null
- `requested_by_user_id` uuid fk `users.id` null
- `status` enum: `pending | approved | rejected | cancelled`
- `status` enum: `pending | revision_requested | approved | rejected | cancelled`
- `payload` jsonb not null
- `decision_note` text null
- `decided_by_user_id` uuid fk `users.id` null
@@ -363,6 +390,15 @@ Operational policy:
- `document_id` uuid fk not null
- `key` text not null (`plan`, `design`, `notes`, etc.)
## 7.16 Current Implementation Addenda
The current implementation includes additional V1-control-plane tables beyond the original February snapshot:
- Issue structure and review: `issue_relations` for blockers, `labels`/`issue_labels`, `issue_thread_interactions`, `issue_approvals`, `issue_execution_decisions`, `issue_work_products`, `issue_inbox_archives`, `issue_read_states`, and issue reference mention indexes.
- Execution and workspace control: `execution_workspaces`, `project_workspaces`, `workspace_runtime_services`, `workspace_operations`, `environments`, `environment_leases`, `agent_task_sessions`, `agent_runtime_state`, `agent_wakeup_requests`, heartbeat events, and watchdog decision tables.
- Plugins and routines: `plugins`, plugin config/state/entities/jobs/logs/webhooks, plugin database namespaces/migrations, plugin company settings, and `routines`.
- Access and operations: company memberships, instance roles, principal permission grants, invites, join requests, board API keys, CLI auth challenges, budget policies/incidents, feedback exports/votes, company skills, sidebar preferences, and company logos.
## 8. State Machines
## 8.1 Agent Status
@@ -395,7 +431,14 @@ Side effects:
- entering `done` sets `completed_at`
- entering `cancelled` sets `cancelled_at`
Detailed ownership, execution, blocker, and crash-recovery semantics are documented in `doc/execution-semantics.md`.
V1 non-terminal liveness rule:
- agent-owned `todo`, `in_progress`, `in_review`, and `blocked` issues must have a live execution path, an explicit waiting path, or an explicit recovery path
- `in_review` is healthy only when a typed execution participant, pending issue-thread interaction or approval, user owner, active run, queued wake, or explicit recovery issue owns the next action
- a blocked chain is covered only when each unresolved leaf issue is live or explicitly waiting
- when Paperclip cannot safely infer the next action, it surfaces the problem through visible blocked/recovery work instead of silently completing or reassigning work
Detailed ownership, execution, blocker, active-run watchdog, crash-recovery, and non-terminal liveness semantics are documented in `doc/execution-semantics.md`.
## 8.3 Approval Status
@@ -484,6 +527,7 @@ All endpoints are under `/api` and return JSON.
- `DELETE /issues/:issueId/documents/:key`
- `POST /issues/:issueId/checkout`
- `POST /issues/:issueId/release`
- `POST /issues/:issueId/admin/force-release` (board-only lock recovery)
- `POST /issues/:issueId/comments`
- `GET /issues/:issueId/comments`
- `POST /companies/:companyId/issues/:issueId/attachments` (multipart upload)
@@ -508,6 +552,8 @@ Server behavior:
2. if updated row count is 0, return `409` with current owner/status
3. successful checkout sets `assignee_agent_id`, `status = in_progress`, and `started_at`
`POST /issues/:issueId/admin/force-release` is an operator recovery endpoint for stale harness locks. It requires board access to the issue company, clears checkout and execution run lock fields, and may clear the agent assignee when `clearAssignee=true` is passed. The route must write an `issue.admin_force_release` activity log entry containing the previous checkout and execution run IDs.
## 10.5 Projects
- `GET /companies/:companyId/projects`
@@ -553,6 +599,17 @@ Dashboard payload must include:
- `422` semantic rule violation
- `500` server error
## 10.10 Current Implementation API Addenda
The current app also exposes V1-supporting surfaces for:
- issue thread interactions (`suggest_tasks`, `ask_user_questions`, `request_confirmation`)
- issue approvals, issue references/search, labels, read state, inbox/archive state, and work products
- execution workspaces, project workspaces, workspace runtime services, and workspace operations
- routines and scheduled/API/webhook triggers
- plugin installation, configuration, state, jobs, logs, webhooks, and plugin database namespace migration
- company import/export preview/apply, feedback export/vote routes, instance backup/config routes, invites, join requests, memberships, and permission grants
## 11. Heartbeat and Adapter Contract
## 11.1 Adapter Interface
@@ -619,7 +676,7 @@ Per-agent schedule fields in `adapter_config`:
- `enabled` boolean
- `intervalSec` integer (minimum 30)
- `maxConcurrentRuns` fixed at `1` for V1
- `maxConcurrentRuns` integer; new agents default to `20`; scheduler clamps configured values to `1..50`
Scheduler must skip invocation when:
@@ -728,13 +785,14 @@ Required UX behaviors:
- Node 20+
- `DATABASE_URL` optional
- if unset, auto-use PGlite and push schema
- if unset, auto-use embedded PostgreSQL under `~/.paperclip/instances/default/db`
## 15.2 Migrations
- Drizzle migrations are source of truth
- local/dev startup applies pending migrations automatically where supported
- `pnpm db:migrate` applies pending migrations manually
- no destructive migration in-place for V1 upgrade path
- provide migration script from existing minimal tables to company-scoped schema
## 15.3 Logging and Audit
@@ -789,6 +847,8 @@ A release candidate is blocked unless these pass:
## 18. Delivery Plan
Current implementation note: the milestones below describe the original V1 sequencing. Several systems originally framed as future work have since shipped or advanced materially, including issue documents/interactions, blockers, routines, execution workspaces, import/export portability, authenticated deployment modes, multi-user basics, and the local/self-hosted plugin runtime.
## Milestone 1: Company Core and Auth
- add `companies` and company scoping to existing entities
@@ -841,7 +901,7 @@ V1 is complete only when all criteria are true:
## 20. Post-V1 Backlog (Explicitly Deferred)
- plugin architecture
- cloud-grade plugin marketplace/distribution
- richer workflow-state customization per team
- milestones/labels/dependency graph depth beyond V1 minimum
- realtime transport optimization (SSE/WebSockets)

Binary file not shown.

After

Width:  |  Height:  |  Size: 174 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 174 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 177 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 177 KiB

View File

@@ -1,7 +1,7 @@
# Execution Semantics
Status: Current implementation guide
Date: 2026-04-13
Date: 2026-04-26
Audience: Product and engineering
This document explains how Paperclip interprets issue assignment, issue status, execution runs, wakeups, parent/sub-issue structure, and blocker relationships.
@@ -67,13 +67,15 @@ This is the right state for:
- waiting on another issue
- waiting on a human decision
- waiting on an external dependency or system
- waiting on an external dependency or system when Paperclip does not own a scheduled re-check
- work that automatic recovery could not safely continue
### `in_review`
Execution work is paused because the next move belongs to a reviewer or approver, not the current executor.
An external review service can also be a valid review path when the issue keeps an agent assignee and has an active one-shot monitor that will wake that assignee to check the service later.
### `done`
The work is complete and terminal.
@@ -146,13 +148,28 @@ Use it for:
- explicit waiting relationships
- automatic wakeups when all blockers resolve
Blocked issues should stay idle while blockers remain unresolved. Paperclip should not create a queued heartbeat run for that issue until the final blocker is done and the `issue_blockers_resolved` wake can start real work.
If a parent is truly waiting on a child, model that with blockers. Do not rely on the parent/child relationship alone.
## 7. Consistent Execution Path Rules
## 7. Non-Terminal Issue Liveness Contract
For agent-assigned, non-terminal, actionable issues, Paperclip should not leave work in a state where nobody is working it and nothing will wake it.
For agent-owned, non-terminal issues, Paperclip should never leave work in a state where nobody is responsible for the next move and nothing will wake or surface it.
The relevant execution path depends on status.
This is a visibility contract, not an auto-completion contract. If Paperclip cannot safely infer the next action, it should surface the ambiguity with a blocked state, a visible comment, or an explicit recovery issue. It must not silently mark work done from prose comments or guess that a dependency is complete.
An issue is healthy when the product can answer "what moves this forward next?" without requiring a human to reconstruct intent from the whole thread. An issue is stalled when it is non-terminal but has no live execution path, no explicit waiting path, and no recovery path.
The valid action-path primitives are:
- an active run linked to the issue
- a queued wake or continuation that can be delivered to the responsible agent
- a typed execution-policy participant, such as `executionState.currentParticipant`
- a pending issue-thread interaction or linked approval that is waiting for a specific responder
- a one-shot issue monitor (`executionPolicy.monitor.nextCheckAt`) that will wake the assignee for a future check
- a human owner via `assigneeUserId`
- a first-class blocker chain whose unresolved leaf issues are themselves healthy
- an open explicit recovery issue that names the owner and action needed to restore liveness
### Agent-assigned `todo`
@@ -160,9 +177,11 @@ This is dispatch state: ready to start, not yet actively claimed.
A healthy dispatch state means at least one of these is true:
- the issue already has a queued/running wake path
- the issue is intentionally resting in `todo` after a successful agent heartbeat, not after an interrupted dispatch
- the issue has been explicitly surfaced as stranded
- the issue already has a queued wake path
- the issue is intentionally resting in `todo` after a completed agent heartbeat, with no interrupted dispatch evidence
- the issue has been explicitly surfaced as stranded through a visible blocked/recovery path
An assigned `todo` issue is stalled when dispatch was interrupted, no wake remains queued or running, and no recovery path has been opened.
### Agent-assigned `in_progress`
@@ -172,7 +191,63 @@ A healthy active-work state means at least one of these is true:
- there is an active run for the issue
- there is already a queued continuation wake
- the issue has been explicitly surfaced as stranded
- there is an active one-shot monitor that will wake the assignee for a future check
- there is an open explicit recovery issue for the lost execution path
An agent-owned `in_progress` issue is stalled when it has no active run, no queued continuation, and no explicit recovery surface. A still-running but silent process is not automatically stalled; it is handled by the active-run watchdog contract.
### `in_review`
This is review/approval state: execution is paused because the next move belongs to a reviewer, approver, board user, or recovery owner.
A healthy `in_review` issue has at least one valid action path:
- a typed execution-policy participant who can approve or request changes
- a pending issue-thread interaction or linked approval waiting for a named responder
- a human owner via `assigneeUserId`
- an active run or queued wake that is expected to process the review state
- an active one-shot monitor for an external service or async review loop that the assignee owns
- an open explicit recovery issue for an ambiguous review handoff
Agent-assigned `in_review` with no typed participant is only healthy when one of the other paths exists. Assignment to the same agent that produced the handoff is not, by itself, a review path.
An `in_review` issue is stalled when it has no typed participant, no pending interaction or approval, no user owner, no active monitor, no active run, no queued wake, and no explicit recovery issue. Paperclip should surface that state as recovery work rather than silently completing the issue or leaving blocker chains parked indefinitely.
### Issue monitors
An issue monitor is a one-shot deferred action path for agent-owned issues in `in_progress` or `in_review`.
Use a monitor when the current assignee owns a future check against an async system or external service. Examples include Greptile review loops, GitHub checks, Vercel deployments, or provider jobs where the agent should come back later and decide what happens next.
Monitor policy lives under `executionPolicy.monitor` and includes:
- `nextCheckAt`: when Paperclip should wake the assignee
- `notes`: non-secret instructions for what the assignee should check
- `serviceName`: optional non-secret external-service context
- `externalRef`: optional external-service reference input; Paperclip treats it as secret-adjacent, redacts it before persistence/visibility, and omits it from activity and wake payloads
- `timeoutAt`, `maxAttempts`, and `recoveryPolicy`: optional recovery hints for bounded waits
Monitors are not recurring intervals. When a monitor fires, Paperclip clears the scheduled monitor and queues an `issue_monitor_due` wake for the assignee. If the external service is still pending, the assignee must explicitly re-arm the monitor with a new `nextCheckAt`. If the issue moves to `done`, `cancelled`, an invalid status, or a human/unassigned owner, the monitor is cleared.
Because `serviceName` and `notes` remain visible in issue activity and wake context, operators should keep them short and non-secret. Put enough context for the assignee to know what to inspect, but do not include signed URLs, bearer tokens, customer secrets, tenant-private identifiers, or provider links with embedded credentials.
Monitor bounds are enforced. Paperclip rejects attempts to re-arm a monitor whose `timeoutAt` or `maxAttempts` is already exhausted. When a scheduled monitor reaches an exhausted bound at trigger time, Paperclip clears it and follows `recoveryPolicy`: `wake_owner` queues a bounded recovery wake for the assignee, `create_recovery_issue` opens visible recovery work, and `escalate_to_board` records a board-visible escalation comment/activity.
Use `blocked` instead of a monitor when no Paperclip assignee owns a responsible polling path. In that case, name the external owner/action or create first-class recovery/blocker work.
### `blocked`
This is explicit waiting state.
A healthy `blocked` issue has an explicit waiting path:
- first-class blockers exist, and each unresolved leaf has a valid action path under this contract
- the issue is blocked on an explicit recovery issue that itself has a live or waiting path
- the issue is waiting on a pending interaction, linked approval, human owner, or clearly named external owner/action
A blocker chain is covered only when its unresolved leaf is live or explicitly waiting. An intermediate `blocked` issue does not make the chain healthy by itself.
A `blocked` issue is stalled when the unresolved blocker leaf has no active run, queued wake, typed participant, pending interaction or approval, user owner, external owner/action, or recovery issue. In that case the parent should show the first stalled leaf instead of presenting the dependency as calmly covered.
## 8. Crash and Restart Recovery
@@ -216,15 +291,83 @@ This is an active-work continuity recovery.
Startup recovery and periodic recovery are different from normal wakeup delivery.
On startup and on the periodic recovery loop, Paperclip now does three things in sequence:
On startup and on the periodic recovery loop, Paperclip now does four things in sequence:
1. reap orphaned `running` runs
2. resume persisted `queued` runs
3. reconcile stranded assigned work
4. scan silent active runs and create or update explicit watchdog review issues
That last step is what closes the gap where issue state survives a crash but the wake/run path does not.
The stranded-work pass closes the gap where issue state survives a crash but the wake/run path does not. The silent-run scan covers the separate case where a live process exists but has stopped producing observable output.
## 10. What This Does Not Mean
## 10. Silent Active-Run Watchdog
An active run can still be unhealthy even when its process is `running`. Paperclip treats prolonged output silence as a watchdog signal, not as proof that the run is failed.
The recovery service owns this contract:
- classify active-run output silence as `ok`, `suspicious`, `critical`, `snoozed`, or `not_applicable`
- collect bounded evidence from run logs, recent run events, child issues, and blockers
- preserve redaction and truncation before evidence is written to issue descriptions
- create at most one open `stale_active_run_evaluation` issue per run
- honor active snooze decisions before creating more review work
- build the `outputSilence` summary shown by live-run and active-run API responses
Suspicious silence creates a medium-priority review issue for the selected recovery owner. Critical silence raises that review issue to high priority and blocks the source issue on the explicit evaluation task without cancelling the active process.
Watchdog decisions are explicit operator/recovery-owner decisions:
- `snooze` records an operator-chosen future quiet-until time and suppresses scan-created review work during that window
- `continue` records that the current evidence is acceptable, does not cancel or mutate the active run, and sets a 30-minute default re-arm window before the watchdog evaluates the still-silent run again
- `dismissed_false_positive` records why the review was not actionable
Operators should prefer `snooze` for known time-bounded quiet periods. `continue` is only a short acknowledgement of the current evidence; if the run remains silent after the re-arm window, the periodic watchdog scan can create or update review work again.
The board can record watchdog decisions. The assigned owner of the watchdog evaluation issue can also record them. Other agents cannot.
## 11. Auto-Recover vs Explicit Recovery vs Human Escalation
Paperclip uses three different recovery outcomes, depending on how much it can safely infer.
### Auto-Recover
Auto-recovery is allowed when ownership is clear and the control plane only lost execution continuity.
Examples:
- requeue one dispatch wake for an assigned `todo` issue whose latest run failed, timed out, or was cancelled
- requeue one continuation wake for an assigned `in_progress` issue whose live execution path disappeared
- assign an orphan blocker back to its creator when that blocker is already preventing other work
Auto-recovery preserves the existing owner. It does not choose a replacement agent.
### Explicit Recovery Issue
Paperclip creates an explicit recovery issue when the system can identify a problem but cannot safely complete the work itself.
Examples:
- automatic stranded-work retry was already exhausted
- a dependency graph has an invalid/uninvokable owner, unassigned blocker, or invalid review participant
- an active run is silent past the watchdog threshold
The source issue remains visible and blocked on the recovery issue when blocking is necessary for correctness. The recovery owner must restore a live path, resolve the source issue manually, or record the reason it is a false positive.
Instance-level issue-graph liveness auto-recovery is disabled by default. When enabled, its lookback window means "dependency paths updated within the last N hours"; older findings remain advisory and are counted as outside the configured lookback instead of creating recovery issues automatically. This is an operator noise control, not the older staleness delay for determining whether a chain is old enough to surface.
### Human Escalation
Human escalation is required when the next safe action depends on board judgment, budget/approval policy, or information unavailable to the control plane.
Examples:
- all candidate recovery owners are paused, terminated, pending approval, or budget-blocked
- the issue is human-owned rather than agent-owned
- the run is intentionally quiet but needs an operator decision before cancellation or continuation
In these cases Paperclip should leave a visible issue/comment trail instead of silently retrying.
## 12. What This Does Not Mean
These semantics do not change V1 into an auto-reassignment system.
@@ -238,9 +381,10 @@ The recovery model is intentionally conservative:
- preserve ownership
- retry once when the control plane lost execution continuity
- create explicit recovery work when the system can identify a bounded recovery owner/action
- escalate visibly when the system cannot safely keep going
## 11. Practical Interpretation
## 13. Practical Interpretation
For a board operator, the intended meaning is:

View File

@@ -10,7 +10,12 @@ It is intentionally narrower than [PLUGIN_SPEC.md](./PLUGIN_SPEC.md). The spec i
- Plugin UI runs as same-origin JavaScript inside the main Paperclip app.
- Worker-side host APIs are capability-gated.
- Plugin UI is not sandboxed by manifest capabilities.
- There is no host-provided shared React component kit for plugins yet.
- Plugin database migrations are restricted to a host-derived plugin namespace.
- Plugin-owned JSON API routes must be declared in the manifest and are mounted
only under `/api/plugins/:pluginId/api/*`.
- The host provides a small shared React component kit through
`@paperclipai/plugin-sdk/ui`; use it for common Paperclip controls before
building custom versions.
- `ctx.assets` is not supported in the current runtime.
## Scaffold a plugin
@@ -77,10 +82,12 @@ Worker:
- secrets
- activity
- state
- database namespace via `ctx.db`
- scoped JSON API routes declared with `apiRoutes`
- entities
- projects and project workspaces
- companies
- issues and comments
- issues, comments, namespaced `plugin:<pluginKey>` origins, blocker relations, checkout assertions, assignment wakeups, and orchestration summaries
- agents and agent sessions
- goals
- data/actions
@@ -89,6 +96,55 @@ Worker:
- metrics
- logger
### Plugin database declarations
First-party or otherwise trusted orchestration plugins can declare:
```ts
database: {
migrationsDir: "migrations",
coreReadTables: ["issues"],
}
```
Required capabilities are `database.namespace.migrate` and
`database.namespace.read`; add `database.namespace.write` for runtime mutations.
The host derives `ctx.db.namespace`, runs SQL files in filename order before the
worker starts, records checksums in `plugin_migrations`, and rejects changed
already-applied migrations.
Migration SQL may create or alter objects only inside `ctx.db.namespace`. It may
reference whitelisted `public` core tables for foreign keys or read-only views,
but may not mutate/alter/drop/truncate public tables, create extensions,
triggers, untrusted languages, or runtime multi-statement SQL. Runtime
`ctx.db.query()` is restricted to `SELECT`; runtime `ctx.db.execute()` is
restricted to namespace-local `INSERT`, `UPDATE`, and `DELETE`.
### Scoped plugin API routes
Plugins can expose JSON-only routes under their own namespace:
```ts
apiRoutes: [
{
routeKey: "initialize",
method: "POST",
path: "/issues/:issueId/smoke",
auth: "board-or-agent",
capability: "api.routes.register",
checkoutPolicy: "required-for-agent-in-progress",
companyResolution: { from: "issue", param: "issueId" },
},
]
```
The host resolves the plugin, checks that it is ready, enforces
`api.routes.register`, matches the declared method/path, resolves company access,
and applies checkout policy before dispatching to the worker's `onApiRequest`
handler. The worker receives sanitized headers, route params, query, parsed JSON
body, actor context, and company id. Do not use plugin routes to claim core
paths; they always remain under `/api/plugins/:pluginId/api/*`.
UI:
- `usePluginData`
@@ -114,6 +170,187 @@ Mount surfaces currently wired in the host include:
- `commentAnnotation`
- `commentContextMenuItem`
## Shared host components
Use shared components from `@paperclipai/plugin-sdk/ui` when the plugin needs a
Paperclip-native control. The host owns the implementation, so plugins inherit
the board's current styling, ordering, recent selections, and dark-mode behavior
without importing `ui/src` internals.
Currently exposed components include:
- `MarkdownBlock` and `MarkdownEditor` for rendered and editable markdown.
- `FileTree` for serializable file and directory trees.
- `IssuesList` for a native company-scoped issue table.
- `AssigneePicker` for the same agent/user selector used in the new issue pane.
Use the controlled `value` format `agent:<id>`, `user:<id>`, or `""`.
- `ProjectPicker` for the same project selector used in the new issue pane.
Use the controlled project id value, or `""` for no project.
- `ManagedRoutinesList` for plugin-owned routine settings pages.
```tsx
import { AssigneePicker, ProjectPicker } from "@paperclipai/plugin-sdk/ui";
export function PluginAssignmentControls({ companyId }: { companyId: string }) {
const [assignee, setAssignee] = useState("");
const [projectId, setProjectId] = useState("");
return (
<>
<AssigneePicker
companyId={companyId}
value={assignee}
onChange={(value) => setAssignee(value)}
/>
<ProjectPicker
companyId={companyId}
value={projectId}
onChange={setProjectId}
/>
</>
);
}
```
## File and path UI
Plugin UI often needs to render a file tree, accept a folder path, or browse a
project workspace. There are three different surfaces for that, and they map to
different trust and data-flow boundaries. Pick the surface that matches the
data the plugin actually has.
### When to use the shared `FileTree`
Use `FileTree` from `@paperclipai/plugin-sdk/ui` whenever the plugin only needs
to render a serializable file/directory list and react to selection or
expand/collapse. The host owns the implementation, so plugin UI inherits the
board's icons, indent, focus ring, and dark-mode styling without importing host
internals.
```tsx
import {
FileTree,
type FileTreeNode,
} from "@paperclipai/plugin-sdk/ui";
const nodes: FileTreeNode[] = [
{ name: "AGENTS.md", path: "AGENTS.md", kind: "file", children: [] },
{
name: "wiki",
path: "wiki",
kind: "dir",
children: [
{ name: "index.md", path: "wiki/index.md", kind: "file", children: [] },
],
},
];
export function WikiTree() {
const [expanded, setExpanded] = useState<Set<string>>(() => new Set(["wiki"]));
const [selected, setSelected] = useState<string | null>(null);
return (
<FileTree
nodes={nodes}
selectedFile={selected}
expandedPaths={expanded}
onSelectFile={(path) => setSelected(path)}
onToggleDir={(path) =>
setExpanded((current) => {
const next = new Set(current);
next.has(path) ? next.delete(path) : next.add(path);
return next;
})
}
/>
);
}
```
Good fits:
- LLM Wiki page navigation in `packages/plugins/plugin-llm-wiki` builds a
`FileTreeNode[]` from worker query results and renders it through `FileTree`.
- The example `plugin-file-browser-example` lazily fetches a directory's
children through a `loadFileList` action when `onToggleDir` fires, then
merges the children into the local tree state — letting the shared component
handle rendering and selection.
Boundary rules:
- Keep the prop surface serializable (`nodes`, `expandedPaths`, `checkedPaths`,
`fileBadges`, `fileTones`). Do not pass arbitrary render functions across the
plugin/host boundary in v1; the supported escape hatches are
`fileBadges` (status pill keyed by path) and `fileTones` (row tone keyed by
path).
- Do not import the host's `FileTree.tsx` or any `ui/src/*` module. The SDK
declaration is the only supported import path for plugin UI.
- The shared `FileTree` is for rendering and selection. Plugin-specific editors,
ingest flows, query forms, and lint runs stay inside the plugin and do not
belong as `FileTree` props.
### When to declare `localFolders`
When the plugin needs operator-configured filesystem roots — typically for
trusted local plugins like wiki tooling — declare `localFolders[]` on the
manifest and add the `local.folders` capability. The host renders a settings
surface for the operator to set the absolute path, validates the path
server-side (containment, symlinks, required files/directories), and exposes
`ctx.localFolders.readText()` and `ctx.localFolders.writeTextAtomic()` in the
worker.
```ts
export const manifest = {
capabilities: ["local.folders"],
localFolders: [
{
folderKey: "content-root",
displayName: "Content root",
access: "readWrite",
requiredDirectories: ["sources", "pages"],
requiredFiles: ["schema.md"],
},
],
};
```
Use this when:
- The data lives outside any project workspace.
- Reads and writes need company-scoped configuration.
- The operator picks the path once in plugin settings and the worker resolves
files relative to that root.
Do not use `localFolders` to grant the UI direct browser-side access to the
filesystem — there is no such capability. The browser still goes through the
worker via `getData` / `performAction`, and the worker only exposes paths it
chose to expose.
### When to keep worker-mediated project workspace browsing
When the data lives inside an existing project workspace, keep the browsing
flow worker-mediated:
- The worker uses `ctx.projects.listWorkspaces()` to resolve the workspace
path, then reads its filesystem with normal Node APIs.
- The plugin UI calls a `getData` handler for the root listing and an action
for lazy children, then renders them through `FileTree`.
- The worker is the only side that touches the disk. The browser receives a
serializable tree and never sees raw absolute paths it can replay.
The example `plugin-file-browser-example` is the reference for this pattern:
the worker registers `fileList` (data) and `loadFileList` (action) over the
same handler, and the UI uses the action for on-toggle directory loading so the
shared `FileTree` stays the rendering surface.
### Mixing surfaces
A single plugin can use more than one of these. The LLM Wiki uses
`localFolders` for its content root, then renders the resulting page list
through `FileTree`. The file browser example uses `ctx.projects.listWorkspaces`
to pick a workspace and renders its on-disk tree through `FileTree` with lazy
loading. Pick the boundary per data source, not per plugin.
## Company routes
Plugins may declare a `page` slot with `routePath` to own a company route like:

View File

@@ -27,7 +27,10 @@ Current limitations to keep in mind:
- Published npm packages are the intended install artifact for deployed plugins.
- The repo example plugins under `packages/plugins/examples/` are development conveniences. They work from a source checkout and should not be assumed to exist in a generic published build unless they are explicitly shipped with that build.
- Dynamic plugin install is not yet cloud-ready for horizontally scaled or ephemeral deployments. There is no shared artifact store, install coordination, or cross-node distribution layer yet.
- The current runtime does not yet ship a real host-provided plugin UI component kit, and it does not support plugin asset uploads/reads. Treat those as future-scope ideas in this spec, not current implementation promises.
- The current runtime ships a small host-provided plugin UI component kit through `@paperclipai/plugin-sdk/ui`, but does not support plugin asset uploads/reads yet. Treat plugin asset APIs as future-scope ideas, not current implementation promises.
- Scoped plugin API routes are JSON-only and must be declared in `apiRoutes`.
They mount under `/api/plugins/:pluginId/api/*`; plugins cannot shadow core
API routes.
In practice, that means the current implementation is a good fit for local development and self-hosted persistent deployments, but not yet for multi-instance cloud plugin distribution.
@@ -624,7 +627,46 @@ Required SDK clients:
Plugins that need filesystem, git, terminal, or process operations handle those directly using standard Node APIs or libraries. The host provides project workspace metadata through `ctx.projects` so plugins can resolve workspace paths, but the host does not proxy low-level OS operations.
## 14.1 Example SDK Shape
## 14.1 Issue Orchestration APIs
Trusted orchestration plugins can create and update Paperclip issues through `ctx.issues` instead of importing server internals. The public issue contract includes parent/project/goal links, board or agent assignees, blocker IDs, labels, billing code, request depth, execution workspace inheritance, and plugin origin metadata.
Origin rules:
- Built-in core issues keep built-in origins such as `manual` and `routine_execution`.
- Plugin-managed issues use `plugin:<pluginKey>` or a sub-kind such as `plugin:<pluginKey>:feature`.
- The host derives the default plugin origin from the installed plugin key and rejects attempts to set `plugin:<otherPluginKey>` origins.
- `originId` is plugin-defined and should be stable for idempotent generated work.
Relation and read helpers:
- `ctx.issues.relations.get(issueId, companyId)`
- `ctx.issues.relations.setBlockedBy(issueId, blockerIssueIds, companyId)`
- `ctx.issues.relations.addBlockers(issueId, blockerIssueIds, companyId)`
- `ctx.issues.relations.removeBlockers(issueId, blockerIssueIds, companyId)`
- `ctx.issues.getSubtree(issueId, companyId, options)`
- `ctx.issues.summaries.getOrchestration({ issueId, companyId, includeSubtree, billingCode })`
Governance helpers:
- `ctx.issues.assertCheckoutOwner({ issueId, companyId, actorAgentId, actorRunId })` lets plugin actions preserve agent-run checkout ownership.
- `ctx.issues.requestWakeup(issueId, companyId, options)` requests assignment wakeups through host heartbeat semantics, including terminal-status, blocker, assignee, and budget hard-stop checks.
- `ctx.issues.requestWakeups(issueIds, companyId, options)` applies the same host-owned wakeup semantics to a batch and may use an idempotency key prefix for stable coordinator retries.
Plugin-originated issue, relation, document, comment, and wakeup mutations must write activity entries with `actorType: "plugin"` and details fields for `sourcePluginId`, `sourcePluginKey`, `initiatingActorType`, `initiatingActorId`, and `initiatingRunId` when a user or agent run initiated the plugin work.
Scoped API routes:
- `apiRoutes[]` declares `routeKey`, `method`, plugin-local `path`, `auth`,
`capability`, optional checkout policy, and company resolution.
- The host enforces auth, company access, `api.routes.register`, route matching,
and checkout policy before worker dispatch.
- The worker implements `onApiRequest(input)` and returns a JSON response shape
`{ status?, headers?, body? }`.
- Only safe request headers are forwarded; auth/cookie headers are never passed
to the worker.
## 14.2 Example SDK Shape
```ts
/** Top-level helper for defining a plugin with type checking */
@@ -696,16 +738,24 @@ The host enforces capabilities in the SDK layer and refuses calls outside the gr
- `project.workspaces.read`
- `issues.read`
- `issue.comments.read`
- `issue.documents.read`
- `issue.relations.read`
- `issue.subtree.read`
- `agents.read`
- `goals.read`
- `activity.read`
- `costs.read`
- `issues.orchestration.read`
### Data Write
- `issues.create`
- `issues.update`
- `issue.comments.create`
- `issue.documents.write`
- `issue.relations.write`
- `issues.checkout`
- `issues.wakeup`
- `assets.write`
- `assets.read`
- `activity.log.write`
@@ -772,6 +822,13 @@ Minimum event set:
- `issue.created`
- `issue.updated`
- `issue.comment.created`
- `issue.document.created`
- `issue.document.updated`
- `issue.document.deleted`
- `issue.relations.updated`
- `issue.checked_out`
- `issue.released`
- `issue.assignment_wakeup_requested`
- `agent.created`
- `agent.updated`
- `agent.status_changed`
@@ -781,6 +838,8 @@ Minimum event set:
- `agent.run.cancelled`
- `approval.created`
- `approval.decided`
- `budget.incident.opened`
- `budget.incident.resolved`
- `cost_event.created`
- `activity.logged`
@@ -917,13 +976,23 @@ export function DashboardWidget({ context }: PluginWidgetProps) {
The SDK includes a `ui` subpath export that plugin frontends import. This subpath provides:
- **Bridge hooks**: `usePluginData(key, params)`, `usePluginAction(key)`, `useHostContext()`
- **Bridge hooks**: `usePluginData(key, params)`, `usePluginAction(key)`, `useHostContext()`, `useHostNavigation()`
- **Design tokens**: colors, spacing, typography, shadows matching the host theme
- **Shared components**: `MetricCard`, `StatusBadge`, `DataTable`, `LogView`, `ActionBar`, `Spinner`, etc.
- **Type definitions**: `PluginPageProps`, `PluginWidgetProps`, `PluginDetailTabProps`
Plugins are encouraged but not required to use the shared components. A plugin may render entirely custom UI as long as it communicates through the bridge.
`useHostNavigation()` is the supported way for plugin UI to navigate to
Paperclip-internal pages. It exposes `resolveHref(to)`, `navigate(to,
options?)`, and `linkProps(to, options?)`. Plugin links should prefer
`linkProps()` so anchors keep real `href` values for copy-link, modifier-click,
middle-click, and open-in-new-tab behavior while plain left-clicks route through
the host SPA router. The host resolves company-scoped paths against the active
company prefix without double-prefixing already-prefixed paths. Plugin UI should
not use raw same-origin `href`s or `window.location.assign()` for internal
Paperclip navigation because those can force a full document reload.
### 19.0.2 Bundle Isolation
Plugin UI bundles are loaded as standard ES modules, not iframed. This gives plugins full rendering performance and access to the host's design tokens.
@@ -1003,6 +1072,11 @@ The host SDK ships shared components that plugins can import to quickly build UI
| `LogView` | Scrollable log output with timestamps | Webhook deliveries, job output, process logs |
| `JsonTree` | Collapsible JSON tree for debugging | Raw API responses, plugin state inspection |
| `Spinner` | Loading indicator | Data fetch states |
| `FileTree` | Host-styled file/directory tree | Wiki pages, workspace files, import previews |
| `IssuesList` | Host issue list | Plugin pages that need a native issue view |
| `AssigneePicker` | Host assignee picker for agents and board users | Creating issues, assigning routines, filtering work |
| `ProjectPicker` | Host project picker | Creating issues, scoping dashboards, filtering work |
| `ManagedRoutinesList` | Host routine list | Plugin settings pages that manage routines |
Plugins may also use entirely custom components. The shared components exist to reduce boilerplate and keep visual consistency, not to limit what plugins can render.
@@ -1238,6 +1312,8 @@ Plugin-originated mutations should write:
- `actor_type = plugin`
- `actor_id = <plugin-id>`
- details include `sourcePluginId` and `sourcePluginKey`
- details include `initiatingActorType`, `initiatingActorId`, and `initiatingRunId` when a user or agent run triggered the plugin work
## 21.5 Plugin Migrations

View File

@@ -114,14 +114,14 @@ If the connection drops, the UI reconnects automatically.
1. Enable timer wakeups (for example every 300s)
2. Keep assignment wakeups on
3. Use a focused prompt template
3. Use a focused prompt template that tells agents to act in the same heartbeat, leave durable progress, and mark blocked work with an owner/action
4. Watch run logs and adjust prompt/config over time
## 7.2 Event-driven loop (less constant polling)
1. Disable timer or set a long interval
2. Keep wake-on-assignment enabled
3. Use on-demand wakeups for manual nudges
3. Use child issues, comments, and on-demand wakeups for handoffs instead of loops that poll agents, sessions, or processes
## 7.3 Safety-first loop

299
doc/spec/invite-flow.md Normal file
View File

@@ -0,0 +1,299 @@
# Invite Flow State Map
Status: Current implementation map
Date: 2026-04-13
This document maps the current invite creation and acceptance states implemented in:
- `ui/src/pages/CompanyInvites.tsx`
- `ui/src/pages/CompanySettings.tsx`
- `ui/src/pages/InviteLanding.tsx`
- `server/src/routes/access.ts`
- `server/src/lib/join-request-dedupe.ts`
## State Legend
- Invite state: `active`, `revoked`, `accepted`, `expired`
- Join request status: `pending_approval`, `approved`, `rejected`
- Claim secret state for agent joins: `available`, `consumed`, `expired`
- Invite type: `company_join` or `bootstrap_ceo`
- Join type: `human`, `agent`, or `both`
## Entity Lifecycle
```mermaid
flowchart TD
Board[Board user on invite screen]
HumanInvite[Create human company invite]
OpenClawInvite[Generate OpenClaw invite prompt]
Active[Invite state: active]
Revoked[Invite state: revoked]
Expired[Invite state: expired]
Accepted[Invite state: accepted]
BootstrapDone[Bootstrap accepted<br/>no join request]
HumanReuse{Matching human join request<br/>already exists for same user/email?}
HumanPending[Join request<br/>pending_approval]
HumanApproved[Join request<br/>approved]
HumanRejected[Join request<br/>rejected]
AgentPending[Agent join request<br/>pending_approval<br/>+ optional claim secret]
AgentApproved[Agent join request<br/>approved]
AgentRejected[Agent join request<br/>rejected]
ClaimAvailable[Claim secret available]
ClaimConsumed[Claim secret consumed]
ClaimExpired[Claim secret expired]
OpenClawReplay[Special replay path:<br/>accepted invite can be POSTed again<br/>for openclaw_gateway only]
Board --> HumanInvite --> Active
Board --> OpenClawInvite --> Active
Active --> Revoked: revoke
Active --> Expired: expiresAt passes
Active --> BootstrapDone: bootstrap_ceo accept
BootstrapDone --> Accepted
Active --> HumanReuse: human accept
HumanReuse --> HumanPending: reuse existing pending request
HumanReuse --> HumanApproved: reuse existing approved request
HumanReuse --> HumanPending: no reusable request<br/>create new request
HumanPending --> HumanApproved: board approves
HumanPending --> HumanRejected: board rejects
HumanPending --> Accepted
HumanApproved --> Accepted
Active --> AgentPending: agent accept
AgentPending --> Accepted
AgentPending --> AgentApproved: board approves
AgentPending --> AgentRejected: board rejects
AgentApproved --> ClaimAvailable: createdAgentId + claimSecretHash
ClaimAvailable --> ClaimConsumed: POST claim-api-key succeeds
ClaimAvailable --> ClaimExpired: secret expires
Accepted --> OpenClawReplay
OpenClawReplay --> AgentPending
OpenClawReplay --> AgentApproved
```
## Board-Side Screen States
```mermaid
stateDiagram-v2
[*] --> CompanySelection
CompanySelection --> NoCompany: no company selected
CompanySelection --> LoadingHistory: selectedCompanyId present
LoadingHistory --> HistoryError: listInvites failed
LoadingHistory --> Ready: listInvites succeeded
state Ready {
[*] --> EmptyHistory
EmptyHistory --> PopulatedHistory: invites exist
PopulatedHistory --> LoadingMore: View more
LoadingMore --> PopulatedHistory: next page loaded
PopulatedHistory --> RevokePending: Revoke active invite
RevokePending --> PopulatedHistory: revoke succeeded
RevokePending --> PopulatedHistory: revoke failed
EmptyHistory --> CreatePending: Create invite
PopulatedHistory --> CreatePending: Create invite
CreatePending --> LatestInviteVisible: create succeeded
CreatePending --> Ready: create failed
LatestInviteVisible --> CopiedToast: clipboard copy succeeded
LatestInviteVisible --> Ready: navigate away or refresh
}
CompanySelection --> OpenClawPromptReady: Company settings prompt generator
OpenClawPromptReady --> OpenClawPromptPending: Generate OpenClaw Invite Prompt
OpenClawPromptPending --> OpenClawSnippetVisible: prompt generated
OpenClawPromptPending --> OpenClawPromptReady: generation failed
```
## Invite Landing Screen States
```mermaid
stateDiagram-v2
[*] --> TokenGate
TokenGate --> InvalidToken: token missing
TokenGate --> Loading: token present
Loading --> InviteUnavailable: invite fetch failed or invite not returned
Loading --> CheckingAccess: signed-in session and invite.companyId
Loading --> InviteResolved: invite loaded without membership check
Loading --> AcceptedInviteSummary: invite already consumed<br/>but linked join request still exists
CheckingAccess --> RedirectToBoard: current user already belongs to company
CheckingAccess --> InviteResolved: membership check finished and no join-request summary state is active
CheckingAccess --> AcceptedInviteSummary: membership check finished and invite has joinRequestStatus
state InviteResolved {
[*] --> Branch
Branch --> AgentForm: company_join + allowedJoinTypes=agent
Branch --> InlineAuth: authenticated mode + no session + join is not agent-only
Branch --> AcceptReady: bootstrap invite or human-ready session/local_trusted
InlineAuth --> InlineAuth: toggle sign-up/sign-in
InlineAuth --> InlineAuth: auth validation or auth error message
InlineAuth --> RedirectToBoard: auth succeeded and company membership already exists
InlineAuth --> AcceptPending: auth succeeded and invite still needs acceptance
AgentForm --> AcceptPending: submit request
AgentForm --> AgentForm: validation or accept error
AcceptReady --> AcceptPending: Accept invite
AcceptReady --> AcceptReady: accept error
}
AcceptPending --> BootstrapComplete: bootstrapAccepted=true
AcceptPending --> RedirectToBoard: join status=approved
AcceptPending --> PendingApprovalResult: join status=pending_approval
AcceptPending --> RejectedResult: join status=rejected
state AcceptedInviteSummary {
[*] --> SummaryBranch
SummaryBranch --> PendingApprovalReload: joinRequestStatus=pending_approval
SummaryBranch --> OpeningCompany: joinRequestStatus=approved<br/>and human invite user is now a member
SummaryBranch --> RejectedReload: joinRequestStatus=rejected
SummaryBranch --> ConsumedReload: approved agent invite or other consumed state
}
PendingApprovalResult --> PendingApprovalReload: reload after submit
RejectedResult --> RejectedReload: reload after board rejects
RedirectToBoard --> OpeningCompany: brief pre-navigation render when approved membership is detected
OpeningCompany --> RedirectToBoard: navigate to board
```
## Sequence Diagrams
### Human Invite Creation And First Acceptance
```mermaid
sequenceDiagram
autonumber
actor Board as Board user
participant Settings as Company Invites UI
participant API as Access routes
participant Invites as invites table
actor Invitee as Invite recipient
participant Landing as Invite landing UI
participant Auth as Auth session
participant Join as join_requests table
Board->>Settings: Choose role and click Create invite
Settings->>API: POST /api/companies/:companyId/invites
API->>Invites: Insert active invite
API-->>Settings: inviteUrl + metadata
Invitee->>Landing: Open invite URL
Landing->>API: GET /api/invites/:token
API->>Invites: Load active invite
API-->>Landing: Invite summary
alt Authenticated mode and no session
Landing->>Auth: Sign up or sign in
Auth-->>Landing: Session established
end
Landing->>API: POST /api/invites/:token/accept (requestType=human)
API->>Join: Look for reusable human join request
alt Reusable pending or approved request exists
API->>Invites: Mark invite accepted
API-->>Landing: Existing join request status
else No reusable request exists
API->>Invites: Mark invite accepted
API->>Join: Insert pending_approval join request
API-->>Landing: New pending_approval join request
end
```
### Human Approval And Reload Path
```mermaid
sequenceDiagram
autonumber
actor Invitee as Invite recipient
participant Landing as Invite landing UI
participant API as Access routes
participant Join as join_requests table
actor Approver as Company admin
participant Queue as Access queue UI
participant Membership as company_memberships + grants
Invitee->>Landing: Reload consumed invite URL
Landing->>API: GET /api/invites/:token
API->>Join: Load join request by inviteId
API-->>Landing: joinRequestStatus + joinRequestType
alt joinRequestStatus = pending_approval
Landing-->>Invitee: Show waiting-for-approval panel
Approver->>Queue: Review request in Company Settings -> Access
Queue->>API: POST /companies/:companyId/join-requests/:requestId/approve
API->>Membership: Ensure membership and grants
API->>Join: Mark join request approved
Invitee->>Landing: Refresh after approval
Landing->>API: GET /api/invites/:token
API->>Join: Reload approved join request
API-->>Landing: approved status
Landing-->>Invitee: Opening company and redirect
else joinRequestStatus = rejected
Landing-->>Invitee: Show rejected error panel
else joinRequestStatus = approved but membership missing
Landing-->>Invitee: Fall through to consumed/unavailable state
end
```
### Agent Invite Approval, Claim, And Replay
```mermaid
sequenceDiagram
autonumber
actor Board as Board user
participant Settings as Company Settings UI
participant API as Access routes
participant Invites as invites table
actor Gateway as OpenClaw gateway agent
participant Join as join_requests table
actor Approver as Company admin
participant Agents as agents table
participant Keys as agent_api_keys table
Board->>Settings: Generate OpenClaw invite prompt
Settings->>API: POST /api/companies/:companyId/openclaw-invite-prompt
API->>Invites: Insert active agent invite
API-->>Settings: Prompt text + invite token
Gateway->>API: POST /api/invites/:token/accept (agent, openclaw_gateway)
API->>Invites: Mark invite accepted
API->>Join: Insert pending_approval join request + claimSecretHash
API-->>Gateway: requestId + claimSecret + claimApiKeyPath
Approver->>API: POST /companies/:companyId/join-requests/:requestId/approve
API->>Agents: Create agent + membership + grants
API->>Join: Mark request approved and store createdAgentId
Gateway->>API: POST /api/join-requests/:requestId/claim-api-key (claimSecret)
API->>Keys: Create initial API key
API->>Join: Mark claim secret consumed
API-->>Gateway: Plaintext Paperclip API key
opt Replay accepted invite for updated gateway defaults
Gateway->>API: POST /api/invites/:token/accept again
API->>Join: Reuse existing approved or pending request
API->>Agents: Update approved agent adapter config when applicable
API-->>Gateway: Updated join request payload
end
```
## Notes
- `GET /api/invites/:token` treats `revoked` and `expired` invites as unavailable. Accepted invites remain resolvable when they already have a linked join request, and the summary now includes `joinRequestStatus` plus `joinRequestType`.
- Human acceptance consumes the invite immediately and then either creates a new join request or reuses an existing `pending_approval` or `approved` human join request for the same user/email.
- The landing page has two layers of post-accept UI:
- immediate mutation-result UI from `POST /api/invites/:token/accept`
- reload-time summary UI from `GET /api/invites/:token` once the invite has already been consumed
- Reload behavior for accepted company invites is now status-sensitive:
- `pending_approval` re-renders the waiting-for-approval panel
- `rejected` renders the "This join request was not approved." error panel
- `approved` only becomes a success path for human invites after membership is visible to the current session; otherwise the page falls through to the generic consumed/unavailable state
- `GET /api/invites/:token/logo` still rejects accepted invites, so accepted-invite reload states may fall back to the generated company icon even though the summary payload still carries `companyLogoUrl`.
- The only accepted-invite replay path in the current implementation is `POST /api/invites/:token/accept` for `agent` requests with `adapterType=openclaw_gateway`, and only when the existing join request is still `pending_approval` or already `approved`.
- `bootstrap_ceo` invites are one-time and do not create join requests.

30
docker/.env.aws.example Normal file
View File

@@ -0,0 +1,30 @@
# AWS ECS Fargate deployment environment
# Copy to .env.aws and fill in values before deploying
#
# Secrets (DATABASE_URL, BETTER_AUTH_SECRET, ANTHROPIC_API_KEY, OPENAI_API_KEY,
# GITHUB_TOKEN) are injected via AWS Secrets Manager — do NOT set them here.
# Deployment mode
PAPERCLIP_DEPLOYMENT_MODE=authenticated
PAPERCLIP_DEPLOYMENT_EXPOSURE=public
PAPERCLIP_PUBLIC_URL=https://paperclip.example.com
# Server
HOST=0.0.0.0
PORT=3100
NODE_ENV=production
SERVE_UI=true
# Paperclip paths
PAPERCLIP_HOME=/paperclip
PAPERCLIP_INSTANCE_ID=default
PAPERCLIP_CONFIG=/paperclip/instances/default/config.json
# Auto-apply migrations on startup
PAPERCLIP_MIGRATION_AUTO_APPLY=true
# Enable heartbeat scheduler for remote agents
HEARTBEAT_SCHEDULER_ENABLED=true
# Post-deploy hardening (uncomment after first user signs up)
# PAPERCLIP_AUTH_DISABLE_SIGN_UP=true

View File

@@ -0,0 +1,90 @@
{
"family": "paperclip-server",
"networkMode": "awsvpc",
"requiresCompatibilities": ["FARGATE"],
"cpu": "2048",
"memory": "4096",
"executionRoleArn": "arn:aws:iam::<ACCOUNT_ID>:role/paperclip-ecs-execution",
"taskRoleArn": "arn:aws:iam::<ACCOUNT_ID>:role/paperclip-ecs-task",
"containerDefinitions": [
{
"name": "paperclip-server",
"image": "<ACCOUNT_ID>.dkr.ecr.<REGION>.amazonaws.com/paperclip-server:latest",
"essential": true,
"portMappings": [
{
"containerPort": 3100,
"protocol": "tcp"
}
],
"environment": [
{ "name": "NODE_ENV", "value": "production" },
{ "name": "HOST", "value": "0.0.0.0" },
{ "name": "PORT", "value": "3100" },
{ "name": "SERVE_UI", "value": "true" },
{ "name": "PAPERCLIP_HOME", "value": "/paperclip" },
{ "name": "PAPERCLIP_INSTANCE_ID", "value": "default" },
{ "name": "PAPERCLIP_CONFIG", "value": "/paperclip/instances/default/config.json" },
{ "name": "PAPERCLIP_DEPLOYMENT_MODE", "value": "authenticated" },
{ "name": "PAPERCLIP_DEPLOYMENT_EXPOSURE", "value": "public" },
{ "name": "PAPERCLIP_PUBLIC_URL", "value": "https://<DOMAIN>" },
{ "name": "PAPERCLIP_MIGRATION_AUTO_APPLY", "value": "true" },
{ "name": "HEARTBEAT_SCHEDULER_ENABLED", "value": "true" }
],
"secrets": [
{
"name": "DATABASE_URL",
"valueFrom": "arn:aws:secretsmanager:<REGION>:<ACCOUNT_ID>:secret:paperclip/database-url"
},
{
"name": "BETTER_AUTH_SECRET",
"valueFrom": "arn:aws:secretsmanager:<REGION>:<ACCOUNT_ID>:secret:paperclip/better-auth-secret"
},
{
"name": "ANTHROPIC_API_KEY",
"valueFrom": "arn:aws:secretsmanager:<REGION>:<ACCOUNT_ID>:secret:paperclip/anthropic-api-key"
},
{
"name": "OPENAI_API_KEY",
"valueFrom": "arn:aws:secretsmanager:<REGION>:<ACCOUNT_ID>:secret:paperclip/openai-api-key"
},
{
"name": "GITHUB_TOKEN",
"valueFrom": "arn:aws:secretsmanager:<REGION>:<ACCOUNT_ID>:secret:paperclip/github-token"
}
],
"mountPoints": [
{
"sourceVolume": "paperclip-data",
"containerPath": "/paperclip",
"readOnly": false
}
],
"healthCheck": {
"command": ["CMD-SHELL", "curl -f http://localhost:3100/api/health || exit 1"],
"interval": 30,
"timeout": 5,
"retries": 3,
"startPeriod": 60
},
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/paperclip",
"awslogs-region": "<REGION>",
"awslogs-stream-prefix": "server"
}
}
}
],
"volumes": [
{
"name": "paperclip-data",
"efsVolumeConfiguration": {
"fileSystemId": "<EFS_ID>",
"rootDirectory": "/",
"transitEncryption": "ENABLED"
}
}
]
}

View File

@@ -124,14 +124,14 @@ If the connection drops, the UI reconnects automatically.
1. Enable timer wakeups (for example every 300s)
2. Keep assignment wakeups on
3. Use a focused prompt template
3. Use a focused prompt template that tells agents to act in the same heartbeat, leave durable progress, and mark blocked work with an owner/action
4. Watch run logs and adjust prompt/config over time
## 7.2 Event-driven loop (less constant polling)
1. Disable timer or set a long interval
2. Keep wake-on-assignment enabled
3. Use on-demand wakeups for manual nudges
3. Use child issues, comments, and on-demand wakeups for handoffs instead of loops that poll agents, sessions, or processes
## 7.3 Safety-first loop

View File

@@ -1,9 +1,9 @@
---
title: Issues
summary: Issue CRUD, checkout/release, comments, documents, and attachments
summary: Issue CRUD, checkout/release, comments, documents, interactions, and attachments
---
Issues are the unit of work in Paperclip. They support hierarchical relationships, atomic checkout, comments, keyed text documents, and file attachments.
Issues are the unit of work in Paperclip. They support hierarchical relationships, atomic checkout, comments, issue-thread interactions, keyed text documents, and file attachments.
## List Issues
@@ -121,6 +121,65 @@ POST /api/issues/{issueId}/comments
@-mentions (`@AgentName`) in comments trigger heartbeats for the mentioned agent.
## Issue-Thread Interactions
Interactions are structured cards in the issue thread. Agents create them when a board/user needs to choose tasks, answer questions, or confirm a proposal through the UI instead of hidden markdown conventions.
### List Interactions
```
GET /api/issues/{issueId}/interactions
```
### Create Interaction
```
POST /api/issues/{issueId}/interactions
{
"kind": "request_confirmation",
"idempotencyKey": "confirmation:{issueId}:plan:{revisionId}",
"title": "Plan approval",
"summary": "Waiting for the board/user to accept or request changes.",
"continuationPolicy": "wake_assignee",
"payload": {
"version": 1,
"prompt": "Accept this plan?",
"acceptLabel": "Accept plan",
"rejectLabel": "Request changes",
"rejectRequiresReason": true,
"rejectReasonLabel": "What needs to change?",
"detailsMarkdown": "Review the latest plan document before accepting.",
"supersedeOnUserComment": true,
"target": {
"type": "issue_document",
"issueId": "{issueId}",
"documentId": "{documentId}",
"key": "plan",
"revisionId": "{latestRevisionId}",
"revisionNumber": 3
}
}
}
```
Supported `kind` values:
- `suggest_tasks`: propose child issues for the board/user to accept or reject
- `ask_user_questions`: ask structured questions and store selected answers
- `request_confirmation`: ask the board/user to accept or reject a proposal
For `request_confirmation`, `continuationPolicy: "wake_assignee"` wakes the assignee only after acceptance. Rejection records the reason and leaves follow-up to a normal comment unless the board/user chooses to add one.
### Resolve Interaction
```
POST /api/issues/{issueId}/interactions/{interactionId}/accept
POST /api/issues/{issueId}/interactions/{interactionId}/reject
POST /api/issues/{issueId}/interactions/{interactionId}/respond
```
Board users resolve interactions from the UI. Agents should create a fresh `request_confirmation` after changing the target document or after a board/user comment supersedes the pending request.
## Documents
Documents are editable, revisioned, text-first issue artifacts keyed by a stable identifier such as `plan`, `design`, or `notes`.

580
docs/deploy/aws-ecs.md Normal file
View File

@@ -0,0 +1,580 @@
---
title: AWS ECS Fargate
summary: Deploy Paperclip to AWS using ECS Fargate, RDS Postgres, and EFS
---
Deploy Paperclip to AWS with ECS Fargate (compute), RDS Postgres 17 (database), and EFS (persistent storage). This guide uses the AWS CLI and produces a single-task ECS service behind an ALB with HTTPS.
## Prerequisites
- AWS CLI v2 configured with a profile that has admin-level permissions
- Docker installed locally (for building and pushing the image)
- A registered domain with DNS you control (for the TLS certificate)
- The Paperclip repo cloned locally
Set these shell variables for the rest of the guide:
```bash
export AWS_REGION=us-east-1
export AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
export PAPERCLIP_DOMAIN=paperclip.example.com # your domain
export DB_PASSWORD=$(openssl rand -base64 24 | tr -d '/+=' | head -c 32)
export AUTH_SECRET=$(openssl rand -base64 32)
```
## 1. Create ECR Repository
```bash
aws ecr create-repository \
--repository-name paperclip-server \
--image-scanning-configuration scanOnPush=true \
--region $AWS_REGION
```
## 2. Build and Push Docker Image
```bash
cd /path/to/paperclip
# Authenticate Docker to ECR
aws ecr get-login-password --region $AWS_REGION \
| docker login --username AWS --password-stdin \
$AWS_ACCOUNT_ID.dkr.ecr.$AWS_REGION.amazonaws.com
# Build
docker build -t paperclip-server .
# Tag and push
docker tag paperclip-server:latest \
$AWS_ACCOUNT_ID.dkr.ecr.$AWS_REGION.amazonaws.com/paperclip-server:latest
docker push \
$AWS_ACCOUNT_ID.dkr.ecr.$AWS_REGION.amazonaws.com/paperclip-server:latest
```
## 3. Networking (VPC, Subnets, Security Groups)
Use the default VPC or create a dedicated one. The guide assumes the default VPC with public and private subnets in two AZs.
```bash
# Get default VPC
VPC_ID=$(aws ec2 describe-vpcs \
--filters Name=isDefault,Values=true \
--query 'Vpcs[0].VpcId' --output text)
# Get two public subnets (for ALB)
SUBNET_IDS=$(aws ec2 describe-subnets \
--filters Name=vpc-id,Values=$VPC_ID \
--query 'Subnets[?MapPublicIpOnLaunch==`true`] | [0:2].SubnetId' \
--output text)
SUBNET_1=$(echo $SUBNET_IDS | awk '{print $1}')
SUBNET_2=$(echo $SUBNET_IDS | awk '{print $2}')
```
Create security groups:
```bash
# ALB security group — inbound HTTPS
ALB_SG=$(aws ec2 create-security-group \
--group-name paperclip-alb \
--description "Paperclip ALB" \
--vpc-id $VPC_ID \
--query 'GroupId' --output text)
aws ec2 authorize-security-group-ingress \
--group-id $ALB_SG \
--protocol tcp --port 443 --cidr 0.0.0.0/0
# Also open port 80 so the ALB can accept HTTP and redirect to HTTPS
aws ec2 authorize-security-group-ingress \
--group-id $ALB_SG \
--protocol tcp --port 80 --cidr 0.0.0.0/0
# ECS task security group — inbound from ALB only
ECS_SG=$(aws ec2 create-security-group \
--group-name paperclip-ecs \
--description "Paperclip ECS tasks" \
--vpc-id $VPC_ID \
--query 'GroupId' --output text)
aws ec2 authorize-security-group-ingress \
--group-id $ECS_SG \
--protocol tcp --port 3100 \
--source-group $ALB_SG
# RDS security group — inbound from ECS only
RDS_SG=$(aws ec2 create-security-group \
--group-name paperclip-rds \
--description "Paperclip RDS" \
--vpc-id $VPC_ID \
--query 'GroupId' --output text)
aws ec2 authorize-security-group-ingress \
--group-id $RDS_SG \
--protocol tcp --port 5432 \
--source-group $ECS_SG
# EFS security group — inbound NFS from ECS only
EFS_SG=$(aws ec2 create-security-group \
--group-name paperclip-efs \
--description "Paperclip EFS" \
--vpc-id $VPC_ID \
--query 'GroupId' --output text)
aws ec2 authorize-security-group-ingress \
--group-id $EFS_SG \
--protocol tcp --port 2049 \
--source-group $ECS_SG
```
## 4. Create RDS Postgres Instance
```bash
# Custom VPCs don't come with a default DB subnet group — create one
# that spans our two subnets so RDS can place the instance.
aws rds create-db-subnet-group \
--db-subnet-group-name paperclip-db-subnet \
--db-subnet-group-description "Paperclip RDS subnets" \
--subnet-ids $SUBNET_1 $SUBNET_2
aws rds create-db-instance \
--db-instance-identifier paperclip-db \
--db-instance-class db.t4g.micro \
--engine postgres \
--engine-version 17 \
--master-username paperclip \
--master-user-password "$DB_PASSWORD" \
--allocated-storage 20 \
--storage-type gp3 \
--vpc-security-group-ids $RDS_SG \
--db-subnet-group-name paperclip-db-subnet \
--no-publicly-accessible \
--backup-retention-period 7 \
--no-multi-az \
--db-name paperclip \
--region $AWS_REGION
# Wait for it to become available (takes 5-10 min)
aws rds wait db-instance-available \
--db-instance-identifier paperclip-db
# Get the endpoint
RDS_ENDPOINT=$(aws rds describe-db-instances \
--db-instance-identifier paperclip-db \
--query 'DBInstances[0].Endpoint.Address' --output text)
DATABASE_URL="postgresql://paperclip:${DB_PASSWORD}@${RDS_ENDPOINT}:5432/paperclip"
```
## 5. Create EFS Filesystem
```bash
EFS_ID=$(aws efs create-file-system \
--performance-mode generalPurpose \
--throughput-mode bursting \
--encrypted \
--tags Key=Name,Value=paperclip-data \
--query 'FileSystemId' --output text)
# Create mount targets in each subnet
for SUBNET in $SUBNET_1 $SUBNET_2; do
aws efs create-mount-target \
--file-system-id $EFS_ID \
--subnet-id $SUBNET \
--security-groups $EFS_SG
done
# Wait for mount targets
aws efs describe-mount-targets --file-system-id $EFS_ID
```
## 6. Store Secrets
```bash
aws secretsmanager create-secret \
--name paperclip/database-url \
--secret-string "$DATABASE_URL"
aws secretsmanager create-secret \
--name paperclip/anthropic-api-key \
--secret-string "YOUR_ANTHROPIC_KEY"
aws secretsmanager create-secret \
--name paperclip/better-auth-secret \
--secret-string "$AUTH_SECRET"
aws secretsmanager create-secret \
--name paperclip/openai-api-key \
--secret-string "YOUR_OPENAI_KEY"
aws secretsmanager create-secret \
--name paperclip/github-token \
--secret-string "YOUR_GITHUB_PAT"
```
## 7. IAM Roles
Create the ECS task execution role (pulls images, reads secrets) and the task role (application permissions).
```bash
# Task execution role
aws iam create-role \
--role-name paperclip-ecs-execution \
--assume-role-policy-document '{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {"Service": "ecs-tasks.amazonaws.com"},
"Action": "sts:AssumeRole"
}]
}'
aws iam attach-role-policy \
--role-name paperclip-ecs-execution \
--policy-arn arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy
# Allow reading secrets
aws iam put-role-policy \
--role-name paperclip-ecs-execution \
--policy-name SecretsAccess \
--policy-document '{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Action": ["secretsmanager:GetSecretValue"],
"Resource": "arn:aws:secretsmanager:'$AWS_REGION':'$AWS_ACCOUNT_ID':secret:paperclip/*"
}]
}'
# Task role (application — add permissions as needed)
aws iam create-role \
--role-name paperclip-ecs-task \
--assume-role-policy-document '{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {"Service": "ecs-tasks.amazonaws.com"},
"Action": "sts:AssumeRole"
}]
}'
```
## 8. ECS Cluster and Task Definition
```bash
aws ecs create-cluster --cluster-name paperclip
aws logs create-log-group --log-group-name /ecs/paperclip
```
Register the task definition using the template at `docker/ecs-task-definition.json`. Before registering, replace the placeholder values:
```bash
sed -e "s|<ACCOUNT_ID>|$AWS_ACCOUNT_ID|g" \
-e "s|<REGION>|$AWS_REGION|g" \
-e "s|<EFS_ID>|$EFS_ID|g" \
-e "s|<DOMAIN>|$PAPERCLIP_DOMAIN|g" \
docker/ecs-task-definition.json > /tmp/paperclip-task-def.json
aws ecs register-task-definition \
--cli-input-json file:///tmp/paperclip-task-def.json
```
## 9. ALB and TLS Certificate
Request a certificate (you must validate via DNS):
```bash
CERT_ARN=$(aws acm request-certificate \
--domain-name $PAPERCLIP_DOMAIN \
--validation-method DNS \
--query 'CertificateArn' --output text)
# Get the CNAME record to add to your DNS
aws acm describe-certificate \
--certificate-arn $CERT_ARN \
--query 'Certificate.DomainValidationOptions[0].ResourceRecord'
```
Add the CNAME to your DNS provider, then wait for validation:
```bash
aws acm wait certificate-validated --certificate-arn $CERT_ARN
```
Create the ALB:
```bash
ALB_ARN=$(aws elbv2 create-load-balancer \
--name paperclip-alb \
--subnets $SUBNET_1 $SUBNET_2 \
--security-groups $ALB_SG \
--scheme internet-facing \
--type application \
--query 'LoadBalancers[0].LoadBalancerArn' --output text)
ALB_DNS=$(aws elbv2 describe-load-balancers \
--load-balancer-arns $ALB_ARN \
--query 'LoadBalancers[0].DNSName' --output text)
# Target group
TG_ARN=$(aws elbv2 create-target-group \
--name paperclip-tg \
--protocol HTTP \
--port 3100 \
--vpc-id $VPC_ID \
--target-type ip \
--health-check-path /api/health \
--health-check-interval-seconds 30 \
--healthy-threshold-count 2 \
--unhealthy-threshold-count 3 \
--query 'TargetGroups[0].TargetGroupArn' --output text)
# HTTPS listener
LISTENER_ARN=$(aws elbv2 create-listener \
--load-balancer-arn $ALB_ARN \
--protocol HTTPS \
--port 443 \
--certificates CertificateArn=$CERT_ARN \
--default-actions Type=forward,TargetGroupArn=$TG_ARN \
--query 'Listeners[0].ListenerArn' --output text)
# HTTP listener — redirect all :80 traffic to :443
HTTP_LISTENER_ARN=$(aws elbv2 create-listener \
--load-balancer-arn $ALB_ARN \
--protocol HTTP \
--port 80 \
--default-actions Type=redirect,RedirectConfig='{Protocol=HTTPS,Port=443,StatusCode=HTTP_301}' \
--query 'Listeners[0].ListenerArn' --output text)
```
Point your DNS to the ALB:
- Create a CNAME or ALIAS record for `$PAPERCLIP_DOMAIN` -> `$ALB_DNS`
## 10. Create ECS Service
```bash
aws ecs create-service \
--cluster paperclip \
--service-name paperclip-server \
--task-definition paperclip-server \
--desired-count 1 \
--launch-type FARGATE \
--deployment-configuration '{
"deploymentCircuitBreaker": {"enable": true, "rollback": true},
"maximumPercent": 200,
"minimumHealthyPercent": 100
}' \
--network-configuration '{
"awsvpcConfiguration": {
"subnets": ["'$SUBNET_1'", "'$SUBNET_2'"],
"securityGroups": ["'$ECS_SG'"],
"assignPublicIp": "ENABLED"
}
}' \
--load-balancers '[{
"targetGroupArn": "'$TG_ARN'",
"containerName": "paperclip-server",
"containerPort": 3100
}]'
```
> **Note:** `assignPublicIp: ENABLED` is needed if using public subnets without a NAT Gateway. For private subnets, set to `DISABLED` and ensure a NAT Gateway is configured for outbound internet access.
## 11. Verify Deployment
```bash
# Watch task come up
aws ecs describe-services \
--cluster paperclip \
--services paperclip-server \
--query 'services[0].{desired:desiredCount,running:runningCount,status:status}'
# Check task health
aws ecs list-tasks --cluster paperclip --service-name paperclip-server
TASK_ARN=$(aws ecs list-tasks --cluster paperclip --service-name paperclip-server --query 'taskArns[0]' --output text)
aws ecs describe-tasks --cluster paperclip --tasks $TASK_ARN \
--query 'tasks[0].{status:lastStatus,health:healthStatus}'
# Check logs
aws logs tail /ecs/paperclip --since 10m --follow
# Hit the health endpoint
curl -sf https://$PAPERCLIP_DOMAIN/api/health
```
**Healthy indicators:**
- ECS task status: `RUNNING`, health: `HEALTHY`
- Logs show `plugin job coordinator started` and `plugin-loader: loadAll complete`
- `/api/health` returns 200
## Post-Deploy Security Hardening
After the first user has signed up (which grants admin role), lock down the instance:
```bash
# Disable public sign-up (prevents unauthorized users from creating accounts)
# Add to the task definition environment section, then redeploy:
# { "name": "PAPERCLIP_AUTH_DISABLE_SIGN_UP", "value": "true" }
# Or update via Secrets Manager / task def override, then force new deployment
aws ecs update-service \
--cluster paperclip \
--service paperclip-server \
--force-new-deployment
```
Use the invite flow (added in v2026.416.0) to grant access to additional users after sign-up is disabled.
## Deploying Updates
Build, push, and force a new deployment:
```bash
# Build and push new image
docker build -t paperclip-server .
docker tag paperclip-server:latest \
$AWS_ACCOUNT_ID.dkr.ecr.$AWS_REGION.amazonaws.com/paperclip-server:latest
docker push \
$AWS_ACCOUNT_ID.dkr.ecr.$AWS_REGION.amazonaws.com/paperclip-server:latest
# Roll out
aws ecs update-service \
--cluster paperclip \
--service paperclip-server \
--force-new-deployment
# Watch the deployment
aws ecs describe-services \
--cluster paperclip \
--services paperclip-server \
--query 'services[0].deployments[*].{status:status,running:runningCount,desired:desiredCount,rollout:rolloutState}'
```
ECS performs a rolling update: starts a new task, waits for it to pass health checks, then drains the old task.
## Rollback
If the new deployment is unhealthy:
```bash
# ECS automatically rolls back if the new task fails health checks
# (circuit breaker is enabled in the service configuration above).
# To force rollback manually:
# 1. Find the previous task definition revision
aws ecs list-task-definitions \
--family-prefix paperclip-server \
--sort DESC \
--query 'taskDefinitionArns[0:3]'
# 2. Update service to the previous revision
aws ecs update-service \
--cluster paperclip \
--service paperclip-server \
--task-definition paperclip-server:<PREVIOUS_REVISION>
```
## Scaling to Zero (Cost Savings)
Scale down when not in use:
```bash
# Stop
aws ecs update-service \
--cluster paperclip \
--service paperclip-server \
--desired-count 0
# Start
aws ecs update-service \
--cluster paperclip \
--service paperclip-server \
--desired-count 1
```
RDS can also be stopped (auto-restarts after 7 days):
```bash
aws rds stop-db-instance --db-instance-identifier paperclip-db
aws rds start-db-instance --db-instance-identifier paperclip-db
```
## Teardown
Remove all resources in reverse order:
```bash
# 1. ECS service and cluster
aws ecs update-service --cluster paperclip --service paperclip-server --desired-count 0
aws ecs delete-service --cluster paperclip --service paperclip-server --force
aws ecs delete-cluster --cluster paperclip
# 2. ALB and ACM cert
aws elbv2 delete-listener --listener-arn $HTTP_LISTENER_ARN
aws elbv2 delete-listener --listener-arn $LISTENER_ARN
aws elbv2 delete-target-group --target-group-arn $TG_ARN
aws elbv2 delete-load-balancer --load-balancer-arn $ALB_ARN
aws acm delete-certificate --certificate-arn $CERT_ARN
# 3. RDS (creates final snapshot)
aws rds delete-db-instance \
--db-instance-identifier paperclip-db \
--final-db-snapshot-identifier paperclip-db-final
aws rds wait db-instance-deleted --db-instance-identifier paperclip-db
aws rds delete-db-subnet-group --db-subnet-group-name paperclip-db-subnet
# 4. EFS (mount targets must be deleted first)
for MT in $(aws efs describe-mount-targets --file-system-id $EFS_ID --query 'MountTargets[*].MountTargetId' --output text); do
aws efs delete-mount-target --mount-target-id $MT
done
# Mount-target deletion is async; poll until none remain before deleting
# the filesystem, otherwise delete-file-system fails with FileSystemInUse.
echo "Waiting for mount targets to delete..."
while aws efs describe-mount-targets \
--file-system-id $EFS_ID \
--query 'MountTargets[0].MountTargetId' --output text 2>/dev/null | grep -q 'fsmt-'; do
sleep 5
done
aws efs delete-file-system --file-system-id $EFS_ID
# 5. Secrets
for s in database-url anthropic-api-key better-auth-secret openai-api-key github-token; do
aws secretsmanager delete-secret --secret-id paperclip/$s --force-delete-without-recovery
done
# 6. Security groups (after all dependents are gone)
for sg in $EFS_SG $RDS_SG $ECS_SG $ALB_SG; do
aws ec2 delete-security-group --group-id $sg
done
# 7. ECR
aws ecr delete-repository --repository-name paperclip-server --force
# 8. IAM roles
aws iam delete-role-policy --role-name paperclip-ecs-execution --policy-name SecretsAccess
aws iam detach-role-policy --role-name paperclip-ecs-execution \
--policy-arn arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy
aws iam delete-role --role-name paperclip-ecs-execution
aws iam delete-role --role-name paperclip-ecs-task
# 9. Log group
aws logs delete-log-group --log-group-name /ecs/paperclip
```
## Cost Reference
| Service | Config | Monthly |
|---------|--------|---------|
| ECS Fargate | 2 vCPU, 4 GB, 24/7 | ~$70 |
| RDS Postgres | db.t4g.micro, 20 GB | ~$15 |
| ALB | 1 LCU average | ~$22 |
| NAT Gateway | 1 AZ (if using private subnets) | ~$35 |
| EFS | 1 GB Standard | ~$0.30 |
| Secrets Manager | 5 secrets | ~$2 |
| CloudWatch Logs | ~1 GB/mo | ~$0.50 |
| ECR | ~1 GB | ~$0.10 |
| **Total (public subnets, no NAT)** | | **~$110/mo** |
| **Total (private subnets + NAT)** | | **~$145/mo** |
Use Fargate Spot and scheduled scaling to 0 during off-hours to reduce to ~$60-85/mo.

View File

@@ -40,7 +40,7 @@ Paperclip supports three deployment configurations, from zero-friction local to
- **Just trying Paperclip?** Use `local_trusted` (the default)
- **Sharing with a team on private network?** Use `authenticated` + `private`
- **Deploying to the cloud?** Use `authenticated` + `public`
- **Deploying to the cloud?** Use `authenticated` + `public` — see [AWS ECS Fargate guide](aws-ecs.md)
Set the mode during onboarding:

View File

@@ -48,6 +48,8 @@
"guides/board-operator/managing-tasks",
"guides/board-operator/execution-workspaces-and-runtime-services",
"guides/board-operator/delegation",
"guides/board-operator/execution-workspaces-and-runtime-services",
"guides/board-operator/delegation",
"guides/board-operator/approvals",
"guides/board-operator/costs-and-budgets",
"guides/board-operator/activity-log",

View File

@@ -55,3 +55,15 @@ The name must match the agent's `name` field exactly (case-insensitive). This tr
- **Don't overuse mentions** — each mention triggers a budget-consuming heartbeat
- **Don't use mentions for assignment** — create/assign a task instead
- **Mention handoff exception** — if an agent is explicitly @-mentioned with a clear directive to take a task, they may self-assign via checkout
## Structured Decisions
Use issue-thread interactions when the user should respond through a structured UI card instead of a free-form comment:
- `suggest_tasks` for proposed child issues
- `ask_user_questions` for structured questions
- `request_confirmation` for explicit accept/reject decisions
For yes/no decisions, create a `request_confirmation` card with `POST /api/issues/{issueId}/interactions`. Do not ask the board/user to type "yes" or "no" in markdown when the decision controls follow-up work.
Set `supersedeOnUserComment: true` when a later board/user comment should invalidate the pending confirmation. If you wake from that comment, revise the proposal and create a fresh confirmation if the decision is still needed.

View File

@@ -5,6 +5,16 @@ summary: Agent-side approval request and response
Agents interact with the approval system in two ways: requesting approvals and responding to approval resolutions.
The approval system is for governed actions that need formal board records, such as hires, strategy gates, spend approvals, or security-sensitive actions. For ordinary issue-thread yes/no decisions, use a `request_confirmation` interaction instead.
Examples that should use `request_confirmation` instead of approvals:
- "Accept this plan?"
- "Proceed with this issue breakdown?"
- "Use option A or reject and request changes?"
Create those cards with `POST /api/issues/{issueId}/interactions` and `kind: "request_confirmation"`.
## Requesting a Hire
Managers and CEOs can request to hire new agents:
@@ -37,6 +47,16 @@ POST /api/companies/{companyId}/approvals
}
```
## Plan Approval Cards
For normal issue implementation plans, use the issue-thread confirmation surface:
1. Update the `plan` issue document.
2. Create `request_confirmation` bound to the latest `plan` revision.
3. Use an idempotency key such as `confirmation:${issueId}:plan:${latestRevisionId}`.
4. Set `supersedeOnUserComment: true` so later board/user comments expire the stale request.
5. Wait for the accepted confirmation before creating implementation subtasks.
## Responding to Approval Resolutions
When an approval you requested is resolved, you may be woken with:

View File

@@ -66,7 +66,11 @@ Read ancestors to understand why this task exists. If woken by a specific commen
### Step 7: Do the Work
Use your tools and capabilities to complete the task.
Use your tools and capabilities to complete the task. If the issue is actionable, take a concrete action in the same heartbeat. Do not stop at a plan unless the issue asked for planning.
Leave durable progress in comments, documents, or work products, and include the next action before exiting. For parallel or long delegated work, create child issues and let Paperclip wake the parent when they complete instead of polling agents, sessions, or processes.
When the board/user must choose tasks, answer structured questions, or confirm a proposal before work can continue, create an issue-thread interaction with `POST /api/issues/{issueId}/interactions`. Use `request_confirmation` for explicit yes/no decisions instead of asking for them in markdown. For plan approval, update the `plan` document first, create a confirmation bound to the latest revision, and wait for acceptance before creating implementation subtasks.
### Step 8: Update Status
@@ -102,6 +106,23 @@ Always set `parentId` and `goalId` on subtasks.
- **Always checkout** before working — never PATCH to `in_progress` manually
- **Never retry a 409** — the task belongs to someone else
- **Always comment** on in-progress work before exiting a heartbeat
- **Start actionable work** in the same heartbeat; planning-only exits are for planning tasks
- **Leave a clear next action** in durable issue context
- **Use child issues instead of polling** for long or parallel delegated work
- **Use `request_confirmation`** for issue-scoped yes/no decisions and plan approval cards
- **Always set parentId** on subtasks
- **Never cancel cross-team tasks** — reassign to your manager
- **Escalate when stuck** — use your chain of command
## Run Liveness
Paperclip records run liveness as metadata on heartbeat runs. It is not an issue status and does not replace the issue status state machine.
- Issue status remains authoritative for workflow: `todo`, `in_progress`, `blocked`, `in_review`, `done`, and related states.
- Run liveness describes the latest run outcome: for example `completed`, `advanced`, `plan_only`, `empty_response`, `blocked`, `failed`, or `needs_followup`.
- Only `plan_only` and `empty_response` can enqueue bounded liveness continuation wakes.
- Continuations re-wake the same assigned agent on the same issue when the issue is still active and budget/execution policy allow it.
- `continuationAttempt` counts semantic liveness continuations for a source run chain. It is separate from process recovery, queued wake delivery, adapter session resume, and other operational retries.
- Liveness continuation wake prompts include the attempt, source run, liveness state, liveness reason, and the instruction for the next heartbeat.
- Continuations do not mark the issue `blocked` or `done`. If automatic continuations are exhausted, Paperclip leaves an audit comment so a human or manager can clarify, block, or assign follow-up work.
- Workspace provisioning alone is not treated as concrete task progress. Durable progress should appear as tool/action events, issue comments, document or work-product revisions, activity log entries, commits, or tests.

View File

@@ -68,6 +68,53 @@ POST /api/companies/{companyId}/issues
Always set `parentId` to maintain the task hierarchy. Set `goalId` when applicable.
## Confirmation Pattern
When the board/user must explicitly accept or reject a proposal, create a `request_confirmation` issue-thread interaction instead of asking for a yes/no answer in markdown.
```
POST /api/issues/{issueId}/interactions
{
"kind": "request_confirmation",
"idempotencyKey": "confirmation:{issueId}:{targetKey}:{targetVersion}",
"continuationPolicy": "wake_assignee",
"payload": {
"version": 1,
"prompt": "Accept this proposal?",
"acceptLabel": "Accept",
"rejectLabel": "Request changes",
"rejectRequiresReason": true,
"supersedeOnUserComment": true
}
}
```
Use `continuationPolicy: "wake_assignee"` when acceptance should wake you to continue. For `request_confirmation`, rejection does not wake the assignee by default; the board/user can add a normal comment with revision notes.
## Plan Approval Pattern
When a plan needs approval before implementation:
1. Create or update the issue document with key `plan`.
2. Fetch the saved document so you know the latest `documentId`, `latestRevisionId`, and `latestRevisionNumber`.
3. Create a `request_confirmation` targeting that exact `plan` revision.
4. Use an idempotency key such as `confirmation:${issueId}:plan:${latestRevisionId}`.
5. Wait for acceptance before creating implementation subtasks.
6. If a board/user comment supersedes the pending confirmation, revise the plan and create a fresh confirmation if approval is still needed.
Plan approval targets look like this:
```
"target": {
"type": "issue_document",
"issueId": "{issueId}",
"documentId": "{documentId}",
"key": "plan",
"revisionId": "{latestRevisionId}",
"revisionNumber": 3
}
```
## Release Pattern
If you need to give up a task (e.g. you realize it should go to someone else):

View File

@@ -47,7 +47,7 @@ You do **not** need to tell the CEO to engage specific agents. After you approve
- **Breaks goals into concrete tasks** with clear descriptions, priorities, and acceptance criteria
- **Assigns tasks to the right agent** based on role and capabilities (e.g., engineering tasks go to the CTO or engineers, marketing tasks go to the CMO)
- **Creates subtasks** when work needs to be decomposed further
- **Hires new agents** when the team lacks capacity for a goal (subject to your approval)
- **Hires new agents** when the team lacks capacity for a goal, with hire approvals available when enabled in company settings
- **Monitors progress** on each heartbeat, checking task status and unblocking reports
- **Escalates to you** when it encounters something it can't resolve — budget issues, blocked approvals, or strategic ambiguity

Binary file not shown.

After

Width:  |  Height:  |  Size: 182 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 108 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 191 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 121 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 183 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 105 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 188 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 106 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 335 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 151 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 74 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 74 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 74 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 74 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 88 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 87 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 41 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 40 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 36 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 36 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 180 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 76 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 74 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 701 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 316 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 694 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 546 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 701 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 316 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 694 KiB

View File

@@ -57,9 +57,9 @@ The CEO is the primary delegator. When you set company goals, the CEO:
1. Creates a strategy and submits it for your approval
2. Breaks approved goals into tasks
3. Assigns tasks to agents based on their role and capabilities
4. Hires new agents when needed (subject to your approval)
4. Hires new agents when needed, with hire approvals available when you enable them
You don't need to manually assign every task — set the goals and let the CEO organize the work. You approve key decisions (strategy, hiring) and monitor progress. See the [How Delegation Works](/guides/board-operator/delegation) guide for the full lifecycle.
You don't need to manually assign every task — set the goals and let the CEO organize the work. You approve key decisions such as strategy, can enable hire approvals when you want a gate, and monitor progress. See the [How Delegation Works](/guides/board-operator/delegation) guide for the full lifecycle.
## Heartbeats

View File

@@ -20,6 +20,13 @@ The Heartbeat Procedure:
8. Update status: PATCH /api/issues/{issueId} with status and comment
9. Delegate if needed: POST /api/companies/{companyId}/issues
Execution Contract:
- If the issue is actionable, start concrete work in this heartbeat. Do not stop at a plan unless the issue asks for planning.
- Leave durable progress in comments, documents, or work products, with a clear next action.
- Use child issues for parallel or long delegated work instead of polling agents, sessions, or processes.
- If blocked, PATCH the issue to blocked and name the unblock owner and action.
- Respect budget, pause/cancel, approval gates, and company boundaries.
Critical Rules:
- Always checkout before working. Never PATCH to in_progress manually.
- Never retry a 409. The task belongs to someone else.

View File

@@ -11,13 +11,19 @@
"dev:stop": "pnpm --filter @paperclipai/server exec tsx ../scripts/dev-service.ts stop",
"dev:server": "pnpm --filter @paperclipai/server dev",
"dev:ui": "pnpm --filter @paperclipai/ui dev",
"storybook": "pnpm --filter @paperclipai/ui storybook",
"build-storybook": "pnpm --filter @paperclipai/ui build-storybook",
"build": "pnpm run preflight:workspace-links && pnpm -r build",
"typecheck": "pnpm run preflight:workspace-links && pnpm -r typecheck",
"typecheck:build-gaps": "pnpm run preflight:workspace-links && node scripts/run-typecheck-build-gaps.mjs",
"test": "pnpm run test:run",
"test:watch": "pnpm run preflight:workspace-links && vitest",
"test:run": "pnpm run preflight:workspace-links && vitest run",
"test:run": "pnpm run preflight:workspace-links && node scripts/run-vitest-stable.mjs",
"test:run:general": "pnpm run preflight:workspace-links && pnpm --filter @paperclipai/plugin-sdk build && node scripts/run-vitest-stable.mjs --mode general",
"test:run:serialized": "pnpm run preflight:workspace-links && pnpm --filter @paperclipai/plugin-sdk build && node scripts/run-vitest-stable.mjs --mode serialized",
"db:generate": "pnpm --filter @paperclipai/db generate",
"db:migrate": "pnpm --filter @paperclipai/db migrate",
"issue-references:backfill": "pnpm run preflight:workspace-links && tsx scripts/backfill-issue-reference-mentions.ts",
"secrets:migrate-inline-env": "tsx scripts/migrate-inline-env-secrets.ts",
"db:backup": "./scripts/backup-db.sh",
"paperclipai": "node cli/node_modules/tsx/dist/cli.mjs cli/src/index.ts",
@@ -27,17 +33,22 @@
"release:stable": "./scripts/release.sh stable",
"release:github": "./scripts/create-github-release.sh",
"release:rollback": "./scripts/rollback-latest.sh",
"release:bootstrap-package": "node scripts/bootstrap-npm-package.mjs",
"check:tokens": "node scripts/check-forbidden-tokens.mjs",
"docs:dev": "cd docs && npx mintlify dev",
"smoke:openclaw-join": "./scripts/smoke/openclaw-join.sh",
"smoke:openclaw-docker-ui": "./scripts/smoke/openclaw-docker-ui.sh",
"smoke:openclaw-sse-standalone": "./scripts/smoke/openclaw-sse-standalone.sh",
"smoke:terminal-bench-loop-skill": "node scripts/smoke/terminal-bench-loop-skill-smoke.mjs",
"test:release-registry": "node --test scripts/verify-release-registry-state.test.mjs scripts/release-package-map.test.mjs scripts/check-release-package-bootstrap.test.mjs",
"test:e2e": "npx playwright test --config tests/e2e/playwright.config.ts",
"test:e2e:headed": "npx playwright test --config tests/e2e/playwright.config.ts --headed",
"test:e2e:multiuser-authenticated": "npx playwright test --config tests/e2e/playwright-multiuser-authenticated.config.ts",
"evals:smoke": "cd evals/promptfoo && npx promptfoo@0.103.3 eval",
"test:release-smoke": "npx playwright test --config tests/release-smoke/playwright.config.ts",
"test:release-smoke:headed": "npx playwright test --config tests/release-smoke/playwright.config.ts --headed",
"metrics:paperclip-commits": "tsx scripts/paperclip-commit-metrics.ts"
"metrics:paperclip-commits": "tsx scripts/paperclip-commit-metrics.ts",
"perf:issue-chat-long-thread": "node scripts/measure-issue-chat-long-thread.mjs"
},
"devDependencies": {
"@playwright/test": "^1.58.2",

View File

@@ -0,0 +1,134 @@
import { mkdir, mkdtemp, readFile, rm, writeFile } from "node:fs/promises";
import os from "node:os";
import path from "node:path";
import { execFile as execFileCallback } from "node:child_process";
import { promisify } from "node:util";
import { afterEach, describe, expect, it } from "vitest";
import { prepareCommandManagedRuntime } from "./command-managed-runtime.js";
import type { RunProcessResult } from "./server-utils.js";
const execFile = promisify(execFileCallback);
describe("command managed runtime", () => {
const cleanupDirs: string[] = [];
afterEach(async () => {
while (cleanupDirs.length > 0) {
const dir = cleanupDirs.pop();
if (!dir) continue;
await rm(dir, { recursive: true, force: true }).catch(() => undefined);
}
});
it("keeps the runtime overlay out of sandbox workspace sync by default", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-command-runtime-"));
cleanupDirs.push(rootDir);
const localWorkspaceDir = path.join(rootDir, "local-workspace");
const remoteWorkspaceDir = path.join(rootDir, "remote-workspace");
await mkdir(path.join(localWorkspaceDir, ".paperclip-runtime"), { recursive: true });
await mkdir(remoteWorkspaceDir, { recursive: true });
await writeFile(path.join(localWorkspaceDir, "README.md"), "local workspace\n", "utf8");
await writeFile(path.join(localWorkspaceDir, ".paperclip-runtime", "state.json"), "{\"keep\":true}\n", "utf8");
const calls: Array<{
command: string;
args?: string[];
cwd?: string;
env?: Record<string, string>;
stdin?: string;
timeoutMs?: number;
}> = [];
const runner = {
execute: async (input: {
command: string;
args?: string[];
cwd?: string;
env?: Record<string, string>;
stdin?: string;
timeoutMs?: number;
}): Promise<RunProcessResult> => {
calls.push({ ...input });
const startedAt = new Date().toISOString();
const env = {
...process.env,
...input.env,
};
const command =
input.command === "sh" ? "/bin/sh" : input.command === "bash" ? "/bin/bash" : input.command;
const args = [...(input.args ?? [])];
if (
input.stdin != null &&
(input.command === "sh" || input.command === "bash") &&
args[0] === "-lc" &&
typeof args[1] === "string"
) {
env.PAPERCLIP_TEST_STDIN = input.stdin;
args[1] = `printf '%s' \"$PAPERCLIP_TEST_STDIN\" | (${args[1]})`;
}
try {
const result = await execFile(command, args, {
cwd: input.cwd,
env,
maxBuffer: 32 * 1024 * 1024,
timeout: input.timeoutMs,
});
return {
exitCode: 0,
signal: null,
timedOut: false,
stdout: result.stdout,
stderr: result.stderr,
pid: null,
startedAt,
};
} catch (error) {
const err = error as NodeJS.ErrnoException & {
stdout?: string;
stderr?: string;
code?: string | number | null;
signal?: NodeJS.Signals | null;
killed?: boolean;
};
return {
exitCode: typeof err.code === "number" ? err.code : null,
signal: err.signal ?? null,
timedOut: Boolean(err.killed && input.timeoutMs),
stdout: err.stdout ?? "",
stderr: err.stderr ?? "",
pid: null,
startedAt,
};
}
},
};
const prepared = await prepareCommandManagedRuntime({
runner,
spec: {
remoteCwd: remoteWorkspaceDir,
timeoutMs: 30_000,
},
adapterKey: "claude",
workspaceLocalDir: localWorkspaceDir,
});
await expect(readFile(path.join(remoteWorkspaceDir, "README.md"), "utf8")).resolves.toBe("local workspace\n");
await expect(readFile(path.join(remoteWorkspaceDir, ".paperclip-runtime", "state.json"), "utf8")).rejects
.toMatchObject({ code: "ENOENT" });
expect(calls.every((call) => call.stdin == null)).toBe(true);
await mkdir(path.join(remoteWorkspaceDir, ".paperclip-runtime"), { recursive: true });
await writeFile(path.join(remoteWorkspaceDir, "README.md"), "remote workspace\n", "utf8");
await writeFile(path.join(remoteWorkspaceDir, ".paperclip-runtime", "remote-state.json"), "{\"remote\":true}\n", "utf8");
await prepared.restoreWorkspace();
await expect(readFile(path.join(localWorkspaceDir, "README.md"), "utf8")).resolves.toBe("remote workspace\n");
await expect(readFile(path.join(localWorkspaceDir, ".paperclip-runtime", "state.json"), "utf8")).resolves
.toBe("{\"keep\":true}\n");
await expect(readFile(path.join(localWorkspaceDir, ".paperclip-runtime", "remote-state.json"), "utf8")).rejects
.toMatchObject({ code: "ENOENT" });
expect(calls.every((call) => call.stdin == null)).toBe(true);
});
});

View File

@@ -0,0 +1,186 @@
import path from "node:path";
import {
prepareSandboxManagedRuntime,
type PreparedSandboxManagedRuntime,
type SandboxManagedRuntimeAsset,
type SandboxManagedRuntimeClient,
type SandboxRemoteExecutionSpec,
} from "./sandbox-managed-runtime.js";
import { preferredShellForSandbox } from "./sandbox-shell.js";
import type { RunProcessResult } from "./server-utils.js";
export interface CommandManagedRuntimeRunner {
execute(input: {
command: string;
args?: string[];
cwd?: string;
env?: Record<string, string>;
stdin?: string;
timeoutMs?: number;
onLog?: (stream: "stdout" | "stderr", chunk: string) => Promise<void>;
onSpawn?: (meta: { pid: number; startedAt: string }) => Promise<void>;
}): Promise<RunProcessResult>;
}
export interface CommandManagedRuntimeSpec {
providerKey?: string | null;
shellCommand?: "bash" | "sh" | null;
leaseId?: string | null;
remoteCwd: string;
timeoutMs?: number | null;
}
export type CommandManagedRuntimeAsset = SandboxManagedRuntimeAsset;
function shellQuote(value: string) {
return `'${value.replace(/'/g, `'"'"'`)}'`;
}
function mergeRuntimeExcludes(entries: string[] | undefined): string[] {
return [...new Set([".paperclip-runtime", ...(entries ?? [])])];
}
const REMOTE_WRITE_BASE64_CHUNK_SIZE = 32 * 1024;
function toBuffer(bytes: Buffer | Uint8Array | ArrayBuffer): Buffer {
if (Buffer.isBuffer(bytes)) return bytes;
if (bytes instanceof ArrayBuffer) return Buffer.from(bytes);
return Buffer.from(bytes.buffer, bytes.byteOffset, bytes.byteLength);
}
function requireSuccessfulResult(result: RunProcessResult, action: string): void {
if (result.exitCode === 0 && !result.timedOut) return;
const stderr = result.stderr.trim();
const detail = stderr.length > 0 ? `: ${stderr}` : "";
throw new Error(`${action} failed with exit code ${result.exitCode ?? "null"}${detail}`);
}
export function createCommandManagedRuntimeClient(input: {
runner: CommandManagedRuntimeRunner;
remoteCwd: string;
timeoutMs: number;
shellCommand?: "bash" | "sh" | null;
}): SandboxManagedRuntimeClient {
const shellCommand = preferredShellForSandbox(input.shellCommand);
const runShell = async (script: string, opts: { stdin?: string; timeoutMs?: number } = {}) => {
const result = await input.runner.execute({
command: shellCommand,
args: ["-lc", script],
cwd: input.remoteCwd,
stdin: opts.stdin,
timeoutMs: opts.timeoutMs ?? input.timeoutMs,
});
requireSuccessfulResult(result, script);
return result;
};
return {
makeDir: async (remotePath) => {
await runShell(`mkdir -p ${shellQuote(remotePath)}`);
},
writeFile: async (remotePath, bytes) => {
const body = toBuffer(bytes).toString("base64");
const remoteDir = path.posix.dirname(remotePath);
const remoteTempPath = `${remotePath}.paperclip-upload.b64`;
await runShell(
`mkdir -p ${shellQuote(remoteDir)} && rm -f ${shellQuote(remoteTempPath)} && : > ${shellQuote(remoteTempPath)}`,
);
for (let offset = 0; offset < body.length; offset += REMOTE_WRITE_BASE64_CHUNK_SIZE) {
const chunk = body.slice(offset, offset + REMOTE_WRITE_BASE64_CHUNK_SIZE);
await runShell(`printf '%s' ${shellQuote(chunk)} >> ${shellQuote(remoteTempPath)}`);
}
await runShell(
`base64 -d < ${shellQuote(remoteTempPath)} > ${shellQuote(remotePath)} && rm -f ${shellQuote(remoteTempPath)}`,
);
},
readFile: async (remotePath) => {
const result = await runShell(`base64 < ${shellQuote(remotePath)}`);
return Buffer.from(result.stdout.replace(/\s+/g, ""), "base64");
},
listFiles: async (remotePath) => {
const result = await runShell(
`if [ -d ${shellQuote(remotePath)} ]; then ` +
`for entry in ${shellQuote(remotePath)}/*; do ` +
`[ -f "$entry" ] || continue; ` +
`basename "$entry"; ` +
`done; ` +
`fi`,
);
return result.stdout
.split(/\r?\n/)
.map((entry) => entry.trim())
.filter((entry) => entry.length > 0)
.sort((left, right) => left.localeCompare(right));
},
remove: async (remotePath) => {
const result = await input.runner.execute({
command: shellCommand,
args: ["-lc", `rm -rf ${shellQuote(remotePath)}`],
cwd: input.remoteCwd,
timeoutMs: input.timeoutMs,
});
requireSuccessfulResult(result, `remove ${remotePath}`);
},
run: async (command, options) => {
const result = await input.runner.execute({
command: shellCommand,
args: ["-lc", command],
cwd: input.remoteCwd,
timeoutMs: options.timeoutMs,
});
requireSuccessfulResult(result, command);
},
};
}
export async function prepareCommandManagedRuntime(input: {
runner: CommandManagedRuntimeRunner;
spec: CommandManagedRuntimeSpec;
adapterKey: string;
workspaceLocalDir: string;
workspaceRemoteDir?: string;
workspaceExclude?: string[];
preserveAbsentOnRestore?: string[];
assets?: CommandManagedRuntimeAsset[];
installCommand?: string | null;
}): Promise<PreparedSandboxManagedRuntime> {
const timeoutMs = input.spec.timeoutMs && input.spec.timeoutMs > 0 ? input.spec.timeoutMs : 300_000;
const workspaceRemoteDir = input.workspaceRemoteDir ?? input.spec.remoteCwd;
const runtimeSpec: SandboxRemoteExecutionSpec = {
transport: "sandbox",
provider: input.spec.providerKey ?? "sandbox",
sandboxId: input.spec.leaseId ?? "managed",
remoteCwd: workspaceRemoteDir,
timeoutMs,
apiKey: null,
};
const client = createCommandManagedRuntimeClient({
runner: input.runner,
remoteCwd: workspaceRemoteDir,
timeoutMs,
shellCommand: input.spec.shellCommand,
});
const shellCommand = preferredShellForSandbox(input.spec.shellCommand);
if (input.installCommand?.trim()) {
const result = await input.runner.execute({
command: shellCommand,
args: ["-lc", input.installCommand.trim()],
cwd: workspaceRemoteDir,
timeoutMs,
});
requireSuccessfulResult(result, input.installCommand.trim());
}
return await prepareSandboxManagedRuntime({
spec: runtimeSpec,
client,
adapterKey: input.adapterKey,
workspaceLocalDir: input.workspaceLocalDir,
workspaceRemoteDir,
workspaceExclude: mergeRuntimeExcludes(input.workspaceExclude),
preserveAbsentOnRestore: input.preserveAbsentOnRestore,
assets: input.assets,
});
}

View File

@@ -0,0 +1,21 @@
export const REDACTED_COMMAND_TEXT_VALUE = "***REDACTED***";
const COMMAND_CLI_SECRET_OPTION_RE =
/(\B-{1,2}(?:api[-_]?key|(?:access[-_]?|auth[-_]?)?token|token|authorization|bearer|secret|passwd|password|credential|jwt|private[-_]?key|cookie|connectionstring)(?:\s+|=)(["']?))[^\s"'`]+(\2)/gi;
const COMMAND_ENV_SECRET_ASSIGNMENT_RE =
/(\b[A-Za-z0-9_]*(?:TOKEN|KEY|SECRET|PASSWORD|PASSWD|AUTHORIZATION|JWT)[A-Za-z0-9_]*\s*=\s*)[^\s"'`]+/gi;
const COMMAND_AUTHORIZATION_BEARER_RE = /(\bAuthorization\s*:\s*Bearer\s+)[^\s"'`]+/gi;
const COMMAND_OPENAI_KEY_RE = /\bsk-[A-Za-z0-9_-]{12,}\b/g;
const COMMAND_GITHUB_TOKEN_RE = /\bgh[pousr]_[A-Za-z0-9_]{20,}\b/g;
const COMMAND_JWT_RE =
/\b[A-Za-z0-9_-]{8,}\.[A-Za-z0-9_-]{8,}\.[A-Za-z0-9_-]{8,}(?:\.[A-Za-z0-9_-]{8,})?\b/g;
export function redactCommandText(command: string, redactedValue = REDACTED_COMMAND_TEXT_VALUE): string {
return command
.replace(COMMAND_AUTHORIZATION_BEARER_RE, `$1${redactedValue}`)
.replace(COMMAND_CLI_SECRET_OPTION_RE, `$1${redactedValue}$3`)
.replace(COMMAND_ENV_SECRET_ASSIGNMENT_RE, `$1${redactedValue}`)
.replace(COMMAND_OPENAI_KEY_RE, redactedValue)
.replace(COMMAND_GITHUB_TOKEN_RE, redactedValue)
.replace(COMMAND_JWT_RE, redactedValue);
}

View File

@@ -0,0 +1,353 @@
import { createServer } from "node:http";
import { mkdir, mkdtemp, rm } from "node:fs/promises";
import os from "node:os";
import path from "node:path";
import { afterEach, describe, expect, it, vi } from "vitest";
import {
adapterExecutionTargetSessionIdentity,
adapterExecutionTargetToRemoteSpec,
adapterExecutionTargetUsesPaperclipBridge,
runAdapterExecutionTargetProcess,
runAdapterExecutionTargetShellCommand,
startAdapterExecutionTargetPaperclipBridge,
type AdapterSandboxExecutionTarget,
} from "./execution-target.js";
import { runChildProcess } from "./server-utils.js";
describe("sandbox adapter execution targets", () => {
const cleanupDirs: string[] = [];
afterEach(async () => {
while (cleanupDirs.length > 0) {
const dir = cleanupDirs.pop();
if (!dir) continue;
await rm(dir, { recursive: true, force: true }).catch(() => undefined);
}
});
function createLocalSandboxRunner() {
let counter = 0;
return {
execute: async (input: {
command: string;
args?: string[];
cwd?: string;
env?: Record<string, string>;
stdin?: string;
timeoutMs?: number;
onLog?: (stream: "stdout" | "stderr", chunk: string) => Promise<void>;
onSpawn?: (meta: { pid: number; startedAt: string }) => Promise<void>;
}) => {
counter += 1;
const command = input.command === "bash" ? "/bin/bash" : input.command;
return runChildProcess(`sandbox-run-${counter}`, command, input.args ?? [], {
cwd: input.cwd ?? process.cwd(),
env: input.env ?? {},
stdin: input.stdin,
timeoutSec: Math.max(1, Math.ceil((input.timeoutMs ?? 30_000) / 1000)),
graceSec: 5,
onLog: input.onLog ?? (async () => {}),
onSpawn: input.onSpawn
? async (meta) => input.onSpawn?.({ pid: meta.pid, startedAt: meta.startedAt })
: undefined,
});
},
};
}
it("executes through the provider-neutral runner without a remote spec", async () => {
const runner = {
execute: vi.fn(async () => ({
exitCode: 0,
signal: null,
timedOut: false,
stdout: "ok\n",
stderr: "",
pid: null,
startedAt: new Date().toISOString(),
})),
};
const target: AdapterSandboxExecutionTarget = {
kind: "remote",
transport: "sandbox",
providerKey: "acme-sandbox",
environmentId: "env-1",
leaseId: "lease-1",
remoteCwd: "/workspace",
timeoutMs: 30_000,
runner,
};
expect(adapterExecutionTargetToRemoteSpec(target)).toBeNull();
const result = await runAdapterExecutionTargetProcess("run-1", target, "agent-cli", ["--json"], {
cwd: "/local/workspace",
env: { TOKEN: "token" },
stdin: "prompt",
timeoutSec: 5,
graceSec: 1,
onLog: async () => {},
});
expect(result.stdout).toBe("ok\n");
expect(runner.execute).toHaveBeenCalledWith(expect.objectContaining({
command: "agent-cli",
args: ["--json"],
cwd: "/workspace",
env: { TOKEN: "token" },
stdin: "prompt",
timeoutMs: 5000,
}));
expect(adapterExecutionTargetSessionIdentity(target)).toEqual({
transport: "sandbox",
providerKey: "acme-sandbox",
environmentId: "env-1",
leaseId: "lease-1",
remoteCwd: "/workspace",
});
});
it("runs shell commands through the same runner", async () => {
const runner = {
execute: vi.fn(async () => ({
exitCode: 0,
signal: null,
timedOut: false,
stdout: "/home/sandbox",
stderr: "",
pid: null,
startedAt: new Date().toISOString(),
})),
};
const target: AdapterSandboxExecutionTarget = {
kind: "remote",
transport: "sandbox",
remoteCwd: "/workspace",
runner,
};
await runAdapterExecutionTargetShellCommand("run-2", target, 'printf %s "$HOME"', {
cwd: "/local/workspace",
env: {},
timeoutSec: 7,
});
expect(runner.execute).toHaveBeenCalledWith(expect.objectContaining({
command: "sh",
args: ["-lc", 'printf %s "$HOME"'],
cwd: "/workspace",
timeoutMs: 7000,
}));
});
it("treats SSH targets as bridge-only", () => {
const target = {
kind: "remote" as const,
transport: "ssh" as const,
remoteCwd: "/workspace",
spec: {
host: "ssh.example.test",
port: 22,
username: "paperclip",
remoteWorkspacePath: "/workspace",
remoteCwd: "/workspace",
privateKey: null,
knownHosts: null,
strictHostKeyChecking: true,
},
};
expect(adapterExecutionTargetUsesPaperclipBridge(target)).toBe(true);
expect(adapterExecutionTargetSessionIdentity(target)).toEqual({
transport: "ssh",
host: "ssh.example.test",
port: 22,
username: "paperclip",
remoteCwd: "/workspace",
});
});
it("uses the provider-declared shell for sandbox helper commands", async () => {
const runner = {
execute: vi.fn(async () => ({
exitCode: 0,
signal: null,
timedOut: false,
stdout: "/home/sandbox",
stderr: "",
pid: null,
startedAt: new Date().toISOString(),
})),
};
const target: AdapterSandboxExecutionTarget = {
kind: "remote",
transport: "sandbox",
providerKey: "custom-provider",
shellCommand: "bash",
remoteCwd: "/workspace",
runner,
};
await runAdapterExecutionTargetShellCommand("run-2b", target, 'printf %s "$HOME"', {
cwd: "/local/workspace",
env: {},
timeoutSec: 7,
});
expect(runner.execute).toHaveBeenCalledWith(expect.objectContaining({
command: "bash",
args: ["-lc", 'printf %s "$HOME"'],
cwd: "/workspace",
timeoutMs: 7000,
}));
});
it("starts a localhost Paperclip bridge for sandbox targets in bridge mode", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-execution-target-bridge-"));
cleanupDirs.push(rootDir);
const remoteCwd = path.join(rootDir, "workspace");
const runtimeRootDir = path.join(remoteCwd, ".paperclip-runtime", "codex");
await mkdir(runtimeRootDir, { recursive: true });
const requests: Array<{ method: string; url: string; auth: string | null; runId: string | null }> = [];
const apiServer = createServer((req, res) => {
requests.push({
method: req.method ?? "GET",
url: req.url ?? "/",
auth: req.headers.authorization ?? null,
runId: typeof req.headers["x-paperclip-run-id"] === "string" ? req.headers["x-paperclip-run-id"] : null,
});
res.writeHead(200, { "content-type": "application/json" });
res.end(JSON.stringify({ ok: true }));
});
await new Promise<void>((resolve, reject) => {
apiServer.once("error", reject);
apiServer.listen(0, "127.0.0.1", () => resolve());
});
const address = apiServer.address();
if (!address || typeof address === "string") {
throw new Error("Expected the bridge test API server to listen on a TCP port.");
}
const target: AdapterSandboxExecutionTarget = {
kind: "remote",
transport: "sandbox",
providerKey: "e2b",
environmentId: "env-1",
leaseId: "lease-1",
remoteCwd,
runner: createLocalSandboxRunner(),
timeoutMs: 30_000,
};
const bridge = await startAdapterExecutionTargetPaperclipBridge({
runId: "run-bridge",
target,
runtimeRootDir,
adapterKey: "codex",
hostApiToken: "real-run-jwt",
hostApiUrl: `http://127.0.0.1:${address.port}`,
});
try {
expect(bridge).not.toBeNull();
expect(bridge?.env.PAPERCLIP_API_URL).toMatch(/^http:\/\/127\.0\.0\.1:\d+$/);
expect(bridge?.env.PAPERCLIP_API_KEY).not.toBe("real-run-jwt");
expect(bridge?.env.PAPERCLIP_API_BRIDGE_MODE).toBe("queue_v1");
const response = await fetch(`${bridge!.env.PAPERCLIP_API_URL}/api/agents/me`, {
headers: {
authorization: `Bearer ${bridge!.env.PAPERCLIP_API_KEY}`,
accept: "application/json",
},
});
expect(response.status).toBe(200);
expect(await response.json()).toEqual({ ok: true });
expect(requests).toEqual([{
method: "GET",
url: "/api/agents/me",
auth: "Bearer real-run-jwt",
runId: "run-bridge",
}]);
} finally {
await bridge?.stop();
await new Promise<void>((resolve) => apiServer.close(() => resolve()));
}
});
it("fails oversized host responses with a 502 before returning them to the sandbox client", async () => {
const rootDir = await mkdtemp(path.join(os.tmpdir(), "paperclip-execution-target-bridge-limit-"));
cleanupDirs.push(rootDir);
const remoteCwd = path.join(rootDir, "workspace");
const runtimeRootDir = path.join(remoteCwd, ".paperclip-runtime", "codex");
await mkdir(runtimeRootDir, { recursive: true });
const requests: Array<{ method: string; url: string; auth: string | null; runId: string | null }> = [];
const largeBody = "x".repeat(64);
const apiServer = createServer((req, res) => {
requests.push({
method: req.method ?? "GET",
url: req.url ?? "/",
auth: req.headers.authorization ?? null,
runId: typeof req.headers["x-paperclip-run-id"] === "string" ? req.headers["x-paperclip-run-id"] : null,
});
res.writeHead(200, {
"content-type": "application/json",
"content-length": String(Buffer.byteLength(largeBody, "utf8")),
});
res.end(largeBody);
});
await new Promise<void>((resolve, reject) => {
apiServer.once("error", reject);
apiServer.listen(0, "127.0.0.1", () => resolve());
});
const address = apiServer.address();
if (!address || typeof address === "string") {
throw new Error("Expected the bridge test API server to listen on a TCP port.");
}
const target: AdapterSandboxExecutionTarget = {
kind: "remote",
transport: "sandbox",
providerKey: "e2b",
environmentId: "env-1",
leaseId: "lease-1",
remoteCwd,
runner: createLocalSandboxRunner(),
timeoutMs: 30_000,
};
const bridge = await startAdapterExecutionTargetPaperclipBridge({
runId: "run-bridge-limit",
target,
runtimeRootDir,
adapterKey: "codex",
hostApiToken: "real-run-jwt",
hostApiUrl: `http://127.0.0.1:${address.port}`,
maxBodyBytes: 32,
});
try {
const response = await fetch(`${bridge!.env.PAPERCLIP_API_URL}/api/agents/me`, {
headers: {
authorization: `Bearer ${bridge!.env.PAPERCLIP_API_KEY}`,
accept: "application/json",
},
});
expect(response.status).toBe(502);
await expect(response.json()).resolves.toEqual({
error: "Bridge response body exceeded the configured size limit of 32 bytes.",
});
expect(requests).toEqual([{
method: "GET",
url: "/api/agents/me",
auth: "Bearer real-run-jwt",
runId: "run-bridge-limit",
}]);
} finally {
await bridge?.stop();
await new Promise<void>((resolve) => apiServer.close(() => resolve()));
}
});
});

View File

@@ -0,0 +1,283 @@
import { afterEach, describe, expect, it, vi } from "vitest";
import * as ssh from "./ssh.js";
import {
adapterExecutionTargetUsesManagedHome,
ensureAdapterExecutionTargetRuntimeCommandInstalled,
resolveAdapterExecutionTargetCwd,
runAdapterExecutionTargetShellCommand,
} from "./execution-target.js";
describe("runAdapterExecutionTargetShellCommand", () => {
afterEach(() => {
vi.restoreAllMocks();
});
it("quotes remote shell commands with the shared SSH quoting helper", async () => {
const runSshCommandSpy = vi.spyOn(ssh, "runSshCommand").mockResolvedValue({
stdout: "",
stderr: "",
});
await runAdapterExecutionTargetShellCommand(
"run-1",
{
kind: "remote",
transport: "ssh",
remoteCwd: "/srv/paperclip/workspace",
spec: {
host: "ssh.example.test",
port: 22,
username: "ssh-user",
remoteCwd: "/srv/paperclip/workspace",
remoteWorkspacePath: "/srv/paperclip/workspace",
privateKey: null,
knownHosts: null,
strictHostKeyChecking: true,
},
},
`printf '%s\\n' "$HOME" && echo "it's ok"`,
{
cwd: "/tmp/local",
env: {},
},
);
expect(runSshCommandSpy).toHaveBeenCalledWith(
expect.objectContaining({
host: "ssh.example.test",
username: "ssh-user",
}),
`sh -lc ${ssh.shellQuote(`printf '%s\\n' "$HOME" && echo "it's ok"`)}`,
expect.any(Object),
);
});
it("returns a timedOut result when the SSH shell command times out", async () => {
vi.spyOn(ssh, "runSshCommand").mockRejectedValue(Object.assign(new Error("timed out"), {
code: "ETIMEDOUT",
stdout: "partial stdout",
stderr: "partial stderr",
signal: "SIGTERM",
}));
const onLog = vi.fn(async () => {});
const result = await runAdapterExecutionTargetShellCommand(
"run-2",
{
kind: "remote",
transport: "ssh",
remoteCwd: "/srv/paperclip/workspace",
spec: {
host: "ssh.example.test",
port: 22,
username: "ssh-user",
remoteCwd: "/srv/paperclip/workspace",
remoteWorkspacePath: "/srv/paperclip/workspace",
privateKey: null,
knownHosts: null,
strictHostKeyChecking: true,
},
},
"sleep 10",
{
cwd: "/tmp/local",
env: {},
onLog,
},
);
expect(result).toMatchObject({
exitCode: null,
signal: "SIGTERM",
timedOut: true,
stdout: "partial stdout",
stderr: "partial stderr",
});
expect(onLog).toHaveBeenCalledWith("stdout", "partial stdout");
expect(onLog).toHaveBeenCalledWith("stderr", "partial stderr");
});
it("returns the SSH process exit code for non-zero remote command failures", async () => {
vi.spyOn(ssh, "runSshCommand").mockRejectedValue(Object.assign(new Error("non-zero exit"), {
code: 17,
stdout: "partial stdout",
stderr: "partial stderr",
signal: null,
}));
const onLog = vi.fn(async () => {});
const result = await runAdapterExecutionTargetShellCommand(
"run-3",
{
kind: "remote",
transport: "ssh",
remoteCwd: "/srv/paperclip/workspace",
spec: {
host: "ssh.example.test",
port: 22,
username: "ssh-user",
remoteCwd: "/srv/paperclip/workspace",
remoteWorkspacePath: "/srv/paperclip/workspace",
privateKey: null,
knownHosts: null,
strictHostKeyChecking: true,
},
},
"false",
{
cwd: "/tmp/local",
env: {},
onLog,
},
);
expect(result).toMatchObject({
exitCode: 17,
signal: null,
timedOut: false,
stdout: "partial stdout",
stderr: "partial stderr",
});
expect(onLog).toHaveBeenCalledWith("stdout", "partial stdout");
expect(onLog).toHaveBeenCalledWith("stderr", "partial stderr");
});
it("keeps managed homes disabled for both local and SSH targets", () => {
expect(adapterExecutionTargetUsesManagedHome(null)).toBe(false);
expect(adapterExecutionTargetUsesManagedHome({
kind: "remote",
transport: "ssh",
remoteCwd: "/srv/paperclip/workspace",
spec: {
host: "ssh.example.test",
port: 22,
username: "ssh-user",
remoteCwd: "/srv/paperclip/workspace",
remoteWorkspacePath: "/srv/paperclip/workspace",
privateKey: null,
knownHosts: null,
strictHostKeyChecking: true,
},
})).toBe(false);
});
});
describe("ensureAdapterExecutionTargetRuntimeCommandInstalled", () => {
afterEach(() => {
vi.restoreAllMocks();
});
it("runs install commands for sandbox targets", async () => {
const runner = {
execute: vi.fn(async () => ({
exitCode: 0,
signal: null,
timedOut: false,
stdout: "",
stderr: "",
pid: null,
startedAt: new Date().toISOString(),
})),
};
await ensureAdapterExecutionTargetRuntimeCommandInstalled({
runId: "run-install",
target: {
kind: "remote",
transport: "sandbox",
providerKey: "e2b",
remoteCwd: "/remote/workspace",
runner,
},
installCommand: "npm install -g @google/gemini-cli",
cwd: "/local/workspace",
env: { PATH: "/usr/bin" },
timeoutSec: 30,
});
expect(runner.execute).toHaveBeenCalledWith(expect.objectContaining({
command: "sh",
args: ["-lc", "npm install -g @google/gemini-cli"],
cwd: "/remote/workspace",
env: { PATH: "/usr/bin" },
timeoutMs: 30_000,
}));
});
it("skips install commands for SSH targets", async () => {
const runSshCommandSpy = vi.spyOn(ssh, "runSshCommand").mockResolvedValue({
stdout: "",
stderr: "",
});
await ensureAdapterExecutionTargetRuntimeCommandInstalled({
runId: "run-skip",
target: {
kind: "remote",
transport: "ssh",
remoteCwd: "/srv/paperclip/workspace",
spec: {
host: "ssh.example.test",
port: 22,
username: "ssh-user",
remoteCwd: "/srv/paperclip/workspace",
remoteWorkspacePath: "/srv/paperclip/workspace",
privateKey: null,
knownHosts: null,
strictHostKeyChecking: true,
},
},
installCommand: "npm install -g @google/gemini-cli",
cwd: "/tmp/local",
env: {},
});
expect(runSshCommandSpy).not.toHaveBeenCalled();
});
});
describe("resolveAdapterExecutionTargetCwd", () => {
const sshTarget = {
kind: "remote" as const,
transport: "ssh" as const,
remoteCwd: "/srv/paperclip/workspace",
spec: {
host: "ssh.example.test",
port: 22,
username: "ssh-user",
remoteCwd: "/srv/paperclip/workspace",
remoteWorkspacePath: "/srv/paperclip/workspace",
privateKey: null,
knownHosts: null,
strictHostKeyChecking: true,
},
};
it("falls back to the remote cwd when no adapter cwd is configured", () => {
expect(resolveAdapterExecutionTargetCwd(sshTarget, "", "/Users/host/repo/server")).toBe(
"/srv/paperclip/workspace",
);
expect(resolveAdapterExecutionTargetCwd(sshTarget, " ", "/Users/host/repo/server")).toBe(
"/srv/paperclip/workspace",
);
expect(resolveAdapterExecutionTargetCwd(sshTarget, null, "/Users/host/repo/server")).toBe(
"/srv/paperclip/workspace",
);
});
it("preserves an explicit adapter cwd when one is configured", () => {
expect(
resolveAdapterExecutionTargetCwd(
sshTarget,
"/srv/paperclip/custom-agent-dir",
"/Users/host/repo/server",
),
).toBe("/srv/paperclip/custom-agent-dir");
});
it("keeps the local fallback cwd for local targets", () => {
expect(resolveAdapterExecutionTargetCwd(null, "", "/Users/host/repo/server")).toBe(
"/Users/host/repo/server",
);
});
});

View File

@@ -0,0 +1,886 @@
import path from "node:path";
import type { SshRemoteExecutionSpec } from "./ssh.js";
import {
prepareCommandManagedRuntime,
type CommandManagedRuntimeRunner,
} from "./command-managed-runtime.js";
import {
buildRemoteExecutionSessionIdentity,
prepareRemoteManagedRuntime,
remoteExecutionSessionMatches,
type RemoteManagedRuntimeAsset,
} from "./remote-managed-runtime.js";
import {
createCommandManagedSandboxCallbackBridgeQueueClient,
createSandboxCallbackBridgeAsset,
createSandboxCallbackBridgeToken,
DEFAULT_SANDBOX_CALLBACK_BRIDGE_MAX_BODY_BYTES,
startSandboxCallbackBridgeServer,
startSandboxCallbackBridgeWorker,
} from "./sandbox-callback-bridge.js";
import { createSshCommandManagedRuntimeRunner, parseSshRemoteExecutionSpec, runSshCommand, shellQuote } from "./ssh.js";
import {
ensureCommandResolvable,
resolveCommandForLogs,
runChildProcess,
type RunProcessResult,
type TerminalResultCleanupOptions,
} from "./server-utils.js";
import { preferredShellForSandbox } from "./sandbox-shell.js";
export interface AdapterLocalExecutionTarget {
kind: "local";
environmentId?: string | null;
leaseId?: string | null;
}
export interface AdapterSshExecutionTarget {
kind: "remote";
transport: "ssh";
environmentId?: string | null;
leaseId?: string | null;
remoteCwd: string;
spec: SshRemoteExecutionSpec;
}
export interface AdapterSandboxExecutionTarget {
kind: "remote";
transport: "sandbox";
providerKey?: string | null;
shellCommand?: "bash" | "sh" | null;
environmentId?: string | null;
leaseId?: string | null;
remoteCwd: string;
timeoutMs?: number | null;
runner?: CommandManagedRuntimeRunner;
}
export type AdapterExecutionTarget =
| AdapterLocalExecutionTarget
| AdapterSshExecutionTarget
| AdapterSandboxExecutionTarget;
export type AdapterRemoteExecutionSpec = SshRemoteExecutionSpec;
export type AdapterManagedRuntimeAsset = RemoteManagedRuntimeAsset;
export interface PreparedAdapterExecutionTargetRuntime {
target: AdapterExecutionTarget;
runtimeRootDir: string | null;
assetDirs: Record<string, string>;
restoreWorkspace(): Promise<void>;
}
export interface AdapterExecutionTargetProcessOptions {
cwd: string;
env: Record<string, string>;
stdin?: string;
timeoutSec: number;
graceSec: number;
onLog: (stream: "stdout" | "stderr", chunk: string) => Promise<void>;
onSpawn?: (meta: { pid: number; processGroupId: number | null; startedAt: string }) => Promise<void>;
terminalResultCleanup?: TerminalResultCleanupOptions;
}
export interface AdapterExecutionTargetShellOptions {
cwd: string;
env: Record<string, string>;
timeoutSec?: number;
graceSec?: number;
onLog?: (stream: "stdout" | "stderr", chunk: string) => Promise<void>;
}
export interface AdapterExecutionTargetPaperclipBridgeHandle {
env: Record<string, string>;
stop(): Promise<void>;
}
function parseObject(value: unknown): Record<string, unknown> {
return value && typeof value === "object" && !Array.isArray(value)
? (value as Record<string, unknown>)
: {};
}
function readString(value: unknown): string | null {
return typeof value === "string" && value.trim().length > 0 ? value.trim() : null;
}
function readStringMeta(parsed: Record<string, unknown>, key: string): string | null {
return readString(parsed[key]);
}
function resolveHostForUrl(rawHost: string): string {
const host = rawHost.trim();
if (!host || host === "0.0.0.0" || host === "::") return "localhost";
if (host.includes(":") && !host.startsWith("[") && !host.endsWith("]")) return `[${host}]`;
return host;
}
function resolveDefaultPaperclipApiUrl(): string {
const runtimeHost = resolveHostForUrl(
process.env.PAPERCLIP_LISTEN_HOST ?? process.env.HOST ?? "localhost",
);
// 3100 matches the default Paperclip dev server port when the runtime does not provide one.
const runtimePort = process.env.PAPERCLIP_LISTEN_PORT ?? process.env.PORT ?? "3100";
return `http://${runtimeHost}:${runtimePort}`;
}
function isBridgeDebugEnabled(env: NodeJS.ProcessEnv): boolean {
const value = env.PAPERCLIP_BRIDGE_DEBUG?.trim().toLowerCase();
return value === "1" || value === "true" || value === "yes";
}
function isAdapterExecutionTargetInstance(value: unknown): value is AdapterExecutionTarget {
const parsed = parseObject(value);
if (parsed.kind === "local") return true;
if (parsed.kind !== "remote") return false;
if (parsed.transport === "ssh") return parseSshRemoteExecutionSpec(parseObject(parsed.spec)) !== null;
if (parsed.transport !== "sandbox") return false;
return readStringMeta(parsed, "remoteCwd") !== null;
}
export function adapterExecutionTargetToRemoteSpec(
target: AdapterExecutionTarget | null | undefined,
): AdapterRemoteExecutionSpec | null {
return target?.kind === "remote" && target.transport === "ssh" ? target.spec : null;
}
export function adapterExecutionTargetIsRemote(
target: AdapterExecutionTarget | null | undefined,
): boolean {
return target?.kind === "remote";
}
export function adapterExecutionTargetUsesManagedHome(
target: AdapterExecutionTarget | null | undefined,
): boolean {
return target?.kind === "remote" && target.transport === "sandbox";
}
export function adapterExecutionTargetRemoteCwd(
target: AdapterExecutionTarget | null | undefined,
localCwd: string,
): string {
return target?.kind === "remote" ? target.remoteCwd : localCwd;
}
export function resolveAdapterExecutionTargetCwd(
target: AdapterExecutionTarget | null | undefined,
configuredCwd: string | null | undefined,
localFallbackCwd: string,
): string {
if (typeof configuredCwd === "string" && configuredCwd.trim().length > 0) {
return configuredCwd;
}
return adapterExecutionTargetRemoteCwd(target, localFallbackCwd);
}
export function adapterExecutionTargetUsesPaperclipBridge(
target: AdapterExecutionTarget | null | undefined,
): boolean {
return target?.kind === "remote";
}
export function describeAdapterExecutionTarget(
target: AdapterExecutionTarget | null | undefined,
): string {
if (!target || target.kind === "local") return "local environment";
if (target.transport === "ssh") {
return `SSH environment ${target.spec.username}@${target.spec.host}:${target.spec.port}`;
}
return `sandbox environment${target.providerKey ? ` (${target.providerKey})` : ""}`;
}
function requireSandboxRunner(target: AdapterSandboxExecutionTarget): CommandManagedRuntimeRunner {
if (target.runner) return target.runner;
throw new Error(
"Sandbox execution target is missing its provider runtime runner. Sandbox commands must execute through the environment runtime.",
);
}
function preferredSandboxShell(target: AdapterSandboxExecutionTarget): "bash" | "sh" {
return preferredShellForSandbox(target.shellCommand);
}
type AdapterCommandCapableExecutionTarget = AdapterSshExecutionTarget | AdapterSandboxExecutionTarget;
function adapterExecutionTargetCommandRunner(target: AdapterCommandCapableExecutionTarget): CommandManagedRuntimeRunner {
if (target.transport === "ssh") {
return createSshCommandManagedRuntimeRunner({
spec: target.spec,
defaultCwd: target.remoteCwd,
maxBufferBytes: DEFAULT_SANDBOX_CALLBACK_BRIDGE_MAX_BODY_BYTES * 4,
});
}
return requireSandboxRunner(target);
}
function adapterExecutionTargetShellCommand(target: AdapterCommandCapableExecutionTarget): "bash" | "sh" {
return target.transport === "ssh" ? "sh" : preferredSandboxShell(target);
}
function adapterExecutionTargetTimeoutMs(
target: AdapterCommandCapableExecutionTarget,
): number | null | undefined {
return target.transport === "sandbox" ? target.timeoutMs : undefined;
}
export async function ensureAdapterExecutionTargetCommandResolvable(
command: string,
target: AdapterExecutionTarget | null | undefined,
cwd: string,
env: NodeJS.ProcessEnv,
) {
if (target?.kind === "remote" && target.transport === "sandbox") {
return;
}
await ensureCommandResolvable(command, cwd, env, {
remoteExecution: adapterExecutionTargetToRemoteSpec(target),
});
}
export async function resolveAdapterExecutionTargetCommandForLogs(
command: string,
target: AdapterExecutionTarget | null | undefined,
cwd: string,
env: NodeJS.ProcessEnv,
): Promise<string> {
if (target?.kind === "remote" && target.transport === "sandbox") {
return `sandbox://${target.providerKey ?? "provider"}/${target.leaseId ?? "lease"}/${target.remoteCwd} :: ${command}`;
}
return await resolveCommandForLogs(command, cwd, env, {
remoteExecution: adapterExecutionTargetToRemoteSpec(target),
});
}
export async function runAdapterExecutionTargetProcess(
runId: string,
target: AdapterExecutionTarget | null | undefined,
command: string,
args: string[],
options: AdapterExecutionTargetProcessOptions,
): Promise<RunProcessResult> {
if (target?.kind === "remote" && target.transport === "sandbox") {
const runner = requireSandboxRunner(target);
return await runner.execute({
command,
args,
cwd: target.remoteCwd,
env: options.env,
stdin: options.stdin,
timeoutMs: options.timeoutSec > 0 ? options.timeoutSec * 1000 : target.timeoutMs ?? undefined,
onLog: options.onLog,
onSpawn: options.onSpawn
? async (meta) => options.onSpawn?.({ ...meta, processGroupId: null })
: undefined,
});
}
return await runChildProcess(runId, command, args, {
cwd: options.cwd,
env: options.env,
stdin: options.stdin,
timeoutSec: options.timeoutSec,
graceSec: options.graceSec,
onLog: options.onLog,
onSpawn: options.onSpawn,
terminalResultCleanup: options.terminalResultCleanup,
remoteExecution: adapterExecutionTargetToRemoteSpec(target),
});
}
export async function runAdapterExecutionTargetShellCommand(
runId: string,
target: AdapterExecutionTarget | null | undefined,
command: string,
options: AdapterExecutionTargetShellOptions,
): Promise<RunProcessResult> {
const onLog = options.onLog ?? (async () => {});
if (target?.kind === "remote") {
const startedAt = new Date().toISOString();
if (target.transport === "ssh") {
try {
const result = await runSshCommand(target.spec, `sh -lc ${shellQuote(command)}`, {
timeoutMs: (options.timeoutSec ?? 15) * 1000,
});
if (result.stdout) await onLog("stdout", result.stdout);
if (result.stderr) await onLog("stderr", result.stderr);
return {
exitCode: 0,
signal: null,
timedOut: false,
stdout: result.stdout,
stderr: result.stderr,
pid: null,
startedAt,
};
} catch (error) {
const timedOutError = error as NodeJS.ErrnoException & {
stdout?: string;
stderr?: string;
signal?: string | null;
};
const stdout = timedOutError.stdout ?? "";
const stderr = timedOutError.stderr ?? "";
if (typeof timedOutError.code === "number") {
if (stdout) await onLog("stdout", stdout);
if (stderr) await onLog("stderr", stderr);
return {
exitCode: timedOutError.code,
signal: timedOutError.signal ?? null,
timedOut: false,
stdout,
stderr,
pid: null,
startedAt,
};
}
if (timedOutError.code !== "ETIMEDOUT") {
throw error;
}
if (stdout) await onLog("stdout", stdout);
if (stderr) await onLog("stderr", stderr);
return {
exitCode: null,
signal: timedOutError.signal ?? null,
timedOut: true,
stdout,
stderr,
pid: null,
startedAt,
};
}
}
const shellCommand = preferredSandboxShell(target);
return await requireSandboxRunner(target).execute({
command: shellCommand,
args: ["-lc", command],
cwd: target.remoteCwd,
env: options.env,
timeoutMs: (options.timeoutSec ?? 15) * 1000,
onLog,
});
}
return await runAdapterExecutionTargetProcess(
runId,
target,
"sh",
["-lc", command],
{
cwd: options.cwd,
env: options.env,
timeoutSec: options.timeoutSec ?? 15,
graceSec: options.graceSec ?? 5,
onLog,
},
);
}
export async function readAdapterExecutionTargetHomeDir(
runId: string,
target: AdapterExecutionTarget | null | undefined,
options: AdapterExecutionTargetShellOptions,
): Promise<string | null> {
const result = await runAdapterExecutionTargetShellCommand(
runId,
target,
'printf %s "$HOME"',
options,
);
const homeDir = result.stdout.trim();
return homeDir.length > 0 ? homeDir : null;
}
export async function ensureAdapterExecutionTargetRuntimeCommandInstalled(input: {
runId: string;
target: AdapterExecutionTarget | null | undefined;
installCommand?: string | null;
detectCommand?: string | null;
cwd: string;
env: Record<string, string>;
timeoutSec?: number;
graceSec?: number;
onLog?: AdapterExecutionTargetShellOptions["onLog"];
}): Promise<void> {
const installCommand = input.installCommand?.trim();
if (!installCommand || input.target?.kind !== "remote" || input.target.transport !== "sandbox") {
return;
}
const detectCommand = input.detectCommand?.trim();
if (detectCommand) {
const probe = await runAdapterExecutionTargetShellCommand(
input.runId,
input.target,
`command -v ${shellQuote(detectCommand)} >/dev/null 2>&1`,
{
cwd: input.cwd,
env: input.env,
timeoutSec: input.timeoutSec,
graceSec: input.graceSec,
},
);
if (!probe.timedOut && probe.exitCode === 0) {
return;
}
}
const result = await runAdapterExecutionTargetShellCommand(
input.runId,
input.target,
installCommand,
{
cwd: input.cwd,
env: input.env,
timeoutSec: input.timeoutSec,
graceSec: input.graceSec,
onLog: input.onLog,
},
);
if (result.timedOut) {
throw new Error(`Timed out while installing the adapter runtime command via: ${installCommand}`);
}
if ((result.exitCode ?? 0) !== 0) {
throw new Error(`Failed to install the adapter runtime command via: ${installCommand}`);
}
}
export async function ensureAdapterExecutionTargetFile(
runId: string,
target: AdapterExecutionTarget | null | undefined,
filePath: string,
options: AdapterExecutionTargetShellOptions,
): Promise<void> {
await runAdapterExecutionTargetShellCommand(
runId,
target,
`mkdir -p ${shellQuote(path.posix.dirname(filePath))} && : > ${shellQuote(filePath)}`,
options,
);
}
/**
* Ensure a working directory exists (and is a directory) on the execution target.
*
* For local targets this delegates to the local `ensureAbsoluteDirectory` helper
* (Node fs). For remote (SSH/sandbox) targets it shells out and runs
* `mkdir -p` (when allowed) followed by a `[ -d ]` check so the result reflects
* the directory state inside the environment, not on the Paperclip host.
*
* Throws an Error with a human-readable message on failure.
*/
export async function ensureAdapterExecutionTargetDirectory(
runId: string,
target: AdapterExecutionTarget | null | undefined,
cwd: string,
options: AdapterExecutionTargetShellOptions & { createIfMissing?: boolean },
): Promise<void> {
const createIfMissing = options.createIfMissing ?? false;
if (!target || target.kind === "local") {
const { ensureAbsoluteDirectory } = await import("./server-utils.js");
await ensureAbsoluteDirectory(cwd, { createIfMissing });
return;
}
// Remote (SSH or sandbox): both expect POSIX absolute paths inside the env.
if (!cwd.startsWith("/")) {
throw new Error(`Working directory must be an absolute POSIX path on the remote target: "${cwd}"`);
}
const quoted = shellQuote(cwd);
const script = createIfMissing
? `mkdir -p ${quoted} && [ -d ${quoted} ]`
: `[ -d ${quoted} ]`;
const result = await runAdapterExecutionTargetShellCommand(runId, target, script, {
cwd: target.kind === "remote" ? target.remoteCwd : cwd,
env: options.env,
timeoutSec: options.timeoutSec ?? 15,
graceSec: options.graceSec ?? 5,
onLog: options.onLog,
});
if (result.timedOut) {
throw new Error(`Timed out checking working directory on remote target: "${cwd}"`);
}
if ((result.exitCode ?? 1) !== 0) {
const detail = (result.stderr || result.stdout || "").trim();
if (createIfMissing) {
throw new Error(
`Could not create working directory "${cwd}" on remote target${detail ? `: ${detail}` : "."}`,
);
}
throw new Error(
`Working directory does not exist on remote target: "${cwd}"${detail ? ` (${detail})` : ""}`,
);
}
}
export function adapterExecutionTargetSessionIdentity(
target: AdapterExecutionTarget | null | undefined,
): Record<string, unknown> | null {
if (!target || target.kind === "local") return null;
if (target.transport === "ssh") return buildRemoteExecutionSessionIdentity(target.spec);
return {
transport: "sandbox",
providerKey: target.providerKey ?? null,
environmentId: target.environmentId ?? null,
leaseId: target.leaseId ?? null,
remoteCwd: target.remoteCwd,
};
}
export function adapterExecutionTargetSessionMatches(
saved: unknown,
target: AdapterExecutionTarget | null | undefined,
): boolean {
if (!target || target.kind === "local") {
return Object.keys(parseObject(saved)).length === 0;
}
if (target.transport === "ssh") return remoteExecutionSessionMatches(saved, target.spec);
const current = adapterExecutionTargetSessionIdentity(target);
const parsedSaved = parseObject(saved);
return (
readStringMeta(parsedSaved, "transport") === current?.transport &&
readStringMeta(parsedSaved, "providerKey") === current?.providerKey &&
readStringMeta(parsedSaved, "environmentId") === current?.environmentId &&
readStringMeta(parsedSaved, "leaseId") === current?.leaseId &&
readStringMeta(parsedSaved, "remoteCwd") === current?.remoteCwd
);
}
export function parseAdapterExecutionTarget(value: unknown): AdapterExecutionTarget | null {
const parsed = parseObject(value);
const kind = readStringMeta(parsed, "kind");
if (kind === "local") {
return {
kind: "local",
environmentId: readStringMeta(parsed, "environmentId"),
leaseId: readStringMeta(parsed, "leaseId"),
};
}
if (kind === "remote" && readStringMeta(parsed, "transport") === "ssh") {
const spec = parseSshRemoteExecutionSpec(parseObject(parsed.spec));
if (!spec) return null;
return {
kind: "remote",
transport: "ssh",
environmentId: readStringMeta(parsed, "environmentId"),
leaseId: readStringMeta(parsed, "leaseId"),
remoteCwd: spec.remoteCwd,
spec,
};
}
if (kind === "remote" && readStringMeta(parsed, "transport") === "sandbox") {
const remoteCwd = readStringMeta(parsed, "remoteCwd");
if (!remoteCwd) return null;
return {
kind: "remote",
transport: "sandbox",
providerKey: readStringMeta(parsed, "providerKey"),
environmentId: readStringMeta(parsed, "environmentId"),
leaseId: readStringMeta(parsed, "leaseId"),
remoteCwd,
timeoutMs: typeof parsed.timeoutMs === "number" ? parsed.timeoutMs : null,
};
}
return null;
}
export function adapterExecutionTargetFromRemoteExecution(
remoteExecution: unknown,
metadata: Pick<AdapterLocalExecutionTarget, "environmentId" | "leaseId"> = {},
): AdapterExecutionTarget | null {
const parsed = parseObject(remoteExecution);
const ssh = parseSshRemoteExecutionSpec(parsed);
if (ssh) {
return {
kind: "remote",
transport: "ssh",
environmentId: metadata.environmentId ?? null,
leaseId: metadata.leaseId ?? null,
remoteCwd: ssh.remoteCwd,
spec: ssh,
};
}
return null;
}
export function readAdapterExecutionTarget(input: {
executionTarget?: unknown;
legacyRemoteExecution?: unknown;
}): AdapterExecutionTarget | null {
if (isAdapterExecutionTargetInstance(input.executionTarget)) {
return input.executionTarget;
}
return (
parseAdapterExecutionTarget(input.executionTarget) ??
adapterExecutionTargetFromRemoteExecution(input.legacyRemoteExecution)
);
}
export async function prepareAdapterExecutionTargetRuntime(input: {
target: AdapterExecutionTarget | null | undefined;
adapterKey: string;
workspaceLocalDir: string;
workspaceExclude?: string[];
preserveAbsentOnRestore?: string[];
assets?: AdapterManagedRuntimeAsset[];
installCommand?: string | null;
}): Promise<PreparedAdapterExecutionTargetRuntime> {
const target = input.target ?? { kind: "local" as const };
if (target.kind === "local") {
return {
target,
runtimeRootDir: null,
assetDirs: {},
restoreWorkspace: async () => {},
};
}
if (target.transport === "ssh") {
const prepared = await prepareRemoteManagedRuntime({
spec: target.spec,
adapterKey: input.adapterKey,
workspaceLocalDir: input.workspaceLocalDir,
assets: input.assets,
});
return {
target,
runtimeRootDir: prepared.runtimeRootDir,
assetDirs: prepared.assetDirs,
restoreWorkspace: prepared.restoreWorkspace,
};
}
const prepared = await prepareCommandManagedRuntime({
runner: requireSandboxRunner(target),
spec: {
providerKey: target.providerKey,
shellCommand: target.shellCommand,
leaseId: target.leaseId,
remoteCwd: target.remoteCwd,
timeoutMs: target.timeoutMs,
},
adapterKey: input.adapterKey,
workspaceLocalDir: input.workspaceLocalDir,
workspaceExclude: input.workspaceExclude,
preserveAbsentOnRestore: input.preserveAbsentOnRestore,
assets: input.assets,
installCommand: input.installCommand,
});
return {
target,
runtimeRootDir: prepared.runtimeRootDir,
assetDirs: prepared.assetDirs,
restoreWorkspace: prepared.restoreWorkspace,
};
}
export function runtimeAssetDir(
prepared: Pick<PreparedAdapterExecutionTargetRuntime, "assetDirs">,
key: string,
fallbackRemoteCwd: string,
): string {
return prepared.assetDirs[key] ?? path.posix.join(fallbackRemoteCwd, ".paperclip-runtime", key);
}
function buildBridgeResponseHeaders(response: Response): Record<string, string> {
const out: Record<string, string> = {};
for (const key of ["content-type", "etag", "last-modified"]) {
const value = response.headers.get(key);
if (value && value.trim().length > 0) out[key] = value.trim();
}
return out;
}
function buildBridgeForwardUrl(baseUrl: string, request: { path: string; query: string }): URL {
const url = new URL(request.path, baseUrl);
const query = request.query.trim();
url.search = query.startsWith("?") ? query.slice(1) : query;
return url;
}
function bridgeResponseBodyLimitError(maxBodyBytes: number): Error {
return new Error(`Bridge response body exceeded the configured size limit of ${maxBodyBytes} bytes.`);
}
async function readBridgeForwardResponseBody(response: Response, maxBodyBytes: number): Promise<string> {
const rawContentLength = response.headers.get("content-length");
if (rawContentLength) {
const contentLength = Number.parseInt(rawContentLength, 10);
if (Number.isFinite(contentLength) && contentLength > maxBodyBytes) {
throw bridgeResponseBodyLimitError(maxBodyBytes);
}
}
if (!response.body) {
return "";
}
const reader = response.body.getReader();
const chunks: Buffer[] = [];
let totalBytes = 0;
while (true) {
const { done, value } = await reader.read();
if (done) break;
if (!value) continue;
totalBytes += value.byteLength;
if (totalBytes > maxBodyBytes) {
await reader.cancel().catch(() => undefined);
throw bridgeResponseBodyLimitError(maxBodyBytes);
}
chunks.push(Buffer.from(value));
}
return Buffer.concat(chunks, totalBytes).toString("utf8");
}
export async function startAdapterExecutionTargetPaperclipBridge(input: {
runId: string;
target: AdapterExecutionTarget | null | undefined;
runtimeRootDir: string | null | undefined;
adapterKey: string;
hostApiToken: string | null | undefined;
hostApiUrl?: string | null;
onLog?: (stream: "stdout" | "stderr", chunk: string) => Promise<void>;
maxBodyBytes?: number | null;
}): Promise<AdapterExecutionTargetPaperclipBridgeHandle | null> {
if (!adapterExecutionTargetUsesPaperclipBridge(input.target)) {
return null;
}
if (!input.target || input.target.kind !== "remote") {
return null;
}
const target = input.target;
const onLog = input.onLog ?? (async () => {});
const hostApiToken = input.hostApiToken?.trim() ?? "";
if (hostApiToken.length === 0) {
throw new Error("Sandbox bridge mode requires a host-side Paperclip API token.");
}
const runtimeRootDir =
input.runtimeRootDir?.trim().length
? input.runtimeRootDir.trim()
: path.posix.join(target.remoteCwd, ".paperclip-runtime", input.adapterKey);
const bridgeRuntimeDir = path.posix.join(runtimeRootDir, "paperclip-bridge");
const queueDir = path.posix.join(bridgeRuntimeDir, "queue");
const assetRemoteDir = path.posix.join(bridgeRuntimeDir, "server");
const bridgeToken = createSandboxCallbackBridgeToken();
const maxBodyBytes =
typeof input.maxBodyBytes === "number" && Number.isFinite(input.maxBodyBytes) && input.maxBodyBytes > 0
? Math.trunc(input.maxBodyBytes)
: DEFAULT_SANDBOX_CALLBACK_BRIDGE_MAX_BODY_BYTES;
const hostApiUrl =
input.hostApiUrl?.trim() ||
process.env.PAPERCLIP_RUNTIME_API_URL?.trim() ||
process.env.PAPERCLIP_API_URL?.trim() ||
resolveDefaultPaperclipApiUrl();
const shellCommand = adapterExecutionTargetShellCommand(target);
const runner = adapterExecutionTargetCommandRunner(target);
await onLog(
"stdout",
`[paperclip] Starting sandbox callback bridge for ${input.adapterKey} in ${bridgeRuntimeDir}.\n`,
);
const bridgeAsset = await createSandboxCallbackBridgeAsset();
let server: Awaited<ReturnType<typeof startSandboxCallbackBridgeServer>> | null = null;
let worker: Awaited<ReturnType<typeof startSandboxCallbackBridgeWorker>> | null = null;
try {
const client = createCommandManagedSandboxCallbackBridgeQueueClient({
runner,
remoteCwd: target.remoteCwd,
timeoutMs: adapterExecutionTargetTimeoutMs(target),
shellCommand,
});
// PAPERCLIP_BRIDGE_DEBUG opts into verbose stdout logs of every bridge
// proxy request/response. The query string is logged verbatim, so callers
// who pass auth tokens or other sensitive values as query parameters
// should be aware those values appear in the host process's stdout when
// this flag is enabled. Only intended for active debugging in trusted
// environments.
const bridgeDebugEnabled = isBridgeDebugEnabled(process.env);
worker = await startSandboxCallbackBridgeWorker({
client,
queueDir,
maxBodyBytes,
handleRequest: async (request) => {
const method = request.method.trim().toUpperCase() || "GET";
if (bridgeDebugEnabled) {
await onLog(
"stdout",
`[paperclip] Bridge proxy ${method} ${request.path}${request.query ? `?${request.query}` : ""}\n`,
);
}
const headers = new Headers();
for (const [key, value] of Object.entries(request.headers)) {
if (value.trim().length === 0) continue;
headers.set(key, value);
}
headers.set("authorization", `Bearer ${hostApiToken}`);
headers.set("x-paperclip-run-id", input.runId);
const response = await fetch(buildBridgeForwardUrl(hostApiUrl, request), {
method,
headers,
...(method === "GET" || method === "HEAD" ? {} : { body: request.body }),
signal: AbortSignal.timeout(30_000),
});
if (bridgeDebugEnabled) {
await onLog(
"stdout",
`[paperclip] Bridge proxy response ${response.status} for ${method} ${request.path}${request.query ? `?${request.query}` : ""}\n`,
);
}
return {
status: response.status,
headers: buildBridgeResponseHeaders(response),
body: await readBridgeForwardResponseBody(response, maxBodyBytes),
};
},
});
server = await startSandboxCallbackBridgeServer({
runner,
remoteCwd: target.remoteCwd,
assetRemoteDir,
queueDir,
bridgeToken,
bridgeAsset,
timeoutMs: adapterExecutionTargetTimeoutMs(target),
maxBodyBytes,
shellCommand,
});
} catch (error) {
await Promise.allSettled([
server?.stop(),
worker?.stop(),
bridgeAsset.cleanup(),
]);
throw error;
}
return {
env: {
PAPERCLIP_API_URL: server.baseUrl,
PAPERCLIP_API_KEY: bridgeToken,
PAPERCLIP_API_BRIDGE_MODE: "queue_v1",
},
stop: async () => {
await Promise.allSettled([
server?.stop(),
]);
await Promise.allSettled([
worker?.stop(),
bridgeAsset.cleanup(),
]);
},
};
}

View File

@@ -20,11 +20,14 @@ export type {
AdapterSkillContext,
AdapterSessionCodec,
AdapterModel,
AdapterModelProfileKey,
AdapterModelProfileDefinition,
HireApprovedPayload,
HireApprovedHookResult,
ConfigFieldOption,
ConfigFieldSchema,
AdapterConfigSchema,
AdapterRuntimeCommandSpec,
ServerAdapterModule,
QuotaWindow,
ProviderQuotaResult,
@@ -53,4 +56,20 @@ export {
redactHomePathUserSegmentsInValue,
redactTranscriptEntryPaths,
} from "./log-redaction.js";
export {
REDACTED_COMMAND_TEXT_VALUE,
redactCommandText,
} from "./command-redaction.js";
export { inferOpenAiCompatibleBiller } from "./billing.js";
// Keep the root adapter-utils entry browser-safe because the UI imports it.
// The sandbox callback bridge stays available via its dedicated subpath export.
export type {
SandboxCallbackBridgeRequest,
SandboxCallbackBridgeResponse,
SandboxCallbackBridgeAsset,
SandboxCallbackBridgeDirectories,
SandboxCallbackBridgeRouteRule,
SandboxCallbackBridgeQueueClient,
SandboxCallbackBridgeWorkerHandle,
StartedSandboxCallbackBridgeServer,
} from "./sandbox-callback-bridge.js";

View File

@@ -0,0 +1,116 @@
import path from "node:path";
import {
type SshRemoteExecutionSpec,
prepareWorkspaceForSshExecution,
restoreWorkspaceFromSshExecution,
syncDirectoryToSsh,
} from "./ssh.js";
export interface RemoteManagedRuntimeAsset {
key: string;
localDir: string;
followSymlinks?: boolean;
exclude?: string[];
}
export interface PreparedRemoteManagedRuntime {
spec: SshRemoteExecutionSpec;
workspaceLocalDir: string;
workspaceRemoteDir: string;
runtimeRootDir: string;
assetDirs: Record<string, string>;
restoreWorkspace(): Promise<void>;
}
function asObject(value: unknown): Record<string, unknown> {
return value && typeof value === "object" && !Array.isArray(value)
? (value as Record<string, unknown>)
: {};
}
function asString(value: unknown): string {
return typeof value === "string" ? value : "";
}
function asNumber(value: unknown): number {
return typeof value === "number" ? value : Number(value);
}
export function buildRemoteExecutionSessionIdentity(spec: SshRemoteExecutionSpec | null) {
if (!spec) return null;
return {
transport: "ssh",
host: spec.host,
port: spec.port,
username: spec.username,
remoteCwd: spec.remoteCwd,
} as const;
}
export function remoteExecutionSessionMatches(saved: unknown, current: SshRemoteExecutionSpec | null): boolean {
const currentIdentity = buildRemoteExecutionSessionIdentity(current);
if (!currentIdentity) return false;
const parsedSaved = asObject(saved);
return (
asString(parsedSaved.transport) === currentIdentity.transport &&
asString(parsedSaved.host) === currentIdentity.host &&
asNumber(parsedSaved.port) === currentIdentity.port &&
asString(parsedSaved.username) === currentIdentity.username &&
asString(parsedSaved.remoteCwd) === currentIdentity.remoteCwd
);
}
export async function prepareRemoteManagedRuntime(input: {
spec: SshRemoteExecutionSpec;
adapterKey: string;
workspaceLocalDir: string;
workspaceRemoteDir?: string;
assets?: RemoteManagedRuntimeAsset[];
}): Promise<PreparedRemoteManagedRuntime> {
const workspaceRemoteDir = input.workspaceRemoteDir ?? input.spec.remoteCwd;
const runtimeRootDir = path.posix.join(workspaceRemoteDir, ".paperclip-runtime", input.adapterKey);
await prepareWorkspaceForSshExecution({
spec: input.spec,
localDir: input.workspaceLocalDir,
remoteDir: workspaceRemoteDir,
});
const assetDirs: Record<string, string> = {};
try {
for (const asset of input.assets ?? []) {
const remoteDir = path.posix.join(runtimeRootDir, asset.key);
assetDirs[asset.key] = remoteDir;
await syncDirectoryToSsh({
spec: input.spec,
localDir: asset.localDir,
remoteDir,
followSymlinks: asset.followSymlinks,
exclude: asset.exclude,
});
}
} catch (error) {
await restoreWorkspaceFromSshExecution({
spec: input.spec,
localDir: input.workspaceLocalDir,
remoteDir: workspaceRemoteDir,
});
throw error;
}
return {
spec: input.spec,
workspaceLocalDir: input.workspaceLocalDir,
workspaceRemoteDir,
runtimeRootDir,
assetDirs,
restoreWorkspace: async () => {
await restoreWorkspaceFromSshExecution({
spec: input.spec,
localDir: input.workspaceLocalDir,
remoteDir: workspaceRemoteDir,
});
},
};
}

Some files were not shown because too many files have changed in this diff Show More