メインコンテンツへスキップ

Agent security

Status: V1.x in flight. Layers 1–3 are wired for Desktop; layer 4 lands with the agent shell.run capability.

GRIDA-SEC-004 is the trust boundary between the URL-loaded renderer and the long-lived agent server. The boundary is enforced by five independent layers. The discipline: a reviewer should be able to compromise any single layer and have the remaining four still defend the user.

For the abstract defense-in-depth model the agent system specifies, see agent/foundations.md / watchdog vs permission rules vs sandbox and agent/tools.md / defense in depth. This page is the desktop-specific landing.

The five layers

#LayerWhereWhat it stops
1Path-scoped window.gridaPreload (desktop/src/preload.ts); exposed iff location.pathname is /desktop or starts with /desktop/; later navigation is blocked by preload history guards and desktop/src/window.ts.XSS on any other grida.co page cannot reach the agent server — window.grida is undefined there.
2CSP-strict /desktop/* routeseditor/proxy.ts and the desktop route group.Third-party scripts and inline-script injection on privileged desktop pages.
3Agent server HTTP perimeterAgentHost HTTP routes (packages/grida-ai-agent/src/http/).Cross-origin / cross-process callers without the per-spawn password, the right Referer, or the right Origin.
4OS-level outer sandbox (srt)agent-sandbox-wrap. Wraps the AgentHost process tree.A compromised agent server reading SSH keys, writing shell rc files, calling arbitrary hosts.
5Secrets disciplineauth.json at chmod 0o600; preload holds the agent server password in closure; never on window.A non-Grida process on the same machine reading the token file; a renderer script enumerating credentials.

Layer 1 — Path-scoped window.grida

The preload exposes the bridge only when the initial document URL is under /desktop/*. contextBridge has no revocation API, so desktop/src/window.ts blocks navigation from an exposed desktop page to non-desktop paths. Cross-link: renderer-bridge / capability boundary.

What this catches. A future bug that introduces an <iframe> on grida.co/blog/... to load the same Electron renderer. The bridge stays detached; the iframe sees no agent server API.

What this does not catch. A bug inside /desktop/* that runs attacker-controlled JS. Layers 2 and 3 catch that.

Layer 2 — CSP-strict /desktop/* routes

editor/proxy.ts attaches a nonce-based CSP to /desktop/* responses. The desktop route group does not run third-party analytics or marketing scripts.

What this catches. A future desktop page that accidentally depends on an inline script or third-party script include.

What this does not catch. A trusted framework script compromised before it reaches the page. Layer 3 catches requests that do not carry the host-held credentials.

Layer 3 — Agent server HTTP perimeter

AgentHost binds on 127.0.0.1:<random>. Every request must:

  • Carry the per-spawn Basic Auth password (regenerated each agent host start; never persisted).
  • Have a Referer header whose path is under the host-declared desktop route root.
  • Have an Origin on the host-declared allowlist.

What this catches. A non-Grida process on the same machine discovering the loopback port and trying to call the API.

What this does not catch. An attacker who already controls the renderer and can reach window.grida — they're inside layer 1. Layer 4 catches the worst filesystem/network blast radius if layer 1 fails.

Layer 4 — OS-level outer sandbox

AgentHost runs inside srt (agent-sandbox-wrap) with the package-owned outer-wrap intent: network limited to BYOK provider hosts from the package metadata, secret-path read/write denies, and host-supplied write roots for the sidecar's persisted state and platform temp paths. Loopback binding is allowed so the HTTP server can bind 127.0.0.1:<random>.

What this catches. A compromised agent server that wants to read ~/.ssh, write ~/.zshrc, or exfiltrate to a random host.

What this does not catch. A compromised agent that uses an allowlisted host (e.g. the provider) for exfiltration. The RFC's watchdog is the answer to that — host-configured policy, not srt.

Layer 4b — Agent shell execution

The run_command agent tool spawns child processes through the shell runner (packages/grida-ai-agent/src/shell/runner.ts) with shell: false, behind three gates: a hardcoded command allowlist (permissions.ts), a cwd-must-be-inside-an-opened-workspace check, and an in-process secret-dir containment check. (The full per-command fs/net sub-policy that would constrain each spawned child is deferred; today srt confines the whole sidecar, not each child.)

Secret-dir containment — the srt / in-process split. There are two classes of secret on disk, owned by two different gates:

  • HOME secrets (~/.ssh, ~/.aws, shell rc files) are denied for the entire tree by the srt deny_read policy. The host has no legitimate read there, so a kernel-level deny is safe.
  • The agent host's own secret dir — its userData, where BYOK auth.json, workspaces.json, recent.json, and the sessions db live — is not in srt deny_read. srt confines the whole sidecar including the host process, and the host process must read auth.json for provider calls. Denying it at the kernel level would break host auth. Instead the shell child is kept out of it in-process: validateShellRequest rejects any command arg that resolves (after realpath of the nearest existing ancestor, mirroring the cwd discipline so a symlink can't bypass it) inside that protected root, threaded down from the runtime.

This is the responsibility-and-reconciliation rule for secret reads: srt owns HOME secrets at the kernel; the in-process runner owns the host's own userData. Neither covers the other.

git is an accepted limitation. git is allowlisted because it is the single most useful dev command, but even with shell: false it is an arbitrary-code-execution / arbitrary-file-read vector: git -c core.pager=… / -c core.sshCommand=…, --upload-pack, and apply/clone run attacker-chosen programs, and --git-dir, apply, and a .git/config credential read reach arbitrary files. This collapses the no-shell / allowlist guarantee. The risk is accepted for V1.x pending the srt per-command sub-policy.

Layer 5 — Secrets discipline

  • auth.json at chmod 0o600 (agent-storage-layout).
  • The agent server password is generated per-spawn, passed to the sidecar over stdin, fetched by preload through guarded IPC, held in preload closure, never placed on argv, env, disk, or window.
  • Logs never include tokens, message content, tool args, or model output (sessions / security boundary in bedrock carries the same rule).
  • secrets.get is not exposed in the bridge. The renderer may check, set, or delete keys, but key material never crosses back from the agent host.

What this catches. A renderer script enumerating window looking for credentials; an unauthorized user on the same machine reading the token file.

What this does not catch. Root on the box. The RFC is explicit that an OS sandbox is not a defense against an attacker with system privileges.

Reviewer checklist

Before merging a PR that touches /desktop/* UI, the preload, the agent server's HTTP layer, buildAgentHostSandboxPolicy(), or any source file under packages/grida-ai-agent/:

  1. Layer 1 intact? Preload path-scoping still fails closed. Window navigation still blocks exposed desktop pages from leaving /desktop/*.
  2. Layer 2 intact? /desktop/* routes keep the nonce CSP and do not add third-party scripts.
  3. Layer 3 intact? Every new route handler runs the auth + Referer
    • Origin guards. No "internal" bypass.
  4. Layer 4 intact? buildAgentHostSandboxPolicy() paths/hosts cover what the new code reads / writes / fetches; mandatory deny set unmodified.
  5. Layer 5 intact? No new secrets placed on window; auth.json write path still uses 0o600; logs still elide content.

Anything that weakens a layer needs an explicit GRIDA-SEC-004 review note in the PR, not a silent regression.

See also