29 件のドキュメントが "agent-system" タグに分類されています

ACP Integration

The Agent Client Protocol (https://agentclientprotocol.com/) as the default outward wire of any conforming agent system. The mapping between this guide's internal shapes and ACP's wire vocabulary; method-by-method correspondence; where the protocol and the guide diverge and how the adapter handles each seam.

Agent System (WG)

A guide for implementing an LLM-driven agent system. Implementation-agnostic, normative, and meant to play the same role for agent runtimes that LSP plays for language tooling or that ACP plays for editor ↔ agent integration. Names the invariants every implementor MUST honor and the policies each implementor picks.

AI SDK (reference substrate)

The guide pins AI SDK v6 as the chunk-shape substrate every conforming implementation speaks internally. This page captures the implementor notes that live outside AI SDK's own docs — the token-usage normalization rule, where the SDK's tool-loop helper fits, and the things the RFC adds on top.

Binary file handling

Glossary and reference for how an agent handles binary attachments the model cannot natively read. Three resolution paths (provider-native multimodal, skill-per-format, shell-based conversion), a format matrix, the scratch-space pattern for archive extraction, and the boundary between protocol and implementor.

Compositor

User intent representation. The multipart user-message shape, file and directory references vs attachments, inline commands, mentions, editor context, attachment handling, and the lowering rules that turn what the user composes into what the model sees.

Cost Optimization

Doctrine for running frontier models as agent models without burning money on waste. The agent-loop cost model (every token is re-billed every step), the ledger of measured leaks — missing prompt caching, tool-result echo, unbounded replay, window/threshold/tier decided separately — and the quality-first ordering that spends effort on quality-neutral fixes before quality-tradeoff ones.

Debugging

The developer experience contract. A canonical inspection format every implementor exposes, the export paths a session can be read through, what the inspection tool exposes, replay semantics, and the DX checklist a conforming implementation passes.

Execution authority

This document decides where an agent host applies confinement. The boundary is

FAQ

Question-and-answer index over the Agent System guide. Doubles as an entry point (read the question, jump to the page) and as a conformance test (if a Q cannot be answered from the RFC, the RFC owes a clarification). Answers are normative and derived from the linked page; they do not invent policy beyond what the guide says.

Foundations

The bedrock the rest of the guide rests on. AI SDK v6 as the streaming substrate, directory-rooted execution, the locked fundamental tool set summary, sandbox placement, the watchdog, the case for web search as a special fundamental tool, and the cross-cutting invariants every implementation MUST honor.

Lifecycle Events

The session-lifecycle event channel — the small, multi-subscriber surface through which the core announces turn-started / turn-finished / approval-requested moments to consumers that are not the chat renderer. Why an event surface and not consumer-specific wiring, the event vocabulary and its fields, volatility and ordering semantics, the projection over the host wire, the notification consumer policy (focus gating, click-to-attend), and the boundary against a user-facing hooks system.

Local Daemon

The agent server as a long-lived, discoverable local process — one server, many clients. The discovery contract (registration record, persistent credential, atomic publish, single-daemon convergence), the authenticated probe and protocol gate, connect-or-spawn, the browser exception, and the production shape.

MCP and Connectors

How user-plugged MCP servers and other external connectors compose with the locked tool set. The bulk problem, tool_search, OAuth, dynamic refresh, and trust policy for untrusted MCP servers.

Persistency

The storage layer. The three-table session schema, the save-on-chunk policy, the JSON-column discipline, the id strategy, and the event-log opt-in. SQLite is the default; the schema ports to any engine that supports JSON columns and indexed string keys.

Plan Mode

Plan mode as a host-owned operating regime, not a model state. The plan/build pair, the four-invariant transition contract (mode is injected context; the agent proposes but never effects a switch; a transition is a human-gated re-injection; the plan is a reviewed artifact), the symmetry between entering and exiting, who may initiate a transition, and the read-only harness with its single carve-out.

Runtime Environments

The three runtime environments an agent can be hosted in — web (limited capabilities), cloud sandbox (ephemeral container/VM), and computer (the user's machine). How the locked tool set degrades and which capabilities each environment provides.

Sandbox Runtime (srt) — reference confinement primitive

srt is the reference macOS and Linux confinement primitive for the computer environment, with explicit limits around host routing and concurrent authority domains.

Scratch

The per-session, system-managed, ephemeral filesystem area where an agent does working I/O and where produced files land by default — distinct from the durable workspace. The scratch contract (host-owned and per-session; ephemeral with durability only by promotion; the default output sink; reachable without per-operation approval yet inside sandbox containment; not the workspace), its lifecycle, and what hosts are free to vary.

Session Lifecycle

How a session is born, grows, survives interruption, is compacted, rewound, or forked, and how it switches models per turn. The loop semantics, the chunk stream, the abort path, the run-state machine, the permission-scope layering, and the session-status back-channel.

Skills and Project Instructions

Two layers of knowledge an agent reaches for beyond its tools. Skills (lazy, advertise-then-load, agent picks when relevant) and project instructions (eager, unconditional, the floor every session stands on).

Subagents

How an agent delegates to a child running the same loop. Agent modes, the task tool, blocking vs background, recursion, permission inheritance (deny rules always win), inspectability, awareness, specialized subagents (title / summary / compaction), and plan/build mode as an opinionated pattern hosts may layer.

Tool Design

The design discipline for shaping an agent tool before it is written — the doctrine behind the tool contract. A tool is a contract authored for a consumer that cannot be renegotiated with and cannot be migrated. Minimal surface, host config off the arguments, grounded and honest knobs, auto-resolved inputs, context-frugal and clearly-failing results, and when to reach for a tool versus a connector versus a skill.

Tools

The tool contract. The locked fundamental set, what every tool must self-describe, the result envelope, truncation, and how permissions are evaluated at the tool-call boundary.

Triggers

Anything that fires a turn besides a human typing in the compositor. Scheduled wakeups, external webhooks (CI / GitHub / generic), programmatic API calls, MCP-pushed events, and agent self-scheduled wakeups. Trigger envelope shape, queue semantics, interactive-vs-hosted execution, agent self-scheduling pattern, lifecycle bounds, and the boundary with background subagents.

Turn Authority

The host states what happened; the client renders it. The turn-lifecycle wire vocabulary must carry the identity of the message the core actually fired and explicit started/finished/aborted transitions, so a client never infers which queued item became a real turn from its own optimistic mirror. The authority direction, the lifecycle contract, why reconstruction forks across consumers, and the migration from a state-only status channel.

Turn Queue

The single point where competing demands to start a turn on one session are serialized, ordered, and drained. The ingestion model, the queued_at data shape, the run-state machine that drains the queue, the single-flight / FIFO / no-preemption invariants, the drop rules, restart behavior, and the core-vs-surface boundary that keeps the queue authoritative in the core.

UX Patterns

UX patterns that ride on top of the compositor and push back into the protocol. Queued sends, sidecar chat as ephemeral fork, and memory as a built-on-top layer. The compositor itself, file refs, attachments, mentions, commands, editor context, and the user-view-vs-model-view lowering rules live in compositor.md.

Visual perception

The read/view modality split. Why a text read and a visual view are separate tools, the perception-tool contract, the input matrix (bitmaps now, rendered sources later), how a tool result becomes a provider image block, and the retention policy that keeps perceived pixels from re-filling context every turn.

Visual perception lowering (AI SDK)

Why a tool-result image must be hoisted into a user-message image part on the OpenAI-compatible wire, and how to do it as an AI SDK consumer. The Chat Completions tool role is text-only, so the SDK stringifies a tool-output media block to base64 text the model cannot decode. The fix is a prepareStep transform that re-attaches the image as a user message — the SDK-specific realization of the neutral vision RFC's stage-and-reattach strategy.