.canvas Directory Contract
Status: Draft V1 · App-agnostic format spec · Last updated 2026-06-16
Engineering and WG material for Grida builders (not end-user product guides).
すべてのタグを表示Status: Draft V1 · App-agnostic format spec · Last updated 2026-06-16
The Agent Client Protocol (https://agentclientprotocol.com/) as the default outward wire of any conforming agent system. The mapping between this guide's internal shapes and ACP's wire vocabulary; method-by-method correspondence; where the protocol and the guide diverge and how the adapter handles each seam.
Provider profile for consuming Codex from an external agent system. Defines what Codex provides, how a host should adapt it into an ACP-consuming runtime, and the boundaries around tools, sessions, filesystem authority, and image generation.
Decision/RFD on whether Grida should host an agent-provider class — driving an external ACP agent (Claude Code / Codex) the user already pays for, with Grida as ACP consumer and an MCP server of its own tools. The forever-cost ledger, the narrow delta over BYOK, and a reversible path.
How Grida Desktop adapts AgentHost sandbox policy to srt and supervises the AgentSidecar process.
Desktop binding of GRIDA-SEC-004 for AgentSidecar, the renderer bridge, HTTP perimeter, sandbox, and secrets discipline.
Desktop userData files owned by AgentSidecar, including auth.json, recent files, workspaces, and SQLite session storage.
A guide for implementing an LLM-driven agent system. Implementation-agnostic, normative, and meant to play the same role for agent runtimes that LSP plays for language tooling or that ACP plays for editor ↔ agent integration. Names the invariants every implementor MUST honor and the policies each implementor picks.
Working group hub for AI-related docs. Two layers — the implementation-agnostic agent system RFC, and the Grida-specific bindings (canvas tools, image tools, host tool surface) that ride on top of it.
| feature id | status | description | PRs |
The guide pins AI SDK v6 as the chunk-shape substrate every conforming implementation speaks internally. This page captures the implementor notes that live outside AI SDK's own docs — the token-usage normalization rule, where the SDK's tool-loop helper fits, and the things the RFC adds on top.
Exposed contract of Grida's local agent system, the package boundary, and the tests that pin it during refactor.
Motivation
Authoring features are a collection of authoring-time behaviors in the editor: operations that rewrite node parameters to achieve an intended visual result, rather than relying on runtime transforms or renderer-specific tricks.
Living document. Every known issue in the billing surface (subscriptions,
Working group documents for the Grida billing surface (subscriptions,
Glossary and reference for how an agent handles binary attachments the model cannot natively read. Three resolution paths (provider-native multimodal, skill-per-format, shell-based conversion), a format matrix, the scratch-space pattern for archive extraction, and the boundary between protocol and implementor.
Grida's specialized subagents (titler, compactor, planner) — concrete bindings of the RFC's specialized-subagent pattern with the Grida-specific tier, model, sentinel, and cost discipline.
Canvas-specific AI toolset for editing Grida design documents — scene-graph operations, specialized inserts, canvas exec/lint/format, and resource lookup.
Key terms and concepts from the Chromium compositor (cc/) and display
The Chromium compositor (cc/) is responsible for taking painted content
Research findings from reading the Chromium source code (chromium/cc/,
The scheduler orchestrates the frame production pipeline. It receives
The damage tracking system computes the region of each render surface that
Research document describing the mechanisms Chromium/Blink uses to mark
How Chromium optimizes visual effects (blur, shadow, opacity, blend modes,
How Chromium maintains smooth interaction (scroll, pinch-zoom) even when
Memory Budget
How Chromium stores and accesses per-node data across its rendering
How Chromium records, stores, and replays paint operations. The recording
Complete mechanism for how Chromium handles pinch-to-zoom: the immediate
The compositor stores layer properties (transform, effect, clip, scroll) in
A render surface is an offscreen GPU texture that a subtree is composited
Source-level analysis of how Chromium handles rendering resolution during
How Blink's SVG implementation is split across third_party/blink/renderer/.
Two mechanisms that create cross-tree references in SVG:
How SVG elements appear in Blink's accessibility (AX) tree. SVG is a
Every animatable SVG attribute is exposed to JavaScript as an
How Blink animates SVG. Two engines coexist: SMIL (`, `,
Three deployment modes for SVG content, with differing pipelines:
How Blink resolves and applies the clip-path property — both the SVG
How Blink tracks transforms and coordinate spaces from the outer `` down
How Blink resolves a point in SVG space to an element. Different from HTML
A snapshot of which Blink rendering systems SVG participates in fully, which
How fill="url(#id)" and stroke="url(#id)" resolve into Skia shaders. Paint
How ` data becomes an SkPath`, and how SVG stroke properties
How Chromium handles the SVG `` element. The pattern element is a
The end-to-end pipeline for rendering SVG inside Blink. Emphasis on where the
How Blink (Chromium's rendering engine) renders SVG. These documents describe
`, , , ` — the four non-paint-server
SVG text is the most intricate part of SVG rendering because it combines:
How `` is laid out and painted in Blink, with side-by-side notes on
Tiling Model
Deep dive into Chromium's tiling implementation from source (cc/).
FRD for clipboard support in the SVG editor: the payload is a standalone SVG document, not a private format. Specifies the two extraction operations (standalone payload vs in-document clone), the five kinds of context a lifted subtree leaves behind and the policy for each, command and history semantics, placement, transport ownership, and the paste-is-load trust model.
User intent representation. The multipart user-message shape, file references vs attachments, inline commands, mentions, editor context, attachment handling, and the lowering rules that turn what the user composes into what the model sees.
A working group draft describing the Grida ID Model (for CRDT) feature for the core engine.
CSS → Grida IR property mapping table and TODO tracker.
Chromium Blink's single source of truth for CSS property metadata, used as a reference for browser-grade CSS cascade implementation
Renderer-agnostic model for attaching glyphs (arrowheads, markers, ticks) to 2D paths using the path's local frame.
The developer experience contract. A canonical inspection format every implementor exposes, the export paths a session can be read through, what the inspection tool exposes, replay semantics, and the DX checklist a conforming implementation passes.
Grida Desktop is one host implementation of the agent RFC. These docs are delta-only — every fact here depends on the Electron + macOS/Linux/Windows host shape.
RFD for the open problem behind #775: NodeId is parse-ephemeral, so there is no reference that survives a load() — let alone an external rewrite of the file. Frames the gap, scopes the candidate identity contracts (positional path, id attribute, semantic anchor), and sets the promotion gate before any public API lands.
Editor features are a collection of Grida editor-specific behaviors and capabilities. These documents focus on practical implementation details over mathematical abstractions, emphasizing real-world UX and editor workflows.
Proposal for a typed per-node element IR that replaces tag-switch intent dispatch in @grida/svg-editor with capability-gated records, centralising round-trip invariants.
| feature id | status | description |
Question-and-answer index over the Agent System guide. Doubles as an entry point (read the question, jump to the page) and as a conformance test (if a Q cannot be answered from the RFC, the RFC owes a clarification). Answers are normative and derived from the linked page; they do not invent policy beyond what the guide says.
Scope: Fonts & Images (WASM / Embedded)
How Chromium/Blink and resvg/usvg implement `` —
How Blink (and resvg/usvg) decide which rectangle to repeat and
Implementation details behind importing .fig files into Grida.
| feature id | status | description |
| feature id | status | description |
Tracking docs for the Grida IR schema and how external formats map into it.
The bedrock the rest of the guide rests on. AI SDK v6 as the streaming substrate, directory-rooted execution, the locked fundamental tool set summary, sandbox placement, the watchdog, the case for web search as a special fundamental tool, and the cross-cutting invariants every implementation MUST honor.
How the locked fundamental-tool RFC lands in Grida. Naming map (RFC id → Grida id), backend adapter table, per-tool deviations Grida ships, and where each tool lives in the monorepo.
Grida-specific tool surfaces and bindings of the agent RFC. Fundamentals as Grida ships them, canvas tools (scene-graph search, specialized inserts, exec/lint/format, resource lookup), and image-generation tools.
| feature id | status | description | PRs |
The Grida IR is the in-memory scene graph used by all Grida rendering, layout, and editing pipelines. It is the single representation that CSS, HTML, SVG, and .grida files all target.
Investigation, bugs, and architectural lessons from a v1 hit-test implementation in @grida/svg-editor — input to the v2 hit-test architecture.
How HTML elements map to Grida IR nodes. For CSS property mapping, see css.md.
Renders HTML+CSS to a Skia Picture for opaque embedding on the canvas
The structure-and-semantics study that informs the Skia-backed SVG
| feature id | status | description |
This document proposes the philosophical basis for image manipulation tools that enable AI agents to generate, enhance, and transform visual content within the design canvas.
Hypothesis
Isolation Mode restricts what the renderer draws and hit-tests to a specific
| feature id | status | description | PRs |
Overview
Universal positioning, dimensions, layout management with anchors, flex and grid.
The session-lifecycle event channel — the small, multi-subscriber surface through which the core announces turn-started / turn-finished / approval-requested moments to consumers that are not the chat renderer. Why an event surface and not consumer-specific wiring, the event vocabulary and its fields, volatility and ordering semantics, the projection over the host wire, the notification consumer policy (focus gating, click-to-attend), and the boundary against a user-facing hooks system.
The agent server as a long-lived, discoverable local process — one server, many clients. The discovery contract (registration record, persistent credential, atomic publish, single-daemon convergence), the authenticated probe and protocol gate, connect-or-spawn, the browser exception, and the production shape.
A catalog of node and subtree properties where a zoom-aware Level-of-Detail
The MarkdownNode renders GFM (GitHub Flavored Markdown) directly to a Skia Picture using pulldown-cmark's event stream and Skia's textlayout::Paragraph API. No HTML/CSS pipeline is involved — the markdown source is walked and drawn in a single pass.
| feature id | status | description |
How user-plugged MCP servers and other external connectors compose with the locked tool set. The bulk problem, tool_search, OAuth, dynamic refresh, and trust policy for untrusted MCP servers.
| feature id | status | description | PRs |
| feature id | status | description | PRs |
Procedural fractal Perlin noise effects with SVG filter semantics
| feature id | status | description | PRs |
Core / Modeling
The storage layer. The three-table session schema, the save-on-chunk policy, the JSON-column discipline, the id strategy, and the event-log opt-in. SQLite is the default; the schema ports to any engine that supports JSON columns and indexed string keys.
Short record of the BYOK-only AgentHost contract decisions after removing the hosted grida-cloud provider.
Working group documents for Grida platform and infrastructure topics.
Defined term — the minimal partition of editable SVG elements such that every editing intent admits the same set of legal solutions within a class.
Why Grida Desktop runs a long-lived Node agent sidecar alongside Electron main, what it owns, and where the boundary sits. The three-process model and the AgentHost god class.
RFD for editing the non-path SVG shapes (rect, circle, ellipse, line, polyline, polygon) as vector geometry: native writeback while the tag can express the edit, promotion to <path> when it cannot — the timing, target, conic representation, and round-trip invariants that keep the conversion honest.
Per-side stroke widths for rectangular shapes (CSS border equivalent)
Reference sheet for estimating GPU render cost of 2D scene operations
Why Grida Desktop URL-loads grida.co/desktop/* instead of bundling editor source — the "one editor codebase, two hosts" doctrine and the path-scoped window.grida bridge.
Specification for rendering untrusted SVG documents inertly in the SVG editor: hardening is a projection choice at the rendering surface, never a mutation of the document model. Names the execution-vector inventory the projection must neutralize, the surface obligations that constrain the strategy, the inert-projection requirements, the named costs, and the residual risks left to the host.
A summary of optimization techniques for achieving high-performance
Fonts & Images (WASM / Embedded)
Purpose
The three runtime environments an agent can be hosted in — web (limited capabilities), cloud sandbox (ephemeral container/VM), and computer (the user's machine). How the locked tool set degrades and which capabilities each environment provides.
srt is named here as the reference sandbox implementation for the computer environment — the only mature, ready-to-go option matching the capability surface this guide describes. The protocol does not lock to it; implementors MAY substitute any equivalent.
| feature id | status | description | PRs |
This document describes the selection behavior for pointer interactions and
See also: the UX-narrative sibling spec at
How a session is born, grows, survives interruption, is compacted, rewound, or forked, and how it switches models per turn. The loop semantics, the chunk stream, the abort path, the run-state machine, the permission-scope layering, and the session-status back-channel.
Two layers of knowledge an agent reaches for beyond its tools. Skills (lazy, advertise-then-load, agent picks when relevant) and project instructions (eager, unconditional, the floor every session stands on).
How node-level opacity interacts with fill and stroke paint when they
How an agent delegates to a child running the same loop. Agent modes, the task tool, blocking vs background, recursion, permission inheritance (deny rules always win), inspectability, awareness, specialized subagents (title / summary / compaction), and plan/build mode as an opinionated pattern hosts may layer.
Design note for the SVG editor's in-document subtree-clone operation: the second of the clipboard FRD's two extraction operations. Specifies the no-closure/no-shell verdict, verbatim-id collision semantics, placement and paint order, who moves during a clone-drag, the mid-drag modifier toggle, the one-undo-step history bracket, and the repeating-offset duplicate (⌘D remembers the translate delta).
Index of design notes for the @grida/svg-editor TypeScript SDK — element IR proposal, hit-test architecture, transform pipeline critique, Policy Class glossary.
Spec-grounded reference for the SVG element surface a graphical editor IR must expose: per-element geometry, presentation hooks, local frames, and round-trip hazards.
SVG → Grida IR property mapping and TODO tracker.
Pattern fills for SVG shapes — tiling a subtree as a repeating paint
A cross-engine comparison of how SVG rendering is factored across three
This document describes the testing methodology and tools used to evaluate SVG rendering accuracy in Grida Canvas.
Status: Active — describes the current import strategy.
Reference for SVG transform-attribute syntax, viewport / viewBox, and use-instance coordinate frames — feeds an IR redesign that must refuse-vs-normalize rotation and pivot.
Current-state inventory of what @grida/svg-editor's public commands do today on each SVG element type — input to the IR redesign.
Why creating text in an SVG editor is click-to-place rather than drag-to-size, and why an empty text element is treated as a deletion.
Motivation
Motivation
The tool contract. The locked fundamental set, what every tool must self-describe, the result envelope, truncation, and how permissions are evaluated at the tool-call boundary.
Context dump on the transform pipeline in @grida/svg-editor — what's done, what's broken, what's load-bearing for the IR redesign.
A canvas-level organizational primitive for grouping design elements without participating in layout.
Anything that fires a turn besides a human typing in the compositor. Scheduled wakeups, external webhooks (CI / GitHub / generic), programmatic API calls, MCP-pushed events, and agent self-scheduled wakeups. Trigger envelope shape, queue semantics, interactive-vs-hosted execution, agent self-scheduling pattern, lifecycle bounds, and the boundary with background subagents.
The host states what happened; the client renders it. The turn-lifecycle wire vocabulary must carry the identity of the message the core actually fired and explicit started/finished/aborted transitions, so a client never infers which queued item became a real turn from its own optimistic mirror. The authority direction, the lifecycle contract, why reconstruction forks across consumers, and the migration from a state-only status channel.
The single point where competing demands to start a turn on one session are serialized, ordered, and drained. The ingestion model, the queued_at data shape, the run-state machine that drains the queue, the single-flight / FIFO / no-preemption invariants, the drop rules, restart behavior, and the core-vs-surface boundary that keeps the queue authoritative in the core.
A working group draft describing the Unicode Coverage Tracker (UCT) feature for the core engine.
Summary
Survey of how the web platform and peer editors render untrusted SVG without executing author script: the script-execution vector inventory, allowlist sanitization (DOMPurify, tldraw), the secure static image mode, iframe sandboxing, parse-into-model editors (Figma, Penpot, Excalidraw), and what a host CSP does and does not neutralize.
What resvg's usvg IR normalises away and what an editor IR must refuse — comparative read informing the @grida/svg-editor IR proposal.
UX patterns that ride on top of the compositor and push back into the protocol. Queued sends, sidecar chat as ephemeral fork, and memory as a built-on-top layer. The compositor itself, file refs, attachments, mentions, commands, editor context, and the user-view-vs-model-view lowering rules live in compositor.md.
UX Surface documents specify Grida editor's specific practical UX specifications for surface interactions, selection, targeting, and related user interface behaviors.
see PR #408
Why Real WASM Benchmarking Matters
Welcome to the Grida Working Group documentation!