An implementation plan for a Roblox Studio AI copilot with a Next.js backend and an LLM tool‑calling orchestrator, refined to mirror Cline’s step‑by‑step tool usage, context management, and diff‑first editing.
- Studio Plugin (Luau): docked chat + proposals, reads active editor state, previews diffs, applies edits inside ChangeHistoryService. Provider configuration is read from the backend
.env. - Next.js Backend (TypeScript):
/api/chat,/api/stream,/api/proposals/:id/apply,/api/assets/search, plus an orchestrator that exposes tools to the LLM across OpenRouter, Gemini, AWS Bedrock, and NVIDIA providers. - LLM Tool‑Calling: one‑tool‑per‑message, approval‑first workflow (like Cline). The model proposes proposals (edits/object ops/asset ops); only the plugin performs writes after user approval. Supports provider selection among OpenRouter, Gemini, AWS Bedrock (Claude / Titan / Llama variants), and NVIDIA (NIM / OpenAI-compatible endpoints). Streaming includes clear indicators:
orchestrator.start …,tool.parsed <name>,tool.valid <name>,tool.result <name>,proposals.mapped <name> count=N,context.request <reason>, anderror.*. The console prints[orch] provider.raw …for every raw reply.- Ask mode UI: a single transcript shows assistant text bubbles and compact tool chips. Completion proposals are echoed as assistant text; in Auto mode proposals auto-apply and render as chips only.
Recent prompt and orchestrator changes aim to make geometry-first scene building smoother:
- The system prompt now nudges the model to plan before acting, list current instances, and iterate with
create_instance/set_propertiesuntil visible results exist. Examples are illustrative only; they do not enforce specific content. The agent chooses relevant patterns (e.g., house) and must not introduce unrelated models (e.g., "Farm") unless explicitly requested. - Geometry intent is detected heuristically, but we no longer hard-block scripting; the prompt + example policy keeps the agent in direct manipulation mode unless the user explicitly requests reusable Luau.
- Raw provider payloads are logged to the terminal via
[orch] provider.raw …, which helps spot malformed XML/JSON immediately during runs. - Tool results are now re‑fed to the model as
usermessages (notsystem), so the provider “remembers” context tool outputs (e.g., list_children) and stops claiming the workspace is empty. Paths: apps/web/lib/orchestrator/index.ts:1531, 1553. - XML props tolerance: the orchestrator accepts
<props><Name>…</Name></props>(nested tags) in addition to strict JSON. These are converted into an object before schema validation. Path: apps/web/lib/orchestrator/index.ts:498. - Fenced JSON tolerance: ```json blocks are unwrapped before parsing so edits/arrays parse reliably. Path: apps/web/lib/orchestrator/index.ts:430–460.
- Auto‑create missing parents: when the model asks to create an Instance under
Workspace.<Name>and that parent path doesn’t exist yet in the snapshot, the orchestrator automatically prepends acreate_instance(Model)for the missing ancestor(s) undergame.Workspacebefore the requested child. This enables prompts like “Create WallNorth under Workspace.Base …” in a single step, without manual selection. Path: apps/web/lib/orchestrator/index.ts around thecreate_instancemapping.- Scope: only under Workspace; creates a chain of missing Models (e.g., Workspace.Base.Structures → creates Base, then Structures, then your child).
- Selection defaults still apply when no
parentPathis provided.
- Optional text‑before‑tool: you may enable
VECTOR_ALLOW_TEXT_BEFORE_TOOL=1to surface brief assistant prose before the single tool tag; setVECTOR_ENFORCE_TOOL_AT_END=1to warn on trailing prose after the tag.
Planning updates (detailed, tool‑oriented)
- The planner now expects a detailed, tool‑specific step list when work is non‑trivial. Steps should name the tool, the exact target (class/path/name), and the intention (e.g., position/size or script action).
- Typical plans are 8–15 steps, but there is no hard cap — the agent may use as many steps as needed to be clear and verifiable.
- Prefer assets first: plan search_assets → insert_asset for props/models, then arrange via set_properties; fall back to create_instance when catalog search is unavailable or insufficient.
- A Luau step is still required before completion (unless user opts out, e.g., “geometry only”).
- Tooltip helper hoisted; nil call removed. The auto‑toggle tooltip no longer throws on hover. File: plugin/src/main.server.lua around 1316–1396.
- Proposal card guards: accessing
.Visibleis now guarded for potentially missing controls in the completion branch to avoid nil errors in renderProposals.
GET /api/assets/searchis wired and defaults to Roblox catalog unlessCATALOG_API_URLis set to a proxy. Results are normalized to{ id, name, creator, type, thumbnailUrl }. Paths: apps/web/app/api/assets/search/route.ts, apps/web/lib/catalog/search.ts.- The plugin renders a result picker with Insert buttons that call
InsertService:LoadAsset(assetId)and report back via/api/proposals/:id/apply. - Orchestrator bias: prefers
search_assets→insert_assetover manualcreate_instancefor non‑primitive content; will pivot to manual creation whenCATALOG_DISABLE_SEARCH=1, no results, or insert errors are reported. - Fallback signals:
- Server apply route records a system hint encouraging manual creation on asset failures.
- Plugin auto‑continue appends a follow‑up message to build manually if insert failed (includes last error).
Use a detailed <start_plan> with one tool per step, naming the action and target. Example (military base outline):
<start_plan>
<steps>[
"Create Model 'MilitaryBase' under game.Workspace",
"Search assets query='watch tower' tags=['model'] limit=6",
"Insert asset <ID_FROM_RESULTS> under game.Workspace.MilitaryBase",
"Search assets query='barracks' tags=['model'] limit=6",
"Insert asset <ID_FROM_RESULTS> under game.Workspace.MilitaryBase",
"Search assets query='fence' tags=['model'] limit=6",
"Insert asset <ID_FROM_RESULTS> under game.Workspace.MilitaryBase",
"Set properties (Anchored, CFrame) to arrange towers, barracks, fence perimeter",
"Open or create Script 'BaseBuilder' in game.ServerScriptService",
"Show diff to add idempotent Luau that rebuilds the base"
]</steps>
</start_plan>
Notes
- Not capped: include as many steps as needed to place/arrange everything clearly.
- After inserts, use
set_propertiesto anchor and position visible parts so progress is obvious in Workspace. - If search/insert fails: continue with
create_instanceand place primitives, then add Luau before completing (unless opted out).
Use a short, strict, tool‑centric system prompt. Keep it deterministic and proposal‑first.
You are a Roblox Studio copilot. You read the active script, inspect selection,
and propose safe, undoable changes. You must use exactly ONE tool per message.
Wait for each tool result before the next step.
# Tool call format (XML-like):
<tool_name>
<param1>…</param1>
<param2>…</param2>
</tool_name>
# Available tools (see JSON schemas below)
- get_active_script()
- list_selection()
- list_open_documents(maxCount?)
- open_or_create_script(path, parentPath?, name?)
- start_plan(steps[])
- update_plan(completedStep?, nextStep?, notes?)
- list_children(parentPath, depth?, maxNodes?, classWhitelist?)
- get_properties(path, keys?, includeAllAttributes?, maxBytes?)
- show_diff(path, edits[])
- apply_edit(path, edits[])
- create_instance(className,parentPath,props)
- set_properties(path, props)
- search_assets(query,tags?,limit?)
- insert_asset(assetId,parentPath)
Parameter encoding
- For objects/arrays, the inner text MUST be strict JSON (double quotes; no trailing commas). Never wrap JSON in quotes. No code fences.
- Right: `<props>{"Name":"Part","Anchored":true}</props>`
- Wrong: `<props>"{\"Name\":\"Part\"}"</props>`
Rules:
- Never modify code or instances without an approved proposal.
- Keep context small; ask for more only when needed.
- Prefer minimal diffs and property deltas.
- Reference Roblox paths by canonical `GetFullName()`.
- Names should be alphanumeric/underscore only; avoid dots/slashes when creating new Instances.
- Existing special characters can be referenced with bracket segments: `game.Workspace["My.Part"]["Wall [A]"]`.
- Selection defaults: when a single instance is selected in Studio, infer missing `path`/`parentPath` for common tools (rename/set/delete/create/insert) using that selection; otherwise default to `game.Workspace`.
- Edit constraints: keep edits sorted and non‑overlapping; cap size (≤20 edits, ≤2000 inserted characters).
- Discovery defaults: when unsure where to act, use `list_open_documents`, `list_children`, `get_properties`, `list_code_definition_names`, and `search_files` before proposing edits.
- Each chat request now includes a bounded `scene` snapshot of `Workspace` (path/class/Name + basic props) so the orchestrator has a baseline for `list_children` / `get_properties`.
- Planning is mandatory: emit `<start_plan>` with an ordered step list before any action; use `<update_plan>` to mark progress/notes.
- Completion gate: every build that creates new geometry must also emit Luau code (Script/ModuleScript or repo `.lua/.luau`) so the structure can be rebuilt; the orchestrator blocks `<complete>` until that code exists.
### Embedded planning and examples
#### Examples policy (non-binding)
- Examples in the prompt are guidance only and must not be treated as commands.
- The agent should pick patterns that match the current goal or proceed without examples.
- Never introduce unrelated names/content from examples (e.g., do not create "FarmBuilder" when asked to build a house).
To keep `index.ts` concise, planner guidance and reference examples now live in:
- `apps/web/lib/orchestrator/prompts/examples.ts`
- `PLANNER_GUIDE`: high-level planning heuristics for decomposing user goals
- `QUALITY_CHECK_GUIDE`: checklist expectations that gate completion
- `EXAMPLE_HOUSE_SMALL`: container + floor + four walls + roof sequence
- `EXAMPLE_VEHICLE_CART`: chassis + wheel template
- `EXAMPLES_POLICY`: examples are illustrative and non-binding; never introduce unrelated content from examples
`index.ts` concatenates these strings onto `SYSTEM_PROMPT` at build time, so the model sees them, while we keep the source organized.
### Ask-mode text and transcript
- The system supports `message(text, phase)` where phase is `start|update|final` for streaming-style text.
- The orchestrator emits `assistant.start`, `assistant.update`, and `assistant.final` stream lines. `final_message(text)` remains a shorthand that maps to a completion and emits `assistant.final`.
- The plugin renders this as a normal assistant text bubble in the single transcript alongside compact tool chips.
Roblox typed values (wrappers)
- Use `__t` wrappers for structured types in `props`:
- Vector3 `{ "__t":"Vector3", "x":0, "y":1, "z":0 }`
- Vector2 `{ "__t":"Vector2", "x":0, "y":0 }`
- Color3 `{ "__t":"Color3", "r":1, "g":0.5, "b":0.25 }`
- UDim `{ "__t":"UDim", "scale":0, "offset":16 }`
- UDim2 `{ "__t":"UDim2", "x":{ "scale":0, "offset":0 }, "y":{ "scale":0, "offset":0 } }`
- CFrame `{ "__t":"CFrame", "comps": [x,y,z, r00,r01,r02, r10,r11,r12, r20,r21,r22] }`
- EnumItem `{ "__t":"EnumItem", "enum":"Enum.Material", "name":"Plastic" }`
- BrickColor `{ "__t":"BrickColor", "name":"Bright red" }`
- Instance ref `{ "__t":"Instance", "path":"game.ReplicatedStorage.Template" }`
- Attributes: prefix keys with `@`, e.g., `{ "@Health": 100 }`.
Optional: Plan vs Act modes. In Plan mode, gather info and outline a plan. In Act mode, execute tools one step at a time.
These are the server‑registered tools exposed to the LLM. Keep schemas strict and types narrow.
{
"tools": [
{
"name": "get_active_script",
"description": "Return the current active script path and editor text.",
"parameters": {"type": "object", "properties": {}, "required": []}
},
{
"name": "list_selection",
"description": "List current Selection with class/name/full path/properties.",
"parameters": {"type": "object", "properties": {}, "required": []}
},
{
"name": "list_open_documents",
"description": "Return a small list of open ScriptDocuments (path, optional ranges).",
"parameters": {"type": "object", "properties": {"maxCount": {"type": "integer", "minimum": 1, "maximum": 100}}, "required": []}
},
{
"name": "show_diff",
"description": "Preview text edits as a diff without applying.",
"parameters": {
"type": "object",
"properties": {
"path": {"type": "string"},
"edits": {
"type": "array",
"items": {
"type": "object",
"properties": {
"start": {"type": "object", "properties": {"line": {"type": "integer"}, "character": {"type": "integer"}}, "required": ["line", "character"]},
"end": {"type": "object", "properties": {"line": {"type": "integer"}, "character": {"type": "integer"}}, "required": ["line", "character"]},
"text": {"type": "string"}
},
"required": ["start", "end", "text"]
}
}
},
"required": ["path", "edits"]
}
},
{
"name": "apply_edit",
"description": "Apply text edits to a script (plugin will do UpdateSourceAsync under ChangeHistory).",
"parameters": {"$ref": "#/tools/3/parameters"}
},
{
"name": "create_instance",
"description": "Create an Instance and parent it.",
"parameters": {
"type": "object",
"properties": {
"className": {"type": "string"},
"parentPath": {"type": "string"},
"props": {"type": "object"}
},
"required": ["className", "parentPath"]
}
},
{
"name": "set_properties",
"description": "Set properties on an existing instance by path.",
"parameters": {
"type": "object",
"properties": {"path": {"type": "string"}, "props": {"type": "object"}},
"required": ["path", "props"]
}
},
{
"name": "rename_instance",
"description": "Rename an Instance by its full path.",
"parameters": {
"type": "object",
"properties": {"path": {"type": "string"}, "newName": {"type": "string"}},
"required": ["path", "newName"]
}
},
{
"name": "delete_instance",
"description": "Delete an Instance by its full path.",
"parameters": {
"type": "object",
"properties": {"path": {"type": "string"}},
"required": ["path"]
}
},
{
"name": "search_assets",
"description": "Search the Roblox catalog for assets (server-side).",
"parameters": {
"type": "object",
"properties": {"query": {"type": "string"}, "tags": {"type": "array", "items": {"type": "string"}}, "limit": {"type": "integer", "minimum": 1, "maximum": 50}},
"required": ["query"]
}
},
{
"name": "insert_asset",
"description": "Insert a catalog asset by ID under a parent path.",
"parameters": {
"type": "object",
"properties": {"assetId": {"type": "integer"}, "parentPath": {"type": "string"}},
"required": ["assetId"]
}
}
]
}show_diff and apply_edit share the same shape; show_diff never writes.
Plugin capture (Luau)
local Selection = game:GetService("Selection")
local ScriptEditorService = game:GetService("ScriptEditorService")
local StudioService = game:GetService("StudioService")
local function getActiveScriptContext()
local active = StudioService.ActiveScript
if not active then return nil end
return { path = active:GetFullName(), text = ScriptEditorService:GetEditorSource(active) }
end
local function getSelectionContext()
local out = {}
for _, inst in ipairs(Selection:Get()) do
table.insert(out, { className = inst.ClassName, name = inst.Name, path = inst:GetFullName() })
end
return out
endBackend receive
// app/api/chat/route.ts
export const runtime = 'nodejs'
import { z } from 'zod'
import { runLLM } from '@/lib/orchestrator'
const ChatSchema = z.object({
projectId: z.string(),
message: z.string(),
context: z.object({
activeScript: z.object({ path: z.string(), text: z.string() }).nullable().optional(),
selection: z.array(z.object({ className: z.string(), path: z.string() })).optional(),
openDocs: z.array(z.object({ path: z.string() })).optional(),
scene: z.object({
nodes: z.array(z.object({
path: z.string(),
className: z.string(),
name: z.string(),
parentPath: z.string().optional(),
props: z.record(z.unknown()).optional(),
})).optional(),
}).optional(),
}),
autoApply: z.boolean().optional()
})
export async function POST(req: Request) {
const input = ChatSchema.parse(await req.json())
const { proposals, taskState } = await runLLM(input) // proposals + TaskState snapshot
return Response.json({ proposals, taskState })
}Auto‑compaction (orchestrator)
- Track tokens (
tokensIn + tokensOut + cacheWrites + cacheReads) andcontextRequests. - If near the model window, summarise & truncate older turns (prefer keeping the latest + a summarized history). Store telemetry for debugging.
End‑to‑end
-
LLM returns proposals (e.g.,
type: 'edit' | 'object_op' | 'asset_op'). -
Plugin shows diffs/ops → Approve or Reject.
-
On approve: plugin wraps writes in
ChangeHistoryService:TryBeginRecording/FinishRecordingand performs:- Text edits:
ScriptEditorService:UpdateSourceAsyncwith a transform function. - Object ops:
Instance.new, set properties.
- Text edits:
-
Plugin reports the result back to
/api/proposals/:id/applyfor auditing.
Apply edit (Luau)
local ScriptEditorService = game:GetService("ScriptEditorService")
local ChangeHistoryService = game:GetService("ChangeHistoryService")
local function applyEdit(scriptInstance, transformFn)
if not ChangeHistoryService:TryBeginRecording("AI Edit", "AI Edit") then return false end
local ok, err = pcall(function()
ScriptEditorService:UpdateSourceAsync(scriptInstance, function(old)
return transformFn(old)
end)
end)
ChangeHistoryService:FinishRecording("AI Edit")
return ok, err
endUX details (Cline‑style)
- Always render a diff preview and require approval before saving.
- When rejected, reset the preview state (no write).
- One approved apply = one meaningful undo step.
Server (Next.js)
GET /api/assets/search?query=…&tags=…&limit=…— Catalog search (server‑side). Also accepts legacyqas alias forquery. Normalize results to{ id, name, creator, type, thumbnailUrl }.- Env:
CATALOG_API_URL(required for real results). If unset, returns stubbed results during local dev. - Env:
CATALOG_API_KEY(optional) — sent asAuthorization: Bearer <key>when present.
- Env:
- Optionally cache by
queryand tags.
Plugin (Luau)
local InsertService = game:GetService("InsertService")
local function insertAsset(assetId, parent)
local container = InsertService:LoadAsset(assetId) -- returns a Model
local model = container
local childModel = container and container:FindFirstChildOfClass("Model")
if childModel then model = childModel end
model.Parent = parent or workspace
return model
endLLM flow
search_assets(query, tags?)→ assistant proposes a short pick list.- On user choice, assistant emits
insert_asset(assetId,parentPath)→ plugin inserts → assistant follows withset_properties/apply_editto wire scripts.
Permissions
- The plugin calls your backend domain only (user approves once via the built‑in HTTP permission prompt). The backend talks to Roblox’s web APIs.
- Endpoint:
POST /api/assets/generate3dnow proxies to Meshy’s text-to-3D API. - Auth: uses
Authorization: Bearer <key>from the request header (sent by the Studio plugin) orMESHY_API_KEYenv on the backend. - Config:
MESHY_API_URL(defaulthttps://api.meshy.ai/openapi/v2/text-to-3d),MESHY_TIMEOUT_MS. - Plugin: Settings adds “Meshy API Key (3D Generation)” and sends it as a bearer header when enqueuing jobs.
- Response:
{ jobId, provider: 'meshy' }(job status/polling can be added next).
Note: For local dev, set MESHY_API_KEY in vector/apps/web/.env.local. Later we can add secure server-side storage for multi-user deployments.
- Plugin: default to short polling or long‑polling
/api/stream?cursor=…. If SSE is available and reliable in your environment, the plugin can use it; otherwise keep long‑poll. The stream store periodically cleans idle workflows to control memory. - Web dashboard: use SSE for token‑level streaming.
- Limit concurrent plugin HTTP requests (≤2) to avoid Studio’s in‑flight cap.
- Script writes only inside ChangeHistoryService recordings.
- ScriptEditorService only for reading/writing code (never touch
Script.Sourcedirectly). - Two prompts the first time: HTTP domain(s) and Script Modification.
- Dry‑run mode: preview diffs, no writes.
- Audit: store prompts, tool calls, proposals, final diffs, and outcomes.
/lib/orchestrator/
index.ts // provider-agnostic run()
providers/openai.ts // function-calling adapter
providers/anthropic.ts// tool-use adapter
context.ts // token tracking + summarization
proposals.ts // build proposals from tool calls
tools.ts // registry + zod schemas
Key responsibilities
- Load system prompt + tool registry.
- Maintain conversation context; summarise before overflow.
- Normalize tool calls → proposals (edit/object/asset).
- Persist proposals; expose to plugin.
- Chat: messages, partial streaming rows, status (generating/applying/failed).
- Context chips: Active Script, Selection, Open Docs (click to attach as tags into the next prompt).
- Proposals: diff/ops list with Apply, Apply & open file, Reject.
flowchart TD
subgraph Studio Plugin (Luau)
A[User asks] --> B[Collect context\nGetEditorSource + Selection]
B --> C[/api/chat]
J[Stream/Long-poll /api/stream] --> K[Render partial tokens]
C -->|proposals| D[Diff/Operations UI]
D --> E{Approve?}
E -- Yes: edits --> F[TryBeginRecording\nUpdateSourceAsync\nFinishRecording]
E -- Yes: assets --> G[InsertService:LoadAsset]
F --> H[Report success]
G --> H
E -- No --> I[Reject]
end
subgraph Backend (Next.js)
C --> L[LLM Orchestrator]
L --> M[System prompt + Tool registry]
M --> N[Model tool-call]
N --> O[Proposal Builder]
O --> C
P[/api/assets/search/] --> Q[Catalog APIs]
Q --> P --> L
H --> R[Mark proposal applied]
end
subgraph Context Manager
L --> S[Token accounting]
S --> T{Near window?}
T -- yes --> U[Summarize + truncate]
U --> L
end
M0 — Spike (1–2 days)
- Dock widget + toolbar; read active script + selection; POST
/api/chat; echo back.
M1 — Edits (1–2 weeks)
- Implement
show_diff+apply_editend‑to‑end with preview & single undo step. - Error surfaces for unsaved edits / Team Create conflicts.
M2 — Instances & properties (1–2 weeks)
create_instance,set_propertieswith selection chips → apply.
M3 — Asset Store (1 week)
/api/assets/search+ plugin picker +insert_asset+ post‑configure script stubs.
M4 — Orchestrator & Providers (1–2 weeks)
- Tool adapters for OpenAI/Anthropic; deterministic JSON parsing; auto‑compaction.
M5 — (Optional) Indexing
- Snapshot route + luau‑lsp diagnostics + embeddings for retrieval.
- Localhost first (
http://127.0.0.1:3000) to trigger plugin HTTP permission. - Concurrency guard (≤2 in‑flight requests) in the plugin.
- Undo/redo: every approved apply == one undo step.
- Audit logs: verify tool calls/proposals/diffs persisted.
Next.js tool registry (zod)
// lib/tools.ts
import { z } from 'zod'
export const EditRange = z.object({ line: z.number(), character: z.number() })
export const Edit = z.object({ start: EditRange, end: EditRange, text: z.string() })
export const Tools = {
get_active_script: z.object({}),
list_selection: z.object({}),
list_open_documents: z.object({ maxCount: z.number().min(1).max(100).optional() }),
show_diff: z.object({ path: z.string(), edits: z.array(Edit) }),
apply_edit: z.object({ path: z.string(), edits: z.array(Edit) }),
create_instance: z.object({ className: z.string(), parentPath: z.string(), props: z.record(z.any()).optional() }),
set_properties: z.object({ path: z.string(), props: z.record(z.any()) }),
search_assets: z.object({ query: z.string(), tags: z.array(z.string()).optional(), limit: z.number().min(1).max(50).optional() }),
insert_asset: z.object({ assetId: z.number(), parentPath: z.string().optional() })
}Plugin request helper (Luau)
local HttpService = game:GetService("HttpService")
local function postJson(url, body)
return HttpService:RequestAsync({
Url = url,
Method = "POST",
Headers = { ["Content-Type"] = "application/json" },
Body = HttpService:JSONEncode(body)
})
end— end —
Below shows exactly where each part runs and who is allowed to edit the scene/code. The backend is the brain (planning, LLM tool orchestration, asset search/generation); the plugin is the hands (reads context, previews diffs, applies edits with undo/redo).
flowchart LR
subgraph User_Studio["Roblox Studio (user)"]
U[Developer types in chat] --> PUI[Plugin Chat UI]
end
subgraph Plugin["Studio Plugin (Luau)"]
PUI --> CTX[Collect Context<br/>GetEditorSource • Selection • Open Docs]
CTX --> REQ[HTTPS POST /api/chat]
DIFFS[Diffs / Ops Preview] --> APPROVE{Approve?}
APPROVE -- Yes: edits --> APPLY[ChangeHistoryService<br/>UpdateSourceAsync]
APPROVE -- Yes: assets --> INSERT[InsertService:LoadAsset → set_properties]
APPLY --> ACK[POST /api/proposals/:id/apply]
INSERT --> ACK
APPROVE -- No --> REJECT[Reject]
end
subgraph Backend["Next.js Orchestrator (Vercel)"]
ROUTES[/api/chat • /api/stream* • /api/assets/search • /api/proposals/:id/apply/]
ORCH[Tool‑Calling Orchestrator]
PROVIDERS[OpenRouter / OpenAI / Anthropic]
STORE[(DB • Cache • Audit Logs)]
GEN3D[/api/assets/generate3d]
GPU[GPU 3D Asset Gen Service]
end
subgraph Roblox_APIs["Roblox APIs"]
CATALOG[Catalog/Marketplace Search]
OPENCLOUD[Open Cloud Upload → assetId]
end
REQ --> ROUTES
ROUTES --> ORCH --> PROVIDERS
ORCH --> STORE
ORCH -->|proposals JSON| DIFFS
GEN3D --> GPU --> OPENCLOUD --> ORCH
ROUTES --> CATALOG
%% Notes: * Studio plugins cannot use SSE/WebSockets. Use short/long‑polling.
Key points
- Only the plugin can edit Roblox code/scene. The backend never writes to the place; it returns proposals (diffs, instance/property ops).
- First call to your domain (e.g. Vercel) triggers Studio’s HTTP domain permission prompt; first write triggers Script Modification permission.
- Streaming inside Studio: use long‑polling (
/api/stream). A separate SSE route (/api/stream/sse) is available for dashboards. - Asset flows: search via backend → propose →
InsertService:LoadAsset(assetId)in plugin → optionalset_properties. - 3D generation: backend calls GPU service → uploads via Open Cloud → returns
assetId→ plugin inserts.
Minimal tool call snippet (for clarity)
You are a Roblox Studio copilot. You read the active script, inspect selection,
and propose safe, undoable changes. You must use exactly ONE tool per message.
Wait for each tool result before the next step.
# Tool call format (XML‑like):
<tool_name>
<param1>…</param1>
<param2>…</param2>
</tool_name>
One‑tool‑per‑message. Always preview before apply. All writes are inside ChangeHistoryService.
You are a Roblox Studio copilot. You read the active script, inspect selection,
and propose safe, undoable changes. You must use exactly ONE tool per message.
Wait for each tool result before the next step.
# Tool call format (XML‑like):
<tool_name>
<param1>…</param1>
<param2>…</param2>
</tool_name>
| Tool | Purpose | Params | Returns | Runs where |
|---|---|---|---|---|
get_active_script() |
Read active editor tab content reliably | — | { path, text, isDirty } |
Plugin (Luau) |
list_open_documents(maxCount?) |
Enumerate open ScriptDocuments |
maxCount? |
[{ path, isDirty }] |
Plugin |
open_or_create_script(path,parentPath?,name?) |
Guarantee a Script exists and return its Source | path or parentPath+name |
{ path, text, created } |
Backend → proposal + plugin |
start_plan(steps[]) |
Declare ordered steps before acting | steps[] |
{ steps } |
Orchestrator |
update_plan(completedStep?,nextStep?,notes?) |
Track plan progress / notes | completedStep?, nextStep?, notes? |
{ completedStep?, nextStep?, notes? } |
Orchestrator |
list_selection() |
Current Selection metadata | — | [{ className, name, path, props }] |
Plugin |
list_children(path) |
Explore hierarchy under a node | path |
[{ className, name, path }] |
Plugin |
get_properties(path, keys?) |
Read specific props for an Instance | path, keys? |
{ key:value } |
Plugin |
get_properties and list_children now appear in the backend tool registry, so the provider can call them just like any other context helper. The Studio plugin still executes them, but the orchestrator serves responses from the cached TaskState scene graph (hydrated after each apply/object proposal) so retries stay deterministic and sandboxes remain read-only.
-
get_properties(path, keys?, includeAllAttributes?, maxBytes?)- Purpose: Read selected properties and/or attributes of an Instance at
path. - Params:
path(string, required) — canonicalGetFullName()pathkeys(string[], optional) — property names; entries beginning with@fetch attributesincludeAllAttributes(boolean, optional) — return every attribute when truemaxBytes(number, optional) — server clamp; defaults to 32768 bytes
- Returns: object map of requested fields with JSON-safe typed wrappers
- Constraints: respects
maxBytes; unknown properties are skipped instead of raising - Example:
<get_properties> <path>game.Workspace.Tree</path> <keys>["Position","Size","Anchored","@Health"]</keys> </get_properties>
- Purpose: Read selected properties and/or attributes of an Instance at
-
list_children(parentPath, depth?, maxNodes?, classWhitelist?)- Purpose: Explore a subtree under
parentPath, returning class/name/path triples. - Params mirror the schema;
classWhitelistexpects an object like{ "Part": true }. - Returns:
[{ path, name, className, children?[] }] - Constraints: respects depth/maxNodes caps; falls back to the cached scene graph when Studio hasn’t returned new data yet.
- Example:
<list_children> <parentPath>game.Workspace.House</parentPath> <depth>2</depth> <classWhitelist>{"Part":true,"Model":true}</classWhitelist> </list_children>
- Purpose: Explore a subtree under
| Tool | Purpose | Params | Returns | Runs where |
|---|---|---|---|---|
show_diff(path, edits[]) |
Render diff preview (no write) | path, edits[] line/char ranges |
{ previewId } |
Plugin |
apply_edit(path, edits[]) |
Apply text edits with undo | path, edits[] |
{ ok, beforeHash, afterHash } |
Plugin |
create_instance(className, parentPath, props) |
Create & parent | className, parentPath, props |
{ path } |
Plugin |
set_properties(path, props) |
Mutate props | path, {key:value} |
{ ok } |
Plugin |
rename_instance(path, newName) |
Rename | path, newName |
{ path } |
Plugin |
delete_instance(path) |
Delete safely | path |
{ ok } |
Plugin |
| Tool | Purpose | Params | Returns | Runs where |
|---|---|---|---|---|
search_assets(query, tags?, limit?) |
Search Roblox Catalog/Marketplace | query, tags?, limit? |
[{ assetId, name, type, thumbnailUrl }] |
Backend (Next.js) |
insert_asset(assetId, parentPath) |
Insert by id | assetId, parentPath |
{ insertedPaths[] } |
Plugin |
generate_asset_3d(prompt, tags?, style?, budget?) |
GPU 3D gen → Open Cloud upload | prompt, tags?, style?, budget? |
{ assetId, name, polyCount, thumbnailUrl } |
Backend + GPU |
| Tool | Purpose | Params | Returns | Runs where |
|---|---|---|---|---|
analyze_luau(snapshotId?, files?) |
Run luau-lsp analyze |
snapshotId? or files[] |
{ diagnostics[] } |
Backend |
run_tests(testPattern?) |
Run TestEZ in runner container | testPattern? |
{ passed, failed, reportUrl } |
Backend |
snapshot_project() |
Gzip editor state for analysis | — | { snapshotId, size } |
Plugin → Backend |
Mapping to Cline:
read_file→get_active_script;write_to_file/replace_in_file→apply_edit;list_files/search_files→list_children+list_selection;execute_command/browser_action→ backend jobs (analyze_luau,search_assets,generate_asset_3d).
The backend returns Proposals. The plugin formats them into the chat + a side panel (diff or op cards) and asks for Approve/Reject. After execution, it writes a compact status to the chat.
edit—{ path, diff:{mode:"rangeEDITS", edits:[{start:{line,character}, end:{line,character}, text}]}, notes? }object_op—ops:[ { op:"create_instance"|"set_properties"|"rename_instance"|"delete_instance", ... } ]asset_op—search:{ query,tags,limit }orinsert:{ assetId,parentPath }orgenerate3d:{ prompt,.. }
-
toolResult(proposalId, summary)→ message block with ✅ and a compact table of changes -
toolError(proposalId, error)→ 🔴 error with actionable hint -
toolDenied(proposalId, reason)→ 🚫 user rejected -
Specialized:
fileEditWithUserChanges(path, userEditsDiff)(if user tweaks the diff before save)formatAssetsList(results[])(grid of asset cards with thumbnails)
flowchart TB
subgraph ResponseTypes[Response Types]
A[Successful Tool Result] --> B[formatResponse.toolResult()]
C[Tool Error] --> D[formatResponse.toolError()]
E[User Denied Tool] --> F[formatResponse.toolDenied()]
G[Specialized Formats] --> H[formatResponse.fileEditWithUserChanges()]
G --> I[formatResponse.formatAssetsList()]
end
B --> J[User Message Content]
D --> J
F --> J
H --> J
I --> J
Chat examples
- After previewing an edit: “Proposed 2 line changes to
Workspace.Button.Script(replacing Touched with ProximityPrompt).” → [Approve] [Open Diff] - After apply: “✅ Applied edit to
Workspace.Button.Script. 1 undo step created.” - Asset search: grid of up to 12 items with thumbnail, name,
Insertbuttons.
-- get_active_script()
local StudioService = game:GetService("StudioService")
local ScriptEditorService = game:GetService("ScriptEditorService")
return function()
local s = StudioService.ActiveScript
if not s then return { path = nil, text = nil, isDirty = false } end
local ok, src = pcall(function() return ScriptEditorService:GetEditorSource(s) end)
return { path = s:GetFullName(), text = ok and src or "", isDirty = true }
end-- apply_edit(path, edits[]) (single waypoint)
local ChangeHistoryService = game:GetService("ChangeHistoryService")
local ScriptEditorService = game:GetService("ScriptEditorService")
return function(scriptInstance, edits)
if not ChangeHistoryService:TryBeginRecording("AI Edit", "AI Edit") then
return { ok = false, error = "Cannot start recording" }
end
local ok, err = pcall(function()
ScriptEditorService:UpdateSourceAsync(scriptInstance, function(old)
-- assume edits already merged server‑side to one final text
return edits.__finalText or old
end)
end)
ChangeHistoryService:FinishRecording("AI Edit")
return { ok = ok, error = err }
end-- insert_asset(assetId, parentPath)
local InsertService = game:GetService("InsertService")
return function(assetId, parent)
local model = InsertService:LoadAsset(assetId)
model.Parent = parent
return { insertedPaths = { model:GetFullName() } }
endNote: Production code should resolve
path → Instance, validate classes/props, and merge multiple text edits to final source beforeUpdateSourceAsync.
// app/api/chat/route.ts
export const runtime = 'nodejs'
import { z } from 'zod'
import { runLLM } from '@/lib/orchestrator'
const Chat = z.object({
projectId: z.string(),
message: z.string(),
context: z.object({
activeScript: z.object({ path: z.string(), text: z.string() }).nullable(),
selection: z.array(z.object({ className: z.string(), path: z.string() })).optional(),
}),
autoApply: z.boolean().optional()
})
export async function POST(req: Request) {
const input = Chat.parse(await req.json())
const { proposals, taskState } = await runLLM(input) // proposals + TaskState metadata
return Response.json({ proposals, taskState })
}// lib/tools/schemas.ts (JSON Schema-ish)
export const tools = {
apply_edit: {
params: {
path: 'string',
edits: [{ start: { line: 'int', character: 'int' }, end: { line: 'int', character: 'int' }, text: 'string' }],
},
},
create_instance: { params: { className: 'string', parentPath: 'string', props: 'record' } },
set_properties: { params: { path: 'string', props: 'record' } },
rename_instance: { params: { path: 'string', newName: 'string' } },
delete_instance: { params: { path: 'string' } },
search_assets: { params: { query: 'string', tags: ['string?'], limit: 'int?' } },
insert_asset: { params: { assetId: 'number', parentPath: 'string' } },
}// app/api/assets/search/route.ts
export async function GET(req: Request) {
const { searchParams } = new URL(req.url)
const query = searchParams.get('q') || ''
const limit = Number(searchParams.get('limit') || '8')
// Call Roblox catalog via your server integration
const results = await searchRobloxCatalog(query, limit)
return Response.json({ results })
}// app/api/assets/generate3d/route.ts (GPU job)
export async function POST(req: Request) {
const { prompt } = await req.json()
const jobId = await enqueue3DJob(prompt)
return Response.json({ jobId })
}You are Vector, a Roblox Studio copilot.
Operating mode
- Proposal-first: never write to code or instances directly. Always propose edits/ops; the plugin applies them only after user approval.
- One tool per message: emit exactly one tool call per assistant turn; wait for the tool result before the next step.
- Keep context small: request only the minimum context via tools (get_active_script, list_selection, list_open_documents, `list_code_definition_names`, `search_files`) when needed; prefer `@file/@folder/@url` mentions for explicit attachments.
Tool call format (XML-like; one per message)
<tool_name>
<param1>...</param1>
<param2>...</param2>
</tool_name>
Editing rules
- Scripts: propose edit with diff.mode="rangeEDITS" using zero-based {line,character} positions.
- Preview before apply: first request show_diff(path, edits[]); after user approval, request apply_edit(path, edits[]).
- Instances: propose object_op with minimal operations (create_instance, set_properties, rename_instance, delete_instance). Use canonical Instance:GetFullName() paths.
- Assets: search_assets -> present shortlist; on user choice -> insert_asset -> optionally set_properties.
Safety
- All writes occur in the plugin inside ChangeHistoryService recordings.
- Use ScriptEditorService APIs to read/write source (never touch Script.Source directly).
- Prefer minimal diffs and property deltas. Never perform broad rewrites without explicit instruction.
Good behavior
- Ask for missing context only when required.
- Reference paths and names precisely; avoid ambiguity.
- If a proposal could be large, stage it into multiple small, reviewable proposals.
Output expectations
- Return concise, actionable proposals:
- edit: { path, diff:{ mode:"rangeEDITS", edits:[...] }, notes? }
- object_op: { ops:[ { op:"...", ... } ], notes? }
- asset_op: { search:{...} | insert:{...}, notes? }
Plan vs Act (optional)
- Plan: outline a short plan and which tool you will call next.
- Act: issue one tool call, wait for result, then continue.
flowchart LR
subgraph TaskCoordination
Task --> ContextManager
Task --> ApiHandler
ApiHandler --> parseAssistantMessage
parseAssistantMessage --> ToolUseObjects
ToolUseObjects --> ToolExecutor
ToolExecutor --> UpdateUserMessageContent
UpdateUserMessageContent --> MessageStateHandler
end
subgraph State
TaskState --> didRejectTool
TaskState --> didAlreadyUseTool
TaskState --> consecutiveMistakeCount
end
Behavior
- Plan Mode (optional): multi‑step plan cards → user toggles to Act Mode
- Act Mode: one tool → preview → Approve/Reject → apply → audit
vector/
apps/
web/ # Next.js (Vercel)
app/api/chat/route.ts
app/api/assets/search/route.ts
app/api/assets/generate3d/route.ts
app/api/proposals/[id]/apply/route.ts
lib/orchestrator/
lib/tools/schemas.ts
dashboard/ # optional web UI with SSE
packages/
shared/ # zod schemas, types
orchestrator/ # provider adapters (OpenRouter/OpenAI/Anthropic/local)
diff/ # text edit merging utilities
plugin/ # Roblox Studio plugin (Luau)
src/
ui/
net/
tools/ # Luau handlers for tools above
services/
main.server.lua # toolbar + dock widget bootstrap
rojo.json
selene.toml
stylua.toml
infra/ # GPU jobs, queues, Open Cloud uploader
README.md
Vector.md
Style/linting
- Luau: Selene + Stylua
- TS: ESLint + Prettier
- First network call to your domain prompts HTTP permission.
- First write prompts Script Modification permission.
- All writes inside ChangeHistoryService for undo/redo.
- Asset insertion only from approved assetIds; surface thumbnails & creators.
- No Roblox web endpoints from plugin; all via your backend.
- The model emits a scaffold plan → proposals in stages (map → vehicles → roles → GUIs → gameplay scripts).
- Every stage is previewed and requires approval; nothing overwrites silently.
- Optional: long jobs (3D gen, analysis) return jobIds; plugin polls
/api/jobs/:id.
- Docked chat UI, context capture, HTTP permission flow — implemented
/api/chatround-trip producingeditandobject_opproposals — implemented- Diff preview + one-click Apply (single undo step) — full unified diff renderer implemented; shows context, additions/deletions, multiple hunks, with Open/Close Diff toggle
- Asset search + insert flow — implemented (server stub + plugin Browse + Insert)
- Audit log of proposals and outcomes — implemented with file-backed JSON durability and apply acknowledgements
Recent updates
- 2026-07: Hardened provider selection — deterministic chooser between OpenRouter, Gemini, AWS Bedrock, and NVIDIA (
VECTOR_DEFAULT_PROVIDER, override precedence, env fallbacks). Added Bedrock adapter (InvokeModel) & NVIDIA adapter (OpenAI-compatible). Gemini client safety checks (empty/safety finishes fail fast), sanitized catalog logging, and prompt guidance for manual scene creation when catalog search stubs out. - 2026-07: Shipped checkpoint snapshots/restore, diff3-powered multi-file apply + conflict hunks, and the TaskState-driven progress UI (per-message auto checkpoints, manual Snapshot/Restore buttons in the plugin, run badges, token telemetry, and streamed conflict previews).
- 2026-06: Composer UI rebuilt to match reference mockup (attachment chips, inline auto toggle, model selector, quick menu) and auto mode now auto-inserts catalog assets.
- 2026-06: Model override pipeline added (
modelOverridefrom plugin → API → orchestrator) with Gemini 2.5 Flash option in the Studio UI. - 2026-06: Implementation plan refreshed into Today/Next/This Week checklist (linting, atomic persistence, TaskState, auto-approval, checkpoints, diff upgrades, streaming, context, resilience, plugin UX, code tools).
- 2026-06: ESLint config checked in;
npm run lintnow useseslint --max-warnings=0and runs automatically before builds. - 2026-06: JSON persistence now journaled —
persist.tsreplays unapplied entries on boot and writes data via atomic temp-rename. - 2026-06: TaskState snapshots stored in
taskStates.json;/api/chatreturns{ proposals, taskState }with history, tool runs, and streaming status. - 2026-06: Mention attachments & code discovery — prompts can use
@file/@folder/@url/@problems, new read-only tools (list_code_definition_names,search_files) expose project structure, OpenRouter retries use exponential backoff, and missing-context tools trigger automaticCONTEXT_REQUESTretries. - Added zod-based validation of provider tool-call arguments before mapping to proposals
- Added Plan/Act scaffold: executes context tools locally and performs a second provider call for the next actionable tool
- Switched proposals store to file-backed JSON durability (apps/web/data/proposals.json)
- Catalog provider interface via CATALOG_API_URL — no stub fallback; explicit 502 errors on failure
- Server-side diff preview shaping (unified diff) added to edit proposals; plugin renders preview snippet
- Provider errors now surfaced (no fallback when provider is configured via Settings); errors bubble to plugin UI
Next focus (July 2026)
- Conversation summarisation & guardrails — compact long histories, add safety prompts, and expose conversation caps so the agent stays within token limits.
- Analysis/CI jobs and durable persistence — promote JSON stores to a durable DB layer, add
analyze_luau/run_testsworkflow steps, and surface job status in the progress panel. - UX polish — advanced chat shortcuts (Ask presets, slash commands), web dashboard for browsing proposals/history, and asset thumbnail proxying to complement the new progress HUD.
Decisions captured
- AI name: Vector (assistant name in UI and comments)
- LLM providers: OpenRouter (default), Gemini, AWS Bedrock, NVIDIA.
VECTOR_DEFAULT_PROVIDER, plugin overrides, and env keys decide which client runs each request; unavailable creds automatically skip a provider. - Package manager: npm
- Hosting: Local dev at http://127.0.0.1:3000 for now. We will move to Vercel (or similar) later.
- Domain permissions: local-only for now; will add hosted domain later when deploying.
- Plugin install workflow: Rojo during development → export .rbxm for small tester group → publish to Marketplace for public release.
- System prompt: refined Vector prompt (one-tool-per-message, proposal-first, minimal diffs).
Working now
- Web (Next.js + npm)
- Local app scaffold (Next 14) with npm scripts; landing page at /.
- API routes wired: /api/chat persists proposals to a file-backed store for auditing, returns
{ proposals, taskState }, and acceptsautoApplyto mirror the Studio Auto toggle; /api/proposals/[id]/apply records an audit event; /api/proposals (GET) lists stored proposals; /api/assets/search calls a Catalog provider (falls back to stub data ifCATALOG_API_URLunset); /api/assets/generate3d returns a stub jobId. - Provider tool-call parsing (OpenRouter) behind flags or per-request Settings: model outputs exactly one XML-like tool call which is parsed and mapped to proposals. Arguments validated with zod; when provider is configured, errors are surfaced (no fallback).
- Multi-turn Plan/Act: executes context tools locally (get_active_script, list_selection, list_open_documents,
list_code_definition_names,search_files), ingests@file/@folder/@url/@problemsmentions, auto-compacts history, and continues up toVECTOR_MAX_TURNS(default 4) until an actionable tool is emitted. When required context is missing, the orchestrator emits a singleCONTEXT_REQUEST …prompt before retrying. - Provider adapters present: openrouter.ts (call path), openai.ts (stub).
- Timeouts & retries: provider calls honor
OPENROUTER_TIMEOUT_MS(default 30000ms) and retry with backoff (OPENROUTER_MAX_RETRIES,OPENROUTER_RETRY_DELAY_MS,OPENROUTER_RETRY_MAX_MS); catalog fetch honorsCATALOG_TIMEOUT_MS(default 15000ms). - Streaming: shared event bus drives
/api/stream(long-poll) and/api/stream/sse(Server-Sent Events) so dashboards can mirror Cursor-style live updates. - Multi-file apply flow: diff3 merges run on
/api/proposals/[id]/apply, conflicts return structured hunks for the plugin, and automatic/manual checkpoints keep TaskState + workspace recoverable. npm run buildpasses;npm run lintrunseslint --max-warnings=0via the committed.eslintrc.jsonand is wired asprebuild.
- Plugin (Vector)
- Dock UI with input + Send. Sends context (active script + selection) to /api/chat and renders proposals.
- Approve/Reject per proposal; applies:
- edit proposals that insert text via rangeEDITS (merged, UpdateSourceAsync with undo)
- object_op rename_instance (ChangeHistoryService wrapped)
- Provider settings panel removed (env-driven). Backend reads provider creds from
.env.local. - Reports apply results back to /api/proposals/:id/apply for auditing.
- Diff preview + conflict UI: renders unified diffs by default and switches to a three-column conflict hunk viewer when diff3 detects overlaps.
- Sidebar UX (Cursor-like with mockup parity): attachment chips row, “Write, @ for context, / for commands” placeholder, ∞/Agent pill that doubles as the Auto toggle, model selector chip (server default vs
gemini-2.5-flash), image button exposing a Retry/Next quick menu, status panel with streaming updates, and proposals list below. - Progress HUD: TaskState-driven progress bar, run badges, checkpoint metadata, and Snapshot/Restore buttons wired to
/api/checkpoints; streaming telemetry lines update the UI in real time. - Auto mode applies edit/object ops and now auto-inserts catalog assets when proposals arrive with
meta.autoApproved === true; others remain for manual review, preventing risky unattended actions.
- Configure provider credentials in
vector/apps/web/.env.localinstead of Studio settings. Add only those you need: OpenRouter:OPENROUTER_API_KEY=OPENROUTER_MODEL=moonshotai/kimi-k2:freeVECTOR_USE_OPENROUTER=0Gemini:GEMINI_API_KEY=GEMINI_MODEL=gemini-2.5-flashGEMINI_API_BASE_URL=(optional)GEMINI_TIMEOUT_MS=30000AWS Bedrock:BEDROCK_REGION=us-east-1BEDROCK_MODEL_ID=anthropic.claude-3-sonnet-20240229-v1:0AWS_ACCESS_KEY_ID=(omit if using instance/role creds)AWS_SECRET_ACCESS_KEY=AWS_SESSION_TOKEN=(optional) NVIDIA:NVIDIA_API_KEY=NVIDIA_MODEL=llama-3.1-70b-instructNVIDIA_API_BASE_URL=(optional OpenAI-compatible base) Shared:VECTOR_DEFAULT_PROVIDER=openrouter(orgemini|bedrock|nvidia)OPENROUTER_MAX_RETRIES=3OPENROUTER_RETRY_DELAY_MS=1000OPENROUTER_RETRY_MAX_MS=10000VECTOR_WORKSPACE_ROOT=VECTOR_PROBLEMS_FILE=
- Transport & Security:
- The plugin does not send provider credentials; the backend reads
process.env. - The plugin calls your backend at
http://127.0.0.1:3000by default in dev. .gitignoreexcludes env files undervector/apps/weband the localdata/folder.
- The plugin does not send provider credentials; the backend reads
- Decision: fully dispatch through
src/tools/*.lua. The main plugin only parses proposals and routes to tool modules. - Flow: parse proposal → route to tool → tool wraps in
ChangeHistoryService→ one undo step per action. - Tools used:
create_instance,set_properties,rename_instance,delete_instance,apply_edit.
- A (Loop): sequential calls verified; 3×3 grid completes without manual fixes
- Status: Implemented. Orchestrator recognizes “grid 3x3” and returns 10 proposals (1 model + 9 parts) that apply sequentially.
- B (Dispatcher): every mutation creates one undo step; unknown tools error gracefully
- Status: Implemented. Plugin dispatches object ops via tool modules (
create_instance,set_properties,rename_instance,delete_instance), each ChangeHistory-wrapped. Unknown ops surface clear errors.
- Status: Implemented. Plugin dispatches object ops via tool modules (
- C (Edit safety): beforeHash conflicts block apply; re‑preview flow tested
- Status: Implemented. Server includes
safety.beforeHash(sha1); plugin validates and blocks on mismatch with a “re-open diff to refresh” hint.
- Status: Implemented. Server includes
- D (Streaming): long‑poll UI shows progress; at most one poller active
- Status: Implemented. Single long‑poller per workflowId appends status lines to the UI.
- E (Catalog): real results + thumbs; Insert works and reports paths
- Status: Partial. Real results require
CATALOG_API_URL; plugin renders thumbnails when present and reports inserted paths.
- Status: Partial. Real results require
- F (Persistence): workflows/audit survive redeploy; filtered listings exist
- Status: Implemented. File-backed proposals/workflows persist; new routes:
/api/proposals?projectId=…,/api/workflows,/api/workflows/:id.
- Status: Implemented. File-backed proposals/workflows persist; new routes:
- G (Templates): farming demo produces 15–20 atomic steps; user approves each
- Status: Implemented. “farming” returns ~16 proposals (Farm model, base, soil tiles) for step-by-step approval.
- H (Recovery/Resume): at least Retry + Ask flows implemented; resume works after restart
- Status: Partial. Resume via
workflowIdis supported; UI Retry/Ask buttons not yet implemented.
- Status: Partial. Resume via
- I (Tests/Metrics): scenario suite passes; latency and failure metrics visible
- Status: Not yet. Suggest adding basic latency/error metrics and a smoke test suite.
This is the final, intended behavior after all phases are implemented. The loop is single‑threaded and sequential: one provider call at a time, one tool per message, one approval per write.
- Kickoff — Create a new Workflow record with
status=planningwhen the user asks for a task (e.g., “make a 10×10 farming plot”). - Plan Next Tiny Step — Orchestrator calls the LLM once and asks for one atomic tool action (not many). Inputs include: user goal, Workflow ledger (steps/results so far), Working Set (facts & fingerprints), and available caches.
- Proposal — Backend returns a proposal:
edit,object_op, orasset_op, each minimal and undoable. - Approval Gate — Plugin renders a diff/op card and waits for Approve. No writes without explicit approval.
- Execute Safely — Plugin runs the single tool wrapped in
ChangeHistoryServiceto ensure one undo step. For script edits, verifybeforeHashto avoid stale diffs. - Acknowledge & Log — Plugin posts
/api/proposals/:id/applywith success/error, touched paths, and new fingerprints (e.g., afterHash). - Update Memory — Backend appends a Step to the Workflow ledger, updates caches keyed by fingerprints, emits a streaming status (“Step 3/20 applied: Created Workspace/FarmBase”).
- Loop — Orchestrator calls the LLM again with the updated Working Set and asks for the next atomic step. Repeat until
status=completedor the user pauses.
Invariants
- One provider call in flight.
- One tool per LLM message.
- One approval = one undo step.
- Every write is auditable and reversible.
Provide constraints + proofs, not dumps of the entire place:
- Goal recap (≤2 sentences).
- Recent steps (last 3–6): short lines like “(✓) create_instance Part →
Workspace/FarmBase(v=12)”. - Known world facts (<30 items): key paths, existence, hashes/versions. Example:
ReplicatedStorage/CropSystemexists (hash91f…), last edited at Step 7.Workspace/FarmBaseexists (versionv12), size[80,1,80].
- Cache facts (with proofs): “File tree snapshot
S123is current (placeHashPH1); no changes since.” - Outstanding subgoal: “Create 10×10 soil tiles under
Workspace/FarmBase.” - Rules: “Use one tool per message. Do not re‑read unchanged resources; rely on the cache facts unless a fingerprint changed.”
The next turn “knows what happened” because the backend maintains two layers of memory and exposes them to the model in a compact way.
For each approved step, persist:
- Attempted action: tool name + normalized args.
- Result: success/error, brief details.
- Resources touched: canonical paths.
- Fingerprints: e.g., script
beforeHash/afterHash, instanceversionor key‑props hash. - New facts: discoveries (“Found
ReplicatedStorage/CropSystem”).
This ledger powers audit, recovery, and resume.
Cache expensive read‑only tool results against fingerprints:
- Key shape:
toolName|argsNormalized|fingerprint(s)- e.g.,
list_children|Workspace|placeHash=PH1 - e.g.,
get_active_script|path=ReplicatedStorage/CropSystem|hash=91f…
- e.g.,
- On next turn: If fingerprints match, serve the cached result and tell the model you did: “Used cached
list_children(Workspace)from snapshot S123; unchanged.” - Invalidation: If any dependent resource changes (new hash/version) or placeHash changes, invalidate related keys. Optional “stale‑while‑revalidate” with TTLs.
Where to store: Start with Postgres (durable). Add Redis/KV later if you need lower latency or cross‑instance streaming.
- Scripts: full‑text hash (the same
beforeHash/afterHashused for edit safety). - Instances: light hash or version counter of
{ClassName, Name, select properties}. - Place snapshot (optional):
placeHashthat changes when any tracked resource changes (good broad invalidator).
You can ship without Redis. Use it later if scale demands it.
- Postgres (must‑have): workflows, steps, proposals, audit, tool_cache.
- Long‑poll streaming: in‑memory per process is fine on a single instance.
- Rate limiting: simple per‑user counters in Postgres are OK to start.
Add when any of the following appear:
- Multiple backend instances need shared long‑poll streams (pub/sub).
- Distributed locks so only one worker advances a workflow at a time.
- Low‑latency caches for read‑only tools or catalog results.
- Strong rate limiting and idempotency keys.
- Background job queue (3D generation, analysis).
Decision matrix
- Single instance, few users → Postgres only.
- Autoscale + live streaming → add Redis (channels:
workflow:{id}:stream). - Exactly‑once execution → add Redis locking (
SETNXwith TTL). - Heavy catalog/search reuse → Redis cache with TTL + fallback to Postgres.
- Idempotency: proposals carry unique
proposalId; plugin ignores duplicate apply attempts. - De‑dupe: orchestrator refuses to generate an identical proposal unless a relevant fingerprint changed.
- Edit safety:
beforeHashmismatch blocks apply and asks for re‑preview. - Sequentiality: at most one provider call in flight per workflow; at most one mutating tool executing in the plugin.
- Summarization: context compaction happens automatically when near token budget; only summaries and the last few turns are sent to the model.
- Normalize args (paths canonicalized via
GetFullName()). - Include exact fingerprint fields that affect the result.
- Invalidate on: script
afterHashchange; instanceversionbump;placeHashchange. - TTL: 10–60 minutes for structure reads; shorter for code reads if editors are active.
- On cache hit: include a short “cache fact” line in the next Working Set.
- State rules: one tool per message; minimal diffs; do not re‑read unchanged.
- Provide only recent steps + cache facts with fingerprints.
- Ask for the smallest next action and explicit acceptance criteria.
- If validation failed last turn, paste the exact error into the new system context.
- Workflow (goal, status, timestamps).
- Steps (tool, args, result, resources touched, fingerprints).
- Proposals (rendered to user, pending/approved/rejected).
- Apply acks (success/failure + metadata).
- Tool cache entries (key, value, fingerprints, TTL).
- Audit events (who/when/what).
User: “Make me a farming game with a 10×10 plot and basic crop logic.”
Loop:
- Plan next step → Proposal:
create_instance(Part)→Workspace/FarmBase. - Approve → Apply → Step logged with new path + version.
- Plan next step → Proposal: generate soil grid (batched
create_instanceops). - Approve → Apply → Paths + versions recorded; placeHash updated.
- Plan next step → Proposal:
apply_editaddReplicatedStorage/CropSystem(ModuleScript).beforeHashrecorded. - Approve → Apply →
afterHashrecorded; cache invalidation for script reads. - Repeat until inventory, save/load, and prompts are in place → status=completed.
The model never re‑reads the file tree unless fingerprints changed; it relies on cache facts furnished in the Working Set each turn.
This appendix documents how Vector maintains chat memory and conversation state so the AI always knows what happened last, across long sessions and restarts. It mirrors Cline’s approach (dual histories, persistence, truncation, auto‑compact summarization, and a persistent Memory Bank).
- Preserve coherence in a sequential, single‑threaded loop (one tool per message, one request at a time).
- Survive long tasks by staying within the model’s context window via intelligent summarization.
- Persist state across restarts/updates and support resume.
- Give the LLM proof‑level facts (paths, hashes/versions) instead of repeating costly reads.
We keep two synchronized tracks per workflow:
- API Conversation History — the exact raw messages sent to/received from the provider (system + user/assistant turns), suitable for re‑prompting.
- Vector Messages — the enriched timeline shown in the UI (proposals, tool results, apply acks, summaries).
Per‑message metadata we persist:
workflowId,ulid/id,tsrole(system|user|assistant|tool|summary) andcontent(ordeltafor streams)proposalId|stepIdlink (if applicable)- Cost & size metrics for the last API request:
tokensIn,tokensOut,cacheWrites,cacheReads,totalCost conversationHistoryIndex→ index pointing to the corresponding API message at the time the enriched message was created (Cline‑style)conversationHistoryDeletedRange→ the window slice removed so we can reconstruct context state even after truncation
Why two tracks? We need (a) exact provider history for reproducibility and (b) richer, user‑facing events for the UI and audit. They drift by design but stay cross‑linked via indices.
Use Postgres tables (names illustrative):
api_messages(id, workflowId, idx, role, content, tokensIn, tokensOut, cacheWrites, cacheReads, createdAt)vector_messages(id, workflowId, ulid, ts, role, content, proposalId, stepId, conversationHistoryIndex, conversationHistoryDeletedRange, costJson)workflows,workflow_steps,proposals,audit(already defined earlier)- Optional:
summaries(id, workflowId, coversFromIdx, coversToIdx, tokens, text)for summarized blocks
Atomicity: write the vector message and update the API history pointer in the same transaction when possible.
Before each provider call we compute an updated prompt bundle:
- Keep: system prompt, tools list, last N user/assistant turns (e.g., 6), last K tool results (e.g., 3).
- If close to budget (use the previous call’s
tokensIn+tokensOut+cache*as a heuristic or model‑specific limit):- Run context optimizations (drop redundant “thinking” text, collapse verbose tool echoes).
- If still large, perform a Summarization step that produces a Summary Block covering an older span of conversation.
- Record/update
conversationHistoryDeletedRangeto reflect which indices were replaced by the summary.
Summary Block expectations: 800–1200 tokens; include (a) user goal, (b) accepted proposals, (c) tool results and fingerprints (paths + hashes/versions), (d) open blockers/next targets. Emit as a role: "summary" message and store in summaries and vector_messages.
- Triggered when crossing a percentage of the model’s window or when switching to a smaller‑context model.
- Produces a comprehensive, technical summary that replaces older verbatim history and preserves crucial facts (file paths, diffs applied, asset IDs, hashes, decisions).
- The UI shows a “Context summarized” event; costs counted like any other provider call.
A durable, human‑readable knowledge file (or DB row) that outlives any one conversation. Suggested file: apps/web/data/memory/<projectId>/MemoryBank.md (or a memory_bank table). Sections:
- Project Charter: scope, success criteria, constraints.
- Conventions: naming, style, directory layout (e.g., where ModuleScripts live).
- Key Entities & Paths: canonical
GetFullName()paths for important instances/scripts. - Decisions: ADR‑style bullets (“We use ProximityPrompt over Touched for X”).
- Focus/TODO Chain: the persistent task list for long‑horizon work.
Synchronization rules:
- After each Summarization or when a Decision is logged, update the Memory Bank.
- The Working Set sent to the model includes a short extract of Memory Bank items relevant to the current subtask.
- Cache key:
toolName|normalizedArgs|fingerprints. Examples:list_children|Workspace|placeHash=PH1get_properties|Workspace/FarmBase|version=v12get_active_script|ReplicatedStorage/CropSystem|hash=91f…
- On cache hit: don’t call the plugin; embed a cache fact in the Working Set (“Using cached result from S123; unchanged since PH1”).
- Invalidation: bump or delete entries when any dependent fingerprint changes (script
afterHash, instanceversion, orplaceHash). TTL recommended (10–60 min). - Storage: Postgres to start; add Redis/KV when scaling or multi‑instance streaming is required.
On restart or user “Resume,” reconstruct state by:
- Loading the Workflow and last Step statuses.
- Restoring API Conversation History up to the last request (including any Summary Blocks).
- Recreating the Working Set: last N turns + cache facts + Memory Bank extract.
- Continuing the loop from the next pending/approved step or planning a new step if none pending.
- Sequentiality: at most one provider call in flight per workflow; no parallel tool execution.
- One tool per message: reject multi‑tool emissions; instruct the model to retry with a single tool.
- Idempotency: proposal IDs are unique; duplicate applies are ignored in the plugin.
- Edit safety:
beforeHashmismatch blocks apply and prompts Re‑preview. - De‑duplication: skip generating an identical proposal unless a relevant fingerprint changed.
Each provider call includes:
- Goal recap + Current subgoal.
- Recent steps (3–6) in ultra‑compact lines.
- Known world facts (paths + hashes/versions).
- Cache facts (which reads are already fresh).
- Rules (single tool, minimal diffs, no redundant reads).
- Ask: “Propose the next single tool to advance the subgoal.”
Result: The model doesn’t waste calls re‑reading the codebase and always “knows what it did last,” because the orchestrator supplies a verifiable memory of actions and state.
- Web Orchestrator:
vector/apps/web/lib/orchestrator/index.tsmaps tool calls → proposals and streams status via/api/stream. - Tool Schemas (web):
vector/apps/web/lib/tools/schemas.tsdefines these tools:get_active_script,list_selection,list_open_documents,show_diff,apply_edit,create_instance,set_properties,rename_instance,delete_instance,search_assets,insert_asset,generate_asset_3d,list_code_definition_names,search_files. - Plugin Tools (Luau):
vector/plugin/src/tools/implements:get_active_script,list_selection,list_open_documents(placeholder),apply_edit,create_instance,set_properties,rename_instance,delete_instance,search_assets,insert_asset, plus extra read toolsget_properties,list_children. - API Routes (web):
POST /api/chat→ orchestrator → proposals persisted (lib/store/proposals.ts) and workflow steps (lib/store/workflows.ts).GET /api/stream→ long‑poll streaming of orchestrator logs.POST /api/proposals/:id/apply→ marks proposal applied (plugin reports applies here).GET /api/assets/search→ catalog proxy with stub fallback.POST /api/assets/generate3d→ stub job enqueue (returnsjobId).
Alignment notes
- Match: All “action” tools in schemas have corresponding plugin handlers, except
show_diff(web‑only preview tool, mapped to proposals). - Context reads:
get_propertiesandlist_childrenflow through the tool schema; the orchestrator answers from the cached scene graph so repeated calls stay deterministic. - Web‑only:
show_diffexists in schemas/orchestrator for diff preview; no plugin tool required because plugin renders diffs from proposals.
Duplicates (expected)
- Edit merge/diff: implemented in both TS (
lib/diff/rangeEdits.ts) and Luau (plugin/src/main.server.lua) because each runtime needs it. Not a wiring bug.
apps/web/lib/orchestrator/providers/openai.ts: placeholder adapter; not used. Use OpenRouter by setting env in the backend andVECTOR_USE_OPENROUTER=1.apps/web/app/api/assets/generate3d/route.ts: stub; returns ajobIdonly. No GPU backend yet.apps/web/lib/catalog/search.ts: real provider requiresCATALOG_API_URL(optionalCATALOG_API_KEY). If unset, serves stubbed results.plugin/src/tools/list_open_documents.lua: placeholder due to limited Studio APIs; returns only the active script.plugin/src/tools/get_properties.lua,plugin/src/tools/list_children.lua: run in Studio and feed the orchestrator’s scene snapshot; both are available to the LLM through the tool registry.
Optional next steps
- Expand scene snapshot streaming to include lightweight prop diffs so repeated
get_propertiesqueries stay fresh without extra fetches. - Implement real 3D generation backend or hide
generate_asset_3duntil ready.
- Prereqs: Node 18+, Roblox Studio, Rojo (optional for syncing).
- Backend (Next.js):
cd vector/apps/web && npm installnpm run dev(listens onhttp://127.0.0.1:3000)- Optional env:
OPENROUTER_API_KEY(if using provider)OPENROUTER_MODEL(default:moonshotai/kimi-k2:free)VECTOR_USE_OPENROUTER=1(forces provider use without plugin‑supplied key)CATALOG_API_URL(real catalog search) andCATALOG_API_KEY(optional)
- Plugin (Studio):
- Load the plugin source at
vector/plugin/src(via Rojo or manual install). - No Studio settings required. Start the backend (
npm run dev) and configure.env.local. - Use the chat dock: send a prompt with an active script open. With no provider, fallbacks will propose a safe edit (insert comment), simple rename, or instruct the provider to create the asset manually when catalog lookup is unavailable.
- Load the plugin source at
- Quick endpoint checks:
GET /api/assets/search?query=test&limit=3→ returns 3 stub items if no catalog provider configured.POST /api/assets/generate3dwith{"prompt":"cube"}→ returns ajobId.
- Orchestrator (
[orch]+ stream):- Start:
orchestrator.start provider=… mode=… - Provider calls:
provider.response turn=X chars=Y(and console[orch] provider.ok/error …) - Tool flow:
tool.parsed NAME,tool.valid NAME,tool.result NAME,proposals.mapped NAME count=N,context.request REASON.
- Start:
- Fallbacks:
fallback.edit,fallback.object,fallback.asset manual_required,fallback.asset disabled. - Validation:
search_assets.disabledemitted when the catalog proxy is missing so the model pivots to manual instance creation. - Chat route (
[chat]): request summary + proposals stored. - Assets routes (
[assets.search],[assets.generate3d]): query, counts, timings. - Catalog provider (
[catalog]): stub vs provider, timings, errors. - Proposals apply (
[proposals.apply]): id, workflow id, payload keys.
All of the above log to the Next.js terminal by default. The same high‑signal events also stream to the plugin via /api/stream so you can see the live turn‑by‑turn state.
- Shared (wired):
get_active_script,list_selection,list_open_documents(placeholder),list_children,get_properties,list_code_definition_names,search_files,apply_edit,create_instance,set_properties,rename_instance,delete_instance,search_assets,insert_asset. - Web‑only:
show_diff(proposal preview),generate_asset_3d(stubbed API; plugin calls endpoint).
No hard duplicates in a single runtime; mirrored implementations across TS/Luau are expected.