Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions docs/.vitepress/config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,10 @@ export default defineConfig({
text: 'Introduction',
items: [
{ text: 'What is GemStack?', link: '/guide/' },
{ text: 'When to Use GemStack', link: '/guide/when-to-use' },
{ text: 'Installation', link: '/guide/installation' },
{ text: 'Your First Agent', link: '/guide/first-agent' },
{ text: 'Build a Multi-Agent App', link: '/guide/tutorial' },
],
},
{
Expand All @@ -48,6 +50,12 @@ export default defineConfig({
{ text: 'mcp', link: '/packages/mcp' },
],
},
{
text: 'Project',
items: [
{ text: 'Contributing & Graduation', link: '/guide/contributing' },
],
},
],

'/packages/': [
Expand Down
37 changes: 37 additions & 0 deletions docs/guide/contributing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# Contributing & Graduation

GemStack is shared, community-governed infrastructure built in the open with the [Vike](https://vike.dev) team. This page explains how the project grows and how to get involved.

## The graduation model

GemStack does not grow by bulk-moving a framework's package set in. Packages join one at a time, by **graduating**: a package earns a place under the `@gemstack/` scope when it proves framework-agnostic value, not when it is merely useful to one framework.

In practice a package graduates when it is:

- **Framework-agnostic.** It runs in any `fetch`-capable Node runtime and does not depend on a specific web framework, ORM, or UI library. Anything framework-specific stays in that framework's own binding.
- **Neutral about infrastructure.** Persistence, caching, and storage are expressed as contracts the caller implements, with in-memory defaults for getting started. The package does not bundle a database or a queue.
- **Well-tested and documented.** It ships a real test suite and a guide here.
- **Composable.** It works on its own and composes cleanly with the rest of the family through the shared primitives (one `toolDefinition()` shape, one `Agent` base, one provider config).

The AI engine is the worked example: it was spun out of Rudder's `@rudderjs/ai`, decoupled from every framework binding, and re-versioned as [`@gemstack/ai-sdk`](/packages/ai-sdk/). The framework-specific pieces (an ORM-backed store set, a service provider, CLI scaffolders) stayed behind in the Rudder binding, which now re-exports the engine.

## Bindings vs the engine

A recurring shape in GemStack is **engine + binding**. The engine is the framework-agnostic core that lives here. A binding is a thin, framework-specific package that re-exports the engine and wires it into one framework's conventions (its container, config, ORM, and CLI).

If you maintain a framework, the path is: depend on the GemStack engine, implement its neutral contracts against your framework's infrastructure, and ship that as your own binding. Your users keep importing from your package; the engine stays shared.

## Ways to contribute

- **File issues.** Bugs, missing capabilities, and rough edges in any package. Reproductions and a clear "expected vs actual" make these actionable fast.
- **Improve the docs.** These guides live in the repo under `docs/`. Fixes and clarifications are welcome; run `pnpm --filter @gemstack/docs docs:build` before opening a PR to catch dead links.
- **Propose a graduation.** If you have a framework-agnostic package that fits the bar above, open an issue describing what it does and why it belongs in GemStack rather than in a single framework.
- **Build a binding.** Wire the engine into your framework and let us link it from here.

The repository, issues, and discussions live at [github.com/gemstack-land/gemstack](https://github.com/gemstack-land/gemstack).

## Next

- [What is GemStack?](/guide/) - the family and the design principles.
- [When to Use GemStack](/guide/when-to-use) - where it fits.
- [Packages overview](/packages/) - every package and how they compose.
3 changes: 3 additions & 0 deletions docs/guide/first-agent.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,10 +78,13 @@ The agent decides when to call the tool, validates the arguments against `inputS

| You want to… | Read |
|---|---|
| Compose tools, skills, and a multi-agent supervisor into one app | [Build a Multi-Agent App](/guide/tutorial) |
| Understand the agent loop, sub-agents, multi-step runs | [Agents](/packages/ai-sdk/agents) |
| Go deeper on tools, scoped tools, client tools, approval gates | [Tools](/packages/ai-sdk/tools) |
| Add more model providers | [Providers](/packages/ai-sdk/providers) |
| Stream tokens and tool progress to a UI | [Streaming](/packages/ai-sdk/streaming) |
| Get typed objects back instead of text | [Structured Output](/packages/ai-sdk/structured-output) |
| Persist conversations and give the agent memory | [Memory & Persistence](/packages/ai-sdk/memory) |
| Retrieval-augmented generation over your documents | [Vector Stores & RAG](/packages/ai-sdk/rag) |
| Test agents without hitting a real model | [Testing & Evals](/packages/ai-sdk/testing) |
| See the whole family and how the pieces fit | [Packages overview](/packages/) |
2 changes: 1 addition & 1 deletion docs/guide/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ mcp standalone MCP server framework agent-agnostic, not

- **Framework-agnostic core.** Every package runs in any `fetch`-capable JS runtime - Node, the browser, Electron, React Native. The agent runtime has zero static `node:*` imports in its main entry, and its only required runtime dependency is `zod`.
- **Neutral contracts, not bundled infrastructure.** Persistence (conversation history, user memory, budgets, suspended runs, generated-file storage) is defined as interfaces you implement against your own database, cache, or object store. In-memory defaults ship for getting started.
- **One way to do a thing.** A single `tool()` shape, a single `Agent` base, a single provider config object - shared across the whole family.
- **One way to do a thing.** A single `toolDefinition()` shape, a single `Agent` base, a single provider config object - shared across the whole family.
- **Graduated, not dumped.** GemStack grows by promoting packages that earn framework-agnostic standing, with the API settling toward `1.0` in the open.

## Where these came from
Expand Down
4 changes: 2 additions & 2 deletions docs/guide/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ The agent runtime lives in [`@gemstack/ai-sdk`](/packages/ai-sdk/). Install it p
pnpm add @gemstack/ai-sdk

pnpm add @anthropic-ai/sdk # Anthropic (Claude)
pnpm add openai # OpenAI (also OpenRouter / Mistral / DeepSeek / Groq / xAI / Ollama)
pnpm add openai # OpenAI (also Azure / OpenRouter / Mistral / DeepSeek / Groq / xAI / Ollama)
pnpm add @google/genai # Google (Gemini)
pnpm add cohere-ai # Cohere (reranking + embeddings)
pnpm add @aws-sdk/client-bedrock-runtime # AWS Bedrock
Expand All @@ -24,7 +24,7 @@ import { AiRegistry, AnthropicProvider, OpenAIProvider, OllamaProvider } from '@

AiRegistry.register(new AnthropicProvider({ apiKey: process.env.ANTHROPIC_API_KEY! }))
AiRegistry.register(new OpenAIProvider({ apiKey: process.env.OPENAI_API_KEY! }))
AiRegistry.register(new OllamaProvider({ baseUrl: 'http://localhost:11434' }))
AiRegistry.register(new OllamaProvider({ baseUrl: 'http://localhost:11434/v1' }))

AiRegistry.setDefault('anthropic/claude-sonnet-4-6')
```
Expand Down
181 changes: 181 additions & 0 deletions docs/guide/tutorial.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,181 @@
# Build a Multi-Agent App

[Your First Agent](/guide/first-agent) ended with a single agent answering one prompt. Real work rarely fits one prompt: a research question fans out into several lines of inquiry that each want their own tools, a shared house style, and someone to plan the work and stitch the findings back together.

This tutorial builds that app, a small research assistant, by composing three GemStack packages:

- [`@gemstack/ai-sdk`](/packages/ai-sdk/agents) for tools and the agent loop,
- [`@gemstack/ai-skills`](/packages/ai-skills) to load a portable `SKILL.md` skill onto a worker,
- [`@gemstack/ai-autopilot`](/packages/ai-autopilot) to plan a task into subtasks, dispatch them to workers, and synthesize the result.

By the end you will have a `Supervisor` that breaks a research question into subtasks, runs each on a skill-equipped worker agent, and combines the answers. We finish with a short note on exposing the whole thing over MCP.

If you have not registered a provider yet, do that first (see [Installation](/guide/installation)).

## Register a provider

Every example assumes a default provider registered once at startup:

```ts
import { AiRegistry, AnthropicProvider } from '@gemstack/ai-sdk'

AiRegistry.register(new AnthropicProvider({ apiKey: process.env.ANTHROPIC_API_KEY! }))
AiRegistry.setDefault('anthropic/claude-sonnet-4-6')
```

With a default model set, agents do not need to declare one.

## Step 1: two tools the worker can call

A research worker needs to reach the web. We give it two tools with `toolDefinition(...)`: one to search, one to fetch a page. Each declares its input with Zod and attaches a `.server()` handler that the agent calls (see [Tools](/packages/ai-sdk/tools)). Swap the stubbed bodies for a real search API and HTTP client.

```ts
import { toolDefinition } from '@gemstack/ai-sdk'
import { z } from 'zod'

export const searchWeb = toolDefinition({
name: 'search_web',
description: 'Search the web and return the top matching result snippets',
inputSchema: z.object({
query: z.string().describe('The search query'),
limit: z.number().int().min(1).max(10).default(5),
}),
}).server(async ({ query, limit }) => {
// Call your real search provider here.
return await search(query, limit) // -> [{ title, url, snippet }, ...]
})

export const fetchPage = toolDefinition({
name: 'fetch_page',
description: 'Fetch a URL and return its readable text content',
inputSchema: z.object({ url: z.string().url() }),
}).server(async ({ url }) => {
const res = await fetch(url)
return await res.text()
})
```

The agent decides when to call each tool, validates the arguments against `inputSchema` before your handler runs, and feeds the result back to the model on the next step.

## Step 2: a skill for house style

Every worker should cite its sources the same way, and that convention should travel with the agent rather than being copy-pasted into each system prompt. That is exactly what a skill is: a portable folder of instructions (and optionally tools and resources) you compose onto an agent on demand.

Create `skills/citations/SKILL.md`. The YAML frontmatter is the manifest; the markdown body becomes extra system-prompt text:

```markdown
---
name: citations
description: Cite every claim with a source URL and never invent sources
trigger: answering a research question that draws on web sources
---

# Citations

When you state a fact drawn from a source, cite it inline with the page URL in
parentheses, like (https://example.com/article). Only cite pages you actually
fetched with `fetch_page`. If you could not verify a claim, say so plainly
instead of guessing. End your answer with a "Sources" list of the URLs you used.
```

This skill is instructions-only, so there is no build step to worry about. (A skill that ships tools co-locates them in a `tools.ts` that the loader imports from its compiled output; see the [compiled-output caveat](/packages/ai-skills) when you go that far.)

Load it once at module init, since loading is async and the agent hooks are synchronous:

```ts
import { loadSkill } from '@gemstack/ai-skills'

const citations = await loadSkill('./skills/citations')
```

## Step 3: the worker agent

The worker is a `SkillfulAgent`. You declare your own identity in `baseInstructions()` and your own tools in `baseTools()`; the skills listed in `skills()` are merged in, with your own declarations winning on any name collision. Because research is multi-step (search, fetch, read, repeat), we give it a stop condition with `stepCountIs(...)`.

```ts
import { SkillfulAgent } from '@gemstack/ai-skills'
import { stepCountIs } from '@gemstack/ai-sdk'

class ResearchWorker extends SkillfulAgent {
baseInstructions() {
return 'You research a focused question using the web tools, then answer concisely.'
}
skills() { return [citations] } // adds the citation house style
baseTools() { return [searchWeb, fetchPage] }
stopWhen() { return stepCountIs(6) } // up to 6 tool-calling rounds
}
```

Override the `base*` hooks, not `instructions()` / `tools()`: those are sealed on `SkillfulAgent` and do the merge for you. Overriding them directly would drop the skill composition.

You can run this worker on its own to sanity-check it before wiring up the supervisor:

```ts
const probe = await new ResearchWorker().prompt(
'What problem did the original Transformer paper set out to solve?',
)
console.log(probe.text) // answer, with a Sources list, thanks to the skill
```

## Step 4: plan, dispatch, synthesize

Now the orchestration. A `Supervisor` takes three stages: a `plan` that decomposes the task into subtasks, the `workers` that run them, and a `synthesize` that combines the results. The planner and synthesizer are themselves ai-sdk agents, adapted with `agentPlanner(...)` and `agentSynthesizer(...)`.

```ts
import { Supervisor, agentPlanner, agentSynthesizer } from '@gemstack/ai-autopilot'
import { agent } from '@gemstack/ai-sdk'

const planner = agent(
'You break a research question into a few independent sub-questions that can be researched in parallel.',
)

const editor = agent(
'You combine several researched answers into one coherent, well-cited brief. Preserve every source URL.',
)

const supervisor = new Supervisor({
plan: agentPlanner(planner), // LLM decomposition into subtasks
workers: new ResearchWorker(), // every subtask runs on this worker
synthesize: agentSynthesizer(editor), // LLM synthesis of the results
concurrency: 3, // up to 3 workers in flight at once
maxSubtasks: 5, // hard cap; a longer plan is trimmed
budget: { maxTotalTokens: 200_000 }, // stop dispatching past this spend
onEvent: (e) => console.log(e.type), // 'plan', 'dispatch-start', ...
})
```

`workers` here is a single agent, so each subtask runs on a fresh `ResearchWorker` prompt. When you want different subtasks handled by different specialists, pass a `Record<string, Agent>` instead and let the planner set each `subtask.worker` to route between them.

## Step 5: run it

```ts
const run = await supervisor.run(
'How did the Transformer architecture change machine translation, and what came after it?',
)

console.log(run.text) // the synthesized, cited brief
console.log(run.plan) // the subtasks that were executed
console.log(run.results) // one result per subtask: { text, ok, error?, usage }
console.log(run.usage) // aggregate token usage across dispatched subtasks
console.log(run.stoppedEarly) // true if a guardrail trimmed or halted the work
```

`run()` resolves to a `SupervisorRun`. A few properties worth leaning on:

- **`run.results`** is one entry per dispatched subtask, in plan order. A worker that throws becomes an `ok: false` result; its siblings still run, so one failed line of inquiry does not sink the whole report.
- **`run.usage`** aggregates token usage across the dispatched workers. (Planning and synthesis spend are not counted: those contracts return data, not usage.)
- **`run.stoppedEarly`** tells you a guardrail (the `maxSubtasks` cap or the token `budget`) cut the work short, so you can flag a partial answer.

That is the whole app: tools give a worker hands, a skill gives it a house style, and the supervisor plans the work, fans it out, and reassembles it.

## Optional: expose it over MCP

Once the supervisor works, you can publish it as a Model Context Protocol server so other agents and MCP-aware clients can call it as a tool. Wrap the run in a server tool and serve it with [`@gemstack/ai-mcp`](/packages/ai-mcp); the worker's own tools stay internal, and callers see one `research` capability. See [/packages/ai-mcp](/packages/ai-mcp) for the server surface and transport options.

## See also

- [Tools](/packages/ai-sdk/tools) - `toolDefinition(...).server(...)`, streaming, approval, and scoped tools.
- [Running agents](/packages/ai-sdk/agents) - the agent loop, stop conditions, sub-agents, and suspend/resume.
- [`@gemstack/ai-skills`](/packages/ai-skills) - authoring, loading, and composing `SKILL.md` skills.
- [`@gemstack/ai-autopilot`](/packages/ai-autopilot) - the `Supervisor` topology and its guardrails.
- [`@gemstack/ai-mcp`](/packages/ai-mcp) - expose agents and tools over the Model Context Protocol.
36 changes: 36 additions & 0 deletions docs/guide/when-to-use.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# When to Use GemStack

GemStack is a set of standalone, framework-agnostic packages for building AI applications in Node. This page is about fit: what GemStack is good at, where it deliberately stops, and how it differs from the tools you might reach for instead.

## Reach for GemStack when

- **You want a provider-agnostic agent runtime, not a provider SDK.** Define an agent once and swap Anthropic, OpenAI, Google, Ollama, and others by changing one model string. The tool loop, streaming, structured output, middleware, and a test fake come with it.
- **You are building on the server, in any stack.** GemStack is UI-agnostic and framework-agnostic. It runs in any `fetch`-capable runtime and ships no React/Vue/Svelte coupling, so it drops into an existing Express, Hono, Fastify, Nitro, or Rudder app without taking it over.
- **You need production concerns as first-class APIs.** Conversation persistence, cross-conversation user memory, token/cost budgets, prompt caching, sub-agent streaming with mid-run suspend, and an eval harness are part of the runtime, behind neutral contracts you implement against your own infrastructure.
- **You care about testing.** A full fake (`AiFake`) lets you assert on prompts, tool calls, and every modality without hitting a real model or spending a token.
- **You are working with MCP from either side.** Bridge an agent to remote MCP servers, expose an agent as a server, or author a standalone MCP server, all with first-party packages.

## Look elsewhere when

- **You want a batteries-included frontend chat UI.** GemStack is a server runtime. It speaks the Vercel AI protocol (`toVercelResponse()`), so a frontend chat library can consume its stream, but it ships no `useChat`-style hooks of its own.
- **You only ever call one provider and want its native SDK.** If you are committed to a single vendor and want every bleeding-edge feature the day it ships, that vendor's own SDK will always be a release ahead of any abstraction.
- **You want a hosted platform.** GemStack is libraries you run, not a managed service with a dashboard and billing.

## How it differs from the usual suspects

| | GemStack | A provider SDK (e.g. one vendor's client) | A heavyweight agent framework |
|---|---|---|---|
| **Scope** | Agent runtime + skills + orchestration + MCP, as separate packages | One provider's API surface | Large, opinionated, many abstractions |
| **Providers** | Many, swap by model string | One | Many, via adapters |
| **Coupling** | Framework-agnostic, server-side, `zod` is the only hard dependency | None, but vendor-locked | Often heavy dependency graph |
| **Persistence** | Neutral contracts you implement (BYO database / cache / store) | None | Often bundled and opinionated |
| **Adopt incrementally** | Yes, take one package | N/A | Usually all-or-nothing |

The point is not that GemStack does the most. It is that each package does one thing, stays framework-agnostic, and composes with the others, so you can adopt a single piece without buying into a platform.

## Next

- [Installation](/guide/installation) - get the runtime and a provider running.
- [Your First Agent](/guide/first-agent) - the smallest end-to-end example.
- [Build a Multi-Agent App](/guide/tutorial) - compose tools, skills, and a supervisor.
- [Packages overview](/packages/) - the whole family and how the pieces fit.
Loading
Loading