Skip to content

arkorlab/arkor

Repository files navigation

Arkor

Arkor

The TypeScript framework for fine-tuning open-weight LLMs

Ship custom open-weight models the same way you ship your TypeScript app. Type-safe configs, a local Studio (web UI) to start and watch runs, and managed GPUs.

npm MIT node ≥22.6 alpha Discord

Quickstart  ·  Why Arkor  ·  Docs

日本語

Warning

Arkor is alpha. APIs change without notice. We're shipping in public, and feedback shapes what lands next.

Quickstart

pnpm create arkor my-arkor-app
cd my-arkor-app
pnpm dev

No signup required: arkor dev opens Studio, a local web UI at http://localhost:4000. On first launch it provisions a throwaway anonymous workspace so you can fire off a real training run right away.

Run arkor login --oauth later if you want to claim your work under an account.

Pick a template

The scaffolder asks which template you want. All three pair the same small open-weight base (unsloth/gemma-4-E4B-it) with a curated public dataset on HuggingFace, so the first run finishes in minutes.

Template Task Example Dataset Est. training
triage Support triage "Can't log in"{category: "auth", urgency: "high", summary: "...", nextAction: "..."} arkorlab/triage-demo ~7 min
translate Translation "パスワードを忘れました"{translation: "I forgot my password", detectedLanguage: "ja"} arkorlab/translate-demo ~7 min
redaction PII redaction "Email john@x.com"{redactedText: "Email [REDACTED]", redactedCount: 1, tags: ["EMAIL"]} arkorlab/redaction-demo ~12 min

Skip the prompt with pnpm create arkor my-arkor-app --template triage.

Why Arkor

Custom open-weight models are a real option today because of years of work in the Python ML ecosystem and the people and companies who built it out. Arkor stands on that foundation.

What we wanted, and didn't find, was a path that fits how TypeScript and Node developers already work: a workflow where fine-tuning, evaluation, and serving live in the same codebase as the product, with the same editor, types, and review flow.

Type-safe configs instead of separate config files. A local Studio for the dev loop.

The phrase we keep coming back to: ship the model the same way you ship the product. If that sounds right, you're the audience.

What works today

  • Fine-tune an open-weight LLM from one file. createTrainer({ model, dataset, lora, ... }) runs LoRA training on the base model you point it at.
  • Three curated templates that run end-to-end. triage, translate, and redaction pair the same Gemma 4 base with a public HuggingFace dataset and finish in minutes.
  • React to training in code, not in a dashboard. Lifecycle callbacks (onStarted, onLog, onCheckpoint, onCompleted, onFailed) fire as the run streams from the cloud, fully typed.
  • Sanity-check the model before the run finishes. Inside onCheckpoint, call infer({ messages }) against the model as it's being trained.
  • Watch the run in a local Studio. arkor dev opens a UI with a jobs list, live loss chart, log tail, and a Playground for chatting with your fine-tuned models.
  • Try it without an account. arkor dev boots straight into a fresh anonymous workspace. Run arkor login --oauth to start the Arkor Cloud OAuth (PKCE) flow and attach the work to your account.

What's coming next

Framework API

  • Generate synthetic training data from a small seed set.
  • Distillation-focused templates that pair compatible teacher and student models.
  • Templates aimed at small, on-device models (WebGPU, mobile).

SDK and CLI

  • Train on a local GPU. Today every run goes to Arkor's managed GPUs.
  • Bring your own dataset from a JSONL file. Today, any HuggingFace name and any blob URL (with optional auth token) already work.
  • More base models beyond Gemma 4.

Studio

  • Download a trained model as a file and run it on your own machine or deploy target. Today, runs stay on Arkor's managed inference.
  • Surface the dry-run option in the UI for fast smoke tests.

Other

  • Self-host the training backend. Today we host it.
  • A real documentation site. Sources live in docs/; the published site is docs.arkor.ai.

A taste of the API

// src/arkor/trainer.ts
import { createTrainer } from "arkor";

export const trainer = createTrainer({
  name: "support-bot-v1",
  model: "unsloth/gemma-4-E4B-it",
  dataset: { type: "huggingface", name: "arkorlab/triage-demo" },
  lora: { r: 16, alpha: 16 },
  maxSteps: 100,
  callbacks: {
    onLog: ({ step, loss }) => console.log(`step=${step} loss=${loss}`),
    onCheckpoint: async ({ step, infer }) => {
      const res = await infer({ messages: [{ role: "user", content: "Hello!" }] });
      console.log(`ckpt @ ${step}:`, await res.text());
    },
  },
});
// src/arkor/index.ts  ← discovered by `arkor dev` / `arkor build`
import { createArkor } from "arkor";
import { trainer } from "./trainer";

export const arkor = createArkor({ trainer });

src/arkor/index.ts is the file the CLI and Studio look for. Your trainer lives in a sibling file and is registered through createArkor.

What's in a project

my-arkor-app/
├── src/arkor/
│   ├── index.ts        # createArkor({ trainer })  ← discovered by the CLI / Studio
│   └── trainer.ts      # createTrainer({ ... })
├── arkor.config.ts
├── .arkor/             # state + build artifacts (gitignored)
└── package.json        # dev / build / start

CLI

Command Purpose
arkor init Scaffold a new project in the current directory
arkor login / arkor logout / arkor whoami Arkor Cloud OAuth (PKCE) / anonymous tokens
arkor dev Launch the local Studio web UI
arkor build Bundle src/arkor/index.ts to .arkor/build/index.mjs
arkor start Run the build artifact (auto-builds when missing)

pnpm dev resolves to arkor dev in scaffolded projects, so most workflows live behind that one command.

Architecture

arkor dev boots a Hono server on 127.0.0.1:4000 that serves a Vite + React SPA from the same origin.

The Studio server rejects non-loopback Host headers before serving HTML, and the SPA talks to your code via per-launch CSRF-token-gated /api/* routes; your code talks to the Arkor training backend over authenticated HTTPS.

Training runs on managed GPUs; checkpoints stream back as SSE events that fire your callbacks.* in process.

Repository

Package What it is
arkor SDK + CLI + bundled local Studio
create-arkor pnpm create arkor scaffolder
docs Mintlify source for docs.arkor.ai (pnpm --filter @arkor/docs docs:dev)

Requires Node.js 22.6+. (Please use Node.js 24, preferably the latest version, for contributing to this repository.)

Works with pnpm / npm / yarn / bun.

We're shipping in public

Arkor is alpha, and the core idea (TypeScript-native fine-tuning for product engineers) is something we want to design with the people who'd use it. If that's you:

  • File an issue with the model + dataset + workflow you wish worked. We read everything.
  • Star the repo if you want updates as we move toward 0.1.
  • Join Discord for live discussion and early-access pings.

See CONTRIBUTING.md for development setup.

License

MIT