Skip to content

ThakeeNathees/orca

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

197 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Important

Architectural Update: The core workflow engine is currently being re-engineered entirely in Go. After evaluating system constraints and scalability requirements, the dependency on LangGraph was deprecated in favor of a bespoke, scratch-built orchestration layer.

Please review the wip/core-workflow-refactor branch for the new custom implementation.

Orca

A DSL for AI agent orchestration.
Language for expressing agentic systems as declarative programs

Why Orca · Quick Start · How It Works · Contributing

Status: Work in Progress Version License


Why Orca?

Frameworks like LangGraph, AutoGen, and CrewAI provide the primitives for building AI agent systems, but they require you to express agent configurations, tool bindings, and execution graphs in imperative Python code. A simple two-agent pipeline can easily span 100+ lines — importing modules, defining state schemas, instantiating models, writing node functions, wiring a graph, and compiling it.

Orca is a domain-specific language that lets you declare what your agent system looks like, and a compiler handles the rest. Think Terraform for AI agents — an HCL-inspired block syntax where you describe what exists, not how to wire it.

Orca

model claude {
  provider    = "anthropic"
  model_name  = "claude-opus-4.6"
  temperature = 0.3
}

agent researcher {
  model   = claude
  tools   = [builtins.web_search]
  persona = "You're a tech trends researcher"
}

agent writer {
  model   = claude
  persona = "You're a professional writer"
}

cron daily {
  schedule = "0 9 * * 1-5"
}

workflow search_and_write {
  daily -> researcher -> writer
}

Python

from langchain_anthropic import ChatAnthropic
from langchain_community.tools import TavilySearchResults
from langgraph.graph import StateGraph, MessagesState

claude = ChatAnthropic(
    model="claude-sonnet-4-20250514",
    temperature=0.3,
)

search_tool = TavilySearchResults()
claude_with_tools = claude.bind_tools([search_tool])

def researcher(state: MessagesState):
    sys = "Research topics thoroughly."
    messages = [SystemMessage(content=sys)]
        + state["messages"]
    return {"messages":
        [claude_with_tools.invoke(messages)]}

def writer(state: MessagesState):
    sys = "Write reports from research."
    messages = [SystemMessage(content=sys)]
        + state["messages"]
    return {"messages":
        [claude.invoke(messages)]}

graph = StateGraph(MessagesState)
graph.add_node("researcher", researcher)
graph.add_node("writer", writer)
graph.add_edge("__start__", "researcher")
graph.add_edge("researcher", "writer")
graph.add_edge("writer", "__end__")
app = graph.compile()

# Cron trigger? You're on your own.

Design Principles

Declarative over imperative. You describe the components of an agent system and their relationships. The compiler handles the code generation. No state schemas, no graph wiring, no boilerplate.

Convention over configuration Sensible defaults for everything. A model block with just a provider should work. Only override what you need to customize.

Composability Agents, tools, models, and workflows are independent blocks that compose freely. Build complex systems by combining simple, self-contained pieces.

Highly orthogonal syntax. The basic construct of orca is declarative blocks with parameters as key-value assignments with predictable syntax.

Language/Framework independent. Orca is not a wrapper around LangGraph. It's a language with its own compiler, type system, and semantic analysis. The current backend targets LangGraph, but the architecture is designed for multiple backends (CrewAI, AutoGen, and others).

Quick Start

git clone https://github.com/ThakeeNathees/orca.git
cd orca/orca
make build

Create a file called main.orca:

model gemini {
  provider    = "google"
  model_name  = "gemini-2.5-flash"
}

agent math_expr_generator {
  model   = gemini
  persona = "Generate a simple math expression"
}

agent math_expr_solver {
  model         = gemini
  persona       = "Solve the given math expression"
  thinking      = true
  output_schema = number
}

workflow flow {
  math_expr_generator -> math_expr_solver
}

Run it:

orca run

This will automatically build your .orca files, generate the build/ directory, and run the resulting Python/LangGraph code.
(You can also use orca build to just generate the code without running.)

Key Features

  • Domain-agnostic DSL — the entire language is defined by schemas in bootstrap.orca; customize it to your domain by redefining block types and fields
  • Declarative syntax — models, agents, tools, workflows, and schemas in clean, readable HCL-like syntax
  • Type-safe by default — schemas and block references validated at compile time, not runtime
  • Constant folding — expressions that reduce to known values are evaluated at compile time and reused in generated code; where list or map access is folded, mistakes like out-of-range indices or missing keys surface as compile errors instead of runtime failures
  • Multi-agent workflows — agent chains, fan-out patterns, conditional routing, and complex orchestration graphs
  • Built-in triggers — cron schedules and HTTP webhooks for automated or event-driven workflows
  • Custom schemas — define strongly-typed, nested schemas with field descriptions and annotations
  • First-class functions — lambdas with type inference, closures, and higher-order functions
  • LangGraph backend — generates production-ready Python code using the battle-tested LangGraph framework
  • IDE support — language server with go-to-definition, hover hints, autocomplete, and diagnostics
  • Readable generated code — Python output is clean and debuggable, with source mapping back to .orca files
  • Token-efficient output — TODO: leverage TOON (Token-Oriented Object Notation) and Caveman for compact, efficient token representations in generated code and LLM prompts
  • Extensible — generated code is meant to be customized and extended for your use case

Orca Studio

Orca Studio is an in-browser companion to the text-based language: a Next.js app that lays out Orca concepts — models, agents, tools, tasks, workflows, and the edges between them — on an interactive canvas.

Try it online: Orca Studio (GitHub Pages).

To run Studio locally:

cd studio
pnpm install
pnpm run dev
image

Editor Support

Orca ships with a VS Code extension that provides syntax highlighting, autocomplete, and go-to-definition for .orca files.

To install the Orca language extension locally for your editor, simply run the commands below. This will link the extension directory directly, enabling you to get the latest features without needing a marketplace install. For VS Code, use:

ln -s $(pwd)/editor/vscode ~/.vscode/extensions/orca-lang

If you're using Cursor, the process is just as straightforward. Run:

ln -s $(pwd)/editor/vscode ~/.cursor/extensions/orca-lang

After creating the symlink for your editor of choice, restart the editor to activate the extension.

VS Code extension showing syntax highlighting and autocomplete

Contributing

Test Driven Development is enforced throughout the project: every new feature or change starts with a failing test case. See CLAUDE.md for development conventions and detailed project structure.

Papers

Orca is accompanied by two research papers exploring declarative agent orchestration and intent compilation. Both are work-in-progress and live in the paper/ directory.

Research papers preview

To build a paper’s PDF (LaTeX required), run make build from that paper’s folder. The PDF is written to out/main.pdf.

cd paper/agents-as-code && make build
cd paper/compiling-intent && make build

Presents Orca as a domain-specific language for AI agent orchestration. Describes the HCL-inspired block syntax, the four-stage compiler pipeline (lexer, Pratt parser, semantic analyzer, code generator), and the type system that enables static checking and editor integration via the Language Server Protocol. The compiler catches misconfigurations — undefined references, type mismatches, missing fields — at compile time, before any LLM is invoked.

Argues that the path to robust multi-agent systems is not smarter prompting but smarter engineering. Applies classical compiler design principles — parsing, semantic analysis, optimization, and code generation — to the problem of transforming natural language intent into executable agent graphs. The agentic compiler, itself written in Orca, takes natural language descriptions and generates valid .orca source files, which are then compiled by the Orca compiler into Python/LangGraph code — creating a two-stage pipeline that isolates LLM non-determinism from deterministic compilation.

References

License

MIT