Skip to content

Latest commit

 

History

History
540 lines (401 loc) · 13.6 KB

File metadata and controls

540 lines (401 loc) · 13.6 KB

Capa Language Reference

Full specification of the syntax and semantics of the Capa language (current version). For a guided introduction, see tutorial.md. For the built-in APIs, see stdlib.md.


1. Lexical structure

1.1. Encoding

UTF-8 is required. Identifiers may contain any Unicode letter, digits, and _, but must start with a letter or _.

1.2. Comments

// Line comment (runs to the end of the line)

There are no block comments.

1.3. Indentation

Capa is indentation-sensitive, à la Python. Implicit INDENT/DEDENT/NEWLINE tokens are produced by the lexer:

  • Leading whitespace on a line defines its indentation level
  • Increase → INDENT
  • Decrease → DEDENT
  • End of line → NEWLINE
  • Inside (, [, {, NEWLINE is suppressed (implicit line continuation)

1.4. Implicit continuation by leading dot

For multi-line method chaining, a line beginning with . is treated as a continuation of the previous line:

let r = xs
    .filter(...)
    .map(...)
    .fold(...)

1.5. Keywords

fun let var if then elif else match while for in
break continue return import const type trait impl
true false and or not consume

1.6. Literals

Type Examples
Integer 42, -7, 0, 1_000_000
Float 3.14, 2.0, 1e10
String "hello", "a\nb", "x = ${x}"
Char 'a', '\n'
Bool true, false
List [1, 2, 3], []
Tuple (1, "a"), (x,), ()

1.7. Interpolated strings

${expr} inside a string literal is parsed as a Capa expression:

let n = 7
"value = ${n * 2}"  // "value = 14"
"len = ${xs.length()}"

$$ is the literal-$ escape. Nested string literals inside interpolation are not supported.


2. Type system

2.1. Primitive types

Int, Float, String, Bool, Char, Unit. See stdlib.md for details.

2.2. Compound types

Construct Syntax
List List<T>
Tuple (T1, T2, ..., Tn)
Function Fun(T1, T2) -> Ret
Map Map<K, V>
Set Set<T>

2.3. User-defined types

Structs:

type Person { name: String, age: Int }

Sum types (nominal variants):

type Shape =
    Circle(Float)
    Rectangle(Float, Float)
    Square(Float)

Variants may have zero or more payloads. Variants without a payload (type X = A) are constants, used without ().

2.4. Generics

Functions and types can take type parameters delimited by <>:

fun first<T>(xs: List<T>) -> Option<T>
    return xs.first()

type Pair<A, B> { first: A, second: B }

Local inference: the caller rarely needs to supply explicit args. first<Int>([1,2,3]) is equivalent to first([1,2,3]).

2.5. Cross-statement inference

let xs = [] produces List<TyVar>. The first use pins the type parameter:

let xs = []
xs.push(42)        // OK, infers List<Int>
xs.push("oops")    // error: expects Int, got String

TyVar sharing propagates through aliases (let ys = xs) and into calls to typed functions (process(xs) where process: List<Int> -> ...).

2.6. Compatibility

compatible(expected, actual) is structural with exceptions:

  • TyUnknown (an untyped expression) is compatible with any type
  • TyVar (inference placeholder) is compatible with any type

3. Statements

3.1. Bindings

let name = "Ana"               // immutable, type inferred
let age: Int = 30              // immutable, explicit type
var counter = 0                // mutable
counter = counter + 1          // assignment (only for var)

Pattern matching in bindings:

let (a, b) = pair()            // tuple destructuring
let Person { name, age } = p   // struct destructuring

3.2. Control flow

// if-statement
if cond
    body1
elif cond2
    body2
else
    body3

// while
while cond
    body

// for
for x in iter
    body

// match (statement)
match scrutinee
    pat1 -> body1
    pat2 -> body2

// match (expression, multi-line)
let r = match scrutinee
    pat1 -> expr1
    pat2 -> expr2

// match (expression, inline single-line)
let r = match scrutinee { pat1 -> expr1, pat2 -> expr2 }

// break / continue (only inside loops)
break
continue

// return
return                  // returns ()
return expr             // returns a value

3.3. Expressions as statements

Any expression can be a statement (value discarded):

stdio.println("hello")      // call with side effect
xs.push(42)                 // mutation
1 + 2                       // value discarded (valid but useless)

4. Expressions

4.1. Operators

In decreasing precedence:

Operator Description
() [] . Call, index, field access
not - Unary
* / % Multiplicative
+ - Additive
< <= > >= == != Comparison
and Short-circuit conjunction
or Short-circuit disjunction
? Try (Err propagation)

4.2. if as an expression

let cat = if cond then e1 else e2

The then keyword is the discriminator: without it, if is a statement. Only the ternary form is an expression; block-form if/elif/else always produces ().

When the branches need intermediate let bindings, use the block-as-expression form of match instead (see §4.3):

let watchlist = match opts.watchlist_path.is_empty()
    true  -> default_watchlist()
    false ->
        let loaded = load_watchlist(read_fs, opts.watchlist_path)?
        log.info("loaded ${loaded.length()} prefixes")
        loaded

4.3. match as an expression

match is the same production whether used as a statement or as an expression, the value is consumed in expression position and discarded in statement position. Two surface forms exist:

// Multi-line (indented arms, expression OR block body)
let r = match scrutinee
    pat1 -> expr1
    pat2 -> expr2

// Inline (single-line, comma-separated, expression body only)
let r = match scrutinee { pat1 -> expr1, pat2 -> expr2 }

Both forms accept guards and or-patterns. All arms must produce compatible types.

A block-body arm (multiple statements under a pattern -> line) produces a value when its final statement is a bare expression (block-as-expression, à la Rust):

let n = match key
    "fast" -> 1
    "slow" ->
        let base = compute()
        base * 2

If the block's final statement is not an expression (it ends in a let, var, assignment, return, etc.), the arm produces (); in that case, mixing it with a value-producing arm is a type error and the match must be used in statement position.

The inline form's { ... } opens immediately after the scrutinee. This collides syntactically with the struct-literal heuristic, to force a struct literal as the scrutinee, wrap it in parentheses:

match (Point { x: 1.0, y: 2.0 })
    Point { x, y } -> stdio.println("${x}, ${y}")

4.4. Lambdas

fun (x: Int) -> Int => x * 2                    // single-expression
fun (x: Int) -> Int =>                          // block body
    let y = x * 2
    return y + 1
fun () -> Int => 42                             // no params
fun (a: Int, b: Int) -> Int => a + b            // multiple params

Lambdas capture the lexical environment. If a single-line lambda contains a nested match, the transpiler automatically promotes it to a nested function.

4.5. The ? operator

Propagates Err in functions that return Result:

fun read_two(fs: Fs) -> Result<(String, String), IoError>
    let a = fs.read("a")?      // if Err, returns immediately
    let b = fs.read("b")?
    return Ok((a, b))

5. Pattern matching

5.1. Available patterns

Pattern Syntax Matches
Wildcard _ Any value
Identifier x Binds to x
Literal 42, "x", true Equality
Variant without payload None Singleton variant
Variant with payload Some(x), Ok(v) Match + bind
Struct Person { name, age } Match + bind fields
Tuple (a, b), (x, _, z) Tuple of the same arity
Or-pattern a | b | c Any alternative

5.2. Or-patterns with bindings

Each alternative can bind variables, provided all of them bind the same set of names with compatible types:

match op
    Add(n) | Sub(n) | Mul(n) -> n   // n is Int in all

5.3. Guards

match n
    x if x > 0 -> "positive"
    x if x < 0 -> "negative"
    _ -> "zero"

5.4. Exhaustiveness

The checker requires full coverage:

  • Sum types: every variant, or a catch-all _
  • Bool: both true and false, or a catch-all
  • Or-patterns count each alternative toward the count
type Color = Red | Green | Blue

match c
    Red -> "r"
    Green -> "g"
    // error: missing variant Blue

5.5. Type-parameter substitution

match m.get(k) where m: Map<String, Int> infers Some(n) with n: Int, not n: T. The owner's type parameters are substituted by the scrutinee's type arguments.


6. Capabilities

6.1. What they are

Capabilities are primitive types representing access to system resources (Stdio, Fs, Env, Clock, Random, Unsafe). They are only accessible via function parameters, there are no global instances.

6.2. The capability discipline (three layers)

Structural: capabilities cannot appear in struct fields, variant payloads, function return types, constants, let/var bindings, generic args, or tuples. They only flow through parameters. (Exception: a struct that impls a user-defined capability may hold built-in caps as fields - the "cap-bearing struct" relaxation.)

Flow:

  • No aliasing: the same capability cannot occupy two argument slots in a single call
  • Mandatory use: capability parameters must be used (or prefixed with _ to silence the warning)

Linear: the consume keyword indicates ownership transfer:

fun close(consume f: File)
    // f cannot be used after this call

"Consumed" variables are tracked across fork/merge in if/elif/ else and match. In loops, the analysis uses dry-run + redo to discover consumes in the first iteration.

6.3. Capability in the signature

fun main(stdio: Stdio, fs: Fs)            // multiple
fun pure(x: Int) -> Int                   // no capabilities (pure)
fun with_consume(consume cap: MyCap)      // ownership transfer

7. Imports

import util                     // sibling: ./util.capa
import sinks.csv_sink           // nested: ./sinks/csv_sink.capa
import capa_log.log             // package dep: <vendor_or_path>/capa_log/log.capa
import util as U                // alias the module name

Only items marked pub in the target module are visible to the importer. After import util, every pub name from util.capa is reachable directly (greet(...)), or by qualified call (util.greet(...)). With import util as U, qualified calls take the alias: U.greet(...).

7.1. Module resolution order

When the loader resolves import x.y, it tries each of the following search paths in order, and uses the first hit:

  1. The directory of the importing .capa file (sibling and nested-subdir imports work without any setup).
  2. Every directory in the CAPA_PATH environment variable.
  3. ./vendor/ when capa.toml declares at least one git dependency (populated by capa install).
  4. The parent of every path = "..." entry in capa.toml.
  5. ./libraries/ - conventional fallback for projects that vendor by hand.
  6. The directory of the root file passed to capa --run.

Each entry is deduplicated; a missing directory is silently skipped. See packages.md for the package manager's role in resolution.

7.2. Visibility

  • pub fun, pub type, pub const, pub capability: visible to importers.
  • Unprefixed declarations: module-private. An importer who tries to call them gets unresolved name 'foo'.

For Python interop, use the typed builtins py_import(unsafe, name) and py_invoke(unsafe, callable, args), both require the Unsafe capability. See stdlib.md.


8. The main program

The entry point is a function called main that may take one or more capabilities as parameters. The capabilities are instantiated by the runtime at boot:

fun main(stdio: Stdio, fs: Fs, env: Env)
    let argv = env.args()
    stdio.println("received ${argv.length()} arguments")

If main returns Result<(), E>, an Err causes a non-zero exit code.


9. Differences from Python

Capa transpiles to Python 3.10+, but the semantics differ:

Capa Python
Capabilities required for I/O Globals such as print, open
Types checked at compile time Duck typing
Exhaustive match checked match at runtime, no exhaustiveness
Or-patterns with consistent bindings Or-patterns without bindings
let x: List<Int> = [] valid Python equivalent has no checks
Mutation only with var or consume Everything mutable

10. Known limitations

  • String literals do not support multi-line (use \n for line breaks).
  • Nested string literals inside interpolation ("x ${"inner"} y") are not supported; bind the inner value to a let first.
  • Errors inside interpolation report positions starting from the file start; the offset has not yet been wired to the position inside the ${ ... } expression.
  • The module system is intentionally small: import, optional as alias, pub visibility. No re-exports, no star imports, no transitive dependency resolution at the language level (the package manager handles transitive fetch via capa.toml and capa install; see packages.md).
  • No asynchronous IO operations.
  • if/match in block-body lambdas needs => before the indented block.