Teddy

A data manipulation and analysis library for Zig, inspired by pandas and polars. Teddy provides type-safe, null-aware operations on structured tabular data with full control over memory allocation.

Features

Typed Series — Generic Series(T) supporting all integer types, f32, f64, bool, and String
DataFrame — Multi-column tabular structure backed by typed series
Null support — Validity bitmap (Apache Arrow model); null-aware aggregations, fill, and drop
Type casting — Three-tier system: castSafe (comptime-verified), cast (strict runtime), castLossy (failures → null)
Aggregations — sum, min, max, mean, stdDev — all skip nulls
GroupBy — Group by any column; aggregate with count, sum, mean, min, max, stdDev
Joins — Inner, left, right, and outer joins on a key column
Sorting & filtering — sort, filter, select, head, tail, slice, unique, valueCounts
I/O — CSV, JSON, and Parquet read/write
Memory safety — Explicit allocator everywhere; clear ownership contracts

Quick Start

const std = @import("std");
const df_mod = @import("dataframe");
const Dataframe = df_mod.Dataframe;
const Series = df_mod.Series;

pub fn main() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    defer _ = gpa.deinit();
    const allocator = gpa.allocator();

    // Build a DataFrame
    var df = try Dataframe.init(allocator);
    defer df.deinit();

    var age = try Series(i32).init(allocator);
    try age.rename("age");
    try age.append(25);
    try age.append(30);
    try age.appendNull(); // missing value
    try age.append(22);
    try df.addSeries(age.toBoxedSeries());

    // Filter rows where age > 24 (nulls are excluded automatically)
    var filtered = try df.filter("age", i32, .gt, 24);
    defer filtered.deinit();

    // Fill nulls before aggregation
    var filled = try df.fillNull("age", i32, 0);
    defer filled.deinit();

    // Cast a column to a different type
    var as_float = try df.cast("age", f64);
    defer as_float.deinit();

    // Summary statistics
    var stats = try df.describe();
    defer stats.deinit();
    try stats.print();
}

Null Handling

Nulls are stored as a validity bitmap alongside values (no memory overhead on null-free columns). The API follows pandas/polars conventions: aggregations skip nulls, and null-aware operations return ?T at the API surface.

try s.appendNull();              // mark a slot as missing
s.isNull(i)                     // check a slot
s.nullCount()                   // count missing values
try s.fillNull(0)               // replace nulls with a scalar
try s.fillNullForward()         // carry last valid value forward (LOCF)
try s.fillNullBackward()        // carry next valid value backward
try s.dropNulls()               // new series with null rows removed

try df.dropNulls("col")         // drop rows where col is null
try df.dropNullsAny()           // drop rows where any column is null
try df.fillNull("col", i32, 0)  // fill nulls in one column

Type Casting

Three tiers matching Zig's explicit-cast philosophy:

// castSafe — comptime guarantee; compile error if the types aren't safely widening
var wide = try s.castSafe(i64);     // i32 → i64: always safe
// var bad = try s.castSafe(i8);    // compile error: may overflow

// cast — strict runtime; errors on overflow, fractional float→int, parse failure
var exact = try s.cast(i32);        // 3.0 → 3 ok; 3.5 → error.LossyCast

// castLossy — permissive; failures become null
var best = try s.castLossy(i8);     // 300 → null, 1 → 1

All three variants are available on Series(T), BoxedSeries, and Dataframe.

GroupBy

var gb = try df.groupBy("department");
defer gb.deinit();

var counts = try gb.count();
defer counts.deinit();

var totals = try gb.sum("salary");
defer totals.deinit();

I/O

// CSV
var df = try Dataframe.Reader.csv(allocator, "data.csv", .{});
defer df.deinit();
const csv_str = try df.toCsvString(.{});
defer allocator.free(csv_str);

// JSON
const json_str = try df.toJsonString(.rows);
defer allocator.free(json_str);

// Parquet
var df2 = try Dataframe.Reader.parquet(allocator, "data.parquet", .{});
defer df2.deinit();

Building

zig build          # compile
zig build test     # run all tests (~350 tests)
zig build run      # run the example

Requires Zig 0.14 or later (tested on 0.16.0-dev).

Project Structure

src/
  dataframe/
    dataframe.zig      — DataFrame: columns, filter, sort, join, groupby, I/O
    series.zig         — Series(T): typed column with null support and casting
    boxed_series.zig   — Type-erased union over all Series(T) variants
    boxed_groupby.zig  — Type-erased GroupBy dispatch
    group.zig          — GroupBy(T) aggregation engine
    join.zig           — Inner / left / right / outer join
    strings.zig        — Managed String type used for string columns
    csv_reader.zig     — CSV parser
    csv_writer.zig     — CSV serializer
    json_reader.zig    — JSON parser (rows and columns format)
    json_writer.zig    — JSON serializer
    reader.zig         — Unified Reader builder
    writer.zig         — Unified Writer builder
  parquet/
    parquet.zig        — Parquet reader/writer

Roadmap

See plan.md for the prioritized feature roadmap comparing teddy to pandas and polars.

License

TBD

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
.vscode		.vscode
data		data
docs		docs
src		src
src_py		src_py
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
build.zig		build.zig
build.zig.zon		build.zig.zon
nullable_design.md		nullable_design.md
plan.md		plan.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Teddy

Features

Quick Start

Null Handling

Type Casting

GroupBy

I/O

Building

Project Structure

Roadmap

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Teddy

Features

Quick Start

Null Handling

Type Casting

GroupBy

I/O

Building

Project Structure

Roadmap

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages