Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
a129759
docs(spec): lazy table API with cjson-compatible decode/encode
May 16, 2026
66529fd
docs(plan): lazy table cjson-compat implementation plan
May 16, 2026
ae8152e
feat(ffi): add qjd_cursor_bytes returning original byte span
membphis May 16, 2026
8352464
refactor(ffi): factor scalar_byte_range helper + test polish
membphis May 16, 2026
8d0fd7a
feat(ffi): add qjd_cursor_object_entry_at for object iteration
membphis May 16, 2026
719a145
feat(lua): skeleton for quickdecode.table + sentinel bridge
membphis May 16, 2026
cd127a9
feat(lua): LazyObject __index for scalar fields
membphis May 16, 2026
7e0e225
chore(lua): comment+error-handling polish in table.decode
membphis May 16, 2026
1c48c1d
feat(lua): wrap nested containers as Lazy proxies
membphis May 16, 2026
f93c39d
feat(lua): LazyArray __index for integer keys
membphis May 16, 2026
78cd75c
test(cursor): regression for walk_children trailing-scalar visit
membphis May 16, 2026
f770752
feat(lua): __len for LazyObject and LazyArray
membphis May 16, 2026
6bb1af8
feat(lua): __pairs/qd.pairs for LazyObject + factor decode_cursor
membphis May 16, 2026
8b2daaf
feat(lua): __ipairs/qd.ipairs for LazyArray
membphis May 16, 2026
1d9a10f
feat(lua): __newindex materializes affected level only
membphis May 16, 2026
32dceaa
feat(lua): qd.materialize for deep conversion to plain tables
membphis May 16, 2026
aeb4f5f
feat(lua): qd.encode proxy fast path (original substring)
membphis May 16, 2026
3fdd278
feat(lua): qd.encode scalars (string/number/bool/null)
membphis May 16, 2026
35c9a58
feat(lua): qd.encode for real tables + mixed lazy/materialized
membphis May 16, 2026
6e31a9a
feat(lua): re-export lazy table API from top-level quickdecode
membphis May 16, 2026
5f6e68b
test(lua): cjson round-trip equivalence + sentinel coverage
membphis May 16, 2026
a62e30e
bench: add qd.decode/qd.encode rows
membphis May 16, 2026
3f0e2e7
docs: add lazy table API usage + roadmap deferred items
membphis May 16, 2026
b740b64
fix(lua): propagate nested mutations through qd.encode + preserve cac…
membphis May 16, 2026
dd12a40
fix(lua): qd.len + LJ52-aware __len tests for vanilla LuaJIT
membphis May 16, 2026
8fd99f9
chore: drop docs/superpowers internal specs + plans
membphis May 16, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,6 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co

Rust JSON decoder (`cdylib` + `rlib`) exposed to LuaJIT via FFI. Optimized for parse-once / extract-a-few-fields / discard. The competitive edge over `lua-cjson` comes from **never building a Lua table** — Phase 1 records only structural offsets, Phase 2 lazily decodes the fields the caller actually asks for. Crate name in `Cargo.toml` is `lua-quick-decode`; the compiled artifact is `libquickdecode.so`.

Authoritative design doc: `docs/superpowers/specs/2026-05-15-rust-quick-json-decode-design.md`.

## Common commands

The `Makefile` is the canonical entry point; `make help` lists targets.
Expand Down
51 changes: 49 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,6 @@

Rust-implemented fast JSON decoder exposed to LuaJIT via FFI. Optimized for the common case where a large JSON is parsed once and only a small number of fields are extracted before the document is discarded.

Design document: `docs/superpowers/specs/2026-05-15-rust-quick-json-decode-design.md`.

## Status

Initial implementation complete: scalar + AVX2/PCLMUL structural scanner, root-path and cursor APIs, escape-decoded strings, integer/float/bool/typeof/len, FFI panic barrier, and a LuaJIT wrapper. Rust unit/integration tests and Lua busted tests run in CI. The benchmark harness compares against lua-cjson but tuning is pending — see `Roadmap / Deferred` below.
Expand Down Expand Up @@ -38,6 +36,41 @@ local model = body:get_str("model")
local temp = body:get_f64("temperature")
```

### Lazy table API (`qd.decode` / `qd.encode`)

For callers migrating from `cjson`, an alternative API returns a table-shaped
lazy view. Reads, iteration, and length all work like a `cjson.decode`'d
table; writes materialize the affected level into a plain Lua table.

```lua
local qd = require("quickdecode")
local cjson = require("cjson") -- optional; provides null / empty_array sentinels

local t = qd.decode(json_str)

print(t.model)
for _, m in qd.ipairs(t.messages) do
print(m.role, m.content)
end

t.extra = "x"

local s = qd.encode(t) -- drop-in replacement for cjson.encode
```

`qd.encode` works on lazy proxies (re-emitting unmodified subtrees as the
original JSON bytes), real Lua tables (matching `cjson.encode` output), and
mixed trees. Callers cannot pass a lazy proxy directly to `cjson.encode`
(cjson bypasses metamethods in C); use `qd.encode` instead, or call
`qd.materialize(t)` to get a plain Lua table that any third-party encoder
can handle.

**LuaJIT compat-52 caveat.** `for k, v in pairs/ipairs(t)` and `#t` on a lazy
proxy rely on `__pairs` / `__ipairs` / `__len`, which LuaJIT only invokes when
built with `LUAJIT_ENABLE_LUA52COMPAT` (OpenResty's default). On a stock LuaJIT
5.1, use the explicit `qd.pairs(t)`, `qd.ipairs(t)`, and `qd.len(t)` helpers
— they work on both builds.

## Testing — Lua

Requires LuaJIT + busted + lua-cjson installed system-wide.
Expand Down Expand Up @@ -76,3 +109,17 @@ Items intentionally pushed out of the first implementation. Each will be picked
- **`cargo fmt --check` not enforced** — `make lint` runs clippy only. The codebase uses intentional manual column alignment in struct definitions and compact single-line literals that default rustfmt would reflow. Skip rather than reformat until a project-wide style decision is made.
- **`validate_brackets` fusion into scan emit loop** — surfaced by profiling: on structurally-dense workloads `validate_brackets` is 65% of parse time (second linear pass over emitted indices). Folding bracket pairing into the scan emit loop via an inline depth stack eliminates that pass. No effect on the current string-heavy bench (0.3% there); a win for config / JSONL / table-shape JSON.
- **`memchr2` cross-chunk jump for very long string interiors** — the AVX2 in-string fast probe (issue #5) drops per-chunk cost from ~25 to ~10 ops but still pays ALU work for every 64-byte chunk in a string. A `memchr2(b'"', b'\\')` jump can approach memory bandwidth on multi-MB single-string payloads. Deferred until a workload that benefits clearly emerges; needs careful `bs_carry` reasoning across the jump.
- **Stateful O(N) iterator FFI** — current `qd.pairs` and the `__newindex`
materialization path walk the object cursor from the start on every step,
giving O(N²) total cost for full enumeration. Acceptable for the "read a
few keys" use case the library is optimized for; full-iteration workloads
(e.g. encoding a deeply-keyed object that has been materialized) would
benefit from a `qjd_iter_init` / `qjd_iter_next` pair that holds position
state across calls.
- **Lazy-table read overhead vs path API** — `qd.decode + t.field x3` lands
~30–40% behind `qd.parse:get_str` on small-to-medium payloads, converging
to parity at multi-MB sizes. The gap is structural (per-access `__index`
metamethod dispatch + transient cdata allocation for nested wraps). Worth
attempting if a workload-driven need surfaces; current measured cost is
still 14× faster than `cjson.decode` at 100 KB, so the lazy API is the
right default for migrating callers.
28 changes: 28 additions & 0 deletions benches/lua_bench.lua
Original file line number Diff line number Diff line change
Expand Up @@ -147,6 +147,18 @@ for _, s in ipairs(scenarios) do
local _ = d:get_str("messages[0].role")
end)
end

bench("qd.decode + t.field x3", s.iters, function()
local t = qd.decode(s.payload)
local _ = t.model
local _ = t.temperature
local _ = t.messages and t.messages[1] and t.messages[1].role
end)

bench("qd.decode + qd.encode (unmodified)", s.iters, function()
local t = qd.decode(s.payload)
local _ = qd.encode(t)
end)
end

-- Interleaved scenario: cycle through several payloads of different sizes
Expand Down Expand Up @@ -207,4 +219,20 @@ do
local _ = d:get_str("messages[0].role")
end)
end

next_p = make_cycler(interleaved)
bench("qd.decode + t.field x3", 400, function()
local p = next_p()
local t = qd.decode(p)
local _ = t.model
local _ = t.temperature
local _ = t.messages and t.messages[1] and t.messages[1].role
end)

next_p = make_cycler(interleaved)
bench("qd.decode + qd.encode (unmodified)", 400, function()
local p = next_p()
local t = qd.decode(p)
local _ = qd.encode(t)
end)
end
Loading
Loading