Roadmap

What's coming next, in roughly the order it'll land.

Shipping next

nn2 ARM / Apple Silicon (PR #3 — @navado)

In review now. Adds first-class build paths for:

Apple Silicon (M1–M4) — NEON + Accelerate (AMX) + SME on M4+
Apple Neural Engine via Core ML bridge
AArch64 NEON (Linux/Android, Raspberry Pi)
Cross-platform tools/bench.c for unified benchmarking
AVFoundation live-camera bench on macOS (make bench-camera)

Status: M2 path is 51/51 tests passing. AArch64 NEON foundation done. iMX 8/93/95 NPU lib (Ethos-U65 / VxDelegate / XNNPACK) and ESP32-P4 ESP-IDF component are draft until physical hardware testing.

Active liveness in the browser

The demo already has a basic "turn your head / blink twice" challenge. Next:

Multi-step challenges ("turn left, then up, then blink")
Randomized sequence per session to defeat video-replay
Server-attested challenge tokens (HMAC, time-bounded)
iBeta-style attack dataset for evaluation

Per-customer encrypted weight bundles

Currently the demo ships a single AES-256 key obfuscated in JS. The production-ready path:

Per-customer key wrapping (KEYWRAP under a master)
Domain-binding via HMAC
Time-bounded keys with rotation endpoint
Audit log on each decrypt attempt

See Encrypted Weights for the API shape.

Bigger items on the horizon

ESP32-P4 + iMX 95 hardware port

Pending: real hardware in hand. Once we have boards:

INT8 quantize the 4 recognition variants for the NPU
Wire MiniFASNet ensemble into the i.MX SDK
Stream MIPI-CSI directly into the engine, no RAM round-trip

Expected: full pipeline (decode → detect → recognize) on a single $10 ESP32-P4 chip.

Smile + age + emotion + glasses heads

The 187 KB smile classifier is a template. Adding more tiny binary heads is straightforward:

age — bucketed 0/18/30/45/60+
emotion — happy/sad/angry/surprised/neutral (5 classes via softmax)
glasses — yes/no
mask — yes/no
eyewear — sunglasses/clear/none

Each ~50–200 K params, ~200 KB ONNX, ~10 min training on scraped data.

rPPG pulse — phase 2

The current 5-second forehead DFT gets ~80% accuracy on a still face. Improvements queued:

Adaptive ROI tracking (forehead moves as the user does)
Bandpass via Butterworth IIR instead of narrow-band scan
HRV (heart-rate variability) metrics over 30-second window
Pulse waveform visualization (sparkline in HUD)

WebGPU backend

onnxruntime-web has a WebGPU EP. On supported devices (Chrome 121+ desktop, Safari 18) inference drops by another 3–5×. We just need to:

Add executionProviders: ['webgpu', 'wasm'] fallback chain
Verify all ops have WebGPU implementations
Bench across recent browsers

nn2 → WASM port

Compile the C engine to WASM with SIMD128 paths instead of AVX-512. Skip the AArch64 NEON kernel rewrites; emit a portable WASM SIMD microkernel directly.

Expected: 1.3–2× faster than onnxruntime-web in browsers that don't have WebGPU.

Multi-face persistent tracking

Several people in the frame, each with their own colored bbox + ID + cumulative match score. Useful for kiosk / surveillance demos.

Long-term

Bitstream-domain detection — pull motion-vector and DCT residual info from H.264 / NXV without full decode; route ML inference to changed regions only. 10× speedup at the pipeline level (some of the prior research is in nn2/README.md)
iBeta-certified passive PAD — collect real attack samples, train a competing model, certify
Event-camera support — Prophesee GENX320 / Sony IMX636 native pipeline. Bypasses the frame-based ML problem entirely

How to influence the roadmap

Open an issue describing your use case
Send a PR — small + focused is best (see @navado's #3 for tone)
Email bauratynov@gmail.com for paid custom work / priority items

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Roadmap

Roadmap

Shipping next

nn2 ARM / Apple Silicon (PR #3 — @navado)

Active liveness in the browser

Per-customer encrypted weight bundles

Bigger items on the horizon

ESP32-P4 + iMX 95 hardware port

Smile + age + emotion + glasses heads

rPPG pulse — phase 2

WebGPU backend

nn2 → WASM port

Multi-face persistent tracking

Long-term

How to influence the roadmap

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally