zenanalyze

Streaming image content analyzer for adaptive codec pipelines. One pass over a zenpixels::PixelSlice extracts the numeric features that decision trees, selectors, and per-image encoder configurators consume — variance, edge density, chroma sharpness, palette population, DCT energy, alpha statistics, and (behind opt-in cargo features) classifier-style content-class likelihoods plus source-direct HDR / wide-gamut / bit-depth signals that codecs use to detect when a descriptor over-promises and the actual pixel content is encodable in something smaller.

[dependencies]
zenanalyze = "0.1"
# Opt in to source-direct HDR / wide-gamut / descriptor-gap signals:
# zenanalyze = { version = "0.1", features = ["experimental"] }
# Opt in to classifier-style composite scores (text / screen / natural /
# line-art likelihoods). The raw signals these consume are stable; the
# composite coefficients are calibration-driven and may drift in 0.1.x:
# zenanalyze = { version = "0.1", features = ["composites"] }

Cargo features

feature	what it gates	stability
(default)	Stable raw signals: variance, edge density, chroma sharpness, DCT energy, alpha, palette, distinct-color bins, etc.	Numeric drift in 0.1.x bounded by the threshold contract; signatures frozen
`experimental`	Research-stage signals: PatchFraction, AqMapMean/Std, NoiseFloorY/UV, GradientFraction, source-direct HDR / wide-gamut / bit-depth	Metric definition or scale may change; opt in only if you re-validate per patch
`composites`	Classifier-style scores: TextLikelihood, ScreenContentLikelihood, NaturalLikelihood, LineArtScore	Hand-tuned weighted combinators; the combination coefficients drift as the corpus calibration matures. The raw signals they consume are stable.

Why

Modern adaptive codecs (zenjpeg, zenwebp, zenpng, zenavif, zenjxl) all want the same handful of cheap content features to pick a quality knob — is this a photograph or a screenshot? does it have alpha? is it palette-friendly? is the HDR flag stale? does this Rec.2020 file actually use the wider gamut? — before they run their expensive encoder. Re-deriving that from scratch in each codec means three copies of the same Tier 1 SIMD scan, three slightly different threshold contracts, and three independent oracle retrains every time the math moves.

zenanalyze is the shared single-pass scanner: codecs ask for the feature set they care about, the orchestrator unions the requests, and one walk over the image returns every signal. Tiers gate themselves out when their outputs aren't needed.

Quick start

use zenanalyze::{
    analyze_features,
    feature::{AnalysisFeature, AnalysisQuery, FeatureSet},
};

const JPEG_FEATURES: FeatureSet = FeatureSet::new()
    .with(AnalysisFeature::Variance)
    .with(AnalysisFeature::EdgeDensity)
    .with(AnalysisFeature::HighFreqEnergyRatio);

const WEBP_FEATURES: FeatureSet = FeatureSet::new()
    .with(AnalysisFeature::Variance)
    .with(AnalysisFeature::AlphaPresent)
    .with(AnalysisFeature::AlphaUsedFraction);

let needed = JPEG_FEATURES.union(WEBP_FEATURES);
let results = analyze_features(slice, &AnalysisQuery::new(needed))?;

let variance = results.get_f32(AnalysisFeature::Variance);
let alpha    = results.get(AnalysisFeature::AlphaPresent)
                      .and_then(|v| v.as_bool());

AnalysisFeature is #[non_exhaustive] with stable u16 discriminants — retired ids stay reserved, new ids are sequential, the wire format never breaks. FeatureSet has full const fn set math (union, intersect, difference) so per-codec presets compose at compile time. AnalysisQuery is intentionally opaque: sampling budgets are crate invariants, not per-call knobs.

For a packed RGB8 buffer the convenience entry skips the PixelSlice ceremony:

// Panicking — for known-good inputs (freshly decoded buffers).
let r = zenanalyze::analyze_features_rgb8(&rgb_bytes, w, h, &q);

// Fallible — for untrusted input. Returns AnalyzeError::InvalidInput on
// length / stride mismatch and AnalyzeError::OutOfMemory on (future)
// fallible-allocation paths.
let r = zenanalyze::try_analyze_features_rgb8(&rgb_bytes, w, h, &q)?;

Codecs read the source PixelDescriptor directly off the result for encode-side metadata decisions:

let descriptor = results.source_descriptor();
match descriptor.primaries {
    zenpixels::ColorPrimaries::Bt709     => /* sRGB-class */,
    zenpixels::ColorPrimaries::DisplayP3 => /* P3 wide-gamut */,
    zenpixels::ColorPrimaries::Bt2020    => /* Rec.2020 / HDR */,
    zenpixels::ColorPrimaries::AdobeRgb  => /* AdobeRGB */,
    _ => /* unknown / future */,
}

FeatureSet::iter() walks the contained features in AnalysisFeature::id() order — convenient for sidecars, harnesses, and Python fitters that need to enumerate the surface without hand-listing variants.

What it computes

Default features (every codec gets these):

Feature	Type	Description
`Variance`	f32	Luma variance on the BT.601 [0, 255] scale.
`EdgeDensity`	f32	Fraction of sampled interior pixels with `\|∇L\| > 20`.
`ChromaComplexity`	f32	`√(Var(Cb) + Var(Cr))` over sampled pixels.
`CbSharpness` / `CrSharpness`	f32	Mean per-axis chroma gradient.
`Uniformity`	f32	Fraction of 8×8 blocks with luma variance < 25.
`FlatColorBlockRatio`	f32	Fraction of 8×8 blocks with R/G/B ranges all ≤ 4.
`DistinctColorBins`	u32	Distinct 5-bit-per-channel RGB bins observed.
`CbSharpness` / `CrSharpness` (Horiz/Vert/Peak)	f32	Per-channel per-axis chroma sharpness.
`HighFreqEnergyRatio`	f32	DCT AC energy ratio over sampled 8×8 luma blocks.
`LumaHistogramEntropy`	f32	Shannon entropy of a 32-bin luma histogram (bits).
`AlphaPresent` / `AlphaUsedFraction` / `AlphaBimodalScore`	bool / f32 / f32	Straight-alpha statistics.

Behind the composites cargo feature (calibration-driven, may drift in 0.1.x):

Feature	Type	Description
`TextLikelihood` / `ScreenContentLikelihood` / `NaturalLikelihood`	f32	Soft `[0, 1]` content-class scores. Combine raw signals (variance, edge density, distinct-color bins, etc.) via hand-tuned weights.
`LineArtScore`	f32	Soft `[0, 1]` rendered-line-art score from luma histogram bimodality + entropy.

Behind the experimental cargo feature, organised by what they drive:

Codec-orchestrator gap-fillers

Feature	Drives
`GrayscaleScore`	zenjpeg `ColorMode::Grayscale`, AVIF `Yuv400`, png/jxl gray paths (~30–40% smaller for B&W)
`AqMapMean` / `AqMapStd`	zenjpeg hybrid trellis λ, webp segments + sns_strength, avif vaq
`NoiseFloorY` / `NoiseFloorUV`	zenjpeg `pre_blur`, jxl `noise/denoise`, webp `sns_strength`, zenrav1e `film_grain`
`LineArtScore` (behind `composites`)	webp `Preset::Drawing`, jxl `splines` / `patches`, png palette preference
`GradientFraction`	jxl `with_force_strategy` (DCT16 / DCT32 selection), zenrav1e deblock strength
`SkinToneFraction`	photo-vs-other dispatch (one-direction signal, AUC 0.80) — webp `Preset::Photo`, jxl perceptual presets, jpeg chroma-aware quant
`EdgeSlopeStdev`	screen-vs-photo dispatch (AUC 0.84, second only to `PatchFraction`) — webp `Preset::Drawing` vs `Photo`, jxl modular vs VarDCT

Source-direct HDR / wide-gamut / bit-depth tier

These read source samples without going through RowConverter, since RowConverter doesn't tonemap — a 4000-nit PQ source and a 100-nit-clipped SDR source would otherwise produce byte-identical RGB8 streams.

Feature	Drives
`PeakLuminanceNits` / `P99LuminanceNits`	AVIF `clli`, JXL `intensity_target`, HDR encoder peak
`HdrHeadroomStops` / `HdrPixelFraction`	HDR vs SDR encode-mode selection
`WideGamutPeak` / `WideGamutFraction`	"Linear value > 1.0" detection
`GamutCoverageSrgb` / `GamutCoverageP3`	Descriptor-gap signal — if a Rec.2020 source's pixels all live in the sRGB sub-gamut, encode at sRGB primaries and save the wide-gamut metadata + encoder modes
`EffectiveBitDepth`	AVIF / JXL `bit_depth`, png `near_lossless_bits` (catches u8-promoted u16)
`HdrPresent`	Composite "transfer claims HDR AND pixels are actually bright" — catches stale HDR flags

Research-stage

Colourfulness, LaplacianVariance, VarianceSpread, PaletteDensity, DctCompressibilityY, DctCompressibilityUV, PatchFraction, PaletteLog2Size, PaletteFitsIn256. Numeric scale or definition may change in 0.1.x patches.

Descriptor-gap detection

The analyzer's job is to spot the gap between what the descriptor promises and what the data actually carries, so encoders don't bloat files paying for capacity the source doesn't need.

Gap	Signal
RGB declared, content is grayscale	`GrayscaleScore ≥ 0.99`
Wider primaries declared, content fits sRGB	`GamutCoverageSrgb ≥ 0.99`
Rec.2020 declared, content fits Display P3	`GamutCoverageP3 ≥ 0.99`
HDR transfer declared, content is SDR	`HdrPresent == false`
u16 declared, content is u8-promoted	`EffectiveBitDepth == 8`
RGBA declared, alpha is constant 1.0	`AlphaUsedFraction == 0`
Standard 8×8 transforms, content is smooth	`GradientFraction ≥ 0.5` (use larger DCTs)
HDR flag set, peak is dim	`PeakLuminanceNits < 200 && HdrPresent` (mis-tagged HDR)

Each one is a place where a codec orchestrator can downcast metadata + encoder modes before encoding, saving real bytes on real corpora.

Wide gamut, HDR, and bit depth

analyze_features accepts every layout zenpixels-convert::RowConverter can ingest — RGB8 / RGBA8 / BGRA8, RGB16 / RGBA16, RGB-F32 / RGBA-F32 (linear, sRGB, PQ, HLG), grayscale variants, all primaries (sRGB / Display P3 / Rec.2020 / AdobeRGB). One entry, no opt-in step. The principle: per-image codec decisions don't usually break on a few LSBs of luma drift, they break on the analyzer refusing to run.

u8-promotion invariance is locked by tests. An RGB8 image promoted to RGB16 via the standard u8 * 257 doubling, or to RGBF32 via u8 / 255.0, produces bit-identical features to the original RGB8 source. Codecs that upgrade from u8 to wider formats internally don't see different analyzer answers. (Verified by garb's exact-identity narrowing (u16 * 255 + 32768) >> 16 for u16 = u8 * 257.)

Wide gamut adapts the values, not the API. RGB8 with Display P3 / Rec.2020 / AdobeRGB primaries passes through the zero-copy Native row path with its bytes intact. The standard tiers pick the right luma matrix per source primaries: BT.601 weights for sRGB / BT.709 (preserving the trained-threshold baseline — coefficient's existing thresholds were calibrated against this matrix on sRGB content), BT.2020 weights for Rec.2020 sources, the Y row of each primary set's RGB→XYZ matrix for Display P3 / AdobeRGB. Fixed-point integer-luma scales are normalised to the same sum-220 libwebp baseline so a pure-white pixel hits the same histogram bin regardless of source primaries — what differs is the per-channel weight that lands it there. No conversion, no clipping, just the right matrix. See src/luma.rs.

HDR f32 / linear inputs. Standard tiers see what an SDR display would show — RowConverter clips out-of-[0, 1] linear values, applies the sRGB OETF, and narrows to u8. That's the legitimate input for SDR-calibrated thresholds; tonemapping a 4 000-nit highlight into a visible mid-tone before measuring "high-frequency-energy ratio" would just lie about what's there. The above-clip signal lives in tier_depth, which reads the source samples directly via PixelSlice::row (bypassing RowConverter entirely) and decodes through the descriptor's transfer function — sRGB / BT.709 / Gamma 2.2 / Linear / PQ / HLG — to linear nits. Two views of the same source: the SDR-display view for trained thresholds, the source-direct view for HDR / wide-gamut signal.

The tier_depth reference convention is stable across 0.1.x:

Transfer	Linear 1.0 maps to	Convention
`Srgb` / `Bt709` / `Gamma22` / `Linear`	80 nits	sRGB display reference (IEC 61966-2-1)
`Pq`	10 000 nits	SMPTE ST 2084 absolute
`Hlg`	1 000 nits	nominal HLG broadcast

The standard tiers' threshold contract is calibrated on display-space RGB8 bytes; the depth tier surfaces the additional metadata-gap and HDR signals that the RGB8 narrowing destroys.

Errors

#[non_exhaustive]
pub enum AnalyzeError {
    Convert(String),                                  // RowConverter setup failed
    InvalidInput(String),                             // user-supplied bad layout / length
    OutOfMemory { bytes_requested: Option<usize> },   // future fallible-alloc path
    Internal(String),                                 // unexpected
}

Production code handling untrusted images should pattern-match on InvalidInput / OutOfMemory explicitly. Today every internal allocation is infallible (so OutOfMemory is reserved, never returned by current builds); the variant is part of the public surface so a future minor that flips internals to Vec::try_reserve doesn't break anyone's match.

How it's organised

Five passes, each gated by what the requested FeatureSet actually needs:

Pass	Iterates over	Reads	Cost (4 MP)	Drives
Tier 1	Stripe-sampled rows	RGB8	~1 ms	luma stats, edges, chroma, uniformity, grayscale
Tier 2	3-row sliding window	RGB8	~2 ms	per-axis Cb/Cr sharpness
Tier 3	Sampled 8×8 DCT blocks	RGB8	~3 ms	DCT energy, entropy, AQ map, noise floor, line-art, gradient, patch fraction
Palette	Full image	RGB8	~1 ms	distinct color bins
Alpha	Stride-sampled rows	Source bytes	~0.3 ms	alpha presence / used / bimodal
`tier_depth` (experimental)	Stride-sampled rows	Source bytes	~0.5 ms HDR, ~0 SDR-fast-path	HDR / wide-gamut / bit-depth / gamut-coverage

Tier 1/2/3 + Palette read RGB8 via RowStream, which has three internal paths:

Native (zero-copy) — RGB8-byte-layout-compatible inputs. Sub-slice straight from the source.
StripAlpha8 (zero RowConverter, scratch-only) — RGBA8 / BGRA8 / Rgbx8 / Bgrx8. Tight strip-and-maybe-swap into the row scratch. Skips the RowConverter alloc + plan + per-row CPU work.
Convert — everything else (16-bit, f32, grayscale, CMYK, …) goes through RowConverter row-by-row.

Alpha and tier_depth always read source bytes directly, never through RowStream — a load-bearing detail for HDR (RowConverter doesn't tonemap; its narrowing clips PQ / HLG into sRGB-display).

Performance

Release build, AVX2, no target-cpu=native, full FeatureSet::SUPPORTED:

Input	4 MP	RowStream path
RGB8 / Rgbx8 with sRGB / wide-gamut primaries	9.5 ms	`Native` (zero-copy slice subindex)
RGBA8	10.9 ms	`StripAlpha8` (garb SIMD strip)
BGRA8	12.0 ms	`StripAlpha8` (garb SIMD strip + swap)
RGB16	24.7 ms	`Convert` (zenpixels-convert RowConverter)
RGBA16	28.6 ms	`Convert` (zenpixels-convert handles strip + narrow)

The RGBA8 strip uses garb::bytes::rgba_to_rgb / bgra_to_rgb (SIMD-dispatched via archmage::incant!) — measured 7× faster than the previous in-tree scalar strip on a 2048-px row, dropping the RGBA8 overhead vs RGB8 baseline from +5.8 ms to +1.4 ms. The 16-bit input paths cost more because RowConverter does transfer-function- aware narrowing — that's the correct tool for genuinely heterogeneous input.

Per-call working-set memory is ~265 KB across ~7 allocations (largest single chunk is the Tier 1 stripe scratch at 9 × width × 3 = 108 KB at 4 K). All allocations are infallible today; the OutOfMemory variant exists so a future minor can flip them without API breakage.

Empirical operating thresholds

Picked on a 219-image labeled corpus from coefficient (benchmarks/classifier-eval/labels.tsv, spanning cid22-train/val, clic2025-1024, gb82, gb82-sc, imageflow, kadid10k, qoi-benchmark). 174 photo, 36 screen, 9 illustration, 44 marked synthetic. F1 / AUC are for binary screen-vs-photo classification.

For codec-orchestrator dispatch ("is this a screen or a photo?"):

signal	threshold	F1	AUC	notes
`line_art_score > 0`	any nonzero	0.978	0.750	near-deterministic — line art ⇒ screen-like
`natural_likelihood >= 0.06`	photo detection	0.924	0.814	high precision photo classifier
`patch_fraction >= 0.27`	screen detection	0.769	0.880	strongest single screen discriminator
`edge_slope_stdev >= 35`	screen detection	—	0.844	second-strongest screen discriminator — photos cluster 15–32, screens 32–58
`screen_content_likelihood >= 0.60`	screen detection	0.750	0.831	derived from flat blocks + palette + chroma
`flat_color_block_ratio >= 0.53`	screen detection	0.750	0.838	raw — same F1 as the derived `_likelihood`
`skin_tone_fraction >= 0.05`	photo detection	0.824	0.799	one-direction (presence ⇒ photo); pigmentation-invariant Chai-Ngan YCbCr
`text_likelihood >= 0.30`	text detection	0.682	0.774	weaker but real
`grayscale_score >= 0.99`	grayscale dispatch	—	—	encoder gap-filler, near-binary on real grayscale

Note: the three *_likelihood features empirically saturate at ~0.70 (not 1.0) on real content, because each is a weighted sum of clamped sub-components that don't simultaneously max on real images. Don't threshold them at >= 0.8 — nothing will fire. Operating points are in the 0.3–0.6 band. The exact corpus maxes are:

text_likelihood max 0.71
screen_content_likelihood max 0.70
natural_likelihood max 0.69

For descriptor-gap detection the thresholds are content-physical (see the "Descriptor-gap detection" table above): GrayscaleScore >= 0.99, GamutCoverageSrgb >= 0.99, etc. Those are spec-driven, not corpus-fit.

The full per-class distributions, ROC-AUC ranking for every feature, Spearman redundancy matrix, and the recalibration findings that were considered and rejected are recorded in docs/calibration-corpus-2026-04-27.md. That file is the pre-0.1.0-ship empirical baseline; subsequent 0.1.x patches that drift numerics should compare against it.

Threshold contract

Numeric thresholds and normalisation scales drift during 0.1.x. Downstream consumers that compile-in fitted models (oracle decision trees, content selectors) must pin to a specific zenanalyze patch version and re-validate when they bump it.

There is no 0.2.x. Every change in 0.1.x is additive — new variants on #[non_exhaustive] enums, new parallel functions, never a signature change to a shipped item. See CLAUDE.md.

Test surface

130+ tests covering math invariants on synthetic inputs (solid colours, horizontal bands, uniform luma distribution, palette-locked images, two-tone line drawings, smooth gradients, pure noise), the full 16-arm dispatch matrix, every supported pixel format (3 channel-types × 6 transfers × 4 primaries × 2 alpha = 144 sanity-matrix combinations), tier sizes from 1×1 to 4096×4096, deterministic-input bit-equality (catches accumulator non-determinism), u8-promotion bit-equality across u16 / f32 sources, HDR-survival (PQ ~1000-nit content preserved end-to-end where standard tiers would have clipped to SDR), gamut-coverage projections (saturated Rec.2020 green correctly fails sRGB coverage), and AnalyzeError Display / source coverage. Math locks use absolute tolerances chosen to clear ULP-level f32 noise from SIMD tree reductions but catch any genuine architecture divergence.

Note on coverage tooling: the SIMD kernels in tier1.rs, palette.rs, tier2_chroma.rs, and tier3.rs use #[magetypes(... v4, v3, neon, wasm128, scalar)] to generate one source-level monomorphisation per architecture tier. At runtime archmage's incant! dispatches to whichever the CPU supports; the other variants stay compiled but unreachable. Line- coverage tools count each variant separately, so the raw percentage on these files looks ≈30 % on x86_64. Real coverage of executable code paths (counted on the dispatched variant only) is ≥95 % across every module.

Companion crates in this repo

The repository hosts three sibling pieces that the codecs in imazen/zenjpeg, imazen/zenwebp, etc. compose with zenanalyze:

Path	Identity	Status
`zenpredict/`	Rust runtime — zero-copy MLP loader (ZNPR v2 binary format), forward pass, masked argmin, typed metadata, OOD bounds, two-shot rescue policy. Used by codec pickers (`zenjpeg`/`zenwebp`/`zenavif`/`zenjxl`) and by `zensim` V0.4 perceptual scoring	Crate, `0.1.x`
`zenpicker/`	Codec-family meta-picker — given features + target quality + an allowed-family mask, picks `{jpeg, webp, jxl, avif, png, gif}`; per-codec pickers then resolve the family into a concrete encoder config. Wraps `zenpredict::Predictor`	Crate, `0.1.x`
`zentrain/`	Python training pipeline — pareto sweep harness, teacher fit, distill, ablation, holdout probes, safety reports, bake to ZNPR v2 (via `tools/bake_picker.py` → `zenpredict-bake`). Produces both meta-picker and per-codec bakes	Tooling, in-repo

The runtime crates (zenpredict, zenpicker) and the trainer (zentrain) version independently; the binary format (ZNPR v2) is the contract between them. See MIGRATION.md for the path from the previous (unpublished) zenpicker Rust shell to the current layout.

Cross-codec defaults + data discipline: read zentrain/PRINCIPLES.md before adopting or re-baking a picker for any codec (zenjpeg / zenwebp / zenavif / zenjxl / zenpng / zengif / zenpicker / zensim). It's the source of truth for what's invariant — corpus shape, argmin objectives, time- budget patterns, multi-metric bakes, OOD / reach gates, validation gates that block release.

License

AGPL-3.0-only OR LicenseRef-Imazen-Commercial. Commercial licensing available from imazen — contact lilith@imazen.io.

Name		Name	Last commit message	Last commit date
Latest commit History 116 Commits
.claude		.claude
.github/workflows		.github/workflows
benchmarks		benchmarks
docs		docs
examples		examples
src		src
tools		tools
zenpicker		zenpicker
zenpredict		zenpredict
zentrain		zentrain
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTEXT-HANDOFF.md		CONTEXT-HANDOFF.md
Cargo.toml		Cargo.toml
LICENSE-AGPL3		LICENSE-AGPL3
LICENSE-COMMERCIAL		LICENSE-COMMERCIAL
MIGRATION.md		MIGRATION.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

zenanalyze

Cargo features

Why

Quick start

What it computes

Codec-orchestrator gap-fillers

Source-direct HDR / wide-gamut / bit-depth tier

Research-stage

Descriptor-gap detection

Wide gamut, HDR, and bit depth

Errors

How it's organised

Performance

Empirical operating thresholds

Threshold contract

Test surface

Companion crates in this repo

License

About

Licenses found

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

zenanalyze

Cargo features

Why

Quick start

What it computes

Codec-orchestrator gap-fillers

Source-direct HDR / wide-gamut / bit-depth tier

Research-stage

Descriptor-gap detection

Wide gamut, HDR, and bit depth

Errors

How it's organised

Performance

Empirical operating thresholds

Threshold contract

Test surface

Companion crates in this repo

License

About

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages