Add universal visualization tool for processed datasets by stepankonev · Pull Request #27 · stepankonev/StandardE2E

stepankonev · 2026-06-24T12:31:07Z

Summary

A dataset-agnostic tool that turns a processed-output folder (.npz frames + index.parquet) into per-scene MP4s, auto-detecting whichever modalities each frame carries — no per-dataset code. To make the output self-describing, the converter now also writes a dataset_info.yaml, and the BEV adapters record their grid params in metadata.

What's new

standard_e2e/visualization/

render.py — per-frame compositor: a camera mosaic (a per-direction surround grid, or a single stitched panorama as the pano adapter emits) + a single co-registered BEV panel in meters: the raster BEVs (hd_map_bev / lidar_bev / detections_3d_bev, color-composited by channel), lidar_pc, vector detections_3d boxes, and past_states / future_states / preference_trajectory trajectories, plus ego. Camera-less datasets render BEV-only.
visualize_processed.py — CLI. Scene selection: --scene-id (repeatable) xor --num-scenes N (default: first scene); --max-frames, --out. --fps defaults to the rate inferred from frame timestamps so playback is real-time (e.g. ~2 Hz for nuScenes keyframes, ~10 Hz for KITScenes); low rates are encoded by duplicating frames up to a player-friendly ~10 fps without changing the real-time duration.

Self-describing output (benefits every dataset)

The converter writes dataset_info.yaml (dataset, split, each adapter's spec) next to index.parquet; AbstractAdapter.spec exposes name + metadata.
HDMapBEVAdapter / Detections3DBEVAdapter / LidarBEVAdapter metadata now carries the grid (min/max x/y, pixels_per_meter) and channel order under f"{modality}_grid" / f"{modality}_channels", so .npz aux_data and the yaml are self-describing — the BEV panel renders correctly for any grid config without hard-coding it.

Usage

python -m standard_e2e.visualization.visualize_processed \
    /data/out/kitscenes_multimodal/val --num-scenes 2 --out /tmp/viz

Verification

Rendered videos across 8 datasets from one tool + one shared config (cameras 5→11; per-direction grids and stitched panoramas; both raster BEVs where shipped; vector boxes; real + SfM lidar; trajectories): nuScenes, KITScenes Multimodal, AV2 Sensor, AV2 Lidar (camera-less → BEV-only), Waymo Perception, waymo_e2e, TruckDrive, WayveScenes. Inferred fps matched each capture rate (nuScenes 2 Hz, the 10 Hz datasets 10).

Tests: dataset_info.yaml emission, BEV grid metadata, renderer modality auto-detection across combinations (dict + pano cameras, rasters, point clouds, vector detections, trajectories, near-empty frame), scene selection, _infer_fps, and the CLI end-to-end. Full gate green (pytest / black / isort / flake8 / mypy).

Notes / out of scope

pano vs per-camera layout is auto-handled; legacy .npz (no grid metadata) degrade gracefully — vectors render, rasters are skipped.
Surfaced separately, not fixed here: the WayveScenes processor emits ~10 kHz timestamps (a 0.1 ms synthetic step) — the visualizer reads + clamps it; worth a follow-up in that processor.

A dataset-agnostic tool that renders processed output (.npz + index.parquet) to per-scene MP4s, auto-detecting whichever modalities each frame carries: camera mosaics (a per-direction surround grid, or a stitched panorama) and a single co-registered BEV panel (hd_map_bev / lidar_bev / detections_3d_bev rasters, lidar_pc, vector detections_3d boxes, past/future/preference trajectories, ego). - standard_e2e/visualization/: render.py (per-frame compositor) + visualize_processed.py (CLI). Scene selection: --scene-id (repeatable) xor --num-scenes N (default: first scene); plus --max-frames / --fps / --out. - Converter writes dataset_info.yaml (dataset, split, adapter specs) next to index.parquet; AbstractAdapter.spec exposes name + metadata. - BEV adapters' metadata now carries the grid (min/max x/y, pixels_per_meter) and channel order under f"{modality}_grid"/f"{modality}_channels", so the .npz aux_data and the yaml are self-describing -- the BEV panel renders correctly for any grid config without hard-coding it. - Tests: dataset_info.yaml emission, BEV grid metadata, renderer modality auto-detection across combinations, scene selection, and the CLI end-to-end.

- --fps now defaults to the rate inferred from each scene's frame timestamps (median inter-frame interval) so videos play at the data's real-world speed (e.g. ~2 Hz for nuScenes keyframes, ~10 Hz for KITScenes); --fps still overrides. - Low data rates are encoded by duplicating frames up to a player-friendly ~10 fps without changing the real-time duration -- many players render sub-~10 fps mp4 as static/broken (nuScenes was unplayable at 2 fps). - Tests for _infer_fps (median / ordering / fallback / clamp).

codecov · 2026-06-24T12:37:57Z

Codecov Report

❌ Patch coverage is 90.11628% with 34 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
standard_e2e/visualization/visualize_processed.py	85.93%	18 Missing ⚠️
standard_e2e/visualization/render.py	93.61%	12 Missing ⚠️
standard_e2e/caching/adapters/lidar_adapter.py	66.66%	2 Missing ⚠️
standard_e2e/caching/source_dataset_converter.py	88.88%	1 Missing ⚠️
standard_e2e/caching/source_dataset_processor.py	66.66%	1 Missing ⚠️

📢 Thoughts on this report? Let us know!

stepankonev added 2 commits June 22, 2026 11:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add universal visualization tool for processed datasets#27

Add universal visualization tool for processed datasets#27
stepankonev wants to merge 2 commits into
mainfrom
add-visualization-tool

stepankonev commented Jun 24, 2026

Uh oh!

codecov Bot commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

stepankonev commented Jun 24, 2026

Summary

What's new

Usage

Verification

Notes / out of scope

Uh oh!

codecov Bot commented Jun 24, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant