Skip to content

Add KITScenes Multimodal dataset support#26

Merged
stepankonev merged 1 commit into
mainfrom
add-kitscenes-multimodal
Jun 22, 2026
Merged

Add KITScenes Multimodal dataset support#26
stepankonev merged 1 commit into
mainfrom
add-kitscenes-multimodal

Conversation

@stepankonev

Copy link
Copy Markdown
Owner

Summary

Adds support for the KITScenes Multimodal dataset (KIT / MRT, arXiv:2606.02956) — a large European urban dataset (~1000 scenes @ 10 Hz) whose headline annotation is a dense, georeferenced Lanelet2 HD map.

New package standard_e2e/caching/src_datasets/kitscenes/ (_kitscenes_io, _kitscenes_geometry, _kitscenes_map, _kitscenes_splits, processor, converter), registered in the process_source_dataset dispatch, plus configs/kitscenes.yaml, scripts/{extract,prepare_dataset}_kitscenes.sh, README + docs/datasets.rst rows/footnotes, a pyproj dependency, and a test suite.

Ingested modalities (→ StandardFrameData):

  • cameras — the 6 surround camera_ring_* views → canonical CameraDirection members (pinhole K, T_ego_from_camera extrinsics).
  • lidarlidar_top xyz, de-discretized from int32 (×discretization_resolution), invalid-return sentinels dropped, in the ego (base_frame) frame.
  • hd_mapmaps/map.osm (Lanelet2) parsed directly → unified MapElementType taxonomy (lanelet road/emergency/bicycle → LANE_CENTER; crosswalk → CROSSWALK; line_thin/line_thick/bike_markingLANE_BOUNDARY; curbstone/road_borderROAD_EDGE; stop_lineSTOP_LINE; traffic_light*TRAFFIC_LIGHT), ROI-cropped per frame and rasterized by HDMapBEVAdapter.
  • ego trajectoryposes.txtT_maplocal_from_ego; past/future via FuturePastStatesFromMatricesAggregator.

One frame = one 10 Hz synced snapshot; one segment = one scene (UUID dir).

Georeferencing

poses.txt is already in the Lanelet2 map-local frame (UTM zone 32N minus the maps/origin.json anchor), verified by the ego trajectory lying on the map node cloud (mean ~1.4 m to nearest node). So the map needs no GNSS reconciliation — OSM nodes are projected with pyproj and the poses placed directly. No lanelet2 runtime dependency (it pins numpy<2, like the nuScenes devkit).

Verification

  • Pre-commit gate green: pytest / mypy / black / isort / flake8.
  • Tests: geometry / IO / map units + hermetic synthetic-scene build + optional real-data checks (gated on KITSCENES_ROOT).
  • End-to-end conversion on the public sample (KITScenes-Multimodal-Sample, 1 val scene, 220 frames): cameras + lidar_pc + hd_map_bev + aggregated past/future all produced; map verified visually co-registered with the LiDAR across all frames.

Limitations

  • No 3D object detections. KITScenes ships no object boxes/tracks (the HD map is its annotation product), so detections_3d is always empty.
  • Cameras: ring 6 only. The three "base" cameras (the high-res front-center and the rectified stereo pair, used mainly for depth / novel-view benchmarks) are not ingested — they have no canonical CameraDirection slot and adding members was deliberately deferred. No enum changes in this PR.
  • LiDAR: lidar_top only. The other 6 LiDARs (4 automotive + 2 corner) are not ingested; the merged ~7-sensor cloud is left as future work.
  • Radar / GNSS-INS not ingested. The 3 imaging radars, the dual GNSS/INS streams and the processed/ model outputs (ground-seg, panoptic) have no StandardE2E target yet.
  • Map scope. Lane-graph connectivity (successor / predecessor / neighbours) is not reconstructed (the BEV rasterizer doesn't use it). pole, traffic_sign, arrow, symbol, pedestrian_marking/zebra_marking ways, and virtual standalone ways are not emitted (no clean unified target; the crossing is captured by the crosswalk lanelet).
  • No per-point LiDAR deskew. Points are placed by the static sensor→ego extrinsic only (the devkit additionally ego-motion-deskews per-point); reflectivity/ring/timestamp columns are dropped (StandardE2E lidar is xyz-only).
  • UTM zone 32N assumed for the map projection (correct for all current KITScenes cities — Karlsruhe / Frankfurt / Sindelfingen).
  • Splits are path-based (the HF data/<split>/<uuid> layout); a flat layout (e.g. the sample) processes every scene with --split as a passthrough output label, rather than vendoring the ~1000 official-split UUIDs.

These mirror the scoping of the VoD / TruckDrive integrations (radar / extra sensors flagged as "no target yet") and are documented in the module docstrings, the README/docs/datasets.rst footnotes, and the config comments.

Notes

  • Does not bump the package version (separate release step).

@stepankonev stepankonev force-pushed the add-kitscenes-multimodal branch from 819f00d to 0753ff0 Compare June 21, 2026 16:38
@codecov

codecov Bot commented Jun 21, 2026

Copy link
Copy Markdown

Adds the kitscenes_multimodal dataset (KIT/MRT, arXiv:2606.02956): six surround
ring cameras, the lidar_top point cloud (xyz, ego frame), the ego past/future
trajectory and the Lanelet2 HD map (parsed directly to the unified
MapElementType taxonomy; no lanelet2 runtime dependency). KITScenes ships no 3D
boxes, so detections are empty.

The dataset key is `kitscenes_multimodal` (not the generic `kitscenes`) so the
planned KITScenes-LongTail variant can be added as a sibling, mirroring the
waymo_e2e/waymo_perception and av2_sensor/av2_lidar variant-naming convention.

- New package standard_e2e/caching/src_datasets/kitscenes_multimodal/ (io,
  geometry, map, splits, processor, converter)
- Register in process_source_dataset dispatch; configs/kitscenes_multimodal.yaml;
  extract (shared) + prepare scripts; README + docs/datasets.rst rows/footnotes
- Declare pyproj (used for the HD map's UTM projection)
- Tests: geometry/io/map units + hermetic synthetic-scene build + optional
  real-data checks gated on KITSCENES_ROOT

poses.txt is already in the Lanelet2 map-local frame (UTM 32N minus the
maps/origin.json anchor), verified by the ego trajectory lying on the map node
cloud, so the map needs no GNSS reconciliation.
@stepankonev stepankonev force-pushed the add-kitscenes-multimodal branch from 0753ff0 to 7a53464 Compare June 22, 2026 08:07
@stepankonev stepankonev merged commit 148f281 into main Jun 22, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant