Skip to content

splat ortho #2

@nozdrenkov

Description

@nozdrenkov

Create orthomosaic of 3D gaussian splatting model

motivation

Finally we got to the point when we can create geometrically correct and photorealistic 3D gaussian splatting models of coral reefs. While it's great and super fun to fly around corals, a lot of downstream analytics tools (e.g. TagLab, CoralNet, ReefCloud, CoralSCOP, Data Mermaid etc.) work only with 2D images or orhtomosaics. Same way as on Google Maps you primarily use 2D view and not 3D view, it would be super beneficial to have 2D top down views onto the corals. We can successfully create orthos from point cloud itself: https://3d.wildflow.ai/o/ootbm-t0-pc now it would be super cool to do the same from splats.

Image

requirements

  • The interface could vary depending on implementation, general idea:
from wildflow import splat

splat.ortho(
    input_model_path,           # .ply file that can be opened by supersplat
    output_ortho_path,          # output file with orthomosaic
    max_side_px=20000,          # a way of defining resolution
    resolution_px=None,         # alternatively specify how many meters we have per pixel, e.g. 0.0004
    format='tiff'               # tiff is the best, png also works
)
  • P0: ortho must be geometrically correct, we shouldn't have any perspective
  • P0: supersplat-like quality of the render
  • P0: support massive models of e.g. 10-30M splats
  • P0: works on (Windows 10+ with CUDA) or (Google Colab) or (Ubuntu with CUDA)
  • P0: time say up to 30min per 20M splats on RTX4090 equivalent
  • P1: ability to define resolution is cool but not critical, if that simplifies things, we could have a fixed resolution to start with e.g. always 30k pixels max
  • P1: black background by default
  • P2: works on Metal e.g. Mac book air m2
  • P2: for tiff, prefer NZP compression and BigTIFF format
  • P2: support both tiff and png, but one is a good starting point
  • P3: ideally needs to be as portable as possible, e.g. would be nice to avoid PyTorch and CUDA, maybe WebGPU? or even Mojo?
  • P3: ability to set any background user wants

more info:

Implementation ideas

1. Tortho-Gaussian (current winner)

There's a great paper on the topic came out and there's even some code:

We might need camera poses in colmap format, you can find them here: https://huggingface.co/datasets/wildflow/sweet-corals/tree/main/_indonesia_tabuhan_p1_20250210/colmap but it would be nice to use only splats as input

We might need to convert the splats from .ply into another format. You can use this CLI https://github.com/playcanvas/splat-transform

2. SuperSplat in headless browser (some progress)

SuperSplat is amazing at rendering splats https://superspl.at/editor and it supports ortho mode (when you click e.g. on (z) axis button). I thought it might not be the worst idea to use it to render everything. I created a quick python script that runs SuperSplat in a headless browser and saves png:

the output for small patches (say 2x2m) is really impressive. I thought I might split the whole .ply into 1x1m chunks and render them all and then stitch together these 1x1m patches. Unfortunately, when I render them, they don't align perfectly, it feels there's a bit of perspective there. This hacky approach sadly didn't work out for me at the end.

3. Reverse engineer and adapt

There's a lot of amazing renderers that work in the browser and have MIT licence:

we can effectively create our own renderer which uses ortho projection instead of perspective.

Conclusion

At this stage any approach is good, as long as we get a high-res photorealistic ortho at the end. If you have any questions, feel free to ask here or in our Discord (https://wildflow.ai/discord). Thank you so much for the help!

Metadata

Metadata

Assignees

No one assigned

    Labels

    help wantedExtra attention is needed

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions