Skip to content

team-aprl/MR.ScaleMaster

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

61 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🗺️ MR.ScaleMaster

Scale-Consistent Collaborative Mapping from Crowd-Sourced Monocular Videos

 


MR.ScaleMaster teaser

Each robot runs MASt3R-SLAM, VGGT-SLAM 2.0, Pi3, and LoGeR independently (scale-ambiguous). MR.ScaleMaster fuses the per-robot trajectories and point clouds into a single, metrically consistent global map using Sim(3) loop-closure constraints optimized with g2o.

graph LR
    R1[Robot 1] --> LC[Heterogeneous Front-end]
    R2[Robot 2] --> LC
    RN[Robot N] --> LC
    LC --> G2O[Sim3 Graph Optimization]
    G2O --> MAP[Consistent Global Map]
Loading


📋 Requirements

Component Specification
OS Ubuntu 22.04 / 24.04
GPU CUDA-capable (tested on RTX 5090)
Python 3.11+


⚙️ Installation

git clone git@github.com:team-aprl/MR.ScaleMaster.git
cd MR.ScaleMaster

💡 Tip: scripts/setup.bash (environment + build) and scripts/download_checkpoint.sh (checkpoints) are independent — we recommend running them in parallel in two separate terminals to save time.

🖥️ Terminal 1 — Environment & Build

./scripts/setup.bash

🖥️ Terminal 2 — Checkpoint Download

./scripts/download_checkpoint.sh

What scripts/setup.bash does automatically

  • ✅ Installs uv (if missing)
  • ✅ Creates a Python 3.11 virtual environment (.venv/)
  • ✅ Detects your CUDA version and installs the matching PyTorch
  • ✅ Clones and installs MASt3R-SLAM (into ../MASt3R-SLAM/)
  • ✅ Downloads model checkpoints (~3.0 GB)
  • ✅ Installs all Python dependencies
  • ✅ Builds the C++ Sim(3) optimizer


🚀 Usage

# Run
./scripts/run.sh examples/Exp1     --fps 2.0 --config config/Exp1.yaml
./scripts/run.sh examples/Exp2     --fps 2.0 --config config/Exp2.yaml
./scripts/run.sh examples/kitti_00 --fps 1.0 --config config/KITTI.yaml   # If you downloaded KITTI datasets.

💡 scripts/run.sh activates the virtual environment and sets PYTHONPATH automatically — no manual source or export needed.

Arguments

Argument Default Description
data_root ./examples/Exp1 Path to dataset folder
--fps 1.0 Playback rate (keyframes per second per robot)

🖼️ A GUI will open. Click ▶ Start to begin loading. ■ Stop pauses and ▶ Start resumes from where it left off.



🎬 Try Your Own Data!

No calibration required. Just bring your videos.

graph LR
    V["📹 video.mp4"] -->|"÷ N"| R1["robot_01"]
    V -->|"÷ N"| R2["robot_02"]
    V -->|"÷ N"| RN["robot_N"]
    R1 --> L1["LoGeR"]
    R2 --> L2["LoGeR"]
    RN --> LN["LoGeR"]
    L1 -->|"pose + pcd"| M["MR.ScaleMaster"]
    L2 -->|"pose + pcd"| M
    LN -->|"pose + pcd"| M
    M --> O["🗺️ Global Map"]
Loading

Step 1 — Install LoGeR (one-time, optional front-end)

./scripts/install_loger.sh

Step 2 — Run the pipeline

# Usage:
./scripts/do_collaborative_mapping.sh <input_video> <num_robots> [options]

# Example:
./scripts/do_collaborative_mapping.sh your_videos/your_video.mp4 4

Open config/default.yaml and adjust a few parameters if needed (frontend, image resolution, etc.). The script automatically splits videos into keyframes, runs LoGeR per video, and fuses everything with MR.ScaleMaster.

💡 Tip: For longer videos, increase subsample in scripts/do_collaborative_mapping.sh (default: 2) to speed up processing.



📦 Example Data

Exp1 and Exp2 are included in the repository. Additional KITTI sequences are available on 🤗 HuggingFace.

source .venv/mrscalemaster/bin/activate

hf download --repo-type dataset hyoseokju/examples kitti_00.tar.gz --local-dir examples/
cd examples && tar -xzf kitti_00.tar.gz && rm kitti_00.tar.gz

hf download --repo-type dataset hyoseokju/examples kitti_02.tar.gz --local-dir examples/
cd examples && tar -xzf kitti_02.tar.gz && rm kitti_02.tar.gz

hf download --repo-type dataset hyoseokju/examples kitti_05.tar.gz --local-dir examples/
cd examples && tar -xzf kitti_05.tar.gz && rm kitti_05.tar.gz

hf download --repo-type dataset hyoseokju/examples kitti_07.tar.gz --local-dir examples/
cd examples && tar -xzf kitti_07.tar.gz && rm kitti_07.tar.gz

hf download --repo-type dataset hyoseokju/examples kitti_08.tar.gz --local-dir examples/
cd examples && tar -xzf kitti_08.tar.gz && rm kitti_08.tar.gz
📁 Dataset Format (click to expand)
examples/
└── kitti_00/
    ├── robot_01/
    │   ├── kf_000000/
    │   │   ├── image.png
    │   │   ├── pose_4x4.npy          # Sim(3) pose (4×4, scale encoded in rotation)
    │   │   ├── pointcloud_local.npz  # keys: xyz (N,3), colors (N,3) uint8
    │   │   └── pose_tum.txt          # timestamp tx ty tz qx qy qz qw
    │   └── kf_000001/ ...
    ├── robot_02/ ...
    └── robot_03/ ...

Arbitrary folder names (e.g., go2, hand_held_01) are also supported — sorted alphabetically and assigned robot IDs automatically.

🎯 Bring your own data: Prepare your data in the format above and MR.ScaleMaster will work out of the box.



🔧 Configuration

Each experiment has its own config file under config/ (Exp1.yaml, Exp2.yaml, KITTI.yaml).

📝 Full configuration structure (click to expand)
device: "cuda:0"

paths:
  mast3r_config:      "./MASt3R-SLAM/config/base.yaml"
  model_weights:      "./MASt3R-SLAM/checkpoints/MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric.pth"
  retriever_weights:  "./MASt3R-SLAM/checkpoints/MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric_retrieval_trainingfree.pth"
  loop_vis_save_dir:  "./MASt3R-SLAM/retrieval_test/loop_closures"

image:
  height: 384
  width:  512

# noise values are 1-sigma:  information = 1 / sigma^2
# rotation unit: degrees,  translation unit: meters
graph:
  odometry:
    t_noise: [0.1, 0.1, 0.1]   # translation sigma (m)
    r_noise: [5.0, 5.0, 5.0]   # rotation sigma (deg)
    s_noise: 0.05              # scale sigma
  loop:
    t_noise: [1.0, 1.0, 1.0]
    r_noise: [25.0, 25.0, 25.0]
    s_noise: 1.0
  loop_inhibit_window:    5    # frames to suppress repeated loop edges between the same pair
  same_robot_min_seq_gap: 90   # minimum seq-index gap for same-robot loop closure

matching:
  retrieval_k:          3      # top-k candidates from image retrieval
  retrieval_min_thresh: 0.025  # minimum retrieval score to consider a candidate
  quality_threshold:    1.5    # Qk threshold for valid 3-D correspondences
  match_frac_min:       0.1    # minimum fraction of valid matches to attempt Sim3

point_cloud:
  local_voxel_size:  0.3   # voxel size for local map downsampling (m)
  pre_slice_rate:    10    # stride applied before voxel downsampling
  global_voxel_size: 0.3   # voxel size for global map rebuild after optimization (m)

anchor:
  scale_min: 0.5   # reject child anchor if scale < this
  scale_max: 4.0   # reject child anchor if scale > this

optimization:
  stage1_iters: 10
  stage2_iters: 50
  verbose:      0

vis:
  point_radii:      0.15   # rerun point cloud radius
  trajectory_radii: 0.15   # rerun trajectory line radius

⚠️ Update paths.mast3r_config, paths.model_weights, and paths.retriever_weights to match your MASt3R-SLAM installation path.



📂 Project Structure

🗂️ View directory tree (click to expand)
MR.ScaleMaster/
├── main.py                            # Entry point
├── requirements.txt
│
├── scripts/
│   ├── run.sh                         # Launch script (activates venv, sets PYTHONPATH)
│   ├── setup.bash                     # One-time environment & build setup
│   ├── download_checkpoint.sh         # Model checkpoint downloader
│   ├── install_loger.sh               # LoGeR front-end setup (optional)
│   ├── do_collaborative_mapping.sh    # Try Your Own Data! pipeline
│   └── demo_viser_for_mrscalemaster.py
│
├── config/
│   ├── Exp1.yaml                      # Config for Exp1
│   ├── Exp2.yaml                      # Config for Exp2
│   └── KITTI.yaml                     # Config for KITTI sequences
│
├── cores/                             # Main Python package
│   ├── dataloader.py                  # Dataset scanner & Qt data loader thread
│   ├── slam_backend.py                # MASt3R inference + Sim(3) loop detection
│   ├── optimizer_worker.py            # g2o optimization thread
│   ├── inference_thread.py            # Per-robot inference & map update
│   ├── retrieval.py                   # Image retrieval for loop candidates
│   ├── visualizer.py                  # Rerun-based 3D visualization
│   ├── config.py                      # Colors, timing utilities
│   ├── math_utils.py
│   ├── io_utils.py
│   └── gui/
│       ├── monitor_window.py          # Main Qt window
│       └── robot_panel.py             # Per-robot status panel
│
└── cpp/                               # C++ Sim(3) optimizer (pybind11)
    ├── build.sh
    ├── CMakeLists.txt
    ├── cmake/
    ├── src/
    ├── include/
    └── thirdparty/
        ├── g2o/                       # git submodule
        └── cnpy/                      # git submodule


🛠️ Building C++ Extensions Manually

cd cpp
./build.sh           # Release build
./build.sh debug     # Debug build
./build.sh clean     # Remove build/
./build.sh rebuild   # Clean + rebuild

📦 The compiled .so is placed directly in cores/ so Python can import it as import g2o_multirobot as g2o.



📚 Citation

If you use this work, please cite:

@article{ju2026mrscalemaster,
  title   = {{MR.ScaleMaster}: Scale-Consistent Collaborative Mapping from Crowd-Sourced Monocular Videos},
  author  = {Ju, Hyoseok and Kim, Giseop},
  journal = {arXiv preprint arXiv:2604.11372},
  year    = {2026}
}
📖 BibTeX entries for underlying front-ends (click to expand)
@inproceedings{mast3rslam,
  title     = {MASt3R-SLAM: Real-Time Dense SLAM with 3D Reconstruction Priors},
  author    = {Murai, Riku and others},
  booktitle = {CVPR},
  year      = {2025}
}

@article{maggio2026vggtslam2,
  title   = {VGGT-SLAM 2.0: Real-time Dense Feed-forward Scene Reconstruction},
  author  = {Maggio, Dominic and Carlone, Luca},
  journal = {arXiv preprint arXiv:2601.19887},
  year    = {2026}
}

@article{wang2025pi3,
  title   = {$\pi^3$: Permutation-Equivariant Visual Geometry Learning},
  author  = {Wang, Yifan and Zhou, Jianjun and Zhu, Haoyi and others},
  journal = {arXiv preprint arXiv:2507.13347},
  year    = {2025}
}

@article{zhang2026loger,
  title   = {LoGeR: Long-Context Geometric Reconstruction with Hybrid Memory},
  author  = {Zhang, Junyi and Herrmann, Charles and Hur, Junhwa and Sun, Chen and Yang, Ming-Hsuan and Cole, Forrester and Darrell, Trevor and Sun, Deqing},
  journal = {arXiv preprint arXiv:2603.03269},
  year    = {2026}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages