[WIP] Incorrect scale / alignment of depth or world points when using SparkFastVGGT in VGGT-SLAM2.0 by stepeos · Pull Request #39 · MIT-SPARK/VGGT-SLAM

stepeos · 2026-04-12T22:21:35Z

Hi!

I’ve been working on integrating the FastVGGT model as a VGGT backend for VGGT-SLAM2.0.

Setup / Context

To achieve this, I merged the MIT Spark VGGT fork with the FastVGGT model into a combined repository:

https://github.com/stepeos/SparkFastVGGT.git

This combined model supports two modes:

Without compute_similarity (MIT Spark VGGT path):
- Uses FastVGGT token merging in attention layers
- Significantly more memory efficient
With compute_similarity enabled (original VGGT behavior):
- Token merging is disabled, since it likely interferes with similarity computation

Both modes work as expected in isolation.

Problem

I encounter issues when using SparkFastVGGT as the backend in VGGT-SLAM2.0.

Specifically, I observe that the depth/scale of the reconstructed point cloud appears incorrect.

I suspect the issue may originate in one of the following stages:

set_point_cloud
get_points_in_world_frame
or an intermediate transformation step

(I want to add, that I also tried using the world_points with enable_points enabled and reprojecting the points into camera system)

It looks like there may be an additional transformation applied (possibly SL(4) on top of SE(3)) that affects the dense per-frame point cloud in world coordinates.

Observations

What makes this particularly confusing is that everything seems consistent in isolation:

In demo_viser.py from SparkFastVGGT:
- The world_points match the points obtained from unproject_depth_map_to_point_map
- These results also match the original VGGT implementation

So the point cloud generation itself appears correct.

However, when integrated into VGGT-SLAM2.0, the scale / alignment becomes incorrect.

The same goes for poses. The scale of the SL4 after optimization causes submaps to drift apart as can be see in the screenshot below.=

Expected Behavior

The reconstructed point cloud should match the correct world-scale geometry as seen in:

demo_viser.py
original VGGT implementation

Actual Behavior

When running VGGT-SLAM2.0 with SparkFastVGGT backend, the reconstruction shows incorrect scale / misalignment.

Example (office loop):

SparkFastVGGT in VGGT-SLAM2.0:

Question

Could you lead me in the right direction to look into the incorrect transformations?

Any pointers on where an unintended scale or transform might be introduced would be greatly appreciated.
I think it would be huge win to get the slam only part working with FastVGGT as backend.

stepeos · 2026-05-06T17:36:26Z

@Dominic101 any advice on where to look at would be greatly appreciated :)

Dominic101 · 2026-05-07T12:49:59Z

Hi @stepeos, thanks for your interest in VGGT-SLAM. The problem is almost certainly that you need to modify FastVGGT's version of unprotect_depth_map_point_map to return points that are defined wrt each camera instead of wrt to the first camera. You can see how I changed VGGT's function in VGGT_SPARK here https://github.com/MIT-SPARK/VGGT_SPARK/blob/6e6e16107b88e8e76c751826af10d4295d87ecd2/vggt/utils/geometry.py#L15. The reason this change is needed is because the homography matrices we compute assume points are defined wrt each camera.

By the way: if you get a chance to include some timing comparison showing the speed-up of VGGT-SLAM with FastVGGT that would be awesome

stepeos · 2026-05-07T12:54:03Z

Wow I completely missed that, thank you so much for taking the time to help me out here, I appreciate it it! I will definitely do that, but I can already say it's significant, since each inference batch has more images (because of less VRAM) while having faster inference. I will include a speed comparison script for the office loop.

add token merge

37b2e75

stepeos changed the title ~~Incorrect scale / alignment of depth or world points when using SparkFastVGGT in VGGT-SLAM2.0~~ [WIP] Incorrect scale / alignment of depth or world points when using SparkFastVGGT in VGGT-SLAM2.0 Apr 12, 2026

remove merge error

8d6e5f8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Incorrect scale / alignment of depth or world points when using SparkFastVGGT in VGGT-SLAM2.0#39

[WIP] Incorrect scale / alignment of depth or world points when using SparkFastVGGT in VGGT-SLAM2.0#39
stepeos wants to merge 2 commits into
MIT-SPARK:mainfrom
stepeos:main

stepeos commented Apr 12, 2026 •

edited

Loading

Uh oh!

stepeos commented May 6, 2026

Uh oh!

Dominic101 commented May 7, 2026 •

edited

Loading

Uh oh!

stepeos commented May 7, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

stepeos commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Setup / Context

Problem

Observations

Expected Behavior

Actual Behavior

Question

Uh oh!

stepeos commented May 6, 2026

Uh oh!

Dominic101 commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stepeos commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

stepeos commented Apr 12, 2026 •

edited

Loading

Dominic101 commented May 7, 2026 •

edited

Loading

stepeos commented May 7, 2026 •

edited

Loading