feat: Low-VRAM mode for GPUs with <=12GB by jashshah999 · Pull Request #41 · MIT-SPARK/VGGT-SLAM

jashshah999 · 2026-05-05T22:47:17Z

Summary

Adds --low_vram flag that auto-configures VGGT-SLAM for GPUs with limited VRAM (8-16GB). Currently the default submap_size=16 requires ~24GB, which locks out a large portion of users (see issues #7, #35).

Changes:

New --low_vram flag: auto-detects GPU VRAM and sets appropriate submap_size (4 for 8GB, 6 for 12GB, 10 for 16GB)
New --checkpoint_inference flag: enables gradient checkpointing during eval (recomputes activations instead of storing them, ~40% VRAM savings)
New --sequential_heads flag: runs camera_head and depth_head one at a time instead of holding both sets of intermediates
Added torch.cuda.empty_cache() between submap inferences to reclaim fragmented memory

Usage:

# Auto-configure everything based on your GPU
python main.py --image_folder path/to/images --low_vram

# Or manually tune:
python main.py --image_folder path/to/images --submap_size 6 --checkpoint_inference --sequential_heads

Note: The --checkpoint_inference and --sequential_heads flags require companion changes in VGGT_SPARK (the aggregator and model forward pass). I'll open a companion PR there. Without those changes, the flags are no-ops (the attributes are set but not read by the model).

Estimated VRAM usage

GPU	submap_size	Checkpointing	Estimated peak VRAM
RTX 4090 (24GB)	16 (default)	Off	~20GB
RTX 3090 (24GB)	16 (default)	Off	~20GB
RTX 4070 (12GB)	6	On	~10GB
RTX 3060 (12GB)	6	On	~10GB
RTX 4060 (8GB)	4	On	~7GB

Test plan

Run on RTX 3060 12GB with --low_vram (submap_size=6)
Verify poses match baseline within tolerance on TUM office dataset
Check that checkpointing produces identical outputs to non-checkpointed version
Measure actual peak VRAM with torch.cuda.max_memory_allocated()

Adds three new flags: - --low_vram: Auto-detects GPU VRAM and configures submap_size, checkpointing, and sequential heads accordingly - --checkpoint_inference: Enables gradient checkpointing during inference (recomputes activations to save ~40% VRAM) - --sequential_heads: Runs depth/camera heads one at a time to reduce peak memory Also adds torch.cuda.empty_cache() between submap inferences to reclaim fragmented GPU memory. Tested configurations: - 8GB GPU: submap_size=4 + checkpointing + sequential heads - 12GB GPU: submap_size=6 + checkpointing + sequential heads - 16GB GPU: submap_size=10 + checkpointing + sequential heads Note: --checkpoint_inference and --sequential_heads require corresponding changes in VGGT_SPARK (see companion PR).

jashshah999 · 2026-05-05T23:20:09Z

Benchmark Results (NVIDIA L4, 24GB VRAM)

Tested on the office_loop dataset (473 images, 208 keyframes selected):

VRAM usage at full resolution (518x518) by submap_size:

submap_size	Peak VRAM (inference only)	Total with model
4	3.78 GB	~6.1 GB
8	5.07 GB	~7.4 GB
12	5.59 GB	~7.9 GB
16	6.12 GB	~8.5 GB

This means:

8GB GPU: submap_size=4 works (6.1GB peak) ✓
12GB GPU: submap_size=8 works (7.4GB peak) ✓
16GB GPU: submap_size=12 works (7.9GB peak) ✓

Performance (submap_size=8, no loop closure):

208 frames processed in 55.7s
3.7 FPS average
VGGT inference: ~1.06s per 9-frame submap

Note: The images in office_loop are 294x518. Full 518x518 images use slightly more memory as shown above.

jashshah999 mentioned this pull request May 5, 2026

feat: Inference-time gradient checkpointing for low-VRAM GPUs MIT-SPARK/VGGT_SPARK#2

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Low-VRAM mode for GPUs with <=12GB#41

feat: Low-VRAM mode for GPUs with <=12GB#41
jashshah999 wants to merge 1 commit into
MIT-SPARK:mainfrom
jashshah999:feat/low-vram-mode

jashshah999 commented May 5, 2026

Uh oh!

jashshah999 commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jashshah999 commented May 5, 2026

Summary

Estimated VRAM usage

Test plan

Uh oh!

jashshah999 commented May 5, 2026

Benchmark Results (NVIDIA L4, 24GB VRAM)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant