Hi, thanks for releasing this great project!
I tried to run lingbot-map strictly following the README instructions, but encountered an issue related to FlashInfer on a Blackwell GPU (RTX 5090). I’d like to report both a reproducibility issue and a compatibility issue.
1. Installation issue (FlashInfer)
The README suggests installing FlashInfer with:
pip install flashinfer-python -i https://flashinfer.ai/whl/cu128/torch2.9/
However, this command does not work:
ERROR: Could not find a version that satisfies the requirement flashinfer-python
ERROR: No matching distribution found for flashinfer-python
It seems that this index URL no longer provides the package.
Instead, I had to install it via:
pip install flashinfer-python
So the installation instructions in the README appear to be outdated.
2. Runtime issue
Environment:
- GPU: RTX 5090
- CUDA: 12.8
- PyTorch: 2.9
- Python: 3.10
Running the demo:
python demo.py \
--model_path ./model/lingbot-map-long.pt \
--image_folder ../../datasets/oxford/data/observatory-quarter/2024-03-13-observatory-quarter-01/cam0/data/ \
--mask_sky
The program fails during streaming inference with the following error:
Loading 5746 images...
Loading images: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5746/5746 [00:16<00:00, 341.74it/s]
Preprocessed images to 518x392 using canonical crop mode
Failed to get device capability: SM 12.x requires CUDA >= 12.9.
Failed to get device capability: SM 12.x requires CUDA >= 12.9.
torchtitan not available for ulysses cp
Building model...
pretrained_path:
Failed to load pretrained weights: [Errno 2] No such file or directory: ''
Loading checkpoint: ./model/lingbot-map-long.pt
Missing keys: 62
Checkpoint loaded.
Total load time: 61.6s
Casting aggregator to torch.bfloat16 (heads kept in fp32)
Input: 5746 frames, shape (5746, 3, 392, 518)
Mode: streaming
GPU mem after load: alloc=16.95 GB, reserved=16.97 GB
Auto-selected --keyframe_interval=18 (num_frames=5746 > 320).
Keyframe streaming enabled: interval=18 (after the first 8 scale frames).
Running streaming inference (dtype=torch.bfloat16)...
Streaming inference: 0%|▏ | 8/5746 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/mnt/crucial/slam_benchmark/baselines/lingbot-map/demo.py", line 522, in <module>
main()
File "/mnt/crucial/slam_benchmark/baselines/lingbot-map/demo.py", line 466, in main
predictions = model.inference_streaming(
File "/home/wangyiyu/miniconda3/envs/lingbot-map/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
return func(*args, **kwargs)
File "/mnt/crucial/slam_benchmark/baselines/lingbot-map/lingbot_map/models/gct_stream.py", line 390, in inference_streaming
frame_output = self.forward(
File "/mnt/crucial/slam_benchmark/baselines/lingbot-map/lingbot_map/models/gct_base.py", line 322, in forward
aggregated_tokens_list, patch_start_idx = self._aggregate_features(
File "/mnt/crucial/slam_benchmark/baselines/lingbot-map/lingbot_map/models/gct_stream.py", line 225, in _aggregate_features
aggregated_tokens_list, patch_start_idx = self.aggregator(
File "/home/wangyiyu/miniconda3/envs/lingbot-map/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/wangyiyu/miniconda3/envs/lingbot-map/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
return forward_call(*args, **kwargs)
File "/mnt/crucial/slam_benchmark/baselines/lingbot-map/lingbot_map/aggregator/base.py", line 589, in forward
tokens, global_idx, global_intermediates = self._process_global_attention(
File "/mnt/crucial/slam_benchmark/baselines/lingbot-map/lingbot_map/aggregator/stream.py", line 409, in _process_global_attention
return self._process_causal_stream(
File "/mnt/crucial/slam_benchmark/baselines/lingbot-map/lingbot_map/aggregator/stream.py", line 509, in _process_causal_stream
tokens = self.global_blocks[global_idx](
File "/home/wangyiyu/miniconda3/envs/lingbot-map/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/wangyiyu/miniconda3/envs/lingbot-map/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
return forward_call(*args, **kwargs)
File "/mnt/crucial/slam_benchmark/baselines/lingbot-map/lingbot_map/layers/block.py", line 264, in forward
attn_x = manager.compute_attention(global_idx, q_nhd)
File "/mnt/crucial/slam_benchmark/baselines/lingbot-map/lingbot_map/layers/flashinfer_cache.py", line 381, in compute_attention
self.prefill_wrapper.plan(
File "/home/wangyiyu/miniconda3/envs/lingbot-map/lib/python3.10/site-packages/flashinfer/prefill.py", line 1999, in plan
self._cached_module = get_batch_prefill_module(
File "/home/wangyiyu/miniconda3/envs/lingbot-map/lib/python3.10/site-packages/flashinfer/prefill.py", line 454, in get_batch_prefill_module
module = gen_batch_prefill_module(backend, *args).build_and_load()
File "/home/wangyiyu/miniconda3/envs/lingbot-map/lib/python3.10/site-packages/flashinfer/jit/attention/modules.py", line 1058, in gen_batch_prefill_module
return gen_customize_batch_prefill_module(
File "/home/wangyiyu/miniconda3/envs/lingbot-map/lib/python3.10/site-packages/flashinfer/jit/attention/modules.py", line 1618, in gen_customize_batch_prefill_module
return gen_jit_spec(uri, source_paths)
File "/home/wangyiyu/miniconda3/envs/lingbot-map/lib/python3.10/site-packages/flashinfer/jit/core.py", line 415, in gen_jit_spec
check_cuda_arch()
File "/home/wangyiyu/miniconda3/envs/lingbot-map/lib/python3.10/site-packages/flashinfer/jit/core.py", line 108, in check_cuda_arch
raise RuntimeError("FlashInfer requires GPUs with sm75 or higher")
RuntimeError: FlashInfer requires GPUs with sm75 or higher
[W423 00:55:43.765978640 AllocatorConfig.cpp:28] Warning: PYTORCH_CUDA_ALLOC_CONF is deprecated, use PYTORCH_ALLOC_CONF instead (function operator())
Do you have a recommended setup for:
- Blackwell GPUs (SM 12.x)?
- CUDA ≥ 12.9?
- Or a specific FlashInfer version that works?
This issue affects reproducibility on newer GPU platforms.
Thanks again for the great work!
Hi, thanks for releasing this great project!
I tried to run lingbot-map strictly following the README instructions, but encountered an issue related to FlashInfer on a Blackwell GPU (RTX 5090). I’d like to report both a reproducibility issue and a compatibility issue.
1. Installation issue (FlashInfer)
The README suggests installing FlashInfer with:
However, this command does not work:
ERROR: Could not find a version that satisfies the requirement flashinfer-python ERROR: No matching distribution found for flashinfer-pythonIt seems that this index URL no longer provides the package.
Instead, I had to install it via:
So the installation instructions in the README appear to be outdated.
2. Runtime issue
Environment:
Running the demo:
The program fails during streaming inference with the following error:
Do you have a recommended setup for:
This issue affects reproducibility on newer GPU platforms.
Thanks again for the great work!