This tutorial guides you through setting up and using the MatterPort3D Simulator for Vision-and-Language Navigation (VLN) tasks. This tutorial is tailored from Matterport3DSimulatory with serval modifications.
- Docker (Installation Guide)
- NVIDIA Docker (Installation Guide)
- Git
git clone https://github.com/MuzK01/VLN-Tutorial.git
cd VLN-Tutorial# Download a single test scene
python scripts/download_mp3d.py -o data --id mJXqzFtmKg4
# Download the complete dataset (1.3TB)
python scripts/download_mp3d.py -o databash scripts/download.sh #it will download the R2R dataset to the R2R_benchmark/data/ directoryYou can set up the environment in two ways: using Docker (recommended) or building from source.
# Option 1: Build from Dockerfile
docker build -t mattersim:v1 .
# Option 2: Pull pre-built image file
# [Link to be added]Verify the image is available:
docker images | grep mattersimdocker run -it --gpus all \
--privileged \
--shm-size=32g \
--network=host \
--device=/dev/video* \
-e DISPLAY=$DISPLAY \
-v /tmp/.X11-unix:/tmp/.X11-unix \
-v /absolute/path/to/VLN-Tutorial:/projects/VLN-Tutorial \
mattersim:v1#set default python
ln -sf /usr/bin/python3 /usr/bin/python
Docker Arguments Explained:
--gpus all: Enable GPU support--privileged: Grant extended privileges--shm-size=32g: Set shared memory size--network=host: Use host network--device=/dev/video*: Mount video devices-e DISPLAY=$DISPLAY: Enable GUI applications-v /tmp/.X11-unix:/tmp/.X11-unix: Mount X11 socket-v /absolute/path/to/VLN-Tutorial:/VLN-Tutorial: Mount project directory
- Ubuntu ≥ 14.04
- NVIDIA drivers with CUDA
- C++11 compatible compiler
- CMake ≥ 3.10
- OpenCV ≥ 2.4
- OpenGL
- GLM
- NumPy
sudo apt-get install libjsoncpp-dev libepoxy-dev libglm-dev libosmesa6 libosmesa6-dev libglew-devNote that the code of Matterport3DSimulator is modifed to fit to new linux verions: specifcally, docker image version ubuntu 20.04+cuda 11.2. It is different from the original code. But it is possible to still work for other versions.
ln -sf /usr/bin/python3 /usr/bin/python
cd /projects/VLN-Tutorial/Matterport3DSimulator
mkdir build && cd build
which python #check python is available
cmake -DEGL_RENDERING=ON ..
make
cd ../../- Default GPU rendering (OpenGL):
cmake .. - Off-screen GPU rendering (EGL):
cmake -DEGL_RENDERING=ON .. - Off-screen CPU rendering (OSMesa):
cmake -DOSMESA_RENDERING=ON ..
# Temporary setup
export PYTHONPATH=/projects/VLN-Tutorial/Matterport3DSimulator/build:$PYTHONPATH
# Permanent setup (add to ~/.bashrc)
echo "export PYTHONPATH=/projects/VLN-Tutorial/Matterport3DSimulator/build:\$PYTHONPATH" >> ~/.bashrc
source ~/.bashrcecho "export EGL_PLATFORM=device" >> ~/.bashrc
source ~/.bashrcpython -c "import MatterSim; print('Import MatterSim successfully')"python seq2seq/eval.py- Download the preprocessed image features In our initial work using this simulator, we discretized heading and elevation into 30 degree increments, and precomputed image features for each view. Now that the simulator is much faster, this is no longer necessary, but for completeness we include the details of this setting below.
We generate image features using Caffe. To replicate our approach, first download and save some Caffe ResNet-152 weights into the models directory. We experiment with weights pretrained on ImageNet, and also weights finetuned on the Places365 dataset. The script scripts/precompute_features.py can then be used to precompute ResNet-152 features. Features are saved in tsv format in the img_features directory.
Alternatively, skip the generation and just download and extract our tsv files into the img_features directory:
- Run the training script
python seq2seq/train.py- How the Simulator is initialized: Located in
R2R_benchmark/env.py(R2RBatch.__init__) - How Scene Navigation Graphs are define: Located in
Matterport3DSimulator/connectivity/andR2R_benchmark/utils.py load_nav_graphs - How the episode data is loaded: Located in
R2R_benchmark/utils.py load_datasets - Observation Space: Defined in
R2R_benchmark/env.py(R2RBatch._get_obs) - How Teaching Actions are generated: Implemented in
R2R_benchmark/env.py(R2RBatch._shortest_path_action)
If you encounter this error, check the following:
-
Python Version Mismatch
- Ensure the Python version used for building matches the runtime Python version
- Verify with:
python --version # Should match the version used during build- Rebuild if necessary using:
cmake -DPYTHON_EXECUTABLE=$(which python) -DEGL_RENDERING=ON .. make -
PYTHONPATH Configuration
- Verify the build directory is in PYTHONPATH:
echo $PYTHONPATH # Should include /VLN-Tutorial/Matterport3DSimulator/build
- If missing, add it:
export PYTHONPATH=/VLN-Tutorial/Matterport3DSimulator/build:$PYTHONPATH
-
Check Build Output
- Verify the .so file exists:
ls /VLN-Tutorial/Matterport3DSimulator/build/MatterSim*.so
When compiling MatterSim, you may see deprecation warnings related to PyThread. These warnings are safe to ignore:
warning: 'int PyThread_set_key_value(int, void*)' is deprecated [-Wdeprecated-declarations]
PyThread_set_key_value(key, tstate);This warning occurs due to the use of a deprecated Python API function in pybind11, but it does not affect the functionality of the simulator.
#modify line 51 in seq2seq/env.py to the correct path
connectivity_dir = '/projects/VLN-Tutorial/Matterport3DSimulator/connectivity'5.4 RuntimeError: cuDNN version incompatibility: PyTorch was compiled against (8, 3, 2) but found runtime version (8, 2, 1)
## just unset the LD_LIBRARY_PATH
unset LD_LIBRARY_PATH