It is recommended to symlink the dataset root to data/ under the project root directory. If your folder structure differs, you will need to modify the data_root path in the corresponding config files.
Data preparation scripts are located in tools/data_prepare/. Remember to UNZIP it.
We recommend downloading datasets from OpenDataLab for faster download speeds. Alternatively, refer to tools/data_prepare/<dataset_name>/README.md for official download links.
| Dataset | Classes | Image Size | Annotation Format | Config |
|---|---|---|---|---|
| DIOR | 20 | 800x800 | VOC-style txt | configs/_base_/datasets/dior.py |
| DOTA v1.0 | 15 | 1024x1024 (after split) | DOTA txt | configs/_base_/datasets/dota.py |
| DOTA v1.5 | 16 | 1024x1024 (after split) | DOTA txt | configs/_base_/datasets/dotav15.py |
| STAR | 48 | 1024x1024 (after split) | DOTA txt | configs/_base_/datasets/star.py |
| FAIR1M | - | 1024x1024 (after split) | DOTA txt | configs/_base_/datasets/fair.py |
| HRSC | 1 | - | VOC-style xml | configs/_base_/datasets/hrsc.py |
| SSDD | 1 | - | DOTA txt | configs/_base_/datasets/ssdd.py |
| HRSID | 1 | - | COCO json | configs/_base_/datasets/hrsid.py |
| SRSDD | 6 | - | DOTA txt | configs/_base_/datasets/srsdd.py |
| RSDD | 2 | - | VOC-style xml | configs/_base_/datasets/rsdd.py |
Point2RBox-v3
├── data
│ ├── dior
│ │ ├── JPEGImages-trainval/ # Train + val images
│ │ ├── JPEGImages-test/ # Test images
│ │ ├── Annotations/
│ │ │ ├── Oriented Bounding Boxes/ # Rotated box annotations
│ │ │ └── Horizontal Bounding Boxes/ # Horizontal box annotations
│ │ └── ImageSets/
│ │ └── Main/
│ │ ├── train.txt
│ │ ├── val.txt
│ │ └── test.txt
Set data_root to data/dior/.
Point2RBox-v3
├── data
│ ├── DOTA
│ │ ├── train/
│ │ ├── val/
│ │ └── test/
DOTA raw images can be very large (up to thousands of pixels). They must be split into patches before training.
Single-scale split (1024x1024 patches with 200-pixel overlap):
# Split trainval
python tools/data_prepare/dota/split/img_split.py --base-json \
tools/data_prepare/dota/split/split_configs/ss_trainval.json
# Split test
python tools/data_prepare/dota/split/img_split.py --base-json \
tools/data_prepare/dota/split/split_configs/ss_test.jsonMulti-scale split (if multi-scale training is needed):
python tools/data_prepare/dota/split/img_split.py --base-json \
tools/data_prepare/dota/split/split_configs/ms_trainval.json
python tools/data_prepare/dota/split/img_split.py --base-json \
tools/data_prepare/dota/split/split_configs/ms_test.jsonNote: Before splitting, update the
img_dirsandann_dirsfields in the JSON config files to your actual raw data paths.
Point2RBox-v3
├── data
│ ├── split_ss_dota/
│ │ ├── trainval/
│ │ │ ├── images/ # Split train+val images
│ │ │ └── annfiles/ # Annotation files (one txt per image)
│ │ └── test/
│ │ ├── images/ # Split test images
│ │ └── annfiles/
If you need COCO-format annotations:
python tools/data_prepare/dota/dota2coco.py \
data/split_ss_dota/trainval/ \
data/split_ss_dota/trainval.json
python tools/data_prepare/dota/dota2coco.py \
data/split_ss_dota/test/ \
data/split_ss_dota/test.jsonSet data_root to data/split_ss_dota/.
Similar to DOTA v1.0, but with an additional container-crane class (16 classes total). The download and splitting procedures are the same; just use DOTA v1.5 data.
Set data_root to data/split_ss_dotav15/.
STAR also requires image splitting. The procedure is similar to DOTA. Refer to the DOTA splitting scripts for guidance.
Point2RBox-v3
├── data
│ ├── split_ss_star/
│ │ ├── train/
│ │ │ ├── images/
│ │ │ └── annfiles/
│ │ ├── val/
│ │ │ ├── images/
│ │ │ └── annfiles/
│ │ └── test/
│ │ └── images/
Set data_root to data/split_ss_star/.
FAIR1M requires image splitting:
# Split trainval
python tools/data_prepare/fair/split/img_split.py --base-json \
tools/data_prepare/fair/split/split_configs/ss_trainval.json
# Split test
python tools/data_prepare/fair/split/img_split.py --base-json \
tools/data_prepare/fair/split/split_configs/ss_test.jsonSet data_root to data/split_ss_fair/.
Point2RBox-v3
├── data
│ ├── hrsc/
│ │ ├── FullDataSet/
│ │ │ ├── AllImages/ # All images
│ │ │ ├── Annotations/ # XML annotations
│ │ │ ├── LandMask/
│ │ │ └── Segmentations/
│ │ └── ImageSets/ # Data split files
Set data_root to data/hrsc/.
Point2RBox-v3
├── data
│ ├── ssdd/
│ │ ├── train/
│ │ └── test/
│ │ ├── all/ # All test images
│ │ ├── inshore/ # Inshore scenes
│ │ └── offshore/ # Offshore scenes
Set data_root to data/ssdd/.
Point2RBox-v3
├── data
│ ├── HRSID_JPG/
│ │ ├── JPEGImages/ # All images
│ │ └── annotations/ # COCO-format JSON annotations
Set data_root to data/HRSID_JPG/.
Point2RBox-v3
├── data
│ ├── srsdd/
│ │ ├── train/
│ │ └── test/
Set data_root to data/srsdd/.
Point2RBox-v3
├── data
│ ├── rsdd/
│ │ ├── Annotations/ # XML annotations
│ │ ├── ImageSets/ # Data split files
│ │ ├── JPEGImages/ # Training images
│ │ └── JPEGValidation/ # Validation images
Set data_root to data/rsdd/.
If your datasets are stored elsewhere (e.g., shared storage), symlinks are recommended:
# Create data directory under project root
mkdir -p data
# Create symlinks (modify paths accordingly)
ln -s /path/to/your/dior data/dior
ln -s /path/to/your/split_ss_dota data/split_ss_dota
ln -s /path/to/your/split_ss_star data/split_ss_star
ln -s /path/to/your/hrsc data/hrsc
# ... similarly for other datasets