Generates synthetic datasets for training and evaluating vision models on object reordering tasks. Each sample contains multiple objects that must be rearranged by swapping specific positions.
Each sample pairs a task (first frame + prompt describing what needs to happen) with its ground truth solution (final frame showing the result + video demonstrating how to achieve it). This structure enables both model evaluation and training.
| Property | Value |
|---|---|
| Task ID | G-2 |
| Task | Reorder Objects |
| Category | Transformation |
| Resolution | 1024×1024 px |
| FPS | 16 fps |
| Duration | ~2 seconds |
| Output | PNG images + MP4 video |
# 1. Clone the repository
git clone https://github.com/VBVR-DataFactory/G-2_reorder_objects_data-generator.git
cd G-2_reorder_objects_data-generator
# 2. Create and activate virtual environment
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# 3. Install dependencies
pip install --upgrade pip
pip install -r requirements.txt
pip install -e .# Generate 50 samples
python examples/generate.py --num-samples 50
# Custom output directory
python examples/generate.py --num-samples 100 --output data/my_dataset
# Reproducible generation with seed
python examples/generate.py --num-samples 50 --seed 42
# Without videos (faster)
python examples/generate.py --num-samples 50 --no-videos| Argument | Description |
|---|---|
--num-samples |
Number of tasks to generate (required) |
--output |
Output directory (default: data/questions) |
--seed |
Random seed for reproducibility |
--no-videos |
Skip video generation (images only) |
The scene contains multiple objects arranged in a horizontal line. Keep all other objects unchanged. Swap the positions of the 4th and 5th objects from the left using shortest paths.
![]() |
![]() |
![]() |
| Initial Frame Objects in original positions |
Animation Two objects swap positions |
Final Frame Objects after swap completed |
Swap the positions of two specified objects in a horizontal arrangement while keeping all other objects in their original positions.
- Object count: 3-6 objects per scene
- Shapes: Circle, square, triangle, hexagon, cylinder
- Colors: Red, blue, green, purple, yellow, orange
- Layout: Objects arranged horizontally at image center
- Background: Pure white
- Goal: Exchange positions of two specified objects using shortest paths
- Objects identified by position from left (1st, 2nd, 3rd, etc.)
- Two randomly selected objects swap positions simultaneously
- All other objects remain stationary
- Smooth linear motion along shortest paths
- Multiple shape and color combinations for variety
data/questions/reorder_objects_task/reorder_objects_00000000/
├── first_frame.png # Objects in initial arrangement
├── final_frame.png # Objects after swap
├── prompt.txt # Swap instruction with specific positions
├── ground_truth.mp4 # Animation of the swap process
└── question_metadata.json # Task metadata
File specifications:
- Images: 1024×1024 PNG format
- Video: MP4 format, 16 fps
- Duration: ~2 seconds
logic reordering spatial-reasoning object-manipulation position-tracking


