Generates synthetic datasets for training and evaluating vision models on object identification tasks. Each sample contains multiple objects where specific items matching given criteria must be identified and marked.
Each sample pairs a task (first frame + prompt describing what needs to happen) with its ground truth solution (final frame showing the result + video demonstrating how to achieve it). This structure enables both model evaluation and training.
| Property | Value |
|---|---|
| Task ID | G-4 |
| Task | Identify Objects |
| Category | Perception |
| Resolution | 1024×1024 px |
| FPS | 16 fps |
| Duration | ~3 seconds |
| Output | PNG images + MP4 video |
# 1. Clone the repository
git clone https://github.com/VBVR-DataFactory/G-4_identify_objects_data-generator.git
cd G-4_identify_objects_data-generator
# 2. Create and activate virtual environment
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# 3. Install dependencies
pip install --upgrade pip
pip install -r requirements.txt
pip install -e .# Generate 50 samples
python examples/generate.py --num-samples 50
# Custom output directory
python examples/generate.py --num-samples 100 --output data/my_dataset
# Reproducible generation with seed
python examples/generate.py --num-samples 50 --seed 42
# Without videos (faster)
python examples/generate.py --num-samples 50 --no-videos| Argument | Description |
|---|---|
--num-samples |
Number of tasks to generate (required) |
--output |
Output directory (default: data/questions) |
--seed |
Random seed for reproducibility |
--no-videos |
Skip video generation (images only) |
The scene contains multiple objects of different shapes and colors arranged randomly. Keep all objects unchanged in their shape, color, size, and position. Identify all triangles and mark them by adding a thick blue outline around each one.
![]() |
![]() |
![]() |
| Initial Frame Objects with no markings |
Animation Blue outlines appear on targets |
Final Frame All yellow circles marked |
Identify all objects matching specific criteria (shape + color combination) and mark them with a colored outline.
- Object count: 3-8 objects per scene
- Shape types: Circle, square, triangle, rectangle, pentagon, hexagon
- Colors: Red, blue, green, yellow, orange
- Layout: Random non-overlapping positions
- Task criteria: Identify objects by both shape AND color
- Marking method: Add thick colored outline around matching objects
- Background: Pure white
- Tests multi-attribute pattern matching (shape + color)
- Requires identifying ALL matching objects (not just one)
- Objects remain in place, only outlines are added
- Various shape-color combinations for variety
- Outline color is always different from target object color
- Sequential animation showing each identification
data/questions/identify_objects_task/identify_objects_00000000/
├── first_frame.png # Scene with unmarked objects
├── final_frame.png # Scene with identified objects marked
├── prompt.txt # Identification criteria (e.g., "yellow circles")
├── ground_truth.mp4 # Animation of marking process
└── question_metadata.json # Task metadata
File specifications:
- Images: 1024×1024 PNG format
- Video: MP4 format, 16 fps
- Duration: ~3 seconds
pattern-matching object-identification multi-attribute visual-search logic


