Skip to content

VBVR-DataFactory/G-4_identify_objects_data-generator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

G-4: Identify Objects Data Generator

Generates synthetic datasets for training and evaluating vision models on object identification tasks. Each sample contains multiple objects where specific items matching given criteria must be identified and marked.

Each sample pairs a task (first frame + prompt describing what needs to happen) with its ground truth solution (final frame showing the result + video demonstrating how to achieve it). This structure enables both model evaluation and training.


📌 Basic Information

Property Value
Task ID G-4
Task Identify Objects
Category Perception
Resolution 1024×1024 px
FPS 16 fps
Duration ~3 seconds
Output PNG images + MP4 video

🚀 Usage

Installation

# 1. Clone the repository
git clone https://github.com/VBVR-DataFactory/G-4_identify_objects_data-generator.git
cd G-4_identify_objects_data-generator

# 2. Create and activate virtual environment
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# 3. Install dependencies
pip install --upgrade pip
pip install -r requirements.txt
pip install -e .

Generate Data

# Generate 50 samples
python examples/generate.py --num-samples 50

# Custom output directory
python examples/generate.py --num-samples 100 --output data/my_dataset

# Reproducible generation with seed
python examples/generate.py --num-samples 50 --seed 42

# Without videos (faster)
python examples/generate.py --num-samples 50 --no-videos

Command-Line Options

Argument Description
--num-samples Number of tasks to generate (required)
--output Output directory (default: data/questions)
--seed Random seed for reproducibility
--no-videos Skip video generation (images only)

📖 Task Example

Prompt

The scene contains multiple objects of different shapes and colors arranged randomly. Keep all objects unchanged in their shape, color, size, and position. Identify all triangles and mark them by adding a thick blue outline around each one.

Visual

Initial Frame
Objects with no markings
Animation
Blue outlines appear on targets
Final Frame
All yellow circles marked

📖 Task Description

Objective

Identify all objects matching specific criteria (shape + color combination) and mark them with a colored outline.

Task Setup

  • Object count: 3-8 objects per scene
  • Shape types: Circle, square, triangle, rectangle, pentagon, hexagon
  • Colors: Red, blue, green, yellow, orange
  • Layout: Random non-overlapping positions
  • Task criteria: Identify objects by both shape AND color
  • Marking method: Add thick colored outline around matching objects
  • Background: Pure white

Key Features

  • Tests multi-attribute pattern matching (shape + color)
  • Requires identifying ALL matching objects (not just one)
  • Objects remain in place, only outlines are added
  • Various shape-color combinations for variety
  • Outline color is always different from target object color
  • Sequential animation showing each identification

📦 Data Format

data/questions/identify_objects_task/identify_objects_00000000/
├── first_frame.png      # Scene with unmarked objects
├── final_frame.png      # Scene with identified objects marked
├── prompt.txt           # Identification criteria (e.g., "yellow circles")
├── ground_truth.mp4     # Animation of marking process
└── question_metadata.json # Task metadata

File specifications:

  • Images: 1024×1024 PNG format
  • Video: MP4 format, 16 fps
  • Duration: ~3 seconds

🏷️ Tags

pattern-matching object-identification multi-attribute visual-search logic


About

This is the data generator for identify objects task

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages