Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
112 changes: 112 additions & 0 deletions O-4_shape_matching_data-generator/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# Virtual environments
venv/
env/
ENV/
env.bak/
venv.bak/

# IDEs
.vscode/
.idea/
*.swp
*.swo
*~
.DS_Store

# Testing
.pytest_cache/
.coverage
htmlcov/
.tox/
.hypothesis/

# Generated data (don't commit actual generated datasets)
data/questions/*/
data/outputs/*/
data/evaluations/*/

# Keep directory structure but not contents
!data/questions/.gitkeep
!data/outputs/.gitkeep

# Logs
*.log
logs/

# Temporary files
tmp/
temp/
*.tmp

# Jupyter
.ipynb_checkpoints/
*.ipynb

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Distribution / packaging
.Python
*.manifest
*.spec

# Environments
.env
.env.local
.venv

# PyCharm
.idea/

# VS Code
.vscode/

# macOS
.DS_Store
.AppleDouble
.LSOverride

# Thumbnails
._*

# Files that might appear in the root of a volume
.DocumentRevisions-V100
.fseventsd
.Spotlight-V100
.TemporaryItems
.Trashes
.VolumeIcon.icns
.com.apple.timemachine.donotpresent

# Windows
Thumbs.db
ehthumbs.db
Desktop.ini
$RECYCLE.BIN/
21 changes: 21 additions & 0 deletions O-4_shape_matching_data-generator/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2026 VM Dataset Team

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
64 changes: 64 additions & 0 deletions O-4_shape_matching_data-generator/PUSH_INSTRUCTIONS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# 推送到GitHub仓库的步骤

## 当前状态
✅ Git仓库已初始化
✅ 所有文件已提交到本地main分支
✅ Commit ID: fddfd20

## 推送步骤

### 方式1: 推送到 vm-dataset 组织(推荐)

1. 在浏览器中打开: https://github.com/organizations/vm-dataset/repositories/new

2. 填写仓库信息:
- Repository name: `O-4_shape_matching_data-generator`
- Description: `Shape matching task data generator for visual reasoning dataset`
- Visibility: Public
- ⚠️ 不要勾选 "Add a README file"
- ⚠️ 不要勾选 "Add .gitignore"
- ⚠️ 不要勾选 "Choose a license"

3. 点击 "Create repository"

4. 在终端执行以下命令:
```bash
cd /workspaces/template-data-generator/O-4_shape_matching_data-generator
git remote add origin https://github.com/vm-dataset/O-4_shape_matching_data-generator.git
git branch -M main
git push -u origin main
```

### 方式2: 推送到个人账户(临时方案)

1. 在浏览器中打开: https://github.com/new

2. 填写仓库信息:
- Repository name: `O-4_shape_matching_data-generator`
- Description: `Shape matching task data generator for visual reasoning dataset`
- Visibility: Public
- ⚠️ 不要勾选任何初始化选项

3. 在终端执行:
```bash
cd /workspaces/template-data-generator/O-4_shape_matching_data-generator
git remote add origin https://github.com/jyizheng/O-4_shape_matching_data-generator.git
git branch -M main
git push -u origin main
```

4. 之后可以通过Transfer功能转移到vm-dataset组织

## 验证

推送成功后,访问仓库URL确认:
- 所有文件都已上传
- README.md正确显示
- 文件结构完整

## 项目信息

- Domain: shape_matching
- Task ID格式: shape_matching_XXXX
- 包含16个文件,1244行代码
- 符合G-1模板和rules.txt规范
139 changes: 139 additions & 0 deletions O-4_shape_matching_data-generator/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
# O-4 Shape Matching Data Generator 🔷

A data generator for creating synthetic "Shape Matching" reasoning tasks. This generator creates datasets where colored shapes must be moved into their corresponding dark outlines.

---

## 🚀 Quick Start

```bash
# 1. Clone the repository
git clone https://github.com/vm-dataset/O-4_shape_matching_data-generator.git
cd O-4_shape_matching_data-generator

# 2. Create and activate virtual environment
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate

# 3. Install dependencies
pip install --upgrade pip
pip install -r requirements.txt
pip install -e .

# 4. Generate tasks
python examples/generate.py --num-samples 50
```

---

## 📁 Structure

```
O-4_shape_matching_data-generator/
├── core/ # Standard utilities
│ ├── base_generator.py # Abstract base class
│ ├── schemas.py # Pydantic models
│ ├── image_utils.py # Image helpers
│ ├── video_utils.py # Video generation
│ └── output_writer.py # File output
├── src/ # Shape matching task logic
│ ├── generator.py # Shape matching generator
│ ├── prompts.py # Task prompt templates
│ └── config.py # Task configuration
├── examples/
│ └── generate.py # Entry point
└── data/questions/ # Generated output
```

---

## 📦 Output Format

This generator produces:

```
data/questions/shape_matching_task/{task_id}/
├── first_frame.png # Initial state with shapes scattered (REQUIRED)
├── final_frame.png # Final state with shapes matching outlines (REQUIRED)
├── prompt.txt # Instructions (REQUIRED)
└── ground_truth.mp4 # Solution video (OPTIONAL)
```

---

## 🎯 Task Description

This generator creates **shape matching tasks** with the following characteristics:

1. **Initial Frame**: A scene containing:
- Colored shapes scattered on the left side (circle, square, triangle, star)
- Dark outline targets on the right side (matching the shapes)
- White background with a dividing line

2. **Animation Process**: Each colored shape moves from its starting position to its matching outline

3. **Final Frame**: All colored shapes are aligned with their corresponding outlines

4. **Task Requirements**:
- Move each colorful shape into its corresponding dark outline
- Shapes must match their target outlines exactly

### Task Specifications

- **Domain**: `shape_matching`
- **Image size**: 800×400 pixels
- **Background**: Pure white with a dividing line
- **FPS**: 30 frames per second
- **Shapes**: circle / square / triangle / star (up to 4 shapes)
- **Animation**: hold 1s at start → linear move 2s → hold 1s at end
- **Target outlines**: Always visible on the right side

### Prompt Format

The prompt provides clear instructions for the task:

```
Move each colorful shape into its corresponding dark outline.
```

---

## 🎨 Customization

### Basic Usage

```bash
# Generate 100 samples
python examples/generate.py --num-samples 100

# Custom output directory
python examples/generate.py --num-samples 50 --output data/my_shapes

# Set random seed for reproducibility
python examples/generate.py --num-samples 50 --seed 42

# Disable video generation
python examples/generate.py --num-samples 50 --no-videos
```

### Configuration

Modify [src/config.py](src/config.py) to customize:

- `domain`: Task domain name (default: `shape_matching`)
- `image_size`: Image dimensions (default: `(800, 400)`)
- `num_shapes`: Number of shapes in task (default: `4`, max: `4`)
- `shape_size`: Size of shapes in pixels (default: `35`)
- `video_fps`: Video frame rate (default: `30`)

---

## 📄 License

MIT License - see LICENSE file for details.

---

## 🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.
21 changes: 21 additions & 0 deletions O-4_shape_matching_data-generator/core/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
"""
Core utilities for template-data-generator.

DO NOT MODIFY - This is framework code.
Customize files in src/ for your task.
"""

from .base_generator import BaseGenerator, GenerationConfig
from .schemas import TaskPair
from .image_utils import ImageRenderer
from .output_writer import OutputWriter
from .video_utils import VideoGenerator

__all__ = [
"BaseGenerator",
"GenerationConfig",
"TaskPair",
"ImageRenderer",
"OutputWriter",
"VideoGenerator",
]
Loading