Quick Start Guide

Get your first dataset ready in about 10 minutes! This guide walks you through installing FineFoundry and collecting your first batch of training data.

What You'll Need

A computer running Windows, macOS, or Linux
Python 3.10 or newer — Download Python here if you don't have it
An internet connection for downloading and collecting data

Optional (for sharing your work online):

A free Hugging Face account

Step 1: Download and Install

Option A: The Easy Way (Recommended)

Open your terminal (Command Prompt on Windows, Terminal on Mac/Linux) and run these commands:

git clone https://github.com/SourceBox-LLC/FineFoundry.git FineFoundry-Core
cd FineFoundry-Core
pip install uv

Then start the app:

On Mac/Linux:

chmod +x run_finefoundry.sh
./run_finefoundry.sh

On Windows:

uv run src/main.py

Option B: Traditional Installation

If the above doesn't work, try this instead:

git clone https://github.com/SourceBox-LLC/FineFoundry.git FineFoundry-Core
cd FineFoundry-Core
python -m venv venv

Activate your virtual environment:

Mac/Linux: source venv/bin/activate
Windows: .\venv\Scripts\Activate.ps1

Then install and run:

pip install -e .
python src/main.py

Step 2: Tour the App

When FineFoundry opens, you'll see a window with tabs across the top:

Data Sources — Where you collect data from websites or documents
Publish — Prepare and share your datasets
Training — Teach AI models with your data
Inference — Test your trained models by chatting with them
Merge Datasets — Combine multiple data collections
Analysis — Check your data quality
Settings — Set up accounts and preferences

Step 3: Collect Your First Data

Let's grab some data to work with:

Click the "Data Sources" tab
Choose a source — For this example, select "4chan"
Pick some boards — Click a few board chips like b, pol, or x
Set your limits:
- Max Threads: 50
- Max Pairs: 500
- Delay: 0.5
- Min Length: 10
Click "Start"

Watch the progress bar and logs as data flows in. This usually takes 1-3 minutes.

Step 4: Preview Your Data

When the collection finishes:

Click "Preview Dataset"
You'll see a two-column view showing conversation pairs

Each row shows an "input" (like a question or prompt) and an "output" (the response). This is what the AI will learn from!

Your data is automatically saved, so you won't lose it if you close the app.

What's Next?

You've just collected your first dataset! Here's what you can do now:

Want to train a model?

Go to the Training Tab Guide to learn how to teach an AI using your data.

Want to share your dataset?

Go to the Publish Tab to upload it to Hugging Face.

Want more data?

Try different sources (Reddit, Stack Exchange)
Collect from multiple boards
Generate synthetic data from your own documents

Want to combine datasets?

Use the Merge Datasets Tab to mix data from different collections.

Having Problems?

App won't start? Make sure Python 3.10+ is installed
No data collected? Check your internet connection and try different boards
Other issues? See the Troubleshooting Guide

Still stuck? Ask for help in GitHub Discussions.

Next: Data Sources Tab | Back to Documentation Index

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quick Start Guide

What You'll Need

Step 1: Download and Install

Option A: The Easy Way (Recommended)

Option B: Traditional Installation

Step 2: Tour the App

Step 3: Collect Your First Data

Step 4: Preview Your Data

What's Next?

Want to train a model?

Want to share your dataset?

Want more data?

Want to combine datasets?

Having Problems?

FilesExpand file tree

quick-start.md

Latest commit

History

quick-start.md

File metadata and controls

Quick Start Guide

What You'll Need

Step 1: Download and Install

Option A: The Easy Way (Recommended)

Option B: Traditional Installation

Step 2: Tour the App

Step 3: Collect Your First Data

Step 4: Preview Your Data

What's Next?

Want to train a model?

Want to share your dataset?

Want more data?

Want to combine datasets?

Having Problems?