YouTube Debate Trainer

A powerful web application that downloads YouTube channel videos, extracts transcripts with timestamps, analyzes speech patterns and debate techniques, and creates an AI-powered training bot to help you learn effective communication and argumentation skills.

Features

1. YouTube Channel Processing

Download all videos from any YouTube channel
Extract video metadata (title, duration, views, etc.)
Support for processing 1000+ videos

2. Transcript Extraction

Automatic transcript extraction with timestamps
Support for both manual and auto-generated captions
Whisper-powered fallback generates subtitles when no captions exist (requires OpenAI API key)
Resume-friendly: previously processed videos are skipped automatically (use --force to regenerate)
Multiple export formats:
- JSON: Full transcript data with timestamps
- TXT: Plain text with formatted timestamps
- SRT: Standard subtitle format
- VTT: Web video text tracks

3. Speech Pattern Analysis

Logical Fallacy Detection: Identify common logical fallacies
- Ad hominem attacks
- Straw man arguments
- Appeal to authority
- Slippery slope
- False dichotomy
- And more...
Rhetorical Device Analysis:
- Rhetorical questions
- Repetition patterns
- Analogies and metaphors
- Contrast and emphasis
- Rule of three
Speaking Style Metrics:
- Formality score
- Assertiveness level
- Emotional language usage
- Question frequency
- Average sentence length
Key Phrase Extraction: Most common phrases and language patterns

4. AI Training Bot

Chat with an AI that emulates the YouTuber's speaking style
Three training modes:
- Practice Mode: Engage in debates with the AI
- Analyze Mode: Get feedback on your arguments
- Learn Mode: Learn specific techniques and strategies
Personalized based on channel analysis
View example responses from the actual YouTuber
Track conversation history

Installation

Prerequisites

Python 3.8 or higher
pip package manager
ffmpeg installed and on your PATH (recommended; without it audio stays in source format but Whisper still works)

Setup

Clone or download this project
Create a virtual environment (recommended):

cd youtube-debate-trainer
python -m venv venv

# On macOS/Linux:
source venv/bin/activate

# On Windows:
venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Configure environment variables:

cp .env.example .env

Edit .env file and add your API keys:

OPENAI_API_KEY=your_openai_api_key_here
# OR
ANTHROPIC_API_KEY=your_anthropic_api_key_here

Note: You need at least ONE API key (OpenAI or Anthropic) for AI trainer features.

Whisper Auto-Subtitles

Set ENABLE_WHISPER_FALLBACK=true (default) in .env to auto-generate subtitles when a video has no captions
Requires OPENAI_API_KEY and ffmpeg installed on your system
Optional overrides:
- WHISPER_MODEL=whisper-1
- WHISPER_LANGUAGE=en
If you drop a standalone ffmpeg binary into youtube-debate-trainer/bin, the app automatically adds it to PATH so you don't need a system-wide install

To get API keys:

OpenAI: https://platform.openai.com/api-keys
Anthropic: https://console.anthropic.com/

Verify setup (recommended):

python check_setup.py

This preflight script checks dependencies, ffmpeg, environment variables, and Whisper fallback settings. Fix any reported issues before continuing. If you only get a warning about missing ffmpeg, Whisper can still run (it will just use the original audio container).

Run the application:

python app.py

Open in browser: Navigate to http://localhost:5000

Usage Guide

Step 1: Process a YouTube Channel

Go to the home page
Enter the YouTube channel URL (e.g., https://youtube.com/@channelname)
Give it a name (e.g., debater_john)
Set maximum videos to process (start with 10-50 for testing)
Select export formats
Click "Start Processing"
- Re-running the same channel skips videos you've already processed; add --force (CLI) or "force": true (API) if you need to rebuild transcripts
- Web/API calls run the same preflight checks as python check_setup.py, so you'll get actionable errors if something is missing

The app will:

Download channel information
Extract transcripts from all videos
Analyze speech patterns and techniques
Save everything in the data/ directory

Processing time: Approximately 2-5 seconds per video

Step 2: Train with AI

Go to "AI Trainer" page
Select your processed channel
Click "Initialize Trainer"
Choose a training mode:
- Practice: Debate with the AI
- Analyze: Get feedback on your arguments
- Learn: Learn techniques
Start chatting!

Step 3: Single Video Mode (Transcribr clone)

Scroll to the "Single Video Mode" card on the home page
Paste any YouTube video URL and click Get Transcript (Free)
Preview the transcript inline or download in TXT/MD/SRT/VTT/JSON/CSV
The same functionality is available via POST /api/transcribe with body { "url": "<youtube url>" }

Example Use Cases

1. Learn Debate Techniques

Process a channel of a skilled debater
Use "Learn Mode" to understand their techniques
Practice arguments in "Practice Mode"

2. Analyze Your Arguments

Type your argument in "Analyze Mode"
Get feedback on logical fallacies, rhetorical effectiveness
Improve your reasoning

3. Improve Communication Style

Study the speaking patterns of charismatic speakers
See their formality, assertiveness, and language patterns
Practice emulating their style

Project Structure

youtube-debate-trainer/
├── app.py                      # Flask web application
├── config.py                   # Configuration settings
├── requirements.txt            # Python dependencies
├── .env                        # Environment variables (create from .env.example)
│
├── app/                        # Core modules
│   ├── youtube_downloader.py  # YouTube video & metadata downloader
│   ├── transcript_extractor.py # Transcript extraction & export
│   ├── speech_analyzer.py     # Speech pattern & fallacy analysis
│   └── ai_trainer.py          # AI chatbot trainer
│
├── templates/                  # HTML templates
│   ├── base.html              # Base template
│   ├── index.html             # Home page
│   └── chat.html              # AI trainer chat interface
│
└── data/                       # Generated data (created automatically)
    ├── videos/                # Downloaded videos (if enabled)
    ├── audio/                 # Cached audio for Whisper fallback
    ├── transcripts/           # Extracted transcripts
    └── exports/               # Analysis results & exports

API Endpoints

Channel Processing

POST /api/process-channel - Start processing a channel
GET /api/job-status/<job_id> - Check processing status
GET /api/channels - List all processed channels
GET /api/channel/<channel_name> - Get channel details

AI Trainer

POST /api/trainer/init/<channel_name> - Initialize trainer
POST /api/trainer/chat/<trainer_id> - Chat with AI
POST /api/trainer/reset/<trainer_id> - Reset conversation
GET /api/trainer/examples/<trainer_id> - Get example responses

Export

GET /api/export/<channel_name>/<format> - Export data

Configuration Options

Edit .env file to customize:

# Maximum videos to process per channel
MAX_VIDEOS_PER_CHANNEL=1000

# Download actual video files (requires storage space)
DOWNLOAD_VIDEO_FILES=false

# AI model to use
DEFAULT_AI_MODEL=gpt-4-turbo-preview
# or for Anthropic: claude-3-5-sonnet-20241022

# Whisper fallback controls
ENABLE_WHISPER_FALLBACK=true
WHISPER_MODEL=whisper-1
WHISPER_LANGUAGE=en

Tips & Best Practices

Start Small: Process 10-20 videos first to test, then scale up
Video Selection: The more videos you process, the better the AI understands the speaking style
API Costs:
- OpenAI GPT-4: ~$0.01-0.03 per conversation
- Anthropic Claude: ~$0.015 per conversation
- Transcript extraction is FREE (no API needed)
Storage: Each transcript is ~10-50KB. 100 videos ≈ 1-5MB
Best Channels to Analyze:
- Debate channels
- Philosophy discussions
- Educational content creators
- Public speakers
- Podcast hosts

Troubleshooting

No transcripts extracted

Some videos don't have captions enabled
Check if the video has manual or auto-generated captions on YouTube

AI not working

Verify API key is set correctly in .env
Check API key has credits/quota
Check console for error messages

Slow processing

Normal: 2-5 seconds per video
Depends on video length and transcript availability
Run in background and check progress

Module not found errors

pip install -r requirements.txt

"The downloaded file is empty"

The clip is age or region restricted. Provide cookies to yt-dlp, log in, or remove that video from the batch.
The progress card now surfaces this warning immediately and moves on to the next video automatically.

Preflight errors

ffmpeg not found: Install via Homebrew (brew install ffmpeg), apt (sudo apt install ffmpeg), or download binaries and ensure the executable is on your PATH.
Whisper fallback misconfigured: Provide OPENAI_API_KEY in .env or set ENABLE_WHISPER_FALLBACK=false if you don't want automatic subtitles.
No AI API keys: Add either OPENAI_API_KEY or ANTHROPIC_API_KEY before using the AI trainer or Whisper fallback features.

Advanced Usage

CLI Mode (without web interface)

Create a Python script:

from app.youtube_downloader import YouTubeDownloader
from app.transcript_extractor import TranscriptExtractor
from app.speech_analyzer import SpeechAnalyzer
from app.ai_trainer import AITrainer

# Download channel
downloader = YouTubeDownloader()
videos = downloader.get_channel_videos('CHANNEL_URL', max_videos=50)
downloader.save_video_metadata(videos, 'channel_name')

# Extract transcripts
extractor = TranscriptExtractor()
results = extractor.process_channel_transcripts(videos, ['json', 'txt'])

# Analyze
analyzer = SpeechAnalyzer()
analysis = analyzer.analyze_channel('channel_name')

# Train
trainer = AITrainer('channel_name')
response = trainer.chat("What's your view on free speech?", mode='practice')
print(response)

Privacy & Ethics

This tool is for educational purposes only
Respect copyright and fair use
Get permission before training on private content
Use responsibly for learning and self-improvement
Don't use to impersonate or mislead others

License

This project is provided as-is for educational purposes.

Credits

Built with:

Flask - Web framework
yt-dlp - YouTube downloader
youtube-transcript-api - Transcript extraction
OpenAI / Anthropic - AI capabilities

Support

For issues, questions, or feature requests, please create an issue in the repository.

Happy learning and debating! 🎯

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

YouTube Debate Trainer

Features

1. YouTube Channel Processing

2. Transcript Extraction

3. Speech Pattern Analysis

4. AI Training Bot

Installation

Prerequisites

Setup

Whisper Auto-Subtitles

Usage Guide

Step 1: Process a YouTube Channel

Step 2: Train with AI

Step 3: Single Video Mode (Transcribr clone)

Example Use Cases

1. Learn Debate Techniques

2. Analyze Your Arguments

3. Improve Communication Style

Project Structure

API Endpoints

Channel Processing

AI Trainer

Export

Configuration Options

Tips & Best Practices

Troubleshooting

No transcripts extracted

AI not working

Slow processing

Module not found errors

"The downloaded file is empty"

Preflight errors

Advanced Usage

CLI Mode (without web interface)

Privacy & Ethics

License

Credits

Support

Youtube_Debate_Trainer

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages