Skip to content

FullfocusApp/Youtube_Debate_Trainer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 

Repository files navigation

YouTube Debate Trainer

A powerful web application that downloads YouTube channel videos, extracts transcripts with timestamps, analyzes speech patterns and debate techniques, and creates an AI-powered training bot to help you learn effective communication and argumentation skills.

Features

1. YouTube Channel Processing

  • Download all videos from any YouTube channel
  • Extract video metadata (title, duration, views, etc.)
  • Support for processing 1000+ videos

2. Transcript Extraction

  • Automatic transcript extraction with timestamps
  • Support for both manual and auto-generated captions
  • Whisper-powered fallback generates subtitles when no captions exist (requires OpenAI API key)
  • Resume-friendly: previously processed videos are skipped automatically (use --force to regenerate)
  • Multiple export formats:
    • JSON: Full transcript data with timestamps
    • TXT: Plain text with formatted timestamps
    • SRT: Standard subtitle format
    • VTT: Web video text tracks

3. Speech Pattern Analysis

  • Logical Fallacy Detection: Identify common logical fallacies

    • Ad hominem attacks
    • Straw man arguments
    • Appeal to authority
    • Slippery slope
    • False dichotomy
    • And more...
  • Rhetorical Device Analysis:

    • Rhetorical questions
    • Repetition patterns
    • Analogies and metaphors
    • Contrast and emphasis
    • Rule of three
  • Speaking Style Metrics:

    • Formality score
    • Assertiveness level
    • Emotional language usage
    • Question frequency
    • Average sentence length
  • Key Phrase Extraction: Most common phrases and language patterns

4. AI Training Bot

  • Chat with an AI that emulates the YouTuber's speaking style

  • Three training modes:

    • Practice Mode: Engage in debates with the AI
    • Analyze Mode: Get feedback on your arguments
    • Learn Mode: Learn specific techniques and strategies
  • Personalized based on channel analysis

  • View example responses from the actual YouTuber

  • Track conversation history

Installation

Prerequisites

  • Python 3.8 or higher
  • pip package manager
  • ffmpeg installed and on your PATH (recommended; without it audio stays in source format but Whisper still works)

Setup

  1. Clone or download this project

  2. Create a virtual environment (recommended):

cd youtube-debate-trainer
python -m venv venv

# On macOS/Linux:
source venv/bin/activate

# On Windows:
venv\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt
  1. Configure environment variables:
cp .env.example .env

Edit .env file and add your API keys:

OPENAI_API_KEY=your_openai_api_key_here
# OR
ANTHROPIC_API_KEY=your_anthropic_api_key_here

Note: You need at least ONE API key (OpenAI or Anthropic) for AI trainer features.

Whisper Auto-Subtitles

  • Set ENABLE_WHISPER_FALLBACK=true (default) in .env to auto-generate subtitles when a video has no captions
  • Requires OPENAI_API_KEY and ffmpeg installed on your system
  • Optional overrides:
    • WHISPER_MODEL=whisper-1
    • WHISPER_LANGUAGE=en
  • If you drop a standalone ffmpeg binary into youtube-debate-trainer/bin, the app automatically adds it to PATH so you don't need a system-wide install

To get API keys:

  1. Verify setup (recommended):
python check_setup.py

This preflight script checks dependencies, ffmpeg, environment variables, and Whisper fallback settings. Fix any reported issues before continuing. If you only get a warning about missing ffmpeg, Whisper can still run (it will just use the original audio container).

  1. Run the application:
python app.py
  1. Open in browser: Navigate to http://localhost:5000

Usage Guide

Step 1: Process a YouTube Channel

  1. Go to the home page
  2. Enter the YouTube channel URL (e.g., https://youtube.com/@channelname)
  3. Give it a name (e.g., debater_john)
  4. Set maximum videos to process (start with 10-50 for testing)
  5. Select export formats
  6. Click "Start Processing"
    • Re-running the same channel skips videos you've already processed; add --force (CLI) or "force": true (API) if you need to rebuild transcripts
    • Web/API calls run the same preflight checks as python check_setup.py, so you'll get actionable errors if something is missing

The app will:

  • Download channel information
  • Extract transcripts from all videos
  • Analyze speech patterns and techniques
  • Save everything in the data/ directory

Processing time: Approximately 2-5 seconds per video

Step 2: Train with AI

  1. Go to "AI Trainer" page

  2. Select your processed channel

  3. Click "Initialize Trainer"

  4. Choose a training mode:

    • Practice: Debate with the AI
    • Analyze: Get feedback on your arguments
    • Learn: Learn techniques
  5. Start chatting!

Step 3: Single Video Mode (Transcribr clone)

  1. Scroll to the "Single Video Mode" card on the home page
  2. Paste any YouTube video URL and click Get Transcript (Free)
  3. Preview the transcript inline or download in TXT/MD/SRT/VTT/JSON/CSV
  4. The same functionality is available via POST /api/transcribe with body { "url": "<youtube url>" }

Example Use Cases

1. Learn Debate Techniques

  • Process a channel of a skilled debater
  • Use "Learn Mode" to understand their techniques
  • Practice arguments in "Practice Mode"

2. Analyze Your Arguments

  • Type your argument in "Analyze Mode"
  • Get feedback on logical fallacies, rhetorical effectiveness
  • Improve your reasoning

3. Improve Communication Style

  • Study the speaking patterns of charismatic speakers
  • See their formality, assertiveness, and language patterns
  • Practice emulating their style

Project Structure

youtube-debate-trainer/
├── app.py                      # Flask web application
├── config.py                   # Configuration settings
├── requirements.txt            # Python dependencies
├── .env                        # Environment variables (create from .env.example)
│
├── app/                        # Core modules
│   ├── youtube_downloader.py  # YouTube video & metadata downloader
│   ├── transcript_extractor.py # Transcript extraction & export
│   ├── speech_analyzer.py     # Speech pattern & fallacy analysis
│   └── ai_trainer.py          # AI chatbot trainer
│
├── templates/                  # HTML templates
│   ├── base.html              # Base template
│   ├── index.html             # Home page
│   └── chat.html              # AI trainer chat interface
│
└── data/                       # Generated data (created automatically)
    ├── videos/                # Downloaded videos (if enabled)
    ├── audio/                 # Cached audio for Whisper fallback
    ├── transcripts/           # Extracted transcripts
    └── exports/               # Analysis results & exports

API Endpoints

Channel Processing

  • POST /api/process-channel - Start processing a channel
  • GET /api/job-status/<job_id> - Check processing status
  • GET /api/channels - List all processed channels
  • GET /api/channel/<channel_name> - Get channel details

AI Trainer

  • POST /api/trainer/init/<channel_name> - Initialize trainer
  • POST /api/trainer/chat/<trainer_id> - Chat with AI
  • POST /api/trainer/reset/<trainer_id> - Reset conversation
  • GET /api/trainer/examples/<trainer_id> - Get example responses

Export

  • GET /api/export/<channel_name>/<format> - Export data

Configuration Options

Edit .env file to customize:

# Maximum videos to process per channel
MAX_VIDEOS_PER_CHANNEL=1000

# Download actual video files (requires storage space)
DOWNLOAD_VIDEO_FILES=false

# AI model to use
DEFAULT_AI_MODEL=gpt-4-turbo-preview
# or for Anthropic: claude-3-5-sonnet-20241022

# Whisper fallback controls
ENABLE_WHISPER_FALLBACK=true
WHISPER_MODEL=whisper-1
WHISPER_LANGUAGE=en

Tips & Best Practices

  1. Start Small: Process 10-20 videos first to test, then scale up

  2. Video Selection: The more videos you process, the better the AI understands the speaking style

  3. API Costs:

    • OpenAI GPT-4: ~$0.01-0.03 per conversation
    • Anthropic Claude: ~$0.015 per conversation
    • Transcript extraction is FREE (no API needed)
  4. Storage: Each transcript is ~10-50KB. 100 videos ≈ 1-5MB

  5. Best Channels to Analyze:

    • Debate channels
    • Philosophy discussions
    • Educational content creators
    • Public speakers
    • Podcast hosts

Troubleshooting

No transcripts extracted

  • Some videos don't have captions enabled
  • Check if the video has manual or auto-generated captions on YouTube

AI not working

  • Verify API key is set correctly in .env
  • Check API key has credits/quota
  • Check console for error messages

Slow processing

  • Normal: 2-5 seconds per video
  • Depends on video length and transcript availability
  • Run in background and check progress

Module not found errors

pip install -r requirements.txt

"The downloaded file is empty"

  • The clip is age or region restricted. Provide cookies to yt-dlp, log in, or remove that video from the batch.
  • The progress card now surfaces this warning immediately and moves on to the next video automatically.

Preflight errors

  • ffmpeg not found: Install via Homebrew (brew install ffmpeg), apt (sudo apt install ffmpeg), or download binaries and ensure the executable is on your PATH.
  • Whisper fallback misconfigured: Provide OPENAI_API_KEY in .env or set ENABLE_WHISPER_FALLBACK=false if you don't want automatic subtitles.
  • No AI API keys: Add either OPENAI_API_KEY or ANTHROPIC_API_KEY before using the AI trainer or Whisper fallback features.

Advanced Usage

CLI Mode (without web interface)

Create a Python script:

from app.youtube_downloader import YouTubeDownloader
from app.transcript_extractor import TranscriptExtractor
from app.speech_analyzer import SpeechAnalyzer
from app.ai_trainer import AITrainer

# Download channel
downloader = YouTubeDownloader()
videos = downloader.get_channel_videos('CHANNEL_URL', max_videos=50)
downloader.save_video_metadata(videos, 'channel_name')

# Extract transcripts
extractor = TranscriptExtractor()
results = extractor.process_channel_transcripts(videos, ['json', 'txt'])

# Analyze
analyzer = SpeechAnalyzer()
analysis = analyzer.analyze_channel('channel_name')

# Train
trainer = AITrainer('channel_name')
response = trainer.chat("What's your view on free speech?", mode='practice')
print(response)

Privacy & Ethics

  • This tool is for educational purposes only
  • Respect copyright and fair use
  • Get permission before training on private content
  • Use responsibly for learning and self-improvement
  • Don't use to impersonate or mislead others

License

This project is provided as-is for educational purposes.

Credits

Built with:

  • Flask - Web framework
  • yt-dlp - YouTube downloader
  • youtube-transcript-api - Transcript extraction
  • OpenAI / Anthropic - AI capabilities

Support

For issues, questions, or feature requests, please create an issue in the repository.


Happy learning and debating! 🎯

Youtube_Debate_Trainer

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors