A robust Python-based Telegram bot built with pyTelegramBotAPI that echoes incoming messages as JSON to the console and automatically stores received media (photos and videos).
- Media Handling: Automatically downloads and stores photos and videos sent to the bot.
- Flexible Storage Modes: Choose between
localdisk storage,S3/MinIOcloud storage, orbothsimultaneously. - Deduplication: Computes a SHA256 hash per file to prevent storing duplicate content for the same user.
- Cross-Platform Paths: Uses Unix-style paths internally (database & S3) and auto-converts to OS-native paths for local disk operations.
- Robust User Registration: Finds or creates users by Telegram ID and username, with built-in protection against race conditions.
- Database Backend: Tracks users and file metadata (including
created_attimestamps) using SQLAlchemy with SQLite or PostgreSQL. - Auto-Migrations: Applies database schema updates automatically via Alembic on startup.
- Centralized Logging: Configurable log levels (
DEBUG,INFO,WARNING,ERROR,CRITICAL) for all modules. - Strict Validation: Validates configuration at startup and fails fast with descriptive error messages.
- Maintenance Tooling: A dedicated
maintenance.pyscript handles background operations (S3 sync, deduplication sweep) without touching the main bot.
| File / Directory | Purpose |
|---|---|
echo_bot.py |
Main bot entry point — handles polling, media processing, and deduplication. |
database.py |
SQLAlchemy models (User, File) and all database operations. |
storage.py |
S3/MinIO storage abstraction (S3StorageManager). |
maintenance.py |
Standalone maintenance script (S3-to-DB sync, deduplication sweep). |
start.sh |
Docker entrypoint — runs Alembic migrations then starts the bot. |
alembic/ |
Database migration scripts. |
files/ |
Local storage directory (auto-created at runtime). |
-
Clone the repository:
git clone <repo-url> cd python-telebot-echo
-
Create and activate a virtual environment:
python -m venv venv .\venv\Scripts\activate # Windows source venv/bin/activate # Linux/macOS
-
Install dependencies:
pip install -r requirements.txt
Configure the bot using command-line arguments or environment variables. CLI arguments take precedence.
| Argument | Environment Variable | Required | Description |
|---|---|---|---|
--token |
TELEGRAM_BOT_TOKEN |
Yes | Your Telegram Bot Token. |
--database-url |
DATABASE_URL |
No | DB connection string. Defaults to sqlite:///./bot_database.db. |
--storage-mode |
STORAGE_MODE |
No | local, s3, or both (default: local). |
--log-level |
N/A | No | Log verbosity: DEBUG, INFO, WARNING, ERROR, CRITICAL (default: INFO). |
Required when --storage-mode is s3 or both, and for maintenance.py --sync-s3.
| Argument | Environment Variable | Required | Description |
|---|---|---|---|
--bucket-endpoint |
BUCKET_ENDPOINT |
No | S3/MinIO custom endpoint URL (omit for AWS). |
--bucket-access-key |
BUCKET_ACCESS_KEY |
Yes* | S3 Access Key. |
--bucket-secret-key |
BUCKET_SECRET_KEY |
Yes* | S3 Secret Key. |
--bucket-name |
BUCKET_NAME |
Yes* | Target bucket name. |
--bucket-region |
BUCKET_REGION |
No | AWS region (default: us-east-1). |
Apply migrations first, then start the bot:
alembic upgrade head
python echo_bot.py --token <YOUR_TOKEN>With S3 storage enabled:
python echo_bot.py --token <YOUR_TOKEN> \
--storage-mode s3 \
--bucket-name my-bucket \
--bucket-access-key <KEY> \
--bucket-secret-key <SECRET> \
--bucket-endpoint http://localhost:9000The provided Dockerfile and start.sh handle migrations and startup automatically:
docker build -t my-echo-bot .
docker run \
-e TELEGRAM_BOT_TOKEN=<YOUR_TOKEN> \
-e STORAGE_MODE=s3 \
-e BUCKET_NAME=my-bucket \
-e BUCKET_ACCESS_KEY=<KEY> \
-e BUCKET_SECRET_KEY=<SECRET> \
my-echo-botmaintenance.py is a standalone utility for administrative tasks. It connects to the same database and S3 bucket as the bot but exits after completing its task — it never starts the polling loop.
Use this to register files already present in S3 that are not yet in the database, and to sweep for hash duplicates:
python maintenance.py --sync-s3 \
--database-url <DB_URL> \
--bucket-name <BUCKET> \
--bucket-access-key <KEY> \
--bucket-secret-key <SECRET> \
--bucket-endpoint http://localhost:9000What it does:
- Lists all objects in the S3 bucket.
- For each object, downloads the content and computes its SHA256 hash.
- Skips the file if it is already correctly registered in the database (same S3 URL).
- Deletes the file from S3 if its content is a duplicate of another already registered file.
- Registers the file in the database if it is new.
- Prints a summary of processed / added / deleted / skipped counts.
Manage the schema with Alembic:
# Apply all pending migrations
alembic upgrade head
# Auto-generate a new migration from model changes
$env:DATABASE_URL = "sqlite:///./bot_database.db"
python -m alembic revision --autogenerate -m "description"MIT