π°οΈ A newsroom-grade Telegram userbot for real-time news monitoring, duplicate suppression, breaking alert routing, smart digests, and private evidence-first queries.
TeleUserBot turns a noisy pile of Telegram channels into a cleaner, sharper, more useful intelligence feed.
It runs on your personal Telegram account with Telethon, listens to channels and folder feeds in real time, filters weak or repeated posts, pushes urgent developments fast, rolls everything else into polished digests, and lets you ask private questions against recent coverage.
If you want something that feels closer to a private monitoring desk than a repost bot, this is what the project is built for. π
Most Telegram monitoring setups break in the same places:
- π the same update gets reposted everywhere
- π¨ weak signals get dressed up as breaking news
- π§± raw channel dumps are hard to read at scale
- π searching recent coverage inside Telegram is painful
- π§ media-only posts often carry important text that never gets surfaced
TeleUserBot fixes that by combining:
- real-time Telegram intake
- multi-layer duplicate suppression
- severity-aware routing
- breaking-story continuity
- hourly and daily digest generation
- OCR translation for media-only posts
- private query mode with Telegram-first evidence search
- optional trusted web fallback when Telegram evidence is thin
- Telegram HTML output with optional premium emoji rendering
- Listen from a shared Telegram folder invite via
FOLDER_INVITE_LINK - Add manual sources through
EXTRA_SOURCES - Run as a real userbot, not only a bot-token listener
- Deliver output to a user destination or through the Telegram Bot API
- Text fingerprinting and hybrid duplicate scoring
- Media signature checks for reposted images and albums
- Visual media hashing for same-image or recompressed media
- SQLite-backed memory so duplicate defense survives restarts
- High-severity posts can go out immediately
- Medium and low priority updates can be queued for digest
- Breaking follow-ups can stay attached to the same evolving story
- Optional opinionated breaking style via
BREAKING_STYLE_MODE
- Hourly digest mode
- Daily 24-hour digest mode
- Configurable queue windows and size limits
- Optional pin rotation for latest digest posts
- Ask questions in Saved Messages
- Or use a private chat with your own bot
- Search recent Telegram evidence first
- Fall back to trusted web coverage only when configured and necessary
- Image OCR for posts without captions
- First-frame video OCR for media-only videos
- Translation only when non-English text is detected
- No invented visual descriptions, no fake captions
- Telegram HTML formatting
- Optional premium emoji mapping
- Reply-thread continuity when source posts are part of a thread
- Delivery tuned for feed readability instead of channel spam
TeleUserBot/
βββ main.py
βββ config.py
βββ auth.py
βββ ai_filter.py
βββ breaking_story.py
βββ db.py
βββ news_signals.py
βββ news_taxonomy.py
βββ severity_classifier.py
βββ utils.py
βββ web_server.py
βββ tests/
βββ install-all.ps1
βββ install-all-ubuntu.sh
βββ .env.example
βββ README.md
Runtime state lives outside the repo in:
~/.tg_userbot/
That directory stores runtime metadata such as:
- SQLite state
- auth payloads and caches
- logs
- delivery and pipeline metadata
- TeleUserBot connects to your Telegram account.
- It resolves sources from your shared folder and extra channels.
- Incoming posts pass through duplicate, OCR, and severity logic.
- High-signal updates can be delivered instantly.
- Everything else is organized into digest workflows and searchable history.
- Python 3.11+
- Latest available Python 3 release is preferred
- Telegram
api_idandapi_hashfromhttps://my.telegram.org - A Telegram account for Telethon login
- Optional bot token for Bot API delivery or bot-PM query mode
- Optional OCR system packages if you want image/video text extraction
git clone https://github.com/therayyanawaz/TeleUserBot.git
cd TeleUserBotpython3 -m venv .venv
source .venv/bin/activateWindows PowerShell:
py -3 -m venv .venv
.\.venv\Scripts\Activate.ps1The examples above intentionally use the default Python 3 launcher behavior so your fork can pick up the newest installed Python 3 version, while still expecting 3.11 or newer.
python -m pip install --upgrade pip setuptools wheel
pip install -r requirements.txtpip install -r requirements.optional.txtOptional extras enable heavier features like OCR helpers and sentence-transformers support if you explicitly choose to use them.
cp .env.example .envWindows PowerShell:
Copy-Item .env.example .envpython main.pyIf you want the faster path:
.\install-all.ps1This script:
- selects the newest installed Python 3.11+ interpreter, or installs the newest available Python 3 package if needed
- creates
.venv - installs
requirements.txtandrequirements.optional.txt - installs FFmpeg
- installs Tesseract OCR
- warms the
sentence-transformerscache
bash install-all-ubuntu.shThis script:
- selects the newest installed Python 3.11+ interpreter, or installs the newest available
python3.xpackage when needed - installs FFmpeg and Tesseract
- installs multilingual OCR language packs
- creates
.venv - installs all Python dependencies
- warms the
sentence-transformerscache
A lean starter .env looks like this:
TELEGRAM_API_ID=123456
TELEGRAM_API_HASH="xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
FOLDER_INVITE_LINK="https://t.me/addlist/xxxxxxxxxx"
# EXTRA_SOURCES=["@channel1","https://t.me/+privateInviteHash"]
# Choose one destination mode
DESTINATION="@your_private_channel_or_chat"
# OR
# BOT_DESTINATION_TOKEN="123456:ABCDEF..."
# BOT_DESTINATION_CHAT_ID="7777826640"Important behavior:
- If both
DESTINATIONand bot-destination values are set, bot destination mode wins FOLDER_INVITE_LINKis optional if you preferEXTRA_SOURCESBOT_DESTINATION_CHAT_IDshould usually be your delivery chat, not your query PM
This project uses a Codex-style OAuth flow, not a plain API key setup.
Recommended for Replit, servers, and long-running deployments:
OPENAI_AUTH_ENV_ONLY=true
TG_USERBOT_AUTH_JSON_B64="..."Bootstrap auth into environment form:
python auth.py bootstrap-envOr write auth values into .env:
python auth.py setup-envOPENAI_AUTH_ENV_ONLY=falseThen log in with browser OAuth:
python auth.py loginLocal interactive startup can repair missing or stale auth automatically and continue startup in the same process.
python auth.py login
python auth.py login --env-file .env
python auth.py status
python auth.py logout
python auth.py logout --env-file .envEach incoming Telegram post is evaluated and can be:
- skipped as a duplicate
- routed as a fast breaking alert
- added to digest
- attached to an existing story thread
Instead of forwarding every post as-is, TeleUserBot can publish:
- hourly digests
- daily digests
Digest mode is designed for people who want signal density without raw-feed chaos.
Ask natural-language questions such as:
latest tehran newswhat happened in last 24 hoursrecent beirut updateswho died recently in iran
The assistant checks recent Telegram evidence first and only uses trusted web fallback when configured and when Telegram results are too weak.
Recommended baseline:
DIGEST_MODE=true
DIGEST_INTERVAL_MINUTES=60
DIGEST_DAILY_TIMES=["00:00"]
DIGEST_DAILY_WINDOW_HOURS=24
DIGEST_MAX_POSTS=80
DIGEST_QUEUE_CLEAR_INTERVAL_MINUTES=0
OUTPUT_LANGUAGE="English"Optional digest pin rotation:
DIGEST_PIN_HOURLY=false
DIGEST_PIN_DAILY=falseWhen enabled:
- the newest digest of that type is pinned
- the previous pinned digest of that type is unpinned
- pin failure does not block digest delivery
Duplicate defense runs in layers.
- normalized text fingerprinting
- hybrid similarity scoring
- recent duplicate memory
- same-image detection
- recompressed-media detection
- album signature tracking
- persistent dedupe memory in SQLite
When follow-up posts arrive as replies in the source channel, the bot can preserve that relationship in the destination feed.
High-level flow:
highβ immediate alertmedium/lowβ digest queue
Breaking tone can be tuned with:
BREAKING_STYLE_MODE=unhingedModes:
unhingedgives harder-hitting breaking formatting and adds context only when the story linkage is strong enoughclassicrestores a more restrained layout
OCR behavior is intentionally conservative.
- captioned media keeps the original Telegram caption
- image-only posts get a caption only if OCR finds meaningful non-English text and translation succeeds
- video-only posts use first-frame OCR
- English OCR text is ignored
- failed OCR adds nothing
Example config:
MEDIA_TEXT_OCR_ENABLED=true
MEDIA_TEXT_OCR_VIDEO_ENABLED=true
MEDIA_TEXT_OCR_MIN_CHARS=12
MEDIA_TEXT_OCR_MAX_CHARS=1600
MEDIA_TEXT_OCR_VIDEO_MAX_MB=25
MEDIA_TEXT_OCR_LANGS=eng+ara+fas+urd+rusIf you want OCR on Linux:
sudo apt-get update
sudo apt-get install -y tesseract-ocr ffmpeg
sudo apt-get install -y tesseract-ocr-ara tesseract-ocr-fas tesseract-ocr-urd tesseract-ocr-rusAllowed contexts:
- Saved Messages
- private chat with your own bot account
Not allowed:
- groups
- channels
- arbitrary private chats with other users
This restriction is intentional and keeps the query workflow private and predictable.
When Telegram evidence is not strong enough, the bot can search trusted news sites:
QUERY_WEB_FALLBACK_ENABLED=true
QUERY_WEB_MIN_TELEGRAM_RESULTS=3
QUERY_WEB_MAX_RESULTS=12
QUERY_WEB_MAX_HOURS_BACK=24
QUERY_WEB_REQUIRE_RECENT=true
QUERY_WEB_REQUIRE_MIN_SOURCES=2
QUERY_WEB_ALLOWED_DOMAINS=["reuters.com","apnews.com","bbc.com","aljazeera.com","cnn.com","nytimes.com","washingtonpost.com","bloomberg.com","ft.com","theguardian.com","dw.com","france24.com","aa.com.tr","npr.org"]Notes:
- Telegram evidence stays the primary source
- web fallback is used only when needed
- higher-risk questions are handled more conservatively
TeleUserBot can deliver with:
- Telegram HTML formatting
- optional premium emoji support
- source-aware reply continuity
- digest-first readability
Useful rendering flags from .env.example:
ENABLE_HTML_FORMATTING=true
ENABLE_PREMIUM_EMOJI=true
PREMIUM_EMOJI_MAP_FILE="nezami_emoji_map.json"For Replit or uptime-monitored deployments:
ENABLE_WEB_SERVER=true
WEB_SERVER_HOST="0.0.0.0"
WEB_SERVER_PORT=8080
HOLD_ON_STARTUP_ERROR=true
OPENAI_AUTH_ENV_ONLY=true
TG_USERBOT_AUTH_JSON_B64="..."Health endpoints:
//health/status
Suggested hosted commands:
- install:
pip install -r requirements.txt - optional extras:
pip install -r requirements.optional.txt - run:
python main.py
Single entrypoint:
python main.pyStartup flow:
- validates config
- ensures only one instance is active
- initializes runtime DB and caches
- repairs auth inline when interactive mode detects stale or missing auth
- connects your Telegram session
- resolves sources
- starts feed, digest, query, and optional web server pipelines
The repo includes test coverage for major pipeline pieces.
Run:
pytestOr install dev requirements first:
pip install -r requirements.dev.txt
pytestDefault digest status command:
/digest_status
This reports queue state, scheduler status, and runtime health details.
Not a problem if you intentionally run without Hugging Face support.
Install optional extras only if you want that backend:
pip install -r requirements.optional.txtAnother process is probably using the same Telethon session or SQLite DB.
Fix:
- stop duplicate processes
- keep only one active instance
Use env-only auth:
OPENAI_AUTH_ENV_ONLY=true
TG_USERBOT_AUTH_JSON_B64="..."Use full E.164 format with country code.
Example:
+15551234567
Check:
- OCR is enabled
- Tesseract is installed
- language packs are installed
- the media actually contains readable non-English text
Keep query mode limited to:
- Saved Messages
- your own bot PM
The bot retries transient delivery errors once. If failures continue, check:
- network quality
- VPS stability
- proxy or VPN path
- oversized media uploads
Never commit:
.envuserbot.session*~/.tg_userbot/*secrets- exported auth payloads
- private token dumps
If anything sensitive leaks, rotate it immediately.
git pull
source .venv/bin/activate
pip install -r requirements.txt --upgrade
pip install -r requirements.optional.txt --upgrade
python main.pyBest results usually come from separating roles:
- one chat for feed delivery
- one private bot PM or Saved Messages for queries
Mixing both into a single high-volume chat works, but the experience becomes noisier and less controlled.
This project operates on a real Telegram account and may process content from many sources. Use it responsibly, follow Telegram rules, respect local laws, and handle monitored content with care.
TeleUserBot is for operators who want Telegram monitoring to feel sharper, calmer, and more intelligent:
- fewer duplicates
- better urgency control
- cleaner digests
- stronger private search
- more useful media handling
If your current setup feels like chaos in a trench coat, this is the upgrade. β¨