You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add a reference Telegram bot example (or library package) that demonstrates how to connect Engine[S] to a Telegram chat, similar to how the debug console bridges the agent loop to a web UI.
Motivation
Telegram's Bot API maps naturally to ChatInterface — text messages, photo uploads, and reply-based conversations all have direct equivalents.
A working Telegram integration would serve as both a reference implementation and a ready-to-use adapter for teams building Telegram-based agents.
Currently the SDK has debug/ (web console) and examples/todo-agent (CLI), but no reference for a real messaging platform.
Chat/session routing must be explicit.ChannelChat is single-channel; a Telegram bot must multiplex per chat ID into isolated ReplyCh streams. If the reference uses a single ChannelChat, concurrent chats will interleave and corrupt agent context. The design must choose: one bot per chat vs multi-chat manager.
Blocking WaitForReply doesn't map 1:1 to Telegram updates. The SDK expects blocking waits; Telegram delivers updates via webhook or long polling. The integration needs to define how updates are buffered and drained per chat to satisfy WaitForReply semantics, or enforce "one active Run() per chat."
"Minimal dependencies" conflicts with realistic Telegram media handling. Sticking to net/http means implementing download URLs, file size checks, and MIME handling manually. A small, widely-used Telegram library (e.g. telebot) may reduce complexity. The tradeoff should be explicit.
Open questions
Single-chat reference or reusable multi-chat adapter? A single-chat bot is simpler for a reference example, but the issue implies a general adapter. These are different designs.
Engine.Run() per message or long-lived session? Per-message loses conversational memory unless messages are persisted externally. Long-lived needs a per-chat goroutine and cancellation strategy.
Suggested scope
Concurrency model
Start with a per-chat session manager: a map of chatID → (Engine, ChannelChat, state) with one goroutine per active conversation. Each chat gets its own ChannelChat with its own ReplyCh. Telegram updates are routed by chat ID to the correct channel. Idle sessions are reaped after a timeout.
Voice handling
Start with Path A (STT preprocessing) — transcribe via Whisper or Google STT before entering the agent loop. This works with all LLM providers. Path B (native audio to Gemini) can be added later when #1 lands the AudioData fields in the SDK.
Package structure
A telegram/ package (or examples/telegram-bot/) implementing ChatInterface per chat.
Support for: text send/receive, image attachments (photos → Reply.ImageData), voice messages (transcribed → text).
A runnable example showing Engine.Run() connected to a Telegram bot token.
Integration tests
Send text → verify Reply.Text
Send photo → verify Reply.ImageData populated
Send voice → verify transcription arrives as Reply.Text
Concurrent chats → verify isolation (no message interleaving)
Summary
Add a reference Telegram bot example (or library package) that demonstrates how to connect
Engine[S]to a Telegram chat, similar to how the debug console bridges the agent loop to a web UI.Motivation
ChatInterface— text messages, photo uploads, and reply-based conversations all have direct equivalents.debug/(web console) andexamples/todo-agent(CLI), but no reference for a real messaging platform.Findings (ordered by severity)
Chat/session routing must be explicit.
ChannelChatis single-channel; a Telegram bot must multiplex per chat ID into isolatedReplyChstreams. If the reference uses a singleChannelChat, concurrent chats will interleave and corrupt agent context. The design must choose: one bot per chat vs multi-chat manager.Blocking
WaitForReplydoesn't map 1:1 to Telegram updates. The SDK expects blocking waits; Telegram delivers updates via webhook or long polling. The integration needs to define how updates are buffered and drained per chat to satisfyWaitForReplysemantics, or enforce "one activeRun()per chat."Voice support depends on feat: voice message input (audio → tool calls) #1; boundary must be defined now. Unless the Telegram adapter owns transcription (Path A from feat: voice message input (audio → tool calls) #1) or forwards raw audio into
Reply.AudioData(Path B), it will be incomplete. The reference should pick one path or show both with a build flag."Minimal dependencies" conflicts with realistic Telegram media handling. Sticking to
net/httpmeans implementing download URLs, file size checks, and MIME handling manually. A small, widely-used Telegram library (e.g.telebot) may reduce complexity. The tradeoff should be explicit.Open questions
Single-chat reference or reusable multi-chat adapter? A single-chat bot is simpler for a reference example, but the issue implies a general adapter. These are different designs.
Engine.Run()per message or long-lived session? Per-message loses conversational memory unless messages are persisted externally. Long-lived needs a per-chat goroutine and cancellation strategy.Suggested scope
Concurrency model
Start with a per-chat session manager: a map of
chatID → (Engine, ChannelChat, state)with one goroutine per active conversation. Each chat gets its ownChannelChatwith its ownReplyCh. Telegram updates are routed by chat ID to the correct channel. Idle sessions are reaped after a timeout.Voice handling
Start with Path A (STT preprocessing) — transcribe via Whisper or Google STT before entering the agent loop. This works with all LLM providers. Path B (native audio to Gemini) can be added later when #1 lands the
AudioDatafields in the SDK.Package structure
telegram/package (orexamples/telegram-bot/) implementingChatInterfaceper chat.Reply.ImageData), voice messages (transcribed → text).Engine.Run()connected to a Telegram bot token.Integration tests
Reply.TextReply.ImageDatapopulatedReply.TextRelated