Skip to content

Discord chat#333

Merged
ptone merged 26 commits into
GoogleCloudPlatform:mainfrom
ptone:discord-chat
Jun 7, 2026
Merged

Discord chat#333
ptone merged 26 commits into
GoogleCloudPlatform:mainfrom
ptone:discord-chat

Conversation

@ptone
Copy link
Copy Markdown
Member

@ptone ptone commented Jun 6, 2026

Add discord chat support

Scion and others added 24 commits June 3, 2026 14:53
… routing

Implements the scion-plugin-discord self-managed broker plugin following
the same architecture as scion-telegram. This phase covers:

- Plugin entrypoint with go-plugin handshake (cmd/scion-plugin-discord/)
- DiscordBroker struct implementing MessageBrokerPluginInterface with
  two-phase Configure(), Subscribe/Unsubscribe, Publish, Close
- Discord Gateway WebSocket connection via discordgo with event dispatch
  for messages, interactions, guild lifecycle
- SQLite persistent store with 7 tables: channel_links, user_mappings,
  conversation_context, project_agents, pending_ask_users,
  callback_lookups, notification_prefs
- Three-tier @-mention routing: bot mention → default agent,
  @agentname → specific agent, @ALL → broadcast
- Slash command registration (/scion with 12 subcommands) with
  Discord autocomplete for agent names
- Hub identity linking registration flow (/scion register)
- Hub API client with HMAC authentication
- Per-channel rate-limited send queue
- Basic outbound message formatting (plain text, Phase 2 adds embeds)
- Component interaction dispatch for setup flow buttons
- 57 unit tests covering store CRUD and mention routing
Without a RequestSubscription call in Configure() phase 2, the Hub never
calls Subscribe() on the plugin, so startGateway() is never triggered and
the Discord WebSocket connection is never opened.

Also fix topic prefix: scion.project.* → scion.grove.* in
subscribeForProject/unsubscribeForProject to match the current message
routing format.
Configure() returns before the go-plugin host wires up callbacks, so
RequestSubscription() called during Configure() always fails. Use a
goroutine that retries every 500ms (up to 10s) until hostCallbacks is
set, then requests the bootstrap scion.grove.> subscription that
triggers Subscribe() → startGateway() → Discord WebSocket connection.
Mirror the Telegram link service pattern for Discord:
- DiscordLinkService with code registration, verification, status polling
- POST /api/v1/discord/link (broker-authenticated, registers linking code)
- POST /api/v1/discord/link/verify (user-authenticated, confirms code)
- GET /api/v1/discord/link/status (broker-authenticated, polls confirmation)
- Update scion-discord plugin to use /api/v1/discord/link endpoints
- Create profile-discord.ts page component following the Telegram pattern
- Auto-populates code from ?code= URL param, shows Discord username
- POSTs to /api/v1/discord/link/verify with just {code}
- Register route at /profile/discord in the router
- Fix hubLink URL in Discord register.go to use /profile/discord
Set Channel='discord' and ThreadID on inbound messages for proper
Hub routing. Add ChannelID to GetInfo() so FanOutEventBus routes
channel-targeted messages correctly. Add outbound channel filtering
and thread-based routing so replies land in the originating Discord
channel/thread.
Remove the unnecessary GuildMembers privileged gateway intent which
requires verification for bots in 100+ servers. Merge two consecutive
RLock/RUnlock pairs in handleInteractionCreate into a single acquisition
that reads commands, callbacks, and registration together.
The Publish() method was calling session.ChannelMessageSend() directly,
bypassing the rate-limited SendQueue initialized during Configure().
Route outbound messages through sendQueue.Send() to enforce per-channel
rate limiting and prevent Discord API 429 errors. Falls back to direct
session send if the queue is not initialized.
…d cache TTL

Add "unlink" and "settings" to ephemeralCommands so admin-only responses
are only visible to the invoking user. Respond immediately for "help"
instead of deferring (avoids unnecessary "Bot is thinking..." indicator).
Make agent cache TTL configurable via CommandHandler instead of
hardcoding 5 minutes.
Add Phase 2 embed rendering functions to format.go:
- RenderStateChangeEmbed: colored sidebar embeds for state change messages
- RenderInputNeeded: embed + button components for input-needed messages
- FormatWithEmbed: length-aware formatting (plain text vs embed vs split)
- SplitLongMessage: splits long text at newline boundaries
- activityColor: maps activity/status strings to Discord embed colors
Implement Phase 2 items 1-2: channel webhook management and outbound
webhook routing so agent messages appear with distinct names and avatars.

- WebhookManager (webhooks.go): lazily creates one "Scion Agent Relay"
  webhook per channel, caches in memory with RWMutex, auto-recreates
  on 404/Unknown Webhook errors, sends via WebhookExecute with
  per-agent username and RoboHash avatar URL
- Publish() routing (broker.go): agent messages (TypeAssistantReply,
  TypeInstruction) from "agent:" senders go through webhook; state
  changes and input-needed keep bot API identity; falls back to bot
  API if webhook send fails
- formatWebhookMessage (broker.go): strips agent name header (webhook
  username shows it), keeps [URGENT]/[Broadcast] prefix tags and
  agent-to-agent recipient indicator
- CallbackHandler: fix constructor to accept deliverInbound function,
  add ask-user and notification callback handlers
- Modal support (modals.go): OpenAskUserModal and HandleModalSubmit
  for free-text ask-user responses
Tests cover:
- activityColor: all status-to-color mappings and unknown fallback
- RenderStateChangeEmbed: basic, summary, no-activity, truncation
- RenderInputNeeded: with choices, multiple rows, without choices,
  invalid JSON fallback, empty choices fallback
- FormatWithEmbed: short/medium/long messages at each threshold
- SplitLongMessage: newline splitting, no-newline, content preservation
- Existing format function smoke tests
…n toggle

Implement Phase 2 items 5, 6, 10 from discord-chat design doc:
- Add ask:opt/reply/dismiss and notif:on/off callback prefixes to Dispatch
- Create modals.go with OpenAskUserModal and HandleModalSubmit
- Wire modal submit dispatch in broker.go handleInteractionCreate
- Skip auto-acknowledge for ask:reply so modal can be first response
Set, show, or clear the default agent for a channel. When a message
is sent without @mentioning a specific agent, the default agent
handles it. Includes autocomplete with a "none" option to clear.
…ttons

Settings subcommand:
- Add ShowStateChanges field to ChannelLink with graceful schema migration
- Implement /scion settings with toggle buttons for observe mode and state
  change notifications per channel
- Wire settings callbacks to flip ShowAgentToAgent and ShowStateChanges
- Filter agent-to-agent messages and state changes in Publish() based on
  per-channel settings

Default subcommand refactor:
- Remove agent parameter from /scion default command
- Show ephemeral button panel with one button per agent plus None
- Add callback handler for default:set and default:none button clicks
Add PostgreSQL as an alternative storage backend for the Discord broker
plugin, enabling shared-database HA deployments where the Hub uses
Postgres.

- Hub (server_foreground.go): Inject database_driver and database_url
  into broker plugin config when the Hub is using a non-SQLite driver
- Discord plugin: Add store_postgres.go implementing the full Store
  interface with discord_-prefixed tables, TIMESTAMPTZ columns, and
  native boolean types via pgx driver
- Discord broker.go: Select PostgreSQL or SQLite store based on
  injected database config, falling back to SQLite when absent
- go.mod: Add github.com/jackc/pgx/v5 dependency
@google-cla
Copy link
Copy Markdown

google-cla Bot commented Jun 6, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements a Discord message broker plugin for the Scion hub, enabling bidirectional communication with rich embeds, interactive components, rate-limited send queues, and per-agent webhook identities. Key feedback highlights a critical deadlock risk in the SendQueue when handling overflows under a mutex lock, a potential nil pointer dereference in Publish(), unsafe byte-based string slicing during modal response truncation, and silent migration failures in the SQLite store. Additionally, replacing the custom errorAs helper with the standard library's errors.As is recommended for better robustness.

Comment on lines +120 to +130
select {
case cq.ch <- req:
default:
dropped := <-cq.ch
if dropped.result != nil {
dropped.result <- &sendResult{err: errors.New("dropped: send queue overflow")}
}
sq.log.Warn("Send queue overflow, dropped oldest message",
"channel_id", channelID, "queue_size", sq.maxSize)
cq.ch <- req
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

Critical Concurrency Bug: Potential Deadlock

When the queue is full, the default branch of the select statement is executed. Inside this branch, a blocking read dropped := <-cq.ch is performed while holding the mutex sq.mu.

If the worker goroutine concurrently drains the channel (for example, if maxSize is 1 and the worker reads the only item), the channel becomes empty. Consequently, <-cq.ch will block indefinitely while holding sq.mu. This deadlocks the entire SendQueue and halts all message sending across all channels.

Solution

Use a non-blocking select with a default case to read from cq.ch. If the channel was drained in the meantime, we simply proceed without dropping anything.

Suggested change
select {
case cq.ch <- req:
default:
dropped := <-cq.ch
if dropped.result != nil {
dropped.result <- &sendResult{err: errors.New("dropped: send queue overflow")}
}
sq.log.Warn("Send queue overflow, dropped oldest message",
"channel_id", channelID, "queue_size", sq.maxSize)
cq.ch <- req
}
select {
case dropped := <-cq.ch:
if dropped.result != nil {
dropped.result <- &sendResult{err: errors.New("dropped: send queue overflow")}
}
sq.log.Warn("Send queue overflow, dropped oldest message",
"channel_id", channelID, "queue_size", sq.maxSize)
default:
// The channel was drained by the worker in the meantime; no need to drop.
}
cq.ch <- req

Comment on lines +380 to +382
if session == nil {
return fmt.Errorf("discord broker not configured")
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Potential Nil Pointer Dereference

Publish does not check if msg is nil at the beginning of the function. If msg is nil, it will panic with a nil pointer dereference at line 454 (msg.Sender) and other places.

Since the broker runs as a plugin subprocess, a panic will crash the plugin, causing the hub to lose connection to the Discord broker.

Solution

Add a nil check for msg at the top of Publish().

Suggested change
if session == nil {
return fmt.Errorf("discord broker not configured")
}
if session == nil {
return fmt.Errorf("discord broker not configured")
}
if msg == nil {
return fmt.Errorf("message is nil")
}

Comment on lines +3 to +13
import (
"fmt"
"log/slog"
"net/http"
"net/url"
"strconv"
"strings"
"sync"

"github.com/bwmarrin/discordgo"
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Import "errors"

Import the standard library "errors" package to use errors.As instead of the custom errorAs helper.

Suggested change
import (
"fmt"
"log/slog"
"net/http"
"net/url"
"strconv"
"strings"
"sync"
"github.com/bwmarrin/discordgo"
)
import (
"errors"
"fmt"
"log/slog"
"net/http"
"net/url"
"strconv"
"strings"
"sync"
"github.com/bwmarrin/discordgo"
)

Comment on lines +168 to +177
var restErr *discordgo.RESTError
if ok := errorAs(err, &restErr); ok {
if restErr.Response != nil && restErr.Response.StatusCode == http.StatusNotFound {
return true
}
// Discord error code 10015 = Unknown Webhook.
if restErr.Message != nil && restErr.Message.Code == 10015 {
return true
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Use errors.As

Use the standard library errors.As instead of the custom errorAs helper.

Suggested change
var restErr *discordgo.RESTError
if ok := errorAs(err, &restErr); ok {
if restErr.Response != nil && restErr.Response.StatusCode == http.StatusNotFound {
return true
}
// Discord error code 10015 = Unknown Webhook.
if restErr.Message != nil && restErr.Message.Code == 10015 {
return true
}
}
var restErr *discordgo.RESTError
if errors.As(err, &restErr) {
if restErr.Response != nil && restErr.Response.StatusCode == http.StatusNotFound {
return true
}
// Discord error code 10015 = Unknown Webhook.
if restErr.Message != nil && restErr.Message.Code == 10015 {
return true
}
}

Comment on lines +184 to +199
// errorAs is a thin wrapper around errors.As that works with the generic
// target type. This avoids an import of "errors" just for As.
func errorAs[T any](err error, target *T) bool {
for err != nil {
if t, ok := err.(T); ok {
*target = t
return true
}
if u, ok := err.(interface{ Unwrap() error }); ok {
err = u.Unwrap()
} else {
return false
}
}
return false
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Use Standard Library errors.As

The custom errorAs function is implemented to avoid importing "errors". However, "errors" is already imported in other files, and the custom implementation does not support custom As(any) bool implementations (which is standard in Go's errors.As). This makes it less robust and prone to maintainability issues.

Solution

Import "errors" and use errors.As directly, then delete this custom helper.

Comment on lines +207 to +210
var restErr *discordgo.RESTError
if ok := errorAs(err, &restErr); ok {
return restErr.Response != nil && restErr.Response.StatusCode == statusCode
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Use errors.As

Use the standard library errors.As instead of the custom errorAs helper.

Suggested change
var restErr *discordgo.RESTError
if ok := errorAs(err, &restErr); ok {
return restErr.Response != nil && restErr.Response.StatusCode == statusCode
}
var restErr *discordgo.RESTError
if errors.As(err, &restErr) {
return restErr.Response != nil && restErr.Response.StatusCode == statusCode
}

Comment on lines +3 to +11
import (
"context"
"database/sql"
"encoding/json"
"fmt"
"time"

_ "modernc.org/sqlite"
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Import "strings"

Import the standard library "strings" package to check for expected "duplicate column" errors during database migrations.

Suggested change
import (
"context"
"database/sql"
"encoding/json"
"fmt"
"time"
_ "modernc.org/sqlite"
)
import (
"context"
"database/sql"
"encoding/json"
"fmt"
"strings"
"time"
_ "modernc.org/sqlite"
)

Comment on lines +244 to +251
func (s *sqliteStore) migrateSchema() {
migrations := []string{
`ALTER TABLE channel_links ADD COLUMN show_state_changes INTEGER NOT NULL DEFAULT 1`,
}
for _, m := range migrations {
s.db.Exec(m)
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Silent Failure of Database Migrations

s.db.Exec(m) is called without checking the returned error. While ignoring "duplicate column" errors is expected on subsequent startups, ignoring all other errors (e.g., database locked, disk full, permission denied) can lead to silent failures of database migrations and extremely hard-to-diagnose production issues.

Solution

Check the error and log a warning or return it if it's not a "duplicate column" error.

Suggested change
func (s *sqliteStore) migrateSchema() {
migrations := []string{
`ALTER TABLE channel_links ADD COLUMN show_state_changes INTEGER NOT NULL DEFAULT 1`,
}
for _, m := range migrations {
s.db.Exec(m)
}
}
func (s *sqliteStore) migrateSchema() {
migrations := []string{
`ALTER TABLE channel_links ADD COLUMN show_state_changes INTEGER NOT NULL DEFAULT 1`,
}
for _, m := range migrations {
if _, err := s.db.Exec(m); err != nil {
// Ignore "duplicate column" errors as they are expected on subsequent startups.
if !strings.Contains(err.Error(), "duplicate column name") {
// Log or handle other unexpected errors.
fmt.Printf("Warning: failed to run migration %q: %v\n", m, err)
}
}
}
}

Comment on lines +148 to +151
truncated := responseText
if len(truncated) > 100 {
truncated = truncated[:97] + "..."
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Unsafe String Slicing by Bytes

truncated = truncated[:97] + "..." slices the user-provided responseText by bytes. If the text contains multi-byte UTF-8 characters (like emojis or non-English text), this can slice in the middle of a UTF-8 rune, producing an invalid UTF-8 string.

Solution

Convert the string to a slice of runes first to ensure safe truncation.

Suggested change
truncated := responseText
if len(truncated) > 100 {
truncated = truncated[:97] + "..."
}
truncated := responseText
runes := []rune(responseText)
if len(runes) > 100 {
truncated = string(runes[:97]) + "..."
}

ptone added 2 commits June 6, 2026 22:22
…roker

- Fix potential deadlock in SendQueue.enqueue by using non-blocking
  select when draining full channel under mutex
- Add nil message check at top of Publish() to prevent panic
- Replace custom errorAs helper with standard errors.As in webhooks.go
- Handle migration errors properly in store.go instead of silently
  ignoring all errors
- Use rune-based truncation in modals.go to avoid splitting multi-byte
  UTF-8 characters
@ptone ptone merged commit 003d2a0 into GoogleCloudPlatform:main Jun 7, 2026
4 of 5 checks passed
@scion-gteam scion-gteam Bot deleted the discord-chat branch June 7, 2026 02:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant