-
Notifications
You must be signed in to change notification settings - Fork 25
Enhance StreamingTranscriptionConsumer and conversation job handling #282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Removed cumulative audio offset tracking from StreamingTranscriptionConsumer as Deepgram provides cumulative timestamps directly. - Updated store_final_result method to utilize Deepgram's cumulative timestamps without adjustments. - Implemented completion signaling for transcription sessions in Redis, ensuring conversation jobs wait for all results before processing. - Improved error handling to signal completion even in case of errors, preventing conversation jobs from hanging. - Enhanced logging for better visibility of transcription completion and error states.
📝 WalkthroughWalkthroughThe changes introduce Redis-based completion signaling between the transcription streaming consumer and conversation job processing. The transcription service now marks completion via Redis with a 5-minute TTL, while the conversation job waits for this signal before proceeding. Timestamp offset calculations are removed from transcription results since Deepgram already provides cumulative timestamps. Changes
Sequence Diagram(s)sequenceDiagram
participant Consumer as Transcription Consumer
participant Redis as Redis
participant JobWorker as Conversation Job Worker
JobWorker->>JobWorker: Validate meaningful speech
JobWorker->>Redis: Poll transcription:complete:{session_id}
Note over JobWorker: Wait up to 30 seconds
Consumer->>Consumer: Process final transcription
Consumer->>Redis: SET transcription:complete:{session_id} = "1"<br/>(TTL: 5 min)
Redis-->>JobWorker: Signal received
JobWorker->>JobWorker: Proceed to post-processing
alt Timeout
JobWorker->>JobWorker: Log timeout warning<br/>Continue anyway
end
alt Error state
Consumer->>Redis: SET transcription:complete:{session_id} = "error"
Redis-->>JobWorker: Error signal
JobWorker->>JobWorker: Log warning, continue
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backends/advanced/src/advanced_omi_backend/services/transcription/streaming_consumer.py (1)
264-304: Verify provider timestamp capabilities in registry configuration.The code assumes all configured streaming providers return cumulative timestamps from stream start (as Deepgram does). However, Parakeet ASR streaming returns per-chunk timestamps by default and requires explicit offset merging to produce cumulative timestamps. Since
RegistryStreamingTranscriptionProvidersupports any provider via config, and the architecture claims provider-agnosticism, downstream word timing will silently fail if a provider with per-chunk semantics (e.g., Parakeet) is configured as the streaming provider. Additionally, the hardcoded provider nameb"deepgram-stream"on line 283 contradicts this provider-agnostic design.Recommended fixes:
- Add provider-specific offset handling in
store_final_result()based on the actual provider from the registry- Document timestamp semantics in config.yml for each streaming provider, or gate cumulative-timestamp processing by a per-provider capability flag
- Replace hardcoded provider name with dynamic detection from the configured streaming provider
🤖 Fix all issues with AI agents
In
`@backends/advanced/src/advanced_omi_backend/services/transcription/streaming_consumer.py`:
- Around line 175-181: The bare "except Exception: pass" in the block that sets
completion_key (f"transcription:complete:{session_id}") hides Redis failures;
change it to catch the exception into a variable (e.g., except Exception as
redis_error:) and log it (e.g., logger.warning or logger.debug) with context
like "Failed to set error flag in Redis" including redis_error, while keeping
the best-effort behavior (do not re-raise) and referencing self.redis_client,
completion_key, session_id, and logger so observability is preserved.
| # Still signal completion even on error so conversation job doesn't hang | ||
| try: | ||
| completion_key = f"transcription:complete:{session_id}" | ||
| await self.redis_client.set(completion_key, "error", ex=300) | ||
| logger.warning(f"⚠️ Set error completion signal for {session_id}") | ||
| except Exception: | ||
| pass # Best effort |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, find and examine the file
find . -type f -name "streaming_consumer.py" | grep -E "transcription|advanced_omi_backend"Repository: SimpleOpenSoftware/chronicle
Length of output: 161
🏁 Script executed:
# Check the file exists and read around the specified lines
if [ -f "backends/advanced/src/advanced_omi_backend/services/transcription/streaming_consumer.py" ]; then
echo "=== File found, showing lines 170-185 ==="
sed -n '170,185p' "backends/advanced/src/advanced_omi_backend/services/transcription/streaming_consumer.py" | cat -n
else
echo "File not found, searching for similar paths..."
find . -name "streaming_consumer.py" -type f
fiRepository: SimpleOpenSoftware/chronicle
Length of output: 1089
🏁 Script executed:
# Get file size first to see if it's manageable
wc -l "backends/advanced/src/advanced_omi_backend/services/transcription/streaming_consumer.py" 2>/dev/null || echo "File path needs adjustment"Repository: SimpleOpenSoftware/chronicle
Length of output: 163
🏁 Script executed:
# Search for error handling patterns (except blocks) in the file
rg "except\s+Exception" "backends/advanced/src/advanced_omi_backend/services/transcription/streaming_consumer.py" -A 2 -B 1 || echo "Pattern not found with that path"Repository: SimpleOpenSoftware/chronicle
Length of output: 1821
🏁 Script executed:
# Also check how logging is used in the file
rg "logger\.(debug|warning|error|info)" "backends/advanced/src/advanced_omi_backend/services/transcription/streaming_consumer.py" | head -20Repository: SimpleOpenSoftware/chronicle
Length of output: 1785
Don't swallow errors when setting the completion signal.
The bare except Exception: pass hides failures that are useful for debugging. Similar error handlers in the module log Redis failures (e.g., logger.warning(f"Failed to set error flag in Redis: {redis_error}")), so logging at least a debug/warn level here maintains the "best effort" behavior while preserving observability.
🔧 Suggested fix
try:
completion_key = f"transcription:complete:{session_id}"
await self.redis_client.set(completion_key, "error", ex=300)
logger.warning(f"⚠️ Set error completion signal for {session_id}")
- except Exception:
- pass # Best effort
+ except Exception as signal_err:
+ logger.debug(
+ f"Failed to set completion signal for {session_id}: {signal_err}",
+ exc_info=True
+ ) # Best effort🧰 Tools
🪛 Ruff (0.14.14)
180-181: try-except-pass detected, consider logging the exception
(S110)
180-180: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
In
`@backends/advanced/src/advanced_omi_backend/services/transcription/streaming_consumer.py`
around lines 175 - 181, The bare "except Exception: pass" in the block that sets
completion_key (f"transcription:complete:{session_id}") hides Redis failures;
change it to catch the exception into a variable (e.g., except Exception as
redis_error:) and log it (e.g., logger.warning or logger.debug) with context
like "Failed to set error flag in Redis" including redis_error, while keeping
the best-effort behavior (do not re-raise) and referencing self.redis_client,
completion_key, session_id, and logger so observability is preserved.
|
| Metric | Count |
|---|---|
| ✅ Passed | 87 |
| ❌ Failed | 8 |
| 📊 Total | 95 |
📊 View Reports
GitHub Pages (Live Reports):
Download Artifacts:
- robot-test-reports-html-no-api - HTML reports
- robot-test-results-xml-no-api - XML output
Summary by CodeRabbit
Bug Fixes
Refactor
✏️ Tip: You can customize this high-level summary in your review settings.