Enhance StreamingTranscriptionConsumer and conversation job handling #282

AnkushMalaker · 2026-01-29T14:49:24Z

Removed cumulative audio offset tracking from StreamingTranscriptionConsumer as Deepgram provides cumulative timestamps directly.
Updated store_final_result method to utilize Deepgram's cumulative timestamps without adjustments.
Implemented completion signaling for transcription sessions in Redis, ensuring conversation jobs wait for all results before processing.
Improved error handling to signal completion even in case of errors, preventing conversation jobs from hanging.
Enhanced logging for better visibility of transcription completion and error states.

Summary by CodeRabbit

Bug Fixes
- Enhanced transcription workflow reliability with improved synchronization and completion signaling between transcription and conversation processing.
- Added timeout and error handling to prevent delays in the processing pipeline.
Refactor
- Simplified timestamp handling to use native streaming timestamps directly for improved accuracy.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

- Removed cumulative audio offset tracking from StreamingTranscriptionConsumer as Deepgram provides cumulative timestamps directly. - Updated store_final_result method to utilize Deepgram's cumulative timestamps without adjustments. - Implemented completion signaling for transcription sessions in Redis, ensuring conversation jobs wait for all results before processing. - Improved error handling to signal completion even in case of errors, preventing conversation jobs from hanging. - Enhanced logging for better visibility of transcription completion and error states.

coderabbitai · 2026-01-29T14:49:33Z

📝 Walkthrough

Walkthrough

The changes introduce Redis-based completion signaling between the transcription streaming consumer and conversation job processing. The transcription service now marks completion via Redis with a 5-minute TTL, while the conversation job waits for this signal before proceeding. Timestamp offset calculations are removed from transcription results since Deepgram already provides cumulative timestamps.

Changes

Cohort / File(s)	Summary
Transcription streaming simplification `backends/advanced/src/advanced_omi_backend/services/transcription/streaming_consumer.py`	Removed per-session audio offset initialization and offset-based timestamp adjustments from start_session_stream and store_final_result. Added Redis completion signaling in end_session_stream that sets `transcription:complete:{session_id}` with 5-minute TTL and error state handling. Simplified to use Deepgram cumulative timestamps directly without offset computation.
Conversation job synchronization `backends/advanced/src/advanced_omi_backend/workers/conversation_jobs.py`	Added wait-and-check sequence after meaningful speech validation that polls Redis key `transcription:complete:{session_id}` for up to 30 seconds. Handles "error" state with warning continuation, logs success on completion, and logs timeout warning if not completed within window.

Sequence Diagram(s)

sequenceDiagram
    participant Consumer as Transcription Consumer
    participant Redis as Redis
    participant JobWorker as Conversation Job Worker
    
    JobWorker->>JobWorker: Validate meaningful speech
    JobWorker->>Redis: Poll transcription:complete:{session_id}
    Note over JobWorker: Wait up to 30 seconds
    
    Consumer->>Consumer: Process final transcription
    Consumer->>Redis: SET transcription:complete:{session_id} = "1"<br/>(TTL: 5 min)
    
    Redis-->>JobWorker: Signal received
    JobWorker->>JobWorker: Proceed to post-processing
    
    alt Timeout
        JobWorker->>JobWorker: Log timeout warning<br/>Continue anyway
    end
    
    alt Error state
        Consumer->>Redis: SET transcription:complete:{session_id} = "error"
        Redis-->>JobWorker: Error signal
        JobWorker->>JobWorker: Log warning, continue
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Enhance StreamingTranscriptionConsumer and conversation job handling' is directly related to the main changes, which involve refactoring StreamingTranscriptionConsumer (removing offset tracking, adding Redis completion signals) and enhancing conversation job handling (adding wait-and-check for transcription completion).
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

AnkushMalaker · 2026-01-29T14:49:34Z

@coderabbitai review

coderabbitai · 2026-01-29T14:49:42Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

backends/advanced/src/advanced_omi_backend/services/transcription/streaming_consumer.py (1)

264-304: Verify provider timestamp capabilities in registry configuration.

The code assumes all configured streaming providers return cumulative timestamps from stream start (as Deepgram does). However, Parakeet ASR streaming returns per-chunk timestamps by default and requires explicit offset merging to produce cumulative timestamps. Since RegistryStreamingTranscriptionProvider supports any provider via config, and the architecture claims provider-agnosticism, downstream word timing will silently fail if a provider with per-chunk semantics (e.g., Parakeet) is configured as the streaming provider. Additionally, the hardcoded provider name b"deepgram-stream" on line 283 contradicts this provider-agnostic design.

Recommended fixes:

Add provider-specific offset handling in store_final_result() based on the actual provider from the registry

Document timestamp semantics in config.yml for each streaming provider, or gate cumulative-timestamp processing by a per-provider capability flag

Replace hardcoded provider name with dynamic detection from the configured streaming provider

🤖 Fix all issues with AI agents

In
`@backends/advanced/src/advanced_omi_backend/services/transcription/streaming_consumer.py`:
- Around line 175-181: The bare "except Exception: pass" in the block that sets
completion_key (f"transcription:complete:{session_id}") hides Redis failures;
change it to catch the exception into a variable (e.g., except Exception as
redis_error:) and log it (e.g., logger.warning or logger.debug) with context
like "Failed to set error flag in Redis" including redis_error, while keeping
the best-effort behavior (do not re-raise) and referencing self.redis_client,
completion_key, session_id, and logger so observability is preserved.

coderabbitai · 2026-01-29T14:56:47Z

backends/advanced/src/advanced_omi_backend/services/transcription/streaming_consumer.py

+            # Still signal completion even on error so conversation job doesn't hang
+            try:
+                completion_key = f"transcription:complete:{session_id}"
+                await self.redis_client.set(completion_key, "error", ex=300)
+                logger.warning(f"⚠️ Set error completion signal for {session_id}")
+            except Exception:
+                pass  # Best effort


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, find and examine the file find . -type f -name "streaming_consumer.py" | grep -E "transcription|advanced_omi_backend"

Repository: SimpleOpenSoftware/chronicle

Length of output: 161

🏁 Script executed:

# Check the file exists and read around the specified lines if [ -f "backends/advanced/src/advanced_omi_backend/services/transcription/streaming_consumer.py" ]; then echo "=== File found, showing lines 170-185 ===" sed -n '170,185p' "backends/advanced/src/advanced_omi_backend/services/transcription/streaming_consumer.py" | cat -n else echo "File not found, searching for similar paths..." find . -name "streaming_consumer.py" -type f fi

Repository: SimpleOpenSoftware/chronicle

Length of output: 1089

🏁 Script executed:

# Get file size first to see if it's manageable wc -l "backends/advanced/src/advanced_omi_backend/services/transcription/streaming_consumer.py" 2>/dev/null || echo "File path needs adjustment"

Repository: SimpleOpenSoftware/chronicle

Length of output: 163

🏁 Script executed:

# Search for error handling patterns (except blocks) in the file rg "except\s+Exception" "backends/advanced/src/advanced_omi_backend/services/transcription/streaming_consumer.py" -A 2 -B 1 || echo "Pattern not found with that path"

Repository: SimpleOpenSoftware/chronicle

Length of output: 1821

🏁 Script executed:

# Also check how logging is used in the file rg "logger\.(debug|warning|error|info)" "backends/advanced/src/advanced_omi_backend/services/transcription/streaming_consumer.py" | head -20

Repository: SimpleOpenSoftware/chronicle

Length of output: 1785

Don't swallow errors when setting the completion signal.

The bare except Exception: pass hides failures that are useful for debugging. Similar error handlers in the module log Redis failures (e.g., logger.warning(f"Failed to set error flag in Redis: {redis_error}")), so logging at least a debug/warn level here maintains the "best effort" behavior while preserving observability.

🔧 Suggested fix

try: completion_key = f"transcription:complete:{session_id}" await self.redis_client.set(completion_key, "error", ex=300) logger.warning(f"⚠️ Set error completion signal for {session_id}") - except Exception: - pass # Best effort + except Exception as signal_err: + logger.debug( + f"Failed to set completion signal for {session_id}: {signal_err}", + exc_info=True + ) # Best effort

🧰 Tools

🪛 Ruff (0.14.14)

180-181: try-except-pass detected, consider logging the exception

(S110)

180-180: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents

In `@backends/advanced/src/advanced_omi_backend/services/transcription/streaming_consumer.py` around lines 175 - 181, The bare "except Exception: pass" in the block that sets completion_key (f"transcription:complete:{session_id}") hides Redis failures; change it to catch the exception into a variable (e.g., except Exception as redis_error:) and log it (e.g., logger.warning or logger.debug) with context like "Failed to set error flag in Redis" including redis_error, while keeping the best-effort behavior (do not re-raise) and referencing self.redis_client, completion_key, session_id, and logger so observability is preserved.

github-actions · 2026-01-29T14:57:57Z

⚠️ Robot Framework Test Results (No API Keys)

Status: ❌ Some tests failed

ℹ️ Note: This run excludes tests requiring external API keys (Deepgram, OpenAI).
Tests tagged with requires-api-keys will run on dev/main branches.

Metric	Count
✅ Passed	87
❌ Failed	8
📊 Total	95

📊 View Reports

GitHub Pages (Live Reports):

Download Artifacts:

robot-test-reports-html-no-api - HTML reports
robot-test-results-xml-no-api - XML output

View full workflow run

coderabbitai bot reviewed Jan 29, 2026

View reviewed changes

AnkushMalaker merged commit e2c331b into dev Feb 1, 2026
3 of 4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhance StreamingTranscriptionConsumer and conversation job handling #282

Enhance StreamingTranscriptionConsumer and conversation job handling #282

AnkushMalaker commented Jan 29, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 29, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Uh oh!

AnkushMalaker commented Jan 29, 2026

Uh oh!

coderabbitai bot commented Jan 29, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Jan 29, 2026

Uh oh!

github-actions bot commented Jan 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Enhance StreamingTranscriptionConsumer and conversation job handling #282

Enhance StreamingTranscriptionConsumer and conversation job handling #282

Conversation

AnkushMalaker commented Jan 29, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Uh oh!

AnkushMalaker commented Jan 29, 2026

Uh oh!

coderabbitai bot commented Jan 29, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Jan 29, 2026

⚠️ Robot Framework Test Results (No API Keys)

📊 View Reports

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

AnkushMalaker commented Jan 29, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 29, 2026 •

edited

Loading