Skip to content

Conversation

@AnkushMalaker
Copy link
Collaborator

@AnkushMalaker AnkushMalaker commented Jan 29, 2026

  • Removed cumulative audio offset tracking from StreamingTranscriptionConsumer as Deepgram provides cumulative timestamps directly.
  • Updated store_final_result method to utilize Deepgram's cumulative timestamps without adjustments.
  • Implemented completion signaling for transcription sessions in Redis, ensuring conversation jobs wait for all results before processing.
  • Improved error handling to signal completion even in case of errors, preventing conversation jobs from hanging.
  • Enhanced logging for better visibility of transcription completion and error states.

Summary by CodeRabbit

  • Bug Fixes

    • Enhanced transcription workflow reliability with improved synchronization and completion signaling between transcription and conversation processing.
    • Added timeout and error handling to prevent delays in the processing pipeline.
  • Refactor

    • Simplified timestamp handling to use native streaming timestamps directly for improved accuracy.

✏️ Tip: You can customize this high-level summary in your review settings.

- Removed cumulative audio offset tracking from StreamingTranscriptionConsumer as Deepgram provides cumulative timestamps directly.
- Updated store_final_result method to utilize Deepgram's cumulative timestamps without adjustments.
- Implemented completion signaling for transcription sessions in Redis, ensuring conversation jobs wait for all results before processing.
- Improved error handling to signal completion even in case of errors, preventing conversation jobs from hanging.
- Enhanced logging for better visibility of transcription completion and error states.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 29, 2026

📝 Walkthrough

Walkthrough

The changes introduce Redis-based completion signaling between the transcription streaming consumer and conversation job processing. The transcription service now marks completion via Redis with a 5-minute TTL, while the conversation job waits for this signal before proceeding. Timestamp offset calculations are removed from transcription results since Deepgram already provides cumulative timestamps.

Changes

Cohort / File(s) Summary
Transcription streaming simplification
backends/advanced/src/advanced_omi_backend/services/transcription/streaming_consumer.py
Removed per-session audio offset initialization and offset-based timestamp adjustments from start_session_stream and store_final_result. Added Redis completion signaling in end_session_stream that sets transcription:complete:{session_id} with 5-minute TTL and error state handling. Simplified to use Deepgram cumulative timestamps directly without offset computation.
Conversation job synchronization
backends/advanced/src/advanced_omi_backend/workers/conversation_jobs.py
Added wait-and-check sequence after meaningful speech validation that polls Redis key transcription:complete:{session_id} for up to 30 seconds. Handles "error" state with warning continuation, logs success on completion, and logs timeout warning if not completed within window.

Sequence Diagram(s)

sequenceDiagram
    participant Consumer as Transcription Consumer
    participant Redis as Redis
    participant JobWorker as Conversation Job Worker
    
    JobWorker->>JobWorker: Validate meaningful speech
    JobWorker->>Redis: Poll transcription:complete:{session_id}
    Note over JobWorker: Wait up to 30 seconds
    
    Consumer->>Consumer: Process final transcription
    Consumer->>Redis: SET transcription:complete:{session_id} = "1"<br/>(TTL: 5 min)
    
    Redis-->>JobWorker: Signal received
    JobWorker->>JobWorker: Proceed to post-processing
    
    alt Timeout
        JobWorker->>JobWorker: Log timeout warning<br/>Continue anyway
    end
    
    alt Error state
        Consumer->>Redis: SET transcription:complete:{session_id} = "error"
        Redis-->>JobWorker: Error signal
        JobWorker->>JobWorker: Log warning, continue
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Enhance StreamingTranscriptionConsumer and conversation job handling' is directly related to the main changes, which involve refactoring StreamingTranscriptionConsumer (removing offset tracking, adding Redis completion signals) and enhancing conversation job handling (adding wait-and-check for transcription completion).
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@AnkushMalaker
Copy link
Collaborator Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 29, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
backends/advanced/src/advanced_omi_backend/services/transcription/streaming_consumer.py (1)

264-304: Verify provider timestamp capabilities in registry configuration.

The code assumes all configured streaming providers return cumulative timestamps from stream start (as Deepgram does). However, Parakeet ASR streaming returns per-chunk timestamps by default and requires explicit offset merging to produce cumulative timestamps. Since RegistryStreamingTranscriptionProvider supports any provider via config, and the architecture claims provider-agnosticism, downstream word timing will silently fail if a provider with per-chunk semantics (e.g., Parakeet) is configured as the streaming provider. Additionally, the hardcoded provider name b"deepgram-stream" on line 283 contradicts this provider-agnostic design.

Recommended fixes:

  • Add provider-specific offset handling in store_final_result() based on the actual provider from the registry
  • Document timestamp semantics in config.yml for each streaming provider, or gate cumulative-timestamp processing by a per-provider capability flag
  • Replace hardcoded provider name with dynamic detection from the configured streaming provider
🤖 Fix all issues with AI agents
In
`@backends/advanced/src/advanced_omi_backend/services/transcription/streaming_consumer.py`:
- Around line 175-181: The bare "except Exception: pass" in the block that sets
completion_key (f"transcription:complete:{session_id}") hides Redis failures;
change it to catch the exception into a variable (e.g., except Exception as
redis_error:) and log it (e.g., logger.warning or logger.debug) with context
like "Failed to set error flag in Redis" including redis_error, while keeping
the best-effort behavior (do not re-raise) and referencing self.redis_client,
completion_key, session_id, and logger so observability is preserved.

Comment on lines +175 to +181
# Still signal completion even on error so conversation job doesn't hang
try:
completion_key = f"transcription:complete:{session_id}"
await self.redis_client.set(completion_key, "error", ex=300)
logger.warning(f"⚠️ Set error completion signal for {session_id}")
except Exception:
pass # Best effort
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, find and examine the file
find . -type f -name "streaming_consumer.py" | grep -E "transcription|advanced_omi_backend"

Repository: SimpleOpenSoftware/chronicle

Length of output: 161


🏁 Script executed:

# Check the file exists and read around the specified lines
if [ -f "backends/advanced/src/advanced_omi_backend/services/transcription/streaming_consumer.py" ]; then
  echo "=== File found, showing lines 170-185 ==="
  sed -n '170,185p' "backends/advanced/src/advanced_omi_backend/services/transcription/streaming_consumer.py" | cat -n
else
  echo "File not found, searching for similar paths..."
  find . -name "streaming_consumer.py" -type f
fi

Repository: SimpleOpenSoftware/chronicle

Length of output: 1089


🏁 Script executed:

# Get file size first to see if it's manageable
wc -l "backends/advanced/src/advanced_omi_backend/services/transcription/streaming_consumer.py" 2>/dev/null || echo "File path needs adjustment"

Repository: SimpleOpenSoftware/chronicle

Length of output: 163


🏁 Script executed:

# Search for error handling patterns (except blocks) in the file
rg "except\s+Exception" "backends/advanced/src/advanced_omi_backend/services/transcription/streaming_consumer.py" -A 2 -B 1 || echo "Pattern not found with that path"

Repository: SimpleOpenSoftware/chronicle

Length of output: 1821


🏁 Script executed:

# Also check how logging is used in the file
rg "logger\.(debug|warning|error|info)" "backends/advanced/src/advanced_omi_backend/services/transcription/streaming_consumer.py" | head -20

Repository: SimpleOpenSoftware/chronicle

Length of output: 1785


Don't swallow errors when setting the completion signal.

The bare except Exception: pass hides failures that are useful for debugging. Similar error handlers in the module log Redis failures (e.g., logger.warning(f"Failed to set error flag in Redis: {redis_error}")), so logging at least a debug/warn level here maintains the "best effort" behavior while preserving observability.

🔧 Suggested fix
             try:
                 completion_key = f"transcription:complete:{session_id}"
                 await self.redis_client.set(completion_key, "error", ex=300)
                 logger.warning(f"⚠️ Set error completion signal for {session_id}")
-            except Exception:
-                pass  # Best effort
+            except Exception as signal_err:
+                logger.debug(
+                    f"Failed to set completion signal for {session_id}: {signal_err}",
+                    exc_info=True
+                )  # Best effort
🧰 Tools
🪛 Ruff (0.14.14)

180-181: try-except-pass detected, consider logging the exception

(S110)


180-180: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
In
`@backends/advanced/src/advanced_omi_backend/services/transcription/streaming_consumer.py`
around lines 175 - 181, The bare "except Exception: pass" in the block that sets
completion_key (f"transcription:complete:{session_id}") hides Redis failures;
change it to catch the exception into a variable (e.g., except Exception as
redis_error:) and log it (e.g., logger.warning or logger.debug) with context
like "Failed to set error flag in Redis" including redis_error, while keeping
the best-effort behavior (do not re-raise) and referencing self.redis_client,
completion_key, session_id, and logger so observability is preserved.

@github-actions
Copy link

⚠️ Robot Framework Test Results (No API Keys)

Status: ❌ Some tests failed

ℹ️ Note: This run excludes tests requiring external API keys (Deepgram, OpenAI).
Tests tagged with requires-api-keys will run on dev/main branches.

Metric Count
✅ Passed 87
❌ Failed 8
📊 Total 95

📊 View Reports

GitHub Pages (Live Reports):

Download Artifacts:


View full workflow run

@AnkushMalaker AnkushMalaker merged commit e2c331b into dev Feb 1, 2026
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants