Skip to content

feat: send native speech activity events#29

Merged
santhoshvai merged 16 commits intomasterfrom
speech-activity
Apr 14, 2026
Merged

feat: send native speech activity events#29
santhoshvai merged 16 commits intomasterfrom
speech-activity

Conversation

@santhoshvai
Copy link
Copy Markdown
Member

@santhoshvai santhoshvai commented Apr 14, 2026

Summary by CodeRabbit

Release Notes

  • New Features

    • Added AudioDeviceModule API for controlling audio playback, recording, voice processing, microphone muting, and ducking settings.
    • Introduced speech activity detection with event notifications when speech starts or ends.
    • Added audio device event listeners for monitoring audio activity and processing states.
  • Chores

    • Updated version to 137.1.4-alpha.5.

hiroshihorie and others added 11 commits April 13, 2026 13:56
Our StreamWebRTC framework exposes RTCAudioEngineState via engineState but
does not provide RTCAudioEngineAvailability or -setEngineAvailability:.
Remove the corresponding native bridge methods and their TS wrappers so
the example app compiles. Also drop the unused AudioDeviceModuleObserver
import and trim AudioEngineAvailability from the public exports.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 14, 2026

Warning

Rate limit exceeded

@santhoshvai has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 26 minutes and 1 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 26 minutes and 1 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f6c7d0c3-8a5f-4c70-8dec-433a09bef623

📥 Commits

Reviewing files that changed from the base of the PR and between dcdccf9 and d4bd4e0.

📒 Files selected for processing (4)
  • examples/GumTestApp/ios/GumTestApp.xcodeproj/project.pbxproj
  • ios/RCTWebRTC/WebRTCModule+RTCAudioDeviceModule.m
  • src/AudioDeviceModule.ts
  • src/AudioDeviceModuleEvents.ts
📝 Walkthrough

Walkthrough

This change introduces speech activity detection on Android and adds comprehensive iOS audio device module controls. It includes a new Java speech detector monitoring microphone loudness, Android integration via WebRTCModule callbacks, iOS native method wrappers for audio device operations, and TypeScript bindings to expose these capabilities to JavaScript while emitting speech activity events across both platforms.

Changes

Cohort / File(s) Summary
Android Speech Activity Detection
android/src/main/java/com/oney/WebRTCModule/SpeechActivityDetector.java, android/src/main/java/com/oney/WebRTCModule/WebRTCModule.java
Introduces SpeechActivityDetector class that monitors microphone audio buffer loudness (PCM16, dBFS-based) to track speech activity with configurable thresholds. Integrates detector into WebRTCModule audio recording lifecycle, emitting audioDeviceModuleSpeechActivity events on state changes.
iOS Audio Device Module
ios/RCTWebRTC/WebRTCModule+RTCAudioDeviceModule.h, ios/RCTWebRTC/WebRTCModule+RTCAudioDeviceModule.m
Adds category on WebRTCModule exporting 25+ React Native bridge methods for audio device control: playout/recording lifecycle, microphone mute, voice processing config, mute modes, audio ducking, and recording preparation mode. Methods provide both async promise-based and sync blocking variants.
TypeScript Audio Device API
src/AudioDeviceModule.ts, src/AudioDeviceModuleEvents.ts, src/index.ts
New AudioDeviceModule class and AudioEngineMuteMode enum expose audio device control to JavaScript, wrapping native methods with platform guard (iOS/macOS only for most operations). Updates event emitter to support Android speech activity events. Exports new API surface in main index.
iOS Framework Configuration
ios/RCTWebRTC/WebRTCModule.h, ios/RCTWebRTC/WebRTCModule.m
Adds documentation comments clarifying that FrameCryption is unsupported and related properties remain empty to preserve native initialization compatibility.
Example App Configuration
examples/GumTestApp/ios/GumTestApp-tvOS/Info.plist, examples/GumTestApp/ios/GumTestApp/Info.plist, examples/GumTestApp/ios/GumTestApp.xcodeproj/project.pbxproj
Disables RCTNewArchEnabled flag in both tvOS and iOS example apps; updates build setting array format for OTHER_LDFLAGS in Debug/Release configurations.
Project Files
.gitignore, package.json
Adds ignore patterns for .xcode.env.local and PLAN.md; bumps version from 137.1.3 to 137.1.4-alpha.5.

Sequence Diagram(s)

sequenceDiagram
    participant MIC as Microphone
    participant AD as Audio Device<br/>(Native)
    participant SAD as SpeechActivityDetector<br/>(Android)
    participant WRM as WebRTCModule<br/>(Bridge)
    participant JS as JavaScript<br/>(React Native)

    rect rgba(100, 150, 200, 0.5)
    Note over MIC,JS: Recording Session Start
    JS->>WRM: start recording
    WRM->>AD: onWebRtcAudioRecordStart()
    WRM->>SAD: reset()
    end

    rect rgba(150, 200, 100, 0.5)
    Note over MIC,JS: Audio Buffer Processing
    MIC->>AD: capture audio chunk
    AD->>WRM: audio buffer callback
    WRM->>SAD: processBuffer(buffer, bytesRead)
    SAD->>SAD: compute RMS & dBFS
    alt Loudness > Threshold
        SAD->>SAD: update timing windows
        alt Duration > START_CONFIRM_MS
            SAD->>WRM: onSpeechStarted()
            WRM->>JS: emit audioDeviceModuleSpeechActivity<br/>event: "started"
        end
    else Loudness ≤ Threshold
        alt Speaking && Silence > TIMEOUT_MS
            SAD->>WRM: onSpeechEnded()
            WRM->>JS: emit audioDeviceModuleSpeechActivity<br/>event: "ended"
        end
    end
    end

    rect rgba(200, 150, 100, 0.5)
    Note over MIC,JS: Recording Session Stop
    JS->>WRM: stop recording
    WRM->>AD: onWebRtcAudioRecordStop()
    WRM->>SAD: onRecordStop()
    alt Still Speaking
        SAD->>WRM: onSpeechEnded()
        WRM->>JS: emit audioDeviceModuleSpeechActivity<br/>event: "ended"
    end
    SAD->>SAD: reset()
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 Hark! The mic doth speak its truth,
Android detects each word's bright gleam,
iOS bridges calls with grace,
Events flow swift through React's domain—
Speech awakens, silence falls,
One alpha leap toward audio's rebirth! 🎤✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 23.08% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'feat: send native speech activity events' accurately describes the main feature: adding native speech activity detection and event emission across Android and iOS platforms.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch speech-activity

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/AudioDeviceModuleEvents.ts (1)

1-1: ⚠️ Potential issue | 🟡 Minor

Remove unused Platform import to fix CI failure.

The pipeline is failing because Platform is imported but no longer used after removing the Android OS guard on line 43.

🔧 Proposed fix
-import { NativeEventEmitter, NativeModules, Platform } from 'react-native';
+import { NativeEventEmitter, NativeModules } from 'react-native';
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/AudioDeviceModuleEvents.ts` at line 1, Remove the unused Platform import
from the module's import list to fix the CI error: update the import statement
that currently reads with NativeEventEmitter, NativeModules, Platform to only
import NativeEventEmitter and NativeModules (i.e., remove the symbol Platform)
in src/AudioDeviceModuleEvents.ts so there are no unused imports reported by the
linter/CI.
🧹 Nitpick comments (2)
src/AudioDeviceModule.ts (1)

20-25: Extract the availability check once.

The same platform guard is copied into every method. A tiny helper that owns the error message and returns the native module would keep this surface consistent and make future additions easier to maintain.

♻️ Possible shape
+const getAudioDeviceModule = () => {
+    if (Platform.OS === 'android') {
+        throw new Error('AudioDeviceModule is only available on iOS/macOS');
+    }
+    if (!WebRTCModule) {
+        throw new Error('WebRTCModule is not available');
+    }
+    return WebRTCModule;
+};
+
 export class AudioDeviceModule {
     /**
      * Start audio playback
      */
     static async startPlayout(): Promise<void> {
-        if (Platform.OS === 'android') {
-            throw new Error('AudioDeviceModule is only available on iOS/macOS');
-        }
-
-        return WebRTCModule.audioDeviceModuleStartPlayout();
+        return getAudioDeviceModule().audioDeviceModuleStartPlayout();
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/AudioDeviceModule.ts` around lines 20 - 25, Extract the repeated
Platform.OS check into a single private helper on AudioDeviceModule (e.g., a
method like getNativeAudioModuleOrThrow or ensureAudioDeviceAvailable) that
performs the iOS/macOS guard and throws the standardized error, then update
startPlayout to call that helper and return the native call via the helper
(e.g., helper().audioDeviceModuleStartPlayout()). Replace the same pattern in
any other AudioDeviceModule static methods to use this helper so the error
message and native module retrieval are centralized.
ios/RCTWebRTC/WebRTCModule+RTCAudioDeviceModule.m (1)

18-84: Factor the status-code promise boilerplate into one helper.

All promise-based exports repeat the same NSIntegerresolve/reject mapping. Pulling that into a small helper would remove a lot of copy/paste and keep error formatting aligned as more ADM operations get added.

♻️ Helper sketch
+static inline void ResolveADMResult(NSInteger result,
+                                    NSString *code,
+                                    NSString *action,
+                                    RCTPromiseResolveBlock resolve,
+                                    RCTPromiseRejectBlock reject) {
+    if (result == 0) {
+        resolve(nil);
+        return;
+    }
+
+    reject(code, [NSString stringWithFormat:@"Failed to %@: %ld", action, (long)result], nil);
+}
+
 RCT_EXPORT_METHOD(audioDeviceModuleStartPlayout
                   : (RCTPromiseResolveBlock)resolve rejecter
                   : (RCTPromiseRejectBlock)reject) {
-    NSInteger result = [RAW_ADM startPlayout];
-    if (result == 0) {
-        resolve(nil);
-    } else {
-        reject(@"playout_error", [NSString stringWithFormat:@"Failed to start playout: %ld", (long)result], nil);
-    }
+    ResolveADMResult([RAW_ADM startPlayout], @"playout_error", @"start playout", resolve, reject);
 }

Also applies to: 88-118, 158-168, 196-208

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ios/RCTWebRTC/WebRTCModule`+RTCAudioDeviceModule.m around lines 18 - 84,
Create a small helper in WebRTCModule+RTCAudioDeviceModule.m (e.g. -
(void)handleADMResult:(NSInteger)result operation:(NSString*)op
resolve:(RCTPromiseResolveBlock)resolve reject:(RCTPromiseRejectBlock)reject)
that centralizes the NSInteger → resolve(nil) / reject(code, message, nil) logic
and consistent error formatting, then replace the repeated blocks in
audioDeviceModuleStartPlayout, audioDeviceModuleStopPlayout,
audioDeviceModuleStartRecording, audioDeviceModuleStopRecording,
audioDeviceModuleStartLocalRecording, audioDeviceModuleStopLocalRecording (and
the other similar blocks mentioned) to call
handleADMResult:operation:resolve:reject: passing the RAW_ADM return value and a
short operation string like @"startPlayout" so all callers use the same message
format and behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@examples/GumTestApp/ios/GumTestApp.xcodeproj/project.pbxproj`:
- Around line 762-765: Remove the extraneous whitespace-only linker flag token
(" ") from the OTHER_LDFLAGS array so only "$(inherited)" remains; locate the
OTHER_LDFLAGS entries (the arrays containing "$(inherited)" and the literal " ")
and delete the " " element in each occurrence (e.g., the entries around the
symbols OTHER_LDFLAGS and the quoted " " token) to avoid passing an
empty/invalid ld argument.

In `@examples/GumTestApp/ios/GumTestApp/Info.plist`:
- Line 47: Update the iOS test app setup to either document or fix why React
Native New Architecture is disabled: locate the RCTNewArchEnabled key in the
Info.plist files for the iOS and tvOS test apps (currently set to false) and
either (A) add a short note in the repository README or PR describing the exact
incompatibility, affected components (e.g., TurboModule/Fabric integration), and
rollback criteria, or (B) enable the flag to true and resolve the underlying
incompatibility in native integration code so it matches the Android setting
(newArchEnabled=true) — mention RCTNewArchEnabled, the two Info.plist entries,
and the Android newArchEnabled setting to make the change discoverable.

In `@src/AudioDeviceModule.ts`:
- Around line 249-256: In setDuckingLevel, validate the incoming level before
calling the bridge: in AudioDeviceModule.setDuckingLevel check Platform.OS as
now, then ensure level is a number, Number.isInteger(level) is true, and level
is between 0 and 100 inclusive; if not, throw a descriptive
RangeError/TypeError; only after passing validation call
WebRTCModule.audioDeviceModuleSetDuckingLevel(level). This prevents non-integer
or out-of-range values from reaching the native method.

---

Outside diff comments:
In `@src/AudioDeviceModuleEvents.ts`:
- Line 1: Remove the unused Platform import from the module's import list to fix
the CI error: update the import statement that currently reads with
NativeEventEmitter, NativeModules, Platform to only import NativeEventEmitter
and NativeModules (i.e., remove the symbol Platform) in
src/AudioDeviceModuleEvents.ts so there are no unused imports reported by the
linter/CI.

---

Nitpick comments:
In `@ios/RCTWebRTC/WebRTCModule`+RTCAudioDeviceModule.m:
- Around line 18-84: Create a small helper in
WebRTCModule+RTCAudioDeviceModule.m (e.g. -
(void)handleADMResult:(NSInteger)result operation:(NSString*)op
resolve:(RCTPromiseResolveBlock)resolve reject:(RCTPromiseRejectBlock)reject)
that centralizes the NSInteger → resolve(nil) / reject(code, message, nil) logic
and consistent error formatting, then replace the repeated blocks in
audioDeviceModuleStartPlayout, audioDeviceModuleStopPlayout,
audioDeviceModuleStartRecording, audioDeviceModuleStopRecording,
audioDeviceModuleStartLocalRecording, audioDeviceModuleStopLocalRecording (and
the other similar blocks mentioned) to call
handleADMResult:operation:resolve:reject: passing the RAW_ADM return value and a
short operation string like @"startPlayout" so all callers use the same message
format and behavior.

In `@src/AudioDeviceModule.ts`:
- Around line 20-25: Extract the repeated Platform.OS check into a single
private helper on AudioDeviceModule (e.g., a method like
getNativeAudioModuleOrThrow or ensureAudioDeviceAvailable) that performs the
iOS/macOS guard and throws the standardized error, then update startPlayout to
call that helper and return the native call via the helper (e.g.,
helper().audioDeviceModuleStartPlayout()). Replace the same pattern in any other
AudioDeviceModule static methods to use this helper so the error message and
native module retrieval are centralized.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d12795ed-ebfa-4733-accb-00f0db46b6ae

📥 Commits

Reviewing files that changed from the base of the PR and between c40fee0 and dcdccf9.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (14)
  • .gitignore
  • android/src/main/java/com/oney/WebRTCModule/SpeechActivityDetector.java
  • android/src/main/java/com/oney/WebRTCModule/WebRTCModule.java
  • examples/GumTestApp/ios/GumTestApp-tvOS/Info.plist
  • examples/GumTestApp/ios/GumTestApp.xcodeproj/project.pbxproj
  • examples/GumTestApp/ios/GumTestApp/Info.plist
  • ios/RCTWebRTC/WebRTCModule+RTCAudioDeviceModule.h
  • ios/RCTWebRTC/WebRTCModule+RTCAudioDeviceModule.m
  • ios/RCTWebRTC/WebRTCModule.h
  • ios/RCTWebRTC/WebRTCModule.m
  • package.json
  • src/AudioDeviceModule.ts
  • src/AudioDeviceModuleEvents.ts
  • src/index.ts

Comment thread examples/GumTestApp/ios/GumTestApp.xcodeproj/project.pbxproj
Comment thread examples/GumTestApp/ios/GumTestApp/Info.plist Outdated
Comment thread src/AudioDeviceModule.ts
@santhoshvai santhoshvai merged commit 13da032 into master Apr 14, 2026
4 checks passed
@santhoshvai santhoshvai deleted the speech-activity branch April 14, 2026 14:21
santhoshvai pushed a commit that referenced this pull request Apr 16, 2026
* pc: don't throw when applyConstraints on audio tracks

* lint
santhoshvai added a commit to GetStream/stream-video-js that referenced this pull request Apr 28, 2026
### 💡 Overview

Old mechanism - polling through rtc stats

Now - speech events pushed from native side

Depends on GetStream/react-native-webrtc#29


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Refactor**
* Speech detection now uses native WebRTC event subscriptions instead of
the previous detector implementation.
* **Removed**
* The useSpeechDetection hook has been removed and is no longer
exported.
* **Docs**
  * Hook entry for speech detection removed from platform-specific docs.
* **Chores**
  * WebRTC dependency updated to a newer release.
* **Tests**
* Related unit tests updated/removed to match the new event-based
approach.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants