Skip to content

Potential Realtime TTS Ending Delay #569

@Kaister300

Description

@Kaister300

Description

In the realtime TTS module, the convert_realtime module is not checking if the isFinal message has been received or not. This causes a couple hundred of milliseconds of delay between when the client receives the last message from the websocket and when the websocket is actually terminated. If this is intentional then that is fine but in order to get a more accurate reading on how long the entire response gets generated on the client side then I propose the following solution:

# src/elevenlabs/realtime_tts.py
        def convert_realtime(...):
                    # Line 138-ish
                    ...
                    if "audio" in data and data["audio"]:
                        yield base64.b64decode(data["audio"])  # type: ignore
                    if "isFinal" in data and data["isFinal"]:  # Checks if we receive final to exist generator function
                        break
                    ...

Code example

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingconfirmedBug has been verified and accepted

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions