OmniVoice HTTP TTS Server#6
Conversation
- New omnivoice-tts-server binary built when OV_WEBSERVER=ON - POST /v1/audio/speech endpoint (OpenAI TTS API compatible) - OpenAPI docs at /v1/api-docs - CLI client in examples/client.sh with a companion examples/server.sh Implements async TTS generation via cpp-httplib with JSON request handling. Supports WAV output formats (16/24/32-bit), language/style instructions.
|
The idea of compatibility with OpenAI is good, and I want to keep it. However, I'm also working on qwentts.cpp (https://github.com/ServeurpersoCom/qwentts.cpp) and I haven't yet decided whether to merge the two projects to avoid duplication. |
Not sure if you have decided which direction you want to take this project in regard to a potential merge with qwentts. But as it is right now it works pretty darn well, and I'd personally love an OpenAI-compatible, ideally cloning and streaming capable HTTP server in order to integrate this project with various frontends. |
|
Tested the workflow and does not break anything in the core application. This should be merged so that it can generate streaming test very quickly. |
What
Adds an HTTP REST API (
omnivoice-tts-server) for text-to-speech generation, exposing an OpenAI-compatible endpoint at/v1/audio/speech. The server runs as a standalone binary built from thetools/omnivoice-tts-server.cppsource.New files:
tools/omnivoice-tts-server.cpp: HTTP server implementation using cpp-httplibexamples/client.sh: CLI client to call the API with jq-based JSON constructionexamples/server.sh: Helper script to launch the server with model pathsModified:
CMakeLists.txt: AddedOV_WEBSERVERoption and new build targetWhy
Provides a networked interface to OmniVoice's TTS pipeline, enabling:
The server supports:
--langand--instructCLI flags--chunk-duration)/v1/api-docsHow to Review
CMakeLists.txt: See the conditional build block foromnivoice-tts-server. Note the dependency onhttplibandnlohmann_json.tools/omnivoice-tts-server.cpp: Start withmain()to understand CLI argument parsing, then:generate_audio_task(): Worker thread that callsov_synthesize()and encodes WAV/v1/audio/speech: Request validation, JSON parsing, async worker dispatch/v1/api-docs: OpenAPI spec generation (static, no dynamic routing)examples/client.sh: Shows how to call the API from bash usingjqto construct JSON payloads.Optinally, two environment vrables can be used to configure the client:
http://127.0.0.1:1234)output.wav)300seconds)examples/server.sh: Demonstrates server launch with model paths and default configuration.Command line arguments are forwarded to the
omnivoice-tts-serverexecutable.What's intentionally left out:
generate_audio_task())Testing
OV_WEBSERVER=ONand verifyomnivoice-tts-serverbinary is createdexamples/server.shand confirm server starts on port 1234/v1/api-docsand verify OpenAPI spec is returned as JSONexamples/client.shwith a test prompt and verify WAV outputinputfield in JSON payload, invalid JSON, unsupported formatDeployment Notes
OV_WEBSERVER=ONin CMakehttplib(C++17 header-only),nlohmann_json(header-only)127.0.0.1:1234by default)--modeland--codecCLI args (no env var fallback)