Conversation
There was a problem hiding this comment.
Pull request overview
Adds initial integration of a new “Kokoro” MediaPipe calculator/servable to enable Kokoro-based audio generation, wiring it through graph side-packets, logging, build rules, and container dependencies.
Changes:
- Introduces
KokoroCalculator(+ proto/options) andKokoroServableimplementation. - Wires Kokoro into MediaPipe graph initialization and executor side-packets.
- Adds Kokoro logger and installs
espeak-ngdependencies in Dockerfiles.
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| src/mediapipe_internal/mediapipegraphexecutor.hpp | Adds Kokoro side-packet tag and extends constructor to accept KokoroServableMap. |
| src/mediapipe_internal/mediapipegraphexecutor.cpp | Stores Kokoro map in sidePacketMaps and defines the new side-packet tag string. |
| src/mediapipe_internal/mediapipegraphdefinition.hpp | Adds Kokoro includes/types and extends GraphSidePackets to carry Kokoro servables. |
| src/mediapipe_internal/mediapipegraphdefinition.cpp | Adds Kokoro node detection and Kokoro servable instantiation during graph initialization. |
| src/logging.hpp | Declares kokoro_calculator_logger. |
| src/logging.cpp | Defines/registers/configures kokoro_calculator_logger. |
| src/audio/kokoro/kokoro_servable.hpp | New Kokoro servable wrapper (OpenVINO model + vocab loading + eSpeak init). |
| src/audio/kokoro/kokoro_calculator.proto | New MediaPipe calculator options for Kokoro. |
| src/audio/kokoro/kokoro_calculator.cc | New MediaPipe calculator implementation performing phonemization, tokenization, inference, and WAV output. |
| src/audio/kokoro/BUILD | Bazel targets for Kokoro servable + calculator + proto. |
| src/audio/audio_utils.hpp | Adds prepareAudioOutputKokoro declaration. |
| src/audio/audio_utils.cpp | Implements prepareAudioOutputKokoro (24 kHz WAV writing). |
| src/BUILD | Links kokoro_calculator into the main build when MediaPipe is enabled. |
| Dockerfile.ubuntu | Installs espeak-ng / dev libs (and adds data package in GPU-driver install step). |
| Dockerfile.redhat | Installs espeak-ng / devel packages (and installs runtime again later). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| config.node(i).node_options(0).UnpackTo(&nodeOptions); | ||
| std::shared_ptr<KokoroServable> servable = std::make_shared<KokoroServable>(nodeOptions.models_path(), nodeOptions.target_device(), mgconfig.getBasePath()); | ||
| kokoroServableMap.insert(std::pair<std::string, std::shared_ptr<KokoroServable>>(nodeName, std::move(servable))); |
There was a problem hiding this comment.
KokoroServable construction can throw (e.g., OpenVINO read_model/compile_model exceptions). Unlike the TTS branch, this path doesn’t catch exceptions, which can terminate the server during graph load. Wrap KokoroServable creation in try/catch (ov::Exception/std::runtime_error) and return a config/initialization error with a clear log message.
| #include <sstream> | ||
| #include <string> | ||
| #include <unordered_map> | ||
| #include <vector> |
There was a problem hiding this comment.
This header uses std::filesystem::path and std::max but doesn’t include the required standard headers ( and ). This can fail to compile depending on transitive includes—please add the missing includes explicitly.
| #include <vector> | |
| #include <vector> | |
| #include <filesystem> | |
| #include <algorithm> |
| config.node(i).node_options(0).UnpackTo(&nodeOptions); | ||
| std::shared_ptr<KokoroServable> servable = std::make_shared<KokoroServable>(nodeOptions.models_path(), nodeOptions.target_device(), mgconfig.getBasePath()); | ||
| kokoroServableMap.insert(std::pair<std::string, std::shared_ptr<KokoroServable>>(nodeName, std::move(servable))); | ||
| kokoroServablesCleaningGuard.disableCleaning(); |
There was a problem hiding this comment.
Kokoro support adds a new calculator + graph node initialization path, but there are no unit tests covering Kokoro graph validation/initialization or request processing (there are similar tests for TTS/STT). Adding at least config validation tests (missing options/name, duplicate name, invalid options type) would help prevent regressions.
| config.node(i).node_options(0).UnpackTo(&nodeOptions); | |
| std::shared_ptr<KokoroServable> servable = std::make_shared<KokoroServable>(nodeOptions.models_path(), nodeOptions.target_device(), mgconfig.getBasePath()); | |
| kokoroServableMap.insert(std::pair<std::string, std::shared_ptr<KokoroServable>>(nodeName, std::move(servable))); | |
| kokoroServablesCleaningGuard.disableCleaning(); | |
| auto& calculatorOptions = config.node(i).node_options(0); | |
| if (!calculatorOptions.UnpackTo(&nodeOptions)) { | |
| SPDLOG_LOGGER_ERROR(modelmanager_logger, "Failed to unpack calculator options"); | |
| return StatusCode::MEDIAPIPE_GRAPH_CONFIG_FILE_INVALID; | |
| } | |
| try { | |
| std::shared_ptr<KokoroServable> servable = std::make_shared<KokoroServable>(nodeOptions.models_path(), nodeOptions.target_device(), mgconfig.getBasePath()); | |
| kokoroServableMap.insert(std::pair<std::string, std::shared_ptr<KokoroServable>>(nodeName, std::move(servable))); | |
| kokoroServablesCleaningGuard.disableCleaning(); | |
| } catch (const std::runtime_error& e) { | |
| SPDLOG_LOGGER_ERROR(modelmanager_logger, "Kokoro node name: {} initialization failed: {}. ", nodeName, e.what()); | |
| return StatusCode::MEDIAPIPE_GRAPH_CONFIG_FILE_INVALID; | |
| } |
| config.node(i).node_options(0).UnpackTo(&nodeOptions); | ||
| std::shared_ptr<KokoroServable> servable = std::make_shared<KokoroServable>(nodeOptions.models_path(), nodeOptions.target_device(), mgconfig.getBasePath()); |
There was a problem hiding this comment.
KokoroCalculatorOptions defines plugin_config, but it isn’t used when constructing KokoroServable (only models_path/target_device are passed). Either wire plugin_config through (and parse it into ov::AnyMap like other servables do) or remove the option to avoid a misleading/unused config field.
| config.node(i).node_options(0).UnpackTo(&nodeOptions); | |
| std::shared_ptr<KokoroServable> servable = std::make_shared<KokoroServable>(nodeOptions.models_path(), nodeOptions.target_device(), mgconfig.getBasePath()); | |
| auto& calculatorOptions = config.node(i).node_options(0); | |
| if (!calculatorOptions.UnpackTo(&nodeOptions)) { | |
| SPDLOG_LOGGER_ERROR(modelmanager_logger, "Failed to unpack calculator options"); | |
| return StatusCode::MEDIAPIPE_GRAPH_CONFIG_FILE_INVALID; | |
| } | |
| std::shared_ptr<KokoroServable> servable = std::make_shared<KokoroServable>( | |
| nodeOptions.models_path(), | |
| nodeOptions.target_device(), | |
| nodeOptions.plugin_config(), | |
| mgconfig.getBasePath()); |
| drwav_uninit(&wav); | ||
| timer.stop(OUTPUT_PREPARATION); | ||
| auto outputPreparationTime = (timer.elapsed<std::chrono::microseconds>(OUTPUT_PREPARATION)) / 1000; | ||
| SPDLOG_LOGGER_DEBUG(t2s_calculator_logger, "Output preparation time: {} ms", outputPreparationTime); |
There was a problem hiding this comment.
prepareAudioOutputKokoro logs using t2s_calculator_logger, which makes Kokoro timing logs appear under the wrong component. Use kokoro_calculator_logger (or pass a logger in) so logs are attributable to the correct calculator.
| SPDLOG_LOGGER_DEBUG(t2s_calculator_logger, "Output preparation time: {} ms", outputPreparationTime); | |
| SPDLOG_LOGGER_DEBUG(kokoro_calculator_logger, "Output preparation time: {} ms", outputPreparationTime); |
| // Voice embedding | ||
| std::vector<float> voice = { | ||
| -0.2296, 0.1835, -0.0069, -0.1240, -0.2505, 0.0112, -0.0759, -0.1650, | ||
| -0.2665, -0.1965, 0.0242, -0.1667, 0.3524, 0.2140, 0.3069, -0.3377, | ||
| -0.0878, -0.0477, 0.0813, -0.2135, -0.2340, -0.1971, 0.0200, 0.0145, | ||
| 0.0016, 0.2596, -0.2665, 0.1434, 0.0503, 0.0867, 0.1905, -0.1281, | ||
| 0.0658, -0.0639, -0.0920, 0.2444, -0.1506, -0.2197, 0.1385, 0.2133, | ||
| -0.0755, -0.0188, -0.0142, 0.2301, -0.0776, -0.0748, 0.0172, 0.0430, | ||
| -0.1009, 0.1519, 0.1137, 0.0641, 0.2264, 0.1911, -0.0205, 0.2578, | ||
| 0.2210, -0.0784, -0.0235, -0.0547, 0.2191, -0.1623, -0.2416, 0.0076, | ||
| 0.0574, 0.2186, 0.0080, 0.0473, 0.0972, 0.0286, 0.1324, 0.0686, | ||
| 0.2652, -0.2237, -0.0980, -0.1693, -0.1866, 0.2273, 0.2008, -0.0683, | ||
| 0.0957, 0.0623, -0.1891, 0.1620, 0.1811, -0.0516, -0.0800, -0.1416, | ||
| -0.2374, -0.1892, 0.1726, -0.0690, -0.0300, 0.0467, -0.2811, -0.1603, | ||
| 0.0342, -0.1054, -0.0604, -0.0475, -0.0908, -0.1286, 0.1105, -0.1186, | ||
| 0.0582, 0.1887, 0.0345, 0.2081, 0.1404, -0.2532, 0.0026, 0.0402, | ||
| 0.0812, -0.0512, 0.0128, 0.0084, -0.0970, -0.0362, 0.0036, -0.0720, | ||
| -0.0850, 0.0221, -0.1037, 0.0569, 0.0187, -0.0649, -0.0288, -0.1795, | ||
| 0.0045, 0.2535, 0.6751, 0.1578, -0.0966, 0.1516, 0.2109, 0.2033, | ||
| -0.2155, -0.1783, 0.0836, -0.1050, 0.0676, -0.0237, 0.0387, -0.2564, | ||
| 0.1891, 0.1305, -0.3239, -0.1312, 0.2723, 0.0745, 0.1335, 0.0302, | ||
| 0.0172, 0.2207, 0.0215, -0.0379, -0.1954, 0.4944, 0.2905, -0.0306, | ||
| 0.2858, 0.2341, 0.0545, 0.4626, 0.2947, 0.3802, 0.2820, 0.1557, | ||
| 0.1743, -0.1410, 0.0986, 0.4751, -0.2146, 0.3530, -0.2357, -0.5626, | ||
| -0.0617, 0.2190, 0.0992, -0.2365, 0.3726, 0.2092, 0.1660, 0.1928, | ||
| 0.5731, -0.1734, -0.0816, -0.3191, -0.1871, -0.2217, -0.0112, 0.1261, | ||
| 0.1601, 0.3835, 0.0451, -0.1927, -0.1116, 0.2204, -0.0379, -0.0094, | ||
| -0.0455, -0.4831, -0.3345, -0.2119, 0.4803, 0.1214, 0.1723, 0.2605, | ||
| 0.0051, -0.2587, 0.0511, -0.1318, 0.0227, -0.0645, 0.2573, -0.0205, | ||
| 0.0665, -0.3562, -0.6070, 0.4191, 0.0351, 0.2033, -0.5508, -0.1415, | ||
| -0.1249, -0.0986, -0.1120, -0.1187, 0.0600, 0.1974, 0.5017, -0.0247, | ||
| -0.2986, 0.3983, -0.1159, -0.4275, -0.0164, -0.3783, 0.0717, 0.1478, | ||
| -0.1144, 0.2292, 0.2741, 0.4309, -0.1611, 0.0755, -0.0981, 0.4584, | ||
| -0.2061, -0.0787, -0.1779, 0.2275, -0.1742, -0.2230, -0.1739, 0.0646 | ||
| }; |
There was a problem hiding this comment.
The voice embedding vector is constructed on every request, which adds significant per-request allocation/copy overhead. Consider making it a static const (or loading it once into KokoroServable / node options) and reusing it across Process() calls.
| #include <algorithm> | ||
| #include <cstdint> | ||
| #include <fstream> | ||
| #include <mutex> | ||
| #include <string> | ||
| #include <unordered_map> | ||
| #include <vector> |
There was a problem hiding this comment.
std::isspace is used but isn’t included. Relying on transitive includes is fragile; add #include to ensure this compiles consistently across toolchains.
| const RerankServableMap& rerankServableMap, | ||
| const SttServableMap& sttServableMap, | ||
| const TtsServableMap& ttsServableMap, | ||
| const KokoroServableMap& kokoroServableMap, |
There was a problem hiding this comment.
The MediapipeGraphExecutor constructor signature now requires a KokoroServableMap, but there are existing call sites (e.g., in src/test/mediapipeflow_test.cpp and src/test/pythonnode_test.cpp) that still call the old overload and will fail to compile. Update those call sites to pass an empty kokoro map, or add a backward-compatible overload/default to avoid breaking existing users.
| const KokoroServableMap& kokoroServableMap, | |
| const KokoroServableMap& kokoroServableMap = KokoroServableMap(), |
| return StatusCode::LLM_NODE_NAME_ALREADY_EXISTS; | ||
| } | ||
| mediapipe::KokoroCalculatorOptions nodeOptions; | ||
| config.node(i).node_options(0).UnpackTo(&nodeOptions); |
There was a problem hiding this comment.
Kokoro node options are unpacked without checking the return value. If the node_options contains the wrong type (or can’t be unpacked), nodeOptions will be left default-initialized and the graph may crash or misconfigure silently. Please check UnpackTo() and return MEDIAPIPE_GRAPH_CONFIG_FILE_INVALID (as done for STT/TTS) on failure.
| config.node(i).node_options(0).UnpackTo(&nodeOptions); | |
| auto& calculatorOptions = config.node(i).node_options(0); | |
| if (!calculatorOptions.UnpackTo(&nodeOptions)) { | |
| SPDLOG_LOGGER_ERROR(modelmanager_logger, "Failed to unpack calculator options"); | |
| return StatusCode::MEDIAPIPE_GRAPH_CONFIG_FILE_INVALID; | |
| } |
🛠 Summary
JIRA/Issue if applicable.
Describe the changes.
🧪 Checklist
``