Skip to content

Comments

Initiali suppor of Kokoro model#3997

Open
michalkulakowski wants to merge 1 commit intomainfrom
mkulakow/kokoro_poc
Open

Initiali suppor of Kokoro model#3997
michalkulakowski wants to merge 1 commit intomainfrom
mkulakow/kokoro_poc

Conversation

@michalkulakowski
Copy link
Collaborator

🛠 Summary

JIRA/Issue if applicable.
Describe the changes.

🧪 Checklist

  • Unit tests added.
  • The documentation updated.
  • Change follows security best practices.
    ``

Copilot AI review requested due to automatic review settings February 19, 2026 16:05
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds initial integration of a new “Kokoro” MediaPipe calculator/servable to enable Kokoro-based audio generation, wiring it through graph side-packets, logging, build rules, and container dependencies.

Changes:

  • Introduces KokoroCalculator (+ proto/options) and KokoroServable implementation.
  • Wires Kokoro into MediaPipe graph initialization and executor side-packets.
  • Adds Kokoro logger and installs espeak-ng dependencies in Dockerfiles.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
src/mediapipe_internal/mediapipegraphexecutor.hpp Adds Kokoro side-packet tag and extends constructor to accept KokoroServableMap.
src/mediapipe_internal/mediapipegraphexecutor.cpp Stores Kokoro map in sidePacketMaps and defines the new side-packet tag string.
src/mediapipe_internal/mediapipegraphdefinition.hpp Adds Kokoro includes/types and extends GraphSidePackets to carry Kokoro servables.
src/mediapipe_internal/mediapipegraphdefinition.cpp Adds Kokoro node detection and Kokoro servable instantiation during graph initialization.
src/logging.hpp Declares kokoro_calculator_logger.
src/logging.cpp Defines/registers/configures kokoro_calculator_logger.
src/audio/kokoro/kokoro_servable.hpp New Kokoro servable wrapper (OpenVINO model + vocab loading + eSpeak init).
src/audio/kokoro/kokoro_calculator.proto New MediaPipe calculator options for Kokoro.
src/audio/kokoro/kokoro_calculator.cc New MediaPipe calculator implementation performing phonemization, tokenization, inference, and WAV output.
src/audio/kokoro/BUILD Bazel targets for Kokoro servable + calculator + proto.
src/audio/audio_utils.hpp Adds prepareAudioOutputKokoro declaration.
src/audio/audio_utils.cpp Implements prepareAudioOutputKokoro (24 kHz WAV writing).
src/BUILD Links kokoro_calculator into the main build when MediaPipe is enabled.
Dockerfile.ubuntu Installs espeak-ng / dev libs (and adds data package in GPU-driver install step).
Dockerfile.redhat Installs espeak-ng / devel packages (and installs runtime again later).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +646 to +648
config.node(i).node_options(0).UnpackTo(&nodeOptions);
std::shared_ptr<KokoroServable> servable = std::make_shared<KokoroServable>(nodeOptions.models_path(), nodeOptions.target_device(), mgconfig.getBasePath());
kokoroServableMap.insert(std::pair<std::string, std::shared_ptr<KokoroServable>>(nodeName, std::move(servable)));
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KokoroServable construction can throw (e.g., OpenVINO read_model/compile_model exceptions). Unlike the TTS branch, this path doesn’t catch exceptions, which can terminate the server during graph load. Wrap KokoroServable creation in try/catch (ov::Exception/std::runtime_error) and return a config/initialization error with a clear log message.

Copilot uses AI. Check for mistakes.
#include <sstream>
#include <string>
#include <unordered_map>
#include <vector>
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This header uses std::filesystem::path and std::max but doesn’t include the required standard headers ( and ). This can fail to compile depending on transitive includes—please add the missing includes explicitly.

Suggested change
#include <vector>
#include <vector>
#include <filesystem>
#include <algorithm>

Copilot uses AI. Check for mistakes.
Comment on lines +646 to +649
config.node(i).node_options(0).UnpackTo(&nodeOptions);
std::shared_ptr<KokoroServable> servable = std::make_shared<KokoroServable>(nodeOptions.models_path(), nodeOptions.target_device(), mgconfig.getBasePath());
kokoroServableMap.insert(std::pair<std::string, std::shared_ptr<KokoroServable>>(nodeName, std::move(servable)));
kokoroServablesCleaningGuard.disableCleaning();
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kokoro support adds a new calculator + graph node initialization path, but there are no unit tests covering Kokoro graph validation/initialization or request processing (there are similar tests for TTS/STT). Adding at least config validation tests (missing options/name, duplicate name, invalid options type) would help prevent regressions.

Suggested change
config.node(i).node_options(0).UnpackTo(&nodeOptions);
std::shared_ptr<KokoroServable> servable = std::make_shared<KokoroServable>(nodeOptions.models_path(), nodeOptions.target_device(), mgconfig.getBasePath());
kokoroServableMap.insert(std::pair<std::string, std::shared_ptr<KokoroServable>>(nodeName, std::move(servable)));
kokoroServablesCleaningGuard.disableCleaning();
auto& calculatorOptions = config.node(i).node_options(0);
if (!calculatorOptions.UnpackTo(&nodeOptions)) {
SPDLOG_LOGGER_ERROR(modelmanager_logger, "Failed to unpack calculator options");
return StatusCode::MEDIAPIPE_GRAPH_CONFIG_FILE_INVALID;
}
try {
std::shared_ptr<KokoroServable> servable = std::make_shared<KokoroServable>(nodeOptions.models_path(), nodeOptions.target_device(), mgconfig.getBasePath());
kokoroServableMap.insert(std::pair<std::string, std::shared_ptr<KokoroServable>>(nodeName, std::move(servable)));
kokoroServablesCleaningGuard.disableCleaning();
} catch (const std::runtime_error& e) {
SPDLOG_LOGGER_ERROR(modelmanager_logger, "Kokoro node name: {} initialization failed: {}. ", nodeName, e.what());
return StatusCode::MEDIAPIPE_GRAPH_CONFIG_FILE_INVALID;
}

Copilot uses AI. Check for mistakes.
Comment on lines +646 to +647
config.node(i).node_options(0).UnpackTo(&nodeOptions);
std::shared_ptr<KokoroServable> servable = std::make_shared<KokoroServable>(nodeOptions.models_path(), nodeOptions.target_device(), mgconfig.getBasePath());
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KokoroCalculatorOptions defines plugin_config, but it isn’t used when constructing KokoroServable (only models_path/target_device are passed). Either wire plugin_config through (and parse it into ov::AnyMap like other servables do) or remove the option to avoid a misleading/unused config field.

Suggested change
config.node(i).node_options(0).UnpackTo(&nodeOptions);
std::shared_ptr<KokoroServable> servable = std::make_shared<KokoroServable>(nodeOptions.models_path(), nodeOptions.target_device(), mgconfig.getBasePath());
auto& calculatorOptions = config.node(i).node_options(0);
if (!calculatorOptions.UnpackTo(&nodeOptions)) {
SPDLOG_LOGGER_ERROR(modelmanager_logger, "Failed to unpack calculator options");
return StatusCode::MEDIAPIPE_GRAPH_CONFIG_FILE_INVALID;
}
std::shared_ptr<KokoroServable> servable = std::make_shared<KokoroServable>(
nodeOptions.models_path(),
nodeOptions.target_device(),
nodeOptions.plugin_config(),
mgconfig.getBasePath());

Copilot uses AI. Check for mistakes.
drwav_uninit(&wav);
timer.stop(OUTPUT_PREPARATION);
auto outputPreparationTime = (timer.elapsed<std::chrono::microseconds>(OUTPUT_PREPARATION)) / 1000;
SPDLOG_LOGGER_DEBUG(t2s_calculator_logger, "Output preparation time: {} ms", outputPreparationTime);
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prepareAudioOutputKokoro logs using t2s_calculator_logger, which makes Kokoro timing logs appear under the wrong component. Use kokoro_calculator_logger (or pass a logger in) so logs are attributable to the correct calculator.

Suggested change
SPDLOG_LOGGER_DEBUG(t2s_calculator_logger, "Output preparation time: {} ms", outputPreparationTime);
SPDLOG_LOGGER_DEBUG(kokoro_calculator_logger, "Output preparation time: {} ms", outputPreparationTime);

Copilot uses AI. Check for mistakes.
Comment on lines +232 to +266
// Voice embedding
std::vector<float> voice = {
-0.2296, 0.1835, -0.0069, -0.1240, -0.2505, 0.0112, -0.0759, -0.1650,
-0.2665, -0.1965, 0.0242, -0.1667, 0.3524, 0.2140, 0.3069, -0.3377,
-0.0878, -0.0477, 0.0813, -0.2135, -0.2340, -0.1971, 0.0200, 0.0145,
0.0016, 0.2596, -0.2665, 0.1434, 0.0503, 0.0867, 0.1905, -0.1281,
0.0658, -0.0639, -0.0920, 0.2444, -0.1506, -0.2197, 0.1385, 0.2133,
-0.0755, -0.0188, -0.0142, 0.2301, -0.0776, -0.0748, 0.0172, 0.0430,
-0.1009, 0.1519, 0.1137, 0.0641, 0.2264, 0.1911, -0.0205, 0.2578,
0.2210, -0.0784, -0.0235, -0.0547, 0.2191, -0.1623, -0.2416, 0.0076,
0.0574, 0.2186, 0.0080, 0.0473, 0.0972, 0.0286, 0.1324, 0.0686,
0.2652, -0.2237, -0.0980, -0.1693, -0.1866, 0.2273, 0.2008, -0.0683,
0.0957, 0.0623, -0.1891, 0.1620, 0.1811, -0.0516, -0.0800, -0.1416,
-0.2374, -0.1892, 0.1726, -0.0690, -0.0300, 0.0467, -0.2811, -0.1603,
0.0342, -0.1054, -0.0604, -0.0475, -0.0908, -0.1286, 0.1105, -0.1186,
0.0582, 0.1887, 0.0345, 0.2081, 0.1404, -0.2532, 0.0026, 0.0402,
0.0812, -0.0512, 0.0128, 0.0084, -0.0970, -0.0362, 0.0036, -0.0720,
-0.0850, 0.0221, -0.1037, 0.0569, 0.0187, -0.0649, -0.0288, -0.1795,
0.0045, 0.2535, 0.6751, 0.1578, -0.0966, 0.1516, 0.2109, 0.2033,
-0.2155, -0.1783, 0.0836, -0.1050, 0.0676, -0.0237, 0.0387, -0.2564,
0.1891, 0.1305, -0.3239, -0.1312, 0.2723, 0.0745, 0.1335, 0.0302,
0.0172, 0.2207, 0.0215, -0.0379, -0.1954, 0.4944, 0.2905, -0.0306,
0.2858, 0.2341, 0.0545, 0.4626, 0.2947, 0.3802, 0.2820, 0.1557,
0.1743, -0.1410, 0.0986, 0.4751, -0.2146, 0.3530, -0.2357, -0.5626,
-0.0617, 0.2190, 0.0992, -0.2365, 0.3726, 0.2092, 0.1660, 0.1928,
0.5731, -0.1734, -0.0816, -0.3191, -0.1871, -0.2217, -0.0112, 0.1261,
0.1601, 0.3835, 0.0451, -0.1927, -0.1116, 0.2204, -0.0379, -0.0094,
-0.0455, -0.4831, -0.3345, -0.2119, 0.4803, 0.1214, 0.1723, 0.2605,
0.0051, -0.2587, 0.0511, -0.1318, 0.0227, -0.0645, 0.2573, -0.0205,
0.0665, -0.3562, -0.6070, 0.4191, 0.0351, 0.2033, -0.5508, -0.1415,
-0.1249, -0.0986, -0.1120, -0.1187, 0.0600, 0.1974, 0.5017, -0.0247,
-0.2986, 0.3983, -0.1159, -0.4275, -0.0164, -0.3783, 0.0717, 0.1478,
-0.1144, 0.2292, 0.2741, 0.4309, -0.1611, 0.0755, -0.0981, 0.4584,
-0.2061, -0.0787, -0.1779, 0.2275, -0.1742, -0.2230, -0.1739, 0.0646
};
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The voice embedding vector is constructed on every request, which adds significant per-request allocation/copy overhead. Consider making it a static const (or loading it once into KokoroServable / node options) and reusing it across Process() calls.

Copilot uses AI. Check for mistakes.
Comment on lines +16 to +22
#include <algorithm>
#include <cstdint>
#include <fstream>
#include <mutex>
#include <string>
#include <unordered_map>
#include <vector>
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::isspace is used but isn’t included. Relying on transitive includes is fragile; add #include to ensure this compiles consistently across toolchains.

Copilot uses AI. Check for mistakes.
const RerankServableMap& rerankServableMap,
const SttServableMap& sttServableMap,
const TtsServableMap& ttsServableMap,
const KokoroServableMap& kokoroServableMap,
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MediapipeGraphExecutor constructor signature now requires a KokoroServableMap, but there are existing call sites (e.g., in src/test/mediapipeflow_test.cpp and src/test/pythonnode_test.cpp) that still call the old overload and will fail to compile. Update those call sites to pass an empty kokoro map, or add a backward-compatible overload/default to avoid breaking existing users.

Suggested change
const KokoroServableMap& kokoroServableMap,
const KokoroServableMap& kokoroServableMap = KokoroServableMap(),

Copilot uses AI. Check for mistakes.
return StatusCode::LLM_NODE_NAME_ALREADY_EXISTS;
}
mediapipe::KokoroCalculatorOptions nodeOptions;
config.node(i).node_options(0).UnpackTo(&nodeOptions);
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kokoro node options are unpacked without checking the return value. If the node_options contains the wrong type (or can’t be unpacked), nodeOptions will be left default-initialized and the graph may crash or misconfigure silently. Please check UnpackTo() and return MEDIAPIPE_GRAPH_CONFIG_FILE_INVALID (as done for STT/TTS) on failure.

Suggested change
config.node(i).node_options(0).UnpackTo(&nodeOptions);
auto& calculatorOptions = config.node(i).node_options(0);
if (!calculatorOptions.UnpackTo(&nodeOptions)) {
SPDLOG_LOGGER_ERROR(modelmanager_logger, "Failed to unpack calculator options");
return StatusCode::MEDIAPIPE_GRAPH_CONFIG_FILE_INVALID;
}

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant