aximo is a CPU-first STT microservice for Russian and English built as a Rust Cargo workspace. It exposes:
POST /v1/transcriptionsfor short audioGET /v1/realtimefor realtime WebSocket streamingGET /openapi.jsonfor the OpenAPI schemaGET /docs/for Swagger UI
crates/aximo: HTTP and WebSocket service binarycrates/aximo-core: scheduler and shared STT domain typescrates/aximo-inference:transcribe-rsadapters for local CPU modelscrates/aximo-audio: audio helpers
Architecture and protocol details live in:
Models are runtime artifacts and must live outside git. The service expects a model root directory configured via config/aximo.example.toml.
Compatible model bundles for the current transcribe-rs integration:
- Parakeet int8 ONNX bundle: blob.handy.computer/parakeet-v3-int8.tar.gz
- Parakeet int8 ONNX bundle on Hugging Face: istupakov/parakeet-tdt-0.6b-v3-onnx
- GigaAM v3 ONNX bundle on Hugging Face: istupakov/gigaam-v3-onnx
Example layout:
/var/lib/aximo/models/
├── parakeet-tdt-0.6b-v3-int8/
└── giga-am-v3/
Download the default Parakeet model bundle:
just setup-modelsor directly:
./scripts/fetch-models.shAfter the model is downloaded to ./var/models:
docker compose up --buildThis uses docker-compose.yml, mounts ./var/models into the container, and serves the API on http://127.0.0.1:8080.
For local non-Docker usage, use config/aximo.local.toml, which points to ./var/models:
AXIMO_CONFIG=config/aximo.local.toml cargo run -p aximoFor containerized usage, config/aximo.example.toml remains the default and expects models at /var/lib/aximo/models.
Short transcription currently accepts:
audio/wavaudio/pcmapplication/octet-stream
audio/pcm and application/octet-stream are interpreted as raw pcm_s16le, 16 kHz, mono audio.
curl -X POST http://127.0.0.1:8080/v1/transcriptions \
-H 'content-type: audio/wav' \
--data-binary @sample.wavExample response:
{
"text": "hello world",
"segments": [],
"detected_language": "en",
"engine": "fake",
"duration_ms": 0,
"processing_ms": 0
}Realtime uses WebSocket and raw pcm_s16le, 16 kHz, mono binary chunks.
const ws = new WebSocket("ws://127.0.0.1:8080/v1/realtime");
ws.binaryType = "arraybuffer";
ws.addEventListener("message", (event) => {
console.log("server:", event.data);
});
ws.addEventListener("open", async () => {
ws.send(JSON.stringify({ event: "start" }));
const pcmChunk = new Uint8Array([0, 0, 1, 0, 2, 0, 3, 0]);
ws.send(pcmChunk);
ws.send(JSON.stringify({ event: "stop" }));
});Expected server events:
session_startedpartialfinalerror
After the service starts:
- Swagger UI: http://127.0.0.1:8080/docs/
- OpenAPI JSON: http://127.0.0.1:8080/openapi.json
Common checks:
just fmt
just lint
just test
just coverage
just setup-modelsThe publishable library crates are:
aximo-coreaximo-audioaximo-inference
The aximo service crate is intentionally marked publish = false.
Use just package-libs for the local pre-publish check of aximo-core and aximo-audio. aximo-inference must be dry-run published only after aximo-core is already available in the crates.io index.
Release workflow notes are documented in docs/publishing.md.