Parakeet.js

Browser speech-to-text for NVIDIA Parakeet ONNX models.

parakeet.js runs fully in the browser with onnxruntime-web. It can use WebGPU for the encoder and WASM for the decoder, so apps can transcribe audio without sending it to a server.

Install

npm i parakeet.js

Quick Start

import { fromHub } from 'parakeet.js';

const model = await fromHub('parakeet-tdt-0.6b-v3', {
  backend: 'webgpu',
  encoderQuant: 'fp32',
  decoderQuant: 'int8',
});

const result = await model.transcribe(pcm, 16000, {
  returnTimestamps: true,
  returnConfidences: true,
});

console.log(result.utterance_text);

pcm is mono Float32Array audio. The sample rate should be 16000. In a browser app, decode files with the Web Audio API or your existing audio pipeline before calling transcribe.

For a complete React example, see examples/demo.

What It Supports

Client-side transcription in the browser.
WebGPU and WASM execution through ONNX Runtime Web.
Hugging Face model loading with IndexedDB caching.
Local or self-hosted model files via explicit URLs.
Timestamp and confidence output when requested.
Long-audio chunking with sentence-aware merge behavior.
Stateful streaming helpers for live transcription apps.

Loading Models

The easiest path is fromHub:

import { fromHub } from 'parakeet.js';

const model = await fromHub('parakeet-tdt-0.6b-v3', {
  backend: 'webgpu',
  encoderQuant: 'fp32',
  decoderQuant: 'int8',
  preprocessorBackend: 'js',
});

Use fromUrls when you host the files yourself:

import { fromUrls } from 'parakeet.js';

const model = await fromUrls({
  encoderUrl: '/models/encoder-model.onnx',
  decoderUrl: '/models/decoder_joint-model.int8.onnx',
  tokenizerUrl: '/models/vocab.txt',
  backend: 'webgpu',
  preprocessorBackend: 'js',
});

If your ONNX model uses external data, pass the matching .data URL too:

const model = await fromUrls({
  encoderUrl: '/models/encoder-model.onnx',
  encoderDataUrl: '/models/encoder-model.onnx.data',
  decoderUrl: '/models/decoder_joint-model.int8.onnx',
  tokenizerUrl: '/models/vocab.txt',
  backend: 'webgpu',
});

Backends

backend: 'webgpu' is the recommended browser mode. It runs the encoder on WebGPU and the decoder on WASM.

Available backend values:

webgpu
webgpu-hybrid (kept for compatibility; same behavior as webgpu)
webgpu-strict
wasm

Quantization options are fp32, fp16, and int8, depending on which files exist in the model repo. In WebGPU modes, int8 encoder requests are upgraded to fp32 because the encoder path does not support int8 there.

preprocessorBackend: 'js' is the default and usually the best choice. Use preprocessorBackend: 'onnx' only when you specifically want the ONNX preprocessor file.

Transcribing

Short audio:

const result = await model.transcribe(pcm, 16000);
console.log(result.utterance_text);

Timestamps and confidences:

const result = await model.transcribe(pcm, 16000, {
  returnTimestamps: true,
  returnConfidences: true,
});

console.log(result.words);

Long audio:

const result = await model.transcribeLongAudio(pcm, 16000, {
  returnTimestamps: true,
});

console.log(result.text);
console.log(result.chunks);

Streaming:

const transcriber = model.createStreamingTranscriber();
const partial = await transcriber.pushAudioChunk(chunkPcm, 16000);

For the full option and result types, use the published API docs or the TypeScript declarations in types/.

Demo Apps

examples/demo: current development demo. Use it to test local source and npm-package behavior.
compat-tests/demo-v*: older demo snapshots. These help catch breaking changes against previous app code.
Keet: real-time reference app built on parakeet.js.

Common demo commands:

cd examples/demo
npm install
npm run dev:local  # use local repo source
npm run dev        # use package dependency

Development

npm install
npm test
npm run verify:frame-copy
npm run docs:api

The project keeps behavior checks in tests/ and browser/manual checks in examples/demo and compat-tests/.

Notes

WebGPU requires a browser/runtime with WebGPU enabled.
Multithreaded WASM needs cross-origin isolation (COOP/COEP) in deployed apps.
Hugging Face model files are cached in IndexedDB for faster reloads.
If browser model cache becomes stale, the hub loader validates cached blobs and redownloads when needed.

License

MIT

Credits

Thanks to istupakov/onnx-asr for the reference implementation and model tooling foundations.

Name		Name	Last commit message	Last commit date
Latest commit History 275 Commits
.github		.github
.jules		.jules
compat-tests		compat-tests
examples/demo		examples/demo
metrics		metrics
reports		reports
src		src
tests		tests
types		types
.coderabbit.yaml		.coderabbit.yaml
.gitattributes		.gitattributes
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
typedoc.json		typedoc.json
vitest.config.js		vitest.config.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Parakeet.js

Install

Quick Start

What It Supports

Loading Models

Backends

Transcribing

Demo Apps

Development

Notes

License

Credits

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Parakeet.js

Install

Quick Start

What It Supports

Loading Models

Backends

Transcribing

Demo Apps

Development

Notes

License

Credits

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages