Python Wake Word / Keyword Detection — API Reference

Overview

This library provides real-time wake word (keyword) detection for Python applications. It supports:

Internal audio mode — the library captures microphone audio for you.
External audio mode — you capture audio yourself and feed frames to the library.
File-based detection — run detection on .wav files.
Voice Activity Detection (VAD) — standalone speech probability estimation.
Noise detection — ambient noise level classification.

Dependencies

keyword_detection — the DaVoice keyword detection library.
asyncio — standard library for async programming.
threading — standard library for thread management.
For external audio mode: pyaudio and numpy.

Quick Start

import asyncio
import threading
from keyword_detection import KeywordDetection

def detection_callback(params):
    print(f"Detected: {params['phrase']}  scores: {params['threshold_scores']}")

async def main():
    models = [
        {
            "model_path": "models/your_wake_word.onnx",
            "callback_function": detection_callback,
            "threshold": 0.9,
            "buffer_cnt": 4,
            "wait_time": 50
        }
    ]

    kw = KeywordDetection(keyword_models=models)

    with open("licensekey.txt") as f:
        kw.set_keyword_detection_license(f.read().strip())

    thread = threading.Thread(
        target=kw.start_keyword_detection,
        kwargs={"enable_vad": False, "buffer_ms": 100}
    )
    thread.start()
    thread.join()

if __name__ == "__main__":
    asyncio.run(main())

Class: `KeywordDetection`

Constructor

KeywordDetection(keyword_models=keyword_detection_models)

Creates a keyword detection instance.

Parameter: keyword_models — a list of model configuration dictionaries. Each dictionary accepts:

Key	Type	Required	Description
`model_path`	`str`	Yes	Path to the `.onnx` model file
`callback_function`	`callable`	Yes	Function invoked when the wake word is detected
`threshold`	`float`	Yes	Detection sensitivity (`0.0` – `1.0`). Higher = fewer false positives
`buffer_cnt`	`int`	Yes	Number of sub models to predict on the buffer -> more equals less false positives
`wait_time`	`int`	No	Wait time in milliseconds between inferences (default varies)

You can supply multiple model dictionaries to detect several wake words simultaneously.

Example:

keyword_detection_models = [
    {
        "model_path": "models/hey_assistant.onnx",
        "callback_function": detection_callback,
        "threshold": 0.9,
        "buffer_cnt": 4,
        "wait_time": 50
    },
    {
        "model_path": "models/ok_assistant.onnx",
        "callback_function": detection_callback,
        "threshold": 0.8,
        "buffer_cnt": 3,
        "wait_time": 50
    }
]

keyword_model = KeywordDetection(keyword_models=keyword_detection_models)

Properties

Property	Type	Description
`keyword_models_names`	`list[str]`	List of loaded model names (derived from model paths)
`is_listening`	`bool`	`True` when wake word detection is active and accepting audio
`is_listening_vad_stand_alone`	`bool`	`True` when standalone VAD is active and accepting audio

License

`set_keyword_detection_license(license_key)`

Sets the license key required to use the library.

Parameter	Type	Description
`license_key`	`str`	Your DaVoice license key

with open("licensekey.txt", "r") as file:
    license_key = file.read().strip()

keyword_model.set_keyword_detection_license(license_key)

Contact info@davoice.io to obtain a license key.

Callbacks

Detection Callback

The callback function you provide in callback_function receives a single params dictionary:

Key	Type	Description
`phrase`	`str`	The detected wake word / phrase
`threshold_scores`	`list[float]`	Array of detection confidence scores
`version`	`str` (optional)	Model version identifier

def detection_callback(params):
    phrase = params["phrase"]
    threshold_scores = params["threshold_scores"]
    version = params.get("version", "N/A")
    print(f"Detected: {phrase}  scores={threshold_scores}  version={version}")

`set_secondary_callback(keyword_model_name, callback, secondary_threshold)`

Registers a secondary callback that fires when inference scores exceed secondary_threshold but remain below the primary detection threshold. This is useful for logging near-miss detections and collecting audio samples to improve your model.

Parameter	Type	Description
`keyword_model_name`	`str`	Model name (from `keyword_models_names`)
`callback`	`callable`	Callback function (same signature as detection callback)
`secondary_threshold`	`float`	Score threshold to trigger this callback

def lower_threshold_callback(params):
    print(f"Near-detection: {params['phrase']}  scores: {params['threshold_scores']}")

for name in keyword_model.keyword_models_names:
    keyword_model.set_secondary_callback(
        keyword_model_name=name,
        callback=lower_threshold_callback,
        secondary_threshold=0.9
    )

Detection Modes

Mode 1: Internal Audio (Built-in Microphone Capture)

The simplest approach — the library handles microphone capture internally.

`start_keyword_detection(enable_vad=False, buffer_ms=100)`

Starts keyword detection using the system microphone. This call blocks, so it should be run in a separate thread.

Parameter	Type	Default	Description
`enable_vad`	`bool`	`False`	Enable built-in Voice Activity Detection
`buffer_ms`	`int`	`100`	Audio buffer size in milliseconds

Full example (see example/example.py for Linux/macOS, example_windows/example.py for Windows):

import asyncio
import threading
from keyword_detection import KeywordDetection

def detection_callback(params):
    phrase = params["phrase"]
    scores = [s for s in params["threshold_scores"] if s != 0]
    print(f"Detected: {phrase}  scores={scores}")

async def main():
    models = [
        {
            "model_path": "models/your_wake_word.onnx",
            "callback_function": detection_callback,
            "threshold": 0.9,
            "buffer_cnt": 4,
            "wait_time": 50
        }
    ]

    keyword_model = KeywordDetection(keyword_models=models)

    with open("licensekey.txt") as f:
        keyword_model.set_keyword_detection_license(f.read().strip())

    thread = threading.Thread(
        target=keyword_model.start_keyword_detection,
        kwargs={"enable_vad": False, "buffer_ms": 100}
    )
    thread.start()
    thread.join()

if __name__ == "__main__":
    asyncio.run(main())

Mode 2: External Audio (You Provide Audio Frames)

Use this mode when you need full control over audio capture — for example, when reading from a custom source, a network stream, or a shared microphone. Audio frames must be 16-bit PCM, mono, 16 kHz (numpy.int16).

`start_keyword_detection_external_audio(enable_vad=False, buffer_ms=100)`

Initializes the wake word detection engine for external audio. Non-blocking — after calling this, feed audio frames via feed_audio_frame().

Parameter	Type	Default	Description
`enable_vad`	`bool`	`False`	Enable built-in Voice Activity Detection
`buffer_ms`	`int`	`100`	Audio buffer size in milliseconds

`start_vad_external_audio()`

Initializes standalone Voice Activity Detection for external audio. After calling this, feed audio frames via feed_audio_frame_vad().

`feed_audio_frame(audio_frame)`

Feeds a single audio frame for wake word detection.

Parameter	Type	Description
`audio_frame`	`numpy.ndarray` (int16)	A single audio frame (e.g., 1280 samples at 16 kHz = 80 ms)

Only feed frames when keyword_model.is_listening is True.

`feed_audio_frame_vad(audio_frame)`

Feeds a single audio frame for standalone VAD.

Parameter	Type	Description
`audio_frame`	`numpy.ndarray` (int16)	A single audio frame

Returns: float — speech probability between 0.0 and 1.0.

Only feed frames when keyword_model.is_listening_vad_stand_alone is True.

`feed_audio_frame_noise_detection(audio_frame, low_noise_margin_db, high_noise_margin_db)`

Feeds a single audio frame for ambient noise level detection. This method can be called at any time (no initialization step required).

Parameter	Type	Description
`audio_frame`	`numpy.ndarray` (int16)	A single audio frame
`low_noise_margin_db`	`int`	Lower dBFS margin for silence/noise boundary
`high_noise_margin_db`	`int`	Upper dBFS margin for noise/loud boundary

Returns: (dbfs, sound_type) — a tuple of the dBFS value and a string classification (e.g., 'silence').

Full External Audio Example

See example/example_external_audio.py for Linux/macOS, example_windows/example_external_audio.py for Windows.

import asyncio
import threading
import pyaudio
import numpy as np
from keyword_detection import KeywordDetection

def detection_callback(params):
    phrase = params["phrase"]
    scores = [s for s in params["threshold_scores"] if s != 0]
    print(f"Detected: {phrase}  scores={scores}")

def mic_dispatcher_thread(keyword_model):
    """Reads microphone audio and dispatches frames to all detection engines."""
    FORMAT = pyaudio.paInt16
    CHANNELS = 1
    RATE = 16000
    CHUNK = 1280

    p = pyaudio.PyAudio()
    stream = p.open(format=FORMAT, channels=CHANNELS, rate=RATE,
                    input=True, frames_per_buffer=CHUNK)
    try:
        while True:
            data = stream.read(CHUNK, exception_on_overflow=False)
            audio_frame = np.frombuffer(data, dtype=np.int16)

            # Wake word detection
            if keyword_model.is_listening:
                keyword_model.feed_audio_frame(audio_frame)

            # Standalone VAD
            if keyword_model.is_listening_vad_stand_alone:
                speech_prob = keyword_model.feed_audio_frame_vad(audio_frame)
                if speech_prob > 0.2:
                    print(f"Speech probability: {speech_prob * 100:.1f}%")

            # Noise detection
            dbfs, sound_type = keyword_model.feed_audio_frame_noise_detection(
                audio_frame, low_noise_margin_db=20, high_noise_margin_db=40
            )
            if sound_type != "silence":
                print(f"dBFS={dbfs}  sound={sound_type}")
    finally:
        stream.stop_stream()
        stream.close()
        p.terminate()

async def main():
    models = [
        {
            "model_path": "models/your_wake_word.onnx",
            "callback_function": detection_callback,
            "threshold": 0.9,
            "buffer_cnt": 4,
            "wait_time": 50
        }
    ]

    keyword_model = KeywordDetection(keyword_models=models)

    with open("licensekey.txt") as f:
        keyword_model.set_keyword_detection_license(f.read().strip())

    # Initialize external audio detection (non-blocking)
    keyword_model.start_keyword_detection_external_audio(enable_vad=False, buffer_ms=100)

    # Initialize standalone VAD (non-blocking)
    keyword_model.start_vad_external_audio()

    # Start feeding audio in a thread
    thread = threading.Thread(target=mic_dispatcher_thread, args=(keyword_model,))
    thread.start()
    thread.join()

if __name__ == "__main__":
    asyncio.run(main())

Mode 3: File-Based Detection

`start_keyword_detection_from_file(file_path)`

Runs wake word detection on a .wav file (useful for testing and benchmarking).

Parameter	Type	Description
`file_path`	`str`	Path to a `.wav` audio file

Returns: dict — detection results keyed by model name:

{
    "model_name": {
        "detections": 2,   # number of detections in the file
        ...
    }
}

Example:

output = keyword_model.start_keyword_detection_from_file("test_audio.wav")
for model_name, result in output.items():
    print(f"{model_name}: {result.get('detections', 0)} detections")

API Summary

Method	Mode	Blocking	Description
`KeywordDetection(keyword_models=...)`	—	—	Constructor
`set_keyword_detection_license(key)`	—	—	Set license key
`set_secondary_callback(name, cb, threshold)`	—	—	Register near-detection callback
`start_keyword_detection(enable_vad, buffer_ms)`	Internal	Yes	Start detection with built-in mic capture
`start_keyword_detection_external_audio(enable_vad, buffer_ms)`	External	No	Initialize detection for external audio
`start_vad_external_audio()`	External	No	Initialize standalone VAD
`feed_audio_frame(frame)`	External	No	Feed audio for wake word detection
`feed_audio_frame_vad(frame)`	External	No	Feed audio for VAD; returns speech probability
`feed_audio_frame_noise_detection(frame, low_db, high_db)`	External	No	Feed audio for noise detection; returns `(dBFS, type)`
`start_keyword_detection_from_file(path)`	File	Yes	Run detection on a `.wav` file

Property	Type	Description
`keyword_models_names`	`list[str]`	Loaded model names
`is_listening`	`bool`	Wake word detection is active
`is_listening_vad_stand_alone`	`bool`	Standalone VAD is active

Examples

File	Platform	Mode
example/example.py	Linux / macOS	Internal audio
example/example_external_audio.py	Linux / macOS	External audio
example_windows/example.py	Windows	Internal audio
example_windows/example_external_audio.py	Windows	External audio

Contact

For questions, custom wake word models, or additional platform support, contact info@davoice.io.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python Wake Word / Keyword Detection — API Reference

Overview

Dependencies

Quick Start

Class: `KeywordDetection`

Constructor

Properties

License

`set_keyword_detection_license(license_key)`

Callbacks

Detection Callback

`set_secondary_callback(keyword_model_name, callback, secondary_threshold)`

Detection Modes

Mode 1: Internal Audio (Built-in Microphone Capture)

`start_keyword_detection(enable_vad=False, buffer_ms=100)`

Mode 2: External Audio (You Provide Audio Frames)

`start_keyword_detection_external_audio(enable_vad=False, buffer_ms=100)`

`start_vad_external_audio()`

`feed_audio_frame(audio_frame)`

`feed_audio_frame_vad(audio_frame)`

`feed_audio_frame_noise_detection(audio_frame, low_noise_margin_db, high_noise_margin_db)`

Full External Audio Example

Mode 3: File-Based Detection

`start_keyword_detection_from_file(file_path)`

API Summary

Examples

Contact

FilesExpand file tree

python_wake_word.md

Latest commit

History

python_wake_word.md

File metadata and controls

Python Wake Word / Keyword Detection — API Reference

Overview

Dependencies

Quick Start

Class: KeywordDetection

Constructor

Properties

License

set_keyword_detection_license(license_key)

Callbacks

Detection Callback

set_secondary_callback(keyword_model_name, callback, secondary_threshold)

Detection Modes

Mode 1: Internal Audio (Built-in Microphone Capture)

start_keyword_detection(enable_vad=False, buffer_ms=100)

Mode 2: External Audio (You Provide Audio Frames)

start_keyword_detection_external_audio(enable_vad=False, buffer_ms=100)

start_vad_external_audio()

feed_audio_frame(audio_frame)

feed_audio_frame_vad(audio_frame)

feed_audio_frame_noise_detection(audio_frame, low_noise_margin_db, high_noise_margin_db)

Full External Audio Example

Mode 3: File-Based Detection

start_keyword_detection_from_file(file_path)

API Summary

Examples

Contact

Class: `KeywordDetection`

`set_keyword_detection_license(license_key)`

`set_secondary_callback(keyword_model_name, callback, secondary_threshold)`

`start_keyword_detection(enable_vad=False, buffer_ms=100)`

`start_keyword_detection_external_audio(enable_vad=False, buffer_ms=100)`

`start_vad_external_audio()`

`feed_audio_frame(audio_frame)`

`feed_audio_frame_vad(audio_frame)`

`feed_audio_frame_noise_detection(audio_frame, low_noise_margin_db, high_noise_margin_db)`

`start_keyword_detection_from_file(file_path)`