Add a wakeword recognition module based on OpenWakeWord#199
Add a wakeword recognition module based on OpenWakeWord#199reisbauer03 wants to merge 7 commits intodnhkng:mainfrom
Conversation
📝 WalkthroughWalkthroughAdds OpenWakeWord ONNX wake‑word support: new config field and model artifacts, CLI registry entries, engine wiring to pass a wake‑word model path, SpeechListener gating via ONNX recognizer (with ASR/text fallback), and a new ONNX-based wake‑word implementation plus license. Changes
Sequence Diagram(s)sequenceDiagram
participant Audio as Audio Buffer
participant SL as SpeechListener
participant WWR as OpenWakeWord Model
participant ASR as ASR Engine
participant Cmd as Command Handler
Audio->>SL: _process_detected_audio(samples)
SL->>SL: concatenate samples -> audio
alt wake_word_model_path configured
SL->>WWR: predict_multi_sample(audio)
WWR-->>SL: detected:boolean
alt not detected
SL->>SL: return early
else detected
SL->>ASR: asr(audio)
ASR-->>SL: transcript
SL->>Cmd: handle transcript
end
else no model configured
SL->>ASR: asr(audio)
ASR-->>SL: transcript
SL->>SL: _wakeword_detected(transcript)?
alt contains wake word
SL->>Cmd: handle transcript
else
SL->>SL: return early
end
end
SL->>SL: reset() (finally)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (6)
src/glados/openwakeword/utils.py (2)
55-58: Use exception chaining for better debugging context.When re-raising an exception with a different message, use
raise ... from errto preserve the original traceback.🧹 Suggested fix
try: import onnxruntime as ort - except ImportError: - raise ValueError("Tried to import onnxruntime, but it was not found. Please install it using `pip install onnxruntime`") + except ImportError as err: + raise ValueError("Tried to import onnxruntime, but it was not found. Please install it using `pip install onnxruntime`") from err🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/glados/openwakeword/utils.py` around lines 55 - 58, The ImportError handler in the import block for onnxruntime should preserve the original traceback; update the except clause in utils.py to capture the exception (e.g., "except ImportError as err") and re-raise the ValueError using exception chaining ("raise ValueError(...) from err") so the original ImportError context is preserved when raising the new error.
42-42: Consider using explicitT | Nonetype annotation.The
onnx_providersparameter uses an implicitOptional(defaulting toNone). While this works correctly (line 69-70 handles theNonecase), modern Python style prefers explicit union types.🧹 Suggested fix
- onnx_providers: list[str] = None + onnx_providers: list[str] | None = None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/glados/openwakeword/utils.py` at line 42, The parameter onnx_providers currently uses an implicit Optional by being annotated as "list[str] = None"; update its type annotation to an explicit union such as "list[str] | None" (or "Optional[list[str]]" and import Optional from typing for <3.10 compatibility) on the function/constructor that declares onnx_providers so the signature clearly reflects None is allowed while keeping the existing None-handling logic intact.src/glados/openwakeword/model.py (2)
113-114: Use exception chaining for better debugging context.🧹 Suggested fix
- except ImportError: - raise ValueError("Tried to import onnxruntime, but it was not found. Please install it using `pip install onnxruntime`") + except ImportError as err: + raise ValueError("Tried to import onnxruntime, but it was not found. Please install it using `pip install onnxruntime`") from err🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/glados/openwakeword/model.py` around lines 113 - 114, The except ImportError block currently raises a ValueError without chaining the original ImportError; change the handler to capture the ImportError (e.g., except ImportError as e) and re-raise the ValueError using exception chaining (raise ValueError("Tried to import onnxruntime...") from e) so the original traceback is preserved for debugging in the module-level import in src/glados/openwakeword/model.py.
59-60: Consider replacing mutable default arguments withNone.Mutable default arguments (
[],{}) are a known Python anti-pattern. While this code handles them correctly by not mutating the defaults, the pattern can lead to subtle bugs if the code is modified later.🧹 Suggested fix for __init__
def __init__( self, wakeword_models: List[str], - class_mapping_dicts: List[dict] = [], - custom_verifier_models: dict = {}, + class_mapping_dicts: List[dict] | None = None, + custom_verifier_models: dict | None = None, custom_verifier_threshold: float = 0.1, - onnx_providers: list[str] = None, + onnx_providers: list[str] | None = None, **kwargs ): + if class_mapping_dicts is None: + class_mapping_dicts = [] + if custom_verifier_models is None: + custom_verifier_models = {}🧹 Suggested fix for predict
- def predict(self, x: np.ndarray, patience: dict = {}, - threshold: dict = {}, debounce_time: float = 0.0, timing: bool = False): + def predict(self, x: np.ndarray, patience: dict | None = None, + threshold: dict | None = None, debounce_time: float = 0.0, timing: bool = False): ... + if patience is None: + patience = {} + if threshold is None: + threshold = {}Note: This requires updating the checks from
if patience != {}to checking both conditions.Also applies to: 176-177
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/glados/openwakeword/model.py` around lines 59 - 60, The constructor for the model uses mutable default arguments (class_mapping_dicts: List[dict] = [] and custom_verifier_models: dict = {}) which is an anti-pattern; change those defaults to None in __init__ and inside __init__ set self.class_mapping_dicts = class_mapping_dicts or [] and self.custom_verifier_models = custom_verifier_models or {} to preserve current behavior. Update any methods that check those parameters (e.g., predict and any checks that currently use equality like `if patience != {}`) to instead check for None or use truthiness (e.g., `if custom_verifier_models is not None` or `if patience is not None`) so logic remains correct after the default change; update references to class_mapping_dicts and custom_verifier_models accordingly.src/glados/cli.py (1)
133-143: Update the docstring to include the new OpenWakeWord models.The docstring lists specific models but doesn't include the newly added
melspec.onnxandembedding.onnxmodels. Based on learnings, this function should have comprehensive documentation including the full model list.📝 Suggested docstring update
async def download_models() -> int: """ Main async controller for downloading all the specified models: - ASR model: nemo-parakeet_tdt_ctc_110m.onnx - VAD model: silero_vad_16k_op15.onnx - TTS model: glados.onnx - Phonemizer model: phomenizer_en.onnx + - OpenWakeWord models: melspec.onnx, embedding.onnx Returns: int: Exit code (0 for success, 1 for failure) """🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/glados/cli.py` around lines 133 - 143, Update the download_models() docstring to reflect all models being downloaded: keep the existing ASR (nemo-parakeet_tdt_ctc_110m.onnx), VAD (silero_vad_16k_op15.onnx), TTS (glados.onnx), and phonemizer (phonemizer_en.onnx) entries and add the new OpenWakeWord models melspec.onnx and embedding.onnx so the docstring lists the complete set of artifacts the function downloads.src/glados/openwakeword/__init__.py (1)
30-31: Remove extraneous f-string prefixes.These f-strings contain no placeholders.
🧹 Proposed fix
- DEFAULT_MELSPEC_MODEL_PATH = resource_path(f"models/openwakeword/melspec.onnx") # Model path to melspectogram model - DEFAULT_EMBEDDING_MODEL_PATH = resource_path(f"models/openwakeword/embedding.onnx") # Model path to embedding model + DEFAULT_MELSPEC_MODEL_PATH = resource_path("models/openwakeword/melspec.onnx") # Model path to melspectogram model + DEFAULT_EMBEDDING_MODEL_PATH = resource_path("models/openwakeword/embedding.onnx") # Model path to embedding model🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/glados/openwakeword/__init__.py` around lines 30 - 31, The constants DEFAULT_MELSPEC_MODEL_PATH and DEFAULT_EMBEDDING_MODEL_PATH use unnecessary f-strings around static literals when calling resource_path; remove the extraneous f-string prefixes so both calls use plain string literals (e.g., call resource_path("models/openwakeword/melspec.onnx") and resource_path("models/openwakeword/embedding.onnx")) to avoid misleading f-string usage while keeping the same resource_path invocation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/glados/openwakeword/__init__.py`:
- Around line 84-106: The bug is that the local variable threshold is
overwritten with a dict and then used in a numeric comparison, causing a
TypeError; fix by keeping the scalar threshold separate and build a dict only
for the call to self._model.predict (e.g., create threshold_dict =
{self.model_name: threshold} or keep threshold_scalar and assign threshold_dict
for the predict call), similarly do the same for patience if needed, then call
self._model.predict(..., threshold=threshold_dict) and compare the returned
array to the original scalar (threshold_scalar) when evaluating > threshold;
ensure you reference self.model_name, self._model.predict, and the local
threshold variable names consistently.
In `@src/glados/openwakeword/model.py`:
- Around line 140-143: The code currently uses pickle.load to deserialize
user-provided custom verifier model files (the custom_verifier_models load block
in model.py), which is unsafe; replace this direct pickle.load with a safer
approach: either (a) require and use a safe, explicit serialization format for
verifier models (e.g., JSON/YAML for simple configs or a validated model format)
or (b) if binary model files are required, perform strict validation before
loading (ensure the path is in an allowed directory, check file
signatures/whitelists, and load inside a sandboxed/isolated process) and log a
clear security warning whenever a custom verifier model is loaded. Update the
code that assigns self.custom_verifier_models[mdl_name] (and related docs) to
perform the validation and emit the warning via the existing logger instead of
calling pickle.load directly.
---
Nitpick comments:
In `@src/glados/cli.py`:
- Around line 133-143: Update the download_models() docstring to reflect all
models being downloaded: keep the existing ASR
(nemo-parakeet_tdt_ctc_110m.onnx), VAD (silero_vad_16k_op15.onnx), TTS
(glados.onnx), and phonemizer (phonemizer_en.onnx) entries and add the new
OpenWakeWord models melspec.onnx and embedding.onnx so the docstring lists the
complete set of artifacts the function downloads.
In `@src/glados/openwakeword/__init__.py`:
- Around line 30-31: The constants DEFAULT_MELSPEC_MODEL_PATH and
DEFAULT_EMBEDDING_MODEL_PATH use unnecessary f-strings around static literals
when calling resource_path; remove the extraneous f-string prefixes so both
calls use plain string literals (e.g., call
resource_path("models/openwakeword/melspec.onnx") and
resource_path("models/openwakeword/embedding.onnx")) to avoid misleading
f-string usage while keeping the same resource_path invocation.
In `@src/glados/openwakeword/model.py`:
- Around line 113-114: The except ImportError block currently raises a
ValueError without chaining the original ImportError; change the handler to
capture the ImportError (e.g., except ImportError as e) and re-raise the
ValueError using exception chaining (raise ValueError("Tried to import
onnxruntime...") from e) so the original traceback is preserved for debugging in
the module-level import in src/glados/openwakeword/model.py.
- Around line 59-60: The constructor for the model uses mutable default
arguments (class_mapping_dicts: List[dict] = [] and custom_verifier_models: dict
= {}) which is an anti-pattern; change those defaults to None in __init__ and
inside __init__ set self.class_mapping_dicts = class_mapping_dicts or [] and
self.custom_verifier_models = custom_verifier_models or {} to preserve current
behavior. Update any methods that check those parameters (e.g., predict and any
checks that currently use equality like `if patience != {}`) to instead check
for None or use truthiness (e.g., `if custom_verifier_models is not None` or `if
patience is not None`) so logic remains correct after the default change; update
references to class_mapping_dicts and custom_verifier_models accordingly.
In `@src/glados/openwakeword/utils.py`:
- Around line 55-58: The ImportError handler in the import block for onnxruntime
should preserve the original traceback; update the except clause in utils.py to
capture the exception (e.g., "except ImportError as err") and re-raise the
ValueError using exception chaining ("raise ValueError(...) from err") so the
original ImportError context is preserved when raising the new error.
- Line 42: The parameter onnx_providers currently uses an implicit Optional by
being annotated as "list[str] = None"; update its type annotation to an explicit
union such as "list[str] | None" (or "Optional[list[str]]" and import Optional
from typing for <3.10 compatibility) on the function/constructor that declares
onnx_providers so the signature clearly reflects None is allowed while keeping
the existing None-handling logic intact.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ddd6d56a-e1cd-4de8-9160-0ae62584959f
📒 Files selected for processing (8)
configs/glados_config.yamlsrc/glados/cli.pysrc/glados/core/engine.pysrc/glados/core/speech_listener.pysrc/glados/openwakeword/LICENSEsrc/glados/openwakeword/__init__.pysrc/glados/openwakeword/model.pysrc/glados/openwakeword/utils.py
| # Load custom verifier models | ||
| if isinstance(custom_verifier_models, dict): | ||
| if custom_verifier_models.get(mdl_name, False): | ||
| self.custom_verifier_models[mdl_name] = pickle.load(open(custom_verifier_models[mdl_name], 'rb')) |
There was a problem hiding this comment.
Security: pickle.load can execute arbitrary code.
Loading pickle files from user-provided paths is a security risk. Pickle deserialization can execute arbitrary code, so malicious pickle files could compromise the system.
Since this is for custom verifier models (an optional advanced feature) and the paths are user-configured, the risk is somewhat mitigated. However, consider adding a warning in documentation or logging when custom verifier models are loaded.
🧰 Tools
🪛 Ruff (0.15.9)
[error] 143-143: pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue
(S301)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/glados/openwakeword/model.py` around lines 140 - 143, The code currently
uses pickle.load to deserialize user-provided custom verifier model files (the
custom_verifier_models load block in model.py), which is unsafe; replace this
direct pickle.load with a safer approach: either (a) require and use a safe,
explicit serialization format for verifier models (e.g., JSON/YAML for simple
configs or a validated model format) or (b) if binary model files are required,
perform strict validation before loading (ensure the path is in an allowed
directory, check file signatures/whitelists, and load inside a
sandboxed/isolated process) and log a clear security warning whenever a custom
verifier model is loaded. Update the code that assigns
self.custom_verifier_models[mdl_name] (and related docs) to perform the
validation and emit the warning via the existing logger instead of calling
pickle.load directly.
There was a problem hiding this comment.
This is a) not my code, but from the openWakeWord project and b) the functionality isn't exposed in the public API / user configuration. Won't fix.
There was a problem hiding this comment.
@reisbauer03, understood — since this is vendored code from the openWakeWord project and the custom_verifier_models path is not exposed through the public API or user configuration, the practical risk is minimal. Acknowledged, no action needed.
…dict() because of duplicate names
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
src/glados/openwakeword/__init__.py (2)
96-101: Remove unreachablethreshold is Nonebranch.After Line 84,
thresholdis always a float, so this conditional can be simplified to a single dict assignment.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/glados/openwakeword/__init__.py` around lines 96 - 101, Remove the dead conditional that creates an empty dict when threshold is None and always build threshold_param mapping directly; replace the if/else around threshold with a single assignment that sets threshold_param = { self.model_name: threshold } (update references to threshold_param and ensure this change is applied in the initializer/function where threshold and self.model_name are used).
128-132: Avoid repeated array slicing in chunk loop.
x = x[self.MIN_SAMPLES:]reallocates each iteration and scales poorly on long audio. Iterate by index to keep this linear and allocation-light.♻️ Proposed refactor
- prediction = 0 - while len(x) >= self.MIN_SAMPLES: - current = x[:self.MIN_SAMPLES] - x = x[self.MIN_SAMPLES:] - prediction = max(self._model.predict(current)[self.model_name], prediction) + prediction = 0.0 + for start in range(0, len(x) - self.MIN_SAMPLES + 1, self.MIN_SAMPLES): + current = x[start:start + self.MIN_SAMPLES] + prediction = max(self._model.predict(current)[self.model_name], prediction)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/glados/openwakeword/__init__.py` around lines 128 - 132, The loop repeatedly slices x causing repeated reallocations; change it to iterate by index (e.g., use a start pointer or a for-range stepping by self.MIN_SAMPLES) so you call self._model.predict on x[start:start+self.MIN_SAMPLES] without reassigning x each iteration, updating prediction with max(self._model.predict(...)[self.model_name], prediction); this keeps memory linear and avoids creating new arrays each loop while preserving the same logic in the block using prediction, self.MIN_SAMPLES, self._model.predict and self.model_name.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/glados/openwakeword/__init__.py`:
- Around line 86-87: The float-to-int16 conversion uses x = (x *
32767).astype(np.int16) which can overflow if samples fall outside [-1.0,1.0];
before scaling, clamp the array with np.clip(x, -1.0, 1.0) to avoid wraparound
(replace occurrences of the conversion in this module — the lines where x is
scaled/converted to int16, and the other occurrence around line 126 — with
x_clipped = np.clip(x, -1.0, 1.0) and then scale: x = (x_clipped *
32767).astype(np.int16)).
---
Nitpick comments:
In `@src/glados/openwakeword/__init__.py`:
- Around line 96-101: Remove the dead conditional that creates an empty dict
when threshold is None and always build threshold_param mapping directly;
replace the if/else around threshold with a single assignment that sets
threshold_param = { self.model_name: threshold } (update references to
threshold_param and ensure this change is applied in the initializer/function
where threshold and self.model_name are used).
- Around line 128-132: The loop repeatedly slices x causing repeated
reallocations; change it to iterate by index (e.g., use a start pointer or a
for-range stepping by self.MIN_SAMPLES) so you call self._model.predict on
x[start:start+self.MIN_SAMPLES] without reassigning x each iteration, updating
prediction with max(self._model.predict(...)[self.model_name], prediction); this
keeps memory linear and avoids creating new arrays each loop while preserving
the same logic in the block using prediction, self.MIN_SAMPLES,
self._model.predict and self.model_name.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 5a3896bb-60ae-4aa5-8e07-b0b4193da5e8
📒 Files selected for processing (1)
src/glados/openwakeword/__init__.py
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/glados/openwakeword/__init__.py`:
- Line 115: The comparison uses strict greater-than which excludes scores equal
to the threshold; update the comparisons to inclusive (>=) so a prediction score
exactly equal to threshold counts as a hit. Replace occurrences of
"self._model.predict(...)[self.model_name] > threshold" with
"self._model.predict(...)[self.model_name] >= threshold" (the lines using
self._model.predict, self.model_name and threshold/threshold_param), and make
the same change at the second occurrence referenced (around the other occurrence
noted).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: c532a268-51cf-41a0-bf30-650457ec02cb
📒 Files selected for processing (1)
src/glados/openwakeword/__init__.py
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/glados/openwakeword/__init__.py (1)
137-141: Process the trailing partial frame instead of dropping it.Line 137/Line 140 currently ignore
len(x) % MIN_SAMPLESsamples. That can miss detections near the end of the clip. Consider zero-padding and evaluating the remainder chunk too.💡 Suggested approach
count_samples = len(x) // self.MIN_SAMPLES prediction = 0 for i in range(count_samples): current = x[i * self.MIN_SAMPLES: (i + 1) * self.MIN_SAMPLES] prediction = max(self._model.predict(current)[self.model_name], prediction) + + remainder_start = count_samples * self.MIN_SAMPLES + if remainder_start < len(x): + tail = x[remainder_start:] + padded = np.pad(tail, (0, self.MIN_SAMPLES - len(tail)), mode="constant") + prediction = max(self._model.predict(padded)[self.model_name], prediction)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/glados/openwakeword/__init__.py` around lines 137 - 141, The loop drops the trailing partial frame when len(x) % self.MIN_SAMPLES != 0 causing missed detections; modify the logic in the block that computes count_samples/prediction (using x, self.MIN_SAMPLES, self._model.predict, self.model_name) to include the remainder by creating a final chunk of length self.MIN_SAMPLES padded with zeros (or appropriate padding) from the tail of x, call self._model.predict on that padded chunk and incorporate its score via the same max(...) logic so the last partial frame is evaluated instead of ignored.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/glados/openwakeword/__init__.py`:
- Around line 96-97: The threshold assignment accepts any float and must be
validated: after computing threshold = self.DEFAULT_THRESHOLD if threshold is
None else float(threshold) (and in the other identical assignment later), check
for NaN (math.isnan) and ensure 0.0 <= threshold <= 1.0; if the value is out of
range or NaN, raise a ValueError with a clear message (or clamp if preferred)
before continuing to prediction/gating logic; reference DEFAULT_THRESHOLD and
the local threshold variable in the predict/is_wake_word methods where the
assignment occurs.
---
Nitpick comments:
In `@src/glados/openwakeword/__init__.py`:
- Around line 137-141: The loop drops the trailing partial frame when len(x) %
self.MIN_SAMPLES != 0 causing missed detections; modify the logic in the block
that computes count_samples/prediction (using x, self.MIN_SAMPLES,
self._model.predict, self.model_name) to include the remainder by creating a
final chunk of length self.MIN_SAMPLES padded with zeros (or appropriate
padding) from the tail of x, call self._model.predict on that padded chunk and
incorporate its score via the same max(...) logic so the last partial frame is
evaluated instead of ignored.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6fbe4038-4668-41b3-af7a-f73f790e20b0
📒 Files selected for processing (1)
src/glados/openwakeword/__init__.py
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/glados/openwakeword/__init__.py (1)
124-133: Validatepatienceanddebounce_timeinputs explicitly.Negative/zero
patienceand negativedebounce_timecurrently pass through silently, which can lead to confusing gating behavior instead of a clear configuration error.🛠️ Suggested guardrails
- if patience is None: + if patience is None: patience = {} else: + if not isinstance(patience, int) or patience <= 0: + raise ValueError("patience must be a positive integer") patience = { self.model_name: patience, } if debounce_time is None: debounce_time = 0.0 + elif debounce_time < 0: + raise ValueError("debounce_time must be >= 0")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/glados/openwakeword/__init__.py` around lines 124 - 133, The constructor currently allows non-positive patience and negative debounce_time; add explicit validation in the __init__ around the patience and debounce_time handling: after normalizing patience into either {} or {self.model_name: patience}, ensure any provided patience value is numeric and > 0 (raise ValueError if <= 0 or non-numeric), and for debounce_time ensure it is numeric and >= 0 (raise ValueError if negative); reference the variables patience, debounce_time and self.model_name in the error messages so callers can see which parameter failed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/glados/openwakeword/__init__.py`:
- Around line 163-169: predict_multi_sample currently computes count_samples =
len(x) // self.MIN_SAMPLES and then uses a default prediction of 0 so an empty
input with threshold == 0.0 can incorrectly return True; modify
predict_multi_sample to explicitly return False when count_samples == 0 (i.e.,
not enough full chunks) before running the loop, keeping the rest of the logic
that slices chunks, calls self._model.predict(current)[self.model_name], and
compares the final prediction to threshold.
---
Nitpick comments:
In `@src/glados/openwakeword/__init__.py`:
- Around line 124-133: The constructor currently allows non-positive patience
and negative debounce_time; add explicit validation in the __init__ around the
patience and debounce_time handling: after normalizing patience into either {}
or {self.model_name: patience}, ensure any provided patience value is numeric
and > 0 (raise ValueError if <= 0 or non-numeric), and for debounce_time ensure
it is numeric and >= 0 (raise ValueError if negative); reference the variables
patience, debounce_time and self.model_name in the error messages so callers can
see which parameter failed.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 5a6d7c7e-e4bf-46a8-8ae3-32a29503738e
📒 Files selected for processing (1)
src/glados/openwakeword/__init__.py
|
@coderabbitai resume |
✅ Actions performedReviews resumed. |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/glados/openwakeword/__init__.py (1)
44-52: Consider simplifying provider selection logic.The current flow modifies
providersin-place then immediately reassigns it. A cleaner approach would check membership directly without mutation:♻️ Suggested simplification
- providers = ort.get_available_providers() - for excluded in ["TensorrtExecutionProvider", "CoreMLExecutionProvider"]: - if excluded in providers: - providers.remove(excluded) - - if "CUDAExecutionProvider" in providers: + available = ort.get_available_providers() + excluded = {"TensorrtExecutionProvider", "CoreMLExecutionProvider"} + if "CUDAExecutionProvider" in available and "CUDAExecutionProvider" not in excluded: providers = ["CUDAExecutionProvider", "CPUExecutionProvider"] else: providers = ["CPUExecutionProvider"]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/glados/openwakeword/__init__.py` around lines 44 - 52, The code mutates providers from ort.get_available_providers() then reassigns it immediately; instead, avoid the in-place removal and directly base selection on membership: call ort.get_available_providers() (keep that symbol), check if "CUDAExecutionProvider" is in that returned list and set providers = ["CUDAExecutionProvider","CPUExecutionProvider"] else set providers = ["CPUExecutionProvider"]; remove the for loop that iterates over the excluded list and any in-place .remove() calls so provider selection is clearer and simpler.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/glados/openwakeword/__init__.py`:
- Around line 96-98: The threshold validation currently enforces 0.0 < threshold
<= 1.0 but raises a generic "threshold must be between 0.0 and 1.0" message
which is misleading; update the check or the message in the validation block
that inspects the threshold variable (in __init__.py) so they agree: either
allow 0.0 by changing the condition to 0.0 <= threshold <= 1.0, or keep the
strict lower bound and change the ValueError text to something precise like
"threshold must be > 0.0 and <= 1.0"; modify the ValueError raised in that
function accordingly so the condition and error message are consistent.
---
Nitpick comments:
In `@src/glados/openwakeword/__init__.py`:
- Around line 44-52: The code mutates providers from
ort.get_available_providers() then reassigns it immediately; instead, avoid the
in-place removal and directly base selection on membership: call
ort.get_available_providers() (keep that symbol), check if
"CUDAExecutionProvider" is in that returned list and set providers =
["CUDAExecutionProvider","CPUExecutionProvider"] else set providers =
["CPUExecutionProvider"]; remove the for loop that iterates over the excluded
list and any in-place .remove() calls so provider selection is clearer and
simpler.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 8e544104-6459-44e9-b20c-dcc0a70080aa
📒 Files selected for processing (1)
src/glados/openwakeword/__init__.py
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/glados/openwakeword/__init__.py`:
- Around line 156-167: predict_multi_sample() currently calls self.reset() at
the start which reinitializes the predictor and causes zeroed outputs for the
first frames on every call; remove the per-call reset in predict_multi_sample()
so the model warm-up only happens once (e.g., in __init__ or an explicit reset
path) and leave the rest of the method (threshold validation via
_validate_threshold, conversion via _np_f32_to_i16, chunking by MIN_SAMPLES and
calling _model.predict(...)[self.model_name]) unchanged; if a forced
reinitialization is needed, provide a separate public reset method or a boolean
parameter to predict_multi_sample to opt into resetting rather than invoking
self.reset() unconditionally.
- Around line 94-98: The threshold check currently compares the raw `threshold`
which can raise TypeError for non-numeric inputs; normalize `threshold` to a
float at the top of the method (before the 0.0 < threshold <= 1.0 check) and
handle conversion failures by raising ValueError with the same message. Locate
the method using cls.DEFAULT_THRESHOLD in __init__.py (the threshold handling
block) and change the logic to convert `threshold` to float first, then perform
the bounds check and return the float or raise ValueError("threshold must be 0.0
< threshold <= 1.0") on conversion or bounds failure.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: c85afecf-4bb2-477d-8cfe-0b2696288f19
📒 Files selected for processing (1)
src/glados/openwakeword/__init__.py
This PR adds a new wake word recognition based on OpenWakeWord, because I'm definitely unable to pronounce "GLaDOS" in a way the ASR recognizes.
onnxruntime-gpuand always adds a dependency ontflite, so I tookmodel.pyandutils.pyfrom their code and integrated them into this project. The original license is preserved, original authorship is clearly visible in the files from the OpenWakeWord project. The Apache 2.0 license is compatible with the MIT license used for GLaDOS.wake_word_model_path: if this is set to a OpenWakeWord.onnxmodel, recognized text is only passed to the LLM if it contains the wake word. This configuration option takes precedence over the oldwake_wordoption (if set).glados download, I suspect you will want to change the URLs.Summary by CodeRabbit
New Features
Chores