feat: add typos spell checker CI workflow and fix typo (fixes #1266) by shivamtiwari3 · Pull Request #1319 · google/magika

shivamtiwari3 · 2026-03-11T19:32:21Z

Summary

Adds automated spell checking via typos (crate-ci/typos) on every push and pull request, with a .typos.toml configuration that excludes generated/binary files and allowlists domain-specific terms to keep the false-positive rate near zero.

Root Cause

There was no CI gate for spelling errors, so typos accumulated undetected between manual passes (the last manual fix being PR #1240). Without automation, any contributor could inadvertently introduce new misspellings with no warning.

Solution

.typos.toml — Configures the checker:

Excludes *.json, *.lock, tests_data/, assets/models/ to avoid scanning generated and binary files
Allowlists six domain-specific terms found during initial triage: hve (ONNX internals), afterall (test variable), hashi (HashiCorp vendor name), harc (LHarc archive format), cpy/CPY (COBOL file extensions), licens (stem of "LICENSEs", the plural of LICENSE in comments)

.github/workflows/typos.yml — Runs on push to main and all PRs, using crate-ci/typos pinned to its SHA for supply-chain security (same pattern as the existing actions/checkout pin).

Bonus fix: Corrected a real typo found during triage — indentify → identify in python/tests/test_magika_python_module.py:277.

typos --config .typos.toml --format brief now exits 0 on the full repository.

Testing

Ran typos --config .typos.toml --format brief locally — zero errors
All 33 existing Python tests pass: uv run pytest tests -m "not slow"
ruff check and ruff format pass clean

Checklist

Workflow triggers on push and pull_request
Action pinned to a commit SHA (supply-chain hygiene)
False positives excluded via .typos.toml (not inline ignores)
One real typo fixed as part of triage
All existing tests pass
Read CONTRIBUTING.md

…oogle#1266) Root cause: the codebase had no automated spell checking, allowing typos to accumulate silently. PR google#1240 fixed many manually but there was no CI gate to prevent regressions. Fix: - Add .typos.toml with exclusions for generated/binary files and domain-specific terms (HashiCorp, LHarc, COBOL .cpy extension) that are intentional and would otherwise produce false positives. - Add .github/workflows/typos.yml that runs crate-ci/typos on every push and pull request with the action pinned to a specific SHA. - Fix typo found during triage: "indentify" -> "identify" in python/tests/test_magika_python_module.py.

gemini-code-assist

Code Review

This pull request introduces a CI workflow for spell checking using typos and fixes a typo in a test file. The configuration for the spell checker in .typos.toml is well-structured with clear exclusions and an allowlist for domain-specific terms. The typo fix is correct. I have one suggestion regarding the use of a print statement in the test file.

gemini-code-assist · 2026-03-11T19:33:46Z

python/tests/test_magika_python_module.py


    for ws_num in ws_nums:
-        print(f"Calling indentify_bytes with {ws_num} whitespaces")
+        print(f"Calling identify_bytes with {ws_num} whitespaces")


While fixing the typo is good, using print() in automated tests is generally discouraged as it can clutter test output. If this output is useful for debugging, consider using the logging module, which integrates well with pytest's output capturing. If it was for temporary debugging, it should be removed.

print() in tests clutters captured output; no information is lost since pytest's test name already identifies the failing scenario.

shivamtiwari3 · 2026-03-15T07:58:44Z

Applied — removed the print() call from the test loop. The test name and pytest's parametrized output already provide enough context if the assertion fails.

shivamtiwari3 requested review from invernizzi and reyammer as code owners March 11, 2026 19:32

gemini-code-assist bot reviewed Mar 11, 2026

View reviewed changes

test: remove print() from test per review suggestion

2d81dd6

print() in tests clutters captured output; no information is lost since pytest's test name already identifies the failing scenario.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add typos spell checker CI workflow and fix typo (fixes #1266)#1319

feat: add typos spell checker CI workflow and fix typo (fixes #1266)#1319
shivamtiwari3 wants to merge 2 commits intogoogle:mainfrom
shivamtiwari3:feat/add-typos-spell-checker-workflow

shivamtiwari3 commented Mar 11, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 11, 2026

Uh oh!

shivamtiwari3 commented Mar 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

shivamtiwari3 commented Mar 11, 2026

Summary

Root Cause

Solution

Testing

Checklist

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

shivamtiwari3 commented Mar 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant