Add TOML config, text sanitization, custom endpoints#1
Add TOML config, text sanitization, custom endpoints#1Martin-Atrin wants to merge 3 commits intoevoleinik:mainfrom
Conversation
- Add Config struct with TOML deserialization and serde defaults - Replace hardcoded Groq URLs/models with configurable endpoints - Support any OpenAI-compatible transcription/chat API (vLLM, etc.) - Handle both plain-text and JSON transcription responses - Add configurable hotkey (fn/option/control/shift/command) - Add Settings... menu item to open config.toml from menu bar - Make api_key optional so app launches without config - Create default config.toml template on first launch - Fix event tap lifetime (tap+source must outlive NSApp.run()) - Make permission check non-blocking - Add app icon (gen-icon.py + AppIcon.icns) - Comprehensive README with permissions troubleshooting guide Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Text sanitization now runs by default (always_polish = true), polish modifier skips it for raw output - Separate polish_api_key field for using different providers for Whisper vs sanitizer (falls back to api_key when empty) - Reworked system prompt for small models: short, direct, with /no_think for Qwen3 to keep latency under 200ms - Custom polish_prompt config field for domain-specific replacement dictionaries (misheard term → correct term) - Language hint passed to Whisper when configured - CLAUDE.md with agent-facing setup guide, architecture, and code map - README expanded with full text sanitization docs, local inference setup (llama.cpp / MLX), and mixed-provider config examples Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Cap sanitizer output tokens proportionally to input length (floor 64, ceiling 1024) to prevent small model hallucination runaway - Add stderr debug logging for audio duration, WAV size, Whisper response, and sanitizer output to aid troubleshooting - Fix UTF-8 safe string truncation in log output Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
evoleinik
left a comment
There was a problem hiding this comment.
Code Review
Must Fix
1. API keys stored world-readable — std::fs::write() creates config.toml with default 0644 permissions. Should chmod 0600 after creation since the file holds API keys:
use std::os::unix::fs::PermissionsExt;
std::fs::set_permissions(&toml_path, std::fs::Permissions::from_mode(0o600))?;2. Default TOML template duplicated — The exact same template string appears in both load_config() and open_settings(). Extract to a const or function to avoid them drifting apart.
3. /no_think hardcoded in default prompts — This is Qwen3-specific. Users switching to GPT-4, Claude, or Llama will have /no_think literally in their system prompt, which models may echo back or misinterpret. Remove from hardcoded defaults; users who need it can add it to their custom polish_prompt.
Should Fix
4. Fn auto-detection fallback removed — The current code has a 5-second fallback from Fn to Option for keyboards that don't emit Fn events (common with external keyboards). This PR removes it silently. That's a breaking change for existing users. Consider keeping the fallback when hotkey = "fn" (the default).
5. Debug logging may leak PII — eprintln!("[fnkey] whisper text: {}", text) logs full transcriptions to stderr → Console.app/system logs. Could contain sensitive dictated content. Consider truncating or making debug logging opt-in via a debug = true config flag.
6. Silent truncation — If the sanitizer hits the max_tokens cap, the user gets half a sentence pasted silently. Should check finish_reason == "length" in the API response and fall back to raw text when truncated.
Minor
7. Config struct verbosity — The 10 separate default_*() functions could be replaced with #[serde(default)] on the struct + a single Default impl.
8. PR scope — 10+ features, +925/-160 lines in one PR. This would be easier to review and safer to merge as 3-4 smaller PRs (config migration, custom endpoints, sanitizer enhancements, cosmetic).
What's Good
- TOML config with backward compat (legacy
api_keyfile, env var fallback) is well designed - Custom endpoints are a real need for local/self-hosted setups
- JSON response fallback for non-compliant Whisper servers is practical
- Settings menu item with ObjC delegate is properly implemented
- The
always_polishinversion logic is correct - README expansion is thorough
🤖 Generated with Claude Code
Product FeedbackThanks for the PR — there's some genuinely useful work here. Wanted to share some thoughts on the product direction before we go further. Love these — clear wins
Concerns about scopefnkey's identity is "hold Fn, speak, paste." One thing, done well. This PR pulls it toward a configurable platform, and I want to be careful about that.
5 hotkey options — The Fn key is the product (it's called "fnkey"). The existing Fn-with-Option-fallback covers the real use case (external keyboards that don't emit Fn). Do we need control/shift/command? Each adds testing surface and edge cases (like the polish modifier collision when hotkey=control). Separate Debug logging always on — Logs full transcriptions to stderr/Console.app. Should be behind a Suggested path forwardWould you be open to splitting this up? Something like:
That way we can land the structural wins quickly and iterate on the behavioral changes. 🤖 Generated with Claude Code |
Summary
~/.config/fnkey/config.toml) with auto-generated template on first launch, Settings menu item to open itapi_keyfor Whisper,polish_api_keyfor sanitizer, supporting mixed providers (e.g. Groq STT + local sanitizer)polish_prompt) for domain-specific term correction via replacement dictionariesTest plan
config.tomltemplate is auto-created on first launchapi_keyand default endpoints, verify transcription worksapi_keyandpolish_api_key, verify both endpoints authenticate correctlypolish_prompt: verify domain-specific term correctionsalways_polish = true: verify hotkey gives polished text, hotkey+modifier gives rawalways_polish = false: verify inverse behaviorlanguage = "de"etc., verify Whisper respects it🤖 Generated with Claude Code