Skip to content

Add iOS app with keyboard extension#3

Open
croshank wants to merge 1 commit intosmallest-inc:mainfrom
croshank:ios-keyboard-extension
Open

Add iOS app with keyboard extension#3
croshank wants to merge 1 commit intosmallest-inc:mainfrom
croshank:ios-keyboard-extension

Conversation

@croshank
Copy link
Copy Markdown

iOS app + keyboard extension that brings MiniFlow dictation to any iOS app.

The macOS app routes audio through a Python backend to Smallest AI. On iOS there's no local server, so the app talks to the Waves WebSocket API directly from Swift — same 16kHz PCM streaming, same is_final/is_last protocol, just no middleman.

How it works

The main app runs AVAudioEngine in the background. The keyboard extension sends commands (start/stop/cancel) via files in the App Group shared container. When the user taps record, the app opens a WebSocket to wss://api.smallest.ai/waves/v1/pulse/get_text, streams PCM chunks from the mic, and writes the transcript back to the shared container. The keyboard reads it and inserts at the cursor.

File-based IPC instead of UserDefaults because UserDefaults is unreliable between app and extension on device (the kCFPreferencesAnyUser issue).

What's in the PR

  • SmallestAIClient.swift — actor wrapping URLSessionWebSocketTask, handles the Waves protocol
  • FlowBackgroundRecorder.swift — AVAudioEngine capture, PCM conversion, streams to SmallestAIClient
  • FlowSessionManager.swift — reads/writes small files in the App Group container for IPC
  • KeyboardViewController.swift — UIInputViewController hosting SwiftUI, state machine for the recording flow
  • VoiceInputView.swift — keyboard UI
  • ContentView.swift — main app: API key entry, session control, setup instructions
  • KeychainHelper.swift — thin Security.framework wrapper for storing the API key

Setup

After cloning, add App Group group.com.smallestai.MiniFlow to both targets in Signing & Capabilities (Xcode doesn't persist this across machines). Then build, enter your API key, enable the keyboard in iOS Settings, and go.

No macOS files were modified.

Adds a complete iOS app (MiniFlow_iOS) with a custom keyboard extension
that enables voice-to-text dictation in any app, powered by Smallest AI's
Waves real-time streaming STT API.

Architecture:
- Main app runs AVAudioEngine in the background, streams 16kHz mono PCM
  audio directly to Smallest AI via WebSocket (no Python engine needed)
- Keyboard extension communicates with the main app via App Group shared
  container (file-based IPC for reliability)
- Keyboard sends start/stop/cancel commands, main app writes transcripts
- Live partial transcripts stream to keyboard during recording
- Final transcript inserted at cursor with smart spacing

Key files:
- SmallestAIClient.swift: WebSocket client for Smallest AI Waves API
- FlowBackgroundRecorder.swift: Background audio capture + streaming
- FlowSessionManager.swift: File-based IPC between app and keyboard
- KeychainHelper.swift: Secure API key storage
- KeyboardViewController.swift: Keyboard extension with state machine
- VoiceInputView.swift: Keyboard UI with record button and controls
- ContentView.swift: Main app settings (API key, session, permissions)

Features:
- Hold-to-record or tap-to-toggle in keyboard
- Real-time partial transcript display during recording
- Auto-scrolling live text preview
- Smart text insertion (space-aware, punctuation-aware)
- Undo support (20-second window)
- Dark mode support
- URL scheme (miniflow://startflow) for keyboard-to-app launch
- Heartbeat-based session liveness detection
@entelligence-ai-pr-reviews
Copy link
Copy Markdown

entelligence-ai-pr-reviews bot commented Mar 30, 2026

EntelligenceAI PR Summary

Introduces the full MiniFlow iOS application — a voice transcription tool using a custom keyboard extension and background audio recording service connected to Smallest AI's STT WebSocket API.

  • Main App: SwiftUI entry point (MiniFlow_iOSApp) with deep-link handling (miniflow://), ContentView for session/API key/microphone management, and FlowBackgroundRecorder for end-to-end audio capture and streaming
  • STT Client: SmallestAIClient Swift actor for real-time WebSocket-based speech-to-text with ping keepalive and transcript joining
  • IPC: FlowSessionManager implements file-based inter-process communication over shared App Group (group.com.smallestai.MiniFlow) replacing UserDefaults-based cross-process signaling
  • Keyboard Extension: MiniFlowKeyboard with KeyboardViewController, KeyboardViewModel state machine, VoiceInputView SwiftUI UI (record button, waveform animation, status pill), and text insertion/undo via UITextDocumentProxy
  • Security: KeychainHelper for secure API key storage; App Group entitlements on both main app and extension targets
  • Project Setup: Full Xcode project (project.pbxproj) targeting iOS 26.0, four build targets, shared schemes, asset catalogs, Info.plist files, and test stubs

Confidence Score: 3/5 - Review Recommended

Likely safe but review recommended — this PR introduces a substantial new iOS application including FlowBackgroundRecorder, SmallestAIClient, and a custom keyboard extension, which represents a significant surface area for a first review pass. While the automated review found no flagged issues across all 22 changed files, the complexity of the WebSocket-based STT streaming in SmallestAIClient, the audio session lifecycle management in FlowBackgroundRecorder, and the cross-process communication between the keyboard extension and main app warrant human eyes before merging. The PR achieves a meaningful milestone — end-to-end voice transcription via a custom keyboard — but microphone permissions, background audio entitlements, API key storage, and WebSocket error recovery paths are all areas where subtle bugs may not surface in automated analysis.

Key Findings:

  • FlowBackgroundRecorder manages audio session lifecycle and background recording state — improper AVAudioSession category handling or failure to handle interruptions (phone calls, Siri) could cause silent recording failures or crashes in production that automated tooling would not catch.
  • SmallestAIClient is a Swift actor handling real-time WebSocket streaming; WebSocket reconnection logic, partial message reassembly, and error propagation paths under poor network conditions are non-trivial and deserve manual review even if no static issues were flagged.
  • The keyboard extension communicates with the main app (likely via shared App Group container or URL schemes given the miniflow:// deep-link), and any mismatch in entitlements or group identifiers between targets would cause silent failures only visible at runtime on a real device.
  • API key management — how SmallestAIClient receives and stores the Smallest AI API key — should be verified to use Keychain rather than UserDefaults or plaintext storage, a security concern that static analysis typically does not surface.
Files requiring special attention
  • FlowBackgroundRecorder.swift
  • SmallestAIClient.swift
  • KeyboardViewController.swift
  • MiniFlow_iOSApp.swift
  • ContentView.swift

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant