A context-aware voice-to-text Chrome extension with LLM-powered rewriting profiles. Speak into any text input on the web, have your words transcribed, rewritten, and injected — all through your own OpenRouter API key.
CocoParrot lets you hold a key, speak, and have your voice transcribed using Whisper via OpenRouter. Before injecting the text into the focused input field, it can optionally pass your words through an LLM — not to answer questions or act as an agent, but to rewrite: fix grammar, adjust tone, translate, match a professional register, or clean up casual speech into something polished. You review the result in a small corner popup before anything gets written.
┌─────────────┐ ┌──────────────────┐ ┌─────────────────┐ ┌──────────────────┐
│ Audio │ │ Transcription │ │ LLM Rewrite │ │ Text Injection │
│ Capture │───▶│ (Whisper) │───▶│ (Optional) │───▶│ (Strategies) │
└─────────────┘ └──────────────────┘ └─────────────────┘ └──────────────────┘
│ │ │ │
│ │ │ │
Microphone OpenRouter API Profile-aware Context-aware
MediaRecorder Streaming SSE System prompts Element targeting
16kHz audio Chunked upload Page context Multiple strategies
The extension uses a custom event bus (BusService) for state management and cross-component communication:
// State phases
type Phase =
| 'idle' // No activity
| 'pending-recording' // Debounce period before recording starts
| 'recording' // Audio capture active
| 'transcribing' // Sending audio to Whisper API
| 'processing' // LLM rewriting in progress
| 'awaiting-approval' // User review before injection
| 'editing-transcription' // Editing raw transcription
| 'editing-llm'; // Editing LLM outputEach service is a singleton module exporting pure functions:
RecorderService— Microphone capture withMediaRecorderAPI, handles permissions, MIME type detection, and base64 encodingTranscriptionService— Chunked streaming upload to OpenRouter Whisper API with retry logic and abort controllersLlmService— Streaming SSE parsing for LLM responses with provider selection and reasoning mode supportProfileService— Profile matching via rule-based system (hostname, URL patterns, DOM selectors, page title)InserterService— Strategy pattern for text injection with 4 configurable strategiesBusService— Event-driven state management with watchers, notifications, and phase transitionsStorageService— Chrome extension storage abstraction with@wxt-dev/storageKeyboardService— Hotkey handling with configurable keys and hold detection
Profiles are the core abstraction. Each profile contains:
interface Profile {
id: string;
name: string;
description?: string;
rules: ProfileRule[]; // When to activate
contextQueries: ProfileContextQuery[]; // What page data to extract
systemPrompt: string; // LLM instructions
inputElementCssSelector?: string; // Target input element
overridePreferences?: DeepPartial<SettingsPreferences>; // Per-profile overrides
}Rule Types:
hostname— Exact domain matchurl-contains— Substring matchurl-prefix— URL prefix matchurl-regex— Regular expression matchcontains-element— CSS selector presencepage-title-contains— Page title match
Context Queries:
css-selector— Extract text from CSS selectorcss-selector-all— Extract from multiple elementsxpath— XPath expressionmeta-tag— Meta tag contentcurrent-url— Current page URLpage-title— Page titlewhole-page-content— Full page content via Mozilla Readability + Turndown (HTML→Markdown)
The extension supports 4 text injection strategies:
insert-text— Direct text insertion viadocument.execCommandhuman-simulation— Character-by-character typing with configurable:- Base delay and jitter
- Typo generation (QWERTY neighbor-based)
- Backspace correction
- Newline handling (line-break, enter, shift-enter, ignore, space)
- React compatibility mode
clipboard-paste— Clipboard API-based pastedirect-value— Direct value setting for controlled inputs
The profile service can extract context from web pages:
// Example: Extract job posting details
const contextQueries = [
{ type: 'css-selector', value: '.job-title', name: 'Job Title' },
{ type: 'css-selector', value: '.company-name', name: 'Company' },
{ type: 'whole-page-content', value: '', name: 'Full Posting' },
];Uses @mozilla/readability for content extraction and turndown for HTML→Markdown conversion.
| Category | Technology | Purpose |
|---|---|---|
| Framework | WXT | Chrome Extension (Manifest V3) |
| UI | Preact + TypeScript | Lightweight React alternative |
| Validation | Valibot | Schema validation (not Zod) |
| Storage | @wxt-dev/storage |
Chrome extension storage |
| Messaging | @webext-core/messaging |
Cross-context communication |
| Content | Turndown + Readability | HTML→Markdown conversion |
| Icons | lucide-preact | Icon library |
| Package Manager | Bun | Fast JavaScript runtime |
| Build | Vite + vite-imagetools | Build tooling |
src/
├── entrypoints/
│ ├── background.ts # Service worker (badge, messaging relay)
│ ├── content.ts # Content script (orchestrates all services)
│ ├── overlay.css # Display overlay styles
│ ├── globals.css # Global styles (CSS variables, dark mode)
│ ├── popup/ # Extension popup UI
│ │ ├── popup.tsx
│ │ ├── popup.css
│ │ └── pages/ # Main, Profiles, Settings, Onboarding
│ └── dashboard/ # Full-page management dashboard
│ ├── dashboard.tsx
│ ├── dashboard.css
│ └── pages/ # Profiles, Rules, History, Statistics, Settings
├── services/ # Core business logic
│ ├── recorder-service.ts # Microphone recording
│ ├── transcription-service.ts # OpenRouter Whisper calls
│ ├── llm-service.ts # OpenRouter LLM calls
│ ├── inserter-service.ts # Text injection orchestrator
│ ├── profile-service.ts # Profile CRUD and matching
│ ├── keyboard-service.ts # Hotkey handling
│ ├── bus-service.ts # Event bus + state management
│ ├── ui-service.ts # UI state management
│ ├── storage-service.ts # Storage abstraction
│ ├── auth-service.ts # Authentication flow
│ └── inserter/ # Inserter strategies
│ ├── insert-text.ts
│ ├── human-simulation.ts
│ ├── clipboard-paste.ts
│ └── direct-value.ts
├── components/ # Feature components
│ ├── Display.tsx # Transcription/LLM output overlay
│ ├── Overlay.tsx # Recording visualization
│ ├── Sidebar.tsx # Dashboard navigation
│ ├── RuleList.tsx # Profile rule editor
│ ├── ContextQueryList.tsx # Context query editor
│ └── ... # 20+ components
├── ui/ # Reusable UI primitives
│ ├── Button.tsx
│ ├── Input.tsx
│ ├── Select.tsx
│ ├── TextArea.tsx
│ └── ... # 11 UI components
├── schemas/ # Valibot schemas
│ ├── auth.ts
│ ├── error.ts
│ ├── llm.ts
│ └── transcription.ts
├── hooks/ # Custom Preact hooks
│ ├── useBus.ts # Event bus integration
│ ├── useCurrentProfile.ts # Profile matching
│ ├── useStorage.ts # Storage access
│ └── useExitAnimation.ts # Page transitions
├── contexts/ # Preact contexts
│ └── router.ts # Client-side routing
├── shared/ # Shared types
│ └── messages.ts # Cross-context messaging protocol
├── utils/ # Utility functions
│ ├── base64-encoder.ts # Audio encoding
│ ├── logger.ts # Dev-only logging
│ ├── profile-utils.ts # Profile helpers
│ └── shallow-equal.ts # State comparison
├── storage.ts # Storage types and defaults
├── constants.ts # App constants, icon maps
├── errors.ts # Custom error classes
└── type.ts # Shared utility types
- Hold-to-record with configurable hotkey (default: backtick)
- Real-time recording visualization
- Automatic MIME type detection (OGG, WebM, MP4)
- Chunked audio streaming for low latency
- Whisper integration via OpenRouter
- Streaming upload with progress feedback
- Automatic retry on failure
- Language selection per profile
- Profile-aware system prompts
- Page context injection (CSS selectors, XPath, whole page)
- Streaming SSE responses
- Reasoning mode support
- Provider selection
- 4 injection strategies
- Smart element targeting (profile selector → active element → document.activeElement → body search)
- React compatibility mode
- Human-like typing simulation with configurable typos
- Rule-based activation (hostname, URL patterns, DOM selectors)
- Context extraction from page elements
- Per-profile model preferences
- Import/export profiles as JSON
- Onboarding flow with API key setup
- Full-page management interface
- Profile editor with visual rule builder
- History log (planned)
- Statistics tracking (planned)
- Provider management
- Bun package manager
- Chrome browser
- OpenRouter API key
# Install dependencies
bun install
# Start development server
bun dev- Load the extension from
.output/chrome-mv3-dev - Navigate to
http://localhost:8000/?title=<feature-name> - Use browser DevTools to inspect:
- Console logs (filtered by source)
- Storage state
- Network requests
- TypeScript strict mode throughout
- Preact (not React) — imports from
preactandpreact/hooks - Valibot for schema validation (not Zod)
- CSS files per component (not CSS-in-JS)
- Singleton services with static
instancegetter - Event-driven architecture via
BusService - Path alias
@/maps tosrc/ - Custom error classes in
src/errors.ts
Event Bus:
// Emit events
await bus.emit('transcription:done', { text });
// Listen to events
bus.on('transcription:done', async ({ text }) => {
// Handle event
});
// Watch state changes
bus.watch((state) => {
console.log('State:', state.phase);
});Service Singleton:
class MyService {
static get instance(): MyService {
if (!this._instance) this._instance = new MyService();
return this._instance;
}
private static _instance: MyService;
}Profile Matching:
// Find matching profile for current page
const { profile, rule } = ProfileService.instance.findMatchingProfile();
// Extract context from page
const contexts = await ProfileService.instance.grabContextFromWebpage();- 3KB gzipped vs 40KB for React
- Perfect for Chrome extensions where bundle size matters
- Same API as React, easy migration
- Tree-shakable (only includes what you use)
- Better TypeScript inference
- Smaller bundle size
- No external dependencies
- Built-in async support
- Natural fit for Chrome extension's message passing
- Phase-based state machine pattern
- Chrome extension content scripts run in page context
- Services need to maintain state across events
- Simple dependency management
- Easy to test and mock
- Chrome 88+ (Manifest V3)
- Firefox (experimental, via
wxt -b firefox)
- No data stored server-side — all processing happens via your OpenRouter API key
- Microphone permission — only used during recording, released immediately after
- Local storage only — profiles, settings, and history stored in Chrome extension storage
- No analytics — no tracking, no telemetry
Proprietary — see LICENSE
oxcl — GitHub