▌ IAN'S AI THOUGHTSTREAM ▌ THOUGHTSTREAM / #dictation
Tag

#dictation

2 posts

2026·05·28 17:40 / 1 MIN

Ghost Pepper Wins for Dictation

I was wrong about Aqua Voice being the ceiling for fast dictation. Ghost Pepper is fantastic, and my Aqua subscription is cancelled. It's free, MIT-licensed, 100% local (WhisperKit plus a small Qwen model for cleanup), and astoundingly fast on Apple Silicon.

The measure that matters is developer-speak. Saying "tilde slash dev" should produce ~/dev. Saying "eich mack or jay double-you tee" should produce "HMAC or JWT". Ghost Pepper gets both right, every time.

Ghost Pepper Settings window showing Models tab with language auto-detect, cleanup model selection, and list of available speech recognition runtime models with file sizes
Ghost Pepper Settings window showing Models tab with language auto-detect, cleanup model selection, and list of available speech recognition runtime models with file sizes

Key bindings

The defaults ship as hold-Control to talk, but my muscle memory is from Aqua: right Option as push-to-talk. Reusing those keys worked fine. Aqua's double-tap-to-go-hands-free mode is the one feature I miss, and Ghost Pepper doesn't have it yet, so Shift+RightOpt is standing in. On my Keychron K2 the M1 macro key handles it nicely. Might take a swing at adding the double-tap toggle upstream.

The cleanup model is a little too honest

Aqua quietly filtered out coughs, keyboard noise, and other non-speech. Ghost Pepper does not. [keyboard clacking] and [snorts] have both shown up in my output, courtesy of Whisper's annotation habit leaking through the cleanup pass. Guess I'll have to be a little more civilized at the desk.

2026·05·27 17:24 / 1 MIN

Aqua Voice vs Ghost Pepper

Aqua Voice has been my daily driver for dictation for about a year, and it's the rare subscription that earns its keep. Eight dollars a month, fast, and genuinely accurate. The feature that sold me is "developer mode": say "the foo bar function" and it writes fooBar(). Say "tilde slash dev slash foo" and it writes ~/foo. Built-in macOS and iOS dictation feels embarrassing by comparison.

AQUA app interface showing Dictionary feature with custom word entries like CodeRabbit, IP, and auth listed with remove options
AQUA app interface showing Dictionary feature with custom word entries like CodeRabbit, IP, and auth listed with remove options
Aqua typing assistant dashboard showing user "Ian" with 68,188 total words typed, 19 hours saved, and Level 6 Great Lake achievement status
Aqua typing assistant dashboard showing user "Ian" with 68,188 total words typed, 19 hours saved, and Level 6 Great Lake achievement status

68,188 words through it so far. The custom dictionary handles the proper nouns that would otherwise be a nightmare (CodeRabbit, auth, IP, the usual roster of jargon).

The one thing I don't love

Audio leaves my machine. How long is it kept? Where is it stored? The product keeps a history, and I don't want a history. Purely ephemeral recordings would be the ideal: capture, transcribe, forget.

A local-first contender

Ghost Pepper just landed on my radar. 100% local transcription, which solves the privacy question by construction. I haven't tried it yet, but it's next on the list.

The barrier to building this kind of tool is lower than it's ever been. Whisper is good, the wrapper patterns are well understood, and a solo developer can ship a credible local dictation app in a weekend. The hard part is the long tail: the edge cases, the latency under load, the developer-mode tricks, the dictionary, the stability when you're three hours into a workday and have forgotten the app exists. That long tail is what $8/month buys you. We'll see if Ghost Pepper closes the gap.