Skip to content
All projects
Vordi app icon

Vordi

Voice to text that remembers.

LiveFounder
vordi.siteGitHub
vordi.site

What it is

A free, open-source macOS voice-typing app, my answer to WisprFlow and Superwhisper. Hold fn, speak into any app, and a Dynamic Notch streams your speech through Groq, OpenAI, or a local Whisper, then cleans and rewrites it before injecting the text. Pure Swift and SwiftUI, no Electron. It also keeps a private, on-device memory of everything you said. I have dictated 142,000 words through it.

System design

Local-first by default. Speech is routed to the right engine per language: Groq for fast English, OpenAI for Hinglish and multilingual, or a fully local Whisper, followed by an LLM cleanup and rewrite pass. A private, on-device memory built on SQLite FTS5 plus local embeddings powers a small RAG layer, so you can ask questions over everything you have ever dictated and get answers with cited sources. It is pure Swift and SwiftUI, no Electron; cloud keys are yours to bring, and local LLMs through LM Studio or Ollama are a first-class fallback rather than an afterthought.

What I got wrong, then fixed.

  1. 01 · the problem

    Whisper is fast but English-only and hallucinates ghost phrases like 'thank you for watching' into silent audio. One transcription model could not cover Hinglish, fast English, and clean output at once.

    what I did

    Route Groq for fast English, OpenAI for multilingual and Hinglish, and run a hallucination guard that strips known phantom text before any cleanup pass touches it.

  2. 02 · the problem

    Good cleanup needs context, but a dictation widget that steals keyboard focus or demands Accessibility permission up front is dead on arrival.

    what I did

    fn+key as the primary hotkey so it works without Accessibility, a Dynamic Island that morphs through listening and thinking states without taking focus, and every capture logged to an auditable run log.

  3. 03 · the problem

    Memory is useless if you can only find things by exact words. Keyword search misses what you meant.

    what I did

    Hybrid retrieval on-device: SQLite FTS5 for keywords plus local embeddings for meaning, fused into a small RAG layer that answers questions over your own transcripts and cites the source.

  4. 04 · the problem

    Cloud transcription means handing your voice to someone else. Local models mean slow.

    what I did

    You bring your own keys for the cloud engines, everything personal stays on-device by default, and local LLMs through LM Studio or Ollama are a first-class fallback, not an afterthought.

See Vordi liveBack to all projects