NexTalk logo

NexTalk

The missing voice input for Linux. Beautiful Private Offline

2026-01-08

Product Introduction

  1. Definition: NexTalk is a native Linux voice input tool leveraging offline automatic speech recognition (ASR). It integrates directly with the Fcitx5 input method framework via Unix domain sockets for system-level text injection.
  2. Core Value Proposition: It delivers a sub-20ms latency voice typing experience exclusively for Linux, prioritizing 100% offline privacy, a minimalist transparent UI, and native desktop integration without cloud dependencies.

Main Features

  1. Transparent Capsule UI: Utilizes Flutter (Dart) for a hardware-accelerated, 60FPS transparent overlay that appears only during voice input. The UI vanishes post-speech, minimizing screen clutter.
  2. 100% Offline ASR Inference: Powered by Sherpa-onnx with Zipformer models, processing audio locally. Eliminates cloud APIs, ensuring zero data leakage and consistent sub-20ms latency regardless of internet connectivity.
  3. Native Fcitx5 Integration: Communicates via zero-copy IPC using Unix domain sockets, enabling direct text injection into any Fcitx5-supported application (terminals, IDEs, browsers) on X11 and Wayland compositors. Avoids unreliable solutions like ydotool.

Problems Solved

  1. Pain Point: Addresses the lack of high-performance, privacy-focused voice input on Linux, where cloud-based alternatives compromise latency and data security.
  2. Target Audience: Linux developers, privacy advocates, multilingual professionals, and accessibility users requiring efficient, offline-capable dictation.
  3. Use Cases: Dictating code in IDEs (VSCode, JetBrains), composing emails/chat messages, terminal command input, and multilingual transcription without latency-induced disruptions.

Unique Advantages

  1. Differentiation: Unlike cross-platform tools (e.g., Windows Speech Recognition), NexTalk is optimized solely for Linux, with deeper system integration (Fcitx5 sockets) and no telemetry. Outperforms cloud-dependent tools in latency and privacy.
  2. Key Innovation: Combines Sherpa-onnx’s Zipformer (state-of-the-art streaming ASR) with direct Fcitx5 socket communication, bypassing Wayland restrictions. The Flutter-rendered capsule UI sets a new standard for Linux-native application aesthetics.

Frequently Asked Questions (FAQ)

  1. Does NexTalk work on Wayland?
    Yes. Its native Fcitx5 integration via Unix sockets bypasses Wayland’s input restrictions, ensuring seamless functionality across GNOME, KDE Plasma, and Sway.
  2. What languages does NexTalk support?
    Currently optimized for English and Mandarin Chinese using Sherpa-onnx’s Zipformer models. Additional language models are planned via community contributions.
  3. Is NexTalk truly offline?
    Absolutely. All speech recognition (Sherpa-onnx) runs locally—no audio data leaves your device. Requires no internet connection post-installation.
  4. How does NexTalk achieve sub-20ms latency?
    Through optimized Sherpa-onnx inference and efficient IPC via Unix sockets, minimizing processing and communication delays end-to-end.
  5. Is NexTalk free and open source?
    Yes. Licensed under MIT/GPL, available on GitHub. Free for personal and commercial use. Development is community-driven.

Subscribe to Our Newsletter

Get weekly curated tool recommendations and stay updated with the latest product news