Audio Tools

Explore the best new Audio tools and products curated by the community.

Kokori logo
Kokori
Transform text to speech with a powerful macOS app
APIMenu Bar AppsAudio

Kokori TTS app for macOS: Transform text to speech with a powerful local API server & desktop app. High-quality voices, speed control, and seamless menubar integration.

2026-01-30
63
Sayline logo
Sayline
The Most Productive Way to Type in 2026
ProductivityMenu Bar AppsAudio

Sayline is a native macOS app that brings private, local voice dictation to any text field. Use global hotkeys to replace typing in Gmail, Slack, VS Code, or Notes. It runs entirely on-device (using NVIDIA Parakeet and MLX), so you can fix grammar, translate, or format text instantly without your audio ever leaving your Mac. Stop worrying about cloud privacy. Just SAY the LINE and watch it appear. Pure magic.

2026-01-28
53
Qwen3-TTS logo
Qwen3-TTS
Voice design, cloning & 97ms streaming
Open SourceArtificial IntelligenceAudio

A family of SOTA speech models (0.6B & 1.7B) supporting 10 languages. Features prompt-based Voice Design, 3s zero-shot cloning, and extreme low-latency streaming.

2026-01-23
61
Session Pilot logo
Session Pilot
Offline Transcription
PrivacyArtificial IntelligenceAudio

Session Pilot is a fully offline speech transcription application designed for privacy, reliability, and independence from cloud services. It converts live or recorded audio into accurate text entirely on-device, with no internet connection required. Ideal for meetings, interviews, and research, Session Pilot ensures sensitive data stays local while delivering efficient, high-quality transcription anytime, anywhere.

2026-01-13
55
Voquill logo
Voquill
The open source WisprFlow alternative
Open SourceWritingGitHubAudio

Voquill is the open source alternative to WisprFlow. Type 4x faster by using your voice. It works in any app and runs natively on any operating system (MacOS, Windows, or Linux). Whether you're using agent mode or AI dictation mode, Voquill understands your intent and turns what you say into beautifully formatted text.

2026-01-13
67
AudioPriorityBar logo
AudioPriorityBar
Priority-based audio switching for macOS menu bar
GitHubMenu Bar AppsAppleAudio

A native macOS menu bar app that automatically manages audio device priorities. Set your preferred order for speakers, headphones, and microphones - the app automatically switches to the highest-priority connected device.

2025-12-30
68
Chatterbox Turbo logo
Chatterbox Turbo
Fast, expressive, open source TTS with native watermarking
Open SourceArtificial IntelligenceAudio

Chatterbox Turbo is a 350M parameter open-source TTS model. It features paralinguistic tags (control laughs, sighs, etc.), zero-shot cloning, and runs 6x faster than real-time. Uniquely includes built-in PerTh watermarking for safety.

2025-12-30
69
NOIZ AI logo
NOIZ AI
Use emoji to to voice season's greetings with emotion
EmojiArtificial IntelligenceAudio

Holidays aren’t always joyful. Sometimes you’re far from home. Sometimes someone is missing. Noiz lets you send a voice message that actually sounds how you feel. Emojis guide the emotion in the voice — not as decoration, but as direction. A message can pause, soften, smile, or ache. Santa or your own voice simply helps carry what’s hard to say when you can’t be there.

2025-12-20
71
SAM Audio logo
SAM Audio
Segment any sound with text, visual, or time prompts
Open SourceArtificial IntelligenceAudio

SAM Audio is a unified model that separates any sound from any source. Use text ("dog barking"), visual clicks on video, or time spans to isolate specific audio. It unifies speech, music, and sound effect separation into one promptable model.

2025-12-19
65
VoiceNotes logo
VoiceNotes
Speak your thoughts. Get clean notes.
ProductivityNotesAudio

Most voice tools stop at recording or raw transcripts. VoiceNotes.me is built for frictionless capture → usable notes. One tap to record, live transcription as you speak, and automatic cleanup into clean, editable notes. No files, no setup, no reformatting. Just think out loud and keep moving.

2025-12-19
65
Grok Voice Agent API logo
Grok Voice Agent API
Bringing the power of Grok Voice to all developers
APIArtificial IntelligenceAudio

Grok Voice Agent API lets developers build real-time voice agents using xAI's in-house stack (VAD, tokenizer, audio models). It features <1s latency, function calling, and native multilingual support.

2025-12-18
93
Pavis logo
Pavis
Real-time fact-checking, and manipulation spotting on calls
MeetingsArtificial IntelligenceAudio

Real-time analysis of your calls. Detect manipulation, fact-check, and come up with unique questions on the spot. Pavis transcribes conversations and instantly detects manipulation tactics, fact-checks claims, and suggests critical questions you'd miss in the moment. Stop walking into bad deals—whether it's investor pitches, sales negotiations, or contractor quotes. See pressure tactics as they happen. Verify statistics before you respond. Ask the questions that change outcomes.

2025-11-20
72
ElevenLabs Image & Video logo
ElevenLabs Image & Video
The best audio, image & video models now in one platform
Artificial IntelligenceAudioPhoto & Video

ElevenLabs now has image and video generation. Generate visuals with top models like Sora, Veo, and Kling, then export to the Studio to add high-quality voiceovers, music, AI sound effects, and captions. It's a unified creative platform.

2025-11-19
67
Typeless logo
Typeless
AI Voice Dictation That's Actually Intelligent
ProductivityArtificial IntelligenceAudio

Speak naturally, and Typeless will turn your words into polished messages, emails, and documents that read like you carefully typed them. Our AI understands context, fixes grammar, and adapts to your style - so you can focus on what you want to say, not how to say it.

2025-11-18
71
VNYL logo
VNYL
Truly unlimited podcast hosting
Audio

Modern podcast hosting with truly unlimited storage and downloads at a flat rate, no caps, no overage fees, no surprises. While competitors charge per download tier (forcing you to delete episodes or upgrade), we leverage modern cloud infrastructure to make unlimited genuinely affordable. Built-in team collaboration, IAB-compliant analytics, publish scheduling, and dedicated podcast website, all included.

2025-11-17
57
Hathora logo
Hathora
Explore, test & deploy production ready voice models.
Developer ToolsArtificial IntelligenceAudio

Build voice agents on open source or closed models with zero DevOps. Start instantly on shared endpoints and upgrade to dedicated infrastructure for privacy, compliance, or VPC requirements. Models run in 14 regions for ultra low latency. Bring your own models or custom containers as you scale.

2025-11-12
73
Scribe v2 Realtime logo
Scribe v2 Realtime
The most accurate real-time Speech to Text model.
LanguagesAudio

Built for voice agents, meeting notetakers, and live applications, it transcribes in 150ms across 90+ languages, including English, French, German, Italian, Spanish, Portuguese, Hindi, and Japanese.

2025-11-12
64
Omnilingual ASR logo
Omnilingual ASR
Advancing automatic speech recognition for 1,600+ languages
Open SourceArtificial IntelligenceAudio

Meta's Omnilingual ASR is an open-source (Apache 2.0) speech recognition model supporting 1,600+ languages. It uses an LLM-based architecture that can be extended to new languages with just a few in-context examples, without retraining.

2025-11-11
67
voicebrief.io logo
voicebrief.io
Turn PDFs into professors
aieducationstudyproductivity

Turn PDFs into professors

2025-11-09
72
MyClone logo
MyClone
AI Clone that scales your expertise
ProductivityArtificial IntelligenceAudio

An AI platform built for knowledge professionals to help scale their services. Check out the live demo in https://www.myclone.is/ The clone acts as an extension of you, continuously learning from your YouTube, podcasts, documents, videos, and audio. It speaks in your voice, language, and gets integrated into your current workflow (website, Slack, etc.) Oh, you can complete white label it. Think of MyClone as "Shopify for knowledge professionals".

2025-11-07
56
Stream Ring by Sandbar logo
Stream Ring by Sandbar
Capture thoughts & build ideas anywhere
WearablesArtificial IntelligenceAudio

Stream is a conversational self extension. It's designed for talking through ideas and capturing notes, with an Inner Voice personalized to you. Stream Ring is a new device for fast, private voice interactions. Hold to speak, whisper in a crowd, and control music effortlessly. No interruptions, pulling out your phone, or talking loudly in public. Now available for preorder, in limited supply

2025-11-06
63

Subscribe to Our Newsletter

Get weekly curated tool recommendations and stay updated with the latest product news