audio Tools
Explore the best new audio tools and products curated by the community.
Kokori TTS app for macOS: Transform text to speech with a powerful local API server & desktop app. High-quality voices, speed control, and seamless menubar integration.
Sayline is a native macOS app that brings private, local voice dictation to any text field. Use global hotkeys to replace typing in Gmail, Slack, VS Code, or Notes. It runs entirely on-device (using NVIDIA Parakeet and MLX), so you can fix grammar, translate, or format text instantly without your audio ever leaving your Mac. Stop worrying about cloud privacy. Just SAY the LINE and watch it appear. Pure magic.
A family of SOTA speech models (0.6B & 1.7B) supporting 10 languages. Features prompt-based Voice Design, 3s zero-shot cloning, and extreme low-latency streaming.
Session Pilot is a fully offline speech transcription application designed for privacy, reliability, and independence from cloud services. It converts live or recorded audio into accurate text entirely on-device, with no internet connection required. Ideal for meetings, interviews, and research, Session Pilot ensures sensitive data stays local while delivering efficient, high-quality transcription anytime, anywhere.
Voquill is the open source alternative to WisprFlow. Type 4x faster by using your voice. It works in any app and runs natively on any operating system (MacOS, Windows, or Linux). Whether you're using agent mode or AI dictation mode, Voquill understands your intent and turns what you say into beautifully formatted text.
A native macOS menu bar app that automatically manages audio device priorities. Set your preferred order for speakers, headphones, and microphones - the app automatically switches to the highest-priority connected device.
Chatterbox Turbo is a 350M parameter open-source TTS model. It features paralinguistic tags (control laughs, sighs, etc.), zero-shot cloning, and runs 6x faster than real-time. Uniquely includes built-in PerTh watermarking for safety.
Holidays aren’t always joyful. Sometimes you’re far from home. Sometimes someone is missing. Noiz lets you send a voice message that actually sounds how you feel. Emojis guide the emotion in the voice — not as decoration, but as direction. A message can pause, soften, smile, or ache. Santa or your own voice simply helps carry what’s hard to say when you can’t be there.
SAM Audio is a unified model that separates any sound from any source. Use text ("dog barking"), visual clicks on video, or time spans to isolate specific audio. It unifies speech, music, and sound effect separation into one promptable model.
Most voice tools stop at recording or raw transcripts. VoiceNotes.me is built for frictionless capture → usable notes. One tap to record, live transcription as you speak, and automatic cleanup into clean, editable notes. No files, no setup, no reformatting. Just think out loud and keep moving.
Grok Voice Agent API lets developers build real-time voice agents using xAI's in-house stack (VAD, tokenizer, audio models). It features <1s latency, function calling, and native multilingual support.
Real-time analysis of your calls. Detect manipulation, fact-check, and come up with unique questions on the spot. Pavis transcribes conversations and instantly detects manipulation tactics, fact-checks claims, and suggests critical questions you'd miss in the moment. Stop walking into bad deals—whether it's investor pitches, sales negotiations, or contractor quotes. See pressure tactics as they happen. Verify statistics before you respond. Ask the questions that change outcomes.
ElevenLabs now has image and video generation. Generate visuals with top models like Sora, Veo, and Kling, then export to the Studio to add high-quality voiceovers, music, AI sound effects, and captions. It's a unified creative platform.
Speak naturally, and Typeless will turn your words into polished messages, emails, and documents that read like you carefully typed them. Our AI understands context, fixes grammar, and adapts to your style - so you can focus on what you want to say, not how to say it.
Modern podcast hosting with truly unlimited storage and downloads at a flat rate, no caps, no overage fees, no surprises. While competitors charge per download tier (forcing you to delete episodes or upgrade), we leverage modern cloud infrastructure to make unlimited genuinely affordable. Built-in team collaboration, IAB-compliant analytics, publish scheduling, and dedicated podcast website, all included.
Build voice agents on open source or closed models with zero DevOps. Start instantly on shared endpoints and upgrade to dedicated infrastructure for privacy, compliance, or VPC requirements. Models run in 14 regions for ultra low latency. Bring your own models or custom containers as you scale.
Built for voice agents, meeting notetakers, and live applications, it transcribes in 150ms across 90+ languages, including English, French, German, Italian, Spanish, Portuguese, Hindi, and Japanese.
Meta's Omnilingual ASR is an open-source (Apache 2.0) speech recognition model supporting 1,600+ languages. It uses an LLM-based architecture that can be extended to new languages with just a few in-context examples, without retraining.
Turn PDFs into professors
An AI platform built for knowledge professionals to help scale their services. Check out the live demo in https://www.myclone.is/ The clone acts as an extension of you, continuously learning from your YouTube, podcasts, documents, videos, and audio. It speaks in your voice, language, and gets integrated into your current workflow (website, Slack, etc.) Oh, you can complete white label it. Think of MyClone as "Shopify for knowledge professionals".
Stream is a conversational self extension. It's designed for talking through ideas and capturing notes, with an Inner Voice personalized to you. Stream Ring is a new device for fast, private voice interactions. Hold to speak, whisper in a crowd, and control music effortlessly. No interruptions, pulling out your phone, or talking loudly in public. Now available for preorder, in limited supply
Subscribe to Our Newsletter
Get weekly curated tool recommendations and stay updated with the latest product news