Product Introduction
- Definition: Shadow is a native macOS application that functions as a local-first AI interface and automation agent. It operates as a background process that captures real-time screen content, audio input (including system and microphone), and user-selected text to execute user-defined AI prompts, known as "Skills."
- Core Value Proposition: Shadow exists to eliminate the friction between human thought and AI-powered action. It serves as an ambient AI co-pilot for Mac users, automating the capture of on-screen and spoken context to generate outputs like meeting notes, transcribed text, emails, and code without manual copy-pasting or bot-joining, thereby enhancing productivity for knowledge workers.
Main Features
- Meeting Skills (Autopilot Mode): This feature enables fully automated meeting documentation. Shadow uses audio and screen detection algorithms to auto-start and end with a call. It performs real-time, on-device transcription using a local speech-to-text engine, avoiding cloud processing. Concurrently, it employs computer vision to identify and capture the active meeting window (e.g., Zoom, Google Meet) as a "Smart Screenshot." Post-call, it executes a user-selected prompt (Skill) on the combined transcript and screenshot data to generate structured outputs like notes, BANT breakdowns, or follow-up emails.
- Action Skills (Keyboard-Triggered): This is a manual, on-demand execution mode. Users press a global keyboard shortcut, at which point Shadow captures the current active window screenshot, records a voice input, and/or reads any selected text. This multimodal context is then sent to a configured Large Language Model (LLM) API to run a specific Skill, with the output pasted directly into the focused application or displayed in a notification.
- Voice Typing Skill: A specialized Action Skill that converts spoken dictation into polished, written text. The underlying prompt is engineered to clean audio transcription by removing filler words, applying self-corrections, formatting lists, adding punctuation based on pauses, and preserving the speaker's original tone and technical terminology, resulting in text that reads as if natively typed.
- Custom Skills & Prompt Editor: Every Skill is a modular, editable prompt template. Users can modify the system instructions, define the input context (Voice, Screen, Selected Text), and set the output destination (Paste at cursor, Display). This transforms Shadow from a fixed tool into a programmable AI workflow platform for macOS.
- Local-First Privacy Architecture: A core technical feature where all audio processing and transcription occur locally on the user's Mac. Audio data never leaves the device, and meeting recordings/transcripts are stored in local files. AI API calls are only triggered explicitly by a Skill, ensuring user data is never used for model training by default.
Problems Solved
- Pain Point: The disruptive workflow of manually switching contexts to record meetings, take notes, and summarize action items, leading to lost information and reduced meeting engagement.
- Pain Point: Inefficient creation of written content (emails, documentation, code comments) due to the cognitive switch between thinking/verbalizing and typing/formatting.
- Target Audience: Remote professionals and hybrid workers (Sales Executives, Account Managers, Product Managers, Engineers, Consultants) who spend significant time in video calls and need accurate, actionable follow-ups.
- Target Audience: Writers, content creators, developers, and executives who require fast, hands-free text input and contextual automation without sacrificing privacy.
- Use Cases: Automatically generating a structured sales call summary with competitor mentions and agreed-next-steps from a Zoom recording. Drafting a lengthy client email by verbally describing the intent while looking at the project brief on screen. Instantly converting a spoken idea into formatted documentation or a code snippet while in a development environment.
Unique Advantages
- Differentiation vs. Otter.ai/Fireflies: Unlike cloud-based meeting assistants that require joining calls as a bot and process all data on their servers, Shadow operates invisibly on the user's local machine. It offers deeper system integration (screen capture, global shortcuts) and is not limited to calendar-linked meetings, enabling ad-hoc audio/screen context capture for any task.
- Differentiation vs. Traditional Mac Automation: Unlike Apple Shortcuts or keyboard macro tools, Shadow natively integrates multimodal AI (vision and language models) to understand and act upon unstructured screen content and natural language voice input, enabling far more complex and intelligent automations.
- Key Innovation: The "Smart Screenshot" technology that programmatically identifies and isolates the relevant meeting or application window from the entire desktop. This provides the AI with precise visual context instead of a full-screen capture, dramatically improving the relevance of Skill outputs.
- Key Innovation: The "Skill" abstraction, which packages a specific LLM prompt, a defined set of input sources, and an output method into a single, triggerable entity. This makes advanced AI workflows accessible and reusable without coding.
Frequently Asked Questions (FAQ)
- How does Shadow record meetings without joining the call? Shadow uses your Mac's system audio input to capture meeting audio locally. It does not join as a participant; it listens to the audio output from your computer and, optionally, your microphone input, all processed on-device for privacy.
- Is Shadow.ai secure and private for confidential business meetings? Yes, Shadow employs a local-first privacy architecture. Audio transcription occurs on your Mac, and raw audio never leaves your device. Skills only contact external LLM APIs (like OpenAI) when explicitly triggered, and you can disable this to keep all data completely local.
- What is the difference between Shadow and Apple's built-in dictation? While Apple Dictation converts speech to basic text, Shadow's Voice Typing Skill uses AI to rewrite dictation into polished, naturally phrased text, removing fillers and applying formatting. Furthermore, Shadow can combine voice with on-screen context to perform complex tasks far beyond simple transcription.
- Can I use Shadow with Google Meet, Microsoft Teams, and Zoom? Yes, Shadow's meeting detection and Smart Screenshot are designed to be agnostic to the conferencing platform. It works with any application that plays audio through your system or receives input from your microphone.
- Does Shadow work on Windows or iPhone? Currently, Shadow is a native application built exclusively for macOS. Its deep integration with the Mac operating system (e.g., global keyboard shortcuts, screen capture APIs) is central to its functionality. Cross-platform support has not been announced.
