GenAI API for Apple Shortcuts

GenAI API for Apple Shortcuts is a TypeScript-based API built with Hono and OpenAI, deployed as a Cloudflare Worker, designed to integrate generative AI capabilities directly into Apple Shortcuts workflows. It enables users to enhance automation tasks by leveraging large language models (LLMs) like Gemini, Claude, and OpenAI through a lightweight, serverless architecture. The API acts as a middleware layer, translating Shortcuts’ HTTP requests into AI model completions and returning plain-text responses for seamless integration.
The core value lies in bridging Apple’s native automation ecosystem with advanced AI capabilities, enabling users to create intelligent shortcuts for tasks like text generation, data analysis, and contextual decision-making without requiring coding expertise. By abstracting AI model interactions into simple API calls, it democratizes access to generative AI within iOS/macOS workflows while maintaining cost efficiency through Cloudflare’s free tier (1,500 daily requests).

Cloudflare Worker Deployment: The API is optimized for serverless execution on Cloudflare’s edge network, ensuring sub-100ms response times and global scalability. It utilizes Wrangler for deployment automation and secret management, with environment-specific configurations (.dev.vars, .prod.vars) for secure API key storage.
Multi-Model Support: Integrates Google AI Studio’s Gemini models (including gemini-2.0-flash), Anthropic’s Claude, and OpenAI through a unified endpoint, allowing users to specify models via POST parameters. System prompts and temperature controls enable precise output tuning directly from Shortcuts actions.
Security Framework: Implements Bearer Token authentication via Cloudflare Secrets, requiring all requests to include a valid Authorization header. Environment variables are encrypted at rest, and the codebase includes Biome linting/formatting rules for audit-ready compliance.

Limited Shortcut Intelligence: Addresses Apple Shortcuts’ inability to process unstructured data or perform complex language tasks by adding LLM-powered text generation and analysis capabilities. Enables dynamic content creation (emails, summaries) and smart data parsing from documents/photos within automation flows.
Developer/Pro User Bottlenecks: Targets iOS/macOS power users and shortcut creators who need AI enhancements without maintaining backend infrastructure. Eliminates the need to build separate microservices for AI integrations by providing a production-ready API template.
Cost-Effective Scaling: Solves the challenge of monetizing AI-powered shortcuts by leveraging Cloudflare’s free tier, allowing creators to deploy and share AI workflows without upfront hosting costs. Supports high-volume use cases like batch processing through optimized cold-start performance.

Edge-Native Architecture: Unlike traditional server-based AI gateways, this solution runs on Cloudflare’s globally distributed Workers platform, reducing latency for international users by processing requests at the nearest edge location.
Prompt Engineering Flexibility: Supports both user prompts and system-level instructions in API requests, enabling context-aware workflows (e.g., “Always respond in JSON format for parsing”) that adapt to specific shortcut requirements.
Zero-Code Updates: The Makefile-driven workflow (make init, make deploy) allows non-technical users to update AI models or security tokens without modifying source code, with automatic TypeScript type generation ensuring endpoint consistency across deployments.

How to handle API authentication in Shortcuts? Add an Authorization header with Bearer [token] when making HTTP requests in Shortcuts, where the token is set via wrangler secret put BEARER_TOKEN during deployment. The API rejects unauthenticated requests with 401 errors.
Can I use OpenAI models instead of Gemini? Yes, modify the inference endpoint logic in src/index.ts to route requests to OpenAI’s API, then set OPENAI_API_KEY as a Cloudflare Secret. The current implementation prioritizes Gemini for cost efficiency.
What’s the maximum input length supported? Cloudflare Workers handle payloads up to 10MB, but Gemini-2.0-flash has a 1M token context window. For optimal performance, keep inputs under 4,096 characters when using temperature 0.7.
How to monitor API usage? Integrate Cloudflare’s Analytics Engine via wrangler.toml to track request volumes and model usage. The free tier allows 1,500 daily requests across all users of your deployed worker.
Can I chain multiple AI operations in one shortcut? Yes, use the API’s plain-text output as input for subsequent HTTP actions in Shortcuts, enabling multi-step workflows like summarization → translation → calendar scheduling in a single automation.

Supercharge Apple’s Shortcuts using Cloudflare and Gemini