Product Introduction
- Definition: Yavy is an AI knowledge infrastructure platform (technical category: MCP server solution) that transforms public websites into Machine-Context Protocol (MCP)-compatible data sources. It automates crawling, indexing, and serving of web content via standardized APIs for AI agents.
- Core Value Proposition: Yavy eliminates AI hallucinations and manual data transfers by enabling real-time, accurate information retrieval from live websites. Its primary keywords include MCP server integration, semantic search for AI, and zero-config knowledge base.
Main Features
- AI-Ready Indexing Engine: Automatically discovers, parses, and structures website content using recursive URL crawling and sitemap.xml analysis. Extracts text, metadata, and hierarchical relationships, then indexes data using vector embeddings (e.g., sentence-transformers) for semantic search.
- MCP Protocol Server: Serves indexed content via HTTP-based MCP endpoints compatible with tools like Claude, Cursor, and OpenAI. Delivers JSON-formatted responses with relevance scores, snippets, and source URLs, enabling precise LLM context injection.
- Live Sync Technology: Continuously monitors source websites for changes via millisecond-level checks. Propagates updates instantly across all connected AI tools, ensuring answers reflect real-time content (e.g., pricing page edits or API doc revisions).
- Semantic Search API: Uses cosine similarity and vector databases to resolve conceptual queries (e.g., "pricing tiers for startups" matches "/blog/announcing-new-pricing"). Supports filters, allow/deny rules, and OAuth 2.1 for private repositories.
Problems Solved
- Pain Point: Prevents AI hallucination by grounding responses in verified, current website data—critical for accuracy in developer docs, support bots, and compliance-sensitive domains.
- Target Audience:
- Developer teams building AI-enhanced IDEs or CLI tools
- Technical writers managing documentation portals (e.g., Docusaurus, ReadTheDocs)
- Product managers maintaining help centers or competitive intelligence dashboards
- Use Cases:
- Customer support bots answering queries using real-time knowledge base content
- AI coding assistants pulling context from SDK documentation during development
- Automated monitoring of competitor blogs/documentation for change alerts
Unique Advantages
- Differentiation: Unlike static scrapers (e.g., manual Selenium scripts), Yavy offers MCP-native integration, structured JSON outputs for LLMs, and sub-second sync—outpacing alternatives like Apify or Diffbot in AI-readiness.
- Key Innovation: Patented semantic indexing combines recursive crawling with transformer-based embeddings, enabling concept-based search beyond keyword matching. MCP protocol standardization ensures plug-and-play compatibility with 50+ AI tools.
Frequently Asked Questions (FAQ)
- How does Yavy handle authentication for private websites?
Yavy uses OAuth 2.1 to securely index password-protected content from platforms like Confluence or GitHub, with granular access controls for team members. - What counts as a "page" in Yavy's pricing tiers?
Each unique URL crawled (e.g., /pricing, /docs/v1/install) constitutes one page. Redirects, assets, or blocked URLs don’t count toward limits. - Can Yavy index JavaScript-heavy SPAs (Single-Page Applications)?
Yes, Yavy’s crawler executes JavaScript to extract dynamically rendered content from React, Angular, or Vue.js sites, ensuring comprehensive coverage. - How quickly do AI tools reflect content updates after a sync?
Changes appear in MCP search results within milliseconds on Pro/Team plans, while Free/Starter tiers update within 24 hours. - Does Yavy support non-English websites?
Yes, multilingual semantic search works for 15+ languages (including Spanish, German, Japanese) via language-agnostic embeddings.
