Known Agents logo

Known Agents

Track the bots and AI agents crawling your website

2026-05-11

Product Introduction

  1. Definition: Known Agents is a specialized analytics and management platform for AI agent, bot, and non-human web traffic. Technically, it is a SaaS (Software as a Service) product that combines real-time traffic analytics, automated robots.txt management, agent identification APIs, and LLM referral tracking.
  2. Core Value Proposition: It exists to provide website owners, developers, and publishers with critical visibility and control over the rapidly growing segment of AI-powered traffic. This includes AI data scrapers (e.g., GPTBot), search engine crawlers, LLM assistants, and automated shopping agents, turning a blind spot into a measurable and optimizable growth channel.

Main Features

  1. AI Agent & Bot Analytics Dashboard: This feature provides real-time, granular visibility into non-human traffic. It works by ingesting and analyzing server logs or JavaScript beacon data to identify, categorize, and track the behavior of artificial agents. The dashboard displays metrics like visit volume, page views, session replay for individual agents, geographic origin, and traffic spikes. Specific technologies include advanced user-agent parsing, IP reputation databases, and behavioral fingerprinting to distinguish between legitimate crawlers and spoofed bots.
  2. LLM Referral Tracking (GEO/AEO): This feature measures the impact of AI platforms on human traffic. It tracks Generative Engine Optimization (GEO) and Answer Engine Optimization (AEO) by identifying when pages are cited in AI chat responses (e.g., from ChatGPT, Claude) and subsequently measuring the human click-through rate from those citations. It works by detecting referral headers from AI platforms and correlating them with user sessions, providing data on total citations, referral volume per AI platform, and the most recommended pages.
  3. Automatic Robots.txt Management & Bad Bot Detection: This is a dynamic content protection system. It automatically generates and updates your site's robots.txt file based on a continuously updated database of known AI agents and crawlers. Users can block entire categories (e.g., all AI data scrapers). The complementary bad bot detection feature identifies and blocks agents that disobey robots.txt rules, using pattern analysis and rate limiting, often integrated via a WordPress plugin or server-side module to issue HTTP 403 rejections.
  4. Agent Identification API (Web Bot Auth): This is a developer-facing API for programmatic agent authentication. It allows backend systems to verify an incoming request's claimed identity (e.g., "Googlebot") using the emerging Web Bot Authentication standard, which involves public key cryptography. The API handles the complex verification, returning a structured JSON result confirming the agent's legitimacy, enabling fine-grained access control for licensed or paid content.
  5. MCP & Shopping Observability: This feature provides monitoring for AI commerce endpoints. It tracks traffic to Model Context Protocol (MCP) servers and optimizes funnels for AI-powered commerce, such as Universal Checkout Pages (UCP) and AI Checkout Processes (ACP). It gives insights into how AI shopping agents (e.g., from ChatGPT Instant Checkout) discover, evaluate, and purchase products, allowing for funnel optimization specifically for agentic behavior.

Problems Solved

  1. Pain Point: The "invisible traffic" problem, where up to 40% of website visits are from unmonitored AI agents and bots, creating blind spots in analytics, unnecessary server load costs, and content scraping without attribution or control.
  2. Target Audience: Website Developers & DevOps Engineers who need to manage server load and implement technical access controls; Digital Marketing Managers & SEO Specialists who must understand new traffic sources like GEO/AEO; Content Publishers & SaaS Companies with licensed content vulnerable to scraping; E-commerce Managers optimizing for AI shopping agents.
  3. Use Cases: A news publisher uses it to block unauthorized AI scrapers while allowing licensed partners. An e-commerce site uses LLM referral tracking to see which products are most cited by AI assistants and optimizes those pages. A developer uses the Agent Identification API to gate API access only to verified partner bots. A company uses bad bot detection to stop a DDoS-style scraping attack from overloading their servers.

Unique Advantages

  1. Differentiation: Unlike traditional web analytics (Google Analytics) which filters out bot traffic, or generic security tools that simply block threats, Known Agents is purpose-built for the nuanced ecosystem of modern AI traffic. It provides analytics and management, distinguishing between beneficial agents (e.g., search engines, licensed LLMs) and harmful ones, whereas competitors often treat all non-human traffic as a security threat.
  2. Key Innovation: Its continuously updated, centralized agent database is a core innovation. The platform automatically identifies new AI agents and crawlers as they emerge, pushing updates to users' robots.txt rules and detection systems. This removes the manual, reactive burden of tracking new bots like "Crawl4AI" or "DeepSeekBot" from individual website operators.

Frequently Asked Questions (FAQ)

  1. How does Known Agents detect and identify AI bots vs. human visitors? Known Agents uses a multi-layered identification system combining HTTP request headers (User-Agent), IP address analysis against known hosting ranges (e.g., OpenAI, Google Cloud), behavioral patterns (crawl rate, navigation paths), and verification via the Web Bot Authentication standard where available, to accurately classify AI agent traffic.
  2. What is the difference between Known Agents and traditional bot management software? Traditional bot management focuses on security—blocking malicious bots, carding fraud, and DDoS. Known Agents is focused on visibility and optimization for the broad category of legitimate and emerging AI agents, LLM crawlers, and search bots, providing analytics on their behavior and tools to manage their access productively, not just block them.
  3. Can Known Agents help with SEO and Generative Engine Optimization (GEO)? Yes, directly. Its LLM Referral Tracking feature measures which of your pages are cited by AI platforms like ChatGPT and how much human traffic those citations drive. This data is essential for GEO, allowing you to double down on content that performs well in AI answers and optimize for AI-driven discovery.
  4. How does the Automatic Robots.txt update work technically? Known Agents serves your robots.txt file through its intelligent system. When you configure rules (e.g., "Disallow all AI Data Scrapers"), the platform dynamically generates the file, appending directives for all current and future agents in that category from its live database. This ensures your disallow rules are always comprehensive without manual edits.
  5. Is the Agent Identification API suitable for protecting paid API endpoints? Absolutely. The Agent Identification API is designed for this. It allows you to authenticate incoming requests from AI agents using cryptographically verified Web Bot Auth. You can grant access only to agents from partners with formal agreements, effectively monetizing or controlling programmatic access to your content and data.

Subscribe to Our Newsletter

Get weekly curated tool recommendations and stay updated with the latest product news