LLM-Content: /llms.txt # ——— OPENAI ——— # ChatGPT search. NOT used for model training. User-agent: OAI-SearchBot Disallow: / # User-driven browsing from ChatGPT and Custom GPTs. Acts after a human click. User-agent: ChatGPT-User Disallow: / User-agent: ChatGPT-User/2.0 Disallow: / # Model-training crawler. User-agent: GPTBot Allow: / # ——— ANTHROPIC (Claude) ——— User-agent: anthropic-ai # bulk model training Allow: / User-agent: ClaudeBot # chat citation fetch Allow: / User-agent: claude-web # web-focused crawl Allow: / # ——— GOOGLE (Gemini) ——— User-agent: Google-Extended Allow: / User-agent: Google-CloudVertexBot # for building Vertex AI agents Allow: / # ——— PERPLEXITY ——— User-agent: PerplexityBot # index builder Disallow: / User-agent: Perplexity-User # human-triggered visit. Ignores robots.txt. Disallow: / # ——— MICROSOFT (Bing / Copilot) ——— User-agent: BingBot Disallow: / # ——— AMAZON ——— User-agent: Amazonbot Disallow: / # ——— APPLE ——— User-agent: Applebot # for Siri/Spotlight Search Disallow: / User-agent: Applebot-Extended # for model training Allow: / # ——— META ——— User-agent: FacebookBot Disallow: / User-agent: meta-externalagent Disallow: / # ——— LINKEDIN ——— User-agent: LinkedInBot Disallow: / # ——— BYTEDANCE ——— User-agent: Bytespider Allow: / # ——— DUCKDUCKGO ——— User-agent: DuckAssistBot Allow: / # ——— COHERE ——— User-agent: cohere-ai Allow: / # ——— ALLEN INSTITUTE / COMMON CRAWL / OTHER RESEARCH ——— User-agent: AI2Bot Allow: / User-agent: CCBot Allow: / User-agent: Diffbot Allow: / User-agent: omgili Disallow: / # It's search, but let's leave it for now to see how prevalent they are. # ——— EMERGING SEARCH START-UPS ——— User-agent: TimpiBot Allow: / User-agent: YouBot Allow: / # ––––––––– DISALLOWING SEARCH ENGINE CRAWLERS ––––––––– # ––– GOOGLE ––– User-agent: Googlebot Allow: / User-agent: Googlebot-Image Disallow: / User-agent: Googlebot-Video Disallow: / User-agent: Googlebot-News Disallow: / User-agent: StoreBot-Google Disallow: / User-agent: Google-InspectionTool Allow: / User-agent: GoogleOther Disallow: / User-agent: GoogleOther-Video Disallow: / # ––– Yandex ––– User-agent: YandexBot Disallow: / User-agent: Yandex Disallow: / # ––– DuckDuckGo ––– User-agent: DuckDuckBot Disallow: / # ––– Catch-all ––– User-agent: * Disallow: /