Tryb
Agents
APIPlayground
  1. Home
  2. Blog
  3. Technical
  4. robots.txt for AI Agents: Complete Guide
Technical
Oct 22, 20246 min read

robots.txt for AI Agents: Complete Guide

Learn how to read, respect, and work with robots.txt when building AI agents that access the web.

Marcus Chen

Marcus Chen

Founder & CEO

robots.txt for AI Agents: Complete Guide

robots.txt is the standard for telling bots what they can and can't access. Understanding it is essential for building ethical AI agents.

What is robots.txt?

A text file at the root of a website (e.g., https://example.com/robots.txt) that provides crawling instructions for bots.

Basic Syntax

# Allow all bots
User-agent: *
Allow: /

# Block all bots
User-agent: *
Disallow: /

# Block specific paths
User-agent: *
Disallow: /admin/
Disallow: /private/

# Allow specific bot
User-agent: Tryb-Agent
Allow: /

Common Directives

DirectiveMeaning
User-agent: *Applies to all bots
Disallow: /Block entire site
Disallow: /path/Block specific path
Allow: /path/Explicitly allow path
Crawl-delay: 10Wait 10s between requests
Sitemap: urlLocation of sitemap

AI Agent Considerations

AI agents should:

  1. Check robots.txt before scraping any domain
  2. Use a descriptive User-agent string
  3. Respect Crawl-delay directives
  4. Cache robots.txt (refresh every 24h)
import robotsParser from 'robots-parser';

async function canScrape(url: string): Promise<boolean> {
  const domain = new URL(url).origin;
  const robotsUrl = `${domain}/robots.txt`;
  
  const response = await fetch(robotsUrl);
  const robotsTxt = await response.text();
  
  const robots = robotsParser(robotsUrl, robotsTxt);
  return robots.isAllowed(url, 'Tryb-Agent');
}

Legal Status

robots.txt is not legally binding, but:

  • Courts have referenced it in scraping cases
  • Ignoring it may constitute trespass or ToS violation
  • Following it demonstrates good faith

Related Guides

  • Web Scraping Best Practices
  • Cloudflare Bypass Techniques
robots.txtComplianceEthicsTechnical
Marcus Chen

Marcus Chen

Founder & CEO at Tryb

Marcus advocates for ethical AI development.

Related Articles

Web Scraping Best Practices for AI Applications
Technical

Web Scraping Best Practices for AI Applications

9 min read

Cloudflare Bypass for AI Agents: Ethical Approaches
Technical

Cloudflare Bypass for AI Agents: Ethical Approaches

9 min read

How to Choose a Web Scraping Tool for AI
Comparisons

How to Choose a Web Scraping Tool for AI

7 min read

Ready to Give Your AI Eyes?

Start scraping any website in seconds. Get 100 free credits when you sign up.

Tryb

The Universal Reader for AI Agents.

Product

  • Agents
  • Industry
  • API Reference
  • Dashboard

Company

  • About
  • Blog
  • Careers
  • Contact
  • Private Sector

Legal

  • Privacy
  • Terms
  • Security

© 2025 Tryb. All rights reserved.

TwitterGitHubDiscord