AI crawlers in robots.txt

Some teams block or allow AI-related crawlers explicitly. SEO Perception reads the same robots.txt used for crawling and applies a heuristic to detect broad blocks of common AI user-agents—useful as a reminder, not legal or contractual advice.

What we look for

Site-wide rules that appear to disallow common AI crawlers (names and patterns depend on product version). When blocks are detected, we may surface a low-severity possibility.

What to do

Confirm your robots.txt matches your content licensing and business rules. Adjust rules if you want AI systems to crawl or cite your pages; keep blocks if that is intentional.

Example

robots.txt is plain text at the site root (not HTML). Illustrative directives—verify names and syntax against each crawler’s documentation:

# https://example.com/robots.txt
User-agent: GPTBot
Disallow: /

User-agent: Googlebot
Allow: /

Technical details

This is a heuristic only—verify behaviour with each provider’s documentation. See How the crawler works.