AI crawlers in robots.txt
Some teams block or allow AI-related crawlers explicitly. SEO Perception reads the same robots.txt used for crawling and applies a heuristic to detect broad blocks of common AI user-agents—useful as a reminder, not legal or contractual advice.
What we look for
Site-wide rules that appear to disallow common AI crawlers (names and patterns depend on product version). When blocks are detected, we may surface a low-severity possibility.
What to do
Confirm your robots.txt matches your content licensing and business rules. Adjust rules if you want AI systems to crawl or cite your pages; keep blocks if that is intentional.
Example
robots.txt is plain text at the site root (not HTML). Illustrative directives—verify names and syntax against each crawler’s documentation:
# https://example.com/robots.txt
User-agent: GPTBot
Disallow: /
User-agent: Googlebot
Allow: /
Technical details
This is a heuristic only—verify behaviour with each provider’s documentation. See How the crawler works.