How to Create a robots.txt File
A robots.txt file tells search engine crawlers which parts of your site they can access. Getting it right is essential for SEO — block the wrong pages and they disappear from search results; leave it open and crawlers waste budget on irrelevant pages.
When You Need robots.txt
- Launching a website and setting up SEO basics
- Blocking admin panels, API endpoints, or staging environments
- Preventing AI crawlers from scraping your content
- Reducing crawl load on resource-heavy pages
- Pointing crawlers to your sitemap
How to Create robots.txt
Step 1: Understand the Syntax
robots.txt uses simple directives: User-agent specifies which crawler the rules apply to, Disallow blocks a path, and Allow permits a path (overriding a broader Disallow).
User-agent: *
Disallow: /admin/
Allow: /admin/public/
Sitemap: https://example.com/sitemap.xmlStep 2: Choose Your Rules
Open the robots.txt Generator and either start from a preset or build rules manually. Add User-agent blocks for specific crawlers, then define Allow/Disallow paths.
Step 3: Add Your Sitemap
Include a Sitemap: directive pointing to your XML sitemap. This helps crawlers discover all your pages efficiently.
Step 4: Deploy
Download the generated file and place it at the root of your website. Verify it's accessible at yourdomain.com/robots.txt.
Common Configurations
- Allow all:
User-agent: * / Allow: / - Block all:
User-agent: * / Disallow: / - Block AI bots: Add separate User-agent blocks for GPTBot, CCBot, Google-Extended, etc.
- WordPress: Block
/wp-admin/but allow/wp-admin/admin-ajax.php
Tips
- Test your robots.txt with Google Search Console's robots.txt Tester before deploying
- Use the OG Meta Previewer to check that blocked pages aren't breaking social share previews
- Combine robots.txt with the Heading Checker for a complete SEO audit
- Remember: robots.txt is a suggestion, not enforcement — malicious bots ignore it
FAQ
What is robots.txt?
robots.txt is a text file placed at the root of your website (example.com/robots.txt) that tells search engine crawlers which pages they can and cannot access. It follows the Robots Exclusion Protocol standard.
Does robots.txt block pages from appearing in search results?
No. robots.txt prevents crawling, but pages can still appear in search results if other sites link to them. To prevent indexing, use a 'noindex' meta tag or X-Robots-Tag HTTP header instead.
How do I block AI crawlers like GPTBot?
Add a User-agent block for each AI crawler you want to block. For example: User-agent: GPTBot followed by Disallow: /. Common AI crawlers include GPTBot, ChatGPT-User, CCBot, Google-Extended, and anthropic-ai.
Where should I place robots.txt?
Place it in the root directory of your website so it is accessible at https://yourdomain.com/robots.txt. It must be at the root — subdirectory robots.txt files are ignored by most crawlers.
Try It Now
Ready to create your robots.txt? Open the robots.txt Generator — it works entirely in your browser with no sign-up required.