Where does robots.txt go?

It must live at the root of your domain, served as text at https://example.com/robots.txt. Crawlers only read it from that exact location, not from subfolders.

What does Disallow do?

Disallow tells compliant crawlers not to fetch URLs matching the given path prefix. An empty Disallow value means nothing is blocked, allowing the whole site.

Does robots.txt guarantee privacy?

No. It is advisory and only respected by well-behaved bots. Sensitive pages should be protected by authentication, not by Disallow, since the file itself is public.

Crawl-delay asks a crawler to wait the given number of seconds between requests. Google ignores it, but Bing and some other crawlers honor it to reduce server load.

Should I include a Sitemap line?

Yes. A Sitemap directive pointing to your sitemap URL helps search engines discover all your pages. It uses an absolute URL and can appear once per file.

What is the robots.txt Builder?

Builds a valid robots.txt from crawl rules per user agent, allow and disallow paths, an optional crawl-delay, and a sitemap URL, then lets you copy or download the file for your site root. It runs free in your browser on Gera Tools, with nothing uploaded.

robots.txt Builder — Gera Tools

Name: robots.txt Builder
Creator: Gera Tools
License: https://creativecommons.org/licenses/by/4.0/

Get one useful tool a week

Like this tool? Enter your email and we'll send you one genuinely useful Gera tool a week — plus a link to come back to this one. No spam, one-click unsubscribe any time.

robots.txt, built correctly

robots.txt is the file crawlers read first to learn which parts of your site they may fetch. A small mistake — a stray Disallow: / or a malformed path — can deindex an entire site, so it pays to generate it cleanly. This builder produces a valid file from simple inputs and lets you copy or download it.

How it works

The Robots Exclusion Protocol groups directives under one or more User-agent lines. This builder emits a single group: a User-agent line, then one Disallow: line per disallow path and one Allow: line per allow path. Paths are matched as prefixes from the site root, so Disallow: /admin blocks /admin and everything beneath it. An optional Crawl-delay line is added when you set a positive number. Finally a Sitemap: directive with your absolute sitemap URL is appended at the end, which applies regardless of user-agent group.

Example output

To allow the whole site, leave the disallow box empty. A generated file looks like this:

User-agent: *
Disallow: /admin
Disallow: /cart
Allow: /admin/help
Crawl-delay: 10
Sitemap: https://example.com/sitemap.xml

Targeting specific crawlers

Using User-agent: * applies your rules to every compliant bot. You can instead target a named crawler — for example User-agent: Googlebot — to set different policies for different bots. Multiple groups are supported in the full robots.txt format; this builder generates one group per session, so run it twice and combine the outputs if you need separate policies for search crawlers and AI training crawlers.

Common named crawlers you might want to address:

Crawler	Operator	Typical purpose
Googlebot	Google	Search indexing
GPTBot	OpenAI	AI model training
ClaudeBot	Anthropic	AI research
CCBot	Common Crawl	Open dataset
Bingbot	Microsoft	Search indexing

Common mistakes to avoid

Disallow: / blocks everything. If you accidentally leave a forward slash alone in the disallow field, every crawler is locked out of your whole site. Always double-check the generated file before uploading.

Paths are case-sensitive on most servers. A rule for /Admin does not cover /admin on a Linux-based host. Use lowercase paths to match your actual URL structure.

robots.txt is not a security tool. It is advisory and public. Any path you list in a Disallow rule is readable by anyone who opens the file. For real access control, use authentication.

Blocking a URL does not deindex it. If a page is already in Google’s index, a new Disallow rule will stop recrawling but not remove the existing listing. You need a noindex meta tag or the URL Removal Tool in Search Console to remove a cached page.

Always verify the result in Google Search Console’s robots.txt tester before deploying.