Free SEO tools / Robots.txt Tester

Robots.txt Tester

Check your robots.txt file to make sure bots can crawl the site properly

Enter URLs to test if they aren't blocked by your robots.txt file

Results

URL

Bot

Your robots.txt file

How to read a robots.txt file?

User-agent

Allow

Disallow

User-agent

This directive identifies a specific spider (or all web crawlers) the prescribed rules apply to. Each search engine has its own bot: Google has Googlebot, Bing has Bingbot, and Yahoo! has Slurp. Most search engines have multiple spiders for their regular index, ad programs, images, videos, etc. The robots.txt validator will show which crawlers can or can't request your website content.

Allow

This directive specifies website files, categories, and pages that the designated crawlers may access. When no path is specified, the directive is ignored. It's used to counteract the Disallow directive, as in to allow access to a page or file within a disallowed directory. The robots.txt tester will show you which pages bots can access.

Disallow

This directive is added to robots.txt to prevent search engines from crawling specific website files and URLs. You can disallow internal and service files, for example, a folder with user data specified during registration. The tool will show which of the entered pages are not allowed for crawling.

How to use our online Robots.txt Tester?

We created the robots.txt tester so that everyone can quickly check their file. To use our tool, paste the necessary URLs into the input field and click Check your robots.txt. As a result, you will learn whether specific pages are allowed or blocked from crawling. A URL will be highlighted in red if it's blocked from crawling, and if the page is allowed to be crawled by bots, it'll be highlighted in green. Moreover, the tool will show you the robots.txt file for each entered domain (if you check the Show the robots.txt file box).

FAQ

Why is a robots.txt file necessary?

Robots.txt files provide search engines with important information about crawling files and web pages. This file is used primarily to manage crawler traffic to your website in order to avoid overloading your site with requests.

You can solve two problems with its help:

First, reduce the likelihood of certain pages being crawled, including getting indexed and appearing in search results.
Second, save the crawling budget by closing pages that shouldn’t be indexed.

However, if you want to prevent a page or another digital asset from appearing in Google Search, a more reliable option would be to add the no-index attribute to the robots meta tag.

How to make sure robots.txt is working fine?

A quick and easy way to make sure your robots.txt file is working properly is to use special tools.

For example, you can validate your robots.txt by using our tool: enter up to 100 URLs and it will show you whether the file blocks crawlers from accessing specific URLs on your site.

To quickly detect errors in the robots.txt file, you can also use Google Search Console.

Common robots.txt issues

The file is not in the .txt format. In this case, bots will not be able to find and crawl your robots.txt file because of the format mismatch.
Robots.txt is not located in the root directory. The file must be placed in the top-most directory of the website. If it is placed in a subfolder, your robots.txt file is probably not going to be visible to search bots. To fix this issue, move your robots.txt file to your root directory.

In the Disallow directive, you must specify particular files or pages that should not appear on SERPs. It can be used with the User-agent directive in order to block the website from a particular crawler.

Disallow without value. An empty Disallow: directive tells bots that they can visit any website pages.
Blank lines in the robots.txt file. Do not leave blank lines between directives. Otherwise, bots will not be able to crawl the file correctly. An empty line in the robots.txt file should be placed only before indicating a new User-agent.

Robots.txt best practices

Use the proper case in robots.txt. Bots treat folder and section names as case-sensitive. So, if a folder name starts with a capital letter, naming it with a lowercase letter will disorient the crawler, and vice versa.
Each directive must begin on a new line. There can only be one parameter per line.
The use of space at the beginning of a line, quotation marks, or semicolons for directives is strictly prohibited.
There is no need to list every file you want to block from crawlers. You just need to specify a folder or directory in the Disallow directive, and all of the files from these folders or directories will also be blocked from crawling.
You can use regular expressions to create robots.txt with more flexible instructions.
- The asterisk (*) indicates any value variation.
- The dollar sign ($) is an asterisk-type restriction that applies to website URL addresses. It’s used to specify the end of the URL path.
Use server-side authentication to block access to private content. That way, you can ensure that important data is not stolen.
Use one robots.txt file per domain. If you need to set crawl guidelines for different sites, create a separate robots.txt for each one.

Other ways to test your robots.txt file

You can analyze your robots.txt file using the Google Search Console tool.

This robots.txt tester shows you whether your robots.txt file is blocking Google crawlers from accessing specific URLs on your website. The tool is unavailable in the new version of GSC, but you can access it by clicking this link.

Choose your domain and the tool will show you the robots.txt file, its errors, and warnings.

Go to the bottom of the page, where you can type the URL of a page in the text box. As a result, the robots.txt tester will verify that your URL has been blocked properly.

What should be in a robots.txt file?

Robots.txt files contain information that instructs crawlers on how to interact with a particular site. It starts with a User-agent directive that specifies the search bot to which the rules apply. Then you should specify directives that allow and block certain files and pages from crawlers. At the end of a robots.txt file, you can optionally add a link to your sitemap.

How to open a robots.txt file?

In order to access the content of any website’s robots.txt file, you have to type https://yourwebsite/robots.txt into the browser.

Can bots ignore robots.txt?

Crawlers always refer to an existing robots.txt file when visiting a website. Although the robots.txt file provides rules for bots, it can’t enforce the instructions. The robots.txt file itself is a list of guidelines for crawlers—not strict rules. Therefore, in some cases, bots may ignore these directives.

How to test if robots.txt is working properly?

You can check the robots.txt file with our tool. Just enter the necessary URLs. Here you’ll see if a given website URL is allowed or blocked from crawling.

How do I fix robots.txt?

A robots.txt file is a text document. You can change the current file via a text editor and then add it again to the website root directory. What’s more, many CMS, including WordPress, have various plugins that allow making changes to the robots.txt file—you can do it directly from the admin dashboard.

Can robots.txt be redirected?

The file can only be accessed at http://yourwebsite/robots.txt and cannot be redirected to other website pages. At the same time, you can set up a redirect to the robots.txt file of another domain.

Does Google respect robots.txt?

When visiting a website, Google’s crawlers first refer to the robots.txt file containing all crawling guidelines. But in some cases, the search engine may ignore these directives.