ABOUT ROBOTS.TXT FILE
A robots.txt file is a really simple, plain text format file. It helps search engines index a website more appropriately and to prevent certain search engine crawlers like Google, Bing, and Yahoo from crawling and indexing content on a website.
There are five common terms you’ re likely to come across in a robots file. They include:
User-agent: The specific web crawler to which you’re giving crawl instructions (usually a search engine).
Disallow: The command used to tell a user-agent not to crawl particular URL. Only one “Disallow:” line is allowed for each URL.
Allow: The command to tell Googlebot it can access a page or subfolder even though its parent page or subfolder may be disallowed.
Crawl-delay: How many seconds a crawler should wait before loading and crawling page content. Note that Googlebot does not acknowledge this command, but crawl rate can be set in Google Search Console.
Sitemap: Used to call out the location of any XML sitemap(s) associated with this URL. Note this command is only supported by Google, Bing, and Yahoo.
SAMPLE ROBOTS.TXT FILE
Robots.txt Generator tool is designed to help you create robots.txt file without a lot of technical knowledge. The robots.txt file helps search engines index your site more appropriately. Search engines use website crawlers, or robots that review all the web pages. There may be parts of your website that you do not want them to crawl to include in user search results, such as admin page. Please be careful though, as creating your robots.txt file can have a significant impact on Google being able to access your website.
HOW TO CREATE ROBOTS.TXT?
To generate the robots.txt file, please follow the steps:
- Open the Robots.txt Generator.
- Here you will see a couple of options. All the options are not mandatory. But you need to choose carefully. The first option contains default values for all robots/web crawlers and a crawl delay.
- If you want a crawl delay, you can select the value in seconds as per your requirements.
- The next option is about a sitemap. Please enter a valid XML sitemap URL.
- The next couple of options contain the search engine bots if you want a specific search engine bot to crawl your website. Then select “Allowed” from the dropdown for that bot. And if you don’t want a particular search engine bot to crawl your website. Then select “Refused” from the dropdown for that bot.
- The last option is for restricted directories if you want to restrict the crawlers from indexing. The path is relative to the root and must contain a trailing slash “/”
- Click the “Create Robots.txt” button. Copy the above text and paste it into the text file. You can also click “Create and Save as Robots.txt” to download it directly to your local PC.