Create a robots.txt file
1. Create a list of pages or directories you don’t want search engines to crawl.
Some things that you might not want search engine bots to crawl are a staging version of your website or resources like a PDF. This will help you ensure non-relevant pages aren’t taking up your crawl budget leaving the relevant pages not crawled.
2. Check whether you already have a robots.txt file by typing domain.com/robots.txt into your browser address bar.
Replace domain.com with your domain name. Some website builders, CMS platforms, and ecommerce platforms, like WordPress, BigCommerce, and Shopify, will automatically create a robots.txt file. You should see a 404 error page if you don’t have a robots.txt file.
3. Use a plain text editor like Notepad to create and save a new robots.txt file.
If you’re using WordPress and found a robots.txt file, log into your web hosting server and check for a robots.txt file in the root. If there’s none, it is a virtual file. Copy the content you saw in domain.com/robots.txt page into the file you just created.
4. Fill your robots.txt with rules specifying what you want search engine bots to crawl.
For example, if you want to allow Googlebot, Google’s search engine bot, to crawl a specific page you would use: User-agent: Googlebot Allow: /page-url/ If you want to disallow Googlebot from crawling a page, you would use: User-agent: Googlebot Disallow: /page-url/ Replace Googlebot with any specific search engine bot you want to create rules for or use * to specify all search engine bots. Replace /page-url/ with the URL of the page you want the rules to target.
5. Test your robots.txt by copying and pasting the file contents to a tester like Google’s or TechnicalSEO’s to check for any errors or warnings.
If there are, check the line of code against what is written in robots.txt file and correct it.