robot.txt

Do you understand the importance of robots.txt file and how it changes your search engine ranking, not sure then this article is definitely for you, we have covered all the things you should know about robots.txt.

First we make clear that your website SEO performance and search engine ranking do affected by changes in robots.txt file. It instruct search engine bots how to crawl and index pages on their website. In simple words it let search engine know which page of your site they should index. Robots.txt file is very powerful file, it should be used with care.

You can check your website live robots.txt by type the following URL in your browser address bar.

www.example.com/robots.txt //Replace 'example' with your website domain name//

If nothing shows up that means robot.txt file is not yet created in your site root directory. We highly recommend you create one.

How to Create a Robots.txt file?

Create a simple blank text file and save it as robots.txt and upload it to your website’s root folder. To upload any file in root folder you need FTP client. Connect to your website using an FTP client and open your root folder and upload it. If you creating a new Robots.txt file you can configure it before uploading.

How to Configure Robots.txt?

The basic thing you have to know about Robots.txt file are user agent. User agents are the bots you are instructing how to crawl and index pages on their website. And ‘Allow’ and ‘Disallow’ instructions for which parts you want them to crawl and which part you do not want to crawl. All the disallowed parts are neglected by bots.

You should study about all the bots that crawling on your site and part of your site you do not want them to crawl. Check your website log to know which bots crawls to your site.

Configure your Robots.txt file according to your needs. Use the following sample formats for your convenience.

1. Allow indexing of everything

User-agent: * 
Allow: /

or

User-agent: *
Disallow:

2. Disallow indexing of everything

User-agent: *
Disallow: /

3. Disallow indexing of a specific folders

User-agent: *
Disallow: /folder 1/
Disallow: /folder 2/

For example:

User-agent:  *
Disallow: /cgi-bin/
Disallow: /comments/feed/

4. Disallow indexing of a specific webpages

User-agent: *
Disallow: /webpage.html

For example:

User-agent:  *
Disallow: /readme.html

5. Disallow and Allow specific bots to indexing of a specific webpages/folders

User-agent: Googlebot
Allow: /folder 1/
Disallow: /folder 2/
Disallow: /webpage.html

6. Disallow access to all URLs that include a question mark (?)

User-agent: *
Disallow: /*?

7. Sitemap Parameter

User-agent: * 
Allow: /
Sitemap: http://www.yourwebsitename.com/sitemap.xml

If you are manually adding your sitemaps  to Google or Bing webmaster then you do not need the Sitemap Parameter in your Robots.txt file.

Warning: As long as it is not intentional, do not use the following format:

User-agent: *
Disallow: /

If you used the above format in your Robots.txt file, no content of your site will index by any search engine.

If you had made changes in Robots.txt file then you have to submit a request to let Google know your robots.txt file has been updated. Go to Google Webmasters>Search Console>Select Property>Crawl>robots.txt Tester>Refresh the Page>Click Submit.

The mostly used Robots.txt file:

User-agent: *
Allow: /

Here is the one we use on GIZEST

User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php
Disallow: /cgi-bin/
Disallow: /comments/feed/

User-agent: NinjaBot
Disallow: 

User-agent: Mediapartners-Google*
Allow: /

User-agent: Googlebot-Image
Allow: /wp-content/uploads/

User-agent: Adsbot-Google
Allow: /

User-agent: Googlebot-Mobile
Allow: /

Sitemap: http://www.gizest.com/sitemap_index.xml

We hope this article helped you. Subscribe to our newsletter and get daily updates to your Inbox.
Leave a comment below if you have any related queries with this.

 

LEAVE A REPLY

Please enter your comment!
Please enter your name here